I'm testing a real-time win probability site again this week. The current state of each ongoing game will be reported along with the probability of winning for each team. The probabilities are based on over 2000 games from the past 8 regular seasons. Last week, the model was fairly basic and simply considered score, time, and possession. This week modifications for field position and seconds are included. You can think of each probability as "if the team with the ball had a first down at their current field position, this would be their chance of winning."
Next week, I hope to include down and distance to go. I'm also planning to include a timeline graph of each game's win probabilities.
Readers of this site are invited to check out the beta version during the games today. Be warned there will certainly be hiccups, bugs, and other problems throughout the day. So if it's not working one minute, it may be back up the next. It goes live shortly after the 1 o'clock kickoffs today. Comments and suggestions are more than welcome!
The link is wp.advancednflstats.com.
- Home
- win probability
- In-Game Win Probabilities Beta 0.2
In-Game Win Probabilities Beta 0.2
By
Brian Burke
published on 10/05/2008
in
win probability
Subscribe to:
Post Comments (Atom)
I'm assuming there isn't a variable for who is getting the ball at halftime. With Tennessee/Baltimore at the half, the win pct. is 50/50, but I would assume the team that gets the ball at halftime, in a tie game, would go on to win more than the other team. Maybe this is something that could be added down the line?
You're exactly right. It should be about 0.53 for BAL right now because they're due to receive after halftime.
For now, my program doesn't have a memory. Every game update just grabs the current state of the game, so it doesn't know who's going to have the kickoff.
When I start doing graphs, I'll establish a method of storing game states, so I hope to figure out a good way to identify and store who got the opening kickoff.
Also, I fixed some bugs since kickoff today. When the ball was inside the 20, there were 1 and 0 probs, and the halftime probs were buggy too. I think both are fixed now.
One other small flaw is that the program doesn't know if the extra point has been kicked yet or not. So if a td is scored, and it's 6-0 before the xp, the model will report the probabilities for a 6-0 lead, when in reality the xp is a nearly a given, and there's a .99 probability that the score will be 7-0 in a moment. The fix is very complex, so I'm going to put that off until I do a few other bigger things.
Can you go back and look up what the win probability was for the Colts before the 21 point comeback? 17 down without the ball with 5 to play has to be slim odds.
.03!
That fumble is what did it. One of the cool things I want to do is have a 'play of the game' kind of thing. It would be the play that had the biggest swing in WP. That fumble return might be the play of the week.
But it could be more obscure plays like the phantom roughing the passer call on Terrell Suggs to let the Titans stay alive on their game-winning TD drive. Without that call they would have been 4th and very long in their own territory with just a few minutes to go. Terrible call. It probably took the WP for Baltimore from .95 to .60.
Thanks, linked it here
http://www.stampedeblue.com/2008/10/5/628867/how-unlikely-was-the-colt
When you are calculating the WP, are you using the exact score difference, or greater than/equal to score difference?
For example, I'm sure there aren't too many cases where a team is down 16 points with 5 to go, but there are lots more when team is down at least 16 points with 5 to go.
I'm curious why you chose the method you did. I'll be interested to keep tracking it throughout the season.
Good question. Yes, there is a lot of cases in which the historical sample size is extremely low. In those cases, I used a couple different techniques to estimate the WP.
First, I looked at the data graphically and attempted to minimize noise in low-sample cases by smoothing the data trends. Specifically I looked at the WP for a single point difference over each time remaining. I mostly used second-order moving averages to smooth the curves.
Second, in many odd score-difference cases, such as 5-point leads and 12-point leads, I used some extrapolation between high-sample cases. For example, there are very few cases of 12-point differences, but lots of 11- and 13-point leads. In these cases, I extrapolated between the high-sample cases.
There are lots of ways to build a WP model for football, and they all have their advantages and flaws. This is just my first attempt, and others may prefer different approaches.
My priority is to preserve the unique flow of the game depending on particular score differences. For instance, 4-point differences are categorically different that 3-point leads because a TD is required rather than a FG. And I realize that extrapolation sacrifices some of the unique character of certain leads, but I think it's the best method available for now.
There is one greater than/less than part to the model. Over the past 8 full seasons, no team has ever come back from more than 21 points down at any point in a game. So I cut off the model at over 21 points.
With a 2000+ game data set, there are not too many cases that require extrapolation. Plus, by definition those are the cases that rarely occur in games, so we don't expect to see them very often. Still, they need to be addressed somehow.
I'll put together a full brief on the way the WP model works. It's not finished yet, but it looks very promising.