## In-Game Win Probabilities Beta

Today I'm testing a real-time win probability site. The status of each ongoing game will be reported along with the probability of winning for each team. The probabilities are based on over 2000 games from the past 8 regular seasons. For now, the model is fairly basic and considers score, time, and possession. Modifications for field position, down and distance, and other factors are in work. It's an extremely challenging project, to say the least. But even now, the model is very revealing.

But for now, readers of this site are invited to check out the beta version during the games today. Be warned there will certainly be hiccups, bugs, and other problems throughout the day. So if it's not working one minute, it may be back up the next. It goes live shortly after the 1 o'clock kickoffs today. Comments and suggestions are more than welcome!

(Be patient. It's a very slow homemade server.)

### 25 Responses to “In-Game Win Probabilities Beta”

1. Anonymous says:

Looking forward to your project....Any adjustment for the game pointspread? Clearly teams favored by 7 vs getting 7 should be lined differntly at most any point in the game...

BlueCards

2. Anonymous says:

Looks great. Really looking forward to watching how this works. Good work.

3. Brian Burke says:

Down for a bit. Now back up 1:28pm.

4. Brian Burke says:

Still working.

It's very mobile-friendly, by the way. I'm headed out, but I'll be checking in with my Blackberry.

5. Brian Burke says:

Probabilities are incorrect at halftime.

6. Anonymous says:

Would it be useful or mererly confusing to present confidence intervals?

7. Brian Burke says:

Down for an hour, but back up.

8. Brian Burke says:

nanker-Nothing against conf intervals, but it'd be a lot more math. That's why I have the "based on x games" line in there, as a poor substitute.

9. Nanker says:

2*sqrt(p*(1-p)/n) doesn't seem like that much math, but you're probably right that it's not worth the effort. n pretty much tells us all we need to know.

10. Brian Burke says:

Good point. Thanks for the suggestion. I wasn't aware you don't need standard deviation for dichotomous variables to compute conf intervals.

By next week I'll have a much better probability model, so there won't be lots of "Based on 1 games" estimates.

Today was weird though. Odd scores such as 15-6 in the 1st quarter, and others that very rarely occur.

11. Brian Burke says:

One other problem--when a TD or FG occurs, the win probabilities flip. I might be able to fix that by the night game.

12. Nanker says:

I've spotted a couple of odd results, but it sounds like you're still tweaking the setup so I will wait till next week to pipe up. I did want to ask if the win probablities are modelled or emprical?

13. Brian Burke says:

Purely empirical at this point. Just historical comparisons. For next week, I'm going to fix the bugs (halftime, immediately after scores) add some smoothing, field position adjustments.

Then, I'll add adjustments for down/distance and other stuff.

Thanks for the feedback.

14. Anonymous says:

i think the vikings win% would go up if they tried to get peterson the ball somehow. They didnt need to run more but they do need to get peterson the ball more. why cant he do wut phily did when he was their O cordinater.

15. Anonymous says:

i agree brad childress should be fire, he sucks.

16. Anonymous says:

I'm the worst coach in the NFL. Im also mentally retarded. I should get fired. I dont know how to coach. I try to lose most of the time.

17. Anonymous says:

If that was the real Childress. At least he knows he sucks.

18. Anonymous says:

Nice pick on Dallas

19. Brian says:

Brian you should check out fangraphs.com. They do a similar thing for baseball games, and they show the game probabilities in graphical format. Obviously you're just starting this thing up, but it might give you some ideas.

20. Brian Burke says:

Thanks. Yeah, I like that site. Doing the graphs is a little over my head though. I'm not a coder or web developer at all, but that's where I'd like to go eventually. I really like how they highlight particular plays that caused big swings in in WP.

Protrade.com did this for NFL games in some form a couple years ago. ESPN.com even used their graphs in the game recaps. But they gave up on it.

Football is many times more difficult to model than baseball, so most of my efforts are going into the modeling for now.

21. Nanker says:

I'm going to fix the bugs (halftime, immediately after scores) add some smoothing, field position adjustments.

Then, I'll add adjustments for down/distance and other stuff.

Are you going to make the details available? I know you usually publish all the nitty gritty, but some people seem to like keeping this stuff to themselves.

Right now, I have a win probability model using drive charts from 5 seasons as the dataset. Basically, it gives the win prob[*] at 1st and 10 given field position, time remaining, and score differential. That's about as much as you can do with drive charts, but I'm guessing you're using play by play as your dataset, so you can do a lot more. I'm hoping that you'll publish the details so I can set up a spreadsheet to output the results.

[* Win probability is not the only response variable I can use -- I've also got models to output probability of a td, fg, score (of and kind), safety -- all logistic regression models. I also have an Expected Pts model. These are all neat and useful, but the only problem is they only work for 1st and 10 situations. Play-by-play is required for within-possession modeling.

22. Brian Burke says:

Yes, I'll publish the details, at least explain the process as best I can.

Yes, I'm using play by play, but I'm starting with simply score difference, possession, and minutes remaining. The problem is, just with those simple starting points, and even for a 2000+ game data set, the sub-sample sizes can get extremely small for empirical analysis. (We saw that yesterday quite a bit.)

First downs are keystones of the model because there are fewer 2nd downs, and even fewer 3rd and 4th downs. Plus, 2nd, 3rd, and 4th downs have varying 'to go' distances.

(I also did some extrapolation for seconds remaining yesterday, which is why you might have noticed some strange things like a .91 WP "Based on 5 games.")

So to break it into field position and then down/distance situations dices up the data into ever-more microscopic bits.

So here's my plan (copyrighted, of course!): spine field position into 20 yard chunks or so (the chunks don't have to be equal in size), calculate the WP for each chunk for each score difference/time remaining. Then I'll do some smoothing to reduce the (large amount of) noise. (I'm just planning to do a moving average for now, but I know there are some very sophisticated methods out there for smoothing.)

Then I'd interpolate between the center points of the chunks. Say the ball is on a team's own 38. That's 12 yds short of the midpoint of the 40-40 yd line chunk, and 8 yds beyond the midpoint of the 20-40 yd line chunk. That will help with smoothing too.

For down and distance, I'd do some Markov stuff. Say it's 3rd and 3 and it's a 50/50 shot of converting. I'd compute:

.5 * having a 1st and 10 3+ yds down the field

plus

.5 * kicking (either a FG or giving the other team possession 38 yds down the field). [I'd also need some way of accounting for end-game 4th down desperation.]

Time outs and kick return possibilities are far off in the distance for now.

23. Anonymous says:

Brian,
You certainly have my attention, I'll be sure to check in during next weeks games. Good Luck!

24. Brian Burke says:

Oops. Nothing for overtime in my code. Tune in next week. Offline until then. You might want to check in on Saturday. I'll be using some college games as dress rehearsals. I hope to have a far more sophisticated model in place by then.

25. Anonymous says:

Brian, after using your picks for each game this week, I think you have a solid system. I find these reads interesting and in depth. Im surprised you havent been offered a job setting the Vegas Lines haha.