In-Game Win Probability Engine 0.4

Improvements to the (near) live win probability scoreboard this week include a model for overtime and a fix for some halftime bugs. The site will also update quicker and feature some minor cosmetic upgrades. The are also a number of behind-the-scenes improvements in the programming. The link is wp.advancednflstats.com. You can see graphs of last week's games here and week 6 games here. Feedback is always more than welcome. I'm currently drafting a post explaining how it works in case anyone is interested. Also, feel free to send in your suggestion for play of the week.

9 Responses to “In-Game Win Probability Engine 0.4”

1. Anonymous says:

Is there anyway you can calculate expected points from a particular drive?

2. Brian Burke says:

Yeah, it would be easy. Great idea. Could also add probability of a 1st down for the current series.

3. Brian Burke says:

The play of the week might be another Rivers interception. Too bad. Otherwise, he's playing really well.

4. Mr.Ceraldi says:

just had a question.Have I missed something obvious but why do you start each graph at 50% shouldnt each in game prob. start at the % you calculate from your prediction model? i mean aren't the probabilities of team A winning dependent on there inherent strength/ rating at the beginning of the game?to put it another
way for example take last years NE team and this years Lions surely the probability of each team winning at the same time of a game under the exact same situations would be mucher better for the stronger team NE?just wondering
thx

5. Anonymous says:

Wondering the same thing....

BlueCards

6. Brian Burke says:

That's a tough one. I've considered doing that, but how would it work? At some point in the game, the estimated relative team strengths no longer really matter (or were mistaken to begin with!). At what point? How would the 2 considerations of pre-game predictions and actual game state be combined until that point? Some sort of logistic or Bayesian algorithm I suppose.

I could do it I guess, but first I'd have to answer a lot of questions. For now, I'm still refining the in-game model, and I'm treating that and the pre-game model as separate.

Ideas are welcome!

7. Anonymous says:

Here is the idea:

Each data point needs to be hooked to a particular game line. Then use only thoses data points points that correspond to the game in question. Or, more likely, exclude some data so the remainder averages to the game line in question.

The problem with the NFL is that this leaves you with a very small sample size...I have just done this for the NBA and come up with about 7500 games....I would guess only around 2000 or so NFL games could be broken down this way.

BlueCards

8. Brian Burke says:

I like the idea, but for football the sample sizes will quickly get much smaller than basketball. In the NBA, it would seem you really just have score difference, time remaining and possession to worry about. In football, add in field position, down, and to go distance. The combinations of variables increases exponentially.

Factoring in the spread from (-10 to +10) divides up the sample even further. And you're right, I've got 2,016 games to work with, so the sub-samples could get really tiny.

But I completely agree that's a solid methodology, if only we had lots more games of data. Otherwise, to get any reasonable sample sizes, we'd have to group the data into really large chunks--20 yd increments instead of 10, or 5 min intervals instead of 1 min, etc. And even then, it would require lots of smoothing and interpolation.

9. Anonymous says:

I agree that there is not much data to work with....However,

For spreads of (home heam) -3...you could use all of your data, since this is about the average spread of an NFL game. For spreads that move from this point, you need only remove outlying spreads so the average falls in line...I find that the data has meaning between Home team -7 and home team +4....you are still using about 1/3 of your total data at this point.

Skipping this step returns win percentages that really don't mean much. For example, take the following:

Game line: Home team -3
Game line: Home team +3

Situation: Half time. Home team up by 3.

My numbers give the win percentage as follows:

Home Team favored by 3: win percentage: 72%
Home team +3: win percentage: 63%

To not separte data gives a very wide range of answers...

BlueCards