I've assembled a new and larger database and tested the most predictive variables for wins. Now it's time to apply the data to the upcoming 2007 NFL season.
Using data from the 2002-2007 seasons, I've run a logit regression on the efficiency stats most predictive of winning for every game played. The result is a model that correctly predicts the winner of 68.9% of games during the 5 most recent seasons.
For every game, the model produces a probability that each team would win. We can apply this model to future games as well, assuming we can estimate the values of each team's efficiency statistics. As any shrink will tell you, the best predictor of future behavior is past behavior. We'll use 2006 stats as our baseline for 2007. I realize this is highly imperfect, but it is the best predictor available. Although this method does not account for personnel changes, efficiency stats are relatively very steady from year to year--much steadier than actual win totals. Previous year 'expected win' totals are singificantly better predictors of the following year's record than are previous year actual wins. Plus, this is all just for fun.
I now have win probabilities for all 256 upcoming regular season games and I've sorted them by team. The probabilily of every possible sequence of game outcomes for a team (there are 2^16 of them--65,536) was computed. Then each sequence of outcomes that result in a certain number of total wins is summed. (There is 1 possible sequence for 16 wins, 16 sequences for 15 wins, 256 sequences for 14 wins,...) Now I have estimated probabilities for every win total for each team.
For example, the Ravens have a breakdown of win probabilities that centers on 11 wins. Although they had 13 wins last year, they appeared to squeak by on a couple wins on luck, and they had a relatively easy schedule. This year, they'll play a much tougher schedule. The table below lists the probabilities for each game on their '07 schedule based on last year's efficiency stats.
Vis Home VPROB HPROBThe next table lists the probability of winning each possible number of games. The 'cumulative' column lists the probability of the Ravens winning at least that number of games.
---------------------------
BAL CIN 0.54 0.46
NYJ BAL 0.26 0.74
ARI BAL 0.17 0.83
BAL CLE 0.82 0.18
BAL SF 0.69 0.31
STL BAL 0.32 0.68
BAL BUF 0.67 0.33
BAL PIT 0.47 0.53
CIN BAL 0.28 0.72
CLE BAL 0.09 0.91
BAL SD 0.32 0.68
NE BAL 0.34 0.66
IND BAL 0.49 0.51
BAL MIA 0.57 0.43
BAL SS 0.69 0.31
PIT BAL 0.34 0.66
WINS PROB CUMULATIVEThe stats say 11 wins is the most probable of all win totals for Baltimore (21% probability). Keep in mind however, they're far more likely to finish with some other total (79%). The cumulative probabilities indicate they have a 50/50 chance to finish with at least 11 wins (52%).
---------------------------
16 0.00 0.00
15 0.01 0.01
14 0.03 0.04
13 0.09 0.14
12 0.17 0.30
11 0.21 0.52
10 0.20 0.72
9 0.15 0.87
8 0.08 0.95
7 0.03 0.99
6 0.01 1.00
5 0.00 1.00
4 0.00 1.00
3 0.00 1.00
2 0.00 1.00
1 0.00 1.00
0 0.00 1.00
It's funny to read this one ... now, knowing that Baltimore only won 5 games in 2007 (probability = 0.00).
I know! I should really delete these bad predictions. :)
Fortunately, I wised up before the season began.