Game probabilities for week 4 are listed below. The probabilities are based on an efficiency win model explained here. The model considers offensive and defensive efficiency stats including running, passing, sacks, turnover rates, and penalty rates. Team stats are adjusted for previous opponent strength.
Visitor | Home | Vprob | Hprob |
BAL | CLE | 0.48 | 0.52 |
CHI | DET | 0.17 | 0.83 |
GB | MIN | 0.61 | 0.39 |
HOU | ATL | 0.76 | 0.24 |
NYJ | BUF | 0.16 | 0.84 |
OAK | MIA | 0.19 | 0.81 |
STL | DAL | 0.05 | 0.95 |
SEA | SF | 0.70 | 0.30 |
TB | CAR | 0.78 | 0.22 |
DEN | IND | 0.44 | 0.56 |
KC | SD | 0.53 | 0.47 |
PIT | ARI | 0.68 | 0.32 |
PHI | NYG | 0.70 | 0.30 |
NE | CIN | 0.86 | 0.14 |
"Team stats are adjusted for previous opponent strength."
How did you do that?
Also, this blog is great, thanks for taking the time to do this.
Thanks.
Fair question. It's a little complicated and has to do with the math surrounding logistic regression.
Logistic/logit regression produces a linear equation of a constant and variable coefficients that add up to a logarithmic odds ratio, i.e. the natural log of the ratio of win probabilities of a binary outcome.
What I do is average the opponents' generic win probability (GWP) for each team. A team that has faced good teams so far would have a high Opp GWP.
I then go backwards through the logistic math process to calculate what log odds ratio would be needed to produce each team's Opp GWP.
I then take a fraction of that odds ratio according to what week it is. For week 3, it's 2/3. For week 4 it will be 3/4. I take that fraction so that a team's good performance against its opponent does not count against itself.
I recalculate the odds ratio with the opponent adjustment included. Teams with easy opponents would see their probability of winning reduced. Teams with tough opponents so far would get a boost in my predictions.
Do you include a home-field advantage? I think I'll use these in my pick 'em league where I have to also assign a confidence rating to each pick. Good stuff.
Yes, home field advantage is included.
Caution using these for pick 'em leagues, especially large ones. The more people in your league, the more likely someone will be lucky and get 70% or more of the games correct.
Using a mathematical model, even one this good, usually gets you second place because of the lucky fool theory. But, after several years, your cumulative record is almost guaranteed to to be best.
Well, the "lucky pickers" are only an issue if you go with a common wisdom system. How close do your picks (include strength of choice) tend to correlate with common wisdom? You could compare with Vegas lines.
For example, you have Detroit winning 83% of the time, MIA 81%, TB 78%, KC 53%, and CLE 52%. Those would seem crazy to the general public.
Last year I was 65% correct and Vegas was 57% correct. 2006 was an unusually unpredictable year. Partly because home field advantage was nearly non-existent.
Part of the apparent 'craziness' of these predictions are due to the short season so far. With only 3 games, there is a relatively large margin of error around each team's central performance tendencies. How often will the Eagles score 56 points, or the Browns score 52?
But I think you might misunderstand the game-by-game predictions (probably due to my very dense explanations). Those are the probabilties either team will win THAT ONE game, not the rest of their games.
In the following post, you'll see that the model predicts Detroit would have only a .350 winning percentage from here on out.
I understand that those numbers are for this week's games only. What I was saying is while you and Vegas might have performed similarly last year, you might have gotten there much differently. Your picks for week three seem to go against common wisdom. You'll be right on some and wrong on some. While your expected number of correct picks will by a hair better than Vegas, your likelihood of doing significantly better or worse than Vegas is much higher than the average fan, who's right in line with Vegas.
As the season goes on, I'm sure you'll fall more in line with Vegas, but no more in line than a lucky fool with a touch of football knowledge.
Sorry, I misunderstood when you said "you have Detroit winning 83% of the time." I thought you meant for the rest of the season.
I can almost guarantee this model will beat Vegas over the long term. It certainly did last year. We'll see about 2007.
But with only 3 weeks of data, I'd hesitate to put any faith on these yet. But last year for week 5 (with 4 weeks of data), this model went 14 for 14. But then it fell to 8 out of 14 for week 6.
Brian,
Nice blog you have here. In several posts you've mentioned your win rate for the previous season and compared it to Vegas. By this I assume you mean the lines set for the games.
However, from what I can deduce, you are predicting 'straight-up', not 'against the spread', since your predictions make no reference to the line.
My first observation is that it is easier to pick SU as opposed to ATS, and that comparing SU picks to ATS picks is apples and oranges.
Additionally, the lines set by Vegas have as much to do with balancing the bets than they do an attempt to predict the outcome of the game. The bookmakers do not want to be the gamblers, after all.
It's not that your performance isn't without value; quite the contrary. I'm just wondering about the precision of your comparison.
Mark
Mark-Regarding comparisons to Vegas, I am not picking against the spread. What I mean is that Vegas predicts a winner for each game (straight up) indirectly by assigning it the favorite. I consider that the universal "consensus" favorite.
When I compare the record of this model, I am comparing the straight-up consensus picks vs. the model's straight-up picks.
By the way, picking against the spread is almost always a 50/50 proposition. But picking winners straight up isn't that much easier. Vegas "consensus" straight up picks were on correct 57% of the time last year.
I may do some work regarding spreads in the future, but for now I'm just using them as a measuring stick for my prediction model.
I ran a correlation between points given by home teams and their probability of winning according to your system. It's .432, which seems low, but I don't have a reference point.
I converted your win% into a rough point spread. Here are your "best bets" against the spread: Buffalo +3.5 over Jets, Detroit +2.5 over Chicago, Denver +10 over Indy, Tampa Bay +3 over Carolina, Kansas City +12 over San Diego.
Sky-Your best bets would have been 4 for 5 today.
haha, thanks for proving it works sky.
love the blog, it's a great read.
This analysis is terrific. I have been thinking of running a regression on NFL statistics for the last few years, but bnow I don't have too. I have a couple of comments on your methodology:
1) Do to the uncertainity in the first few weeks of a season with new coaches, players, etc., I suggest that you re-run your logistic regression with weekly data from weeks 5 - 13 data only. I would throow out the first few weeks and the last few weeks. The first few weeks give poor data due to the reasons I just stated adn the last due to teams being out of the playofff hunt and "throwing" games with rookies playing or other issues.
2) Do you have a more complete list of statistics and their correlations with win percentage?
3) Using point spreads - This will be very difficult. If you re-run the regression agianst covering the spread we may see new variables appear with higher correlation
4) I transformed your win probabliites into a point spread model by assuming 50% = 0 point and 100% = 14 points (I know this is arbitrary) but for week 4 it predicted 9/14 games correct (65%). I did not run it on the earlier weeks for the reasons above.
If you publish the standard deviations on the percentages I can run crystal ball on the spreads to get a better result.
Once again, great job, I look forward to the upcoming weeks.
I love this blog. Can you post your predictions by Friday morning though? Thanks.
Kevyn-Meant to reply earlier but lost your note in the shuffle.
1. Excellent idea. Next chance I get I'm going to try that.
2. Here:http://www.bbnflstats.com/2007/04/list-of-stats-and-correlations.html
3. Spread predictions are not terribly interesting to me. I realize there is a big market out there for that stuff, but I'm just a fan. I do have good spread data and might do the analysis one day soon, if only because I enjoy beating conventional wisdom.
4. Std dev for all probabilities is 0.29 following week 4. I'd expect that to decrease as the season wears on.