When picking winners in an office pool, I'd guess that most people start with the point spreads, or at least look at the records of each opponent, when making their predictions. Most people have some sort of baseline even if it's not Sagarin, DVOA, or the regression-based predictions on this site.
So I thought it would be interesting to look at the spreads, and how often they're correct in identifying winners. If someone needs to correctly pick a few upsets to win a pool, it might be good to know that some weeks are less predictable than others. You'd ideally want to pick upsets in weeks where the spread is less accurate.
In this installment I'll look at how well the spread does in picking winners by week. My theory was that the spread would be relatively less accurate in the early weeks of the the season when there is less information about team performance. There may also be a high degree of bias towards teams expected to be strong in the pre-season. Week 17 may be inaccurate too, due to the uncertainty of some playoff-bound teams resting their better players. Additionally, late season weather may also contribute to higher uncertainty and less predictability.
Using point spread data from the 2002-2007 seasons obtained here, I analyzed how often the spread was correct. Overall, the point spread favorites win 66.2% of the time. Weekly accuracy ranges from 59.0% in weeks 4 and 9 to 72.6% in week 12. The graph and table below list the weekly averages.
Week | Accuracy |
1 | 63.8% |
2 | 62.5% |
3 | 69.8% |
4 | 59.0% |
5 | 71.3% |
6 | 69.5% |
7 | 61.4% |
8 | 64.6% |
9 | 59.0% |
10 | 61.6% |
11 | 76.3% |
12 | 72.6% |
13 | 62.8% |
14 | 71.6% |
15 | 69.5% |
16 | 63.5% |
17 | 65.3% |
Total | 66.2% |
Although there appears to be substantial differences between some weeks, they are most likely random. The only statistically significant difference between any one week and the season average of 66.2% was week 12 with p=0.04. However, there are 17 weeks, so we should not be surprised to see a week or even two appear significant when there really is no systematic connection (a type I error).
The bottom line here is that no week can be viewed as particularly favorable for picking upsets. If you're behind in your office pool toward the end of the season and need to pick some upsets to make up ground, one week is as good as any other to start getting aggressive.
Next, I'll look at point spreads from a different angle and see how accurate they are at picking winners by the size of the spread. I'll also break it down into two types of games, home-underdogs and home-favorites, to see if there are any inefficiencies in how the spread accounts for home field advantage.
Would it be hard to figure out how accurate the spread is? That is, how often does favorite win by more than the spread?
If all picks are simultaneous, you want to have your picks be in a relatively underrepresented portion of the result space.
However, any pool often involves multiple weeks of picks. The strategy from week to week changes based on your position relative to everyone else. There is a metagame of making your current week picks appropriate for where you are in the standings (and who your opponents are and their pick tendencies).
The interesting problem for me from a mathematical perspective is
What pick strategy would you use if you only know
a) s = the size of the field
b) n = the number of games
c) p = an accurate probability of each game.
d) r = the range that p takes on in the set of games, represented by a list of all p. (this is needed as different strategies might be optimal depending on the p-mix in the n games)
In this version of the problem, there is no metagame of outthinking your opponents based on their tendencies or the current standings. How do different functions f (s, n, p, r) that give the probability that an underdog should be taken in a particular game perform against each other? Is there a minimum amount of a win you can guarantee? Even if you tell everyone your f?
-Dan (for whatever reason, post under a name didn't work)