Signal vs. Noise in Football Stats

In 2007, the Detroit Lion defense began the first half of the season with 13 interceptions, the most in the NFL. The next best teams had 11. It's reasonable to expect that the Lions would tend to continue to generate high numbers of interceptions through the rest of the season, notwithstanding calamitous injuries.

I wouldn't expect them to necessarily continue to be #1 in the league, but I'd expect them to be near the top. And I'd be wrong. It turns out they only had 4 interceptions in their final 8 games, ranking dead last. So halfway through the season, if I were trying to estimate how good the Lions are in terms of how likely they are to win future games, I might be better off ignoring defensive interceptions.

Although turnovers are critical in explaining the outcomes of NFL games, defensive interceptions are nearly all noise and no signal. Over the past two years, defensive interceptions from the first half to the second half of a season correlate at only 0.08. In comparison, offensive interceptions correlate at 0.27. As important as interceptions are in winning, a prediction model should actually ignore a team's past record of defensive interceptions.

You might say that if defensive interception stats are adjusted for opponents' interceptions thrown, then the correlation would be slightly higher. I'd agree--but that's the point. Interceptions have everything to do with who is throwing, and almost nothing to do with the defense.

This may be important for a couple reasons. First, our estimations of how good a defense is should no longer rest on how many interceptions they generate. Second, interception stats are probably overvalued when rating pass defenders, both free-agents and draft prospects.

I've made this point about interceptions before when I looked at intra-season auto-correlations of various team stats. That's a fancy way of saying how consistent is a stat with itself during the course of a season. The more consistent a stat is, the more likely it is due to a repeatable skill or ability. The less consistent it is, the more likely the stat is due to unique circumstances or merely random luck.

The table below lists various team stats and their self-correlation, i.e. how well they correlate between the first half and second half of a season. The higher the correlation, the more consistent the stat and the more it is a repeatable skill useful for predicting future performance. The lower the correlation, the more it is due to randomness.
















VariableCorrelation
D Int Rate0.08
D Pass0.29
D Run0.44
D Sack Rate0.24
O 3D Rate0.43
O Fumble Rate0.48
O Int Rate0.27
O Pass0.58
O Run0.56
O Sack Rate0.26
Penalty Rate0.58

In a related post, I made the case that although 3rd down percentage tended to be consistent during a season (0.43 auto-correlation), other stats such as offensive pass efficiency and sack rate were even more predictive of 3rd down percentage. In other words, first-half-season pass efficiency predicted second-half-season 3rd down percentage better than first-half-season 3rd down percentage itself.

But what about other stats? Are there other examples where another stat is more predictive of of something than that something itself? Below is a table of various team stats from the second half of a season and how well they are predicted by other stats from the first half of a season.

For example, take offensive interception rates (O Int). Offensive sack rates (O Sack) from the first 8 games of a season actually predict offensive interception rates from the following 8 games slightly better than offensive interception rates (0.28 vs. 0.27).








































PredictingWithCorrelation
D FumD Fum0.33
D FumD Sack 0.15
D FumD Run0.12
D Int D Sack 0.08
D Int D Int 0.08
D Int D Pass0.01
D PassD Pass0.28
D PassD Sack 0.26
D RunD Run0.44
D Sack D Sack 0.24
D Sack D Pass-0.07
O 3D PctO Sack -0.53
O 3D PctO 3D Pct0.43
O 3D PctO Int -0.42
O 3D PctO Pass0.42
O 3D PctO Run0.08
O FumO Fum0.48
O FumO Sack 0.24
O Int O Sack 0.28
O Int O Int 0.27
O Int O Run0.06
O Int O Pass-0.37
O PassO Pass0.49
O PassO Sack -0.33
O PassO Run-0.10
O RunO Run0.56
O RunO Pass0.00
O Sack O Pass-0.40
O Sack O Sack 0.26
O Sack O Run0.03
PenPen0.58
PenD Pass-0.23
PenO Sack -0.08


There are a thousand observations from this table. I still see new and interesting implications whenever I look it over.
  • Having a potent running game does not prevent sacks.
  • The pass rush predicts defensive pass efficiency as well as defensive pass efficiency itself.
  • Running does not "set up" the pass, and passing does not "set up" the run. They are likely independent abilities.
  • Offensive sack rates are much better predicted by offensive passing ability than previous sack rates.
  • Defensive sack rate predicts defensive passing efficiency, but defensive passing efficiency does not predict sack rate.
We see that many stats, such as passing and running efficiency predict themselves fairly well. But even those stats might be better predicted by using a combination of themselves and related stats. For example, in my previous post I noted how accurately offensive 3rd down percentage could be predicted using passing efficiency, sack rate, and interception rate.

The implications of these auto-correlations are numerous. Team "power" rankings and game predictions (both straight-up and against the spread) rely on a very simple premise--past performance predicts future performance. We now know that's not necessarily true for some aspects of football.

Lions head coach Rod Marinelli might be banging his head against the wall trying to understand how his defense was able to grab 13 interceptions through game 8, but only 4 more for the rest of the season. He's wasting his time. The answer is that in the first half of the season, the Lions played against QBs Josh McCown (2 Ints), Tavaris Jackson (4 Ints), and Brian Griese twice (4, 3 Ints).

Safe Leads in NCAA Basketball


Bill James takes a look at when leads become insurmountable in college basketball. In other words, when should CBS cut away from the UNC-Mt. Saint Mary's game to show us the barn-burner between Vanderbilt and Siena?

James' formula uses the lead in points, who has the ball, and seconds remaining to tell us if the lead is completely insurmountable. Here it is in a nutshell:

  • x= (Lead - 3 +/- .5) 2 -- [+.5 if winning team has possession, -.5 if not]
  • If x > time remaining in sec, the lead is insurmountable
Pretty cool. This is the kind of thing James is really good at. Unfortunately, I think he buys into a logical fallacy later in his article. He says that if a team is deemed to be "dead," that is to say too far behind, but it is able to climb back inside the limits of "insurmountability," it doesn't matter. The losing team is still dead.

I'd agree that it is highly unlikely that such a team would win, but I think James has been taken in by the gambler's fallacy. He writes "The theory of a safe lead is that to overcome it requires a series of events so improbable as to be essentially impossible. If the "dead" team pulls back over the safety line, that just means that they got some part of the impossible sequence—not that they have a meaningful chance to run the whole thing."

It seems to me that if a team climbs back into contention, it's in contention. If a sequence of events are independent, it doesn't matter how lucky or how impossible previous events were. They're water under the bridge. For example, (from Wikipedia) the probability of flipping 21 heads in a row, with a fair coin is 1 in 2,097,152, but the probability of flipping a head after having already flipped 20 heads in a row is simply 0.5

The only thing that matters is the current situation. It's like saying, "There's no way they'll hit another 3-pointer. They just hit five in a row. They're due to miss."

What does this have to do with football? It would be interesting to look at something similar in the NFL. When is a lead so safe that a team should stop throwing? Or when is it so safe a team should only throw on 3rd down? And so on. Basically, when should a winning team stop trying to gain a bigger lead and start trying to simply prevent big mistakes?

The Office Pool 3

You might be wondering why I'm interested in NFL pick 'em pools in the middle of March Madness. Well, there are already plenty of statistical analyses on the NCAA basketball tournament. Here are a couple of sites to get your bracket filled out scientifically.

But for now, I've got the luxury of time before the NFL season starts, or even draft season, when I can think through these things. In the last post I looked at using the point spreads as a baseline for picking winners. I looked at the accuracy with which the point spread correctly favored straight-up winners. Over the past six seasons, the spread was accurate about 67% of the time, and no single week showed any statistically significant deviation from the overall average. In other words, the spread is no more or less accurate in early or late weeks than throughout the season.

The reason I analyzed spread accuracy by week was because when you're behind in a pick 'em pool, you'll probably have to gamble on some upsets in order to catch up. I wanted to know if it was to your benefit to go against spread favorites in any particular week. The answer is no.

But what about spread amounts? It certainly makes sense that games with +1 or -1 spreads will be less predictable than games with +14 or -14 spreads. But how much less predictable? Is there a point of inflexion when it never makes sense to go against the spread? Are there situations when it's basically a toss up and the spread is no more accurate the flip of a coin?

Below are the answers. The graph shows the accuracy of the spread in terms of predicting the straight-up winner for each spread amount. Data is from all regular season games from 2002-2007.


As expected, the spread predicts winners more accurately with increasing spread amounts. Note the fan-shaped dispersion of the data points. Some spread amounts are far less common than others. For example an 8-point spread is less common than a 7-point spread. At the least common spread amounts, there are fewer cases and therefore a wider range of accuracies.

Also notice how games with spreads less than 3 points are no better than 50% accurate. I wouldn't expect much better than 50% - 55%, but less than 50% is surprising.

The 'home underdog' phenomenon has been established in previous research. This may be due to many observers underestimating the home field advantage due to weather conditions late in the season. But whatever the reason, the home underdog effect clearly exists. The graph below breaks up the spreads into home underdogs and home favorites.


Notice the 40% accuracy of low-spread home underdog games. When the spread is below +2.5 points, the home underdog is not only likely to cover, but will probably win.

It makes sense to pick upsets in low-spread home underdog games. However, there have only been 70 such cases in the past 6 years, averaging about 11 games per year. If you go for the underdog in all 11, on average that gives you a 2-game edge over someone picking all favorites. It's not nearly enough of an advantage to guarantee you bragging rights around the office, but every little bit helps.

And if you need to catch up in the later weeks, games with low spreads and home favorites aren't much better than 50%. Going with upsets in such games would make sense because it's not a bad gamble and your leading opponent might be playing it safe by picking all favorites.

You'll want to be very careful picking against the favorite in games with spreads more than 3.5 points, for either home underdog or home favorite games. The odds against you climb rapidly beyond that point.

The Office Pool 2

When picking winners in an office pool, I'd guess that most people start with the point spreads, or at least look at the records of each opponent, when making their predictions. Most people have some sort of baseline even if it's not Sagarin, DVOA, or the regression-based predictions on this site.

So I thought it would be interesting to look at the spreads, and how often they're correct in identifying winners. If someone needs to correctly pick a few upsets to win a pool, it might be good to know that some weeks are less predictable than others. You'd ideally want to pick upsets in weeks where the spread is less accurate.

In this installment I'll look at how well the spread does in picking winners by week. My theory was that the spread would be relatively less accurate in the early weeks of the the season when there is less information about team performance. There may also be a high degree of bias towards teams expected to be strong in the pre-season. Week 17 may be inaccurate too, due to the uncertainty of some playoff-bound teams resting their better players. Additionally, late season weather may also contribute to higher uncertainty and less predictability.

Using point spread data from the 2002-2007 seasons obtained here, I analyzed how often the spread was correct. Overall, the point spread favorites win 66.2% of the time. Weekly accuracy ranges from 59.0% in weeks 4 and 9 to 72.6% in week 12. The graph and table below list the weekly averages.

























WeekAccuracy
163.8%
262.5%
369.8%
459.0%
571.3%
669.5%
761.4%
864.6%
959.0%
1061.6%
1176.3%
1272.6%
1362.8%
1471.6%
1569.5%
1663.5%
1765.3%
Total66.2%


Although there appears to be substantial differences between some weeks, they are most likely random. The only statistically significant difference between any one week and the season average of 66.2% was week 12 with p=0.04. However, there are 17 weeks, so we should not be surprised to see a week or even two appear significant when there really is no systematic connection (a type I error).

The bottom line here is that no week can be viewed as particularly favorable for picking upsets. If you're behind in your office pool toward the end of the season and need to pick some upsets to make up ground, one week is as good as any other to start getting aggressive.

Next, I'll look at point spreads from a different angle and see how accurate they are at picking winners by the size of the spread. I'll also break it down into two types of games, home-underdogs and home-favorites, to see if there are any inefficiencies in how the spread accounts for home field advantage.

The Office Pool 1

Say I'm in an office pool pick 'em contest. My 10 buddies and I pick NFL winners each week, and the guy with the best record at the end of the season wins. My office mates aren't particularly good at handicapping football games, so I figure that if I pick the consensus favorite in every game (the team favored by the spread), I'll have a great chance to come out on top over the long haul.

My office buddies have access to point spreads too. They tend to look at the spread (or at least look at each team's respective record, which is just as accurate) and then pick a couple upsets each week. Over the past five years, the spread identifies winners correctly 66.2% of the time. So, normally, their upsets would be correct on average 33.8% of the time (100%-66.2%), but they won't be picking upsets in lopsided match-ups. (Although we can't always assume rationality, we will assume sanity.) So in their 34 chosen upsets (2 per week), my buddies will be right 45% of the time on average. I would have a 10% accuracy advantage in those games.

After doing the math, each of my buddies would average a 63.3% accuracy rate (66.2% * 232 games + 45% * 34 games). And I'd average 66.2% accuracy. Man, I can't wait to collect my winnings!

But wait. Because of luck, some would be slightly more accurate, and some would be less accurate. In fact, the only thing that really matters is how well each of them do on the 34 games they deviate from picking the published favorite. In the other 232 games, we'd have identical picks. Of the 34 games in question, each game that one of my buddies gets right, I must have been been wrong. One of my 10 friends needs to be correct greater than 50% of the time in his 34 games to beat me.

The mathematical bottom line is, "How often is someone correct at least in 18 out of 34 trials with a 0.45 probability of being correct in any given trial?" The binomial distribution gives us the answer--it's 22.4% of the time. That's pretty good, right? I have a 77.6% chance of beating any one of my opponents. The only problem is that there are 10 of them.

The chance I would beat all 10 of my buddies is the conjunctive probability of beating one of them. It's 77.6% * 77.6%... and so on, for however many opponents I have. In this case, it's:

0.776 10 = 0.079

In other words, my chances of winning the office pool are just 7.9%--significantly less than a fair chance of 1 in 11. That's why just picking the favorites is a bad strategy. I'd actually be better off choosing the less accurate strategy of my buddies. At least then I'd have fair chance at 1 in 11.

I realize that it is counter-intuitive that a strategy that is less accurate overall is better than a more accurate strategy. But in a contest against several opponents, the more risky strategy--with a greater deviation of outcomes--may be best.

Note: Phil Birmbaum points out that the odds of the opponents are not independent of one another, and therefore the simple compound probability I calculated here is far too low. If one opponent happens to beat you, then the other opponents may be more likely to beat you as well, and vice versa. In the end, picking all favorites may be the better play. See his comments for an explanation.