In 2007, the Detroit Lion defense began the first half of the season with 13 interceptions, the most in the NFL. The next best teams had 11. It's reasonable to expect that the Lions would tend to continue to generate high numbers of interceptions through the rest of the season, notwithstanding calamitous injuries.
I wouldn't expect them to necessarily continue to be #1 in the league, but I'd expect them to be near the top. And I'd be wrong. It turns out they only had 4 interceptions in their final 8 games, ranking dead last. So halfway through the season, if I were trying to estimate how good the Lions are in terms of how likely they are to win future games, I might be better off ignoring defensive interceptions.
Although turnovers are critical in explaining the outcomes of NFL games, defensive interceptions are nearly all noise and no signal. Over the past two years, defensive interceptions from the first half to the second half of a season correlate at only 0.08. In comparison, offensive interceptions correlate at 0.27. As important as interceptions are in winning, a prediction model should actually ignore a team's past record of defensive interceptions.
You might say that if defensive interception stats are adjusted for opponents' interceptions thrown, then the correlation would be slightly higher. I'd agree--but that's the point. Interceptions have everything to do with who is throwing, and almost nothing to do with the defense.
This may be important for a couple reasons. First, our estimations of how good a defense is should no longer rest on how many interceptions they generate. Second, interception stats are probably overvalued when rating pass defenders, both free-agents and draft prospects.
I've made this point about interceptions before when I looked at intra-season auto-correlations of various team stats. That's a fancy way of saying how consistent is a stat with itself during the course of a season. The more consistent a stat is, the more likely it is due to a repeatable skill or ability. The less consistent it is, the more likely the stat is due to unique circumstances or merely random luck.
The table below lists various team stats and their self-correlation, i.e. how well they correlate between the first half and second half of a season. The higher the correlation, the more consistent the stat and the more it is a repeatable skill useful for predicting future performance. The lower the correlation, the more it is due to randomness.
Variable | Correlation |
D Int Rate | 0.08 |
D Pass | 0.29 |
D Run | 0.44 |
D Sack Rate | 0.24 |
O 3D Rate | 0.43 |
O Fumble Rate | 0.48 |
O Int Rate | 0.27 |
O Pass | 0.58 |
O Run | 0.56 |
O Sack Rate | 0.26 |
Penalty Rate | 0.58 |
In a related post, I made the case that although 3rd down percentage tended to be consistent during a season (0.43 auto-correlation), other stats such as offensive pass efficiency and sack rate were even more predictive of 3rd down percentage. In other words, first-half-season pass efficiency predicted second-half-season 3rd down percentage better than first-half-season 3rd down percentage itself.
But what about other stats? Are there other examples where another stat is more predictive of of something than that something itself? Below is a table of various team stats from the second half of a season and how well they are predicted by other stats from the first half of a season.
For example, take offensive interception rates (O Int). Offensive sack rates (O Sack) from the first 8 games of a season actually predict offensive interception rates from the following 8 games slightly better than offensive interception rates (0.28 vs. 0.27).
Predicting | With | Correlation |
D Fum | D Fum | 0.33 |
D Fum | D Sack | 0.15 |
D Fum | D Run | 0.12 |
D Int | D Sack | 0.08 |
D Int | D Int | 0.08 |
D Int | D Pass | 0.01 |
D Pass | D Pass | 0.28 |
D Pass | D Sack | 0.26 |
D Run | D Run | 0.44 |
D Sack | D Sack | 0.24 |
D Sack | D Pass | -0.07 |
O 3D Pct | O Sack | -0.53 |
O 3D Pct | O 3D Pct | 0.43 |
O 3D Pct | O Int | -0.42 |
O 3D Pct | O Pass | 0.42 |
O 3D Pct | O Run | 0.08 |
O Fum | O Fum | 0.48 |
O Fum | O Sack | 0.24 |
O Int | O Sack | 0.28 |
O Int | O Int | 0.27 |
O Int | O Run | 0.06 |
O Int | O Pass | -0.37 |
O Pass | O Pass | 0.49 |
O Pass | O Sack | -0.33 |
O Pass | O Run | -0.10 |
O Run | O Run | 0.56 |
O Run | O Pass | 0.00 |
O Sack | O Pass | -0.40 |
O Sack | O Sack | 0.26 |
O Sack | O Run | 0.03 |
Pen | Pen | 0.58 |
Pen | D Pass | -0.23 |
Pen | O Sack | -0.08 |
There are a thousand observations from this table. I still see new and interesting implications whenever I look it over.
- Having a potent running game does not prevent sacks.
- The pass rush predicts defensive pass efficiency as well as defensive pass efficiency itself.
- Running does not "set up" the pass, and passing does not "set up" the run. They are likely independent abilities.
- Offensive sack rates are much better predicted by offensive passing ability than previous sack rates.
- Defensive sack rate predicts defensive passing efficiency, but defensive passing efficiency does not predict sack rate.
The implications of these auto-correlations are numerous. Team "power" rankings and game predictions (both straight-up and against the spread) rely on a very simple premise--past performance predicts future performance. We now know that's not necessarily true for some aspects of football.
Lions head coach Rod Marinelli might be banging his head against the wall trying to understand how his defense was able to grab 13 interceptions through game 8, but only 4 more for the rest of the season. He's wasting his time. The answer is that in the first half of the season, the Lions played against QBs Josh McCown (2 Ints), Tavaris Jackson (4 Ints), and Brian Griese twice (4, 3 Ints).