Predictivity

Success Rate (SR) is a simple measure of whether or not a play improves an offense's expected net point potential. It essentially ignores the magnitude of a play's result, and instead focuses only on whether a play was simply a good outcome or bad outcome.

Although team SR statistics ignore important information in terms of explaining past wins, it may be able to predict future outcomes better than other measures. A team's SR on run plays is particularly informative, because it is not sensitive to the low-frequency but high-impact events that are largely subject to randomness, such as long broken runs or turnovers.

Compared to simple running efficiency, run SR correlates better with winning (0.39 compared to 0.15). This is telling and helpful, but it only accounts for past outcomes. How well a statistic predicts future outcomes is not just about the parlor game of picking winners. Stats that predict future outcomes measure the signal of how good a team really is, underneath all the noise of randomness.

In other words, there is no 'right now.' There is no is. There is the known past, clouded by randomness, and there is the unknown future, clouded by uncertainty. Now is merely the ephemeral intersection between the past and future. When trying to measure team strength or player ability, the focus should be on how well a team or player is likely to play in the future.

That's why stats that minimize noise, even while sacrificing some of the signal, can be more predictive and better measures of true ability than stats that include all the signal and all the noise.

Consistency is the key to measuring how much signal is in a stat. A stat--say total TDs--explains a lot about past wins, but unless it can be counted on as consistent measure of a team's future output, it isn't helpful. But if teams consistently scored the same number of TDs each game, we'd know a lot about which teams were truly better than the others.

I selected several team stats known to correlate well with winning and tested how consistent they were within a season. Consistency was measured by how well the stat correlates with itself. I broke each team's season into two alternating sets of games. There were 2 sets of 8 games, with set A comprised of a team's #1, #3, #5... games, and with set B comprised of a team's #2, #4, ...games. A statistic's correlation coefficient between the two sets of games measures its consistency and how well we can rely on it as a predictor. The data consist of team stats from the 2002-2009 regular seasons.

A statistic that both correlates with winning and correlates with itself would be a reliable predictor of future wins. Put mathematically:

Predictivity = r(win) * r(self)

For example, team offensive net pass efficiency correlates with team win totals at 0.66, which is about as strong a correlation as you'll find. It is also fairly consistent throughout the season, correlating with itself at 0.55. The result of the two correlations is a 0.36 predictivity factor, which is the highest of the stats I measured.

In contrast, defensive net pass efficiency correlates with winning at -0.56, nearly as strong as its offensive counterpart. (The negative sign does not indicate a weak correlation. Instead it simply indicates that lower is better.) But defensive net pass efficiency isn't very consistent throughout the season, correlating with itself at only 0.17. The resulting predictivity factor is only 0.09, one quarter of the predictive power of offensive pass efficiency.

The table below lists the full results of each of the selected team statistics. The stats are listed from most to least predictive.


Statr(win)r(self)Predictivity
O Pass Eff0.660.550.36
Pass SR0.580.600.35
Run SR0.390.340.13
D Pass SR0.370.290.11
D Pass Eff-0.560.170.09
O Fum Rate-0.330.240.08
D Run SR0.210.320.07
D Int Rate0.340.150.05
O Run Eff0.150.330.05
Pen Rate-0.110.390.04
O Int Rate-0.470.080.04
D Fum Rate0.200.180.04

There are implications for improvements for a prediction model. For example, replacing run efficiency with run SR would likely be a significant improvement. There are other implications too. Previously, I had found that offensive interceptions self-correlated much stronger than defensive interceptions. That result appears to have been a spurious result of using too few seasons worth of data, as defensive interception rate appears more consistent than offensive interception rate.

At the very least, this tells us that if you had to chose only one stat to measure a team, it should be offensive pass efficiency. It's both highly predictive and simple to calculate.

  • Spread The Love
  • Digg This Post
  • Tweet This Post
  • Stumble This Post
  • Submit This Post To Delicious
  • Submit This Post To Reddit
  • Submit This Post To Mixx

28 Responses to “Predictivity”

  1. Deryl says:

    Does the fact that Offensive Net Pass Efficiency correlates so well to itself and Defensive Net Pass Efficiency doesn't, indicate that a teams passing defense is significantly more impacted by the the quality of it's opponent rather than it's own defensive prowess?

  2. Anonymous says:

    Thanks for posting these correlations, Brian.

    These findings jibe with my intuition that QB's are penalized WAY too heavily for interceptions, by both the media and statheads alike. Metrics like DVOA, AY/A and the NFL Passer Rating just kill QB's for interceptions, which are essentially random, non-predictive events.

    It's unfortunate that the stats most often used to judge quarterbacks (comp %, and TD/INT) are also the most useless. These stats skew public perception of who's good and who's not.

    Look at Michael Vick this year. People point to his NFL high 105.3 Passer Rating as evidence of his greatness. But that rating is heavily inflated by his non-predictive 7/0 TD/INT ratio, which will almost certainly regress toward the mean as the season goes on.

    When are the "experts" going to realize that Net Yards Per Attempt is the way to go?

  3. Sampo says:

    Mike Vick is awesome! His EPA/P is the best in the league and his AYPA is the second best!

  4. Zach Myers says:

    Thanks Brian for this comparision of statistics. Hopefully some day ESPN and the rest of the sports world will catch up and start paying attention to what makes teams win.

    @Sampo
    I agree Vick is good. However, he is a good example of the inefficiency of passer rating.

  5. Guy says:

    Great analysis, Brian. FYI: I think you're missing Def run efficiency in your table.

  6. DSMok1 says:

    Brian: would you care to comment on how the bimodal shape of the passing play results interacts with your SR and Eff numbers (basically a mean and a percentile) and how that compares to the unimodal character of the rushing play results? SR and Eff, as percentile/threshold and mean, behave differently when looking at bimodal vs. unimodal distributions. For this reason, I wonder if some sort of Monte Carlo simulation might not be better?

  7. Brendan says:

    Great stuff Brian. One question, do you think that because offensive statistics correlate better with winning than their defensive counterparts, that means teams should pore the majority of their resources (draft picks, salary, etc.) onto the offensive side of the ball? Or do u think that kind of approach would be untenable?

  8. Brian Burke says:

    Oops. Left off D run eff somewhere early on in the process. I might have to take some time to dig it up.

    I think pass eff is more predictive than run eff because you can't 'dial up' a deep run. Being able to throw (relatively) deep means a lot of things are going well for the offense, blocking, routes, qb, play calling.

    Running is the kind of event where once a RB breaks past the front line of defenders, it becomes a random number generator regarding how far down-field he gets. So future deep runs are a function of a team's ability to get the RB past the front 7, i.e. SR. Passing can be like that too in some cases, but much less often.

  9. Brian Burke says:

    Regarding pouring resources into the offense, I'd say yes to a limited extent. Yes because every play passes through the head and arm of your QB.

    In football, performance on your team is dependent on the averaged abilities of the 11 guys on the field at any time. But when you're passing the ball, the performance is disproportionately on the shoulders of the QB.

    Defenses are more alike. There is less variance among defenses than among offenses because of the single-point focus on the QB. Teams like the 2000 Ravens or '85 Bears come around maybe only once a generation.

    It's hardly a revelation to say you need a top QB, but it's true. Bill Polian should go on the lecture circuit or write a book about the secret to winning in the NFL: win the lottery by drafting a HoF QB (Manning or Kelly).

  10. football tickets says:

    Hey nice to read it. I was not aware of SR formula. It was really interesting for me to read this. Thanks for the share.

  11. Guy says:

    Brian: Noodling on this some more, don't you want predictivity to be the product of reliability (r-self) and win impact controlling for everything else? That is, Run SR and pass SR are fairly highly correlated, so each of their individual r(win) is capturing some of the other factor's win impact. So I think you want to try to isolate each element. (However, it's a bit tricky since there likely is a true causal relationship: increasing one SR probably improves the other.)

  12. Anonymous says:

    Great stuff, as always.

    Although wouldn't using Cronbach's alpha be a more robust measure of consistency/self-correlation? As I understand it, it measures the correlation between two halves of a data set but averaged across all possible ways of splitting the data set in two. By taking the decidedly non-random approach of alternating games, you could be opening up the analysis to all sorts of lurking variables.

  13. Chris says:

    To follow up on Anon. I have no idea about what Cronbach's alpha is, but one thought on a lurking variable w/ alternating variables is potential issues with home vs. away. What I mean is perhaps the NFL attempts to set up a certain home, away, home, away schedule as much as possible. (I'm speculating), if they do this enough, then you aren't getting a truly random sampling of games.

  14. Vince says:

    Why not just directly calculate the correlation between the statistic in one half of the season and wins in the other half of the season, instead of calculating the two correlations (r(win) and r(self)) and multiplying them?

  15. Jim Glass says:

    Defenses are more alike. There is less variance among defenses than among offenses because of the single-point focus on the QB.

    There is less variance among Ds than Os, but an alternative explanation for the fact is that Ds must be "generalists" while Os are "specialists".

    To make the point clear consider college ball. Offense 1 plays the Air Raid spread offense throwing, throwing, throwing for 50 points per game if it can. Offense 2 plays the wishbone, running, running, running... Each plays only that, over and over in practice and games, specializing.

    But Defense A has to be able to play against both of them, and also against the Pro Set in all its many variations, and against who knows what else, the Single Wing? Its players have to be generalists who can play against anything, and they can spend only limited practice time against each different kind of offense.

    The same thing is true for Defenses B, C, D... all of them. As generalists one would logically expect their peak performance level to be lower than that of specialists, and might reasonably expect their bottom performance level to be higher than that of specialist systems that collapse for whatever reason. So the variance among Ds would be smaller than that among Os.

    I suggest the same principle is at work in the pros, in a milder form.

    I do this as a member of the small and apparently strange cult that believes the QB is much less important to a team's success than is generally considered (even while being the single most important player on the team by a lot -- with 35 or so regulars contributing to each game, there's no contradiction).

  16. Jim Glass says:

    I meant to add just above that the "single-point focus on the QB" and "generalist - specialist" explanations are not inconsistent, they could both be true and at work in the numbers.

  17. weinsteinium says:

    Ok, a nice simple number that has a fairly high predictive value. I like it. The only problem is that I have no idea whether 47% for a QB means that the player is helping or hurting his team. And the same value for a QB and RB mean completely different things (RB SR is much lower)

    So I took your nice and simple statistic and I obfuscated it (and hopefully illuminated it at the same time).

    I give you, "Normalized Success Rate"
    Player SR(%) Normalized SR
    12-A.Rodgers 52.9 133.921691
    9-D.Brees 52.6 132.5854683
    10-E.Manning 51.9 129.4676154
    18-P.Manning 51.8 129.0222078
    7-M.Vick 51.4 127.2405776
    9-D.Garrard 50.2 121.8956869
    12-T.Brady 49.9 120.5594642
    14-Sh.Hill 49.8 120.1140566
    9-C.Palmer 49.6 119.2232415
    4-K.Kolb 49.4 118.3324264
    8-M.Schaub 49.1 116.9962037
    9-T.Romo 49 116.5507961
    8-K.Orton 48.7 115.2145734
    17-P.Rivers 48.3 113.4329432
    6-S.Wallace 48.3 113.4329432
    7-B.Roethlisberger 48.1 112.5421281
    7-M.Cassel 47.8 111.2059054
    2-M.Ryan 47.2 108.53346
    5-J.Flacco 46.9 107.1972373
    7-C.Henne 46.7 106.3064222
    5-J.Freeman 46.1 103.6339769
    12-C.McCoy 46 103.1885693
    6-J.Cutler 45.1 99.17990124
    5-D.McNabb 45 98.73449368
    14-R.Fitzpatrick 44.3 95.61664075
    10-V.Young 44.1 94.72582563
    4-B.Favre 44.1 94.72582563
    3-J.Kitna 43.7 92.94419539
    11-A.Smith 43.7 92.94419539
    3-M.Moore 43.3 91.16256514
    5-B.Gradkowski 43.1 90.27175002
    3-D.Anderson 43.1 90.27175002
    8-M.Hasselbeck 42.5 87.59930465
    5-K.Collins 42.4 87.15389709
    8-S.Bradford 42.2 86.26308197
    6-M.Sanchez 42.1 85.81767441
    8-J.Campbell 39.4 73.79167025
    2-J.Clausen 33.1 45.7309939

    I took all QB's from 2000-2010 and calculated the standard deviation and then used that and the average to center the numbers around 100.

  18. weinsteinium says:

    Man, that looked much better before I posted it

    Anyone know how to make columns line up?

  19. Ian Simcox says:

    I've run a test over the 02-09 data set comparing how well wins correlate with themselves (in order to test the predictivity of wins as set out in the post). The self-correlation is 0.42. Obviously wins correlates perfectly with wins, so the predictivity of wins is 0.42, putting it top of the list. I was actually slightly surprised by this, I expected at least one of the stats to be better than wins as a predictor of wins.

  20. weinsteinium says:

    For comparison sake, what do the net punting and net kickoff yards look like (offensive and defensive). You can argue that blocked punts and missed fieldgoals are essentially random and non-predictive, but I think that a lot of people feel that kickoff and punting yards might be significant.


    Here is the Normalized Success Rate for RB's. That's more interesting than the QB one because yards per attempt is highly correlated with winning while yards per rush isn't.

    A couple of interesting things come up like Marion Barber, Ladell Betts, Beanie Wells and Clinton Portis who have negative EPA/P but good success rates.

    At the other end of things, guys like LeGarrette Blount, Brandon Jackson & Thomas Jones have positive EPA/P but negative success rates. However none of their EPA/P are very positive so it's not as interesting as the guys with negative EPA/P but good success rates.

    Player NSR
    23-A.Foster 146.9
    23-S.Greene 138.8
    25-L.McCoy 135.7
    39-D.Woodhead 133.7
    25-J.Charles 132.2
    42-B.Green-Ellis 129.1
    27-B.Jacobs 127.1
    35-M.Tolbert 126.4
    40-P.Hillis 124.8
    29-M.Bush 122.9
    26-C.Portis 122.5
    46-L.Betts 122.5
    20-D.McFadden 121.3
    29-J.Addai 114.3
    34-R.Williams 114.3
    23-P.Thomas 112.0
    44-A.Bradshaw 112.0
    29-C.Ivory 110.5
    24-M.Barber 107.4
    28-A.Peterson 106.2
    21-L.Tomlinson 105.4
    44-J.Snelling 105.0
    21-F.Gore 103.9
    26-C.Wells 101.9
    23-W.McGahee 100.0
    34-T.Hightower 100.0
    32-M.Jones-Drew 99.6
    32-C.Benson 96.5
    33-M.Turner 96.1
    28-F.Jones 96.1
    24-R.Mathews 96.1
    32-B.Jackson 94.6
    44-J.Best 94.6
    23-R.Brown 93.8
    27-R.Rice 93.4
    31-D.Brown 92.2
    20-T.Jones 91.1
    20-J.Forsett 90.7
    29-C.Taylor 89.1
    21-C.Spiller 88.0
    28-C.Johnson 87.6
    34-R.Mendenhall 85.6
    46-R.Torain 84.9
    30-J.Kuhn 84.9
    27-L.Blount 84.9
    23-M.Lynch 83.3
    22-F.Jackson 81.0
    27-K.Moreno 80.6
    22-M.Forte 77.9
    34-D.Williams 72.8
    39-S.Jackson 70.1
    24-C.Williams 64.3
    24-M.Lynch 50.0
    28-C.Buckhalter 48.0
    26-L.Maroney 43.7
    28-J.Stewart 39.5

  21. Florida Danny says:

    brian, as a more philosophical point -- and with that 4th paragraph, i think you've opened the door to an on-topic philosophical discussion -- aren't these findings also evidence of the futility of NFL prediction?

    i mean, the most predictive stat on the list is hardly "predictive" at all in the sense of r-squared. and, as brendan said earlier, i don't think there's much more unique variance explained above and beyond offensive pass efficiency given shared variances and the resulting threat of multicollinearity.

    i just find that, having been doing these kinds of analyses for several years now, the ceiling of predictivity for team wins is pretty low; somewhere in the 50% range.

    your thoughts?

  22. Brian Burke says:

    I agree. I suspect that's part of what is so compelling about the NFL's product.

    But as I wrote, it's not always about predicting winners. It's about finding out what aspects of team performance are consistent and reliable indicators of a good team, and ultimately, where teams should put their resources.

    weinsteinium-I did a post earlier on break even SR. For QBs I think it's right at 46 or 47%. For RBs it's 44%. There's nothing wrong with normalizing it. But the end of it all, it's about simplicity to me. I'm all about the units. EPA is net points, and WPA is in probability of winning games. SR is so simple: % of plays that help a team rather than hurt it.

  23. Michael Beuoy says:

    "In other words, there is no 'right now.' There is no is. There is the known past, clouded by randomness, and there is the unknown future, clouded by uncertainty. Now is merely the ephemeral intersection between the past and future."

    Dude, you just blew my mind. Seriously, that paragraph is just one of the many reasons I'm always on this site.

    Advanced NFL Stats has pretty much ruined all other football analysis for me.

  24. weinsteinium says:

    The two most interesting things I've seen looking at success rate and EPA/P are Chris Johnson last year and Ladell Betts this year.

    Last year Chris Johnson had an 0.11 EPA/P which translated in to a league leading 49.7 EPA for the season. However he only had a 38.6 Success Rate which is below the leage average of 40. Does that mean that he wasn't helping his team last year? Or does it mean that based on his success rate, his yards per attempt weren't statistically sustainable? This year his Success Ratge is down to 36.8 and his EPA/P is -0.02.

    Ladell Betts is the opposite, his EPA/P is a miserable -0.23 but his Succcess Rate is a healthy 45.8. How is that possible? My first guess was fumbles but he only has one. Maybe he is losing a lot of yards on some of his carries and then getting just enough to be successful on the other ones. Has anyone watched enough New Orleans games to know what is going on with him?

  25. Sampo says:

    @Michael

    So true.

  26. weinsteinium says:

    @Brian - According to your post, the break even point on running success rate is about 39: http://www.advancednflstats.com/2010/10/what-is-break-even-run-success-rate.html

    The average running back success rate from 2000-2010 (through week 10) is 40. I used the average when I calculated my Normalized Success Rate. Interesting that the average RB is mildly successful. That probably means something but I don't know what.

    The average Success Rate for a QB between 2000 and 2010 is 45.28, I couldn't find anything about a break even point for QB's.

    The average success rate for a TE is WAY, WAY higher, like 52%.

  27. Brian Burke says:

    Receiver stats like EPA and SR are going to be higher than for QBs or RBs because a WR or TE's name is never mentioned in the play-by-play for not getting open. Only when other things have gone well--pass protection, QB read, getting open--is a receiver called out.

    To answer the question about why I didn't just correlate stats from the 1st half of the season with wins from the 2nd half: I did it this way for 2 reasons. First, this method doubles the sample size. And second, it lets me learn specifically about the consistency of the various facets of the game.

  28. Ian Simcox says:

    I really enjoyed that article and it gave me a great idea for some of the stuff I do around English Premier League football. The self-correlation thing stood out as something I've never tested. Turns out that shots per game, which are well correlated with goals, also correlate with themselves at 0.80 - which I thought was an amazing correlation for something in the real world.

    They don't really keep enough stats to do analysis like you do here, but in terms of predicting match probabilities, number of shots is something I'm putting into my models. That kind of self-correlation can't be ignored.

Leave a Reply

Note: Only a member of this blog may post a comment.