Published: 02 November 2014
Before tools such as EPA and WPA
were available, I relied on team efficiency stats to estimate team strength. Yards per pass attempt or per run attempt worked out to be very good estimators of how good a team was, especially if ‘good’ is defined as being likely to win forthcoming games. Efficiency stats had the added benefit of being relatively simple, widely available, and easy to calculate.
Efficiency stats also worked well in regression analysis. In a regression model, it’s best if the predictor variables are independent of each other. In other words, the less each predictor variable correlates with the others, the more valid and reliable the resulting model will be. Passing and running efficiencies in the NFL correlate weakly. Over the past 10 seasons, offensive passing and running efficiencies for each team correlate at 0.09 (where 1 would mean lock-step correlation and 0 would mean complete independence.) Problems with efficiency stats
However tidy independence makes a regression model, it defies the logic of game theory. In football speak, it would mean there is little evidence that “the run sets up the pass” and vice-versa. In zero-sum two-strategy games, outcomes are optimized at an equilibrium (called the “minimax”) where the benefit of one strategy equals the benefit of the other strategy. Although this is provable mathematically, it makes intuitive sense too. If one option pays off better than another option, a player should choose the superior option more often until his opponent responds appropriately. Eventually, the payoffs will equalize at a stable payoff. Any deviation by either player from the optimum mix of strategies opens a hole for his opponent to exploit. Consequently, we should expect a reasonably strong correlation between passing and running efficiencies. There are limited examples of football coaches “playing minimax”
in situations when the payoffs are clear, but for the most part, that’s not the case. So what exactly are coaches optimizing? Why aren’t passing and running efficiencies more connected?
Raw efficiency stats also point to the futility of the running game. Offensive running efficiency correlates with team wins at 0.15, a meager relationship compared to the 0.66 correlation of passing efficiency. This stark difference suggested that teams are investing far too much attention and resources into running the ball. My theory was that as the pass become more successful, with efficiency climbing and interceptions plummeting over the recent decades, coaches were slow to catch on, clinging to outdated convention. New Tools
With new tools and new stats however, it's now clear that the relative futility of the running game is overstated by looking at efficiency stats alone. Chase Stuart of Pro-Football-Reference.com
suggested that running efficiency might be too sensitive to long runs, which tend to be rare and relatively random. So I took a fresh look at things and found some very interesting results. It turns out that running correlates with winning to a much higher degree when we look at it using Success Rate (SR).
SR is a very simple concept in principle and has been around for decades. Each play is graded as either a success or not based on its outcome. For example, if a play gains 3 yards on 3rd and 2, that would be a success. But those same 3 yards would be a failure if the situation were 3rd and 4. In the seminal book from the 1980s Hidden Game of Football
, the authors devised a simple rule of thumb based on their intuitive sense of football success. A success would be: On 1st down--a gain of 4 or more yards; on 2nd down--a gain that at least halved the distance to go; and on 3rd down--a conversion for a new set of downs.
Although I have nothing against this rule of thumb and am generally a big fan of simplicity, my own definition of SR is different. Using the concept of Expected Points Added (EPA), successes can be defined more precisely. Any play that results in a positive change in an offense’s net point expectancy can be considered a success. This technique not only accounts for down and distance considerations, but for field position as well. For example, a play that gains 10 yards on 3rd and 12 would normally be considered a failure, but if those 10 yards put the team in field goal range, that might be considered a success. It also accounts for the effects of the shortened field in the red zone. Importance of run Success Rate
Based on SR, running correlates with team wins at 0.40, much higher than the 0.15 correlation based on efficiency.
SR reveals an even more interesting revelation. I think it finally answers the question of what coaches are optimizing. Based on SR, passing and running correlate at 0.41, a much stronger relationship than the 0.09 correlation based on efficiency. Coaches are optimizing Success Rate, although they’re probably thinking of a simpler version of SR, much more similar to the rule of thumb than to my EPA-based definition.
So NFL coaches are playing minimax after all! They’re just using a very simple payoff function for the value of each play—either success or failure. The correlation between offensive run and pass EPA is smaller than for SR—0.35. This is remarkable because it suggests that coaches are not as sensitive to the magnitudes of the payoffs as they are to the simple dichotomy of “success.” This is understandable, because without an EPA model running in your brain, it’s impossible to accurately assess the value what the myriad of possible outcomes on any given play. Coaches are human, and the easiest and simplest way to value outcomes is to say, “Yeah, I think we’re better off than before,” or “Nope, that didn’t work.”
This mindset is echoed in an old football saying, attributed to Texas coach Darrell Royal and still often repeated: “Three things can happen when you throw the football, and two of them are bad.” That says it all, doesn’t it? Coaches classify outcomes as good things and bad things, and then count them up. I’m sure coaches obviously know that interceptions are a lot worse than an incompletion or a 1-yard stuff. But they don’t accurately account for the payoff and frequency of each possible outcome.
As I understand it, most coaching staffs grade player performance the same simple way. For every snap, players are graded-out either as successful or not, and their total for the game is compiled into an overall percentage grade. It should be no surprise that coaches think of entire plays in the same way.
Whether coaches admit it, or could articulate it or are even aware of it, they're predominantly thinking in a simple success-or-not paradigm. Defensive Success Rate
Defensive SR shows similar relationships as offensive SR. Although defensive run efficiency correlates with team wins about the same as defensive run SR (-0.16 and -0.17, respectively), the correlation strengthens to -0.26 when we only look at runs from the first three quarters. This excludes the highly predictable run-out-the-clock runs in the 4th quarter that are usually short gains. Additionally, defensive run and pass SR correlate with each other at 0.41, stronger than the 0.31 for defensive run and pass efficiencies. Just like on offense, net pass efficiency is still king on the defensive side of the ball, correlating with team wins at -0.56. (Defensive stats usually correlate with winning negatively, because lower numbers are better.) Implications
Coaches appear to be overly focused on play-level success (represented by SR) and not focused enough on drive-level (represented by EPA) and game-level success (represented by WPA). They’ll spend late nights in the film room dissecting every possible match-up for the slightest advantage on a single play, but they’ll ignore the numbers that suggest they pass more or go for it on 4th down. They’re looking down at the sport from a 10-foot ladder when they should also be looking at it from the 10,000-foot level.
It’s possible that SR can help improve prediction models that currently rely solely on efficiency statistics. I’ve already tested several SR-based regressions, but for one reason or another, they fail to outperform the efficiency-based models. The problem is more complex with SR, because of the correlation between running and passing. Ideally, run and pass SR would have equal weights because of game-theoretic effects, and overall team SR would be the only thing that matters.
It’s also notable that passing efficiency, both raw efficiency and passing EPA, remains the most important facet of the game. No matter how it’s sliced and diced, passing reigns supreme. However, running is more important with regard to winning than previously indicated, and it does help “set up the pass” as game theory had predicted.
Does this mean that teams should be running as often as they do? No, not in most circumstances. There is a difference between descriptive and prescriptive analysis. This is descriptive, which means this is what coaches are doing
to win games. If all the coaches adhere to the same conventional wisdom, their strategic flaws would virtually never be exposed. The prescriptive analysis remains the same. Generally, teams should be passing more often on 1st
and 2nd down, and running more often on 3rd down
and in the red zone
The bottom line is that we should pay attention to run SR. It’s very likely a predictor of passing performance and of overall team success.