We've seen how a win probability model can help coaches make better decisions, but it can also help settle some water cooler debates around the office. One of the cool things we can do with a win probability model is compare teams and players based based on how much they contribute to their chances of winning. If we sum up all the defense's contribution toward winning the game, we can truly rank defenses.
One team might have the best defense in terms of total yards, and another might have the best defense in terms of yards per play. Yet another might be the best in terms of points allowed. All of these ways of comparing defenses is flawed in one way or another. For example, a defense with a poor offense will be facing short fields frequently.
Win Probability Added (WPA) can account for these various considerations and provide a very good estimate of how good a squad actually was. WPA should be considered a 'narrative' stat. It measures what actually happened and doesn't attempt to any more than that. I'll start by looking at defenses since 2000, as far back as my data goes.
The WPA listed for each year are raw totals including playoff games. In a way, you can think of them as saying 'this is the number of games a team would win given a completely average offense and special teams.' For example, the 2000 Ravens defense had a -6.0 WPA and would have won 14 games (8+6) had their offense and special teams simply held serve, making absolutely no contribution toward winning. (Edit: Tom Tango pointed out my original thinking was in error. I had previously said 12. Thanks!)
I've also added a column for WPA per game to account for teams with playoff appearances. You could think of this number as how much a defense added to their team's chance of winning any given game. For example, the Baltimore defense (-0.17 per game) would turn a 50% chance of winning a game into a 67% chance.
Negative numbers are good, and positive numbers are bad for defenses. Click on the table headers to sort:
Defense | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | Total | Per G |
BLT | -6.0 | -2.7 | -1.6 | -6.0 | -1.6 | -0.5 | -1.8 | 0.4 | -5.8 | -25.7 | -0.17 |
TB | -2.7 | -2.7 | -6.5 | -1.4 | -1.2 | -3.1 | 0.5 | -2.1 | -1.6 | -21.0 | -0.14 |
PHI | -2.8 | -1.5 | -3.9 | -1.3 | -0.8 | -1.2 | -0.2 | 1.0 | -4.5 | -15.4 | -0.09 |
CHI | -0.6 | -3.5 | 0.8 | 1.6 | -1.6 | -5.6 | -4.9 | -0.4 | 0.2 | -14.0 | -0.09 |
PIT | -1.4 | -1.9 | -1.1 | 0.3 | -2.1 | -2.5 | 1.0 | 0.0 | -5.8 | -13.6 | -0.09 |
MIA | -2.2 | -1.2 | -1.4 | -2.7 | -2.6 | 0.3 | -1.5 | 3.9 | -1.5 | -9.2 | -0.06 |
NE | 2.5 | -1.9 | 2.5 | -5.9 | -3.2 | 1.5 | -1.9 | -1.5 | 0.1 | -8.0 | -0.05 |
CAR | -0.4 | 2.9 | -4.8 | -0.8 | 2.2 | -2.7 | -0.9 | -0.5 | -0.7 | -6.0 | -0.04 |
TEN | -2.4 | 2.5 | 0.1 | -2.9 | 0.9 | 1.2 | 1.3 | -0.7 | -5.1 | -5.3 | -0.04 |
WAS | 0.1 | -1.3 | -1.4 | 1.9 | -3.2 | -2.0 | 2.2 | -1.1 | 0.0 | -5.1 | -0.03 |
NYG | -2.2 | -2.2 | 0.1 | 2.2 | 1.1 | -1.1 | 0.5 | -2.8 | 0.9 | -3.9 | -0.03 |
NYJ | -1.4 | -1.4 | 0.8 | 1.3 | -1.8 | 0.2 | 0.7 | 1.2 | -2.2 | -2.7 | -0.02 |
BUF | -1.5 | 2.3 | -0.1 | -1.3 | -3.2 | 2.8 | -1.0 | 0.1 | 0.5 | -1.6 | -0.01 |
GB | 0.7 | -1.6 | -0.8 | 1.3 | 1.4 | 0.4 | -2.7 | -1.8 | 1.9 | -1.5 | -0.01 |
DAL | 0.4 | -0.2 | -0.7 | -1.3 | 1.5 | -1.6 | 1.9 | 1.1 | -0.9 | -0.2 | 0.00 |
DEN | 0.8 | -1.0 | 0.9 | -0.1 | -1.9 | -0.8 | -2.2 | 1.2 | 3.8 | 0.4 | 0.00 |
SF | 4.1 | 1.9 | -0.6 | 0.3 | 1.2 | -0.7 | 0.1 | -0.8 | -1.4 | 3.9 | 0.03 |
JAX | 1.5 | 0.4 | 0.4 | 0.2 | 0.6 | -2.2 | -0.1 | 0.9 | 2.5 | 4.0 | 0.03 |
IND | 2.7 | 2.1 | -0.4 | -0.2 | 2.2 | -0.8 | -0.6 | -0.1 | 0.1 | 4.7 | 0.03 |
SL | 2.1 | -2.8 | -0.9 | -2.1 | 1.2 | 2.4 | 2.5 | 0.3 | 2.3 | 4.7 | 0.03 |
SEA | 3.0 | 0.7 | 3.1 | 0.4 | 0.0 | -1.2 | -0.2 | -2.1 | 1.3 | 4.8 | 0.03 |
SD | 0.4 | 3.0 | 0.4 | 1.7 | 0.6 | 1.3 | -1.6 | -2.0 | 1.4 | 4.9 | 0.03 |
ARZ | 2.4 | 1.1 | 1.7 | 3.0 | -2.5 | 0.2 | 1.0 | -0.8 | -0.3 | 5.5 | 0.04 |
CLV | 2.0 | -2.3 | 2.5 | 0.9 | -0.9 | 0.8 | 1.2 | 1.2 | 1.2 | 6.3 | 0.04 |
OAK | -1.3 | -1.3 | -0.1 | 1.2 | 3.8 | 3.1 | 0.1 | 1.1 | 1.6 | 8.1 | 0.05 |
ATL | 0.1 | 4.1 | -1.5 | 1.3 | -1.2 | 1.3 | 1.9 | 0.6 | 2.6 | 9.0 | 0.06 |
DET | -2.5 | 3.3 | 0.0 | 0.7 | 0.4 | -0.6 | 2.3 | 1.5 | 5.2 | 10.1 | 0.07 |
MIN | 2.9 | 2.6 | 5.2 | 0.9 | 4.3 | 0.5 | -2.3 | -1.9 | -1.8 | 10.2 | 0.07 |
CIN | 3.3 | 0.7 | 1.3 | 2.8 | -0.3 | 0.9 | 2.1 | 1.0 | 0.0 | 11.6 | 0.08 |
NO | 0.1 | 2.3 | 2.0 | 0.9 | 1.6 | 3.4 | 0.9 | 0.8 | 1.7 | 13.5 | 0.09 |
HST | -1.1 | 1.8 | 0.1 | 4.3 | 1.5 | 2.3 | 1.7 | 14.4 | 0.13 | ||
KC | 0.0 | 1.5 | 4.8 | 1.4 | 5.1 | 1.9 | 0.0 | -0.2 | 2.6 | 17.0 | 0.12 |
No surprises here. Baltimore's defense is undeniably the defense of the decade on the strength of their 2000, 2003, and 2008 squads. Tampa Bay is several games behind, both in terms of total WPA and per game WPA. Tampa also owns the best single season at -6.5 WPA. Baltimore owns the next two best seasons at -5.0 WPA. Other notable seasons belong to the 2008 Titans and Steelers, and the 2003 Patriots.
Minnesota's 2002 defense and Detroit's 2008 defense tie as the worst with +5.2 WPA. The Kansas City defense of 2004 is close behind at +5.1 WPA. Also note that Houston's defense would be the overall worst on a per game basis.
"Win Probability Added (WPA) can account for these various considerations and provide a very good estimate of how good a squad actually was."
To me *good* implies something more along the lines of inherent, repeatable skill or ability. Instead, I would say WPA provides a very good estimate of how *valuable* a squad was. WPA is greatly influenced by context and situational effects, which are highly random. I would bet WPA is not very consistent from game to game or year to year and thus not as predictive as more conventional stats. As you note, it is a narrative stat, i.e. retrodictive.
I agree that WPA can be a very useful concept, but I don't want to see it presented as something it is not.
That's how it was presented in his post. Don't see your point.
"WPA should be considered a 'narrative' stat. It measures what actually happened and doesn't attempt to any more than that."
What's with that first comment? It's like saying 'I agree with what you wrote. But if you had written something completely different, I would disagree with you.' No use splitting hairs over "good" vs. "valuable" in a 200 word blog post.
I love WPA in football analysis because it seems to be to be the only way to entirely account for context. Sure, the contextual factors mean you need a bigger sample size, but WPA gives you a ton to make up for this. It's retrodictive to a point but its inclusive nature means WPA captures tons of stuff that simple component metrics don't and can't account for, things that may well be predictive given a large sample. I'm looking forward to more WPA analysis, especially to some form of run-pass analysis - I think WPA has the most potential to fairly conclusively answer the "passing paradox" and tell us how optimal teams are in their mix of running and passing.
It seems to me that this metric will rate a good defense higher the worse the team's (not opposing) offense is. A defensive stop is worth a lot more in a 7-3 game than a 35-3 game.
Adding to what JB H said, is there a way for you to calculate WPA/LI? For instance, the Patriots D in 2007 had a -1.5 WPA, but I'm sure their WPA/LI was near the high 2's/low 3's since they were involved in so many blowouts (hence not as much opportunity for their defense to help them).
Zach-Interesting, but I think we'd be better off with an Expected Point rating. Doing WPA/LI would be somewhat circular (unless I misunderstand). LI has to be derived from WP, whichever particular method used to estimate it. So when you do WPA/LI, you're making all plays with the same outcomes equivalent in terms of score and time remaining. I'd just go with an Expected Points Added (EPA), which I think is Football Outsiders DPAR/DYAR simply is.
Actually, WPA/LI is not circular. But, there's probably a handful people in the world that understands why I do WPA/LI, so this is a failing of mine that I can't explain better than I have...