Recently I looked at team defense through the lens of Win Probability Added (WPA). Every play changes a team's chances of winning, and we can sum all the ups and downs into a total WPA for any player, team, offense, or defense. In this post, I'll look at team offense.
As with defenses, one offense might be the best in yards and another might be the best in yards per play. Yet another offense might be the best in terms of points. WPA can cut through all that, and tell us which squads made the biggest impacts on winning.
WPA captures the things that other stats cannot. For example, consider an offense leading by one point with 3 minutes left in the game. A series of modestly successful runs to convert a first down and kill the clock won't make fantasy fans happy, but it effectively clinches the win. A stop and a punt would typically give the opponent over a 30% chance of winning, so that grinding first down conversion may mean more in terms of winning than almost any other series all season.
The WPA listed for each year are raw totals including playoff games. They effectively say 'this is the number of games a team would win given a completely average offense and special teams.' For example, the 2000 Colts offense posted a +4.5 WPA. This suggests the Colts offense would have won 12.5 games had their defense and special teams simply held serve, making absolutely no contribution toward winning.
I've also added a column for WPA per game to account for teams with playoff appearances. You could think of this number as how much a defense added to their team's chance of winning any given game. For example, the Indianapolis offense (+0.23 per game) would turn a 50% chance of winning a game into a 73% chance.
Click on the table headers to sort:
Offense | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | Total | Per G |
IND | 4.5 | 1.1 | 0.0 | 3.1 | 7.0 | 5.2 | 7.7 | 4.3 | 3.7 | 36.2 | 0.23 |
DEN | 3.3 | -1.7 | 1.3 | 3.8 | 0.7 | 3.2 | -0.6 | 2.7 | 3.3 | 15.8 | 0.10 |
NE | -2.8 | 1.1 | 0.2 | 0.2 | 4.0 | 2.8 | 0.0 | 6.5 | 1.7 | 13.5 | 0.08 |
KC | -0.1 | 0.6 | 4.5 | 5.1 | 3.8 | 4.4 | 0.4 | -3.3 | -2.4 | 12.8 | 0.09 |
PIT | 1.0 | 3.5 | 1.1 | -1.9 | 2.7 | 2.0 | 1.7 | 1.5 | 0.1 | 11.6 | 0.07 |
SD | -3.4 | 0.7 | -0.1 | 0.2 | 3.9 | 1.8 | 3.3 | 0.8 | 2.9 | 9.8 | 0.07 |
SEA | -0.3 | 1.2 | 2.2 | 3.0 | 0.8 | 3.8 | 0.0 | 0.8 | -1.4 | 9.8 | 0.06 |
NO | 1.8 | 0.9 | -0.1 | 0.6 | 1.5 | 0.0 | 1.2 | 0.6 | 2.9 | 9.4 | 0.06 |
MIN | 4.8 | 0.0 | 2.6 | 3.9 | 4.5 | 1.3 | -5.2 | -2.2 | -0.4 | 9.0 | 0.06 |
NYG | 1.0 | -0.5 | -0.3 | -0.2 | 0.7 | 1.0 | 0.9 | 0.9 | 3.7 | 7.0 | 0.05 |
SL | 4.7 | 4.4 | -0.1 | 0.4 | 2.6 | -0.8 | 1.7 | -2.9 | -3.8 | 6.0 | 0.04 |
GB | 0.6 | 1.3 | 0.4 | 2.9 | 1.9 | -2.9 | -1.6 | 1.7 | 1.8 | 6.0 | 0.04 |
TEN | 0.7 | 0.8 | 2.9 | 0.8 | 0.2 | -2.8 | 0.9 | 1.1 | -0.8 | 3.3 | 0.02 |
CIN | 0.0 | -0.6 | -3.3 | 2.6 | -0.7 | 4.4 | 2.6 | 0.8 | -3.2 | 2.3 | 0.02 |
ATL | -4.9 | 1.4 | 1.6 | -1.0 | 0.0 | 2.3 | 1.7 | -2.2 | 3.6 | 2.2 | 0.01 |
PHI | -0.7 | -0.7 | 0.5 | 1.9 | 2.8 | -4.0 | 3.1 | 0.7 | -3.1 | 0.3 | 0.00 |
JAX | 0.8 | -0.6 | -2.1 | -1.0 | -0.7 | 1.5 | -0.2 | 1.6 | -0.6 | -1.6 | -0.01 |
DAL | -1.7 | -2.0 | -3.7 | -0.7 | -1.4 | -0.3 | 2.1 | 4.8 | 1.0 | -2.0 | -0.01 |
NYJ | -1.6 | -0.3 | 2.6 | -0.7 | 1.7 | -3.1 | 1.4 | -3.1 | -0.1 | -3.4 | -0.02 |
BUF | 1.5 | -0.2 | 2.3 | -3.0 | -0.8 | 0.8 | -2.0 | -1.8 | -0.1 | -3.5 | -0.02 |
SF | 3.1 | 5.9 | 2.9 | 0.4 | -4.7 | -4.9 | -0.2 | -4.9 | -1.8 | -4.3 | -0.03 |
HST | -5.1 | -1.1 | -1.4 | -2.4 | -0.8 | 1.3 | 0.7 | -5.1 | -0.05 | ||
OAK | 2.3 | 1.3 | 4.3 | -3.6 | -0.6 | 0.5 | -3.9 | -2.6 | -2.5 | -5.3 | -0.03 |
CAR | -2.5 | -4.9 | -3.7 | 1.4 | 2.1 | 1.9 | -1.0 | -0.9 | 2.5 | -5.3 | -0.04 |
WAS | -1.2 | -0.9 | -0.5 | -0.9 | -3.9 | -0.6 | 0.4 | -0.9 | 0.2 | -8.4 | -0.06 |
MIA | 1.5 | -0.7 | 1.4 | -2.0 | -7.1 | -0.1 | -2.4 | -2.1 | 2.6 | -9.0 | -0.06 |
TB | 0.0 | -1.7 | 0.0 | -1.1 | -3.1 | -0.3 | -2.8 | -0.5 | -1.6 | -11.4 | -0.08 |
ARZ | -1.7 | -0.6 | -1.9 | -2.6 | -4.0 | -2.5 | 0.1 | -1.2 | 2.4 | -12.1 | -0.08 |
CLV | -1.7 | -2.6 | 0.2 | -2.3 | -3.3 | -2.1 | -3.3 | 2.1 | -4.3 | -17.6 | -0.12 |
BLT | -2.5 | -1.4 | -0.6 | -5.1 | -1.8 | -4.2 | 1.0 | -1.3 | -1.7 | -17.8 | -0.11 |
DET | -2.4 | -1.5 | -5.4 | -2.4 | -1.4 | -3.1 | -3.7 | 1.0 | -4.5 | -23.6 | -0.16 |
CHI | -2.3 | -1.4 | -4.2 | -0.8 | -5.9 | -2.9 | -2.7 | -3.4 | -0.8 | -24.8 | -0.17 |
I was surprised to see that the Colts' 2006 season (7.7 WPA) eclipses the Patriots' undefeated 2007 season (+6.5). New England broke many offensive records that year, but running up the score doesn't add much to WPA. A late touchdown on top of a 0.98 WP doesn't make much of a difference.
The most inept team of the decade belongs to Chicago so far, but Detroit might overtake them this year. The worst single-year offense belongs to the 2004 Dolphins.
Nit-picky formatting note - You can keep your table cells from wrapping if you display a table cell like this:
<td nowrap='nowrap'>-23.6</td>
It's just a pet peeve of mine. I hate it when larger negative numbers render on two lines. If you are using CSS, you can keep all of your table cells from wrapping by doing this:
td {
white-space: nowrap;
}
[And please feel free to delete this comment so the rest of your readers are not bored with HTML/CSS markup issues.]
Added it. Let me know if it's working--I never see wrapped cells in my browser. Thanks.
Would it be fair to say that the WPA numbers, for offenses and defenses as in these last two posts, comprise a method for taking a team's actual W-L record and dividing it into offensive and defensive W-L?
The two necessarily *have to* add up to actual W-L, right? (Or is special teams a third category, in which the *three* would have to add up to actual W-L?)
Yes, if special teams were included.
"This suggests the Colts offense would have won 12.5 games had their offense and special teams simply held serve, making absolutely no contribution toward winning."
You mean had their defense and special teams held serve, right?
Yep - that looks great now. Everything lines up perfectly. Thanks. ;-)
how do your offensive and defensive WPA correlate with actual Wins and Losses?
pure genius. This stuff totally kicks the butt of the DVOA nonsense as seen on another football-"statitics"-site.
I never thought you could make a system that would take into account garbage time and patriots-style running up the score.
I'll tabulate WPA correlation with actual W-L in a future post.
WPA is not intended to replace DVOA. FO has its shortcomings but DVOA has its place.
Amazing. So the 2004 Dolphins could expect to win only 1 game even with average defense, while the 2006 Colts could expect to go 16-0 with average defense.
As a reality check, I averaged the WPA across each season. I was glad and impressed to see that they all average to 0, as should be the case (for every winner there's a loser).
Since Special Teams aren't factored into your win probabilities (I think you said this) does that mean that Offensive WPA + Defensive WPA = total team strength? Or does, for example, a bad offense lower a teams Defensive WPA.
I'm thinking in particular of the Seahawks recent loss to Dallas. The defense didn't really play as bad as the numbers look; but they were on the field for the whole game.
Thanks,
- Happy
So the NE offensive contribution for 2000-2007 is approximately 1.475 games with Tom Brady where as in 2008 with Cassel it was 1.7 games. I know they have more weapons now than they had earlier in the decade but when you compare those two, and especially compare it to the Indianapolis ratings it sure seems pretty cut and dry who should end up ahead on the argument for QB of the decade.
Happy-
The game prediction/team prediction model is a different animal. The WPA model can incorporate anything. It's just a matter of what we'd choose to look at. Also, remember that the numbers include playoff WPA, so the 2006 Colts would have to factor that in.
Anon-Regarding Brady vs. Manning, these numbers certainly imply Manning has carried his team. But in Brady's defense, he played with a much stronger defense, which as pointed out in a comment above makes a good offense less crucial. Plus, until 2007 Brady had lesser receivers.
And therein lies the flaw of WPA, whether in baseball or football. An offense/defense gets penalized if its team is good on the other side of the ball. It should not be used to attempt to ascertain true talent levels across teams.
I'm not saying it doesn't have value. It just doesn't in that respect.
Is that kind of poetic pessimism really warranted? I mean, if I invented the fork, would people say it's flawed because it doesn't cut your steak?
WP and WPA have their purpose. If you care about what actually happened in real games regarding what ultimately matters, then WP is useful. It's not a theoretical construct like other tools.
@Brian: I was responding to the Anonymous post above yours. I think it is fair to call it a flaw. There is no perfect statistic to entirely capture true unbiased value of a player. I was simply pointing out someone's incorrect use of the tool due to their being unaware of this flaw.
Regarding WPA in football, I agree it has its place. I'm happy to see someone finally assembled it for the NFL. I'm repeatedly impressed with FanGraphs' use of WPA regarding MLB, and I commend you for adapting it to the gridiron.
Also, I'm posting a comment on your site so I'm clearly a fan of your work.
"WP and WPA have their purpose. If you care about what actually happened in real games regarding what ultimately matters, then WP is useful. It's not a theoretical construct like other tools."
A theoretical construct... Just so that I understand correctly, the difference between your rankings and those of others (like DVOA), is that you don't award arbitrary points or percentages for say, getting 5 yards at first and 10.
Instead, you let Win Probability decide how much any given play is worth?
Am i making any sense?
Juri, that's true, but I wouldn't call DVOA arbitrary. As far as I understand it, it's defense-adjusted "success points" given for advancing the ball as described in the book Hidden Game of Football. If you want to rank teams based on how they've played so far, DVOA might be what you want. If you want to predict future wins, my efficiency model might be the best tool (which is a theoretical construct too).
I think of these things like a toolbox. Each tool has its purpose. One of the criticisms I'd have for FO is that they have a one or two tools and use them for everything. If all you have is a hammer, the whole world looks like a nail.
The elegance of WP in any sport is its usefulness as the one and only true measure of utility. If you care about winning, then WP is what you want to use when analyzing the value of plays or decisions.
I'm sorry to sound combative, Aaron. I appreciate your kind words. Certainly my own WP model has its flaws and defects. But as a concept WP is not "flawed." No one would say that a screwdriver is flawed because it doesn't hammer nails. But I agree it's not a panacea.
In my mind, there are 4 pure systems of utility in football. On the play level there's raw yards. On the series level, there is 1st down probability. On the drive level, there's expected points. And on the game level, there's WP.
A couple of nitpicks.
You wrote -- "Anon-Regarding Brady vs. Manning, these numbers certainly imply Manning has carried his team. But in Brady's defense, he played with a much stronger defense, which as pointed out in a comment above makes a good offense less crucial. Plus, until 2007 Brady had lesser receivers."
I don't know how pointing out that Brady had much less pressure on him to produce a win works in Brady's defense in a comparison with Manning.
And while Brady had lesser receivers until 2007, he always had far superior offensive lines. Given the choice, I'd go with the offensive line every time. The NFL is all about pass protection. Every week, the defensive coaches' major focus is how to get pressure on the QB and the offensive coaches focus on how to protect. Everything else pales in comparison.
In 2008 the Eagles offense put up a -3.1 WPA (28th overall). That really interesting because they were a team that showed up as being slightly above averate on the advanced statistics sites (13th in offensive DVOA and 11th in O-Rank on this site).
Do you have any idea why there is such a discrepancy between WPA and O-Rank? If I had to put up a guess it was their inability to run out the clock (and I don't mean that you have to run, as the 2002 Raiders showed, a 5-yard pass with a 70% completion rate is a lot like a run).
That might explain the difference between their W/L record last year and their DVOA (#1 overall) and GWP (#2 overall). Of course playing in the brutal NFC East is part of that also given that DVOA and GWP both correct for opponents and W/L doesn't.
@Stan - If you think about how WPA works, you'll see how the defense's effectiveness plays into Offensive WPA. Let's say the defense only allows 3 points and the offense has a pretty average day and put's up 21 points. Now imagine a game where the defense allows 3 points but the offensive is playing really great and puts up 49 points. Shockingly, the Offensive WPA won't be much higher in the second case. 21 points was already enough for a 95% Win Percentage, putting up another 28 points can't raise that much beyond 99% Win Percentage.
So offense in a close game is worth more than offensive a blowout and the defense is going to be a big part of determining how close the game is.
Brian, will you post who are the best individual players using WPA? I would love to see that for QB, RB, and WR/TE
I'd like to, but unfortunately that's code-intensive. I'd have to write some scripts to capture each player's name from the play-by-play descriptions and tabulate WPA. I don't have the time this season to do that. What I can do easily right now is go in and search for one player at a time. So I could do "T.Brady" and "P.Manning" and compare their WPA.
Would a team with a good offense but mediocre defense be able to "rack up" WPA points since their mediocre defense will always make the games close so that a high percentage of the drives will be high on the WPA scale just because the score is close all the time? IOW, a team score 5 TDs, but every time they scored, the game was tied, and say they win 35-28.
Where as another team with the exact same offense skill scores 5 TDs, but wins 35-10, so their last 2 TDs don't mean much WPA-wise.
Just trying to get a feel for how this works.
Joe-Yes and no. A good offense with a bad/mediocre defense would be more likely to be seen as more valuable in terms of WPA. But then again, that offense has to actually make it happen and overcome the poor defense.
Baseball WPA uses Leverage Index to quantify how important a particular situation is in a game.
@Joe G:
In the game you describe, the offense is getting increased opportunities to accumulate WPA. The situations have a higher average LI.
As Brian alluded to, this is only an advantage for an above average offense. A below average offense would be getting increased opportunities to lose WPA. Still, 2 equally great offenses will not rank the same in WPA if put in environments of different leverage. The offense in the higher leverage environment will accumulate more WPA despite their equal talent levels. A better defense would lower the leverage since it would contribute to blowouts.
I think I see a major flaw in the relevence of this article, correct me if I'm wrong.
Assuming WP takes into account the strength of the teams in question, won't WPA be a measurement of performance relative to expectation? In other words, WP should only rise (and WPA should only be positive) if a play exceeds expectations. Consequently, a team that is known to be dominant might enter a game with .9 WP and an early score will effectively limit their WPA to .1.
This would help explain why the 2007 Patriots do not lead this statistic and the 2006 Colts were +7.7, despite the fact that they only won 4 games over half (12-4). It's about beating expectations rather than beating a theoretical average.
Good question, Dan. No, it is measurement relative to league average. The reason the '07 Pats Offense doesn't lead the statistic is because they had a very good defense.
Thanks for the explanation, that makes sense and clearly holds meaning. I'm still seeing problems, though.
2006 Indy, for example, had +7.7 offense, and -.6 defense, but was only +4 in real games. Does this imply special teams cost them 4.3 games? Seems unreasonable, especially considering Indy had good special teams as well, IMO.
This seems heavily weighted towards close games and WP volatility.
Take, for example, two 49-48 shootouts. The first, where one team scores 48 points and then blows the lead, and the second where the teams alternate scoring all game long.
I would argue that these two games are very similar in terms of their reflection on the strengths of the teams.
However, In case 1, offense would be about +.5, defense would be +1.0. In case 2, offense would be very high (let's say ~3.0) and defense would be ~3.5.
I think it's this scaling issue that I have a problem with and the fact that WPA is not bounded by -.5 and .5 for a single game. Should an offense be credited with adding 300% to the teams win percentage? Seems like values for teams, and even for different games of the same team, are not directly comparable- especially in cases where deviations from the average are large.
Amazing site, by the way. You have my dream job.