Troy Polamalu is one of my favorite players in the league. As a Ravens fan whose heart despises everything that is black and gold, I'm compelled to say that he is truly spectacular. (The picture to the left is his interception return for a touchdown that saved the AFC Championship game for Pittsburgh last year.) He's also a classy and selfless player who never showboats, a breath of fresh air in today's NFL. Unfortunately for Steelers fans, his injury this season has revealed just how great a player he is.
Polamalu has played in five games this season and has missed nine. Using the Win Probability (WP) model, we can calculate the Win Probability Added for any play (WPA). We can also sum WPA for any team, squad, or player to estimate the context-dependent contribution of that player to his team's chances of winning.
WPA for defensive players is particularly problematic. If a running back evades all ten other defenders but is tracked down 60 yards later at the 5-yard line by the eleventh, it's the tackler that will show up in the play-by-play and be charged with the loss in WP. That doesn't make sense.
But Polamalu's 2009 season provides us with a special opportunity. We can measure the WPA of the entire Steelers defense when Polamalu is playing and compare it to the WPA of the Steelers defense when he's not.
Essentially, this is a synthesis of two different concepts in advanced sports statistics: WPA and Plus/Minus. Win Probability Added may be familiar to readers here or to baseball sabermetric fans, but Plus/Minus may only be familiar to basketball and hockey fans. For example, in hockey, when a goal is scored, every player on the ice for the scoring team is credited with a plus, while every player on the ice for the team scored against gets a minus. This kind of stat is particularly useful for a player like a safety, who can deter a play by his very presence without actually defending a pass or making a tackle. He makes it easier for a teammate to make a play. This is also sometimes called a 'WOWY' stat--'With Or Without You.'
Combining these two concepts gives us WPA+/-, defined as the sum of the increases and decreases in WP for a player's team while he's on the field. For the vast majority of NFL players, this concept has very limited application. But when an undisputed difference-maker like Polamalu is suddenly sidelined, we can get a good idea of just how valuable he is.
For the five games in which Polamalu has played, the Steelers defense posted a -2.39 WPA, an average of -0.49 WPA/game. (Negative values are good for defenses.) To put that in perspective, a team starts with a 0.50 WP, so a net of another 0.50 WPA can add up to a win. You could say the Steelers defense, all by itself, could have won those five games with only a modest contribution from their offense.
In the nine games Polamalu has missed, the Steelers defense posted a +1.05 WPA, an average of +0.12 WPA/game. Losing Polamalu takes the Steelers defense from close-to-unbeatable to below-average.
You could say Polamalu has been worth a 0.61 WPA/game for Pittsburgh. Certainly part of that difference could be due to random variation or opponent strength, but it's very hard to discount the contribution of such a dynamic player. I bet if you polled the Steelers coaching staff, or their opponents' coaching staffs, they'd agree with this analysis.
The game-by-game breakdown of the Steelers defense WPA is provided in the table below. Games in which Polamalu participated are denoted by the asterisk.
Opponent | Def WPA |
TEN* | -0.61 |
CHI | 0.35 |
CIN | -0.21 |
SD | -0.09 |
DET | -0.17 |
CLE* | -0.48 |
MIN* | -0.68 |
DEN* | -0.40 |
CIN* | -0.22 |
KC | -0.07 |
BAL | -0.43 |
OAK | 1.09 |
CLE | -0.10 |
GB | 0.68 |
Very interesting article, nice to quantify something that has been assumed. The effect of Troy's absence has been sorely felt in Steelers Nation - in fact, it has generated an expression on behindthesteelcurtain.com: Troy Would Have Had That, or TWHHT for short...
The WOWY EPA+/- (Expected Points Added) for Polamalu is:
-12.8 EPA/game with him
-1.6 EPA/game without him
A difference of 11.2 points per game. That's a lot of points!
I doubt it is possible to get data for the eleven men on the field for any given play, right?
If it were it would be great to apply this to all plays to all teams over a longer period.
EdBed
Practically meaningless without adjusting for the team the Steelers played. He got the Kerry Collins-led Tennessee Titans, and the Derek Anderson-led Cleveland Browns; what were two of the worst quarterback/offense combinations in football.
The Steelers defensive output was essentially contant (-.22 to -.21) in the two Bengals games, one with Polamalu, one without. Pointing at this and saying "Polamalu is worth nothing" has about as much mathematical strength as pointing at this current study and saying "Polamalu makes a difference of 11.2 points per game".
Practically meaningless? We get it, Marver. You don't like WPA.
One thing important to keep in mind, WPA is not like most other stats. 99% of the stuff done here is inferential frequentist statistics. That means you look at how often things happened in the past to project what will likely happen in the future. It's like measuring average rainfall, its median, its standard deviation, etc., and then saying how much rain you expect next year.
With WPA, all we're doing is sticking the ruler in the bucket and seeing how much it rained. If you don't get that, it's ok. But that doesn't mean how much it rained last year is 'practically meaningless.'
My problem isn't with using WPA. It's with using a five game sample without adjusting, whatsoever, for the quality of the opponent. The only inference you can possibly make is the comparison between Bengals games since that's the only game(s) in which you have a consistent offense with which to compare.
If you revised the statistic to adjust for the opponent's typical defensive-WPA-against (like, the average WPA against the Kerry Collins Titans this season was -.34, so Polamalu created a difference of x%) then you'd at least be getting somewhere.
As it currently stands, there is almost no mathematical strength here making it, yes, meaningless.
I think this is awesome. Revolutionary even. That guy's a know-it-all with an agenda.
Who else can we do this for? Your data goes back to 2000, right?
"WPA for defensive players is particularly problematic. If a running back evades all ten other defenders but is tracked down 60 yards later at the 5-yard line by the eleventh, it's the tackler that will show up in the play-by-play and be charged with the loss in WP. That doesn't make sense."
I'm assuming from this statement that your play-by-play database doesn't go so far as to show who was on the field during each play, correct? I'm asking because, if you knew the actual personnel on the field for each play, you could "charge" that 60-yard run against every defender on the field at the time so that it wouldn't unfairly target just the guy who managed to track him down at the end of the play.
If you could track the personnel on the field for each play, you could combine this with your WPA analysis to create a much-more-meaningful version of the +/- indicator that is used in basketball.
+/- is somewhat helpful in basketball, but it suffers some serious weaknesses because it only tracks a team's point production when a given player is on/off the court. But if you could track a team's WPA production when a given player is on/off the court (or football field) I would see that as being extremely insightful.
Can you fine tune this even more this even more -as a Steelers fan, I'm pretty sure he went out in the 2nd quarter of the second CIN game (and it was yet another game they lost in the 4th quarter) - and I think he only participated in 3 full games, so there is another game he only play partially.
I'm not sure when exactly he was hurt, but the first half of the game the PIT def had -0.13 WPA, and the second half they had -0.08 WPA. For the 1st quarter it was -0.19, and the for 2nd quarter it was +0.06.
No, my play-by-play doesn't list exactly who's on the field on each play. It's no different from the play-by-play breakdowns you see on nfl.com or other sites. But the concept is there, and that's the key thing.
I don't know who else to do this for. Which star defenders have been injured mid-season? Ray Lewis? Brian Urlacher? Who do you guys suggest?
Lofa Tatupu, Marcus Trufant, and you could look at St. Louis after dealing Will Witherspoon. Once you get enough data looking at 'star' players at each position, you could even get a gauge on which positions had largest impact between 'star' and replacement level.
Just please adjust for opponent quality, turning it into a percentage increase/decrease; otherwise variance completely dominates.
"Practically meaningless without adjusting . . ."
". . . otherwise variance completely dominates."
Hyperbole does no service to your point, though at least you threw in the "practically". If the sample were 1000 games on either side, we would happily take the results at face value, assuming (rightly) that the opponent quality would pretty well even out. With the (much) smaller sample opponent quality is an issue, but it doesn't disqualify the results and it doesn't "completely dominate". How much it dominates is a question for research, not cocksure dismissal.
Adjusting for opponent quality would change the stat, and maybe improve it, at least for what you (and maybe Brian) want it to do. But the stat has validity without the adjustment, because it describes what it purports to describe, as Brian points out. It has this descriptive power even if it doesn't perfectly isolate Polamalu's contribution.
"Adjusting for opponent quality would change the stat, and maybe improve it"
The only scenario in which it wouldn't improve it is if the variance in opponent quality was 0. Since the NFL is certainly not a league that lacks variance, failing to adjust for opponent adds additional variance to a statistic that already has plenty due to variance in week-to-week team performance; we don't have a 1000 game sample to drown this out.
With easy access -- he possesses the WPA model -- to team WPA, it should be a relatively quick fix and would remove a sizable amount of error from the statistic.
I just hate to think Joe Steeler fan is going to quote Brian as saying "Polamalu is worth 11.2 points per game", point to this article and website, and subsequently cause people (with interpretive skills) to discredit other work here that is well-done, as the overwhelming majority is.
I don't dislike the idea here; I actually think it'd be very interesting to get WPA baselines for different positions and their replacement values. It would have numerous applications, specifically as it pertains to the NFL draft and roster construction in general.
But there's a drastic difference between having the Polamalu-less Steelers facing a Philip Rivers led offense and Aaron Rodgers led offense, compared to a Derek Anderson led Browns offense and a Kerry Collins led Titans offense. I would even bet that a Steelers defense WITH Polamalu would have a worse WPA against the first two than a Steelers defense without him would against the latter. That wouldn't mean he hurts the team's defense; it'd just be an artifact of the sample.
"The only scenario in which it wouldn't improve it is if the variance in opponent quality was 0."
There's the main sticking point, because the truth of that statement depends entirely on what you mean by "improve". You think the stat should isolate the effect of Polamalu's performance as much as possible, so therefore adjusting for opponent quality is "better" than not adjusting. This isn't a given - if true differences in opponent quality were minimal but variance in measurement was considerable, adjusting for opponent quality might simply further dilute the signal - but in any case, it's your definition of what you want the stat to be. Brian likes that WPA is an "as it happened" recounting of a game's events; you want adjusted WPA (or something) to better inform us about Polamalu's impact. One is better for your goals, one is better for Brian's. I shouldn't speak for him too much, but I think that's what he's saying. There are other factors driving the WPA stats, but the numbers are still meaningful despite that.
If I look up Peyton Manning's Y/A this year (7.9), that number isn't adjusted for opponent quality, weather, HFA, day of the week, or any other numerous factors that could (and no doubt do, if very modestly) impact performance. It's not useless because of that: it is, simply, what it is. It has some value for some purposes and people and different value for others.
I guess I'm interested in seeing both sides, really - what actually happened, and what happened with adjustments to better isolate Polamalu's impact - and of course you're right that adding adjustments could bring insight. But your out and out rejection of unadjusted WPA doesn't seem quite right to me.
Marver
Although this doesn't technically adjust for opposition offense, if you look at Brian's Team Efficiency Stats for this week he includes a column for Offense Rank.
Looking at the opposition's offense rank, we can see that the average Opposition O Rank 'With Tory' was 17.6, and 'Without Troy' it is 21.2. Added to the WPA stats from the article, essentially this means that the Steelers' defense are playing worse 'Without Troy' even though they are facing worse offenses.
Data for above numbers in CSV format
----------------------
Opp,Def WPA,Opp O RANK
With Troy
MIN,-0.68,10
TEN,-0.61,12
DEN,-0.4,16
CIN,-0.22,18
CLE,-0.48,32
Without Troy
SD,-0.09,2
GB,0.68,11
BAL,-0.43,13
CIN,-0.21,18
CHI,0.35,25
KC,-0.07,29
OAK,1.09,30
DET,-0.17,31
CLE,-0.1,32
Can this method account at all for injuries to other key players on defense? Although Polamalu is undoubtedly the most valuable player to the Steelers' defense, defensive end Aaron Smith is also an extremely important player, and he got injured mid-season. That could exacerbate the difference that this method shows Polamalu made.
Steerfan-That's a good point. What you're pointing to is what the bball and hockey guys call 'Adjusted plus/minus.' You calculate +/- for each point or goal for each player like usual. But then you put it through an adjustment process that accounts for the +/- other guys on the floor or ice.
In football, with 11 guys rotating in and out every play, it would be very difficult, both in practical data-collection regards, and mathematically. But it could be done, and it might be very, very useful if it's done well. I'd want to look at adj +/- for EPA as well as for WPA.
Marver said, "The Steelers defensive output was essentially contant (-.22 to -.21) in the two Bengals games, one with Polamalu, one without. Pointing at this and saying "Polamalu is worth nothing" has about as much mathematical strength as pointing at this current study and saying "Polamalu makes a difference of 11.2 points per game"."
Your point might be valid if Polamalu played the entire Bengals game. In fact he actually played less than a quarter.
Whether you believe that the WPA is an accurate measure or not, there is no denying the negative impact that Polamalu's injury, and to a lesser degree the injury to Aaron Smith, has had on the Steelers defense.
Bob Sanders is another player who missed significant parts of several years. I'd like to see this analysis applied to him if you get the chance.
Hi there,
I've got a question regarding WPA. I'd appreciate if anyone could help me out with this, even if it might be stupid.
"Is WPA adjusted for game score?"
If I'm not mistaken a play made in a close contest adds more win probability for your team than a play made in a blowout. An interception made during a tie game is better than an interception made with your team down three scores.
In that regard the Defense isnt always in charge of controlling that factor. If your Special Teams or Offense give up points to the opposing team, then your Defense gets credited with less WPA on the ensuing plays, eventhough they're not the ones to blame.
You are correct. WP and WPA are heavily context-dependent and suffer all the problems associated with that. WPA is what I call a 'narrative' stat, something that tells you what happened and not necessarily what will or should happen. WPA emphasizes performance when it matters most and discounts performance in 'trash time.'
What it's good for is isolating which plays, players, and squads that are really to credit or blame for wins and losses.
Thank you for your explanation Brian. The concept of WP&WPA is quite cool. I enjoy reading about it, especially the article applying it to the MVP debate. Obviously the data needs to be interpreted and put in context, but it is nice to view football through a lense as objective as can be.
Keep up the good work!
"Pointing at this and saying "Polamalu is worth nothing" has about as much mathematical strength as pointing at this current study and saying "Polamalu makes a difference of 11.2 points per game"."
Except that there's a possibility of a safety being worth nothing. To insinuate, after examining all of Brian's other work which point towards offensive dependancy/predictability on outcomes, that one defensive player can make a difference as large as 11.2 points is incredibly far-fetched. From a common-sense point-of-view, there's no way one defensive player can make a difference of 11.2 points per game; otherwise, we'd expect to see much larger importance in defensive statistics than we exhibit.
I'm certainly not saying Polamalu isn't worth something, and I'm not saying he isn't a great player. And I'm not even saying WPA is a flawed statistic to use...just that, as it's used here, comparing two different sized samples without accounting for the variances with using different sample sizes, and the opponents (and the differences within those said opponents, ie. quarterback changes, etc.) makes it contextually empty. [As an aside, someone please explain what having a WPA above 1, like the Raiders game, means statistically.]
The alpha here is absurdly high.