 Here is an alternative set of rankings based on Success Rate. (You can learn more about SR at this post, and up-to-date advanced team stats, including SR, are always available here.) Beginning this week, each team's SR is adjusted for opponent strength.
Here is an alternative set of rankings based on Success Rate. (You can learn more about SR at this post, and up-to-date advanced team stats, including SR, are always available here.) Beginning this week, each team's SR is adjusted for opponent strength.The table below also lists each team's opponent SR, except when playing the team of interest itself, for offense, defense, and in total. I realize this may be a little confusing, so I shaded the columns. The shaded columns represent the team's own adjusted SR, while the white columns represent the team's opponent SR. Each team's total adjusted SR is in the far right column.
Each column is sortable. You can sort by division to see how each team shakes out within its own neighborhood. Note that the three top teams come from the NFC East, and that's not good news for my loyal Washington Post readers.
I can understand why Dallas and San Diego are ranked where they are, but Pittsburgh and San Francisco are big mysteries to me.
| Rank | Team | Div | Opp DefSR | Adj OffSR | Opp OffSR | Adj DefSR | Opp SR | Tot SR | 
| 1 | NE | 52.7 | 45.2 | 44.6 | 62.8 | 97.3 | 108.0 | |
| 2 | NE | 53.4 | 46.7 | 43.3 | 57.1 | 96.7 | 103.8 | |
| 3 | NE | 53.5 | 45.9 | 45.2 | 57.4 | 98.7 | 103.4 | |
| 4 | NS | 57.0 | 53.5 | 39.1 | 49.8 | 96.0 | 103.3 | |
| 5 | AW | 55.6 | 46.3 | 41.7 | 56.8 | 97.3 | 103.1 | |
| 6 | NW | 56.4 | 45.0 | 45.5 | 58.0 | 101.8 | 103.0 | |
| 7 | AN | 55.2 | 44.1 | 44.5 | 58.6 | 99.7 | 102.7 | |
| 8 | AS | 56.8 | 45.6 | 44.3 | 57.0 | 101.1 | 102.6 | |
| 9 | AE | 54.0 | 43.7 | 44.6 | 58.7 | 98.6 | 102.4 | |
| 10 | AW | 52.7 | 41.4 | 46.6 | 60.9 | 99.3 | 102.3 | |
| 11 | AE | 57.2 | 54.0 | 42.8 | 48.1 | 100.0 | 102.1 | |
| 12 | AE | 54.0 | 45.4 | 44.8 | 55.9 | 98.9 | 101.3 | |
| 13 | AS | 54.7 | 50.6 | 43.8 | 50.6 | 98.5 | 101.2 | |
| 14 | NN | 53.2 | 45.0 | 43.7 | 55.8 | 96.9 | 100.8 | |
| 15 | NN | 54.3 | 39.2 | 48.1 | 61.5 | 102.4 | 100.7 | |
| 16 | AN | 55.0 | 46.4 | 41.4 | 53.7 | 96.5 | 100.0 | |
| 17 | NW | 56.6 | 43.1 | 42.1 | 56.4 | 98.7 | 99.6 | |
| 18 | NS | 55.8 | 44.8 | 43.8 | 54.6 | 99.5 | 99.4 | |
| 19 | AW | 54.9 | 41.6 | 45.1 | 57.1 | 99.9 | 98.7 | |
| 20 | NN | 57.7 | 48.2 | 42.0 | 48.8 | 99.7 | 97.0 | |
| 21 | AW | 53.7 | 41.7 | 43.1 | 54.7 | 96.8 | 96.5 | |
| 22 | NN | 56.1 | 39.5 | 44.0 | 56.9 | 100.1 | 96.4 | |
| 23 | NE | 52.7 | 43.0 | 46.5 | 53.2 | 99.1 | 96.2 | |
| 24 | AS | 54.6 | 43.8 | 44.5 | 51.7 | 99.1 | 95.5 | |
| 25 | AN | 55.1 | 40.1 | 43.0 | 55.3 | 98.1 | 95.4 | |
| 26 | AN | 55.3 | 41.1 | 42.3 | 54.1 | 97.6 | 95.2 | |
| 27 | AS | 56.7 | 50.1 | 44.2 | 44.7 | 101.0 | 94.8 | |
| 28 | NW | 54.8 | 38.3 | 44.5 | 55.5 | 99.3 | 93.7 | |
| 29 | NW | 53.7 | 38.7 | 42.7 | 52.9 | 96.4 | 91.6 | |
| 30 | NS | 55.5 | 41.4 | 41.4 | 48.6 | 96.9 | 90.0 | |
| 31 | AE | 53.1 | 39.7 | 45.4 | 50.0 | 98.5 | 89.7 | |
| 32 | NS | 54.1 | 31.4 | 45.2 | 58.0 | 99.2 | 89.4 | |
| Average | 54.9 | 43.9 | 43.9 | 54.9 | 98.7 | 98.7 | 
 
What does your prediction model look like when you plug in SR as opposed to efficiency metrics?
ReplyDeleteI'd like to know that answer too, you probably already gave it elsewhere on the site, which has a higher correlation to future success?
ReplyDeleteAlso, McCarthy has said a couple times the Packers need to do better on third down. He could be witnessing the same thing where the Packers are having trouble with Success Rate but are overall efficient, possibly do to big plays. Somebody probably already asked and you answered this as well but can success rate and efficiency be combined for a better predictive model?
God I love this new SR stuff! Brilliant posts. Can't wait to see more. Couple questions/comments:
ReplyDeleteHow does using adjusted SR affect its correlation with winning?
I know it's essentially the same, but instead of "total SR" you should have "Average SR". It's jarring to see a probability (the success rate) over 1 (or 100%). Sure Average SR would just be Total SR divided by two, but it would say something: "This is the average rate of success for any play, defensively or offensively;" or "This is the percentage of plays for this team that result in positive EPA; as opposed to "This is the sum of those two columns." I guess you could (should?) weigh defense and offense unequally, but I'm not sure how you would pick the weights.
Separate question: how do you adjust for opponent SR? i.e. what method? When you adjust for opponent SR, is that opp SR itself an adjusted opp SR?
Alex-Good point about the totals. "Avg" is probably better. Maybe I'll go with "Overall" or something. Or how about some hyper-complex acronym, like OOATSRAA--Overall Opponent Adjusted Team Success Rate Above Average? The more complex and confusing, the less people can criticize it.
ReplyDeleteWill-Run SR correlates with winning better than run efficiency, but pass efficiency correlates better than Pass SR. This is likely due to the difference between teams with vertical passing games vs teams with dink-and-dunk passing games.
Overall however, the SR stats correlate with themselves over the course of a season better than efficiency stats do, which is one thing you look for when trying to predict rather than explain. I'm still finishing up some research on what combination of things should be most predictive.
Ultimately I suspect a SR-based model is more predictive than an efficiency-based model. There is also the possibility that a combination would be best, perhaps one that is based on pass efficiency and run SR.
May I ask if anything was done to linearize the SR's before the schedule adjustment?
ReplyDeleteNo. But they're all so close to 50% linear adjustments would almost be zero.
ReplyDeleteBrian-
ReplyDelete2 things:
1) Although I'm quite pleased that the opponent adjustments lift my Niners up to #6, I suspect the reason they're up so high is because you're weighting the opponent adjustments way too heavily. What's the correlation between opponent strength and team wins? That's what the opponent adjustment weight should be. God knows there's no way SF is the 6th-most successful team on a p-b-p basis this season, even if they'd played the 85 Bears, 72 Dolphins, and 07 Patriots in the first 6 weeks.
2) You know I love your stuff, so don't take this the wrong way, but how is this recent SR work of yours not a reinvention of the DVOA wheel? A replication study is valuable in its own right given how mysterious the methodological details of DVOA are; but as a substantive contribution, I'm not sure what SR-related methods or findings here are new. Is it the fact that you're using EPA to determine play success instead of the Carroll et al. guidelines?
What would a model look like that used EPA per play instead of simple success rate? I dont know if it should be overall offensive EPA per play or separate components for running and passing.
ReplyDeleteEPA already has built into it that the value of yards past the first down marker are not as valuable as actually getting the first down. It seems like this would reward teams that get can consistently pick up 3rd and 1s.
One problem, however, is that it could penalize teams that get a lot of first down gains of 10-16 yards (the EP value actually is less than only getting 9 yards on first down). I dont know how to deal with that.
Im just curious what that would look like.
It's true that DVOA is a success rate based stat, but it isn't straight success. They give you more points for a bigger success (like 2 successes for 10 yards on 1st and 10) and prorated success for a decent gain that isn't a success (so some part of a success for 12 yards on 3rd and 15).
ReplyDeleteI think that it's interesting in that it tell us that Success rate is important for runs but not so much for passes.
Danny, Yes, I believe that would be the major difference. His definition of a success or failure is the +/- expected points added, which is a stat that (i believe) Brian developed using pbp data from the last several years.
ReplyDeleteBrian, what program do you use to do your calculations? I have been doing a lot in excel, but when you do large regressions, excel seems rather limited and slow.
Also, I was doing some reading on logit regression, and I have a question that I believe you may give the quickest and shortest answer: the guy in the link below has a tutorial on how to do logistic regression in excel, but he does something that I find strange. He ADDS his probability, and maximizes that, wheres to get the probability, you should multiply the probabilities. What is the right thing to do in this case?
http://blog.excelmasterseries.com/2010/04/using-logistic-regression-in-excel-to.html
Regarding DVOA, we'll never know without knowing what the model does. I do know some things, like it is a SR-based system but uses "bonus points" for bigger plays and over-weights red zone outcomes. It also over-weights high leverage plays and counts turnovers heavily without regressing them nearly enough. These are all things that would make a model far too over-fit to past events. Over-fit models like DVOA make rankings appear to match our intuitive estimations of team strength but lack true insight.
ReplyDeleteI implemented things like WPA, EPA, and SR as research tools to learn more about the inner workings of the game, things like the game theory aspects of play calling and risk-reward considerations. I'm interested in the decision making, the psychology, and the strategic doctrine involved at every level. Ranking players and teams or predicting games are happy byproducts of those tools, and it gets clicks, but it's not my purpose.
Maybe I'm wrong, but my sense is that DVOA's express purpose is to rank teams. It's nice message-board fodder for arguing that my team can beat up your team, but does it teach us anything about the sport itself?
What I really enjoy is building tools. I'd like to consider myself a tool maker rather than a 'hammerer.' I'd rather have other folks can make use of the tools I build. But DVOA is just one tool. And like they say about a man who only owns a hammer--the whole world looks like a nail.
Shawn-I use GRETL. It's a free stats/regression package.
ReplyDeleteAndy-Using EPA for a model seems logical at first, but over the long run, it would be little different than just using team point differential. After all, that's how EPA is defined--the long run expected change in net points.
A WPA-based would be strictly circular. Every team's WPA should equal half of its win total. By its very definition, every win nets +0.50 WPA.
Thanks for the reply Brian. To be honest with you, the game theory and decision-based stuff is what I'm most interested in when I come to your site. I definitely think that's something you've kind of cornered the market on, and it complements the more descriptive stuff that FO does.
ReplyDeleteRe overfitting, while I obviously agree with you that the focus should be on prospection rather than retrospection, I'll just make the general point that -- from what I've found -- R-squared's are unbelievably underwhelming in football stat models; to the point that overfitting seems like a very minor concern in the grand scheme of things. I was over on Wages of Wins yesterday reading up on WP48 just for my own edification, and almost fainted when I saw an R-squared over .90 for their model. That would be like manna from heaven in football stat analysis.
Any thoughts on what I said about overweighting the opponent adjustments being one potential problem vis-a-vis the Niners ranking?
It took me a while, but i finally see what you are saying. It seems that the EPA per play method would give heavy weight to actual touchdowns. Forth and goal from the 1 would be a huge EPA/P swing, but its still only one play, and it would get more credit than the other plays even though it could be considered very(trying not to use the "L" word)..unrepeatable.
ReplyDeleteSo i guess the question on everyone's mind; are passing YPA and run SR inter-related? If not then can use them and would that model predict good results?
Andy-Yes, they are correlated. In a regression model that would cause multicollinearity problems. However, that's normally ok as long two things hold:
ReplyDelete1. The general relationship between the predictor variables is true for all cases, and
2. The end-goal is the predictive result and not the weights of the coefficients.
Brian: What do you maximize when you did your regression to get your coefficients?
ReplyDeleteFor a logit model? The dependent/outcome variable is whether a team won or lost a game. Each game is a case in the data.
ReplyDeleteDanny-Even if you remove the adjustment completely for SF, they still rank as an abv-avg team in terms of SR. Their defense is fine. It looks like their problem is that they're not getting big plays from the offense.
ReplyDeleteI understand that, but the goal is what? Do you multiply all the probabilities and maximize that value?
ReplyDeleteAll I have to say is, "From your mouth to Alex Smith's ears to Vernon Davis's or Michael Crabtree's hands."
ReplyDeleteShawn-I'm sorry, but I'm not sure what you are asking. I would you suggest you look at a post entitled "How the model works-a detailed example" for a step-by-step explanation. Otherwise, you might want to google around for better explanations of logit regression than I can provide.
ReplyDeleteBrian, maybe I am missing something, but shouldnt the average opp defsr be the same adj offsr and the same for the other 2 variables?
ReplyDeletethanks
Defenses have an average SR of 54.9%, and offenses have a SR of 43.9%.
ReplyDelete1.3% of the time, a play's EPA is estimated to be precisely 0.00, so the play is not considered a success for either side.
54.9 + 43.7 1.3% = 100.0%
T
I see... thanks
ReplyDelete