Advanced Football Analytics (formerly Advanced NFL Stats): The Off-Season

Unlabelled

The Off-Season

By Brian Burke

During the NFL off-season, fans are treated to the countless and predictable stories of contract hold outs, mini-camp no-shows, and draft hype. But for number crunchers like me, the off-season is when we can do our best work. Without the weekly grind of game predictions, team rankings, playoff forecasts and play-calling analyses, I can focus on more meaningful projects.

Last year I was able to do some deeper research. I took a look at the run/pass balance question in depth. My series on the passing paradox put to rest a lot of the questions surrounding why most teams appear to run too often. Based on those articles, I also developed a way to measure coaching risk aversion.

The randomness of turnovers was another focus of my off-season research. I was able to predict the apparent struggles of the San Diego defense and refine my game prediction model.

The final nail was put into the coffin of the myth of running back overuse. The connection between the Wonderlic Test and QB performance was also thoroughly debunked.

During the annual soap opera, I showed how overrated Brett Favre really is, and that a lot of his apparent success has come not from his own skill, but from his receivers. The Jets learned the hard way just how true that is.

The draft analysis by position was, I think, both straightforward and useful. We learned there aren't a lot of 6th round Hall of Famer QBs to be had, and that the best RBs come from the top rounds just like any other position.

By far, the most interesting subject to me was game theory. I gobbled up books on it all summer. It's fascinating stuff, and its lessons go far beyond sports. My game theory articles have actually become part of lessons in some university classes. I frequently get bunches of clicks from college 'blackboard' sites.

I also got my hands on a play-by-play database, which opened up a broad array of possibilities for research. I examined the three primary measures of utility in football: first down probability, expected points, and win probability. I was also able to construct an in-game win probability model that became the basis of the funky graphs I made all season.

There are lots of interesting articles buried in the archives here, so please click around when you get the time. I realize most people would call this site a blog and expect near-daily postings, but I cringe whenever it's referred to that way. I'm not throwing out my personal opinions on topics of the day like the vast majority of blogs out there. But the blog model is very convenient for publishing.

So this off-season, what can you expect?

Optimum turnover rates--should some teams throw riskier passes to increase their chances of winning, even at the expense of more interceptions? Are some teams too timid?

How consistent are team statistics during the season? Another year of data answers a lot of questions. You'll see that 2008 was the year of consistency in defensive interceptions. There are significant ramifications for the ranking and prediction models.

I'm working on major improvements to the in-game win probability model. The model will be primarily based on LOWESS (locally weighted regression/noise reduction) techniques. I'll also go back through the 2000 season and can build WP graphs for any game. I'll be able to do some other neat stuff, such as find out 'what was the dumbest 4th down decision of the decade?'

I've also been working on a fantasy projection system. Like my other efforts, it will be simple and open, and I'm confident it will be at least as accurate as any other out there. Plus, more positions will be added to my draft analysis. I'll also put together an 'Advanced NFL Stats' reading list featuring all the great books that have helped inspire some of the research I've done over the past couple years. And those are just the things I'm working on right now, so stay tuned.

Just 184 days until the Hall of Fame game...

published on 2/05/2009

16 Responses to “The Off-Season”

Anonymous says:: Thursday, February 05, 2009; Is it possible to have a 4th down calculator similar to the WP where you enter the 4th down and distance and time remaining to see when a 4th down is logical?
Brian says:: Friday, February 06, 2009; Regarding the NFL draft, every year in the days after the draft, every website and newspaper gives each team a grade for how their draft went that year. I always thought this was a little silly to grade a team's draft after the players they took haven't even stepped on the field yet.

A better way to evaluate a draft would be to wait until the players had been in the league for at least 3 years. Now would be the time to grade teams on their 2006 draft. Have you ever done something like this or do you know of a website that has done anything on this. It would be interesting to know which teams do the best job in the draft.
Mark M says:: Friday, February 06, 2009; Where do you get your data from (for YAC, for instance)?

I am interested in whether its possible to create offensive line stats, based on sacks, rushes etc to see which teams offensive line does the best job (although as they say the game is won in the trenches, you could say the team with the most wins has the best o-line).

Stats such as 'yards after contact' would be useful but I don't know if anywhere collects them.
Brian Burke says:: Friday, February 06, 2009; Yeah, I could do a 4th down calculator. That would be a nice addition to the WP gadget. But in general, the Romer paper has a great chart that shows the break-even point for going for it/kicking based on field position and to-go distance.

Brian-There are so many draft sites, I'm sure at least one of them do that. I'm not sure how to feel about grading drafts. I think teams should be judged based on the information available at the time.

For example, Ryan Leaf is widely regarded as one of the biggest busts of all time. But at the time, almost everyone agreed he was a sure-fire talent. Some even ranked him ahead of Manning. Three years later it was obvious he didn't have the personality to handle the NFL.

To me, the Chargers made the right pick--based on the information available at the time. But some people would call it the worst pick ever. This is due to the combined effects of outcome bias and hindsight bias.

Say a team faces a 3rd and 3. A coach correctly estimates his teams chances of converting are 60% if he runs, and 50% if he passes. He calls a run and the RB gets stuffed. A lot of fans would instantly say, "Should have passed! I knew it!"

On the other hand, it might be interesting to see if some teams really do have an "eye" for real talent. But it should be remembered that a great deal of the variance in player outcomes will naturally be random, or at least 'unpredictable', and some teams will be luckier than others.

The ratio of the variance of player outcomes that is unpredictable compared to what is predictable is probably so large that we may never be able to truly know which teams are better than others in identifying talent.
Brian Burke says:: Friday, February 06, 2009; Mark-The YAC data comes from myway.com. Profootballweekly.com and some other sites have the same data. SI.com has the best site for receiver YAC. You can get past year's stats by altering the year in the URL in your browser window.

Football Outsiders already does exactly what you suggest. They call it Adjusted Line Yards. It's one of their less flaky stats. It's quite clever, actually.
Anonymous says:: Friday, February 06, 2009; "I'll also go back through the 2000 season and can build WP graphs for any game. I'll be able to do some other neat stuff, such as find out 'what was the dumbest 4th down decision of the decade?'"

Here's another one-- calculate WPA for either just last year or every year since 2000. WPA (win probability added) is basically the combined effects of a player onto his team's win probability throughout the year, and is one of the major sabermetric stats used.
Brian Burke says:: Friday, February 06, 2009; Zach-Neat idea. I know the baseball guys have a lot of fun with WPA. Individual stats in football are very different though because of the team dynamics. (Is Warner really so good, or is it that he had great WRs?) I still think it would be interesting to find out who has won the most games for his team.
Anonymous says:: Saturday, February 07, 2009; WPA (and expected points added) still, to me, seems useful as a direct method of comparison for situations or two players on the same team rather than two players on different teams. WPA and EPA could, I think, cast interesting light on decision-making (whether to run or pass) on any given down, distance, yardline, or any combination of those. It could therefore potentially provide more light on the "passing paradox" - it would be possible to see total EPA or WPA for all first-and-10 runs and all passes, say, to see if teams were choosing too conservatively or if their desire to run the ball was logically motivated by game situation/increased 1st down probability/decreased TO probability or what have you.
Anonymous says:: Saturday, February 07, 2009; "The final nail was put into the coffin of the myth of running back overuse."

Really? Listen, I think the FO study fails as a serious statistical study for the various reasons you identified. But if you show that the boy who cried wolf did so without seeing a wolf, that doesn't mean you can go on and claim that you've disproved the myth of the existence of wolves.

There is a large difference between showing that something with a small sample size isn't statistically significant, and disproving something altogether. The FO study may be imperfect, but pointing that out doesn't disprove that running back overuse can cause increased injuries.

I began to look at the issue at a closer level precisely because of my concerns with the claims of running back overuse, including the "Curse of 370". I was a skeptic when I first delved into the issue. I now think it more likely than not that increased injuries result from "overuse", but that it is a complex issue deserving of far more research and discourse. I don't think pronouncements of coffins and nails adds to that discourse.

I would also like to think that I am searching for "truth", however clearly or unclearly we may divine it. So even though I've expressed such an opinion publicly, I would like to think that--much like a prosecutor who has a duty to disclose exculpatory evidence and change a position based on new evidence--I will still do so if further study changes that view. In fact, I have additional studies that I am hoping to work on this off-season to look at the workload issues.

So far, though, the evidence suggests (doesn't prove, but suggests) that immediate recent workload is correlated with higher injury rates soon thereafter. Even your rebuttal of FO notes that the games missed was greater for the 370+ group, even if it wasn't statistically significant given the sample size and relative change in games played.

So in addition, consider each of the following, which involve different data sets and show similar results when it comes to extreme workloads:

1) For the period 1995-2006, backs who had 138 or more carries (23.0 or more per game) in the first six games of a season got injured at a higher rate than those that had between 90 and 137 attempts and played in all six. Specifically, 22.2% of those with 150+ carries (n=9) and 12.5% of those with 138 to 149 carries (n=24) missed at least half of the remaining games that season, while only 5.6% of those with between 90 and 137 carries (n=179) missed at least half the remaining games. http://www.pro-football-reference.com/blog/?p=328

2) For the period 1995-2006, backs who averaged 25.0 or more carries over the final six weeks of one season were more likely to miss games at the start of the next season than all other backs, who averaged 15.0 or more carries over the final six games. Specifically, 28.6% (n=14) of backs who averaged 25.0 or more carries over the final six games suffered a “season-ending injury” during the first six games of the following season, compared to 8.7% (n=23) for all backs who averaged between 23.0 and 24.9 carries, and
5.2% (n=153) for all backs who averaged between 15.0 and 22.9 carries at the end of the previous season. http://www.pro-football-reference.com/blog/?p=330

3) From 1978-2006, 13.5% (n=78) of backs who had at least one playoff game where they had 25 or more official rush attempts the previous post season suffered a season-ending knee or other leg injury within the first six games of the following season. This severe injury rate is again higher than the baseline rate for other starting running backs. http://www.pro-football-reference.com/blog/?p=330

4) For the 2007 season (weeks 6-13), backs who had 24 or more carries in a single game (and more notably 28 or more carries) appeared on the injury report and were more likely to miss games immediately after, when compared to other starting running backs. Specifically,
76% of backs (n=17) who had 28 or more rush attempts in a game appeared on the injury report at least once over the next four weeks, while 51% (n=67) of other starting backs appeared the injury report. 41% were listed as OUT on the report at least once, compared to 13.4% of others. 24% suffered a season ending injury within a month, as compared with 5% of other starters. http://www.pro-football-reference.com/blog/?p=483

What is the response to each? sample size of the high workload group. (I'm fairly comfortable saying that the baseline serious injury rate, for backs with no recent high work games, is about 5% and certainly less than 10%). But it's certainly not that I massaged the data to best fit already conceived conclusions.
Brian Burke says:: Saturday, February 07, 2009; JKL-You're absolutely right. I did not disprove running back overuse. With inferential statistics, you can never disprove anything,
only fail to find evidence for something. And they're not the same thing.

What I am confident I did do was dismantle the Curse of 370 from top to bottom. I had originally written "Curse of 370" instead of "RB overuse" in this post. But the subject has become an emotional one for the FO guys, and I sometimes get some vitriolic emails whenever I criticize them, so I wrote 'RB overuse' just to make it less targeted (unless you click the link).

I don't have time at the moment to respond properly to your points above. I'll just make 1 or 2 general points of my own.

Let's say that the "no overuse" or "null" hypothesis is defined like this: there is a small and constant likelihood of injury for any RB on every play regardless of the number of previous carries in a season or previous season.

RBs who get a lot of carries in past games will tend to get a lot of carries in subsequent games. They are either stars or "feature" backs who don't platoon with other backs--Willie Parker, LT, LJ, guys like that.

So if you agree that guys with high numbers of previous carries will continue to get lots of future carries, they will have a higher chance of future injury. We should expect them to have more injuries than other backs simply because they have more opportunities to get hurt, without any overuse effect required to explain it.

I'll leave it at that for now. I do trust your numbers, and I don't rule out that RB overuse may exist. But once you take into account my point above, my hunch is the effect is small.
Anonymous says:: Monday, February 09, 2009; Draft re-grades:

WalterFootball.com, an independent website, offers surprisingly detailed investigation of previous drafts:
http://walterfootball.com/draft2005G.php
http://walterfootball.com/draft2004G.php
http://walterfootball.com/draft2003G.php
Unknown says:: Monday, February 09, 2009; Hi Brian,

I stumbled onto your blog around AFC Championship games and I find this site very educational. With the thought of Brett Farve being overrated, I was curious where Roethlisberger stacks up in his short career.

I have heard comparsions of Roethlisberger (especially in 2006 with his motorcycle injury) and then in other years he has benefited from a strong run game and is just a game manager. Is he a product of the "system" or is he a game winner you'd want to build a franchise around, statistically speaking.

Cheers!
Anonymous says:: Monday, February 09, 2009; If you're thinking about doing draft analysis, I have a specific request (isn't that rather grand of me, since I don't want to spend the time to do it myself?):

I am always extremely frustrated by the continual "butcher-case" mentality that lines prospects up like pieces of meat and puts them through drills that have no real bearing on the game of football and then crowns some as top prospects or others as marginal. I'm a big believer (although I don't have the data to prove it) that, all things being equal, measured, statistical success in college IS the best possible predictor of success in the pros. I'm not saying it's fail-safe. I'm just saying that I would rather have a guy who did great on his college team rather than a guy who played one marginal season but is 6'4", 240 lbs., and runs a 4.3./40.

Now I know that this is very difficult or impossible to quantify for some positions (like offensive line). But through your analysis you've found team stats that correlate to winning. Surely, there are individual stats that correlate to winning as well?

And if there are individual stats that correlate to winning, I would love to see if there is any correlation between those individual college stats and a player's total number of games started in the pros? (I'm nominating total-games-started as a generic way of indicating pro success because marginal guys typically only manage to stay around for a year or two and games-started allows you to kinda put all players at all positions on an even ground because a QB who started 100 games is probably, relatively speaking, about as good as a safety who also started 100 games.)
Anonymous says:: Monday, February 09, 2009; As I go back and read my previous comment, I feel the need to clarify somewhat. I'm NOT claiming that all guys who are successful in college should be successful in the pros. That's why I would love to see some rigorous analysis done to determine whether there are certain stats that may in fact be (somewhat) predictive of professional success.

For example, there have been QBs who had "great" years and led their teams to championship seasons while almost exclusively running the option (especially in the 60s and 70s). Some of the QBs racked up gaudy rushing numbers, very average passing numbers, and extremely low INTs (because they hardly ever threw the ball). Obviously, many of these QBs were not adequately suited for the NFL.

So I'm not saying that a guy should make it in the pros merely because he had a few good statistical categories in college or because he played for a powerhouse team.

But again, I'd much rather see a team go with a solid player who actually PRODUCED something in college rather than fall in love with the latest "workout wonder" and then wonder why all their top draft picks are busts.
Anonymous says:: Thursday, February 12, 2009; I'm with Ryan. I'd like to see how Big Ben really stacks up.
motorcycle chargers says:: Wednesday, July 22, 2009; Nice blog. i was searching for so long time. thanks. A+ work done.

Note: Only a member of this blog may post a comment.

The Off-Season

16 Responses to “The Off-Season”

Leave a Reply

Special Note

Search Advanced Football Analytics

Required Reading

Archive

@BBurkeESPN

ANS COMMUNITY

Support Military Families