- Home Posts filed under analysis
Leaving Free WP All Over the Field
So why do NFL coaches voluntarily leave WP out on the field?
Take yesterday's DEN-SEA game as an example. SEA was ahead 17-12 in the 4th quarter, and had the ball deep in their own territory with about 9 minutes to play. With the game clock running, they snapped the ball with: 8, 5, 5, 8, and 10 seconds left on the play clock. That's a total of 36 seconds. Plus, there was a play in which the receiver could have just as easily remained in bounds. Because there was more than 5 minutes left in the game and the clock restarts after the ball is set, that may have only cost 10-15 seconds of play clock rather than up to 40 seconds. To be fair let's say there was a total of 46 seconds SEA could have burned off the clock during their second to last drive with almost no effort or risk.
Texans Try Once, Fail Twice
Down 14-0 at the start of the second half to the New York Giants,
the Houston Texans faced a 4th-and-1 on their own 46-yard line. At this point, with just a 9.0% chance to win, Bill O'Brien made the correct call to go for it. A successful conversion means a 12.9% win probability, while a punt means about an 8.6% chance to win. The break-even point going for it is far below an estimated 65% conversion rate on 4th-and-1. Alfred Blue ran off right tackle and was stuffed, turning the ball over on downs. The Giants would kick a field goal to go up 17-0.
On the very next drive, Ryan Fitzpatrick led the Texans downfield to the Giants 9-yard line where they faced another 4th-and-1. With 6:13 left in the third, down 17, Bill O'Brien elected to take the chip-shot field goal. Even the commentators suggested he should be going for it. Obviously, the prior failure on fourth down should not have an affect on the Texans' decision this time. If that were the case, O'Brien would be judging his previous decision on the outcome, rather than the process. The only other logic could be that he figured they would need a field goal at some point, down 17 - common faulty logic in the NFL as coaches should be doing whatever they can to maximize their chances of winning.
Two-Point Conversion in the KC-DEN game
NFL coaches typically adhere to what's known as the Vermeil Chart for making two-point decisions. The chart was created by Dick Vermeil when he was offensive coordinator for UCLA over 40 years ago. It's a very simple chart that simply looks at score difference prior to any conversion attempt and does not consider time remaining, with one caveat. It applies only when the coach expects there to be three or fewer (meaningful) possessions left in the game.
With just over 7 minutes to play, there could be three possessions at most left, especially considering that at least one of those possessions would need to be a KC scoring drive for any of this to matter. (In actuality, there were only two possessions left, one for each team.) Even the tried-and-true Vermeil chart says go for two when trailing by 5. But it's not the 1970s any more and this isn't college ball, so let's apply the numbers and create a better way of analyzing go-for-two decisions.
Except for rare exceptions I've resisted analyzing two-point conversion decisions with the Win Probability model because, as will become apparent, the analysis is particularly susceptible to noise. Now that we've got the new model, noise is extremely low, and I'm confident the model is more than up to the task.
First, let's walk through the possibilities for KC intuitively. If KC fails to score again or DEN gets a TD, none of this matters. Otherwise:
Chiefs Crawling Drive, Come Away With Nothing
The Chiefs lost to the Broncos 24-17 on Sunday and had a chance to at least tie the game at the very end. Kansas City kept Peyton Manning off the field for an enormous chunk of the second half. The Broncos offense had only two drives after halftime (not including the final kneel down), one for a punt, one for a field goal, totaling just 8:51 in possession. The longest drive came from the Chiefs at the very start of the second half, where they ran 23 plays, taking 10 minutes off the clock... and ultimately missed a field goal. This got me thinking, how does drive length (in minutes) affect the probability of a team scoring?
First, here's a look at the ridiculous drive using our Markov model:
Nick Foles and Interception Index Regression
With one week of the 2014 season in the books, Foles and McCown have already matched that combined total. While everyone should have expected both to regress from their remarkably turnover-free 2013 seasons, that does not tell us how far each should regress based on historical norms.
Analyzing Replay Challenges
Most challenges are now replay assistant challenges--the automatic reviews for all scores and turnovers, plus particular plays inside two minutes of each half. Still, there are plenty of opportunities for coaches to challenge a call each week.
The cost of a challenge is two-fold. First, the coach (probably) loses one of his two challenges for the game. (He can recover one if he wins both challenges in a game.) Second, an unsuccessful challenge results in a charged timeout. The value of the first cost would be very hard to estimate, but thankfully the event that a coach runs out of challenges AND needs to use a third is exceptionally rare. I can't find even a single example since the automatic replay rules went into effect.
So I'm going to set that consideration aside for now. In the future, I may try to put a value on it, particularly if a coach had already used one challenge. But even then it would be very small and would diminish to zero as the game progresses toward its final 2 minutes. In any case, all the coaches challenges from this week were first challenges, and none represented the final team timeout, so we're in safe waters for now.
Every replay situation is unique. We can't quantify the probability that a particular play will be overturned statistically, but we can determine the breakeven probability of success for a challenge to be worthwhile for any situation. If a coach believes the chance of overturning the call is above the breakeven level, he should challenge. Below the breakeven level, he should hold onto his red flag.
The 4th Down Bot Returns
As I mentioned last year, although the 4th down issue is growing mold with smarter fans, it remains the lowest hanging fruit on the football analytics tree. So it's nice to be able to automate things and not have to do the analysis myself. But on the other hand, we can add 'football analyst' to the list of jobs being taken over by robots.
The Bot will be faster, more accurate, and come with some new features this season. Here is a brief introduction. Here's are a few notes on how it works. And here is his Twitter feed.
Season Projections Visualization
The method used to create the projections is explained here.
The viz is intended one-stop shopping for the season outlook. The top window shows the probabilities each team will make the playoffs. Dark green indicates a playoff berth by winning the division, lighter green indicates a wildcard berth.
The three windows below are team-specific. Hover the cursor over (or tap) a team's column in the top chart to see its details below. The window on the left is a chart of win totals. The bars represent the probability the selected team will finish with a corresponding number of wins. The second window shows the same information presented in a different way. It's the cumulative probability of each win total. In other words, it's the probability the selected team will win at least that many games. The third window is a pie chart. (Yes, I know pie charts are the unloved orphans of the chart world.) It illustrates the probability each team will win its division.
2014 Season Predictions for ESPN the Magazine
Then it got worse. "We want you to predict every score of every game."
I started doing some math in my head. There's 267 games in the season, including the playoffs, which means there's 2^267 different possible combinations of game outcomes in the season. While that might sound like a lot of different possibilities, it's even more than a human being could possibly fathom. Physicists and astronomers estimate there are about 10^80 atoms in the universe (that's 100 quinvigintillion to you and me). And the NFL season's 2^267 possible outcomes comes to 2.4x10^80, or about 240 quinvigintillion. Put simply, there more than twice as many possible outcomes to the NFL season than there are atoms in the universe. And that just refers to wins and losses, and doesn't even consider scores.
So how hard could it be?
Bayesian Draft Prediction Model
I've created a tool for predicting when players will come off the board. This isn't a simple average of projections. Instead, it's a complete model based on the concept of Bayesian inference. Bayesian models have an uncanny knack for accurate projections if done properly. I won't go into the details of how Bayesian inference works in this post and save that for another article. This post is intended to illustrate the potential of this decision support tool.
Bayesian models begin with a 'prior' probability distribution, used as a reasonable first guess. Then that guess is refined as we add new information. It works the same way your brain does (hopefully). As more information is added, your prior belief is either confirmed or revised to some degree. The degree to which it is refined is a function of how reliable the new information is. This draft projection model works the same way.
2013 Seahawks Defense: In the Conversation for Best Ever?
One of the best things about Expected Points Added is that it separates the contributions of offenses, defenses, and special teams. A defense with a very good offense will appear better in terms of other metrics because their opponents would tend to get possession in poor field position. Conversely, a defense sharing a locker room with a below-average offense won't seem as dominant.
Another feature of EPA is that it's measured in net points. It's not just a klugey stat transformed into an analog of net points. It is net point potential. When EPA says a defense is worth 5.0 points per game, that's universally understandable and comparable.
One drawback, at least in its current general implementation, is that EPA doesn't account for the changing nature of the NFL. The league is a moving target, as offenses consistently gain an ever firmer upper hand over defenses. Even over the the last dozen years, offenses have gained several points of advantage. (How do we know exactly how much? EPA, that's how.) So defenses from a decade ago might appear better than today's defenses only because of how the league has evolved.
It's a trivial matter to account for the average EPA by year. That would allow us to compare apples to apples based on the "scoring environment" of the season. I'll do that below and see where SEA '13 fits in. But there's one other notion we should at least consider.
Super Bowl XVLXVLVLIIICDMVXXXIII Analysis
Secondly, although my numbers pointed to a SEA edge I did not see that coming. The game notched a 1.5 on the Excitement Index, the lowest of any SB in the data (since '99). The next lowest were the TB-OAK 2002 game and the BAL-NYG 2000 game, each at 2.7. There weren't many decisions to analyze because the game got out of hand so quickly, but I'll go over the little we can learn from last night.
Overall, the game hinged on the fundamentals. SEA's defense was faster, bigger, stronger. Even a layman like myself could tell SEA won because of lots and lots of individual matchup victories. They made tackles at first contact. Guys shook their blocks lightning fast. They swarmed to the screens, caved the pocket, and covered the receivers in stride. There weren't many blitzes or scheming contrivances. Instead it was plain old physical football. The only wrinkle I noticed was that SEA played more cover/man 2 than we expected, but that's not exactly something Manning shouldn't normally be able to handle.
The Challenges
The 2013 ANS All-Analytics Team
Without further ado, here are the 2013 awardees. Winners receive an invitation to play nerf touch football in my backyard. Airfare and hotel are not included. Click on the position headers to see the full stat table for each position.
MVP
Thomas Bayes Would Approve of Seattle's Defensive Tactics
Last week a WSJ article about the Seahawks' defensive backs claimed that they "obstruct and foul opposing receivers on practically every play." I took a deeper look in to the numbers and found that as long as referees are reluctant to throw flags on the defense in pass coverage (as claimed in the article), holding the receiver is a very efficient defensive strategy despite the risk of being penalized.
The following is an analysis using the concepts of expected utility, expected cost, and bayesian statistics.
The reason defensive holding is an optimal strategy comes down to one word. Economics. The referee's reluctance to call penalties on the defensive secondary is analogous to a market inefficiency. The variance in talent on NFL rosters, coaching staffs, and front offices between the best and worst teams in the league is probably very small. Successful teams win within a small margin. Seattle has found a way to exploit a relaxation in marginal constraints within the way the game is called that their competitors have not, and turned it into a competitive advantage.
If you think about committing a penalty in the same way as committing a crime, the expected utility is essentially the same. The expected utility (EU) for defensive holding is (opponent loss of down due to incomplete pass - probability of being penalized x cost of penalty). In other words, EU is the benefit of an incomplete pass minus the cost of the penalty times the probability of getting caught.
Advanced Stat Breakdown for Super Bowl 48
Instead of reading a bunch of words about the Super Bowl matchup, where each phase is trying its hardest to express some sort of numerical evaluation, wouldn't you prefer to see the numbers themselves collected into one giant eye chart? Well, if that appeals to you, you'll enjoy the SEA-DEN Matchup page.
NFCCG SF-SEA Observations
Unlike the AFC game, this one was all about 4th downs. HUGE leverage throughout the game. I know I can be a broken record on this stuff, but this game really hinged on some very interesting strategic decisions.
-SF 4th and 2 on the SEA 7, 1st qtr. They punted. Probably should have gone for it.
-SF 4th and goal on the SEA 1. They went for it. Great call.
-SF 4th and 6 on the SEA 46, 1st qtr. They punted. Probably should have gone for it.
-SEA 4th and 6 on SF 38, 26 sec in 2nd qtr. They went for it, converted then kicked a FG to end the half.
-SEA 4th and 7 on the SF 35, 4th qtr 14 min to play, down by 4. They went for it. Great call except SEA burned a timeout that they were reasonably likely to need in order to think things over. Here's the thing: Timeouts are very valuable. If you can't decide between going for it or kicking or punting, you're probably very close to the point of indifference anyway. You may be better off making any quick decision and saving the timeout than you are making an optimum decision but wasting a timeout.
-SEA 4th and goal from the 1, 4th qtr 8:39 to play, up by 1. The went for it. Great call. Why? First, because they'll probably make it and virtually put the game away. And if they don't they're likely to leave the ball on the SF 1-yd line. That's not exactly a good place to be for an offense. I heard someone say that despite the math you can't take a chance like that against the SF defense. But as I noted last week, over the past 2 seasons SF has faced 15 (now 17) plays from the 1-yd line and allowed TDs on 10 of them. That's worse than league average. Don't get me wrong. I'm not saying that the SF defense is below average. Instead, the point is that good and bad teams aren't that different on any one given play. It's just that good or bad teams show up that way after accumulating very small advantages over several dozen individual plays in a game.
-Here's a weird one. SEA 4th and 11 on the SF 29, 4th qtr 3:43 to play, up by 3. My numbers say...punt? Yes, punt. Here's why:
AFCCG NE-DEN Observations
NE was going to need some fluky things to go their way to win--turnovers, a special teams play, or some terrible call by the refs. It never came.
Manning and the DEN passing game did have a fantastic day. Manning: +.48 WPA, +17.9 EPA, 9.3 AYPA, no turnovers, no sacks.
I, and the NYT 4th Down Bot--(funny how you never see the two of us together at the same time), agreed with every 4th down call during the game. Belichick knows what he's doing. I was disappointed to see DEN burn a timeout just prior to NE's 4th down conversion attempt. Teams should be better prepared for a 4th down attempt, particularly in situations like this: a 4th and short in or near the red zone. In a high-leverage situation like that, it's ok for a team with a significant lead to use a timeout, but in a closer game, it would be much more costly. (I'm working on a project to value timeouts in terms of WP now, and without any spoilers--they are very precious in the 2nd half.)
Obligatory Manning-Brady Post...But This One Is Cool, I Promise
The interactive visualization below chart's each player's career. It's a special version of the QB viz I update weekly throughout the season. In this edition, I've selected only Manning and Brady for comparison, plus I've included postseason data.
The viz offers two unique and innovative ways of looking at each player, unashamedly stolen from the best baseball analytics site on the Web, Fangraphs. First, there is a plot of career cumulative Win Probability Added (WPA) from each QB's first year through his most recent year. It's an interesting way to compare the career trajectories of top passers because it's a cumulative chart.
Second, there is an "Nth best season" Expected Points Added (EPA) chart, which takes some thinking to understand because it's not plotted in chronological order. It plots each QB's season in order from his best EPA season through worst EPA season. It's not cumulative and because it appears to trend downward does not mean the QBs are declining. I like it because re-ordering each season makes the separation between each player's performance clear to see.
Would Auburn Have Been Better Off Not Scoring a Touchdown?
...Nevertheless, Auburn had about an 81 percent chance of winning after Mason’s score, as teams in Florida State’s position are able to score a touchdown about 19 percent of the time. (These numbers are based on analogous situations in the NFL, though I’ve made slight adjustments for the differences in pace between college and the pros, and to include the chance of a kick return for a touchdown.)
So, the question we’re evaluating is whether having a first down at the 1-yard line would have given Auburn more than an 81 percent chance of winning. It’s a tricky question because it needs to be analyzed backwards...
(What I didn't explain in the article is that it's easiest to work backwards because the Auburn WP on 3rd down is depending on the results of a potential 4th down. And the WP on 2nd down are dependent on the potential results of 3rd down, which in turn depend on 4th down. And so on.)
For the Slate articles, I can't get away as much math and equations as I like, so here's a table of the relevant probabilities I used. It was complicated because the deeper into the goal-line series Auburn went, the lower Auburn's chances of getting the TD went BUT the lower Florida State's chances of responding went too. This edition assumes Auburn goes for it on the 4th down on what would be a single make-or-break play for championship.
What if KC Had Just Kneeled Out the 2nd Half?
There were a total of 1,519 seconds left in the game. KC can burn 40 seconds between plays and 6 seconds during a typical play just by calling a super safe run (that stays in bounds) or even a kneel. Even if KC doesn't try to convert a single first down, they can burn 144 seconds on a series. However, IND can use its 3 timeouts to make one series only take 24 seconds off the game clock.
Because IND is due to receive the kickoff at 13:39 in the 3rd, KC was guaranteed to have at least three possessions--one between each theoretical IND TD. That means that just by kneeling, KC can burn a total of 456 seconds (7:36) off the game clock, which leaves a total of 1,207 seconds of game time (20:07) and no timeouts for IND to score 4 touchdowns.