Showing posts with label analysis. Show all posts
Showing posts with label analysis. Show all posts

Leaving Free WP All Over the Field

If you were a coach, would you voluntarily give up a down at some point in the game, just to be sporting? Ehh, let's just make it 3rd and 5 instead of 2nd and 5. Of course not. For a random play in the 2nd quarter, that would cost you about 0.02 WP (2% chance of winning) for no reason.

So why do NFL coaches voluntarily leave WP out on the field?

Take yesterday's DEN-SEA game as an example. SEA was ahead 17-12 in the 4th quarter, and had the ball deep in their own territory with about 9 minutes to play. With the game clock running, they snapped the ball with: 8, 5, 5, 8, and 10 seconds left on the play clock. That's a total of 36 seconds. Plus, there was a play in which the receiver could have just as easily remained in bounds. Because there was more than 5 minutes left in the game and the clock restarts after the ball is set, that may have only cost 10-15 seconds of play clock rather than up to 40 seconds. To be fair let's say there was a total of 46 seconds SEA could have burned off the clock during their second to last drive with almost no effort or risk.

Texans Try Once, Fail Twice

Down 14-0 at the start of the second half to the New York Giants,
the Houston Texans faced a 4th-and-1 on their own 46-yard line. At this point, with just a 9.0% chance to win, Bill O'Brien made the correct call to go for it. A successful conversion means a 12.9% win probability, while a punt means about an 8.6% chance to win. The break-even point going for it is far below an estimated 65% conversion rate on 4th-and-1. Alfred Blue ran off right tackle and was stuffed, turning the ball over on downs. The Giants would kick a field goal to go up 17-0.

On the very next drive, Ryan Fitzpatrick led the Texans downfield to the Giants 9-yard line where they faced another 4th-and-1. With 6:13 left in the third, down 17, Bill O'Brien elected to take the chip-shot field goal. Even the commentators suggested he should be going for it. Obviously, the prior failure on fourth down should not have an affect on the Texans' decision this time. If that were the case, O'Brien would be judging his previous decision on the outcome, rather than the process. The only other logic could be that he figured they would need a field goal at some point, down 17 - common faulty logic in the NFL as coaches should be doing whatever they can to maximize their chances of winning.

Two-Point Conversion in the KC-DEN game

With 7:15 left in the 4th quarter against DEN, KC's Knile Davis ran for a 4-yard TD, narrowing the Broncos' lead 21-16 pending the extra point or two-point conversion. Andy Reid elected for the extra point, and following the kick the Chiefs trailed by 4 points rather than 3 or 5 points resulting from a two-point try.

NFL coaches typically adhere to what's known as the Vermeil Chart for making two-point decisions. The chart was created by Dick Vermeil when he was offensive coordinator for UCLA over 40 years ago. It's a very simple chart that simply looks at score difference prior to any conversion attempt and does not consider time remaining, with one caveat. It applies only when the coach expects there to be three or fewer (meaningful) possessions left in the game.

With just over 7 minutes to play, there could be three possessions at most left, especially considering that at least one of those possessions would need to be a KC scoring drive for any of this to matter. (In actuality, there were only two possessions left, one for each team.) Even the tried-and-true Vermeil chart says go for two when trailing by 5. But it's not the 1970s any more and this isn't college ball, so let's apply the numbers and create a better way of analyzing go-for-two decisions.

Except for rare exceptions I've resisted analyzing two-point conversion decisions with the Win Probability model because, as will become apparent, the analysis is particularly susceptible to noise. Now that we've got the new model, noise is extremely low, and I'm confident the model is more than up to the task.

First, let's walk through the possibilities for KC intuitively. If KC fails to score again or DEN gets a TD, none of this matters. Otherwise:

Chiefs Crawling Drive, Come Away With Nothing

The Chiefs lost to the Broncos 24-17 on Sunday and had a chance to at least tie the game at the very end. Kansas City kept Peyton Manning off the field for an enormous chunk of the second half. The Broncos offense had only two drives after halftime (not including the final kneel down), one for a punt, one for a field goal, totaling just 8:51 in possession. The longest drive came from the Chiefs at the very start of the second half, where they ran 23 plays, taking 10 minutes off the clock... and ultimately missed a field goal. This got me thinking, how does drive length (in minutes) affect the probability of a team scoring?

First, here's a look at the ridiculous drive using our Markov model:

Nick Foles and Interception Index Regression

Nick Foles and Josh McCown were two of last season's most pleasant surprises, emerging from obscurity to post two of league's most efficient seasons.  Both finished in the top 3 for Expected Points Added per Play, largely in part because the two combined to throw just three interceptions.

With one week of the 2014 season in the books,  Foles and McCown have already matched that combined total.  While everyone should have expected both to regress from their remarkably turnover-free 2013 seasons, that does not tell us how far each should regress based on historical norms.

Analyzing Replay Challenges

The new WP model allows some nifty new applications. One of the more notable improvements is the consideration of timeouts. That, together with enhanced accuracy and precision allow us to analyze replay challenge decisions. Here at AFA, we've tinkered with replay analysis before, and we've estimated the implicit value of a timeout based on how and when coaches challenge plays. But without a way to directly measure the value of a timeout the analysis was only an exercise.

Most challenges are now replay assistant challenges--the automatic reviews for all scores and turnovers, plus particular plays inside two minutes of each half. Still, there are plenty of opportunities for coaches to challenge a call each week.

The cost of a challenge is two-fold. First, the coach (probably) loses one of his two challenges for the game. (He can recover one if he wins both challenges in a game.) Second, an unsuccessful challenge results in a charged timeout. The value of the first cost would be very hard to estimate, but thankfully the event that a coach runs out of challenges AND needs to use a third is exceptionally rare. I can't find even a single example since the automatic replay rules went into effect.

So I'm going to set that consideration aside for now. In the future, I may try to put a value on it, particularly if a coach had already used one challenge. But even then it would be very small and would diminish to zero as the game progresses toward its final 2 minutes. In any case, all the coaches challenges from this week were first challenges, and none represented the final team timeout, so we're in safe waters for now.

Every replay situation is unique. We can't quantify the probability that a particular play will be overturned statistically, but we can determine the breakeven probability of success for a challenge to be worthwhile for any situation. If a coach believes the chance of overturning the call is above the breakeven level, he should challenge. Below the breakeven level, he should hold onto his red flag.

The 4th Down Bot Returns

The 4th Down Bot is returning to the New York Times this season. You might recall we booted him up late last season, but this year he'll be around starting week one. At its heart, the bot is a fun application of the 4th Down Calculator feature here at AFA. It uses both the Expected Points model and the Win Probability model to estimate the best option for every 4th down as a game is in progress.

As I mentioned last year, although the 4th down issue is growing mold with smarter fans, it remains the lowest hanging fruit on the football analytics tree. So it's nice to be able to automate things and not have to do the analysis myself. But on the other hand, we can add 'football analyst' to the list of jobs being taken over by robots.

The Bot will be faster, more accurate, and come with some new features this season. Here is a brief introduction. Here's are a few notes on how it works. And here is his Twitter feed.

Season Projections Visualization

I put together a season projection visualization that illustrates the playoff probabilities and win totals for each team. The numbers are based on the results of the season prediction project I did for ESPN The Magazine.

The method used to create the projections is explained here.

The viz is intended one-stop shopping for the season outlook. The top window shows the probabilities each team will make the playoffs. Dark green indicates a playoff berth by winning the division, lighter green indicates a wildcard berth.

The three windows below are team-specific. Hover the cursor over (or tap) a team's column in the top chart to see its details below. The window on the left is a chart of win totals. The bars represent the probability the selected team will finish with a corresponding number of wins. The second window shows the same information presented in a different way. It's the cumulative probability of each win total. In other words, it's the probability the selected team will win at least that many games. The third window is a pie chart. (Yes, I know pie charts are the unloved orphans of the chart world.) It illustrates the probability each team will win its division.

2014 Season Predictions for ESPN the Magazine

ESPN asked me to predict the 2014  season for their NFL Preview edition of their magazine. I was very hesitant because predicting the season to any degree is extremely difficult. I'm even on record as proclaiming that all pre-season predictions are "worthless." (More on that below). "You want me to predict which teams make the playoffs?" "Yes," they said, "in fact, we want you to predict the winner of all 267 games."

Then it got worse. "We want you to predict every score of every game."

I started doing some math in my head. There's 267 games in the season, including the playoffs, which means there's 2^267 different possible combinations of game outcomes in the season. While that might sound like a lot of different possibilities, it's even more than a human being could possibly fathom. Physicists and astronomers estimate there are about 10^80 atoms in the universe (that's 100 quinvigintillion to you and me). And the NFL season's 2^267 possible outcomes comes to 2.4x10^80, or about 240 quinvigintillion. Put simply, there more than twice as many possible outcomes to the NFL season than there are atoms in the universe. And that just refers to wins and losses, and doesn't even consider scores.

So how hard could it be?

Bayesian Draft Prediction Model

Let's say you're a GM in need of a safety. You really like Ha Ha Clinton-Dix (FS Ala.) but are unsure if he'll still be on the board when you're on the clock. Do you need to trade up? How far? What if you're a GM with a high pick and would be willing to trade down if you're still assured of getting Clinton-Dix? How far down could you trade and still get your guy?

I've created a tool for predicting when players will come off the board. This isn't a simple average of projections. Instead, it's a complete model based on the concept of Bayesian inference. Bayesian models have an uncanny knack for accurate projections if done properly. I won't go into the details of how Bayesian inference works in this post and save that for another article. This post is intended to illustrate the potential of this decision support tool.

Bayesian models begin with a 'prior' probability distribution, used as a reasonable first guess. Then that guess is refined as we add new information. It works the same way your brain does (hopefully). As more information is added, your prior belief is either confirmed or revised to some degree. The degree to which it is refined is a function of how reliable the new information is. This draft projection model works the same way.

2013 Seahawks Defense: In the Conversation for Best Ever?

The SEA defense dominated the league's best offense in years to take home the championship. Where should they stand in relation to some of the great defenses in recent times? Could they be the best defense ever?

One of the best things about Expected Points Added is that it separates the contributions of offenses, defenses, and special teams. A defense with a very good offense will appear better in terms of other metrics because their opponents would tend to get possession in poor field position. Conversely, a defense sharing a locker room with a below-average offense won't seem as dominant.

Another feature of EPA is that it's measured in net points. It's not just a klugey stat transformed into an analog of net points. It is net point potential. When EPA says a defense is worth 5.0 points per game, that's universally understandable and comparable.

One drawback, at least in its current general implementation, is that EPA doesn't account for the changing nature of the NFL. The league is a moving target, as offenses consistently gain an ever firmer upper hand over defenses. Even over the the last dozen years, offenses have gained several points of advantage. (How do we know exactly how much? EPA, that's how.) So defenses from a decade ago might appear better than today's defenses only because of how the league has evolved.

It's a trivial matter to account for the average EPA by year. That would allow us to compare apples to apples based on the "scoring environment" of the season. I'll do that below and see where SEA '13 fits in. But there's one other notion we should at least consider.

Super Bowl XVLXVLVLIIICDMVXXXIII Analysis

First of all, I'm getting tired of the Roman numerals. It was cool up until maybe Super Bowl XXV, but now it just hurts my brain.

Secondly, although my numbers pointed to a SEA edge I did not see that coming. The game notched a 1.5 on the Excitement Index, the lowest of any SB in the data (since '99). The next lowest were the TB-OAK 2002 game and the BAL-NYG 2000 game, each at 2.7. There weren't many decisions to analyze because the game got out of hand so quickly, but I'll go over the little we can learn from last night.

Overall, the game hinged on the fundamentals. SEA's defense was faster, bigger, stronger. Even a layman like myself could tell SEA won because of lots and lots of individual matchup victories. They made tackles at first contact. Guys shook their blocks lightning fast. They swarmed to the screens, caved the pocket, and covered the receivers in stride. There weren't many blitzes or scheming contrivances. Instead it was plain old physical football. The only wrinkle I noticed was that SEA played more cover/man 2 than we expected, but that's not exactly something Manning shouldn't normally be able to handle.

The Challenges

The 2013 ANS All-Analytics Team

The All-Analytics team returns. Like always, the awards are predominantly based on pure numbers, specifically Win Probability Added and Expected Points Added. The chart at the bottom of this post is provided for easy reference. It plots regular season WPA and EPA for the top 32 players at each position. You can look at past seasons as well. The players closest to the top right corner are the leaders at their position. That chart is available with running totals throughout the season in the Tools | Visualizations | Position Leaders link in the menu.

Without further ado, here are the 2013 awardees. Winners receive an invitation to play nerf touch football in my backyard. Airfare and hotel are not included. Click on the position headers to see the full stat table for each position.

MVP

Thomas Bayes Would Approve of Seattle's Defensive Tactics

The following is a guest article by Gary Montry, a professional applied mathematician. Editor's note: Gary uses net yardage as the measure of utility, and we might prefer something like EP or WP, I think the general point of the article stands, and its strength is in the construction and solution to the problem. It's also a great refresher on conditional probabilities and Bayes' theorem.   

Last week a WSJ article about the Seahawks' defensive backs claimed that they "obstruct and foul opposing receivers on practically every play."  I took a deeper look in to the numbers and found that as long as referees are reluctant to throw flags on the defense in pass coverage (as claimed in the article), holding the receiver is a very efficient defensive strategy despite the risk of being penalized.

The following is an analysis using the concepts of expected utility, expected cost, and bayesian statistics.

The reason defensive holding is an optimal strategy comes down to one word. Economics. The referee's reluctance to call penalties on the defensive secondary is analogous to a market inefficiency. The variance in talent on NFL rosters, coaching staffs, and front offices between the best and worst teams in the league is probably very small. Successful teams win within a small margin. Seattle has found a way to exploit a relaxation in marginal constraints within the way the game is called that their competitors have not, and turned it into a competitive advantage.

If you think about committing a penalty in the same way as committing a crime, the expected utility is essentially the same. The expected utility (EU) for defensive holding is (opponent loss of down due to incomplete pass - probability of being penalized x cost of penalty). In other words, EU is the benefit of an incomplete pass minus the cost of the penalty times the probability of getting caught.

Advanced Stat Breakdown for Super Bowl 48

Instead of reading a bunch of words about the Super Bowl matchup, where each phase is trying its hardest to express some sort of numerical evaluation, wouldn't you prefer to see the numbers themselves collected into one giant eye chart? Well, if that appeals to you, you'll enjoy the SEA-DEN Matchup page.

NFCCG SF-SEA Observations

As expected, this was a real defensive slugfest. The winning QB had -3.4 EPA. Kaepernick posted -0.28 WPA and 2.2 AYPA. Both offensive lines were beaten soundly. SF's notched -5.4 EPA and SEA's had -2.6 EPA.

Unlike the AFC game, this one was all about 4th downs. HUGE leverage throughout the game. I know I can be a broken record on this stuff, but this game really hinged on some very interesting strategic decisions.

-SF 4th and 2 on the SEA 7, 1st qtr. They punted. Probably should have gone for it.

-SF 4th and goal on the SEA 1. They went for it. Great call.

-SF 4th and 6 on the SEA 46, 1st qtr. They punted. Probably should have gone for it.

-SEA 4th and 6 on SF 38, 26 sec in 2nd qtr. They went for it, converted then kicked a FG to end the half.

-SEA 4th and 7 on the SF 35, 4th qtr 14 min to play, down by 4. They went for it. Great call except SEA burned a timeout that they were reasonably likely to need in order to think things over. Here's the thing: Timeouts are very valuable. If you can't decide between going for it or kicking or punting, you're probably very close to the point of indifference anyway. You may be better off making any quick decision and saving the timeout than you are making an optimum decision but wasting a timeout.

-SEA 4th and goal from the 1, 4th qtr 8:39 to play, up by 1. The went for it. Great call. Why? First, because they'll probably make it and virtually put the game away. And if they don't they're likely to leave the ball on the SF 1-yd line. That's not exactly a good place to be for an offense. I heard someone say that despite the math you can't take a chance like that against the SF defense. But as I noted last week, over the past 2 seasons SF has faced 15 (now 17) plays from the 1-yd line and allowed TDs on 10 of them. That's worse than league average. Don't get me wrong. I'm not saying that the SF defense is below average. Instead, the point is that good and bad teams aren't that different on any one given play. It's just that good or bad teams show up that way after accumulating very small advantages over several dozen individual plays in a game.

-Here's a weird one. SEA 4th and 11 on the SF 29, 4th qtr 3:43 to play, up by 3. My numbers say...punt? Yes, punt. Here's why:

AFCCG NE-DEN Observations

I thought the big story of the game wasn't how easily DEN moved the ball. We all expected that. The big story was DEN's defense, which held NE to just 3 points in the 1st half and 10 points through 55 minutes. Brady was held to -0.02 WPA. He did notch +8.3 EPA, but a lot of that was after the game was mostly decided.

NE was going to need some fluky things to go their way to win--turnovers, a special teams play, or some terrible call by the refs. It never came.

Manning and the DEN passing game did have a fantastic day. Manning: +.48 WPA, +17.9 EPA, 9.3 AYPA, no turnovers, no sacks.

I, and the NYT 4th Down Bot--(funny how you never see the two of us together at the same time), agreed with every 4th down call during the game. Belichick knows what he's doing. I was disappointed to see DEN burn a timeout just prior to NE's 4th down conversion attempt. Teams should be better prepared for a 4th down attempt, particularly in situations like this: a 4th and short in or near the red zone. In a high-leverage situation like that, it's ok for a team with a significant lead to use a timeout, but in a closer game, it would be much more costly. (I'm working on a project to value timeouts in terms of WP now, and without any spoilers--they are very precious in the 2nd half.)

Obligatory Manning-Brady Post...But This One Is Cool, I Promise

The risk of injury aside, it's inevitable we'll see either Manning or Brady in this season's Super Bowl. These two great players will be linked for all of football history. Even advanced stats aren't going to separate their performance from their teams'--the numbers are only the start of the conversation, not the end. But as long as the conversation is going to happen, we might as well start with the best numbers.

The interactive visualization below chart's each player's career. It's a special version of the QB viz I update weekly throughout the season. In this edition, I've selected only Manning and Brady for comparison, plus I've included postseason data.

The viz offers two unique and innovative ways of looking at each player, unashamedly stolen from the best baseball analytics site on the Web, Fangraphs. First, there is a plot of career cumulative Win Probability Added (WPA) from each QB's first year through his most recent year. It's an interesting way to compare the career trajectories of top passers because it's a cumulative chart.

Second, there is an "Nth best season" Expected Points Added (EPA) chart, which takes some thinking to understand because it's not plotted in chronological order. It plots each QB's season in order from his best EPA season through worst EPA season. It's not cumulative and because it appears to trend downward does not mean the QBs are declining. I like it because re-ordering each season makes the separation between each player's performance clear to see.

Would Auburn Have Been Better Off Not Scoring a Touchdown?

Slate asked me to take a look at the possibility that Auburn would have a higher win probability had they not scored on Tre Mason's go-ahead touchdown run, and instead taken a knee at the 1-yard line with 1:19 to play. It was a difficult analysis, and required some unsatisfying assumptions, but in the end the results confidently pointed toward one conclusion.

...Nevertheless, Auburn had about an 81 percent chance of winning after Mason’s score, as teams in Florida State’s position are able to score a touchdown about 19 percent of the time. (These numbers are based on analogous situations in the NFL, though I’ve made slight adjustments for the differences in pace between college and the pros, and to include the chance of a kick return for a touchdown.)

So, the question we’re evaluating is whether having a first down at the 1-yard line would have given Auburn more than an 81 percent chance of winning. It’s a tricky question because it needs to be analyzed backwards...

(What I didn't explain in the article is that it's easiest to work backwards because the Auburn WP on 3rd down is depending on the results of a potential 4th down. And the WP on 2nd down are dependent on the potential results of 3rd down, which in turn depend on 4th down. And so on.)

For the Slate articles, I can't get away as much math and equations as I like, so here's a table of the relevant probabilities I used. It was complicated because the deeper into the goal-line series Auburn went, the lower Auburn's chances of getting the TD went BUT the lower Florida State's chances of responding went too. This edition assumes Auburn goes for it on the 4th down on what would be a single make-or-break play for championship.

What if KC Had Just Kneeled Out the 2nd Half?

With 13:39 left in the 3rd quarter, KC led IND 38-10. Obviously, 3rd grade math tells us IND needed at least 4 touchdowns just to tie the game at this point. A little more arithmetic might illustrate just how badly KC bungled this game.

There were a total of 1,519 seconds left in the game. KC can burn 40 seconds between plays and 6 seconds during a typical play just by calling a super safe run (that stays in bounds) or even a kneel. Even if KC doesn't try to convert a single first down, they can burn 144 seconds on a series. However, IND can use its 3 timeouts to make one series only take 24 seconds off the game clock.

Because IND is due to receive the kickoff at 13:39 in the 3rd, KC was guaranteed to have at least three possessions--one between each theoretical IND TD. That means that just by kneeling, KC can burn a total of 456 seconds (7:36) off the game clock, which leaves a total of 1,207 seconds of game time (20:07) and no timeouts for IND to score 4 touchdowns.

Special Playoff Team Viz

I put together a special version of the team stat visualization for the playoff field. It's a good, quick way to get a feel for how each contender compares on offense and defense, in the passing game and in the running game. The first three tabs depict the three key team stats: EPA, WPA, and SR. Two additional tabs break out run and pass EPA production.

What might be particularly useful is the week selector slider. It's handy for isolating recent trends or for isolating subsets of games, such as GB's games with and without a healthy Aaron Rodgers.

The dashed average lines for each stat represent the 2013 overall league average, not just the playoff field. Here's a version below, but a larger version can be found here.

Chargers Courageous Call & Playoff-Clinching Drive

Despite the controversy surrounding an illegal defense on the Chiefs' missed field goal at the end of regulation, the San Diego Chargers defied odds and clinched a postseason berth on Sunday. In overtime, Philip Rivers orchestrated a 17-play, nine-and-a-half minute field goal drive to start the extra quarter that ultimately sealed their win. The length of the drive, in this case, is just as important as the outcome as San Diego could advance with either a win or a tie.

Using our Markov model, let's take a look at the drive. Keep in mind, the model is best used for a standard drive when time and score differential would not greatly affect decision-making or play-calling. Since this was the opening drive of overtime, those standards will predominantly hold true, although not perfectly given the leverage of the situation.


Tony Romo and High Variance QBs

It's too bad we won't get to watch Tony Romo play for the N.F.C. East championship Sunday Night due to injury, because he always seems to add drama to a game. Just look at his last two starts. In week 15 he helped blow a 23-point halftime lead with a -0.13 Win Probability Added (WPA) performance, but in week 16 he threw the game-winning TD pass in the final seconds and put up a +0.64 WPA performance to keep his team's playoff alive.

Romo has a reputation as a choker, mostly due to his team's late-season under-performance. But taken as a whole, his numbers are very "clutch." He has ranked 8th, 5th, 6th, 3rd and 12th in QB WPA over his last 5 (non-injury seasons). I don't buy the choker label, but I do buy the notion that he's a gambler and an improviser, which can cause wide swings in his game-to-game WPA numbers. So I thought I'd pull the stats and investigate how Romo compares in terms of his game-wise variance.

Variance measures how wildly a stat changes. In this analysis, I measured the variance of week-to-week WPA. If a player has lots of heroic games and lots of calamitous games, his WPA variance will be high. If a player is relatively consistent from week-to-week, regardless if he is consistently good or consistently bad, his WPA variance will be low.

For those who are unfamiliar, variance is an average of the squared deviation from each player's mean WPA. For statistical sticklers out there, I used variance of a population rather than variance of a sample--otherwise players with relatively few starts would have a slightly inflated result.

This season, Romo ranks only 14th in game-to-game WPA variance out of the 41 QBs with 200 or more qualifying plays. He's also 14th over the past 5 seasons. If we go back to 1999, the earliest year of my data, he ranks 9th out of the 101 QBs with 800 or more qualifying plays. His early career was apparently very boom-and-bust.

Here are a three tables of QB WPA variance. The first is for the 2013 season through week 16. The second is for the last five seasons. And the third for the period 1999-2013.

PIT Should Not Have Scored the TD

With the score tied and 1:51 to play, PIT had a 1st and 10 on the GB 17 yard line. In many circumstances a team can run down the clock and kick a short FG to win the game. PIT was near the 'Field Goal Choke Hold' zone, when it's better for the offense not to score a TD and better for the defense to allow a TD. But fortunately for GB, they had all 3 of their timeouts, and could be assured of getting the ball back with 1:27 to play if they made a stop and forced the FG. So with 3 timeouts remaining, the numbers say it never makes sense for a defense to intentionally allow the TD.

But GB jumped offside on the FG attempt, and gave PIT a 1st down and goal from the 5 with 1:35 to play. Now GB had only 1 timeout left, and it would have certainly made sense for PIT to refrain from scoring the TD, burn time off the clock, and kick an easy FG for the win.

The chart below illustrates when a defense would prefer to allow a TD. The black diamond represents the state of the game at the 1st and goal mark assuming PIT does not score a TD. The black line shows the win probability of the defense if they allow the TD.

Did Shanahan Make the Right Decision to Go for Two?

Following a TD to pull within an extra point of tying the game with just seconds left to play, WAS head coach Mike Shanahan elected to attempt a two-point conversion to win rather than enter overtime. It’s not fashionable to defend Shanahan around DC these days, but I think this was the right decision for a couple reasons.

Philbin Shows Up Phil

In a huge division game with significant playoff implications, the Dolphins trailed the Patriots 20-17 with a little over three minutes remaining and three timeouts in their back pocket. Facing a 4th-and-5 from their own 45-yard line, Jim Nantz asked Phil Simms whether he would punt in that situation, to which Simms replied "Absolutely, punt it!" The logic here is that with three timeouts remaining, the Dolphins could stop the Patriots and force a three-and-out, getting the ball back with another chance to drive down and tie or win the game.

Let's look at the baseline win probability numbers:

Payton Was Right to Decline

At least according to Expected Points, he was.

Here's the situation: At the beginning of the 3rd quarter against CAR, NO had a 1st and 10 at their own 16-yard line. They threw for a 7-yard gain, setting up a 2nd and 3 from their 23. But CAR was flagged for defensive holding, which would have given NO 5 yards and an automatic first down at their 21. NO head coach Sean Payton declined the penalty to the bafflement of many including the tv announcers.

The game did not hinge on this decision by any stretch. But it's worth taking a look at. The EP model is probably the right tool for the job in this case because it gives a much finer level of precision to down/distance/yd-line situations than the WP model or other approaches.

Using the hand-dandy WP calculator tool (which as a bonus is also an EP calculator), here are the relevant numbers:

What Kind of Teams Are Super Bowl Winners?

What's the profile of a Super Bowl winner in the modern era? Does defense win championships? Are they predominantly elite offenses? Do they have to be above average on both sides of the ball? Are champions always dominant in the regular season? Is your team out of the mix for the Lombardi Trophy?

Here's the plot of every team's regular season Expected Points Added (EPA) for every team from 1999-2013. The horizontal axis represents their offensive EPA per game, and the horizontal axis represents their defensive EPA per game. The best teams are in the upper-right quadrant, while the worst are in the lower-left. (Click to enlarge...it's suitable for framing!)

Seahawks Stumble, Should Have Allowed TD

In one of the most anticipated games of the week, the San Francisco 49ers took over down 17-16 to the Seattle Seahawks with 6:20 remaining. After a huge Frank Gore 51-yard run, the Niners lined up for a 1st-and-Goal from the 7-yard line with 2:39 remaining. Seattle had no timeouts remaining. Should the Seahawks have tried to intentionally allow the Niners to score a touchdown? Let's look at Brian's graph for this situation in his intentional TD study:

Saban's Hyperbola: Analyzing Alabama's Long FG Attempt

Way late to the party here, but let's do this because it's so interesting. As every football fan on the planet knows, Alabama attempted a 57-yd FG with 1 sec to play in regulation against Auburn. The kick fell short and was returned for a stunning game-winning TD. The consensus analysis seems to be that the FG attempt wasn't necessarily a bad decision, but the big mistake was that Alabama was not prepared with appropriate personnel to cover a potential return.

Let's look at the FG decision more closely. I won't use the WP model, but instead apply some math and logic. There were three options for Alabama:

1. Kneel
2. Hail Mary
3. Attempt the FG

Let's make some assumptions. First, OT is a 50/50 proposition. Alabama was favored in this game, but Auburn was playing strong. Plus, OT is a bit of a dice roll to begin with. Second, Hail Marys (Maries, Mary's?) from that range are probably no more successful in college than they are in the pros, which is around 5%. Lastly, for the sake of the argument, let's say there is zero chance of a defensive TD return on the Hail Mary.

We don't really know the probability of a successful FG attempt or the probability of a successful return or block & return from a range like that, especially in college ball from a kicker without many attempts. But let's set that aside for a moment.

New Feature: The NYT 4th Down Bot

When the season kicked off in September, I had promised several new and interesting features. The game graphs are now on NBC Sports, we've started a great podcast, added new site features, and added new contributors. But there was one feature that has been in the works until now.

In partnership with the New York Times, I'm pleased to unveil the bane of every football coach in America--The 4th Down Bot. At its heart, it is a real-time application of the 4th Down Calculator feature here at ANS. It uses both the Expected Points model and the Win Probability model to estimate the best option for every 4th down as a game is in progress.

Was Belichick Right to Take the Wind in OT?

I was surprised when Bill Belichick chose to take the second possession (and risk no possessions) in OT against Peyton Manning and a team that had scored 31 points in four quarters. Although the new OT format mitigates the advantage of the team with first possession, it's still there to the tune of about 56% to 44%.

The advantage of wind must have felt fairly strong to Belichick. His team captains thought he was crazy. At the time, it was impossible to tell from the comfort of my sofa how bad the wind was, but I was curious if we could see the effect statistically.

McCarthy Makes New OT Mistake

Kicking a field goal on the two is like kissing your sister.

I could not have said it better myself. Nothing is more annoying when trying quantify a season and run simulations than coding for a tie. The Packers and Vikings tied 26-26 on Sunday after both teams kicked field goals in overtime. On the opening drive, Matt Flynn led the Packers to the Vikings 2-yard line before Mike McCarthy decided to kick a field goal, sending the game into the "chance to match down three" format. In the new overtime format, was this the optimal decision?

Did Marvin Lewis Make the Right Call on 4th Down in OT?

After a freak Hail Mary TD to tie the game, Cincinnati won the coin flip to start OT and drove to the BAL 33. Facing a 4th and 2, Lewis decided to for the conversion rather than attempt a 51-yard FG or punt.

Under the old sudden death OT rules, every coach would have undoubtedly attempted the FG in that situation. But under the new OT format, things have changed. Because the opponent gets an opportunity to match a first-possession FG, or to trump it with a TD, long FG attempts are not the percentage play.

If you make it, you've given your opponent all four downs to cruise down the field to respond. Plus, there is no urgency like in other four-down desperation drives because the clock is not a factor. And if you miss the FG, you've given the ball to your opponent in decent field position while triggering sudden death rules. Now, an opponent FG would end the game.

Trestman's 4th and Inches Call

I received a few requests to analyze Marc Trestman's decision to go for it on 4th and inches from his own 32, up by 4 with 7:50 to play in the 4th quarter. So here goes:

Punting would hand GB the ball at or around their own 20 yard line, worth 0.71 WP for CHI.

A successful conversion means a 1st and 10 at CHI's own 33, which would give CHI a Win Probability (WP) of 0.79. And a failed conversion attempt gives GB the ball at the CHI 32, worth 0.51 WP for CHI. [That's a relatively high-leverage situation--a potential swing of 0.28 WP.]

The break-even conversion probability (x) required to make it worthwhile to go for the conversion can be found by setting the value of the punt equal to the total value of the conversion attempt:

The 2013 Buccaneers Really Know How to Blow It

This post at Reddit noted that the 2013 Buccaneers have blown 4 games in which they had at least a 0.95 Win Probability (WP). This is the most blown games for any team since at least 1999, and there are still 8 games left to blow this season.

Most readers are familiar with the Comeback Factor (CBF) stat. It measures the unlikelihood of the win at the lowpoint of the game for the eventual winning team. For example, if a team comes back to win from a 0.05 WP, that would be a CBF of 20 (1/.05 = 20). A team that comes back from a 0.01 WP earns a CBF of 100.

On the flip side of that equation is the Blown Game Factor (BGF), a stat which measures how badly a team blows a game. If a team has a 0.95 WP and goes on to lose, its BGF is a 20. It's really no different than Comeback Factor--it's just measured from a different perspective.

TB already has 4 games with a BGF of 20 or higher, meaning at one point they had at least a 0.95 WP. The table below lists all the teams in the database (since 1999) with 2 or more games with a BGF of at least 20. That's not the only way to measure total heartbreak, so I included some other numbers.

The Worst 8 - 0 Team of all Time?

CBS's Pete Prisco recently sent out a tweet saying the Chiefs "might be the worst 8 - 0 team I've ever seen."  I thought I would take him up on that by compiling some basic team stats of every 8 - 0 team in a Google spreadsheet and comparing them.

Regardless of what the stats say, you should familiarize yourself with this amazing gif which is probably the only reason the internet needs to exist and is proof-positive that Andy Reid's Chiefs are worth every bit of 8 and OHHH YEEEAAAA.

The spreadsheet will contain the following stats on each team:

Slate: Punt From the Opponent's 26?

I make my first appearance this season at Slate to propose that DAL may have been better off punting from the DET 26 than attempting a 44-yd FG. If DAL could have pinned DET at or inside their own 10, the numbers suggest punting might have been preferable to going for it or trying the FG. It may have even been preferable to making the FG. I also discuss DAL's holding penalty that was the start of the critical path toward an improbable comeback.

...But there’s an extra wrinkle. Strangely, Dallas would have preferred to keep Detroit within 3 points rather than extend its lead to 6. When desperate teams like the Lions with no timeouts remaining get into the outer rim of field goal range, they send in the field goal unit for a long-range attempt. This is an irrational decision, one I discovered the very first time I began looking at win probability numbers. Rather than try to win the game, teams in this situation settle for a tie—or rather, an attempted tie. Even if the field goal attempt is good, it only buys a 50–50 shot at the win in overtime...

Packers' Perfect Third Quarter

After a grown-man run from Adrian Peterson to end the first half, the Green Bay Packers opened up the second half up only a touchdown to the dismal Vikings, 24-17. Aaron Rodgers led the Pack on a 16-play, 80-yard touchdown drive that lasted over eight minutes. During the march, Green Bay converted on three third downs and a fourth down. Let's look at the progression of the drive using our Markov model:

The Myth of Playoff Peyton

Let's play a quick word association game.  I say "Peyton Manning in the playoffs," and you tell me what images your mind conjures up.

For someone who may be the greatest quarterback in NFL history, the picture isn't quite befitting. You probably imagined something like Peyton hanging his head in the snowy Foxboro winter, or slumped over following his pick-six against the Saints in the Super Bowl.  Or perhaps just a general expression of chagrin, like the hilariously petulant "Manning Face."

Yes, Manning does have a losing record in the playoffs for his career (though only eight have had more wins, but who has time to split hairs?).  The next time Manning loses a playoff game will give him the record for most career playoff losses, and fairly or not, that will always be a part of his legacy.

Most rational fans realize that Peyton has only thrown up a handful of postseason clunkers, and that such a small sample size should not significantly affect his standing as an all-time great.  They might defend by saying, "Yeah, Peyton's been a bit worse in the playoffs, but his regular-season numbers are so great that it doesn't matter."

Actually, even that statement would be false.  What people fail to realize is that Peyton has not been any worse in the postseason.  In fact, one could make a fairly convincing argument that he's been one of the two or three best playoff quarterbacks of this generation.

To illustrate this point, let's take off our "Embrace Debate" hats and let the numbers tell the story.

Was BAL's Onside Kick Attempt Smart?

Trailing with 13 minutes left in the 4th quarter vs. PIT, BAL kicked a FG to make the score 13-9. BAL then attempted a surprise onside kick attempt, which was unsuccessful. What do the numbers say?

Surprise onside kick attempts are generally worth the gamble. Based on the Expected Points model, a normal kickoff is typically worth -0.42 EP, as the opponent has a 1st down at about their own 22. A recovered onside kick (1st down at one's own 45) is worth 1.77 EP. A failed onside kick (opponent's 1st down a one's own 45) is worth -2.36 EP. The break-even recovery probability would therefore be:

Jets 'Push' Their Luck

After forcing overtime, the Jets stopped the Patriots on their first drive, reverting to the old OT format - sudden death. Geno Smith and the Jets moved downfield before being stopped for a 4th-and-7 from the New England 38. Rex Ryan had three viable options here, keeping in mind that the next score wins: Kick a low-probability (40% league-wide) 55-yard field goal, attempt to convert a low-probability (42% league-wide) 4th-and-7, or punt the ball deep and risk Tom Brady leading a game-winning drive.

The Jets elected to attempt the field goal. Nick Folk missed wide left, but in a crazy turn of events, New England was penalized 15-yards for an unsportsmanlike conduct "pushing" penalty. Before we get to the penalty, let's talk about the decision. While I almost always advocate going for it in no-man's land, in this situation, I was leaning toward the punt.

For this analysis, I used a combination of my Markov probabilities as well as Brian's overtime win probabilities.

Dolphins Should Have Intentionally Allowed the TD

Leading by a point, the MIA defense was on its heals as BUF drove into FG range with less than 3 minutes to play. With all 3 timeouts, MIA must have been feeling pretty good about its chances of getting the ball back with some time to retake the lead.

BUF ran the ball, forcing MIA to use its timeouts. With one timeout remaining and 2:37 to play, BUF faced a 3rd and 4 at the MIA 28. BUF ran the ball for a 10-yard gain, earning a fresh set of downs and forcing MIA to use its last timeout.

At this point, MIA would have preferred to allow BUF to score a TD. The odds favored scoring a TD of their own in response. Accordingly, BUF should have preferred to take a knee rather than score then.

Here is the chart, built using this research. The red dot is where MIA found itself. The dotted black line is the Win Probability curve for allowing the TD, and for reference, the teal line is the 20 yd line.

Analyzing the Patriots' 4th Down Calls

Kurt Bullard is a freshman at Harvard and a first year member of the Harvard Sports Analysis Collective. He intends to major in either Economics or Statistics. Go 'Cuse.

This Sunday, the Patriots found themselves down six points to the Saints with only 1:13 left in the game. In that span, Tom Brady was able to lead a drive down the field, connecting on a 17-yard pass to Kenbrell Thompkins in the left corner of the end zone to complete one of Boston’s two major come-from-behind wins of the day.

Hidden by the final drive were two controversial fourth down calls in the fourth quarter that happened at key moments in the game. The first came with 8:34 remaining in the game when the Patriots held a 20-17 lead over the Saints. Faced with a 4th-and-goal at the five yard line, New England opted to kick a field goal rather than try to score a touchdown to go up two scores.

The win probability and expected point value each suggest different optimal decisions. The expected point formula suggests that the Patriots ought to have gone for it, while the win probability calculator says otherwise, albeit by a slight margin.

Bengals Punt Away Regulation

Heading into the fourth quarter, the Cincinnati Bengals led the Buffalo Bills handily, 24-10. With just over 10 minutes left, the Bills drove down the field and found themselves in a 4th-and-8 from the 22-yard line. With that much time left on the clock, it is safe to assume that most NFL coaches would kick the field goal. In fact, with backup Thad Lewis under center, the probability of coaches kicking the field goal would increase. The Bills made a gutsy call and went for it, hitting Scott Chandler for a 22-yard strike and scoring. The numbers say it’s a toss up:

Which Teams Should Abandon the Run?

Yeah, yeah, yeah. It's a passing league. We got it. And still, according to the numbers, teams aren't passing enough. In the cases of some teams, it's painfully obvious that they should be passing more and running less. As a Ravens fan, I watched another game where nearly every run was simply a wasted down. Most of their paltry positive rushing yards seem to come from trash draw plays on long distances to gain, intended to mitigate very poor field position prior to a punt. It's like they're playing with two or three downs when everyone else gets four.

I wonder if, at some point, when an offense is so much better at passing than running, should it abandon the run almost altogether. On top of the general imbalance in the league, some teams are just throwing away downs when calling conventional run plays. Of course, running and passing generally play off of each other in a game-theory sense. To be successful, passing needs the threat of running, and vice versa. But sometimes, the cost of running is so high for some offenses, that it would be worth the trade-off to forfeit the unpredictability and just pass nearly every down.

It sounds crazy, but take a look at the Expected Points Added per play so far this season (through the 1pm games on Sunday 10/13). The right-most column is the pass-run split. The bigger that number, the greater the imbalance. Pay particular attention to the teams highlighted in red:

The Truth Is Tony Romo Isn't a Choker

Last week, a premier quarterback threw a devastating fourth-quarter interception that effectively sealed his team's fate in a back-and-forth game.  This critical gaffe perpetuated what has become an increasingly disturbing trend of failures to "come through in the clutch," as critics like to deride.  Despite gaudy statistics, he has a recent track record of playing his worst in big games, raising questions as to whether or not his team's so-called championship window remains open.

Clearly, Tom Brady is the NFL's biggest choke artist.

That last example is just one way real and armchair analysts alike can place selective focus on certain facts to create a skewed perception.  Perhaps no player has had more damage done to his reputation in this manner than Tony Romo.  Romo played the game of his life against the Broncos last Sunday before a single ill-timed mistake reignited accusations of "choking."  As Grantland's Bill Barnwell put it, Romo did not have a perfect game, but rather the "perfect Tony Romo game."

But what does that even mean?  Most people know that Romo is not really as bad as his choker reputation implies.  More sensible fans may even understand that Romo receives far too much blame for the organizational and team-wide failures of the Cowboys.

However, do people know that Tony Romo may actually be the most clutch quarterback in the league, or at least very close to it?  To answer this, let's delve into the murky depths of "clutchness" through a couple different lens.

Bengals' Big Time Drive

A week after losing to the Cleveland Browns, the Cincinnati Bengals were tasked with taking down the undefeated New England Patriots in Week 5. The game was not enjoyable to watch and had a baseball score of 6-3 heading into the fourth quarter. Needing to separate themselves from the Patriots, the Bengals put together a 15-play, 93-yard, nearly eight minute drive, ultimately scoring a touchdown to go up two scores, 13-3.

Using our Markov model, we can look at the evolution of the drive:

When Should DAL Have Allowed DEN to Score a TD?

Once Demaryius Thomas crossed the line to gain at the DAL 14 yard line with 1:50 to play, the DAL defense should have intentionally allowed the TD. With 2 timeouts and 1:40+ to play, they would have had a better chance of winning than allowing DEN to choke the life out of them and kick an easy FG for the win.

That's based on this analysis of when teams should intentionally allow a TD when tied.

Aften Thomas gained a 1st down, DAL correctly called a timeout setting up a 1st down and 10 at the DAL 11 with 1:49 to play. This chart illustrates DAL's chances of winning based on field position and time remaining with 1 timeout left.

Denver over Dallas: Instant Analysis

Sunday's game was a hard fought contest of wills, but in the end Denver pulled out the much-needed victory over Dallas. Question marks remain for both teams, however. Denver was finally able to find their identity and stick with what works. Although it's not time to panic yet in Dallas, they're going to need to find a way to get it done before it's too late.

Peyton Manning silenced his critics with a solid, although imperfect, outing. "For me it's not about the individual records," he said after the game. "It's all about the team. I'm just glad we were able to go out there and get the win today," he added. Peyton Manning needs to be careful not to look past next week's opponent. Their upcoming match-up will be nationally televised in prime time, a classic trap game.

Tony Romo looked confident at times, but eventually let the game slip out of his grasp. He said after the game, "We didn't bring our A-game today. We didn't step up when we needed to and make the plays we needed to make." Tony Romo showed a lot of courage in the face of adversity on Sunday. Now he needs to put it all together for Dallas to get back among the top teams in the league.

Dallas just can't continue to make the kinds of mistakes they made at this level of the sport. Dallas fans had higher hopes for the defense this season, but if they keep playing at this level significant changes will be needed in the offseason. The offense's performance was also less than inspiring, stalling numerous times in opponent territory.

Dallas lost that game more than Denver won it. Everyone knows football is a game of emotion, and clearly Denver showed they had the passion to come out on top. You could feel the electricity in the stadium from kickoff to the final whistle. It was a game for the ages.

Don't like this column? Generate your own with the ANS Sports Column Generator!

What Are You Doing, Chip?

Cards on the table, I'm a huge Eagles fan. As an NFL stats nerd, I could not have been more excited for Chip Kelly to make the transition to the big leagues. While I did not expect him to immediately institute his Oregon trademarks, I did expect to see him going for it more often on fourth down, especially in situations where the numbers called for it -- and generally, making decisions to maximize the Eagles win probability.

It's four weeks into the season, and too many times I've asked my TV, "What are you doing, Chip?" Today against the Broncos, there were a couple of questionable decisions. Down 14-3, the Eagles were moving the ball very well to start the game. Vick and company strung together a 15-play drive that ended up with a 4th-and-4 from the Broncos 7-yard line. Using our Markov model, we can look at the progression of the drive: