(Much) More on 1st and 10 Run-Pass Balance

In a recent article I presented evidence that offenses should generally pass more often on first down. Accounting for both the potential gains and the potential risks of each type of play, passing tends to lead to a greater net point advantage than does running. The analysis was based on a concept known as Expected Points, which measures the average point advantage an offense can expect given a down-distance-field position situation.

Expected Points incorporates all the various factors such as turnovers, yardage gains, sacks, incompletions, first down conversions, scoring, and so on. But I thought it would be helpful to dig deeper to investigate how and why passing appears more advantageous. In this article, I'll present a series of graphs comparing running and passing on first down, each one looking at a different facet of the game.

Sunday Night 4th Down

Down by 3 with a little more than 3 minutes left in the game, Ravens head coach John Harbaugh faced a decision with a 4th and 5 at his own 48. Harbaugh decided to go for the first down. Both Al Michaels and Chris Collinsworth agreed Baltimore had no choice but to punt. Was it a good call?

Punts from the 46 net an average of 34 yards, which would put the Steelers at their own 20. In that situation, teams win 83% of the time. A punt gives the Ravens a 0.17 win probability (WP).

Conversion attempts for 4th and 5 situations outside the red zone are successful 48% of the time. An unsuccessful attempt leaves the Steelers with a first down at the Ravens' 46. Teams win 87% of the time from there, leaving the Ravens with a 0.13 WP. A successful conversion would, at worst, give the Ravens a first down at the Steelers 49, worth a 0.37 WP. On net, the conversion attempt is worth:

Andy Reid: Hero

Reid was my goat of the year for 2008 for the season's single worst 4th down decision, but today he's my hero of the day (or at least for the 1st quarter of the 1 pm games). Reid started the game with an unexpected onside kick, which is an underused but very worthwhile tactic. This time it didn't work out, but the Eagles recovered nicely, and now lead 10 - 7.

The linchpin of the Eagles's 1st quarter TD drive was a successful 4th down conversion. With a 4th and 1 on the Redskins' 42, Reid went for it, made it, and went on to take the lead (for the moment).

Team Playoff Probabilities - Week 12

Courtesy of NFL-Forecast.com, here are the latest playoff forecasts. The tables below do not include results from the Thursday games.

It looks like the AFC teams remain relatively firm. As of now, the Colts, Bengals, Steelers, Patriots, Broncos, and Chargers are on the inside track. Baltimore, Jacksonville, and Houston are on the outside looking in.

Not much has changed for the NFC either. The top teams have a tight hold on playoff spots, but the wildcards are up for grabs. The Saints, Vikings, and Cardinals each have a good grip on their divisions if they keep playing well. The Cowboys are the class of the East at the moment, and have a good shot at a wildcard if they don't take the division. The other contenders include the Eagles, Giants, Packers, Falcons, and the 49ers.

Game Probabilities - Week 12

Weekly game probabilities are available now at the nytimes.com Fifth Down. This week I also discuss the major shortcoming of a purely quantitative approach to game predictions.

Happy Thanksgiving, everyone!

Team Efficiency Rankings - Week 12

The team rankings below are in terms of generic win probability. The GWP is the probability a team would beat the league average team at a neutral site. Each team's opponent's average GWP is also listed, which can be considered to-date strength of schedule, and all ratings include adjustments for opponent strength.

Offensive rank (ORANK) is offensive generic win probability, which is based on each team's offensive efficiency stats only. In other words, it's the team's GWP assuming it had a league-average defense. DRANK is is a team's generic win probability rank assuming it had a league-average offense.

GWP is based on a logistic regression model applied to current team stats. The model includes offensive and defensive passing and running efficiency, offensive turnover rates, defensive interception rates, and team penalty rates. If you're scratching your head wondering why a team is ranked where it is, just scroll down to the second table to see the stats of all 32 teams.

Offenses Run Too Often On 1st Down

NFL offenses generally run too often on 1st down. Accounting for the relative gains of each play type, and accounting for the risks of turnovers, offenses should pass more. There is currently an imbalance, where teams are too often running directly into defenses that are expecting runs.

Game theory tells us that when there are two strategy options, like run and pass, the expected payoffs for both options should be equal. You really don't need game theory to intuitively understand this. If one option yields a better payoff, then it should be chosen until the opponent responds with a strategy change of his own. Eventually, as the opponent responds, the payoffs for the two options equalize. The point at which the strategy mix equalizes payoffs is known as the minimax, or sometimes called the Nash equilibrium. The resulting strategy mix, or run-pass balance in this case, produces the best overall, long-run payoff.

When there are two strategy options and one of them yields a much higher payoff, it tells us two things. In this case, passing is more lucrative than running on 1st down, and this tells us: 1) offenses should be passing more often, and 2) for now, defenses should continue to be more biased toward stopping the run.

Last Thoughts on the 4th and 2

At the risk of being accused of milking this thing, here are some final few thoughts on the topic. I watched the Football Night in America segment and a few things struck me.

Let me say I have a lot of respect for Coach Dungy. I also think that convincing someone skeptical is difficult, and we shouldn't expect someone to immediately come around. In general though, coaches (and former players/analysts) benefit from a perception that football is some unknowable mystery, and that they are the only priests that can divine the true answers.

From Dungy's perspective, he's enjoyed long career by doing it the conventional way. But the reason his way worked was because everyone else played the same way. To change his mind now might mean admitting, "I did it wrong all these years." That's a tough hurdle to overcome.

Team Playoff Probabilities - Week 11

Courtesy of Chris at NFL-Forecast.com, here are the latest playoff forecasts. The tables below do not include results from the Thursday night Dolphins-Panthers game.

It looks like the AFC teams are relatively firm. As of now, the Colts, Bengals, Steelers, Patriots, Broncos, and Chargers are on the inside track. Baltimore, Jacksonville, and Houston are on the outside looking in.

The top NFC teams have a tight hold on playoff spots, but the wildcards are up for grabs. The Saints, Vikings, and Cardinals each have a good grip on their divisions if they keep playing well. The Cowboys are the class of the East at the moment, and have a good shot at a wildcard if they don't take the division. The other contenders include the Eagles, Giants, Packers, Falcons, and even the 49ers.

ESPN Interview

ESPN's Sunday NFL Countdown will be doing a piece on 4th down decisions this weekend. The focus will center around Bill Belichick and his thoughts on the topic in an interview they did with him a few years ago. At the time, they also interviewed Dr. David Romer, author of 'Do Firms Maximize.' Yesterday, ESPN re-interviewed Romer, and interviewed me too. I'm anxious to see how it turns out.

I'm told it will probably air a few minutes before noon EST on Sunday. Set your DVRs now!

ESPN also talked about my take on '4th-and-2-gate' on the Monday Night Football pre-game earlier this week. A lot of people were harsh on Matt Millen and Steve Young for their comments, but I think they asked exactly the right questions. Young said, 'Is that in context? I'd want to see that in context.' I presume he wants to know if the score and time remaining were considered. Millen asked, 'But does that take into account the Colts offense?' They were rightfully skeptical, but they zeroed in on the heart of the matter immediately. I have to give them a lot of credit.

'Patriots Lead Colts at Halftime' - The Onion

The guys at the Onion have previously made their opinions known about 4th down decisions. This time they take on the Patriots' 4th down call from Sunday night. Too funny for words.



Belichick told reporters..."The only time they've been able to stop us is on on short-yardage passing plays, so if we're careful to execute and avoid any situation where we give Peyton Manning excellent field position, I'm extremely confident we'll leave here with a 'W.'

Thanks to Justin for the pointer.

Game Probabilities - Week 11

Weekly game probabilities are available now at the nytimes.com Fifth Down. This week I also look at what's plaguing the Jets, and how it affects their match-up against the Patriots this week.

Belichick 4th Down Follow-Up

Thanks to the commenters on the original post, all 200+ of you, for all the great criticisms and suggestions. The Internet is a big place, and although not every comment was helpful, I was pleased to see such a thoughtful and constructive debate. In this follow-up, I'll try to address some of the most common criticisms and questions. I suspect some skeptics will never be fully satisfied, but I feel many of the comments deserve a response.

1. You used 38 yds for the expected punt distance, the average punt distance for that region of the field. But Patriots punter Chris Hansen had averaged 4 punts for 44 yds. Why not use that distance instead?

Team Efficiency Rankings - Week 11

The team rankings below are in terms of generic win probability. The GWP is the probability a team would beat the league average team at a neutral site. Each team's opponent's average GWP is also listed, which can be considered to-date strength of schedule, and all ratings include adjustments for opponent strength.

Offensive rank (ORANK) is offensive generic win probability, which is based on each team's offensive efficiency stats only. In other words, it's the team's GWP assuming it had a league-average defense. DRANK is is a team's generic win probability rank assuming it had a league-average offense.

GWP is based on a logistic regression model applied to current team stats. The model includes offensive and defensive passing and running efficiency, offensive turnover rates, defensive interception rates, and team penalty rates. If you're scratching your head wondering why a team is ranked where it is, just scroll down to the second table to see the stats of all 32 teams.

Should the Patriots Have Let the Colts Score?

After the Patriots' failed 4th down conversion attempt in the now epic game against the Colts, they wound up facing a Colts offense with a 1st down on the Pats' own 14-yard line. With 1:20 on the clock, the Patriots could have allowed the Colts to score an easy touchdown, yielding a 1-point lead, but giving the Patriots a shot to take the lead back with a possible FG drive. Or, they could have played straight defense, hoping to keep the Colts out of the end zone.

I'll take a look at league-wide averages, and then we can make adjustments from there. Against a team needing a TD, playing straight defense yields a TD about 62% of the time. This means the Patriots would have a 0.38 Win Probability (WP).

The Two-Minute Drill

In my recent posts on the Maurice Jones-Drew kneel-down and the Belichick 4th down decision I cited a few stats about how likely an offense that needs a touchdown to tie or win is to score one with about 2 minutes remaining in the game. Normally I like to back up obscure stats like that with some firmer context, and I was finally able to get around to it for these numbers.

The chart below graphs the percentage of time a team that needs a TD, defined as being down by 4 through 8 points, gets a TD on its current drive given a 1st down at various field positions with 2:00 +/- 15 seconds left. (I know that's quite the run-on sentence, but I don't know how else to say it.)

MJD Taking a Knee

With just under two minutes remaining against the Jets, Jaguars RB Maurice Jones Drew broke free for a go-ahead TD. Instead of plunging into the end zone he took a knee at the one-yard line. This decision allowed the Jaguars to run out the clock before kicking a chip-shot FG for the win. A TD would have allowed the Jets almost two minutes to answer with a TD of their own. How important was MJD's decision?

The analysis was a little more complicated than I thought because had the Jaguars scored the TD, they would have gone for the 2-point conversion. With a single XP, a Jets TD wins. With a failed 2-pt conversion, a Jets TD also wins. But with a successful conversion, a Jets TD only ties, so there is nothing lost and a lot to gain for the Jags by going for the 2.

Belichick's 4th Down Decision vs the Colts

New England coach Bill Belichick is taking a lot of heat for his decision to attempt a 4th down conversion late in the game against the Colts. Indianapolis came back to win in dramatic fashion. Was the decision a good one?

With 2:00 left and the Colts with only one timeout, a successful conversion wins the game for all practical purposes. A 4th and 2 conversion would be successful 60% of the time. Historically, in a situation with 2:00 left and needing a TD to either win or tie, teams get the TD 53% of the time from that field position. The total WP for the 4th down conversion attempt would be:

Game Probabilities - Week 10

Weekly game probabilities are available now at the nytimes.com Fifth Down.

Efficiency Rankings - Week 10

The team rankings below are in terms of generic win probability. The GWP is the probability a team would beat the league average team at a neutral site. Each team's opponent's average GWP is also listed, which can be considered to-date strength of schedule, and all ratings include adjustments for opponent strength.

Offensive rank (ORANK) is offensive generic win probability, which is based on each team's offensive efficiency stats only. In other words, it's the team's GWP assuming it had a league-average defense. DRANK is is a team's generic win probability rank assuming it had a league-average offense.

GWP is based on a logistic regression model applied to current team stats. The model includes offensive and defensive passing and running efficiency, offensive turnover rates, defensive interception rates, and team penalty rates. If you're scratching your head wondering why a team is ranked where it is, just scroll down to the second table to see the stats of all 32 teams.

The Value of 1st Down and 5 Situations

In a recent post I illustrated how to decide whether to accept or decline a penalty based on expected point (EP) values. I compared 1st down and 5 situations, often resulting from a defensive offside, 2nd down situations.

I had to assume a roughly linear point advantage for the 1st and 5 situations because there weren't nearly as many in the data as other situations. But when I actually plotted out the empirical average EP values for 1st and 5s and each yard line, I noticed something strange. Compared to the curve of 1st and 10 values (for which there is an abundance of data), the curve of 1st and 5 EP values was oddly shaped. As we'd expect, there is a fair advantage for 1st and 5s from a team's own 20 through the opponent's 45 or so. But from there all the way to the opponent's 5, the value of a 1st and 5 is apparently lower than that for a 1st and 10.

Second-Guessing Coughlin and Reid

Toni Mokovic at Fifth Down questions the late-game decision-making of Tom Coughlin and Andy Reid in their respective losses Sunday. Let's see what WP says.

The Giants had an opportunity to put away the Chargers with a touchdown to put the game out of reach. They went conservative, opting for the FG and a 6 point lead. The lead didn't hold up as the Chargers were able to drive for a game-winning TD.

After suffering a penalty on 1st down, the Giants had a 1st and goal from the Chargers' 14 yard line. After a run and a pass for no gain, the Giants faced a 3rd and goal from the 9. Run plays on 3rd and goal from from the 9 are TDs 13% of the time, while pass plays are TDs 20% of the time. Passes from there are intercepted about 4% of the time.

Accept or Decline: 1st & 5 vs 2nd & Short

Your team has just run the ball for a hard-fought 8-yard gain on 1st down bringing up a 2nd and 2. The defense was flagged for offsides giving you the option of making it 1st and 5. Should your team accept or decline the penalty or take the gain?

To simplify things, let’s only consider situations between the 20-yard lines. Looking at each option in terms of the probability of converting for a 1st down, you should be indifferent. Both situations convert equally as often at an 85% rate. But 1st down probability isn’t the whole story.

Another Run-Pass Balance Study

Benjamin Alamar, author of the Passing Premium paper (critiqued here), takes a second stab at comparing the values of running and passing with new research. A brief presenting his methodology and findings was presented at a recent symposium on sports statistics. This time Alamar uses expected points as a measure of value, and compares the EP values gained by passes with those of runs. He defines risk as the probability each play type will result in a negative EP change, and finds that running is both less productive and "riskier" than passing. There are three fairly big problems in the methodology. Fortunately, all three can be rectified.

First, Alamar creates his EP values using a simple linear regression (see slide 10). Using down, distance, and yardline (plus other variables controlling for quarter and other effects), he produces an EP equation. This is a really bad way to estimate EP values. They're not linear, and there are any number of interacting effects within. The Levitt-Kovash paper, in contrast, uses a regression with quintic terms and full interactions to create their EP values. (I prefer a  more direct method--looking at the data empirically and smoothing it using a method called LOESS.)

Roundup 11/8

A new site from the Harvard Sports Analysis Collective looks promising. (But what's with the term "Collective?" What happened to "Club?") Here they look at 2-point conversion decisions.  They also look at what effect the new wedge rule might be having on kickoff distances (not much, especially if you consider the weather is about to get a lot worse). Do teams with RB committees run better than teams with a feature back? How much more accurate are FG attempts when kicked indoors? Overall, the site asks some neat questions and is definitely headed in the right direction.

Greg Easterbrook on coaching: "One factor here is the Illusion of Coaching. We want to believe that coaches are super-ultra-masterminds in control of events, and coaches do not mind encouraging that belief. But coaching is a secondary force in sports; the athletes themselves are always more important. TMQ's immutable Law of 10 Percent holds that good coaching can improve a team by 10 percent, bad coaching can subtract from performance by 10 percent -- but the rest will always be on the players themselves, their athletic ability and level of devotion, plus luck."

I'd mostly agree with Easterbrook, however I'd say that a good coach can make a good team 10% better, but a bad coach can absolutely ruin a good team. In most cases though, the NFL rarely features coaches bad enough to do that kind of damage simply because the league is considered the top of its profession. I think the bigger point is that modern coaching, beyond the leadership aspect, is mostly about the illusion of control.

Game Probabilities - Week 9

Weekly game probabilities are available now at the nytimes.com Fifth Down. This week I also lead-in with break down of the Chargers-Giants matchup.

Efficiency Rankings - Week 9

The efficiency rankings are beginning to settle down approaching the midpoint of the season. Big movers include San Diego climbing from 8 to 5 and Arizona dropping like a rock from 11 to 19.

The Cards played poorly against a bad team, and most teams are tightly bunched around the league average. So it doesn't take much movement in GWP to drop a number of spots from the top to the bottom of the "pack."

There continues to be a correlation between opponent-adjusted team strength and strength-schedule. This indicates that the death of parity is greatly exaggerated. It seems that so far this year there have been a disproportionate number of games between the very best and very worst teams. Ironically, this may indicate parity is stronger than ever. The scheduling system is supposed to give the top teams two extra games against other division leaders, and the bottom teams two extra games against the other dwellers. If a dweller or two turns into an elite team, or if an elite team suddenly becomes a doormat, the current disparity is what we'd get. Besides, parity isn't just about keeping teams from having very good or very bad records in any one year. It's also about making sure they aren't the same teams year after year.

Team Offense Win Probability Added 2000-2008

Recently I looked at team defense through the lens of Win Probability Added (WPA). Every play changes a team's chances of winning, and we can sum all the ups and downs into a total WPA for any player, team, offense, or defense. In this post, I'll look at team offense.

As with defenses, one offense might be the best in yards and another might be the best in yards per play. Yet another offense might be the best in terms of points. WPA can cut through all that, and tell us which squads made the biggest impacts on winning.

WPA captures the things that other stats cannot. For example, consider an offense leading by one point with 3 minutes left in the game. A series of modestly successful runs to convert a first down and kill the clock won't make fantasy fans happy, but it effectively clinches the win. A stop and a punt would typically give the opponent over a 30% chance of winning, so that grinding first down conversion may mean more in terms of winning than almost any other series all season.

Team Defense Win Probability Added 2000-2008

We've seen how a win probability model can help coaches make better decisions, but it can also help settle some water cooler debates around the office. One of the cool things we can do with a win probability model is compare teams and players based based on how much they contribute to their chances of winning. If we sum up all the defense's contribution toward winning the game, we can truly rank defenses.

One team might have the best defense in terms of total yards, and another might have the best defense in terms of yards per play. Yet another might be the best in terms of points allowed. All of these ways of comparing defenses is flawed in one way or another. For example, a defense with a poor offense will be facing short fields frequently.

Win Probability Added (WPA) can account for these various considerations and provide a very good estimate of how good a squad actually was. WPA should be considered a 'narrative' stat. It measures what actually happened and doesn't attempt to any more than that.  I'll start by looking at defenses since 2000, as far back as my data goes.

An Open Letter to Dan Dierdorf

And to Brian Billick. And to all the other football analysts who use the word prideful. The Baltimore defense is not prideful. The Panthers offense is not prideful. I know you mean "full of pride," but there's already a word for that. It's called proud.

I realize that prideful has wormed its way into our lexicon, and now dictionaries even consider it a word, thanks primarily to your efforts. But please, what's wrong with just using proud?