## The 4th Down Study - Part 4

The is the fourth and final part of my article on 4th down decisions. In the first part, I reviewed the concept of Expected Points and the concept of expected utility. The second part detailed the kicking game and its expected values. The third part explored the value of 4th down conversion attempts. This, the final part of the article puts all the concepts together. I also discuss some of the explanations for why coaches are so reluctant to go for it when they should.

Putting It All Together

To build a chart of general recommendations where teams should go for it or kick, we can simply repeat the analysis from Part 3 for each yard line and distance to go. We'll start by plotting the EP values for kicks from various yard lines. First, here are the values for punts:

And here are the values for FGs:

The graph for 'go for it' attempts is a little trickier. While punts are the same value regardless of distance to go, the value of a conversion attempt is highly dependent on it. The colored curves plotted below correspond to the EP values for each distance to go.

Now, let's put it all together and overlay the graphs for the kick values. (Click on the graph to enlarge).

Wherever the value lines for the conversion attempt are above or overlap the value lines for kicking, the decision should normally be to go for it. Remember, we assumed a successful conversion would be exactly at the first down marker and no further, which means the tie goes to ‘go for it.’ The final graph below charts the recommended option for each field position and distance to go combination. On the line or below it, a coach should go for the 1st down.

That chart is the bottom line, the take-away. It says that coaches should normally be far more aggressive on 4th down.

So Why Are Coaches So Stubborn?

If the benefit of going for it is so clear, why are coaches choosing to kick so often? The authors of Hidden Game of Football suggest that the current 4th down doctrine in football is a hold-over from the early days of the sport. Back in the day, teams were lucky if they mounted one successful scoring drive all game. A good punt virtually ensured the opponent wouldn't score on their ensuing possession.

David Romer's explanation goes a step further. He suggests that coaches are thinking more about their job security than their team's chances of winning. Coaches know that if they follow age-old convention by kicking and lose, then the players get most of the blame. But if they defy convention and go for the 1st down and fail, even if it was the best decision, they'll take all the criticism.

I buy both of those explanations, plus I'll throw in my own take. In addition to the natural conservatism of coaches, I believe much of the reason why coaches don't go for the conversion more often can be explained by Prospect Theory. As I outlined in my Decision Theory article, people tend to fear a loss more than they value an equivalent gain. This built-in tendency toward risk aversion means that coaches are biased toward kicks rather than conversion attempts. They understandably view an unsuccessful conversion attempt as a 'loss' and a successful one as a 'gain.' But people naturally tend to exaggerate the consequences of a loss, and this favors the conservative decision.

Do I expect coaches to do all this math on the sideline? Of course not. What I hope is that some coaches will one day see research like this and reset their baseline 4th down paradigm.

End Notes

The 37 yard line should be the nominal league-average boundary between FGs and punts.

All data are from official NFL gamebooks for all non-preseason games from 2000 through 2008.

This analysis only applies to ‘typical’ game situations when the score is relatively close, time is not expiring, and weather is not a large factor. With time expiring or if one team has a large lead, a different type of analysis is required. An analysis based on Win Probability can be generalized to any game situation.

This type of analysis can be tailored to any team’s specific characteristics, or opponent characteristics. For example, the Expected Points curve, 4th down conversion probability, and FG range and accuracy can be customized to produce a chart specific to a particular game.

1. Anonymous says:

What do the graphs look like if you substitute regressions for real data? I'm especially curious about the final graph, which has obvious abnormalities that would be smoothed out.

2. Brian Burke says:

Those are regressed. It's a LOWESS regression, which is actually a series of locally weighted regressions bound together. You have to balance the degree of the smoothing very carefully so not to over-smooth. You can get very straight and pretty lines and lose a lot of signal.

3. Anonymous says:

It seemed on Monday that Belichick was following your advice. Was this game an outlier for him, or does he always coach with a more rational evaluation of risk?

4. Matt says:

If Belicheck does seem to follow these rules much more often than all the other NFL coaches could that be an arguement to explain his consistent wins above the expected wins you derive? Using these strategies wouldn't really make the stats much better, but could definitely increase your wins by a couple a year. Maybe this is why Belicheck consistently beats the odds, rather than cheating.

ps. I hate the pats, but I would like to rule this out before assuming someone is cheating... you know, innocent until proven guilty.

5. JKL says:

Random assorted thoughts:

-The recommended options chart doesn't seem to be clear on this at distances of 5 yards or less from to the endzone, but it appears that the decision inside the five should almost always be "go", even with the reduced success rate from the compressed field.

-I'll add another reason, sort of related to yours, that coaches don't go for it enough. I think they are poor evaluators of risk in the kicking game. They assume kickers should always be perfect, and grouse when they are not, while not expecting that same perfection from the field players. That field goal might be the right call, if we assume the kicker will make it. You can't make that assumption. That punt from the opponent's 40 might be justified if you are guaranteed to down it inside the 5. But you're not-far from it, as your net punting values show. Punts and Field Goals should never get blocked or shanked or returned for touchdowns, except they are, all the time. So while they are risk averse, they don't seem to account for the negative actions that result from the kicking decision, while overemphasizing the negatives of the go for it decision.

--some coaches will go for it in the mid-30's on their opponents side more aggressively, but this shows "no man's land" (i.e., punting and kicking are bad options) to be broader than most believe. The yards gained by a team past midfield but before it gets inside the opponent 30 are, like a box of candy, empty calories. They don't ultimately help you much if you're not willing to add something to it, namely a willingness to go for it if need be. Do you happen to have the data when a) a team punts from its opponent's side of the field, and b) holds that opponent without a first down and forces a punt, and c) where that original punting team then takes over the next drive. My guess is they are ceding about 20 yards of field position on average in that "exchange of punts" scenario thanks to the difference in net punt yards + the yards the opponent gained short of first down. And that's a best case scenario (well, short of the other team turning it over). Many teams will pick up at least one first down following that punt.

6. Anonymous says:

Awesome work, these posts are probably the best, simplest, easiest-for-the-lay-public to understand stuff I've ever read on 4th down conversion attempts. Would it possible to see what the graph looks like using Win Probability rather than Expected Points? I guess that entails a different graph for every point in the game, so maybe just an example from the first quarter of scoreless games? It would be cool to see it near the end of the game too for various situations as I'm sure that gets a lot different, though obviously you're adding in tons of other variables at that point. Thanks for the excellent work.

7. Brian Burke says:

Thanks. You sound like my mom!

JKL-That's a great way to put it--"empty yards." I haven't really thought of it that way. You'd think that would show up as some sort of dip or flattening in the expected points curve just inside the 50. I've got the data you mention, but to code the logic to identify those drives would be tough.

Matt-I looked at the Pat's 4th down decisions a few weeks ago, and Belichick doesn't stand out as more aggressive than other coaches. However, he's consistently had a winning team, so he finds himself in desperate situations infrequently. This would make it look like he doesn't go for it often. Regarding the cheating--we know he did. It's just a question of how far it went.

8. James says:

This is a partial repost from part 3.

If a coach were to adopt this strategy, how much would that affect you EP model? Because it is based upon the past results of conservative coaches kicking and punting frequently, a coach that went for it on 4th down more often is less likely to kick a field goal, particularly as he gets closer to the end zone as 1st downs have increasingly higher EPs. Wouldn't this result in a different EP? Whether it's higher or lower I can't say, but it seems unlikely it would be the same.

9. Brian Burke says:

James-Yup. It would steepen the EP curve and created higher point values. See this post.

10. Dave says:

This might be a lot of work, but I'll suggest it anyway:

It would be interesting to compare your recommended 4th down strategy directly against what coaches do in practice on 4th downs. You could do it this way:

For all 4th down situations (pairs of "to go" and "distance from end zone") in which your system recommends going for it, calculate the proportions that coaches went for it, kicked a FG, and punted. Repeat these measurements for when your system recommends kicking a FG, and when it recommends punting. You could plot it on one graph as clustered columns (3 clusters of 3 columns).

You could further break it down into separate graphs by sections of the field, or by yards to go. Lots of ways here to graphically show just how often, and in precisely which ways, coaches are too conservative compared to the recommendations of your system.

11. Jeff Clarke says:

Theres another interesting point about Belichick.

Getting fired is not a realistic fear in his case. People are a lot less risk averse when the stakes are lower.

If I offered you a coin flip, heads I pay you \$100, tails you pay me \$50, I can guarantee you would take it. The expected value is obvious.

If I changed it to heads I triple your life savings, tails you give me everything you own, you'd probably be a hell of a lot more hesitant.

Most coaches realize that their job security is very tenuous and don't want to do anything to risk it.

Belichick knows that one bad game won't cost him his career and can take a few more calculated risks.

Of course, there is an obvious chicken and egg thing going on here. Do winning coaches take more risk because they have job security? Or do they have job security because they win more games from calculated risks?

12. Jeff Clarke says:

Brian,

Can you clarify the opening slope on the graph?

It appears to be saying you should go for it less often within your own 6 yard line.

I think what it is really saying is always go for it on 4th and 4 from the 4 but since it doesn't have data for 4th and 5 from the 4 it puts that into the FG category.

If this is true then its probably easier to just say always go for it inside the 6 and have the line come down when it starts to change at the 6.

If its not true and I'm misreading the graph and the FG is the appropriate choice anywhere within the 6, thats an interesting finding which might deserve further follow up.

13. Rob says:

I posted something similar to the following on the "Are coaches too timid" article. In that one, you found the odd result that heavier favorites appear to play higher-variance strategies, even though as a favorite you'd generally want to avoid variance in the outcome. Things like this 4th down study might help to explain this.

You've come to the conclusion that coaches should go for it more often on 4th down, since doing so increases their expected points. But this strategy also has higher variance. So, a team adopting this strategy would win more, and therefore be a favorite more often, and also would be playing a higher variance strategy.

14. Anonymous says:

Actually, Brian, Spygate has already been cleared up. To those of you who don't understand the situation, let's be clear. Teams ARE allowed to film the other team's signals. The only dispute was the angle of the shot (from the endzone) allowed view of the field of play, sidelines, and signals all in one tape. Normally, teams film all of these aspects individuals. (Yes, even YOUR team.)

In 2008, Scouts Inc. released an article stating:

"After reviewing the material released by the league, this much is clear: We saw nothing in that video that would allow us as a scouting department to provide a team with an unfair advantage over an opponent. Yes, preparation time was reduced and film study was streamlined, but not in a way that single-handedly turned the Patriots into one of the premier teams in the league."

Check out the article at:
http://sports.espn.go.com/nfl/news/story?id=3394809

15. Brian Burke says:

Thanks for clearing that up, Mr. Kraft.

16. Anonymous says:

To be sure I understand your final chart...

It's 4th and 2 on my own 16 yard line - I should go for it every time? (the line is at 3 yards there.)

That is a bit shocking...

17. Anonymous says:

And to further my just posted observation - another way to look at this is that the chart says ALWAYS go for fourth and one (or less), regardless of field position.

ALWAYS.

Wow, that is a bit of a revelation.

Of course, where does specific team knowledge come into play? These calculations were based on performance across the whole league. Shouldn't that be tempered by knowledge about a specific team - e.g., we have a a weak OL going up against the best run D in the league, maybe we shouldn't go for 4th and 2 on our own 16??

18. Brian Burke says:

Yes. And Romer's paper was just as aggressive deep in your own territory. It is shocking, but keep in mind the other side of the equation. The opponent is probably going to score if you punt anyway.

How hard is it to get 2 or 3 yards? The median play in the NFL gets you at least 4.

Here's another thought: Imagine there is no punt in the rulebook, and then one day it's invented. A guy like me comes up to a coach and says, 'Kick the ball on every 4th down and the other team gets it 35 yds further down the field.'

The coach would think I'm crazy. "Wait, you want me to give up 25% of my opportunities for a first down on every series...just for 35 yards of field position? Do you realize how much that's going to kill our chances of scoring?"

19. James says:

Brian, thanks for the reference, that's exactly what I was looking for. I'm guessing you used the "Romer EP" when calculating when to go for it on 4th down in this post?

20. Brian Burke says:

I used the same methodology, but re-constructed it myself based on a much larger data set. I also used a different regression/smoothing technique.

21. Jeff Clarke says:

Team specific readings would obviously change the chart a little. But keep in mind the fact about averages means that if it says you should go slightly less than the average in some situations, then you should also go even more than the chart tells you to in others. Since coaches go significantly less than the chart in almost all situations, we know that some of the time they are making a huge mistake.

Also, remember that by punting you are ultimately gambling that you will hold them without a score and score the next time around. If you have the worst run offense and they have the best run defense that has to affect the future calculations and not just the present. You might still be better off gambling that you will score now instead of gambling that you will score later.

22. Anonymous says:

ESPN Mag ran an article by Michael Lewis (of Moneyball fame) on 4th down decisions a couple of years ago, citing the Romer paper and some quotes from coaches. The sidebar included a list of the coaches who were the most aggressive in going for it on 4th downs. Belichick and Parcells were at the top.

23. Tom G says:

As usual, great stuff

Also notice that once you are around mid-field, fourth and one is a pretty good spot to be in. Now if only the team would make the right decision. . .

24. DSMok1 says:

Excellent series, Brian.

One note--is there a difference between the "Expected Points" metric and the "at least score 3 points" metric? In baseball, bunting can make some sense if looking to score *a* run, but not looking at overall "expected runs". In other words, do the numbers change when looking at simply scoring *any* points? If I need a FG to win, then I should be far more cautious, right?

I'm just point out that expected points isn't the final word on close, late game decisions.

25. Ian says:

Even with the benefit of the maths, as a coach I'd still struggle to not punt on 4th and 1 from my own red-zone.

It does make an interesting question though. These numbers assume you're punting to an average returner. If your punting to a Josh Cribbs or Devin Hester, I would imagine the case for going for it on 4th-and-1 increases massively.

26. Dave b says:

A couple points/questions

1. One thing I didn't see mentioned is the chance of a turnover(fumble) on a punt or kick return. Although I'm sure this wouldn't impact the results I'm sure it would take a small chunk out of the advantage.

2. Using a previous posters analogy: Let's say I can flip a loaded coin that comes up heads 60% of the time and tails 40% of the time. I get heads but I have to bet my life savings on the outcome. If we only flip one time no way in hell am I going to flip the coin. Even if it was 80/20 I'm likely not going to risk it since the cost of ruin is so great. My question is are there enough 4th down "go-for-it" situations to make it so that in any one season the coach that used this strategy is VERY likely to come out ahead. Coaches are on shorter and shorter leashes these days.

Given that in any one game sample size may be an issue it MAY also not be a good strategy for a heavy favorite to employ.

27. Anonymous says:

I think that a complete analysis should also include where the offensive team gets the ball back next. If you punt instead of going for it on your end of the field, then you may get the ball back in a better position the next time (if your expected yards gained by going for it is less than a punt). For example, if you punt and hold the opponents, then you get the ball back where you punted from (if their punt is the same length as yours). if, instead, you turn the ball over on downs and hold them and they punt, you get the ball 30 yards further back.

28. Jeff Clarke says:

I think there totally are enough "go for it" situations, but its also a case of a memory game. You go for it. You get two yards. Three plays later, you get a highlight reel TD. People remember the TD, not the go decision. You go for it and miss, that is all anyone remembers.

Thats how casual fans view the game and owners view it as casual fans do. Here is the thing though. Owners are not naive rubes. They understand the concept of calculated risk. Its probably what got them the hundreds of millions necessary to buy the team in the first place.

A coach should sit the owner down months before the season and explain the strategy to him in great detail. Explain why it works and why your memory plays tricks on you. Explain that you will almost certainly lose at least one or two games a year because of it, but you will also win two or three in ways that might not be obvious to casual fans.

29. Jeff Clarke says:

"I think that a complete analysis should also include where the offensive team gets the ball back next."

I think that this analysis does do that. My understanding is that EP is calculated based on when the next points are scored (and the ensuing kickoff). The next points could be immediate or 5 possessions away.

I think the data already accounts for that. If it doesn't, that is a major oversight...

30. Brian Burke says:

Correct. Expected Points already factors in the possibility of holding a defense to a punt and re-gaining the ball. EP factors in everything that can happen between a kickoff and a score.

31. mrparker says:

Great work Brian,

I have a couple of facets that I think need to be added to actually practicing the theory First, as a coach I would need to understand that the expected points of giving the Patriots 07' on any yard line would be drastically different from the average NFL team in the study. Also my expected points on any yard line would be drastically different should I be playing the 85 Bears. I would have to taylor strategy to caliber of opponent.

Secondly, what this paper is sorely lacking is risk assesment. Just because I know I have an advantage does not mean it is always prudent to risk what it would take to gain that advantage. I'm thinking in terms of the Kelly Formula. I haven't attempted to wrap my mind around finding the football equivalent to a bankroll but I'm assuming time and score would be the inputs.

Again nice work.

32. Dr Obvious says:

To add on to mrparker's post above, it would also be important to understand the variance in some way. I would assume it would be greater using this method, but by how much?

33. Anonymous says:

There was an article in this weeks Sports Illustrated about a high school coach in Arkansas that never punts or kicks field goals. The team has been very succesful, winning 100 games this decade. It's on page 18 in the Scorecard section.

34. Anonymous says:

Go for it would increase the variance of outcomes in many cases.
Thus it might be a somewhat more beneficial strategy for the underdog team than the favorite.
Also the quality of a particular team's punter and field goal kicker would have some effect on the optimal strategy for a given team.
In some games the wind is a big factor, particularly depending which way one is kicking. (This is included under the broad category of weather.)
I know you were only presenting things for the average situation.
Nice work.

35. Jeff Clarke says:

"Secondly, what this paper is sorely lacking is risk assesment. Just because I know I have an advantage does not mean it is always prudent to risk what it would take to gain that advantage."

I guess thats the point that I was trying to get at when I compared coaches to mutual fund managers. MutFunds need to be aware of risk and should pass up on some +EV moves if they increase variance too much.

This does not apply to coaches. You will walk out of the stadium with 100% of a win or a 100% of a loss (** I know a tie is theoretically possible but its so rare that it hardly matters). If you view your bankroll in WP, you can't say I have 41% now, I'll take the conservative course and keep the 41%. A mutual fund manager can do that. A coach can't. One way or the other, the variance will be huge.

The question isn't whether he should gamble or not. The question is which gamble should he take. Should he take the gamble that he will get the first right now or should he take the gamble that the defense will hold the opponent scoreless and he will score when he gets the ball back?

Risk is unavoidable. You have to play blackjack. Do you want to be the dealer or the player?

Gambling is a pretty good way to make a living if you own the casino.

36. Anonymous says:

Something important here isn't being discussed, which is that the chart would look a lot different if the data was actually being implemented, for one very specific reason.

The notion to almost always go for it on 4th and 2 or less, and usually go on 4th and 4 and less would completely change the way 3rd and short is called. You might have more rushing plays on 3rd and 3, 4, 5 and 6 designed to set up a 4th and 1 or 2, rather than having to play for the market on 3rd.

37. Anonymous says:

I think game situation might play a larger role than accounted for here. Late in a game where I'm only up by 1 or 2 points I don't think going for it deep in my territory would be a good strategy. The optimal strategy might be to make sure the opponent got the ball with the least amount of expected points.

To put it another way, if I have a low probability of scoring (EP value less than 3) and it is late in the game, and there is a narrow lead, then it would probably be the best thing to seek to reduce the opponents EP rather than trying to enhance my EP.

It would be interesting to find out just how late in the game and just how narrow a lead would prompt this change in behavior.

38. Matt says:

Thats a great point anonymous. I am always so frustrated when my favorite team passes on 3rd and 4 or 5, and then decides to go for it on 4th down. If you knew you were going to go for four downs you should run on 3rd and mid to get a very easy situation. Coaches seem to think of 4th down as some seperate entity and consider it "when they get there", rather than plan out a four down solution.

39. Anonymous says:

Well,the problem with that is that if a team is playing 4 downs, they often are trying to preserve the clock as well, which makes running far less attractive than passing.

40. Anonymous says:

Have you considered using the average EP after a punt, FG or attempt, rather than an average yard line to generate your data? For instance, on a punt, the offense, sometimes recovers the ball (muff or fumble) giving them a significant positive EP. How is this accounted for in your model?

Similarly, if the returner scores, how is this input into your model? There are also punt and FG blocks that do not appear to be accounted for and some of these could lead directly to 6.3 EP (TD minus KO).

Per my previous post, making the assumption that the offense converts at the minimum yardage when successful and the defense receives the ball at the original line of scrimmage on an unsuccessful conversion, oversimplifies the results. Using actual EP from every play would generate a much more robust model.

I believe excluding the data from the 2nd and 4th quarters seems to be egregious as this may provide a better sampling of actual results if teams were expected to be more aggressive on 4th downs and it would also bring to light if poorer teams (those that are behind) are better off punting or not.

Lastly, I believe this is something like the prisoners' delimma. If all teams followed the chart above, then the value of gaining possession for the defensive team would go up because they wouldn't punt and this would reduce the value of going for it. In otherwords, one reason coaches punt on 4th down is because they know the other coach will too.

Bill Mallett

41. Eddie Bajek says:

Great stuff, I've been wanting to see some data like this for a long long time.

However, I am confused (having only had the time to skim the data) over why we are kicking the field goal on 4th and goal from the 2-6 yard lines. That doesn't seem to be consistent at all with the above graph.

42. Jeff Clarke says:

Bill Mallett,

I'm not sure I understand your point about the prisoner's dilemma. I do see how if every coach makes the same decision, the combined expected value is 0. If he kicks and you kick, you average out to 0. If you both go, you still average out to 0. But I think the point is that you have +EV and he doesn't if you go and he doesn't. So regardless of whether he goes or not, you should go. I guess thats sort of like the prisoners dilemma but it would seem to be the other way around from what I'm saying.

The other thing is that in the prisoners' dilemma, you are worse off if you both confess, hence the paradox. You aren't worse off if you both go so I don't see the connection.

43. Anonymous says:

Jeff,

Refering to the prisoner's dilemma. The EP calculations that we are using are based on coaches current tendancy to kick on 4th down - i.e. going for it and not making it, generates a low negative EP, because the assumptionn is the other coach doesn't go for it.

If we assume that the other team also goes for it every fourth down, their EP would be higher given a missed conversion. This would reduce the value of going for it.

So like the dilemma, if one coach always goes for it (rats on the other prisoner), it's to his advantage, but if both coaches go for it (rat on each other), no one wins. You would have to adjust the defenses EP to assume that they always go for it to see the true benefit of changing to this strategy.

Where it obviously differs from the prisoners delimma, is that both coaches take the conservative approach (not rating on the other) because they do not see the benefit in going for it on 4th down.

Bill Mallett

44. Eddie Bajek says:

Never mind, I'm an idiot.

45. Kos says:

As always, this is great stuff. I'm not sure if this is factored into the study or if it makes absolutely no difference at all, but does this assume the opposing team's strategy is equal to the 2009 "status quo strategy" of the NFL (i.e. basically never going for it on 4th down)? My understanding is that the EP totals are based on recent historic NFL data. If every coach woke up tomorrow and started playing a perfectly optimal EP strategy, would that change the chart significantly?

Similarly, how does the EP chart change if Team A fully adopts a perfect EP strategy while Team B plays 2009 "status quo" football? For example, if Team A always goes for it on 4th and 1 from any spot on the field, they are almost certainly going to make different play calls on 3rd and 4 than Team B, who is punting if they don't gain all 4 yards. Team A probably runs the ball significantly more on 3rd and 4 (because they know they will go for it if they get 2-3 yards) than Team B (who passes a lot because they are punting if they don't get the first down). Since they usually have two chances to convert 3rd and 4 situations, Team A "scores" more EP in the long run against Team B's sub-optimal strategy...so wouldn't that change Team A's strategy and make them even more aggressive?

46. Anonymous says:

Great work again Brian;

I know that psycological explanations are frowned upon. However, I was wondering if the 'conditioning' of the players would have to be changed as well.
Factors to consider:
1. is there an observable letdown after a failed 4th down attempt (could we compare plays after a
turnover to similar plays on a regular drive?
In fact, are turnovers even more stronger correlated to wins because of their 'unexpected' nature..
2. When you kick from mid field to try and pin a team in deep does the recieving team choose plays that are more conservative ( again by conditioning) therefore leading to an advantage to the kicking team that does'nt show up in
stats. analysis.

( By conditioning I mean what players and coaches expect in given circumstances )
Dan

47. Jeff Clarke says:

Dan,

I've read several studies in several different sports that says that momentum is a myth. A team will perform no better in the future after a particularly positive play than it will after a particularly negative play. Of course, sometimes a team scores 3 TDs in a row. Just as often, the opposite happens. If momentum is a myth, then coaches should just go with the numbers.

Bill,

"If we assume that the other team also goes for it every fourth down, their EP would be higher given a missed conversion. This would reduce the value of going for it."

This is the part I don't follow you on. If the other team goes for it more often, their EP will be higher when they have the ball. Of course, that means they'll have a higher EP when you miss a conversion. It also means they'll have a higher EP when you punt to them. The fact that they will score more often with the ball would seem to maximize the importance of you keeping the ball yourself.

It is like the prisoner's dilemma in that you have the incentive to go for it regardless of what the other coach does. The key part of the dilemma is that both parties going makes both parties worse off. If every coach goes, every party is the same as if no coach goes. Thats where I don't understand the corollary. The PD is all about the paradox. Individually you do what collectively makes no sense. I just don't see it with coaches. They aren't making the logical move individually and it doesn't really matter collectively.

48. Tom G says:

Very good article in Sports Illustrated this week applying this to the high school level. Never punt, never kick a field goal, always go for it on fourth down, never even kick off, nor bother returning punts

http://sportsillustrated.cnn.com/2009/writers/jon_wertheim/09/17/no.punt/index.html

49. Jeff Clarke says:

I was actually shocked that Jim Zorn made the correct move late in the Redskins game. The situation was blatantly obvious to me. Yet I believe 90% of coaches would have kicked there. Of course, the announcers said he was making a "huge risk" and were preparing themselves to blame him in the event the Skins lost the game.

Here is the situation. 1:50 left. St Louis has no timeouts. You have 4th and inches on the Rams 3 yard line. Skins up by 2 points. You go and make it. You win 100% of the time. You kick the ball and after the kickoff, they need to go 70 yards for the touchdown. But if you go for it and miss, they get the ball at the 3 yard line. They need to go 65 yards in order to get it into field goal position. You go for it and make it (which you would do 2/3 of the time) and you win the game right then.

In other words, for a rational person to kick it, he'd have to conclude that the probability of the Rams going 65 yards was 3 times higher than the probability of them going 70 yards.

I know that adding up the yards isn't perfect. Its slightly more difficult to get yards in the end zone because the back of the end zone helps the defense take away options. On the other hand, with time running out they can use the whole end zone. If you are trying to get into fg range, you need to worry about completing the pass and getting out of bounds. I've also made the huge assumption that when you get into fg range, the kick is actually successful. Thats hardly 100% true. All things considered, you could make a pretty good argument a team is actually better off in that situation with their opponent 65 yards away from the winning fg range than with them 70 yards outside of the winning touchdown.

If you offered a coach the choice of giving his opponent 3 different opportunities to go 65 yards or 1 to go 70, I'm pretty sure none would go for the "conservative" option of letting them have triple the probability. I just don't get why none of the announcers ever seem to see what the real huge risk is...

50. Anonymous says:

I understand the concept, and I have seen it before. I think that coaches are too conservative, especially between the 40s.

However, I think it is important to remember these numbers are based upon "average offenses" vs. "Average defenses".

To make a truly informed decision the coach would have to have a modifier based upon the specific teams involved. Which I would like to see done, because I think it would be significant.

I think the best interpretation of the data, is that between the 40s, facing 4-10<, that teams should be going for the first down more often.

Also somewhere in this conversation the concept of Variance needs to be brought into play.

51. Xaffeine says:

Nice work! I'd love to see you reproduce Romer's final figure, including standard-error bands, using your data.

Meanwhile, how do you interpret the two little humps around 65 and 80 yards out? I think they come from the punt value data. Do they indicate anything real or important?

52. Xaffeine says:

One more comment and a question.

Romer graphed teams' actual 4th-down decisions. It would be cool to see these plotted from your larger dataset.

Did you estimate conversion rates based on actual 4th-down plays or 3rd-down plays? I believe Romer used 3rd-down plays because there was not enough data on 4th-down attempts.

53. Brennan Sherry says:

Hey Brian,

Great work here, and you really explain what you're doing well. Transparency is very important when doing statistical analysis. I had a few comments about your analysis, and what some of the commenters have mentioned.

1. I think you should augment your EP when making decisions to take into account the emotional effects of going for it on fourth down. When a team goes for it on 4th and doesn't get it, there is a huge emotional gain for the other team. There probably is enough data to say EP(yard line X | previous play was a turnover on downs) I think this will give a more accurate measure of when to go for it.

2. Point 1 may be counterbalanced by something a few commenter have mentioned, which is that if a coach actually implements this and goes more aggressive, it changes his play calling on second and third downs such that his EP in different situations may go up, making it more worth it to go on 4th.

3. Lastly, to commenter who have mentioned that you have to take into account your own teams skills when making these decisions, of course you do. This analysis is only there to show what to do when an average team is playing an average team in all respects, and to show that coaches on average may be too conserative. If you have a punter that can consistantly blast 80 yard punts, or has the ability to drop a punt perfectly at the 5 yard line consistantly, your decisions may deviate from these conclusions. Similarly if you're facing a team where you can get 2-3 yards at will while rushing, your decisions may deviate in the other direction.

54. LNG says:

Can you explain the effect of only using first and third quarter data? Especially when, for example, you use this study to defend Jim Zorn for going for it late in the game, precisely in a situation where you have clock issues or teams getting desperate. How do the numbers change? They just get a little less certain, with the same approximate results? Or is there a noticeable shift in the results?

55. Brian Burke says:

LNG-Good question. Fist, keep in mind this study is intended to outline a general baseline for "normal" football drives--when teams are not very far ahead or behind and when the clock is not yet a factor.

2nd and 4th quarter data is excluded because that's when the clock comes into play. The 4th quarter is also when teams start to become over-aggressive and over-conservative depending on the score.

Because I have such an abundance of data, I can afford to narrow the data to the 1st and 3rd quarters, and still have very reliable results.

For situations outside of "normal" situations, we have to use a different kind of analysis. In the Zorn article, for example, I don't use an (Expected Point) EP analysis like this one, but a win probability (WP) analysis. WP can account for the particulars of time and score.

56. Ben says:

Hey, I posted this yesterday in the part 2 comments but realized that this seems to be the live commenting page for the whole topic.

In part 2, when you say "a punt from a team's own 40 (60 yds from the end zone) nets around 37 yards, giving the opponent a 1st down at their own 23. This corresponds to 0.5 EP for the opponent, which is -0.5 EP for the punting team," are you just using the straight average net punt yardage from a team's own 40? I don't know how drastically this would affect your values, but for a more rigorous calculation you would need to use the probability distribution of net punt yardage to calculate the EP of a punt from, say the team's own 40.

For a simplisitic example, when punting from your own 40, if there were a 30% probability of a net of 45 yards and a 70% chance of a net of 30 yards, then your EP would be -(.3 * (EP from your own 15) + .7 * (EP from your own 30)) which would not equal the EP of the average (EP is not linear).

Again, I don't know if you did this calculation behind the scenes, or if there simply isn't enough data to generate comprehensive distributions from every yard line (I would think there probably is, given the number of coaches who punt from inside opponents' territory).

57. Brian Burke says:

Ben-You are correct. Using the complete distribution would be more accurate. The EP curve bends slightly toward each goal line.

I chose not to use the full distributions for a couple reasons. First, it multiplies the complexity of the study several-fold. And two, the difference to the final results would be very minor. I wanted to make the study as straightforward as possible, so that even someone without a strong math background could grasp it.

58. Guy says:

Brian: I'm not sure it's fair to assume that 4th down conversion rates would be the same as current 3rd down conversion rates, which are the basis of many of your conversion estimates. The incentives for the offensive team change at 4th down: the cost of failing to convert rises. At 3rd down a team is not only seeking a conversion, they may also take a higher-risk attempt at a longer yardage gain. But at 4th down (same field position), the value of getting just enough yards to convert rises relative to the value of a larger gain. And -- here's the trick -- the defense knows all this as well. So the defense should change strategy at 4th down, devoting resources to minimizing the probability of a conversion, even at risk of larger gain.

For example, this analysis says a team should go for it at 4th and 2 on their own 12. In that case, where preventing conversion offers a huge reward, wouldn't the defense pursue a different strategy than it might on 3rd and 2? Isn't it likely they would succeed in reducing the conversion frequency a bit?

I realize the 4th down attempt data is limited. But simply projecting the 3rd down data seems problematic. Maybe a look at success rates near the goal line -- how often do teams score on 3rd and goal from the 3? the 4? -- would provide some insight into likely outcomes in similar all-or-nothing situations on 4th down.

59. Brian Burke says:

Guy-Great point. I do have a great deal of 4th down data, unfortunately it's subject to considerable bias due to desperation situations and other factors.

I don't completely assume 3rd down = 4th down. I graphed both and compared them. For the most part, I can confidently say 3rd and 4th down conversion rates are equal. Where the 4th down data was extremely thin, such as on 4th and longs, I used 3rd down rates and adjusted them conservatively (lower) if there was a general discrepancy between 3rd and 4th. But those adjustments were extremely slight.

To answer your point fully, if a defense does pursue a different strategy on 3rd and 2, it is likely committing an error away from the Nash equilibrium, leaving an even greater opportunity for conversion to the offense. But in practice, we don't see an increase. Conversion rates are virtually the same. So either defenses are not adjusting strategies, or offenses are not taking advantage of the error.

60. Guy says:

Thanks, Brian. I agree that the consistency of 3rd and 4th down conversion rates does suggest that defensive teams don't change strategy to reduce the chance of "just enough" conversion gains. Or if they do, the offense is likely adjusting as well to maximize its chance of making that gain (sacrificing some probability of a long gain) and the two changes offset each other.

Still, I'm curious: is there any data on how often teams score from various distances within 10 yards of the end zone? It wouldn't only be 3rd and 4th down attempts that would be instructive, as teams are presumably trying to gain exactly that number of yards on each attempt (with no benefit of a longer gain). That's not precisely what happens on 4th down attempts, of course -- a long gain still has additional value. And the offense has other constraints at the end of the field. But it might give you another read on worst-case 4th down conversion rates.

61. Anonymous says:

Someone posted earlier about how it's not good strategy to go for it on 4th and 1 deep in your own territory late in the game with a 1 or 2 pt lead. I would think your WP would be higher going for it rather than punting. If you punt from your own 15 an average punt would give your opponent the ball around midfield. His chance of scoring a winning FG or TD would be pretty high(especially if that QB's last name is Brady or Manning). It sounds nuts, but a coach should always go for it in that situation.

62. Anonymous says:

Sorry to be so late, and thanks for the corrections to my earlier postings.

I still think there's a flaw in the analysis. It doesn't look at comparable situations. For simplicity, assume the positive EP for a team means they score now, rather than first.
In two cases (punting and failed fourth down attempt), the analysis ends by looking at how many points the defense is expected to score. In the other case (going for it and making) the analysis stops with the offense scoring, and the defense doesn't get a try with the football.
FOr parity, it seems that you should subtract from the EP of the offense when 4th down is successful the EP of the defense when it receives the ball in average kickoff-return position.
If you remove the assumption about EP meaning scoring immediately, then life is more complicated. Perhaps one solution is to prorate the EP for the defense from the kickoff-return position (EP-D) by the probability that they didn't score first. For example, if the EP for the offense being successful on 4th down is based on the offense scoring first 60% of the time, then use 60% of the EP-D.
Another problem is time constraints. Since a positive EP may mean scoring on the 5th possession, the defense may never actually get a chance on a kickoff after the offense scores. One solution is to someone discount the EP-D, based on how likely they are to have enough time. Another solution is to use something more limited than EP, say a probable result of the current drive (either -- some number of points scored OR turning the ball over to the defense at a particular point). Then, under the assumption of the posting that time isn't a constraint, there will generally be enough time for two drives.

63. Remy says:

Hmm. I'd think I agree with Anon above, although the language is confusing (defense scoring? You mean the team currently on defense getting the ball back and subsequently scoring I think).

I also think there's another factor in play, and that's "getting & playing with a lead". This is somewhat akin to the excellent "momentum hypothesis" commented here:
in that teams perhaps psychologically, or even tactically, can play better when they have a lead, even if it's a relatively small one.

So let's call it the "taking a lead" hypothesis. I'd be interested to see that teams that decide to punt more often (to help avoid being behind) or kick field goals more often than they 'should' on 4th down in terms of pure EP do better in the long run.

eg. teams that take a 3, 6 or 10 point lead.

The trouble is that teams may well take a lead because they were simply better and going to win anyway, so maybe you're probably better to look at other performance indicators (defensive pass defense, 3rd down conversion on offense etc) before and after "playing with a lead" rather than just win rates. ie: do teams really play better with a lead, or is it just a perception/myth?

I know that if the strengths of a team are defense and running game, taking the "3" or punting do seem to become much better decisions than the 4th down gamble.

64. Anonymous says:

I'm so glad to read this. For years, it has driven me crazy to see teams (well, mainly my Vikings teams) settle for FGs when they have fourth-and-goal from the 1 or 2 yard line. I certainly never put all this mathematical or statistical analysis into it, but it just seemed intuitively obvious that:

(a) TDs are (assuming a successful PAT) worth more than twice as much as FGs;

(b) Even if you make that fourth-and-goal-short significantly less than half the time (which is kind of weak), you are pinning the other team way way back in their own end zone. Even if you don't get a safety out of it, on average given the ebb and flow of field position, this should be worth three points all by itself compared to kicking off;

(c) So in poker terms (and I sense there are a lot of fellow poker players about here), the attempt to get into the end zone from the one or two yard line is, at least, a freeroll!

-Alan

65. Dan says:

Great work, discussion and graphs. Minor quibble with your final chart. I believe it is impossible to have a, say, 4th and 2 from your own five yard line (think about it). So rather than "go for it," there are some portions of your chart that should be shaded out as N/A territory.

Same goes for the other side of the chart, although it looks like you've half-accounted for it there.

66. Brian Burke says:

Good catch. I did that for the opponent's goal line (e.g. no 4th and 5s from the 2), but completely missed it for a team's own goal line.

67. Anonymous says:

Very interesting analysis. The 37yard line representing the boundary between punts and field goals is interesting and equates to a 55yard FGA. Accepting these stats are purely from Q1 and Q3, a kicker's effective range must also come into play. KER however is not consistent as late in a half, the decision is going to be slewed by individual records at longer distances, side of pitch, wind/weather and altitude

We need to look at the graphs slightly differently therefore to obtain base criteria. It is pointless punting on or inside the 37 yet the FGA option other than late in a half is equally unattractive until we get down to an EP of around 1.0; let's call it the 31 or a 49yarder for sake of argument

Now we know that rushing on 3rd or 4th is no gimmie though there is a fair chance with less than two to go, so facing 4th and 4 or more, which in theory lies wholly within the go-for-it area of the graph in defensive territory, for sake of argument, depends purely on the confidence of offense to complete a pass long enough to pick up a FD. I imagine Def 2-6 is a no-brainer FGA in most situations due to the lack of room for receivers

On the face of it, therefore, 38 to 50 yards out is a punting situation unless c1.5yards-to-go (rush) or c7yards-to-go (pass) for it must be nett yards per pass play to allow for sacks, not average per completion. That said, average field position off a punt also needs to be factored in...

For 32-37 yards it is a no-brainer attempt-the-pass or possibly a short rush rather than the 50-55yard FGA, though this is influenced by the KER and other factors

For 7-31 yards/25 to 49yard FGAs, all other things being equal, IT IS BETTER TO ATTEMPT THE FOURTH DOWN CONVERSION with just under four yards to go, or roughly half the nett pass average

For 1-6 yards/19 to 24 yards, it all gets very blurred for a multiplicity of factors. Defense know that more than a yard out, a pass would be almost certain yet there is little room for receivers to work. In fact if a TD is need in this territory, a QB sneak or designed play is probably the best option

In conclusion, all things being equal, once range and other factors combine to produce a measurable probability of a FG being missed, rather than what could be construed a random event, such as a missed EP, it is better to attempt the fourth down conversion with less than four yards to go. This would not occur with a line of scrimmage on or inside the 6 nor unlikely and if our decision is based on an EP of 2.0 then we're looking tops around the 17 or a 35yarder. Being positive and unless the points situation dictates - and in Q1 or Q3 at least - with the LofS 16-31 (34-49 FGA equivalent), it is better to attempt the conversion when four or fewer yards are required.

Like the influence on overtime when the KO was moved back from the 35 to the 30, taking over at the spot of kick with a failure has also had an influence

68. Dave Mc says:

I would argue the use of the term "Probability" when you're actually presenting mathematical "Possibility".
Even a fantasia rate of 70% leaves a failure risk of 30%.
That risk would be considered monumentally dangerous in any endeavor where loss is importantly avoided.
(for instance a vaccine given to a child that has a 1 in 3 chance of being fatal is an unacceptable risk)

The question as to why coaches don't make the attempts also ignores the local application.
A 4th Down attempt by the New England Patriots is not equivalent to an attempt made by the Kansas City Chiefs.

Left out of your demonstration, also, is the effect of failure on Time Management, Field Position, Point Differentials, and the intangibles of Morale and Rhythm.

I can only applaud your definition for one variable in the full of equation of the decision process to make the attempt.
They can't rely on what is possible.
They must rely on what they deem to be probable.

69. Anonymous says:

I would think the underdog should be even more aggressive, while the favorite might be more inclined to do some kicking on close calls as that should lower the variance and increase the likelihood of victory.

70. DJ says:

Would it be possible for you to look at how individual coaches (or just teams, if that's easier) fare against this curve (i.e. what percentage of their fourth downs do they make the "right" decision on, holding the data to the 1st and 3rd quarters again)? It would be interesting to get a number for "lost points per season" for the more conservative teams.

71. Stefan says:

Dave Mc, this study is based on actual data, which in turn is based on the sum of all of the variables you claim he isn't accounting for. I agree further studies should be down against each variable seperately, especially time remaining in half as this is one variable that can cause number to change greatly from the mean.

72. Anonymous says:

According to what I've read, that high school coach that never kicks or punts does something else really important. He also spends considerable time convincing his players why not kicking is the best approach. This removes any morale aspect. His players all understand that, even when they don't make a first down, the decision is still the best one.

73. Anonymous says:

As a high school coach, I'd be curious to see how the numbers change for the high school game. Obviously our kickers don't bring the same leg to the game, so we don't average 40 yard punts or kick field goals from the 35, but our offenses and defenses tend to vary quite a bit more.
I suspect that the "Go for it" range would just be much wider, maybe ranging from about the -30 to the +15 or so, but I'd love to see if the math would back that up.

74. Brian Burke says:

thehurt-You should look up Kevin Kelly, a high school coach who *never* punts. I've got some random links about him around here if you do a search.

75. Anonymous says:

This is a very interesting post. But it seems to me that this analysis does not take into account the decreasing marginal utility of points in many situations. The three points of a touchdown are more valuable to a team than the additional four a touchdown would yield, right? This effect is probably stronger later in games rather than earlier. At what point in the game would coaches need to consider the nonlinear value of points?

76. Mr. Red says:

Would the line on the last graph shift up a couple yards for NCAA football? I know that NCAA football is more of an offensive game, and the kickers/punters aren't as good as their NFL counterparts. I think the breakeven point for going for it on fourth down would be much lower.

77. TheGlennDavid says:

Hi Brian,

LOVE the post. Just found your blog for the first time today and I already know where the next several days of my life are going.

I wanted your thoughts on how you would adjust the final graph for a team that doesn't have much that's average -- the NE Pats.

The offense is almost always in the top 5 for the league, and the defense in the bottom 5. If I understand your article right:

1) The top-tier offense has a better than average EP for all of their first downs. That makes 4th down attempts more valuable than average.

2) A defense that is 31st in yards means that the EP on a punt are reduced. That makes punts worse than average.

How drastic an impact do you think that has on the graph?

78. Anonymous says:

83. Lu says:

Hi there

Great article; just wanted to ask - to make this even more tactically useful for coaches, building on some of the above posts (particularly the one on the Pats with their excellent offence and less than excellent defense), I'd imagine they'd ask for this to be adjusted to their own team's performance levels (offence, defense, punter, kicker, special teams), and possibly adjusted for the opponent's performance (same categories; I imagine the 2012 Jet's EP for a given field position is significantly less than the 2011 Saints). Is it possible to build up a meaningful sample set that would yield an FG - Go for it - Punt graph for a given game, or at least "fudge factors" to know how much the FG - Go for it - Punt zones stretch or shrink based on opponent performance?

84. Anonymous says:

Ever since Payton's well documented use of the surprise onside kick in the super bowl, I have been thinking about how strategy can relate to decisions that advanced analytics have shown to be better options than the status quo. It is too frustrating to me to think that these smart individuals who coach for a living would simply ignore such compelling evidence. Is it possible that some coaches have bought into these studies but defy their suggestions for other reasons?

The more times that a team uses a specific strategy, the utility of the strategy diminishes. If a team already expects to win home-field advantage in the playoffs [an admittedly rare situation but any low pressure situation would work (such as a blowout)], I think it is unequivocally the "right" decision to make the "wrong" decision and maintain the element of surprise for future situations of greater importance. Was Payton "hustling" all season to shock opponents in the super bowl? What other reasons exist to ignore these studies suggestions?

Hope, you stat geeks like the parentheses inside of brackets.

85. Unknown says:

You missed an important statistic, which is something coaches consider. If you go for it and fail, what is the likelihood of the opposing team scoring based on the field position?

86. EpicWestern says:

^Tai Phong Doo, it was considered see EP explanation.

I'm very impressed with the overall methodology, but I feel the fourth down conversion rate is way too high. For example, 56% for 4th and 3rd at the 50 has to be too high when 2 point conversion attempts are only around 45%. I understand the extra space makes things more difficult on the defense but it shouldn't be that much of a margin.

Rather than have the 4th down conversion rate be fixed, I think it would be far better for a user to be able to plug in his own numbers and see what would result. For example, if 45% was plugged in instead of 56% in the previous article, its absolutely not correct to go for it.

87. Anonymous says:

I have coached for 9 years and you failed to think!! and you data is just to simplified for a complicated game like football. if you go for it and do not make it, it has a depressed feel to the team (sometimes more if away games or under certain sitituations) and they tend to play worse and the other team plays better due to the gained momentum. and if you make the go for it the postive effective is not as much as a gain. many go for it attempts that are successfull still end up with a punt or field goal later in the same drive and no gain was made but a not needed risk was taking. the nfl keeps these records of increased scores after a turnovers. or if you watch football you should know this happens and increases the chance of the other team to score more than no turnover on the same spot on the field. I have no doubt that who ever wrote this article belongs in the office!! why dont you look at the true data which is clear to show after a unsucessfull attempt on going fot it greatly increase the chance of the other team to score same as a fumble or interception would...where a punt , usually does not change the momentum......as i said the nfl has this data and so do the owners and coaches !!

88. Unknown says:

Interesting to not that according to the first three graphs of expected point values, when faced with a 4th and 15 from a team's own 1 yard line, a field goal attempt has the highest EP value. This seems strange, as a FG attempt from there would lead to a first down for the opponent at the 1 yard line 100%, while a punt would lead to a first down for the opponent at the 40 yard line or worse a vast majority of the time. Is there an explanation for this?

89. Anonymous says:

Unless I am missing something, there should be a multiplier effect for a team adopting this approach. That is, they will increase the expected point value of a given position on the field. This, in turn, would shift the boundary between "Go For It" and "Punt" or "FG" upward, meaning they should be even more aggressive than the graph suggests.

90. Anonymous says:

Tommy

Are you looking at "own one" yard line or one yard from opponent goal line?

91. Anonymous says:

As anonymous says three messages up, there is normally a momentum shift during a (unintentional) turnover. But I think if it could be explained to the players that, yes, playing this way our opponents will score more points per game, but we will still outscore them in the end, then they will accept the occasional failure as the cost of doing business. They will not get so down, and the opposing team will feel more relief than exhilaration at stopping us. I think the truth is, that by playing a four down offense, you are not going to get into many "punting" situations anyway. You'll be scoring on most every drive where there is not a fumble or INT.