I'm not sure I buy your numbers. Where do they come from anyway?
The win probability (WP) numbers are from actual teams playing actual games. They’re average win rates based on recent NFL history. They are not theoretical or simulated outcomes. They’re simply how often teams win given certain situations. In many cases where there are too few games for a reliable estimate, other similar situations are used to smooth the gaps. There are some sophisticated statistical tools that help make sure we do that right, but underneath it all are actual games and actual wins and losses.
Football is a game of momentum. If a team goes for it and fails, it will lose all its momentum and its players will be deflated. 
Momentum in sports, for the most part, is an illusion. People naturally expect to see events alternate more often that they really do. Streaky outcomes are a natural part of the world, and momentum is not needed to explain it. Flip a coin a few times and you’ll see that there are streaks of the same result, and no one would ever say the coin had momentum.
Here’s a quarter. Go ahead and start flipping now. I’ll wait. Half out of every set of two flips will be the same two results in a row. One in four sets of three flips will be a streak of the same result…And one out of eight times you’ll see four straight of the same result. There are at least 20 drives in an NFL game, plenty of opportunity to witness perfectly natural streaks.
Despite this illusion, emotion can be an important part of a sport like football. No one can deny that. And I agree that large successes and failures are an important part of players’ emotional states. But the momentum/emotion argument always assumes failure. It assumes momentum/emotion can only be lost. It can be gained too, perhaps from a 4th down conversion. Besides, what happens to your opponent’s emotional state when your team successfully converts a 4th down? 
But that offense was only 1 for 3 in converting short yardage situations in the game. No way the chance of conversion was close to 75%. More like 33%!
If I flip a coin four times and get heads three times, does that mean the coin is weighted three quarters towards heads? Or does it mean that it’s a fair 50/50 coin, but sometimes you get variable results, especially when you have so few trials? Although prior results from the particular game might be helpful, they are far more likely to be due to the effect of small sample size. The long-run probabilities are far more likely to be representative of the true chance of success. 
The math is too confusing.
Once the WP numbers are in place, the math is 5th grade arithmetic. If you can multiply and add decimals, you're all set.
I don’t understand how you can say the decision to go for it was worth a "net WP"? Either they make it or they don’t.
Let's use a gambling analogy. Assume every dollar in your pocket is equally valuable to you. I offer you the choice between a certain $10 or a coin flip for $20. What’s better? They’re theoretically equivalent. What if I offer you the choice between $10 or a 75/25 proposition for $20? I’m sure you see that the gamble is preferable to the certain ten bucks (.75*20 = $15).
Now, what if I offer you a certain $40 or I offer you a 75/25 proposition at $60 (.75*60=$43). Further, what if I sweeten the deal and say, even if you lose the 75/25 gamble, I’ll still give you $15 bucks? Well, that clinches it. The gamble is the better deal.
Ok...now...pretend you’re a football coach, and I, the football god, offer you a certain 40% chance at winning the game or a 75/25 gamble at a 60% chance at winning? Which should you prefer? You’d prefer the gamble. And if I sweeten the deal and let you have a 15% chance at winning even if you lose the gamble, that’s just a bonus. The gamble gives you then best overall chance to win. See what I did there? 
But dollars and winning football games aren’t the same thing.
In this case, they are. Recall my assumption above—“every dollar is equally valuable to you.” The same is true for chances to win a football game. Think of WP as chances to win out of 100. A 0.40 WP is 40 wins out of every hundred games. So 20 chances is exactly twice as good as 10 chances to win, and 40 chances to win is exactly twice as good as 20. And so on. Every chance of winning is equally valuable as the next. (Economists call this linear utility, and the gamble analogy above is known as expected utility.)
But if they go for it and fail, they’ll definitely lose. You can’t bet the whole game on one play!
Sure you can, when the odds are in your favor. The speed at which you win or lose is irrelevant. 
If you fail, handing the other team the ball in field goal range is suicide!
It’s not suicide. Turnovers happen. Just last Monday night, we saw a team lose a fumbled quarterback snap when a game-winning field goal was well within range. Field goals get missed. Just yesterday, 23 of the 75 attempted field goals were missed, including a potential game-winning kick that was only a 22-yarder! (The line of scrimmage was the 4.) And last year in a Falcons-Saints overtime game, the Saints missed a supposedly automatic 38-yard FG, allowing the Falcons to steal the win. It happens--not often--but enough to matter.
Well, since it didn’t work out and the team lost, that proves going for it was the wrong decision.
By that logic, the team that missed the 22-yard game-winning field goal should have gone for it, right? Or should they have punted from the 4? That kind of thinking is called outcome bias. 
OK. I understand everything you just said. I understand where the numbers come from. I understand the math and the concepts, but I still can’t bring my head around to think that going for a super-risky fourth down was the smart thing to do.
So far we’ve only been looking at it from the side of the coach who has to make the decision. Let’s put ourselves in the shoes of a fan of the other team. Consider what was going through the head of a Saints fan yesterday as Mike Smith decided to go for it in OT: 
Yay! We made the stop on 3rd down! Here comes the punt team. We’re gonna’ get the ball back and win this thing...Hey wait, what’s going on?… 
Where’s the punter going?... Why is their offense…? Crap, they’re going for it….Oh, $#!+ ! They're going to make this. It's so short.
$#!+! $#!+! $#!+!...Oh, thank God.
If you gave the Saints coaches the choice between receiving the punt and letting the Falcons roll the dice on 4th and inches, they’d take the punt every time and twice on Sunday. That tells you something, doesn’t it?
How to Talk to a Skeptic about Risky 4th Downs
By
Brian Burke
Subscribe to:
Post Comments (Atom)
 
 







I think the sample size is very relevant info., esp. for rare game situations (OT).
It would be pretty interesting to take the choices coaches have actually made over time to construct their cumulative utility curve. I'm sure it's not linear, but just what does it look like?
That might also reveal when/where coaches are most -EV risk averse. I suspect it's when the odds of losing the game are brought forward in time to be closer to the decision point. i.e. they'd rather have a 43% chance of losing at some future point not associated with the decision than a 40% chance of losing as a seemingly direct result of the decision.
Nicely explained.
This is a really good article. I am a Falcons fan and have had trouble explaining to my friends why I support Mike Smith in his decision. Do you think Mike Smith knew about these types of probabilities or he was just being bold?
...I, the football god,...
Classic line.
Hockey offers an excellent example of of where "going for it" is accepted as the right call. In the last minute of play (at times in the last two minutes) the coach will pull the goalie for an extra skater.
Over the past 50 years, this has become common practice. Fans and sports writers do not question the tactic. Everyone understands that a goal scored into the empty net loses the game unequivocally. Everyone also understands that the right play is to pull the goalie as that offers an increased chance of scoring a goal to tie the game.
The numbers in hockey are a little more drastic. You are running out of time and you are going to lose unless you take a chance which succeeds.
The football analyses are somewhat more subtle (else we wouldn't have all this discussion.)
As a nerdy football fan, I'd like to thank you for posting this info, and putting this site together. Been checking it out since last season.
But...sample size...what is the N for 4th-1 attempts at the 29, or behind the 50 for that matter? What's the confidence level? I'll buy the probability numbers as a baseline at least, but there has to be a noise factor equal to the stdev, and/or opponent adjustments, to be able to claim that this was/wasn't the right call with confidence.
In short, a calculated 5% difference in outcomes is small given some large errors lurking outside the assumptions.
-- boxchain
I think this article is exactly correct re: coin flip probabilities.
The small issue that I have in general with this analysis applied to football is that doesnt take into account the volatility of the probabilities. A coinflip is always 50pct with 0 volatility. Because team's converts 4th and inches at 99pct or a 4th and 10 at 5pct is fine as using a general rule of thumb, but as we know its NOT the same trial everytime. Different offense/defenses schemes are used, and I don't believe your data goes to this granularity - it assumes that 4th and inches is always defended the same way, which might be a good assumption but that is that a >90pct confidence interval???
This is not to say that analysis is wrong or bad or anything of the sort. I just think the exact expected value of a decision comes into question when you are talking about something as dynamic as a sport. To me if you are talking of an difference of win probability of 1pct, that is not enough of a "MARGIN OF ERROR" to support that decision, or rather it needs more analysis depending on that situation.
And speaking of flipping coins, I heard that the Saints have lost the last 12 coin tosses. Not sure if that factored into any decision making during the game, but it's interesting in itself
-- boxchain (who would post as such if site would let me)
I loved the line about punting instead of kicking a 22 yard field goal.
"Hockey offers an excellent example of of where "going for it" is accepted as the right call. In the last minute of play (at times in the last two minutes) the coach will pull the goalie for an extra skater."
I read somewhere that the percentage play is to pull the goalie with 2 or 3 minutes to go.
"I suspect it's when the odds of losing the game are brought forward in time to be closer to the decision point. i.e. they'd rather have a 43% chance of losing at some future point not associated with the decision than a 40% chance of losing as a seemingly direct result of the decision."
Rob that's a REALLY interesting thought. I hope some research is done in that regard.
I have bought into the math on these decisions for some time. You can slowly see NFL head coaches accepting it as well. Teams going for it on 4th when down big but in a poor spot, instead of punting and basically giving up (Vikings tonight) are on the increase, 4th and short attempts are on the increase, so the stats guys are having an impact.
To the non-stats guy, I think the greatest point was asking what were the Saints wishing for? Clearly the punt! Great way to put it in perspective.
"I read somewhere that the percentage play is to pull the goalie with 2 or 3 minutes to go. "
It is and when coaches do it earlier than a minute to go, they get criticized.
Its actually a pretty good parallel to football. Even the most risk-averse old-timers realize that you have to go into desperation mode at some point, they just wait way too long.
Here's a good rule of thumb:
When you hear an announcer say "there's plenty of time left, they don't have to panic or get desperate...", not only do they have to get desperate, they really should have gotten desperate at least 5 minutes ago.
Lcsonka39: absolutely. The WP numbers from here (or any other similar analysis) give a baseline from which to start the decision making process. The coach can then apply local knowledge which either pushes the decision towards going for it ("they're missing their top DT and we've made good yards running at the replacement all day") or punting ("our O-line is banged up and our top short yardage back is out").
"Typically" 74% of 4th and 1's are converted.
Does anyone think that a situation where the Offense is thinking "If we don't get this we lose"....and the Defense is thinking "If we stop this we win"...is atypical?
Let's take as given that its statistically defensible, and likely a modest *win* to go on this 4th and short, as with most 4th and shorts. Don't you then have to mentally adjust for the very specific context?
In the similar-odds Belichek call, it sure felt like the field position wasn't going to radically change Manning's odds of scoring the winning TD. In this game, it did not feel like the very good Saints offense was a lock to take the ball the 40ish yards to get into FG range.Nor were the Falcons running all over the mediocre Saints run D. In fact, I believe this was the only run they called in OT. I read somewhere also that while the Saints are poor at short yardage D, the Falcons are poor at short yardage O.
Other variables? The Saints are a smart, agressive team when it comes to knowing risk/reward and were unlikely to pull a Herman Edwards and play for a 48 yard kick. They have a kicker who's better than league average on 40 yds and in and any kicker is better in a dome.
The math here all shows the decision was in the ballpark, but all things considered, the wrong one imho. And no, I am not exhibiting outcome bias, I felt that way at the time , and I honestly can't remember the last time I disagreed with an agressive 4th down call.
"Does anyone think that a situation where the Offense is thinking "If we don't get this we lose"....and the Defense is thinking "If we stop this we win"...is atypical? "
Honestly, I don't. 4th and 1 is almost always a very important play. In this case, it was an extremely important play. I don't think that changes the basic math though.
It seems like what you're saying is that on a typical 4th down, one of the teams is going to say "this is only very important and not extremely important...lets not give it 100%". I don't think that happens.
"It seems like what you're saying is that on a typical 4th down, one of the teams is going to say "this is only very important and not extremely important...lets not give it 100%"."
Not quite that, more:
Just like every putt is important, and every free throw is important and every at-bat is important....there's something different, (in attitude, stress level, performance, and ultimately results) when the putt is on the 72nd hole, the free-throw is with 1 second left, the at-bat is in the bottom of the ninth. In all three of these situations, I'm sure the athlete is giving 100%...he just might not see typical results.
Heck, maybe in this situation, the success rate is higher than 74% because the defense suffers more from the stress/pressure than an offense.
An excellent post!
The human mind is conditioned to see patterns, whether or not they're there. This is why momentum is more myth than reality.
I cannot express how awesome a read this was.
A million thanks.
Adam: The plays that the Saints run to attempt a shorter field goal can gain no yardage, lose yardage and exit field goal range, or result in a turnover as well. The 82% chance that Brian cites takes all that into consideration.
Note that the 74% chance of converting 4th and 1 includes the chance of converting 4th and the full yard. 4th and inches is a much surer bet, and I think that strongly skews the odds in Atlanta's favor if the playcall was better. Practically any QB can lunge for a few inches, and under the scrum the referees will give them the first down almost every time. Handing the ball off to Turner in a slower developing play with 11 defenders in the box was the poor decision - not the decision to go for it.
your theory is overly simplified. While percentages play, football has so many nuances, that it can not be compared to a coin flip, any more than genetic coding can be,.
Brian and staticians
Have you considered compiling EP/WP information based on assorted playcalls.
For instance, getting EPA and WPA splits for run-action pass plays, draws, iso, toss sweep r/l, 3-5-7 step drops, boot action, etc,etc, et al? Conversely for defense, 4 man pressure with m2m behind, cover3-4-2, man (although this might be impossible without 'all22' tape).
It would be labor intensive relative pouring over pbp data, but could yield meaningful information.
As your community grows, maybe there will be someone willing to break that off to chew, if it is worth the chew of course.
I understand what he says but I disagree that all teams would have the same 'average chance' to convert a 4th down.
Because teams don't go for alot of 4th downs, it's a very small sample size, look instead at the best teams on 3rd down and compare then to the worst teams on 3rd down, light years difference, then how could all teams have the same chance to convert a 4th down, does not seem right to me.
I also think comparing a coin to a human being playing in a sports game is different.
Coins do not have, energy, motivation, desire, coaching, complacancy, skill, ability to perform, all things that can vary from game to game for humans.
I'm a big believer in the percentages but I don't think it is as cut and dried as the site tries to make it out to be.
Just a heads up the 4th Down calculator seems to be down. Not sure if it's just being overwhelmed by site traffic.
I realize that the 82% Brian quotes takes everything into consideration. That number is *only* 82% because some teams use a sub-optimal strategy of running into the line 2-3 times and attempting a long FG. The Saints are an unlikely team to use a suboptimal strategy, so perhaps the specific odds of the Saints winning the game IF they get 1st and 10 on the opponents 29 in OT is > than Brian's number. What's more, they have a kicker without a big leg, but with better than average accuracy if he gets just a little closer. And finally, its in a dome, which helps kicking accuracy overall.
Brian's numbers reflect an average team in average conditions. The Saints likely have a better than average chance of winning the game than those odds. Of course they also might have a better than 58% chance of winning if the Falcons simply punt. There's no way to know the exact adjustment to make for the Saints, so best we can say is their all within range where you can argue either going for it on 4th, or not, is a reasonable decision.
And one final thought. Unless I missed it, all the number assume a punt simply gains the 38 net yards and we run the numbers from there, but what about the punt itself? Is that factored into the numbers we're looking at? Maybe Saints jump offsides or run into the kicker, maybe they fumble the punt? On the flip side, maybe they block the punt, or run it back for a TD. Those seem less common though, especially the block (and these results are more likely embedded in the net punt distance calculation).
Bill Simmons remarked that if the stats are right, why is it that the two times this has come up in the last three years that the 4th down decision is wrong?
People are much more likely to forget the times when it works out, because it does not become a big controversy. That's the availability heuristic.
I've also heard several sports pundits say that since Atlanta had just held the Saints to a three and out, that means they should have trusted their defense. That would be the recency effect (if Atlanta had held them to three and outs for most of the game, perhaps that would've been more appropriate to consider).
While the numbers and theory here are correct, I think there is too much variance in the data, as others have pointed out, to make any decisions strictly based on the numbers. You would at least have to factor in some adjustments for if a team is above or below average or use team stats. Unfortunately there isn't enough data from just a single team.
Over at Fangraphs, they analyze bunts all the time. Since baseball has a much larger sample size, they use the individual reliever's and hitter's splits rather than the league average. They then use the average bunt success rate and ad-hoc adjust for the batter's skill. This gives us a much better picture of what is actually happening, rather than a generic league average situation.
I think it might be better to use league-regressed opponent-adjusted individual team data when computing these situations. This would at least give us a better sense of what the success rates would be. The other part of the equation, WP, is a different story.
@probablepicks,
Simmons also brought up a complaint he's made many times before—that statistical analysis took all the fun out of baseball, and now it's threatening to do the same with football. He might be right about baseball. I've pretty much stopped watching, in part because of all the pitching changes and four-hour games, which I'm guessing are at least somewhat related to increased reliance on statistics.
But wouldn't an analytical approach to football make the game more fun, not less? Where baseball statisticians frown on stolen base attempts (which are always exciting) and high pitch counts (which keep the games moving), football statisticians frown on punts and field goals, which nobody likes anyway. And if the complaint is that reliance on statistics removes the element of instinct-based decision-making, that's also not a concern with football. Sample sizes will always be smaller, and, as the comment threads on this site have proven, there will never be a shortage of outside factors people can cite to support deviating from the numbers in one direction or another.
I understand, to some extent, the blind adherence to tradition we see in a lot of coaches and pundits (I'm sure I fall into that trap myself from time to time), but it baffles me that someone like Simmons—who generally favors intelligent, creative innovations, especially those that would make sports more entertaining—can't get behind applying basic analytical tools to football.
Simmons is A LOT more of a traditionalist than he sometimes pretends to be. He makes a lot of noises about respecting the "stat heads", but really he is just a pundit like most other NFL blow-hards.
He likes stat heads when they agree with him and can support his gut/wild speculation, and dislikes them when they disagree with his gut/wild speculation. That is the entire pattern to find there.
That his opportunistic use of stats is seen as being "stats friendly" is just a sign of how math phobic and traditional much of the sports media still is. In some ways he is almost worse because he is clearly smart enough to understand, but just lacks that module in the brain that requires his beliefs actually be supported by evidence.
I think it is so funny that people assume Brian Burke doesn't think coaches should adjust for the offense/defense. They just so badly misunderstand what people like him are saying.
As a point on some of the 'adjusting for team specifics' comments, I've had a look into the number in my database to see what effect efficiency stats have on 4th and 1 conversion percentages.
Turns out the biggest driver of success is the defense rushing yards allowed per play. As an example, 4th and 1 attempts against teams that allowed 3.5 or fewer yards per rush were successful 61% of the time, compared to 70% of attempts against teams that allowed 4.5 or more yards per carry (for some reason, my numbers show that a 4th and 1 involving two average teams would be successful 70% of the time, lower than Brian's average of 74%).
Looking at the current 4th and 1 that everyone is talking about, plugging the efficiency stats for Atlanta and New Orleans into my model, the outcome is a 75% chance of converting, well above average.
Brian,
Do you have a set of reasons to provide to "football" people when they ask about this. The points "football" people would make in a discussion like this would be:
*NFL defenses can form a brick wall at the point of attack when they NEED to on one play that will likely decide the game...with what amounts to an 11 on 9 edge because the QB isn't blocking for the RB...or the RB stays out of the play on a QB sneak. The bigs lock up the bigs...others fill the gaps...and the QB or RB are basically stood up behind the line of scrimmage. I know everyone can visualize this because we see it all the time on running plays.
*NFL Run defense stats overall are often based on choices teams make about how they generally defend the whole field...and aren't a good indicator for what a defense will do one on play when they know a run is coming. "Bend but don't break" defenses have different ypc allowed averages than others who stuff the line generally...but EVERYONE knows how to stuff the line when the time comes, so it doesn't matter how many longer runs came on other plays when defenses were more spread out.
*Not all 4th and one's are created equal...particularly when in garbage time, or when a team is right on the cusp between field goal range or having to punt in the second or third quarter or something. So, "game on the line" instances are pretty rare...and including all the other 4th and one's don't have any sort of prediction value for when a game is truly on the line and a defense can go 11 on 9 to build a wall.
The above article seems aimed at discussions with radio show callers...but many football people are very skeptical about this line of analytical framework(saw Mike Golic give it a big raspberry, Monday afternoon on ESPN, and he has more experience at the point of attack than us statheads do. He was representing the majority position from former coaches/players as best I could tell).
The football people who attended the Sloan conference when this was discussed re NE/Indy were very skeptical of the stat position...and the skeptics represented the elite of NFL performance and strategy (Polian of Indy, who's been way ahead of the field in clock management for example) was skeptical.
What compelling evidence do you have for NFL decision-makers (the likes of whom attend the Sloan conference to see what statheads are doing) that their knowledge of how defenses build brick walls on rare "game on the line" running plays should be overruled by numbers that include plays where brick walls weren't being built with "game on the line" intensity?
Jeff Fogle and Stevekirsch have it right.
A mathmatical formula like this may give you somewhat of a general idea what to do but shouldn't be the end-all to decide
When are most 4th downs gone for, in garbage time when the losing team is desperate and the winning team has a huge lead and the goal of the winning team is not nessacarily to stop the losing team on 4th down, but to protect against a long pass play the leads to a quick score.
Those situations are completely different form Atlanta's situation and the results of those can not be of much help in looking at the situation in this game.
Mike M - See Brian's post on 4th down runs:
http://www.advancednflstats.com/2011/11/qb-sneak-vs-rb-dive.html
Not sure what the numbers are for pass, but here are the conversion numbers for rushing:
3rd and 1 - 72%
4th and 1 - 72%
Jeff - If defenses truly had the ability to build a "brick wall" when the game is on the line, why don't they do that on every 4th and inches? That makes no sense.
Mike B.
If the game isn't on the line, you don't want to sell out and risk putting the game on the line by allowing a big play on a surprise pass...or in the instances where the runner breaks through the brick all anyway (not suggesting there's a 0% success rate when defenses try to build the brick wall...but if the game is on the line, the team on defense is probably going to lose if he breaks through anyway---not at all the same thing for a defense sitting on a 10-point lead and the ball near midfield or something).
NFL defenses have different approaches when an opponent isn't within one score. NFL defenses have different approaches in the second or third quarters than they do in the final moments of a close game or in overtime when the immediate impact of a play carries so much weight on a result.
Using "all" fourth and short plays to create expecations for this specific fourth and short example is stirring the full spectrum of situations into mud. It hides the realities in play rather than describing them.
And "debunking" the poor arguments of people who never read "Innumeracy" dodges the best arguments coming from the people who are most closely involved in fourth down decision-making.
Jeff,
There is no reason to assume a 4th down play must always be a sneak or a dive and the defense knows this so sells out to stop it 100%. If that was the case every 4th and 1 play would be a play action to a TE for the first until defenses changed. You cannot have it both ways.
Optimal strategy certainly includes having some play action and general pass plays on 4th and 1.
Jeff - if defenses don't want to 'sell out' normally against a big play, does this mean that when they deploy the 'brick wall' defense on the crucial 4th down play they ought to be susceptible to the big play.
So ideally, the offense should, with the game on the line, run a play-action or something else to go down the field, well away from 'the brick wall'.
@Jeff Fogle,
You're probably right that "many football people are very skeptical about this line of analytical framework", and you're definitely right that "not all 4th and one's are created equal", but that's an argument for improving the analytical framework, not dismissing it. If Golic or Polian or whoever else wanted to know if the numbers are any different when the data is more closely tailored to a particular situation, they're more than welcome to do the math themselves. The fact that they don't seem interested in trying tells me their real complaint is with something other than the methodology.
That said, I think your "brick wall" point (like a lot of "but this situation is different so the numbers don't apply"-style arguments) is flawed because it only looks at the strategy considerations from one perspective. Yeah, on a crucial 4th-and-1 the defense can go all-out to stop the run down the middle, and that would decrease the offense's chance of converting with a run down the middle, but it would also greatly increase the offense's chance of converting with a run to the outside or a pass. Just as the Falcons' 4th-and-1 strategy looks dumb in retrospect because it didn't work, the Saints' strategy looks smart because it did work. But the Saints were taking a huge risk—if the Falcons had done anything other than run down the middle, there's a decent chance they would've won the game on that play.
Ah, I see Anonymous and Ian beat me to it. Should've reloaded first.
The biggest takeaway I can see from situations such as these is that the decision to go for it is often pretty close (.47 WP vs .42 WP in the case of the Atl/NO game). The key to being a long-term successful coach is not by always getting the tough decisions right, it’s by always getting the easier decisions right where the EV difference is greater. These individual decisions may have a smaller impact on an individual play, but there are far more of them and as a group have a much larger impact on the game. The tough decisions are more noticeable because we can match a single play to an outcome, but they are only moderately relevant to long term winning. Tough decisions are tough largely because they are so close. It’s fun to analyze (and making fun of coaches for being nitty and generally risk averse), but I find that they are often blown out of proportion. If there weren’t tiny mistakes earlier on, then perhaps there would be no need to talk about the large mistake at the end. So due to variability in using team specific data vs. league average and other noise the takeaway should be if it’s close then either decision is probably fine. It’s screwing up the ones that aren’t close that are going to kill you, even if they don’t make Sportscenter.
In other news, Brian thanks for the site and all your work. I really appreciate and look forward to your insight and analysis.
Appreciate the responses:
Agree with the general theme that offenses should spread things out more to reduce the brick wall building. You'll see the smartest teams do that (remember how often Brady was throwing TD passes to an eligible lineman in goal line situations awhile back before defenses adjusted?)
If the offense is in a clear run formation (not with everyone spread out), and has a history of trying to be macho...then the defense is much less afraid of a pass. That's where game theory comes in (saw that mentioned somewhere along the way). Atlanta didn't create much mystery in this current example. They did put a receiver in motion...but he then stopped within the box making it clear that he wasn't an option (and created even one less blocker because he was behind the offensive line so the guy responsible for him could fill a gap).
I think this is at the heart of much of the negative response to the play. People saying things like "I don't like going for it....but if you are going to go for it...that's a HORRIBLE play to run." No disguise, and New Orleans was able to build the brick wall.
I think the ideal "minimal" sample size would be "game on the line," "offense in clear run-formation." Start from there and build up. I don't think looking at all fourth and one's in all parts of the field in all game situations helps much here.
And, I don't think it's the job of Golic or Polian to "do the math" to debunk an advanced stats site that's making a claim that is influenced by irrelevant samples in its sample size. (Polian does a lot of math by the way, he's been way ahead of the curve on clock management for years...FO's after the fact "estimated wins" undershot their actual win totals for every year of the Dungy/Caldwell/Manning combo..which B. Barnwell later said in a Grantland article was statistically significant after 4 years while it lasted about twice that--going from memory, something like that). It's the job of the advanced stat site to pin things down as accurately as possible given both statistical and "real football" indicators...and to humbly remember what Taleb said about the value of models in "The Black Swan."
Appreciate the conversation. I think most would agree here that offensive creativity is important on fourth and one...particularly when the game is on the line. A play-by-play breakdown of the samples with formations and play results would help pin that day. Maybe the ultimate lesson will be something like "Going for it was the right call mathematically for Atlanta if they spread things out to keep the opposing defense honest, but was mathematically a horrible call if all they planned to do was hope their back could bust through what became an 11-8 blocking disadvantage with the receiver in motion stopping behind the line of scrimmage. Because NFL defenses have the ability to create a brick wall when needed, you need to dissuade them from doing so.
Mike Golic might know a thing or two about stopping a 4th and 1. But why on earth would anyone ask him when to go for a 4th and 1?
"Football" guys like Golic are interesting to listen to about things like what's it like to be a player, to be injured, to be in a locker room, or to take a hit. But for the life of me, I'll never understand what makes guys like him think they are experts on anything beyond tackling people.
Everyone agrees the coach needs to "play the percentages" in situations like this. The particulars matter, but only at the margin. Everyone pretends to know what the percentages are, but they are kidding themselves. I'm just the guy giving you "the percentages." Ignore them at your own peril.
All the groaning and flailing by guys like this is just BS denial.
Wow...
Who better than a defender who's played 4th and 1's to ask about what works and what doesn't work on 4th and 1?
Football isn't just tackling. It's schematics, assignments, working as a team to attack vulnerabilities in the offense. Football defenders are experts on a lot of things besides tackling people.
The percentages you're giving may be strongly influenced by stuff that has nothing to do with this particular situation. A small subset of relevant data...merged into a much bigger set of irrelevant data doesn't lead to better information.
Characterizing critiques from people who live inside a sport as "BS denial" when they're talking about assertions from outsiders that are polluted for various reasons represents a stunning lack of perspective.
BS denial?
If that was a joke I didn't get, I apologize. Otherwise...a stunning post. Disheartening in the extreme...
I didn't start as clearly as I could have. You were asking about "when" people should go for it. I think NFL defenders who have been involved in various 4th an 1 situations through their careers might have a sense of when defenses are best suited to stop one individual play when they HAVE to stop that play...in addition to knowing how to tackle.
Jeff, I'd probably ask the offensive coordinator, the head coach, an offensive linemen, the quarterback, the runningback... basically anyone who has had to either design or run a play. Does Golic know what his weaknesses are as a player/defender? Maybe... but I'm confident his opposing offensive coordinator does.
Also, if some random fighter pilot came out and said something completely radical against the consensus but had the data to back it up I'd give him a listen. Then instead of dismissing him outright or saying his data is bunk I'd either have to ignore it or come up with the data to refute him. Saying, "No, I don't think that applies because that's what I think" isn't going to cut it.
A quick question about the money analogy: shouldn't we take utility into account?
For example, if I were to ask the average person whether they would take a guaranteed one million dollars, or a 50/50 chance of two million, most people would take the one million. This implies that these two options are not actually equal, and that the utility of the one million is greater than the utility of the two million - diminishing marginal utility.
Jeff - Formations and schemes aren't timeouts. You don't have to save them until the end of the game. Are you saying that defenses deliberately play sub-optimally on "normal" third downs or fourth downs? Just because they don't HAVE to make a stop?
Besides, in the ATL-NO example, NO didn't HAVE to make a stop. If ATL converted, NO would have still had a a decent shot at winning the game. Not that it matters from the standpoint of how likely the conversion on 4th and 1 was. What matters was ATL's choice of play, NO's chosen defensive scheme and personnel, and (of course) execution on both sides of the ball.
Anon: this has been discussed. WPA all has the same utility because you can only use it for one thing: to get to a win. You can't save .55 WPA until the end of the season.
That said, if a team is vastly superior, they may want to take a slightly less profitable option in terms of WPA in favor of decreasing variance, while an inferior team may want to take the higher variance, but more profitable option. However, that seems like a consideration that would only exist in earlier-game situations. Once you get to 4th quarter or overtime in a close game, the more profitable option is the way to go even if it is higher variance in terms of WPA. The teams are too close and the number of decisive plays left too small to leave any potential WPA off the board.
"Everyone agrees the coach needs to "play the percentages" in situations like this. The particulars matter, but only at the margin. Everyone pretends to know what the percentages are, but they are kidding themselves. I'm just the guy giving you "the percentages." Ignore them at your own peril."
This is always the point of highest comedy/irony in these discussions.
Multiples times when this comes up you will see traditionalists construct arguments in the following way:
1) I cannot agree with those stat geeks (guys figuring out the percentages). You have to do the traditional thing.
2) Appeal to authority/expertise.
3) Claim about how this particular situation was different from every other situation ever.
4) Claim that you have to stick to the percentages and do what works (as opposed to the stat heads who are clearly arguing you should go with your gut and not do what works?!?).
The whole time completely oblivious to the fact they just contradicted themselves in the course of about 15 seconds.
It is the same kind of people who won't believe the Monty Hall Problem outcomes unless you show them to them with a deck of cards. The don't understand math and so they don't believe it.
Actual anon- the phenomenon you are talking about is not decreasing marginal utility. It is loss aversion. There has been a ton of psychological research done on this (which unfortunately I don't have any links to right now, but I'm sure you can find w/ a quick Google search)
To James: yes, ask everybody. Get as much relevant information as you can. Raw numbers by themselves don't tell nearly the story that some think they do. Very imporant to talk to people on both sides of the ball.
And, if a random fighter pilot was telling you about the general back-to-back tendencies in the NBA without accounting for Game Two being in Denver and involving a short-handed visitor...then yes, his data should be dismissed. It includes a lot of irrelevant pollution involving back-to-backers that weren't at altitude and didn't involve shorthanded teams. Same thing if he's talking about strikeouts in a Milwaukee Brewers home game using league-wide data that doesn't acknowledge how poor visibility gets when the roof is shut there. A commentator can dismiss bad data outright when it's clearly not accounting for key elements that are in play...he can say "get back to work and do a better study" without being obligated himself to do the study.
Mike B: What's "optimal" changes with the situation. It's "optimal" to not risk giving up a big play when you lead by 10 points, so you live with giving up a little play. When the game is on the line, what's optimal changes.
It's true that New Orleans wasn't life and death if they didn't stop that play. But, it was probably pretty close. Saints stop, and they're already in field goal range. Falcons make it, and they're 35-40 yards away from a field goal try against a tiring defense that's been on the field for 16 of the last 19 plays and 29 of the last 38 plays. Importantly, the Saints knew they could basically WIN the game with a stop...which became easy when Atlanta tried to run into an 11-man brick wall.
To anonymous: seeing that bad samples are part of a study doesn't represent people who "don't understand math" People who do understand math are having a problem with the percentages being meaningful here...in addition to people who played or coached football at the NFL level.
Brian, I love that you were a fighter pilot, b/c when a rambunctious commenter says, "random fighter pilot" its sounds so dang cool. I wish I could be a random fighter pilot. Or even a not random one. It might actually be demeaning if you were a random person, or a random actuary....but when he says fighter pilot, it just makes me want to listen and believe you. :-)
Jeff Fogle has it right. The priority of the defense cahanges based on game situations.
In order to have meaningful data one needs to have similar game situations where the defense has the same priority.
If a team leads by 17 pts in the 4th quarter with 7 minutes left and it's opponent goes for a 4th and 1 the priority of the defense is different in this situation than in the Atlanta/NO game situation.
But because this situation makes-up many 4th down situations, this data is not as meaningful as it should be when applied to the Atlanta/NO game situation.
Of coarse the team ahead by 17 pts is trying to stop it's opponent on 4th and 1 but it's priority is to protect the entire field against a big play, quick score, not to build a brick wall, but to force time off the clock as their opponent marches down field because then it would be almost impossible to lose the game.
I have no clue why others can't understand this simple principle of how NFL games are played.
Jeff: correct me if I'm wrong, but you seem to be suggesting NO were more motivated because they knew that a stop would basically win the game. But you ignore the same factor applying to Atlanta. If that's what you're saying, then knowing that a failure essentially loses you the game surely provides as much of a boost to Atlanta as it does NO.
All that happened is they gambled on a conservative play call from ATL. If it had been anything other than a run up the middle, any kind of misdirection, we probably wouldn't even be talking about it.
In fact Ian, research suggests that one who loses x amount (a loss) will lose more satisfaction than another person will gain satisfaction from a windfall of x amount (a win). So, if there is an emotional edge here, it is all Atlanta's. As they would be more primed to avoid the loss than the Saints would be to gain the win. (In other words, not only should we question Jeff's application of emotion to the situation, but if anything, should view it in the reverse manner). And that is now two comments in the same thread where I've brought up loss aversion. So, here's the wikipedia link this time that discusses some of the major relevant studies.
http://en.wikipedia.org/wiki/Loss_aversion
This seems like a good place to add a comment about a Bears move. Ahead 31-20 with exactly 2 minutes to go, and San Diego now out of timeouts, they faked a punt on 4th and 8.
The intended receiver was wide open, but punter Adam Podlesh threw a little high.
The announcers panned the call, but I actually liked it. If the play succeeds, you can run out the clock and definitely win.
Even if it fails, the Chargers need to score a TD, recover an onside kick, and get another score to force overtime (or 2 TDs to win). But the probability of success on the play, especially given the surprise element, is likely quite high.
Of course, given the score and time remaining, it's likely that the call makes very little difference. As it happens, Phil Rivers threw an interception on the very next play, and the Bears still ran out the clock.
So it may not get much press...
The decision to go on fourth down is not only motivated by the likelihood of converting the fourth down. NFL coaches tend to be risk averse, because taking risk and failing gets you fired. The coach doesn't get fired for missing the 22-yard field goal. The coach might for going on fourth-and-1 after getting stuffed three times in three previous plays.
Phenomenal article. I will definitely be stealing some of these quotes to my "conventional wisdom" friends that think because the Packers can't run the ball, they should always punt in 4th and 1.
Wow! I just found this site. Great read and good info!