Last Thoughts on the 4th and 2

At the risk of being accused of milking this thing, here are some final few thoughts on the topic. I watched the Football Night in America segment and a few things struck me.

Let me say I have a lot of respect for Coach Dungy. I also think that convincing someone skeptical is difficult, and we shouldn't expect someone to immediately come around. In general though, coaches (and former players/analysts) benefit from a perception that football is some unknowable mystery, and that they are the only priests that can divine the true answers.

From Dungy's perspective, he's enjoyed long career by doing it the conventional way. But the reason his way worked was because everyone else played the same way. To change his mind now might mean admitting, "I did it wrong all these years." That's a tough hurdle to overcome.

It starts with something like this--planting a seed of doubt that the conventional wisdom might not be so wise. A smart young coach will one day figure it out, maybe in college, and everyone else will be forced to catch up. It's amazing that this particular 4th down was a failure, but it has done more to advance the subject than any other 4th down play, including all the successful ones.

I was particularly struck by Bob Costas' remarks. Costas said the crowd at Lucas Oil Stadium went from jubilation to stunned silence when they saw the Patriots offense lining up for the 4th down. Costas said from their perspective, they would have gladly accepted the ball 40 yds back than give the Patriots a good chance to end the game with a simple 2-yd gain.

Think about what that tells you about Belichick's decision. 'Do what your opponent fears most' might not be a bad way to make football decisions.

Now, to correct some bad info from Dungy and Dan Patrick: The 4th down conversion rates I used are based on data from 1st and 3rd quarters when the score is within 10 pts. That way 'desperation' and 'prevent' plays are excluded.

Also, as I made clear in the original post, my numbers are league baselines. But once you start increasing the Colts' chances of scoring from either field position, the go-for-it option just gets better and better relative to punting. The stronger you gauge the Colts offense, the better the decision becomes.

I talked with a very well-known reporter at length on Friday, and he's still very skeptical. And that's ok. He was very interested in all the mitigating circumstances surrounding the play--the personnel, the events earlier in the game, the timeout, the personal histories. And that's great. Those considerations matter to some extent, but they're all relative to some baseline. There needs to be something concrete underneath all those other extenuating factors, otherwise there is nothing off of which to make those adjustments.

You might not buy my numbers, but the battle has already been won. Even the skeptics are now just arguing over what the particular numbers should be. They've unwittingly accepted the framework that there are numbers, and there is a way to use them to help make the best decision.

Lastly, let me say I am not a rocket scientist despite being called that on two different networks today. That was just my major in college. I consider myself a tactician, not a statistician or mathematician. My day job is as a naval tactics consultant. To me, stats and math are tools for better decisions, not ends in themselves. That's one thing I'm trying to get across--you really don't need fancy calculus, just some percentages and an open mind.

  • Spread The Love
  • Digg This Post
  • Tweet This Post
  • Stumble This Post
  • Submit This Post To Delicious
  • Submit This Post To Reddit
  • Submit This Post To Mixx

77 Responses to “Last Thoughts on the 4th and 2”

  1. Anonymous says:

    Tell Mr. King hello! (just taking a guess...)

  2. Anonymous says:

    In the end it was still one of the dumbest calls ever. That's why Bill has made the punt call in that situation every single time until that game, and he's had much better success with the punt call.

    Situationally it is still 0 for 3 in the past 20-25 years. That's a 0% win percentage for the team that goes for it. Small sample size yes, but still it hasn't been run successfully ever in that situation. Obviously recent rule changes like the defensive headsets have driven down short yardage success in these types of situations, which is ignored. I don't care what the success rate was 40 rule changes ago in 2003. I care what they are today.

    And again, what makes the call even worse is taking the lower percentage play (pass over run).

    The NFL has always accepted that there "are numbers". The game is predicated on it. But taking baseline numbers such as used here without regard to different situations, which are much more important is lunacy.

    By your own account, in a 7-0 game, facing Jamarcus Russell and the Raiders, on your 2nd string QB, with Ray Guy as your punter, this is the right call statistically. By your own account, even if the team calling the play is 0-30 that season in short yardage conversions, this is the call that will statistically give them the best chance for the win.

    HUGE PROBLEM.

    The situation has so much more to do with whether the call was "right" or "wrong" and in this place, it is that reason that the experts who have lived the game call it the wrong call. 9% chance? That is nothing in comparison to who was out there and what they were doing.

    Yeah Costas said the crowd quieted. Put a guy out there with 200 cartons of cigarrettes, and have him smoke for the next 15 years, no one cares. Put a guy out there playing russian roulette once, everyone will tune in. Granted the guy smoking runs a lot better chance of dying first. But the one play method definately creates the tension.

    Do what your opponent fears most??? How'd that work out for Bill? How'd that workout for Cutler this year? How'd that work out for Barry Sanders for 10 years? How about do what wins.

    Will a coach do this one day? Probably not. Maybe he tried it and he wins a couple superbowls off the bat. But the likelihood, is if he makes a few playcalls like we saw Belichick make, by seasons end he's no longer a head coach anymore.

    Especially when these once a season type plays that have so much success become commonplace for that coach. It doesn't take long for the NFL to make adjustments. Yeah, a back may average 6 yards a carry on draw plays. Give up the regular running plays for only draws, and watch them catch on and that back's numbers fall.

  3. Anonymous says:

    Brian Burke's analysis is convincing. The data he uses is qualified to consider game scenarios that are relevant to end game decision making and finds the odds are in favor of going for it more often. If these types of coaching moves involve tradeoffs between field position and ball possession, Brian's research suggests possession becomes the more attractive option later in the game. My only concern is that the data may not account for home field advantage. If it's true teams are less successful converting 4th downs on the road the advantage Bellicheck's "gamble" may have had could evaporate. Any thoughts on this?

  4. James says:

    wlcarriv beat me to it, but I had the same guess as him.

    Anonymous's post hardly deserves the response, but I enjoy how he ignored that it was Tom Brady under center and how they averaged 6.9 yards per play that game. Oh well. As Brian said, people are rarely convinced immediately.

    Keep fighting the good fight!

  5. Zach says:

    If for nothing else, I wish the fourth down conversion would've been successful so we could see Dungy and Dilfer and Rodney Harrison praising Belichick for his guts.

    Right after the game ended, Dilfer said there was "no way" he'd be saying that it would've been a good call had it converted...but we all know he would've changed his mind.

  6. Jeff Clarke says:

    The real question is how will people respond to the first situation where the opposite happens.

    You have the mathematical advantage for going for it. You punt and then lose.

    How vocal with the punt first crowd be in arguing that just because we punted all the time before, we should punt all the time now without even bothering to analyze the situation?

    My guess is that some of the same people saying "should have punted" will be saying "should have gone".

  7. Anonymous says:

    I still don't gt how punting was considered the clear best option. There were 2 mins on the board againsts the Colts offense. My guess is that it is better to lose passively than to take a chance to win.

    James,

    My comment to anonymous poster would have been, "how many superbowl winning teams have you coached for again?", but that's just me.

  8. zlionsfan says:

    Some people will never be convinced because their position is based on opinion. If you demonstrate that over time, decision A is better than decision B, and they say "but you just have to go with decision B", well, there isn't much you can say to that. If they're interested in judging a decision by the final result of the game, well, there's really no point in them reading any kind of analysis, because they don't need any. Jim Caldwell and Sean Payton are currently the best coaches in the game. "Anyone" can see that.

    For the rest of us, well, this is what we like to read, and I'm not so sure that Belichick's strategy will take that long to spread. Maybe it's just coincidence, but there were several fourth-down attempts today, and not all of them were in "by the book" situations.

  9. James Sinclair says:

    Zach,

    I respectfully disagree. If it had been successful there would've been some talk about how it was "risky," but it worked out. Dilfer would probably say Belichick got lucky; others would just shrug and say "I don't get it, but he must know what he's doing." And the whole thing would be quickly forgotten.

    There's no way this would be such a big deal if it had worked--the media is much more interested in criticizing than praising.

  10. Dylan says:

    I don't think the whole thing would've been forgotten. I agree the media loves criticizing, but it also loves heaping praise on the best. This only would've sent Belichick from God to Titan. He's also the only coach that can even get away with this type of decision without having his job being questioned. It's almost as if he did this just because he could.

  11. Becephalus says:

    The really horrifying thing about this whole fiasco is it iluminates the quality of reporting and respect for the truth you get from the media in this country. Huge businesses that must by their nature have some people in decision making position who understand math and science, and yet no one steps in and simply says "look guys you are wrong, stop misinforming people".

    I don't doubt for a second that if our political problems were as easy to analysze the politcal talking heads would ignore the data just as blatantly as the football ones do.

  12. Jeff Clarke says:

    Becephalus,

    Do you ever read fivethirtyeight.com?

    Its basically the political version of this website. Its pretty good. They definitely do a better job analyzing the specific issues than the regular pundits. The writers are liberals and sometimes fall into advocacy journalism on issues, but their numerical analysis (ie who wins what state and why) is pretty objective.

    This is one of the areas where the internet is so helpful. The mainstream media needs to keep everything at soundbite level. Its just a lot easier to say "he disrespected his defense" than to explain a long math problem and how you got to all of the component variables (other long math problems).

    The internet and pages like this are putting pressure on the mainstream media. Twenty years ago, people would have accepted Tony Dungy's criticism "you need to follow the percentages" at face value. Pages like this one forced the follow up question: Exactly what percentages are you looking at. Which caused the response "The percentages are baloney". If you think about it, that is a major step back on his part.

  13. potter says:

    This isn't going to be popular,but here goes.

    Firstly,Brian congratulations on the exposure,it's well deserved.

    However,your win probability of 79% for going for it on 4th down only applies to the longterm expectation.Therefore,you have to ask how often Belichick is going to get the opportunity to call such a play.If it's not very often,then you're hoping that longterm probability advantages will manifest themselves in a small number of trials.

    And there's no guarantee that they will.It's the flip side of trying to estmate a players longterm average by watching a very small number of his attempts.

    This isn't just my hunch,it's basic probability.Take out any stats text book and sooner rather than later you'll read something along the lines of

    "If you repeat a trial just once then knowledge of the average outcome will not be very useful.Over limited repeated trials the variability of the outcome is more important than the average".

    By simply using the longterm average to evaluate the play,other more valid tools are being neglected.And that raises the very real possibility the Tony Dungy was correct when he said that the play was too risky,even if the reasons he used to justify his statement were incorrect or badly articulated.

    potter.

  14. LamKram says:

    Potter: I don't understand what the number of trials or variance has to do with anything. Going for it gave the Pats a higher probability of winning than punting, but neither was 100%. End of story.

    Basic blackjack strategy says to hit on a 16 if the dealer shows a face card. Over the long run, hitting will win more often than standing. But hitting gives you a greater than 50% chance of busting immediately, while standing lets you sit back and see what happens to the dealer. If your whole stake was riding on that one hand, wouldn't you still hit?

  15. Anonymous says:

    NO! No, because if we had more info (e.g. had counted cards), we might know that forthcoming hands may be more favorable and elect to stand. This in effect keeps the bankroll alive for at least one more chance. That's exactly what the Pats gave up! They bet the house and LOST!

    In the 4&2 analysis, it is very shortsighted and simplistic. More can be determined because it is near the end of the game with limited, but computable possibilites.

    The probabilities are incorrect. The empirical data is incomplete, and worse what was used is irrelevant to this exact situation.

  16. LamKram says:

    OK, then, to stretch the analogy to the breaking point...

    Counting the cards showed that the dealer was not likely to bust (the Pats' defense was running on fumes - punting the ball to the Colts would not definitely guarantee victory). Hitting (going for it) was the better option, even (especially) given the specific game situation.

    "The probabilities are incorrect. The empirical data is incomplete, and worse what was used is irrelevant to this exact situation."

    Brian has explained the sources of his probabilities over and over. His empirical data is based on hundreds of NFL games. He has explained convincingly how the numbers were relevant to this exact situation.

    And even if you throw out his probablities and plug in "common sense" estimates (as long as you are being honest and reasonable with those estimates), the numbers STILL say go for it.

  17. Pat Laffaye says:

    (I'm the Anonymous poster between LanKram's posts)

    Well, respectfully I am NOT convinced.

    I think you've missed my point. Brian is looking at one and only one play in his analysis and I'll stand by my quote you referenced.

    What if I proved you and Brian wrong using the data supplied by Brian himself or the NFL Gamebooks going back to 10 seasons?

    And I'll back up what I'm saying. If I'm wrong and can be convinced that "going for it" was in fact the correct play, I will do the following:

    I'll donate a full week of my time to a football-based statistical website such as Brian Burke's (this one obviously) or Ken Massey's in any way they see fit.

    I'm saying this crucial play is worth a much more in depth look (i.e. beyond the one play), which again on the surface the analysis presented thus far appears to be quickly done with minimal research.

  18. Anonymous says:

    Do you weight more recent data greater than 5-8 year-old data?

    The numbers from the WP equation only need to be adjusted a little to make this a bad call, even based on Brian's faulty modelling which is based on faulty data.

    The 60% estimate is too high; 55% is more realistic and maybe still too high (Pats presently (2009) are converting 4ths at 50%, Colts allowing 46%, 2pt conversion at 44%, so 60% is obviously high based on recent data. This change in data from 9 and 5 years ago could be due to more attempts recently which have failed due to the "go 4 it" doctrine.

    53% for Colts to score from Pats 28 is too low; I believe Colts score from the Pats 28 2 out of 3 times so 67%. I haven't verified it, but I've read on another site that, this season, the Colts are 5 or 5 scoring TDs when starting drives on opponent's half.

    Making those two adjustments, we have WP=.69 "going 4 it" v. .70 punt. BAD CALL.

    Moreover, the Punt WP is too low because Hanson definitely punts farther than 38 net. INDOORS. hello. Hanson averaged 44 net with a long 55 net from his won 20 that put the ball at the Colts 25. So from the Pats 28? Colts 20? 17? Hanson is a past pro bowler who can get off a 50+ net punt INDOORS with little effort. Sure he doesn't usually punt that far, because often that would mean a touchback. With Brady and a high-powered offense, Hanson is rarely punting from his own 10, 20, 30.

    Your .30 Colts probablity to score TD does not account for TOs remaining as you stated in that article. The difference between 1 and 3 time outs with 2 mins and a 2min stop is vast. Peyton had to throw to the sidelines, the D knows that. Picked Peyton off twice in the 4th. Once on the drive previous to the last score, which included a bogus 38 yard pass interference penalty. Peyton threw a couple more ints against Baltimore, which brings me to Peyton's 2009 performance.

    Before the Pats, Peyton never threw a pass this regular season against a pass defense in the top half of the NFL. Look at the teams he played aginst. Most ranked in the high 20's or worse. The Pats had the 4th best pass defense before last week's game.

    I really wish you'd open your dataset up for peer-review. As it is now, it's just a black, hokus-pokus box from which you can pull any number you want with no way for skeptics to verify. It amazes me how many "mathematicians" have accepted your datasets without question. Same for Zeus: black box predictions.

    People understand the statisitical concpets upon which your modelling is based, but they disagree with the relevance of baselines taken from league averages over 5-9 years.

    I also realize that 2 months ago you publically touted the success any coach who made aggressive 4th and short calls would achieve. Belichick's crash and burn result using your "go 4 it" doctrine must be embarrassing.

    It appears a re-evaluation of your "doctrine", instead of a condescending and patronizing defense of it, would be in order.

    Do you really think that Belichick makes the same call during the playoffs? During a Super Bowl? Ever again against the Colts?

    Note: I had trouble posting in Firefox. I typed out something almost twice as long and it disappeared when I went to post it. I did that tiwce, and almost gave up. A google search informed me that it is a Firefox/blogger issue and I should be able to post this using IE. Fingers crossed and I will see what I left out. It would be interesting to find out that the more technically savvy users of Firefox have been prohibited from posting repsonses here while the IE droolers have run amok through comment access bias.

  19. Zach says:

    I love how all the skeptics are using the 10 fourth-down attempts the Patriots have had this year over 9-10 years of data.

    To anonymous above: "I also realize that 2 months ago you publically touted the success any coach who made aggressive 4th and short calls would achieve. Belichick's crash and burn result using your "go 4 it" doctrine must be embarrassing."

    Are you serious? It was one fourth-down attempt. If he does that one a game (which is what Brian is talking about), then in the long run it will be more beneficial than punting.

  20. Anonymous says:

    I can't wait to see the "technically savvy" Firefox user two posts up flip a coin.

    "You *said* it was 50/50 heads and tails, but I flipped it once and it came up heads! How does your fancy math explain *that*? It's time for a peer-review, Brian. Open up that black box!"

  21. Anonymous says:

    I really like Pat's thoughts (even though I disagree with the conclusion)
    Variance is an important factor here in that we're not playing in a long run scenario. "Going for it", even if it results in a higher win-probability may not be the "right" call if you

    It's like a gambler who is simply trying not to lose his shirt at a casino. He should be betting full odds at the craps table, doubling down on his 11 against a dealer 10, but he can't take that chance because if he loses (which will happen a pretty high percentage of the time), he won't be able to play anymore.

    He can play a strategy of low variance, in which his expected outcome is lower, but he still has some chance to win and he's eliminated (or at least severely reduced) the chance of going broke immediately. He's taken himself out of the big wins, but also out of the big losses, which allows him to play longer (even with a lower expected return)


    I think the disagreement I see (at least among intelligent people, who don't preach the gospel of conventional wisdom) comes from the fact that they think of the goals differently.

    Belichick's goal was to maximize his chance of winning (which most of us seem to agree with), while most other coaches would gone for a strategy to prevent the loss (a lower variance play)

    That's not incorrect in any specific instance (I'm sure Pat would agree that it is in the long run...), and thus we will never win a full week of his statistical service...

  22. James Sinclair says:

    "It's like a gambler who is simply trying not to lose his shirt at a casino. He should be betting full odds at the craps table, doubling down on his 11 against a dealer 10, but he can't take that chance because if he loses (which will happen a pretty high percentage of the time), he won't be able to play anymore."

    I've said this elsewhere on this site, and I'll say it here: if you're concerned that the cost of losing is greater than the value of winning, the appropriate decision is to not play the game in the first place. Once the game starts, you're either going to win or lose, so any decision that doesn't increase your chance of winning has to increase your chance of losing.

  23. LamKram says:

    Pat: I’ll give it a shot. I assume you agree with the basic math for calculating the win probability for going for it and for punting. That’s just probability 101 (if you don’t agree with the math, then there is really no way to convince you).

    I’ll write it this way:

    WP for “go for it” = P1 + (1-P1)*(1-P28)
    WP for “punt” = 1 – P66

    Where P1 is the probability of the Patriots converting the 4th and 2, P28 is the probability of the Colts scoring a TD from the 28 yard line (if the conversion failed), and P66 is the probability of the Colts scoring a TD from the “66 yard line” after a punt (where a typical punt would end up). This assumes that if the conversion is successful, the Patriots will win with 100% certainty. This is not strictly true. There is a small but finite chance that the Colts would still get the ball back and score a TD in the closing seconds. However, there is also a small but finite chance that following a failed conversion and a Colts TD, the Pats would get the ball back and score a TD or FG for the win. These factors should roughly cancel out.

    If you set the two winning percentages equal to each other, you can solve for the “break even” P1, the 4th down conversion probability where going for it or punting gives an equal chance of winning the game. And that is:

    P1be = (P28 – P66)/P28.

    If you think your actual chances of getting the 2 yards are higher than P1be, then you should go for it. Otherwise you should punt.

    According to Brian’s empirical data for teams in NEARLY IDENTICAL game situations as the Colts were in (i.e., needing a TD to tie or win with 2 minutes left to play from the X yard line), P28 = 0.53 and P66 = 0.30. This results in P1be = 0.43 or 43%. If the Patriots had a better than 43% chance of getting the 2 yards, then going for it was the right call. In fact, Brian estimates P1 to actually be 60% based on his data, so Belichick made the right call.

    “But…”, you might say, “Peyton Manning is not a typical quarterback. You can’t give him a short field!” In other words, P28 is probably higher than 0.53. Fine. But then if you increase P28, you have to also increase P66, right? If he’s better than average from the 30, isn’t he also better than average from the 70? So let’s assume that Brian’s numbers are basically correct, but add a “Peyton Manning Factor”, Xpm, to each probability. So that 53% chance from 30 yards is really 63% for P.M. and 30% is really 40% (or whatever).

    Then the break even P1be comes out to be:
    P1be = ((P28+Xpm)-(P66+Xpm))/(P28+Xpm)

    Or, simplifying:
    P1be = (P28-P66)/(P28+Xpm).

    You might notice that as long as Xpm is greater than zero (meaning Peyton Manning is better, not worse, than average), then whatever you choose for Xpm, the break even probability becomes LOWER. So going for it becomes an even better option. You can also apply the Xpm a different way – by multiplying (not adding) each probability by Xpm (Xpm = 1.2, say, meaning P.M. scores TDs 20% more often than average in each situation). In that case, the Xpm factor will just completely cancel out and you are back with the original equation (and P1be = 43%).

    The only was to make the break even probability higher than 43% is to assume that Peyton Manning is better than average from the 30 but average or worse than average from the 70. That is not a reasonable assumption. Equivalently, you might assume the Patriots' defense was better than average at defending a long field, but worse than average at defending a short field. Again, not reasonable.

    Bottom line: if you think the Patriots chances of gaining 2 yards was higher than 43%, then going for it was the right option.

  24. Anonymous says:

    I believe this current controversy only demonstrates the weight of Habit in the minds of most people. If something is not a habitual norm, it is 'wrong'.

    Many aspects of NFL play could be improved if people were not so married to habit. I believe that any team in the NFL could make a huge improvement if they would teach their defensive players to strip the ball on every play, all the time, instead of focusing on tackling. But it is habitual to think only of slowing down the opponents' offense instead of getting them off the field NOW and without them getting a punt to improve their field position.

    I do not believe it is possible for a running back or receiver to consistently hang on to the ball if it is being smacked 2-3 times every play. And even if the defense is successful only 10% of the time in causing a fumble, this more than makes up for the extra yardage the offense makes due to less emphasis on tackling. So what if the offense goes for a long drive if they end up losing possession on a fumble? And with their players getting nervous about losing the ball, even more turnovers may result from their lack of confidence.

    Why is the QB option never run in the NFL, but it is run nationwide in college? Dangerous to the QB? USE MORE THAN ONE THEN. Have a running QB and a Pocket QB and alternate them all the time during the drive. A bit like what the Eagles have been trying to get Vick to do.

    The QB option is a real threat if you have a speedy QB. It cetainly presents matchup problems which can overwhelm any defense, simply by creating more options for how the play can unfold to take advantage of defensive alignment.

    Why don't NFL linemen get rotated more frequently? Why play with winded 1st stringers and only use the backups if somebody gets injured? I would use a large rotating pool of linemen to keep fresh legs in the game at all times. Don't tell me this would not be crucial in the 4th quarter.

    Habit holds us back I think.

  25. Anonymous says:

    This reminds me of the internet poker revolution.

    Before the internet boom a guy could make some good money playing decent sound weak/tight poker.

    However with the boom came education, and the higher stakes games got tougher. And the older Weak/tight players who refused to adapt... became losing players.

    The NFL is similar to poker in that it is a game of exploiting small edges. I don't think a coach should base a decision on math alone (as there are a lot of intangible variables in football), however they should be familiar with the general percentages and factor them into their thought process.

  26. Anonymous says:

    So much silliness in here. Statisitcs from 5000 years ago for coin-flipping would be exactly the same as coin-flipping statistics today.

    Do you believe that the Lions or the Titans or the Rams can flip a coin as well as the Colts or the Pats?

    Take a look at NFL.com here: http://www.nfl.com/stats/categorystats?archive=false&conference=null&role=OPP&offensiveStatisticCategory=null&defensiveStatisticCategory=GAME_STATS&season=2009&seasonType=REG&tabSeq=2&qualified=true&Submit=Go

    Team defense stats for defending against 4th conversions range from 11 to 75%, and you want to just plug 60% in because it sounds good, or because someone told you that's what the number is? No need to question that number's accuracy or relevance, just plug it in and let's go for it.

    If you really believe that coin-flipping statistics are in any way similar to NFL stats taken from several years past, then you cannot be reasoned with.

    Experienced, knowledgable, skilled, and wise coaches take percentages into account when making decisions. They also take injuries, personnel matchup, weather, wind, fatigue, momentum, stamina, grit, and toughness into account when making calls. They just don't plug random numbers into an equation and call plays.

    Mr. Burke, if you do nothing else, please inform your fans that your NFL dataset is not based on the same rigor and probabilistic certainty as coin-flipping.

    Even that graph that plots the 2 minute drill is unreliable. Looks to me like the sample size might be too small and that is why it's choppy and you "attempt" to reduce noise by adding the linear approximation, which only distorts the .70 number plugged into the equation.

    And then Mr. Burke makes this admission:

    Brian Burke said...

    "I meant to mention that. No, unfortunately this does not factor in varying numbers of timeouts remaining. Consider it an average of a typical number of TOs left. We know more timeouts is better, and fewer is worse, but I can't say exactly how much with out a much, much deeper analysis."

    I applaud Brian and Zeus and others for bringing statistical analysis to the gridiron, but by your own admission, "much, much deeper analysis" is necessary before you, or anyone else, can "say exaclty" what these numbers tell us.

  27. Anonymous says:

    addendum:

    I know that Zeus was trying to sell their system to NFL teams for $100K+, so they have an obvious reason to fudge the numbers and keep their data private. I've checked their site; more voodoo, hokus-pokus football somersaults.

    Zeus had the Rams as the greatest play callers (CCI index) and the Giants as close to the worst for the year the Giants won the Super Bowl. Brian had Pats 99% WP (to win) midway through the 2nd qtr, up by only 10 points against Peyton. Really? Get a grip.

    I do believe that Belichick made this call based on numbers, literature and sales pitches he has endured from Zeus & Co. There has been a fanatical push for more aggressive 4th & short calls for some time. Zeus even admits that's why they put their Zeus computer program together

    Zeus Co was created by Bower, a physicist, and Frigo, a backgammon player as a way to confirm their preconceived beliefs that play calling should be more aggressive.

    From NYT:
    "That is what inspired Bower and Frank Frigo, like Bower a backgammon player and a football fan, to begin working on a program in 2001 that would bring order to the thinking of fans who beseech coaches to go for it on fourth down and boo when the punter runs onto the field.

    “People second-guess coaches for the decisions they make, and people act like they know what they’re supposed to do,” Bower said recently. “The question we had is, do they really know what they’re talking about? We set about to see if we could build a model to answer these questions.”

    It appears that Bower held beliefs BEFORE looking at the data and modelling NFL dynamics.

    NYT:
    "Bower said the program seemed to confirm his belief that coaches were usually too conservative in calling plays."

    To design an unbiased system it would behoove the designers to begin without any preconceived beliefs about optimal 4&2 play calling. Reading their site, Bower and Frigo sound like 4&short zealots. Anyone who disagrees with their ZEUS computer is labeled as idiots, boneheaded, futile, flawed, weak, insecure, fearful, and a host of other niceties I don't care to recall. Read their Zeus computer site for yourself.

  28. Anonymous says:

    I know that Zeus was trying to sell their system to NFL teams for $100K+, so they have an obvious reason to fudge the numbers and keep their data private. I've checked their site; more voodoo, hokus-pokus football somersaults.

    Zeus had the Rams as the greatest play callers (CCI index) and the Giants as close to the worst for the year the Giants won the Super Bowl. Brian had Pats 99% WP (to win) midway through the 2nd qtr, up by only 10 points against Peyton. Really? Get a grip.

    I do believe that Belichick made this call based on numbers, literature and sales pitches he has endured from Zeus & Co. There has been a fanatical push for more aggressive 4th & short calls for some time. Zeus even admits that's why they put their Zeus computer program together

    Zeus Co was created by Bower, a physicist, and Frigo, a backgammon player as a way to confirm their preconceived beliefs that play calling should be more aggressive.

    From NYT:
    "That is what inspired Bower and Frank Frigo, like Bower a backgammon player and a football fan, to begin working on a program in 2001 that would bring order to the thinking of fans who beseech coaches to go for it on fourth down and boo when the punter runs onto the field.

    “People second-guess coaches for the decisions they make, and people act like they know what they’re supposed to do,” Bower said recently. “The question we had is, do they really know what they’re talking about? We set about to see if we could build a model to answer these questions.”

    It appears that Bower held beliefs BEFORE looking at the data and modelling NFL dynamics.

    NYT:
    "Bower said the program seemed to confirm his belief that coaches were usually too conservative in calling plays."

    To design an unbiased system it would behoove the designers to begin without any preconceived beliefs about optimal 4&2 play calling. Reading their site, Bower and Frigo sound like 4&short zealots. Anyone who disagrees with their ZEUS computer is labeled as idiots, boneheaded, futile, flawed, weak, insecure, fearful, and a host of other niceties I don't care to recall. Read their Zeus computer site for yourself.

  29. Anonymous says:

    addendum:

    Finally, you should all recall how badly things can go when physicists are allowed to apply all their number-crunching, statistical firepower to to areas outside their better defined boundaries of physics and physical science.

    Newton, Einstein, Quantum Mechanics, String Theory - Fine.

    Statisitcally based financial models, designed by physicists, to reduce and virtaully remove all risk from trading mortgage backed securities, credit default swaps, and derivatives on Wall Street? Really, really NOT fine.

    Physicists statisitcally modelling NFL games? Almost worse than trillion dollar Wall Street bank bailouts.

    I do have one other theory. Bower went to Purdue in Indiana and got his PhD in Astrophysics from Indiana University, so he probably has been gunning to take the Pats down a few pegs in the AFC so Bowers concocted this bad science to fool Belichick. Colts owner probably paid him dearly to concoct this lunacy. Ok, just kidding there. I'm really sick of this now, because I just realized that this lunacy likely cost the Pats an unprecedented and undefeated season.

    2007.

    "After the Patriots lost Super Bowl XLII by a mere three points, sportswriters everywhere began to look back on one particular play that might have cost the team their perfect season. After an impressive opening drive in the third quarter, the Patriots, then leading 7-3, stalled at the Giants' 31-yard line. Instead of sending in Stephen Gotkowski to kick a field goal, though, Coach Bill Belichick left his offense on the field to try to convert on 4th and 13. Tom Brady's pass fell harmlessly to the ground, and Belichick's reputation as strategic genius took a hit."

    "Even though the Patriots lost possession and eventually the game, it turns out that Belichick made the right play call. In fact, by calling for a pass attempt, he increased his team's chances of winning ever so slightly. How can we know that? Because of a sophisticated modeling program called ZEUS. Developed by Frank Frigo and Chuck Bower—a couple of champion backgammon players—ZEUS is a simulator that can take any critical play-calling decision in any NFL game and tell you which option results in the highest chance of a particular team winning the game. At the fork in the road that is the coach's choice of play calls, ZEUS can simulate hundreds of thousands of games based on the various possibilities. As it turns out, in the simulations where Belichick called for a pass as opposed to a field-goal attempt, his team won 1.2% more often."

    Bower, Frigo, Burke: Please keep your weird and very bad science the hell out of New England.

  30. feralboy12 says:

    In my experience, anyone who takes a skeptical viewpoint, against the grain, on a subject that people commonly get emotional about (like politics, religion, or football) encounters the sort of information resistant thinking that Dungy and others are showing here. Long held opinion always trumps numbers.
    That being said, I think looking at these percentages is useful in a general way in informing your decision, but I'm unsold on how meaningful past results are in a specific football situation. The arguments over the real odds of converting a 4th-and-2 are telling. The fact is, every previous 4th-and-2 involved a slighly different combination of factors (different personnel, for starters), and like any other situation with lots of variables and coupled components, those slight differences in initial conditions make for highly divergent results further down the line (sound familiar?).
    The specific circumstances matter more than historical percentages.
    As I wrote in my blog that nobody read, the circumstance that would matter most to me is having Tom Brady, Wes Welker, Randy Moss and Kevin Faulk...if history says it's not completely retarded (and it does) and I have what I believe are advantagous variables, I have no problem going for it. Brady needs two yards to win--who's betting against him? Of course, I'm not a rocket scientist, unless you count model rockets.
    Note to anonymous--if you alternate between a runner and a thrower at QB, aren't you maybe telegraphing your intentions just a little bit?

  31. Alex says:

    anonymous: please keep your weird and very bad posts and logical skills the hell out of anflstats comments.

    thanks in advance,
    alex

  32. Anonymous says:

    LamKram said:

    "“But…”, you might say, “Peyton Manning is not a typical quarterback. You can’t give him a short field!” In other words, P28 is probably higher than 0.53. Fine. But then if you increase P28, you have to also increase P66, right? If he’s better than average from the 30, isn’t he also better than average from the 70?

    So let’s assume that Brian’s numbers are basically correct, but add a “Peyton Manning Factor”, Xpm, to each probability. So that 53% chance from 30 yards is really 63% for P.M. and 30% is really 40% (or whatever)."

    No, the above is not correct. You cannot scale Peyton's TD% at the Pats 28 equally to Peyton;s TD% at his own 30 or 25 or 20.

    A 10% increase over 30% would represent a 33% improvement.

    A 10% increase over 53% would represent a 19% improvement.

    That is fuzzy math.

    I may be wrong here, but that is how I see it.

  33. LamKram says:

    I think everyone is missing the point. The mainstream media talking heads are labelling Belichick's decision as among the worst coaching decisions in the history of football. Brian Burke's analysis shows that, making some reasonable assumptions based on historical data, the decision was the right one. Even if you disagree with specific inputs to his analysis, you have to admit that "going for it" wasn't ridiculous or stupid. At the very least it was a close call between the two options, and Belichick should not be vilified.

  34. Jeff Clarke says:

    " I just realized that this lunacy likely cost the Pats an unprecedented and undefeated season."

    One team plays unconventionally and goes 18-1. All the other teams play conventionally and go under .500.

    You could make a good case that the reason why they lost that one game was the calculated risk didn't work out. But you'd also have to acknowledge that taking those calculated risks probably had something to do with why they won the eighteen games they won that season.

    Belichick did basically the same thing in Atlanta this year. It worked and they won. Where was the uproar over "the worst call in history" then?

    There is something called "hindsight bias" and its definitely present here. He picked A and won. That was the right decision. No argument there. He picked A and lost. It was so obvious he should have picked B.

  35. LamKram says:

    "No, the above is not correct. You cannot scale Peyton's TD% at the Pats 28 equally to Peyton;s TD% at his own 30 or 25 or 20.

    A 10% increase over 30% would represent a 33% improvement.

    A 10% increase over 53% would represent a 19% improvement."

    Right, so instead (as I said in my post) you can apply the same multiplier to the two probabilities. For example, multiply by 1.2. Then 30% becomes 36% (a 20% improvement), and 53% becomes 63.6% (also a 20% improvement). Then the "Peyton Manning factor" just completely cancels itself out and your back to the original numbers.

    In reality I believe that the improved TD probability for a better than average QB WILL be more dramatic for a long field than for a short field. To give an extreme example, consider 1st and goal from the 1 yard line. According to Burke's numbers, the TD probability for an average QB is about 70%. The best it could possibly be, for the greatest imaginable QB (say, the unholy, genetically engineered, cloned hybrid of Peyton Manning, Tom Brady, Bart Starr, Johnny Unitis, and Dan Marino) is 100% - a 42% improvement. On the other hand, 1st and 10 from your own 10 is a 16% TD probability for an average QB, but you can easily imagine a 32% probability for a phenomenally great QB - a 100% improvement.

  36. LamKram says:

    Of course, from the 1 yard line, the ability of the QB isn't as important as the team as a whole. So replace "QB" in my post above with "team" or "offense" and the point is still valid.

  37. Jeff Clarke says:

    "A 10% increase over 30% would represent a 33% improvement.

    A 10% increase over 53% would represent a 19% improvement.

    That is fuzzy math.

    I may be wrong here, but that is how I see it."

    I think you are wrong here.

    Lets assume that instead of adjusting the league averages for Manning, the Colts had Superman playing quarterback.

    Superman would obviously make the TD 100% of the time no matter where he played.

    So his increase would be

    47% over 53%...an 88% improvement

    70% over 30%... a 233% improvement.

    If you look at it this way, its actually a lot easier to increase 10% from 30%, because you are farther away from the ceiling.

    Perhaps the better question is what percentage of touchdowns would Manning get that other QBs wouldn't.

    Superman would get 100% of the TDs others wouldn't get so he'd get an extra 43% when others miss 43% and an extra 70% when others miss 70%.

    Manning isn't Superman. Lets say he is 25% better than league average (I just made up this number for the example...I don't know what the real number is).

    You would expect his % would increase from 30% to 47.5% (17.5% = .25 * 70%) from the 65

    He'd only increase to 65% from the 28 (12% = .25 * 43%)

  38. Jeff Clarke says:

    Darn LamKram posted the same thing I was thinking slightly earlier...

  39. LamKram says:

    Jeff: ha ha!

    Another argument why Manning is probably more dangerous than average on a long field than on a short field (relatively speaking): On a long field with time running out, the accuracy of the passer is key. The Colts would be running a lot of long sideline routes. On a short field, there is more opportunity for the running game (which the Colts did, in fact, use) and short routes, so there will be less distinction among different QBs as to the probability of a TD.

    In fact, given the way the final drive played out (short pass to Reggie Wayne and YAC for 15 yards, Joseph Addai run up the middle for 13 yards), it wasn't really "Peyton Manning on a short field" that the Patriots had to worry about, was it?

  40. Pat Laffaye says:

    LamKram: I am not going to dispute Brian's probability formula or your math. All is fine there.

    What I DON"T AGREE WITH is the numbers you guys used for P1, P28 & P66.

    Furthermore, P28 is really P29 because the Pats gained 1 yard on the 4&2. Also, as mentioned by others, Hansen averaged 44 yards per punt, so an average punt would place the ball on the IND 27, making the true estimate P73.

    I also mentioned on another thread that FOUR timeouts were used on that NE series, which again the model does not account for.

    Prove I'm wrong with real empirical data, and I'll be glad to work free for a week.

  41. Daniel Jepson says:

    Bravo to Brian and everyone else who has been vociferously championing the case for rational analysis of 4th-down decisions. The fact that people with entrenched opinions have simply kept their heads buried in the sand should be neither surprising nor discouraging - as Brian noted, even the fact that so much dialogue has taken place is an important first step. One could say that 4th-down analysis is today where elementary sabermetric concepts in baseball such as OPS were ten years ago.

    Also, to see why the punt becomes a less favorable option against an elite offense, call the probabilities involved P(S), the probability of a score from a short field, and P(L), the probability of a score from a long field. For reasons that you can probably work out for yourselves, a lower-bound estimate of P(L) in terms of P(S) - i.e., an estimate that is favorable to the case for punting - is P(L) = P(S)^r, where r is the ratio of the yards needed for a touchdown in the two cases. So as P(S) declines, P(L) declines faster, giving greater equity to the punting option against an inferior offense. You could call this (with a big hat-tip to Brian, of course) the punting paradox: the more disastrous, in absolute terms, a fourth-down failure would be in this situation, the better a gamble "going for it" becomes, all else being equal.

  42. Jeff Clarke says:

    "Also, as mentioned by others, Hansen averaged 44 yards per punt, so an average punt would place the ball on the IND 27, making the true estimate P73."

    Ridiculously small sample size. Those others seem to want to use 1 games punts instead of an entire season's because it helps their case. If you had a .270 hitter that went 3 for 4 on a night, would you really think he had a 75% chance of getting a hit in his 5th ab. Regression to the mean is definitely expected here.

    Hanson averaged 35.5 net yards a punt for the season as of that moment. It has actually gone down since then as you can clearly see in the ESPN link below. I guess his "hot" streak ended against the Jets.

    Saying 66 yards to go was based on a 38 yard net for that punt, which accounted for the fact that it was indoors and a touchback wasn't a factor. Honestly, I think we were generous with a 38 yard expected punt.

    I'll tell you what, give me your phone number and we can arrange a bet on Hanson's next punt. Over/Under on 44 yards net.
    I'll take the under.

    http://espn.go.com/nfl/statistics/player/_/stat/punting

    "I also mentioned on another thread that FOUR timeouts were used on that NE series, which again the model does not account for."

    The Indy timeout situation is the important one. You're right. The model doesn't account for the timeouts. It only uses the average. I'd be curious to see what the average timeouts were in that situation.

    If a team gets the ball down 4-8 points with two minutes to go, its a reasonable assumption that they probably already used several of their timeouts on defense. Indy still had 1 left. I'd be willing to bet that 1 is actually either above or really near average for that situation.

    I can't empirically prove that but I think I did prove the punting stat, so I guess you only work free for me for half a week.

  43. mermel says:

    It could be that both Colts fans and Pats fans wanted them to punt if they are both risk averse over probabilities of winning. So the argument that Belichick was right because Indy fans preferred a punt is faulty. I bet bars in Boston also fell deathly silent. That being said, Belichick was right in maximizing the chances of winning.

  44. Daniel Jepson says:

    I think you hit on something important here: risk aversion is pretty obviously the reason why football practice and thought-processes have gotten stuck in such a rut when it comes to fourth-down decisions. This is understandable to a certain degree - risk-aversion is something of a hard-wired tendency for humans when making choices under uncertainty. But that doesn't make it any less of a liability when dealing with quantities, such as win probability, that are not subject to decreasing marginal utility.

  45. Anonymous says:

    I think you omitted the fact that if the colts scored, the patriots could then score again. And a colts drive starting from where the turnover on downs occurred would most likely leave much more time on the clock than one starting from where they received the punt. I think this would slightly further enhance the patriot's probabilities in your estimate.

  46. potter says:

    LamKram.

    Assuming the issues are about BB going for it on 4th and 2 from his own 28,when up by 6 with 2:08 left against the Colts and not 4th downs in general.

    Accepting 60% of the time he converts.

    However,on one play he either converts (100%) or he doesn't (0%).

    He converts for any manner of reasons.It's a geat play,it's a lousy defensive play,it's incomplete,but there's a roughing the passer call.You can go on and on.

    If he doesn't convert,again it's because of any manner of reasons.Running too shallow a route,a great defensive play,a bobbled catch that's spotted short...

    Does the 60% conversion rate help you much if you want to know what will happen on just one play? I'd say it doesn't,it just tells you that if you can repeat the play a great many times,you'll convert on 60% of those attempts.

    On just one play you just have to hope good things happen or bad things don't,you convert because you got lucky.

    It's a sample size issue.Even if the 60% conversion rate is robust,you aren't going to have enough opportunities to go for it on 4th and 2 from your own 28,when up by 6 with 2:08 left for the 60% to kick in.

    If you're feeling lucky,that's fine,but be aware of where failure to convert leaves you compared to the punting option and be aware that you are trusting a lot to luck.

    Potter

  47. Jeff Clarke says:

    "If you're feeling lucky,that's fine,but be aware of where failure to convert leaves you compared to the punting option and be aware that you are trusting a lot to luck."

    But what about the luck that you need in order to win if you punt?

    You are correct. Luck plays a huge role in whether any decision is correct. And to a certain extent, going is definitely a gamble. On the other hand, punting is just as much of a gamble. Its an even worse gamble because the odds are worse.

    The question the pro-punt crowd keeps asking is would you play a single hand of blackjack for $100,000 even if you got to be the dealer and had the odds in your favor.

    Its the wrong question.

    Say you were forced to play a single hand of blackjack, but you got to choose between being the dealer and being the player.

    Which side of the table would you sit on?

  48. Anonymous says:

    Jeff Clarke said...

    "Belichick did basically the same thing in Atlanta this year. It worked and they won. Where was the uproar over "the worst call in history" then?"

    There were questions raised when BB did that. Google is your friend.

    Regardless, the situations are not similar.

    In Atlanta, it was the 3rd qtr with plenty of time remaining. A Falcons TD would only put them up by 1 with plenty of time remaining. It was the 3rd game of the season against an out-of-conference opponent. Playoff home field advantage was not on the line. It was an acceptable risk.

    Jeff Clarke said...

    "Ridiculously small sample size. Those others seem to want to use 1 games punts instead of an entire season's because it helps their case. If you had a .270 hitter that went 3 for 4 on a night, would you really think he had a 75% chance of getting a hit in his 5th ab. Regression to the mean is definitely expected here.

    Hanson averaged 35.5 net yards a punt for the season as of that moment."

    Another bad analogy. Batting and punting are NOTHING alike. Different pitchers and different pitches make it difficult to hit the ball. Hanson will probably connect with the football 99% of the time. The only question is how far will he punt. Indoors (Lucas field's roof was closed) means ideal punting conditions. What are punters' stats INDOORS from their own 29? Not stats from the 45 or 50 where the punter will intentionall shorten his punt to nix a touchback.

    Statistics from years ago are useless. Technical analysis of the stock market recognizes the importance of 5, 10, 15, 50, 100 day moving averages of price data. Even 5, 10, 15, and hourly moving averages are used to make price data analysis more sensitive and accurate. Show me a stock trader who relies on 9 year moving averages, and I'll show you a broke mofo.

    Finally, why has nobody addressed the fact that the Pats had the 4th best pass defense in the league before that game?

    Head buried in the sand? Ya, I'd say so.

  49. Anonymous says:

    Can you take a shot at analyzing the OT between the Steelers and Chiefs this weekend? Pittsburgh had a 4th and 5 from the KC 38 and punted, but would going for it have been an option to consider there?

  50. Anonymous says:

    I'd like to see the analysis of this:

    "After the Patriots lost Super Bowl XLII by a mere three points, sportswriters everywhere began to look back on one particular play that might have cost the team their perfect season. After an impressive opening drive in the third quarter, the Patriots, then leading 7-3, stalled at the Giants' 31-yard line. Instead of sending in Stephen Gotkowski to kick a field goal, though, Coach Bill Belichick left his offense on the field to try to convert on 4th and 13. Tom Brady's pass fell harmlessly to the ground, and Belichick's reputation as strategic genius took a hit."

    How anyone can statistically justify "going for it" on 4th and 13, when a fg would've added 3pts, would be interesting.

    4th & 13?? Come on. Wake up.

    There was NO data for 4th and 2 on the Pats 28 AGAINST the Colts. NONE.

  51. Jeff Clarke says:

    How anyone can statistically justify "going for it" on 4th and 13, when a fg would've added 3pts, would be interesting.

    If you stop with your anger and actually consider the question, you might understand why Belichick went for it.

    A fg would have added 3 points. How many points would a missed fg have added? See that is the problem, you seem to implicitally assume 100% success rate on a 48 yard field goal. If he kicked it, he would have only had about a 60% chance at making it. If he went for it, he'd have about a 35% chance at making it (need to look up that number...I'm estimating from memory), but if he did convert the fourth down, he'd be in a position where a fg was practically guaranteed and a td was very likely.


    Ultimately, missing the 4th down didn't cost him anymore than missing the fg would have and the reward for making the 4th down was a lot higher than the reward for making the field goal. If you assume success on the field goal, he obviously made the wrong decision. If you are realistic about probabilities, you realize Belichick knows what he is doing.

  52. Anonymous says:

    Jeff Clarke said...

    "If he kicked it, he would have only had about a 60% chance at making it."


    I'm guessing the 60% "statistic" comes from all kickers over 9 years?

    Gostkowski is 100% at 50+ yds (3 attempts) and 64% at 40-49 yds over 3 1/2 years (81.8% 2008) mostly in New England and other venues.

    Boston is a windier city than Chicago, though Chicago got the name. I don't know Foxborough's wind stats but I imagine they're similar. The point is that Gostkowski's FG stats would have been negatively impacted playing mostly in Foxborough. The Pats/Giants SB was played in Phoenix, with the roof closed.

    Please tell me that 60% number is not a league average over 9 years.

  53. Josh Robben says:

    I think in the end, the bigger picture value of this whole thing is that he didn't just fall back on "what does everyone normally do" or "what is best for my reputation". Instead, he went with what he thought would give his team the best chance of winning the game. Clearly it can be debated whether it truly was the best call from a probability standpoint, but all football fans should appreciate that fact that he still TRIED it, personal consequences be damned, rather than just automatically going the conservative route. NFL coaching mentality needs more of that, not less. Here's hoping the fallout from this doesn't set the rest of the coaches back 10 years just for the sake of avoiding media scrutiny.

  54. Brian Burke says:

    For the record, I think the 4th and 13 in the Super Bowl vs. the Giants was a bad decision. I would have gone for it up to about 4th and 6 there, which most people would still say is crazy.

    From the 31, a 48-yd FG is good about 66% of the time over the past 9 seasons.

  55. Brian Burke says:

    "There was NO data for 4th and 2 on the Pats 28 AGAINST the Colts. NONE."

    There was also no data for a punt from the Pats' 28 against the Colts.

    So I guess we should just use a Magic 8-Ball to decide? Ouija Board? Why is punting the automatic good idea?

  56. Jeff Clarke says:

    OK...I admit I did the 60% number from memory. I stand corrected on it actually being 66%. My 35% estimate was also incorrect. It should be 23%.

    I would tell you that if you are using kickers' personal averages and not league averages you are practically crying out for regression to the mean to slap you across the face. Remember Vanderjagt. He was the best kicker in the league one year. A couple years later he was out of the league. Accept the fact that all kickers are basically the same and that if one misses or makes it, he basically got lucky/unlucky within the confines of league probability and you will probably be good. Use extremely small sample sizes and over-extrapolate and you're asking for trouble.

    Brian posted an article on this topic awhile ago...

    So back to the 4th and 13.

    Here were the probs

    kick 66% prob 2.3 EP (3 pts - 0.7 points Giants score after kickoff)

    34% prob -0.7 EP

    Combined prob: 1.28 EP

    Go for it: 23% prob 4.1 EP (based on EP from 16 yard line after conversion)

    76% prob -0.7 pts

    Combined prob: 0.4 pts

    It does appear like Belichick made a mistake here and that mistake cost him 0.8 EP. However, everybody is always talking about situational probability. Perhaps, Belichick knew something we didn't. Maybe, Gostkowski told him he strained his leg on the last kickoff. All coaches (even Belichick) lean towards passivity if there is any doubt. It is just so rare to see a coach make a mistake on the over-aggressive side, that I have to wonder if something else was at play here. But I agree, if there wasn't extra information, he made a mistake.

  57. Jeff Clarke says:

    BTW...I haven't done an exact analysis of the probabilities, but I'm pretty sure that Kubiak's decision not to try and get 10 more yards against a prevent defense with 8 seconds to go last night cost the Texans far more than even the most absurdly pessimistic probabilities say Belichick's decision cost them with 4th and 2.

    He took a little bit of heat for it, but ultimately was able to get off with the pat excuse "I had confidence in my kicker." This translates into "How was I to know he'd f__k up? Blame him...not me" A passive coach always has a ready made scapegoat. An aggressive coach never does.

    It worked. Kubiak took a little heat, but nobody really focused on how he probably cost his team the game. Passive is always better than aggressive if your real goal is not to win, but to avoid being blamed for the loss.

  58. potter says:

    Jeff Clarke,
    "But what about the luck that you need in order to win if you punt?"

    OK here's why you need to be lucky.

    Going for it gives you a 78% of winning the game....in the longrun.

    So we persuade NWE and Indy to replay the 'go for it option' a couple of hundred times.We draw a graph axis ranging from a WP for NWE of zero at one end and 100 at the other.After every trial we record the WP for NWE of that actual play by means of a dot.

    We'll get a lot of dots around the 47% mark (corresponding to a 4th down failure) and slightly more at the 100% mark (corresponding to a 4th down success).They'll also be a couple of random dots corresponding to a couple of feakish outcomes.If we do a weighted average of these percentages,we'll get 78%.So we draw a vertical line there,just to remind ourselves.

    Punting gives a 70% chance,again in the long run.

    So we get the punt and return teams to go through the punt a couple of hundred times.This time we'll get the majority of dots concentrated around 70% mark.Do the weighted average and your vertical line will pretty much bisect the clump of dots.

    Now rewind back in time to the Sunday night game and ask the question should BB go for it on 4th and 2.

    The stats guys have without reservations said yes and they've used the relative positions of the lines to justify their call.

    Tony Dungy,(whether he realises it or not ) has said,to know what I'm getting on the go for it call I have to stick all those dots into a bag,shut my eyes and pick one.If I'm lucky it's a 100% dot,but if I'm unlucky it's a 47% one.

    On the punt he's almost certain to get a 70% one and overall he prefers that to the risk of being unlucky and pulling a 47% dot.

    If you don't like his choice,that's fine.But you have to say why and you can not use the " over a large number of trials,I'll win more by going for it" argument,because unless you plan on finding yourself facing 4th and 2 from your own 28 with 2 minutes left a dozen times a season,this kind of call doesn't have a longterm.You have to justify your choice based on the position of the dots (the variability).

    potter

  59. Anonymous says:

    Brian Burke said...

    "There was also no data for a punt from the Pats' 28 against the Colts.

    So I guess we should just use a Magic 8-Ball to decide? Ouija Board? Why is punting the automatic good idea?"

    If there are no good and applicable data to support a decision, then the decision maker will rely on experience, knowledge and common sense instead of a Magic 8-Ball or Ouija Board.

    I want to cross the street. I have no data from previous crossings at this intersection, on this day, with this weather, with this traffic pattern to tell me if I should cross now or wait for the light. What do I do? With no data to make this decision for me, I rely on personal observation, past experience, knowledge, and wisdom. These combine together into what is commonly known as Common Sense. Would it make sense for me to make a street crossing decision on a rainy day on data culled from 100's of attempts to cross on a sunny day? Would it make sense to make that decision on statistical averages of sunny and rainy days?

    Nobody ever said that a decision to punt is the "AUTOMATIC" good idea. Coaches weigh many factors when making a play call decision. Some coaches probably weigh these factors better than others.

    I've been hearing constantly here and other places (Zeus and 4th Down Doctrine followers) that coaches are just worried about job security and that they forego making decisions to improve their chances of winning "to play it safe" and "keep their job". There is little to support this -WP coaching narrative besides just making the claim. Coaches want to win games. As many as possible. What better way to ensure coaching job security than to win as many games as possible? Besides that, coaches are already looking at healthy retirement packages and contract termination clauses. Some teams and coaches clearly recognize that they will not be contenders for the SB. They have not spent enough money perhaps, but it seems to me that coaches want to win, and claims to the contrary are just unsupported attempts to explain why coaches do not always follow the statistics offered here.

    Saber baseball stats benefit from a far more robust dataset than football stats. One season has 162 games with thousands of pitches and thousands of at bats for specific players and teams. One Season. Recent data. No need to go back 9 years to create a stastically significant dataset.

    Is it possible to tweak your dataset to weight more recent data heavier than older data? Maybe you do that already.


    Jeff Clarke,

    Thanks for making the SB 4th & 13 analysis. I was curious. Thank you for the intellectual honesty. How do you explain that ZEUS, Bower, and Frigo claim BB made the right call?

    On Vanderjagt:

    Vanderjagt suffered a groin injury in Dallas after he was replaced by Vinatieri in Indy. He had a so-so season after that injury I believe, not sure, I didn't check.

    Vanderjagt is still the most accurate FG kicker in NFL history. I'm really not sure what your point is. Vanderjagt claimed he wanted to play in Toronto and that he wouldn't accept any NFL offers. His wife and kids are in Toronto.

    Another thing to note about Vanderjagt is that he ammassed his stellar, record-setting kicking accuracy in the RCA Dome, INDOORS!

  60. Anonymous says:

    Addendum.

    Also, I'm still waiting for someone to address the fact that the Pats has the 4th best pass defense in the NFL and how that fact would affect the math and WP equation.

    Peyton's previous 8 games were played against teams with horrible pass defenses. Check it out. The best he played against was 16th or 17th.

    Belichick signed Bodden, McGowan, Chung, Arrington, Butler, Lockett and Springs for a reason: to put together a young and deep secondary that could be freely substituted to remain fresh against no-huddle Peyton. It was the key addition to the Pats pass defense. There are few, if any, teams with such young and deep defensive backs. It would've been great to see them shut Peyton down on that final drive. LOL...just saw a story calling Bodden the 'Michael Jordan' of the Pats secondary.

    Recall that Peyton was picked twice in the 4th and that the his last scoring drive included a very questionable 31 yard pass interfernce penalty. All these people saying the Pats' defense was "gassed" are bonkers.


    Anyway, keep up the good work. Sorry if I seemed angry. I am angry. Very. Not at you or Brian. Bill is the Coach not you. It's his fault, not yours. He is supposed to take in all advice, game conditions, player readiness, play specifics, stats, and then apply his experience, knowledge, wisdom and common sense, and then make the right call. He did not. He made a horrible call.

  61. Anonymous says:

    correction:

    In the 1st sentence above, I meant to say the Pats HAD the 4th best pass defense before the Indy game, not HAS.

  62. Anonymous says:

    This is an interesting story on the 4th down theory. Admittedly, this is a high school team. But it's a variation on a theme, I suppose.

    http://highschool.rivals.com/content.asp?CID=892888

  63. Jeff Clarke says:

    Potter,

    "If I'm lucky it's a 100% dot,but if I'm unlucky it's a 47% one.

    On the punt he's almost certain to get a 70% one and overall he prefers that to the risk of being unlucky and pulling a 47% dot."

    Whether you realize it or not what you are doing is utility analysis. Its very popular in the financial world. With good reason. I'd much rather have a million dollars than a 50% chance at 2 million and a 50% chance at being flat broke. The marginal utility of the 2 millionth dollar is not equal to the marginal utility of the first dollar. This isn't true with WP. Its just not. All you care about is winning the game. You don't care how. You need to maximize your probability.

    It makes no sense here if your ultimate goal is to win the game. Stop and think about it. He is still going to have to reach into the bag and pull out a ball with the 70% solution. All you are doing is postponing that pull for about 2 minutes. That isn't worth anything in the long run. You think its great because its safe but its not. Its the ultimate in fools gold.

  64. Jeff Clarke says:

    "He had a so-so season after that injury I believe, not sure, I didn't check. Vanderjagt is still the most accurate FG kicker in NFL history. I'm really not sure what your point is. "

    He wasn't the most accurate FG kicker in NFL history that season. That was my point. Go to Wikipedia and look up the topic "regression to the mean". I think you will understand more. I've looked at this in great detail and kickers are highly subject to it.

    As a matter of fact, its difficult to say with any degree of confidence that the man that appears to be the best kicker in the NFL is any better than the man that appears to be the worst kicker in the NFL. The sample size is just too small. Its the equivalent of handing out the batting title after the first week of baseball. You are almost certain to give it to the wrong person.

  65. Brian Burke says:

    Forget about football and 4th downs for a second. Let's just play a game called "WinOrLose." In WinOrLose, I give you 2 options and tell you the historic chances that each option will win the game based on the last several hundred times it was played. You chose your option, and then we roll the dice and see if you win.

    Ok. Ready? Here goes.

    Option A, which has won 70/100 times.
    Option B, which has won 79/100 times.

    Which one would you choose?

  66. Anonymous says:

    "I've been hearing constantly here and other places (Zeus and 4th Down Doctrine followers) that coaches are just worried about job security and that they forego making decisions to improve their chances of winning "to play it safe" and "keep their job". There is little to support this -WP coaching narrative besides just making the claim. "

    You've got to be kidding me. There is you to support that claim. You are the support.

    What would have happened if Belichick punted and lost?

    You'd blame somebody else.

    What happened if Belichick went for it and lost?

    You blame him.

    Barry Switzer made the correct coaching move and won a Super Bowl. Belichick makes the correct coaching moves and won three Super Bowls. Yet people still think that they are idiots and criticize them for a couple of moves that didn't work out. The NFL is not quite the meritocracy that I think you think it is. Yes winning=job security in the larger sense, but coaches figure out that losing is a hell of a lot better with a scapegoat than without one.

    If it comes down to lose in the traditional way and have someone else to blame 30% of the time or lose and have people like you screaming about it endlessly 20% of the time, what do you think they choose?

    What would you choose?

  67. Jeff Clarke says:

    You're leaving out a key detail....

    With Option A, 30% of the time your spouse gets yelled at but you don't.

    With Option B, 21% of the time you get yelled at but your spouse doesn't.

    Is it really "for richer or for poorer"?

    Is it really teamwork before selfishness?

  68. Brian Burke says:

    Do ex-spouses count?

  69. Jeff Clarke says:

    lol....I should have said co-workers. Ultimately who really gives a shit about their co-workers job status more than their own.

    Anonymous that said he relied on "experience, knowledge and common sense" and that Belichick made a horrible decision there....I'm curious what you think the probabilities were...

    Did any of you guys see that Yale went for 4th and 22 from their own territory with the lead against Harvard?

    I actually sort of had a feeling that the Ivy league would start trying the Pulaski strategy. I didn't think it would be this dramatic or this early.

  70. Sam says:

    "Now, to correct some bad info from Dungy and Dan Patrick: The 4th down conversion rates I used are based on data from 1st and 3rd quarters when the score is within 10 pts. That way 'desperation' and 'prevent' plays are excluded."

    This was not a first or third quarter decision.

  71. Brian Burke says:

    Right. That's the point. The numbers are based on plays that matter. They are not "trash" plays like Dungy says. As far as I know, 2 yards is about 6 feet no matter what quarter you're in.

  72. Anonymous says:

    What I'm about to say is going to sound horribly elitist and condescending...I don't really mean it that way. Its an important point to make though and if it offends anybody, well I'm sorry.

    I've talked to a bunch of people about this since it happened. Nearly everybody that has an advanced degree in anything math related (economics, biology, engineering, etc.) agrees with Belichick.
    Nearly everybody that never went farther than high school math disagrees with Belichick. The correlation is very strong.

    I guess it all comes down to multivariate regression. How much do you know about it? How much do you trust it? Its fairly easy to say "4th and 2 with 2:00 minutes left, with this offense and this defense in a dome with a lead". We've never seen this before. Therefore we have no samples. Therefore, we can't estimate the probability. Therefore, the whole exercise is a waste of time. The scientific community recognizes that this is an enormous cop out. If this logic was allowed to hold in other fields....

    Ultimately, a lot of different math guys have put together models of football. None of them have remotely agreed with the conventional wisdom. The conventional crowd is basically starting to realize that the averages are saying go more often. Sometimes they even admit it outright, before launching into a diatribe about models. They say coaches have to consider all of the specifics of the situation. True enough. But if the averages are saying "go" and coaches are always saying "punt", then clearly they are getting the specifics wrong on a very regular basis.

    Its basically impossible to come up with a model that says Belichick should have punted without also making the conclusion that other coaches made even bigger mistakes in punting in other situations. In other words, coaches are like the parents of Lake Woebegone, where every child is above average. In coach's minds, every 4th down conversion opportunity is below average.

  73. Jim Glass says:

    I haven't seen this mentioned above, though I may have missed it, since it is pretty plain...

    Whatever one thinks of the fourth-down call, one mistake Belichick unquestionably made -- since he admitted it later -- was not realizing on third down that he could put the game in his pocket by getting two yards in two plays.

    If he was going to go for it on 4th, he certainly should have realized that on 3rd -- that's looking only one move ahead, not so far for one of the world's top and highest-paid game players. (He admitted after the game that going for it on 4th hadn't occurred to him until he was already there.)

    He didn't optimize his play calls to get (only) 2 yards in two chances.

    The Pats averaged 4+ yds per carry, and on 3rd down Indy had to defend the pass with Brady having already thrown for 370 yards. A run on 3rd down would have had an excellent chance of getting the two, and if it didn't it would probably have left the Pats with less than the two-to-go on 4th down that they ultimately faced, improving their chance on 4th.

    I've got to figure if he'd optimized to get "2 in 2" the chance of success would have been at least 85+%.

    As a bonus, if he'd had this coherent foresight he could've run most of another minute off the clock, and/or forced Indy to use its last time out -- the Pats' last three plays used only 18 seconds -- improving his defensive chances if he didn't get the first down. (Manning hit the winning pass with only 16 seconds left, if he'd had 50 fewer seconds to work with...?)

    So while I'll say going for it on 4th was the right call in that situation, Belichick's play calling left something to be desired in getting to that situation.

  74. Daniel Jepson says:

    Jim - you're absolutely right. At the time, I was hoping he would mix things up with a third down run, and then treat it as four-down territory barring a loss of yardage - and I bet I was far from the only one.

    In fact, I think you can actually make the same case regarding the infamous 3rd-and-4 in the '06 AFCCG. They had basically abandoned the run since midway through the second quarter....it was a perfect time for one of those draw plays to Faulk. Even if it didn't pick up the first outright, it likely would have made "going for it"* a very appetizing proposition - especially as they were near midfield at the time.

    I also thought (in real time, not hindsight) that they should have considered going for it when they faced 4th and 2 on their first possession of the second half, though this case is probably less objectively defensible. The defense had just been driven on all the way down the field twice, on either side of half-time, and the offense absolutely had to put something better than a three-and-out together in order to allow them time to regroup. And if the fourth down try had failed, at least the subsequent touchdown would have happened quickly. As it was, there was another long touchdown drive following the punt, at the end of which the defense had been on the field for a rather ridiculous number of snaps.

  75. Daniel Jepson says:

    Oh, I forgot about my asterisk. I was going to say, as a tangent, that I really hate using colloquial phrases in serious discussion, but I really have no idea what the alternative to "going for it" is here. "Attempting to convert" just sounds a little ponderous.

  76. Anonymous says:

    "Option A, which has won 70/100 times.
    Option B, which has won 79/100 times.

    Which one would you choose?"

    but BB isn't playing a game where he can play 100's of times to churn over his advantage.He's much more in a Deal or No Deal situation where he's down to his last two boxes (one containing an Indy stop and the other with a NWE 1st down) and the banker's offered him the punt.

  77. Jeff Clarke says:

    "but BB isn't playing a game where he can play 100's of times to churn over his advantage.He's much more in a Deal or No Deal situation where he's down to his last two boxes (one containing an Indy stop and the other with a NWE 1st down) and the banker's offered him the punt."

    I think you are completely missing the point.

    The banker's offer on Deal or No Deal is the safe choice. He offers you $200k. You leave with $200k. If the Patriots could have left with 70% of a W and 30% of a L, it would have been analogous. The point is that punting was just as much of a gamble only it was a far worse gamble mathematically.

    The better question would be something similar to the Monty Hall problem (google it). In either option, you are gambling but you have much greater odds of winning by not doing what the vast majority of people do.

Leave a Reply

Note: Only a member of this blog may post a comment.