'Touching' the Passer and Belichick on 4th Down

Should it be called “roughing” the passer or “touching” the passer?

The talk of the NFL today is about the two roughing the passer calls in the Baltimore-New England squeaker yesterday. Judging from comments around the blogosphere (plus Rodney Harrison and Tony Dungy), the verdict on whether those calls were justified is pretty clear. In this post, I’ll mathematically prove that those calls were errors…No, just kidding. But what I will do is look at how crucial those calls were to the Patriots’ win. I'll also take a look at a critical 4th down call.

Both drives in question ended in touchdowns for the Patriots, worth a total of 14 points in their 26-20 victory over the Ravens. It might be temping to just subtract 14 from 26 and claim the Ravens would have won, but it’s obviously not that simple. From the point we change anything within the game, everything after that would unfold differently. We need to look at it probabilistically.

Unlike penalties like pass interference or holding, Sunday’s two “roughing” the passer calls (is my Baltimore bias showing through?) did not affect the outcome of the play itself. So we can look at the natural outcome of those plays and estimate the Pats’ WP assuming the penalties were not called.

The first penalty came as the Patriots trailed by 4 with 3:35 first quarter. A 3rd down and 9 pass from the Baltimore 37 fell incomplete. Without the penalty the Pats’ WP would have been 0.28, but with the 15 yard penalty their WP was 0.48, a difference of 0.20.

The second penalty came with 5:16 left in the second quarter. This time the Pats’ were up by 3 points. Their 2nd down and 11 pass from the Baltimore 43 was incomplete. Without the penalty the Pats’ WP would have been 0.63, but with the penalty their WP became 0.76, a difference of 0.13. Since New England still had the 3rd and 11 opportunity to convert (which are successful 30% of the time), the difference in WP was not as stark as the for the first penalty.

 Oh, the Humanity! The first, and most decisive roughing call.

In total, that’s a difference of 0.33 WP, which is either substantial or extremely substantial depending on how you look at it. To win a team needs a 1.00 WP, so one way to think of it is that the calls accounted for about a third of the game outcome. But each team starts with about a 0.50 WP, so maybe we’re talking about a much greater proportion.

Thus concludes my biased rant. Admittedly, you can probably pick out a couple bad calls from any game and show how much they changed the outcome. The problem is that these particluar calls were judgment calls subject to vague interpretation. The NFL understandably wants to protect its marquee players, but in the process it’s allowing games to be decided by subjective opinions of officials as much as the play on the field.

I agree with Chase Stuart's distinction between violations of the rules of football (like off-sides) and 'disincentive' penalties (like roughing the passer). A better solution would be to have relatively severe fines and suspensions instead of in-game penalties for players who are truly unsportsmanlike. This would allow a detailed review of each play and not put game officials in the position of making split-second judgment calls that often decide game outcomes.

Another critical moment of the game was Belichick’s decision to go for it on 4th down. Following the first roughing the passer penalty, the Patriots faced a 4th and 1 from Baltimore’s 3-yard line toward the end of the first quarter. The Patriots converted the 1st down with a 2-yard plunge by RB Sammy Morris, and they scored a go-ahead touchdown in the ensuing series.

Had they went for the near-automatic field goal, the Pats would have a 0.35 WP. Going for it, which could be expected to be successful 68% of the time, was worth 0.46 WP, a clear advantage of 0.11. The successful conversion gave them a 0.51 WP, and New England never relinquished the advantage for the rest of the day.

  • Spread The Love
  • Digg This Post
  • Tweet This Post
  • Stumble This Post
  • Submit This Post To Delicious
  • Submit This Post To Reddit
  • Submit This Post To Mixx

39 Responses to “'Touching' the Passer and Belichick on 4th Down”

  1. Anonymous says:

    To be fair, Wright's sac in which his hand grazed Flacco's mask was a worse call than either of these -- what do the numbers look like on that?

    Also, the "below the knees" penalty was definitely correct -- had Brady's cleat been planted, instead of slightly off the ground, that could have ended his season( again). Just because his leg swung away when hit doesn't mean the hit was OK -- it was plainly dirty.

  2. John Candido says:

    Given your past analysis, that roughly 50% of the game of football is decided by luck, it would be interesting to see how much this percentage would decline given your proposition of non-game changing fines for roughing penalties. Just given how the numbers were affected by this example alone, it seems like it would be major.

  3. Anonymous says:

    A very general comment, inspired by your analysis of the 4th and 1.

    William Krasker at http://www.footballcommentary.com/ has done a lot of evaluations of "controversial" coaching calls.

    My impression from reading them, is that it was very rare for such a call, no matter how seemingly dumb or seemingly obvious, to make more than 5% difference in expected winning probability. But my impression from reading some of your articles is that you often find very much higher differential outcomes.

    Is that your sense as well? And if so, do you have any comment on why it might be?

    Thanks for a very thoughtful site.

  4. Brian Burke says:

    A couple reasons: the Football Commentary is different from mine in that it is a "reverse-induction dynamic programming model." I'm not an expert, but it's been explained to me that this type of model requires both opponents to make optimum decisions for the rest of the game.

    My own model is purely empirical, meaning it's based on actual win rates given various combinations of time, score, field position, down, and distance. But I don't think that's the biggest reason for the difference.

    When I write up an analysis, I selectively choose situations I know to be high-leverage situations. There's nothing wrong with that. I'm not going to take time to write up every call that might have have swung the WP by 0.01.

  5. Anonymous says:

    I'm surprised by the liberties you're taking in interpreting the winning probability changes due to the penalties. I think if you're going to add one to the other to come up with an aggregate .33 change, you should compare this to all other positive plays that New England had. .33/.5 sounds huge, but if you compare .33 to all the positive plays NE had you might have something like .33/10, so that the penalties only made up 3.3% of New England's good plays. You just need more perspective on everything else that happened in the game.

  6. Anonymous says:

    the WP changes you give here are ridiculous. there is no way the Pats went from a 20% chance of winning to a 48% chance b/c of 1 penalty near midfield in the 1st quarter.

  7. Brian Burke says:

    Consider a situation in which some football deity magically bestowed Baltimore with 42 points at halftime. That would have made their WP 0.99, a boost of 0.85.

    For the entire game, the WP "needle" moved a total of 6.4 "games" worth of WP. (This is what my 'Excitement Index' (EI) is.) Would you think that the deity's bestowal really only determined 13% of the game? (.85/6.4)

  8. Eric says:

    how do the two penalty plays compare to New England's other big plays? a quick look at the chart shows the failed 4th down conversions pretty important, NE's successful conversion, the fumble, the INT, etc.

  9. Brian Burke says:

    Anon-I'm sorry that it disagrees with your intuitive sense of who's going to win a game. I'll just delete my database and ask you your guess anytime I need some analysis.

    The 0.20 WP is due to a situation where the Ravens would have a > 3-point lead plus possession. In the 1st quarter, leads between 4 and 6 points without possession typically indicate a 0.67 WP. That Baltimore would have the lead and the ball moves the needle an extra 0.13 for the 0.80 total.

    The 0.48 situation is one in which the Patriots have a 1st and 10 inside FG range, likely either making it a 1-point game or taking a 3-point lead.

  10. Brian Burke says:

    Eric-You can check out the entire WP graph and see the change for any play.

  11. Joe25 says:

    I'm a Charger fan and remember getting screwed by Ed Hochuli last year so I feel your pain. Those calls did swing the outcome. Can you divulge how much WP NE produced and how much WP BAL produced so we can get a more accurate picture of the magnitude of a .33 shift in WP? Thanks.

  12. Anonymous says:


    if you believe the WP changes are truly that drastic then you are wasting your time here and should be trading in game. if you can buy the Ravens at 5-1, get a penalty, then sell them at a coinflip you will be a millionare soon.

    the truth is that odds never change this drastically, or close to it. either the markets are not even close to close to being efficient or you are misinterpreting your data and/or drawing faulty conclusions.

  13. Nate says:

    Hey Brian,

    I'm also skeptical that you can simply add these WP deltas.

    Suppose Baltimore builds up a huge lead, giving it a WP of 99%. Then a magical deity switches the scores, giving New England the 99% win probability. Somehow, Baltimore fights back and regains its original 99% WP. But the magical deity interferes again, giving New England the 99% WP. New England holds its lead and wins the game.

    The magical deity has interfered twice, and both times the WP delta was 98%. But I don't think you can say the deity was 196% responsible for New England's victory.


    I also have a question about WP. Suppose the Niners are beating the Cardinals by 21 points at halftime. My understanding is that in essence, you look at historical games where a team was up by 21 points at halftime and see how often that team won the game. But this is a biased sample -- more often than not, the team that's winning is the stronger team. If you have a priori knowledge that the Niners and Cardinals are equivalently strong, the actual probability of the Niners winning will be smaller than your calculated WP.

    Does that all sound right? If so, it seems like some of these uses of WP are problematic. The model is implicitly assuming the 48%WP Pats are a stronger team than the 28%WP Pats. But it's not like calling the penalty somehow made the Pats weaker.

    - Nate

    P.S. Since this is my first time posting, I want to say I love your site. There are tons of insightful observations here, and it's definitely increased my understanding of the math behind the game.

  14. Brian Burke says:

    You could be right, Nate. I see your point. Maybe we should ask the magical deity himself, the referee in that game yesterday!

    Yes, your understanding of the bias in the model is correct. The WPs are really just average win rates for teams in similar situations. There's some interpolation and smoothing, but otherwise that's it (for now). So usually, but not always, bad teams are the ones losing and good teams are the ones winning. Then again, if you're trailing by 21 points at halftime, chances are good you are the inferior team.

  15. Anonymous says:

    Just one comment - if those "disincentive" penalties were removed in favor of fines only, it would mean that players that were willing to play dirty (*cough*flozell*cough*) could do it repeatedly, every game, and get the win and just pay a few thousand in fines. I guarantee that between "performance incentives" and extra payment for playoff/ SB appearances/ wins, they'd find it well worth it to commit those penalties and eat the fines. When you're making millions, a few thousand (or even tens of thousands) are worth it when it could potentially pay back 10x or 100x that in the end.

  16. DSMok1 says:

    Brian--that sounds like a Bayesian statistics question.

    Suppose we have an a priori knowledge of how good a team is. It is possible to develop a "standard error" for this evaluation. It is also possible to use Bayesian updating to create a running evaluation of how good a team is throughout a game. That would probably be overkill, however... It would perhaps be possible to create a "biased" WP model based on the a priori understanding of the teams' relative strengths. All of this would perhaps create a better predictive model during the course of the game. I don't know if that's useful, though, come to think of it.

  17. Brian Burke says:

    Yes, I agree. I have some ideas how to fold that into the model. You'd start with a good pre-game estimate of the WP. But no matter how good the team was on paper, a 14-point deficit with 2 minutes left still means they're going to lose. It's a matter of how the pre-game advantage decays as the game goes on.

    Based on my research on how home field advantage decays throughout a game, I have an idea of the shape of the curve.


  18. Jeff Clarke says:

    There's another call from the last week that I'm curious about:

    Browns have 4th and 10 with the game tied and 19 seconds left on the Cincinnati 40 yard line. They have no timeouts. Cincinnati has 2.

    It seems to me like if they go for it and get 10 yards and out of bounds, they have a shot at making the game winning field goal. If they go for it and miss, the Bengals get one play where they need to go 30 yards to set up a shot at the game winning field goal.

    By punting, it seems to me like Mangini was saying that the odds of his team getting 10 yards on one play were less than the odds of the Bengals getting 30 yards on one play.

    Its actually worse than this. Lets just assume that the odds of either event occuring are 30%. If the Browns convert their 10 yard pass 30% of the time and the Bengals convert their 30 yard pass 30% of the time, the Browns would win 30% of the time preovertime, but the Bengals would only win 21% (30% success * 70% of the time that they get the ball when the Browns failed).

    It seems to me that for this move to make any sense at all, Mangini would have to think that the Bengals' chances of making a 30 yard pass were significantly higher than the Browns' chances at making a 10 yard pass. I know the Bengals are a better team, but that cuts both ways. You have to assume the better team will win more than 50% of the games in overtime. Its hard to make a logical case that was the right move. I'd like to see someone try.

    There was a supreme irony in the way the game ended in overtime. The Bengals were in almost exactly the same situation. 4th and 11 from the 41 yard line. It was even the same side of the field. They went for it and two plays later, they kicked the game winning field goal. In a way, their move was even ballsier. Part of the calculation from the Browns point of view was that the Bengals would only have 1 play if they failed. When the Bengals went for it, there was 1:04 left in overtime.

    I seriously think this might contend for the worst 4th down call ever award. I'm curious what everybody else thinks.

  19. GeorgeJD says:

    It's always a shame when the refs decide a game. An interesting stat would be to see how roughing the passer and pass interference calls have altered game situations (WP).

  20. Dave M says:

    I don't think Mangini was too far off that the Bengals had a better chance to get 30 yards in one play than the Browns had to get 10 yards. It's not like they're relatively equal teams.

  21. Unknown says:


    Appreciate all of the work you put in here. This is my fall / winter "fangraphs". Would it be possible to make the WP charts available in a table format?

  22. Anonymous says:


    Regarding the "disincentive" penalties. A major focus of the English Premiership (Soccer league) over the last couple of years has been the issue of dissent toward the game officials. In soccer field position and possesion are much more fluid commodities, they change frequently and to a great extent, and as such applying a didstance penalty for such an infringment is of limited disincentive (there was even an unsuccessful trial recently). They have been fining and suspending offenders for years, but as it is still a problem that is clearly not an effective solution either.


    You suggest that the 0.33 WP is a third of the game outcome, yet in the commentsyou tell us that the total change is 6.4 and that the diety effect is thus 13%. Doesn't that mean the penalties had just over a 2% impact?


    On that diety interferance: I suspect that if Baltimore were bestowed 42 points at the half then the EI would be lower than it was. Would the change in WP of each subsequent play initially be minimal (there is little difference between 42-7 and 42-0, for example).


  23. Jeff Clarke says:

    Dave M,

    If that was true, then the Bengals would have had about a 95% chance of winning in overtime. As a matter of fact, it actually makes it an even worse call if Mangini was convinced his team is that bad relative to the Bengals.

    Could you beat Kobe Bryant in a basketball shooting competition?

    If you both had 1 shot, there is a reasonable possibility you make it and he doesn't. If you both have 30 shots, he is pretty close to guaranteed to win.

    What if you had 1 shot from 10 feet? You make it and he doesn't even get a shot. You miss it and he needs to sink one from 30.

    Would you take that or would you rather play him with many shots on an equal court?

    The randomness factor increases as the itirations decrease. If you know you are inferior, you should always pick one play over many.

    The truth is the Browns aren't that inferior to the Bengals. We see teams we think are far inferior beating teams we think are superior all the time. How about the 2008 Dolphins vs the Patriots? For that matter, most people expected the Bengals to be just as bad as the Browns. It was only because they had a huge upset against the Steelers that people started to change their opinions.

    The differences in talent between any NFL teams is not as large as it seems. Over the course of a 60 minute game, the better team will usually win. Over one play, the difference is practically negligible.

    The worst NFL team imaginable will get 10 yards far more frequently than the best will get 30.

  24. Dave M says:

    The Browns *are* that bad, they are that inferior. We're talking about one of the worst teams in the league, the Bengals are a solid team. Mangini--who I am no fan of--had every right to believe his offense is that inept, because it is that inept. Coming into the game they scored one offensive TD, and it was in garbage time.

  25. Brian Burke says:

    Jeff's right. On any one given play, the difference among teams is minuscule. The only reason we (somewhat) reliably see good teams beat bad teams is that the tiny advantage accrues over dozens of plays.

  26. Brian Burke says:

    On the point about how much the penalties affected the game: I think the guys suggesting it's really .33 out of 6.4 are missing an important point. Yes, the WP needle moved a total of 6.4, but it moved only a *net* of 0.5. All the other movement during the game canceled itself out. Both teams start at 0.5 and the winner finishes with 1.0.

    Say you go into a casino with $50, win some then lose some, and win some more and lose some more. But at the end of the day you come home with $100. You may have won and lost an absolute total of $640 or so, but you net $50.

    Now let's say an external agent (a deity) had given you an extra $33 at some point during the day. Of your winnings, your net gain, that $33 represents 2/3rds. No one would ever think it's 33/640.

    Also, keep in mind there is a distinction between the natural back and forth of a game, and the intervention of an external agent. Unlike penalties like pass interference or holding, the 'roughing' calls did not effect the outcomes of the plays. The effect of those penalties were external to the game itself.

    Often, there may be bad calls on both sides of the ball, and they would tend to cancel out. But not in this case.

  27. Anonymous says:

    Thanks for clarifying that, I didn't read your comment correctly.

    Not sure where else to post this:

    Can you tell me why Matt Schuab kneeling with 1:30 left in the game on Sunday resulted in a 0.19 change in WP?


  28. Brian Burke says:

    Weird. No idea. There's some bug somewhere in my code that only appears in strange places when one team has a >21 point lead. The exact same thing happened here.

  29. Brian Burke says:

    Fixed. Thanks.

  30. chris says:

    Your analogy w/ poker is not an accurate way to think about things. Adding $33 could add much more of an advantage than just that $33, allowing you to bully, or giving you more chips to double up with. The doubling then means that $33 turned into $66. It also works the other direction where you may miss out on opportunities b/c you play things differently with a bigger chip stack.

    Also, I believe you have to look at things in an absolute sense. Otherwise what you are saying is that given 5 or so horrible calls in a game they could account for more than 100% of the outcome. Or lets ignore horrible calls. Lets look at all pass positive pass plays and they would account for more than 1.00 of the WP. There's clearly a flaw in this method.

  31. Brian Burke says:

    No, Chris. No one said anything about poker. And there isn't "clearly a flaw in this method."

    Technically, if you had a high number of very flawed plays as you suggest, you'd need to add them within a logistic transformation. But for amounts within .75 or so, it's basically linear and the transformation isn't necessary.

  32. Brian Burke says:

    Maybe this will be clearer: Let's say some authority gave one team a 7-point advantage to start the game. That changes the game from .50/.50 to about .67/.33, a difference of .17 to both teams.

    Now, as the game ensues, there are ups and downs throughout the game. In all there is about 6.4 units of total travel in the WP graph.

    Would anyone say that the 7-point advantage was worth .17/6.4 (2.7%)? Of course not. You'd say it was worth 17%, pure and simple. Just because the external "providence" occurs within the game instead of before it, doesn't change its importance.

  33. Luis DeLoureiro says:

    I'm from NE and I like (not love) the Pats and Brady. But, I have to say that the best analogy I heard when referring to Brady's behavior was a soccer player flopping. I understand you do what you need to do to win - but, there's a point where you come off as an entitled baby.
    Oh - and, I'm a HUGE euro soccer fan as well. I just want to be clear that I wasn't trashing soccer....

  34. Luis DeLoureiro says:

    Brian - I've been trying to explain EXACTLY your last point for a couple of years to anyone who would listen.
    At my site, I have a section called "deceptive facts".....it's not exactly an accurate title. But, that's where I say the qb rating stinks. It's also where I analyzed the statement heard over and over again after week 1 - only x% of teams who lose in week 1 make the play-offs. That's a stat that makes people say - "wow, that's incredible..." But, from a quant point of view, it makes perfect sense.
    The next "deceptive fact" I want to tackle is "teams who score first win x% of the time".....duh!!
    And, I think you explained it perfectly in your text. A team who is given a 7 point advantage changes the odds from 50/50......it's as simple as that.

  35. David Kociemba says:

    As a teacher, I can tell you the best way to discipline a student is to do it as soon as possible after the infraction so that they draw a connection between the behavior and the punishment. If you wait until several days later to punish them for the behavior, it has much less impact: you've missed the teachable moment. In addition, when the system of collective punishment has been bought into as "fair" by all involved whatever the individual judgment, such collective punishments have a tendency to get others punished to teach the offender not to do that act again.

    Reinforcement and discipline have to be tightly connected to the action for behaviorism to work.

    So, no, fines are not an effective way to provide disincentives for behavior.

  36. Tom says:

    You say The second penalty came with 5:16 left in the second quarter. This time the Pats’ were up by 3 points. Their 2nd down and 11 pass from the Baltimore 43 was incomplete.

    actually Brady completed the pass.

  37. Luis DeLoureiro says:

    I believe the title of this post should actually be "ALMOST Touching the Passer"

  38. Tom Brady says:

    I am a wuss and a very overrated qb, who looks for bad calls to get me out of tough situations that i put those bunch of bums i play with in trouble. I really dont know why people are so angry with me getting those flags were good plays by my part the way i flicked my wrist is the reason we won

  39. Unknown says:

    1- The statistics on 'roughing the passer' calls speak for themselves.



    2- Is Joey Porter insinuating that all the officials are dirty and in Tom Brady's pocket?

    3- I believe any rules that prevent permanent or career ending damage to a player, QB or otherwise should be considered a good thing. Injury rules in the NFL not only protect the quarterback but they also protect the other players. Let us not forget players like Darryl Stingly, Corey Stringer,Fast Eddie (Reed), Joe Theisman, Napoleon McCallum and so many more. Do we learn nothing from these injuries? Making the game safer does not make the game 'pussy', it makes it smart.

    4- And that makes me think that Tom Brady is very smart and Joey Porter is a very angry man with a big green monster on his back.
    5-Joey Porter has been fortunate in that he has never had a major NFL injury. Not so fortunate in his personal life with being shot and family hardships. I know people who include his little girl in their prayers.
    6-So Joey, my advice do what you do best and it won't cost you a dime. Give Tom Brady the finger and let it drop.

Leave a Reply

Note: Only a member of this blog may post a comment.