As I noted in my game commentary, if you need to call a timeout to think over your options, the situation is probably not far from the point of indifference where the options are nearly equal in value. And timeouts have significant value, particularly in situations like this example--late in the game and trailing by less than a TD--because you'll very likely need to stop the clock in the end-game, either to get the ball back or during a final offensive drive. Would Carroll have been better off making a quick but sub-optimum choice, rather than make the optimum choice but by burning a timeout along the way?
Here's another common situation. A team trails by one score in the third quarter. It's 3rd and 1 near midfield and the play clock is near zero. Instead of taking the delay of game penalty and facing a 3rd and 6, the head coach or QB calls a timeout. Was that the best choice, or would the team be better off facing 3rd and 6 but keeping all of its timeouts?
Both questions hinge on the value of a timeout, which has been something of a white whale of mine for a while. Knowing the value of a timeout would help coaches make better game management decisions, including clock management and replay challenges.
In this article, I'll estimate the value of a timeout by looking at how often teams win based on how many timeouts they have remaining. It's an exceptionally complex problem, so I'll simplify things by looking at a cross section of game situations--3rd quarter, one-score lead, first down at near midfield. First, I'll walk through a relatively crude but common-sense analysis, then I'll report the results of a more sophisticated method and see how both approaches compare.
I essentially want to know the value of a timeout in terms of Win Probability (WP), which is the universal linear utility function of the sport. If we can measure the value of something in terms of WP, we can apply that knowledge in all kinds of ways.
To estimate a timeout's value, we just want a good estimate of how often a team wins in some situation with n timeouts compared to how often a team wins in the same situation with n-1 timeouts. The difference in how often a team can be expected to win in each case is our result--the WP value of the timeout.
In football, where a season consists of only 16 games, there isn't as much data as we'd like to make reliable WP estimates just by counting how often a team wins for a given game state based on score, time, down, distance, field position. And splitting the data up further according to timeouts remaining thins the data out far beyond the point of statistical reliability. So we need to aggregate, remove the noise, and then de-aggregate the data for reliable results.
We need to make a few assumptions to help analyze things. First, like the guy in the AT&T commercial might ask, Is having more timeouts better or having less timeouts better? More! It's obvious, but when we might observe higher win rates associated with fewer timeouts, we can be confident there is significant noise. Also, timeouts are essentially equivalent. The difference in value between having 3 timeouts vs having 2 timeouts is essentially the same as the difference between having 2 vs 1, and so on... We know this probably isn't true, but it will simplify the problem greatly. There are good reasons to think that the difference between 1 timeout vs 0 timeouts is larger than the other differences.
To see the WP effect of a timeout, I started with common situations that we would be most interested in. I also tried to isolate as many variables as reasonable. I aggregated all situations where the offense was up by 0 through 7 points and had a first down between their own 40 and the opponent's 40 in the 3rd quarter.
Normally, I would not aggregate situations this way for many reasons. But in this analysis, I am less concerned with the actual WP estimate as I am the difference in WP estimates in like situations based on timeouts remaining. Also, as you'll see later, we can account for bias created by the aggregation.
There are 4 timeout states for each team (0...3 timeouts remaining), which makes 16 potential combinations of timeout states. Here is a matrix of how often we see each combination of timeouts remaining for the offense and defense in the selected cross-section of data. The columns represent the offense's timeouts left, and the rows represent the defense's timeouts left. For example, there were 290 actual situations where the offense had all 3 timeouts and the defense had 2 timeouts.
Count
Def \ Off | 0 | 1 | 2 | 3 |
0 | 2 | |||
1 | 2 | 6 | 20 | |
2 | 14 | 41 | 290 | |
3 | 27 | 294 | 3063 |
There were very few cases of 1 or fewer timeouts for either team, so for this primary analysis we'll rely on the difference in value between 3 and 2 timeouts. In fact, I'll delete the 0 row and column from here out.
Here are the raw average win rates for the offense (as opposed to the smoothed, modeled WP estimate) for each combination of timeout states. Like we would expect, having more 2nd-half timeouts usually results in winning more often in a close game. The offense tends to win more often as the defense's number of timeouts decrease. For example, after the defense has used its first timeout, offenses go from winning 69% of the time to winning 73% of the time.
Raw Win Rates
Def \ Off | 1 | 2 | 3 |
1 | 0.50 | 0.80 | |
2 | 0.64 | 0.63 | 0.73 |
3 | 0.67 | 0.63 | 0.69 |
Likewise, offenses win less often as they use their timeouts. Going from 3 to 2 timeouts reduces its raw win rate from 69% to 63%. Right off the bat, we can make a very rough approximation of the value of a timeout in similar situations is near 3..4..5% or so. I was honestly surprised at this result. I was expecting something smaller.
But we aggregated a lot of qualitatively different situations--large swaths of score differences, field position, and the entire 3rd quarter! Frankly, I'm not worried about the score difference problem and field position problem. Not that they don't affect WP--It's that they don't correlate with TOs remaining (r=.017 for yd line and .005 for score difference), so they're not going to introduce bias into the analysis of average differences. It's true that there are more snaps near an offense's own 40 than its opponent's 40, so the midpoint of the swath of data won't be exactly midfield. But that's ok, at least for now.
What I am concerned about is bias caused by game time. It's not too difficult to understand that there tends to be fewer timeouts remaining as the half progresses (r=.302). Teams never gain timeouts. And as time goes on, the same lead in terms of points will bring an improved chance of winning. For example, a 3 point lead has a higher WP at the end of the 3rd quarter than it does at the beginning of the 3rd quarter. So there will be a slight illusion that makes it appear that having fewer timeouts, either for the offense or defense, means a higher WP for the offense.
Here is a similar matrix of timeouts remaining for defenses and offenses in the 3rd quarter that lists the average minutes remaining in the game. For example, there is an average of about 23 min left in the game (8 min in the quarter) when both teams have all 3 timeouts, and an average of about 20 min left when either the offense or defense has 2 timeouts left.
Average Game Time
Def \ Off | 1 | 2 | 3 |
1 | 16.1 | 18.1 | |
2 | 18.5 | 18.1 | 20.1 |
3 | 17.1 | 19.8 | 23.4 |
To account for the bias caused by the relationships between time, timeouts remaining, and WP, we can calculate the difference between the raw win rates and the average expected WPs for our subsample. The current WP does not (directly) consider timeouts. In effect, this is a crude multivariate regression. But instead of dumping a bunch of variables into regression software and pressing the 'I-believe-in-analytics' button, we can transparently see if what's going on makes sense.
The matrix below shows the average expected WP for each timeout state combination. For example, when both teams have all 3 timeouts, the average time remaining was 23.4 minutes which equates to a .70 expected WP. As you can see, the WPs are relatively flat, meaning that they don't climb very much for a given lead through the 3rd quarter, and even its slight increase is fairly linear. The big divergence begins shortly into the 4th quarter. (That's one reason I chose the 3rd quarter to begin attacking the timeout value problem.)
Expected Win Probability
Def \ Off | 1 | 2 | 3 |
1 | 0.72 | 0.72 | |
2 | 0.72 | 0.72 | 0.71 |
3 | 0.72 | 0.71 | 0.70 |
We finally arrive at the results. The final matrix below lists the raw win rate above expected for each timeout combination. Keep in mind the very small sample sizes in everything other than the 3-3, 2-3, and 3-2 cells.
Win Rate above Expected
Def \ Off | 1 | 2 | 3 |
1 | -0.22 | 0.08 | |
2 | -0.07 | -0.08 | 0.02 |
3 | -0.05 | -0.08 | -0.02 |
If we focus only on the differences between the 3-3 and 3-2 cells, we can get a rough approximation for the value of a 2nd-half timeout in the 3rd quarter when the offense is ahead by one score:
-When both teams have all 3 timeouts, the offense ends up winning 2% less often than expected.
-When the offense has all 3 timeouts but the defense has only 2, the offense wins 2% more often than expected.
-When the defense has all 3 timeouts but the offense has only 2, the offense wins 8% less often than expected.
When the defense uses a timeout, it costs:
wp(2,3) - wp(3,3) = .02 - (-.02) = .04
And when the offense uses a timeout, it costs:
wp(3,2) - wp(3,3) = -.02 - (-.08) = .06
Given the assumptions, a fair rough estimate would be that the value of the first timeout is .05 WP.
In Practice
Back to the NFC Championship Game situation. Down by 4 on the SF 37 with 14 minutes to play, SEA faced a 4th and 7. Going for it yields a .35 WP and punting yields a .32 WP, according to the general (timeout unaware) model. But the Seahawks chose neither option. They chose the call-a-timeout-then-go-for it-option instead, which was worth .35 WP - .05 WP = .30 WP--worse than had they punted right away.
(Of course, this might be the worst example in the world, as they went on to score the go-ahead TD on that very next play.)
Up Next
This was a very crude first approximation. In the second part of this article, I'll conduct a more sophisticated analysis using a regression model that will account for the full mix of variables including field position, score, and time that were aggregated in this first approximation. If the results of the more advanced method confirm the common sense estimate, we can have confidence in its approach, and ultimately produce a generalizable method for estimating the value of a timeout across all values of score, time and field position.
Good article...I like the question, the writing, and the approach. I especially loved "But instead of dumping a bunch of variables into regression software and pressing the 'I-believe-in-analytics' button..."
One note--I think that playing at home or away will be correlated with number of timeouts left. It's harder to get a play off in Seattle for opposing teams, and you might burn timeouts. It's also harder to win in Seattle. Will you take that into account in the next article?
So, one thing that popped into my mind reading this article.
Supposing that generally using a timeout early in the 2nd half is a worse idea than taking a delay of game (as your analysis implies may well be the case), could that not correlate with further bad coaching decisions down the stretch?
So fewer timeouts correlating to lower win rate than WP predicts might not be solely due to the timeout itself? A bit of a convoluted thought, but thought I'd share it...
How about a team leaves the field at halftime and is receiving the ball. For some reason they're not ready to play. Call a timeout at the beginning of the 3rd quarter or just take the delay of game?
What percentage of games does a team finish with 0 timeouts?
Brian, I saw this post about 4th and 7 and thought you would discussing running a 4 vertical plays on an offside. Can you run some kind of an analysis on what an offense should do in a situation where there is an offside? Should they run their normal play or should the QB look to throw a low percentage deep pass? On that 4th and 7, the receivers saw the offside so they immediately changed their routes from the normal routes to running vertical routes downfield.
Brian, is it possible that you actually seeing a correlation that is not due to causation? For example, could having fewer than all 3 timeouts midway through the 3rd quarter be an indication that the team/coach has poor clock management skills in general, which would be correlated with lower WP?
In any case, very interesting. Looking forward to part 2...
Chase-In a one-score game where the score is within 8, the off has 0 TOs 36% of the time and the def has 0 TOs 28% of the time. Both have 0 7% of the time. That's a bit rough because the data is by snap and not a picture at exactly 0 seconds remaining. I filtered the sample by <40 seconds to play.
Lam and Chris-There could be some variance carried along in a lurking 'bad coach' factor. Can't rule it out, but it would be impossible to tell, at least until we start analyzing things like this.
Now that I think about it, it may be more likely that early timeouts are burned because of crowd noise (Dan's point in the 1st commend), so we may be capturing some variance due to HFA in the timeout variable.
"...was worth .35 WP - .05 WP = .30 WP--worse than had they punted right away."
It seems possible that the success rate of a play may be effected by having an extra minute to think about it. In a high leverage situation, that could easily be worth quite a bit, maybe more that the value of the timeout.
Thanks, Brian. That surprises me. My gut would have told me the percentage of games where a team finished with 1+ timeouts would have been a bit higher. I thought your WP estimate was too high, but in light of this evidence, maybe not.
Maybe in games that end with the score >8 points that's the case?
Good point above. I also think a lot of timeouts are called when the play clock runs down because the defense shows something unkind to the called play. Getting into some other play against another defensive look has real value.
Chase-My guy agrees. You'll see in part 2 that the estimates are more modest.
> Likewise, offenses win less often as they use their timeouts. Going from 3 to 2 timeouts reduces its raw win rate from 69% to 63%.
My instinct is that this conclusion probably has a post-hoc issue similar to "offenses that execute 10 or more run plays in the 4th quarter win more often". Both of those items may correlate to winning, but they are also things that teams do when they are already ahead.
Could you look at an offense's success rate on downs immediately after timeouts, and compare it to the same offense's success rate in identical situations?
It was the kicker who influenced the call in Seattle.
http://www.cbssports.com/nfl/eye-on-football/24415427/clutch-decision-by-seahawks-k-steven-hauschka-led-to-fourth-down-td
A lot of what I was thinking has been mentioned in the comments (change in success rate for calling a timeout - this would probably have to be a bayesian analysis, HFA, timeouts lost being correlated with poor coaching).
My other question is shouldn't the expected change in win probability from a timeout also have the percentage of games where a timeout would actually make a difference included? This might be addressed slightly by just looking at the different situations in the 3rd quarter, but a timeout in terms of clock management at the end of the game would only have a significant difference in select situations (close games, not blowouts or games where the losing team is already on the field and does not need to stop the opponent to save time).
Does the basic WP model not consider how many timeouts the teams have, to arrive at WP in late game situations? If not, that seems like a huge weakness of the model. If it did, you could get a fairly solid approximation of the timeout value by looking at the difference in WP between late-game situations where everything is equal besides the timeouts, and then weighting those situations by how common they are.
I think we're dangerously close to overthinking some of these biases. After all, being up by X points suggests a lot of things. Part of that lead might be due to bad coaching/game management etc...
HFA gets really small by the 3rd quarter too. Not that these effects can't exist. It's just that a) they're probably very small if they do exist, and b) we may have to accept them to have any good approximation of TO value.
Also, I get the impression from some of the comments that people read this to say the universal value of a TO is ~ .05. That's not what the article says. It only addresses the TO value in the subsample (3rd quarter, offense up by 1 score).
As others have pointed out, you also may be seeing a reverse causation here. Bad coaching/playing/losing makes teams less likely to win but more likely to burn time outs earlier. It could be the bad coaching that drives down the winning percentage rather than using the time outs.
Like Brian said, a lot of folks here are over-thinking this. Some of the comments are gving WAAYYY too much credit to coaches for engineering a win or loss by using timeouts.
The same thing could be said of the score or field position or a lot of things, but we can still be confident of causation because of the plain rules of the game.
The same thing could be said of the score or field position or a lot of things, but we can still be confident of causation because of the plain rules of the game.
It's not overthinking it to speculate that incompetent coaches are both 1) more likely to lose close games and 2) burn timeouts earlier. The cause of both is bad coaching, rather than the timeout lost causing the drop in WP.
The biggest problem with using all of one's timeouts too early is losing the ability to challenge. Although that has been somewhat mitigated by the rule that all scoring plays and turnovers are automatically reviewed, it's still a big deal. I would posit (without having yet read part 2) that later in the game, a timeout's "worth" gains value. I don't know for sure, but I would bet that there are ~10 games per year where a team runs out of time in their "comeback" attempt because a player is tackled in bounds, and the offense cannot get off another play. I'm not talking about times where team A takes the lead with less than 20 secs, I'm talking where they have 1 TO, 1:10 left, and the ball on their 20. Another scenario would be where the defense cannot stop the offense from running the clock down to nearly nothing before kicking a game-winning FG (Philly-NO playoff game scenario).