There’s a new study on run-pass balance based on game theory minimax equilibrium. The study is called Professionals Do Not Play Minimax: Evidence from Major League Baseball and the National Football League and it’s from Kenneth Kovash and Steven Levitt (of Freakonomics fame).
The authors created their own version of Expected Points as their measure of play success. Using a giant regression model that accounts for all sorts of confounding variables, they find passes lead to more success than runs. Game theory would say that, ideally, both strategies should yield the same amount of success.
Additionally, game theory recommends that the strategy mix be unpredictable. The authors find that running and passing on successive plays are negatively correlated, meaning they tend to alternate rather than be random, something Doug Drinen discovered (and I ignorantly rediscovered) previously.
The authors also look at Major League Baseball and do a similar analysis with similar results for pitch selection. Phil Birnbaum’s excellent critique is here. Thanks also go to Phil for the heads up on the study.
I’ll do a full review of the football part of the study when I get more time to digest it, but I wanted to point it out to everyone. On first read, I’ve got some points of concern, but nothing as fatal as Phil’s critique. Unfortunately it’s $5 to download, so if anyone is aware of an un-gated copy, please leave a link in the comments.
A New Academic Study on Game Theory and Run-Pass Balance
By
Brian Burke
published on 9/29/2009
in
game theory,
research,
reviews,
run-pass balance,
strategy
Subscribe to:
Post Comments (Atom)
Hi Brian - Unrelated to this post and somewhat random... Gregg Easterbrook (TMQ) claims that a kickoff fumble is the worst turnover possible, because a team just scored and now they have the ball again. Is there a way to see if that's true? I would imagine it might be because a fumbled kickoff would typically give the kicking team great field position, but is it worse than, say, a fumble on the first play after a kickoff?
Seems to me that's an error in logic.
The value of a play isn't conditional on what happened in the past. What's the difference between a fumble on the kickoff recovered by the kicking team at the 20 and a fumble on the first play of a drive from the 20? What if you just intercepted a 20-yd pass into the end zone for a touchback and then fumble on the next play?
All of those situations give the ball to the opponent at your own 20. Yeah, a score-then-fumble is a bad combo, but the value of the fumble part isn't enhanced because of previous events.
Turnovers near either end zone are more costly than toward midfield due to the non-linearity of the EP curve. (It steepens inside the 20s.)
Thats a weird thing for Easterbrook to say because he is generally a scientific guy. It sounds like he is playing on a momentum argument. Points are points. Field position is field position. Possession is possession.
I don't think it really matters how you get points, field position or possession. All it matters is that you have them.
Obviously, a team is better off with 14 extra points than without them but if the 14 points come on two consecutive plays or with 20 plays in between. I'm not sure I see why it matters.
On the study, I haven't read it yet...but let me comment anyway. I hate it when people do that but I'm a hypocrite.
I think this has a lot to do with the gambler's fallacy and how people misinterpret true randomness. Studies have shown that when people try to replicate random patterns off the top of their head, they are fairly predictable and that its reasonably easy to tell the difference between a true random pattern and a human's attempt at being random.
People think that random events should not form streaks nearly as often as they do. So in an attempt to "randomize", they overcorrect and make sure there are no streaks.
They think that throwing the same pitch three times in a row is predictable. Of course, if you go out of your way not to throw the same pitch three times in a row, you are being even more predictable than if you sometimes throw three in a row and sometimes don't.
The negative correlation between successive pitches/plays could be an error by pitchers & offenses of being too predictable, or it could reflect an error by batters & defenses of tending to expect the pitcher/offense to repeat what they just did (or being more prepared for it, because they've just been primed to see it). Is there data to test whether pitchers/offenses suffer because of this predictability in their playcalling?
"Game theory would say that, ideally, both strategies should yield the same amount of success."
I don't think this is true. The mix of running vs passing is part of what determines the success rate of each (a defense will adjust if it knows you pass 70% of the time rather than 56%, even if you do randomize your play calling more). You do not necessarily want each type of play to have the same level of success; you want to choose the combination of runs and passes that maximize total success. This is dependent on the marginal success of a run play and a pass play at each combination of the two. It could be the case that increasing pass plays lowers the success of such plays more than it raises the success of run plays, and that could lower total success (I'm just using "success" to keep in the terminolgy of the blog post; and I'm not arguing that I know passing more isn't right).
http://gravityandlevity.wordpress.com/2009/05/28/braesss-paradox-and-the-ewing-theory/
See here for a blog post that applies this concept to basketball. Notice that the optimal strategy isn't to have Ewing shoot until his percentage matches that of his teammates; it is to have him shoot until the total FG% is maximized.
I haven't read the paper (refuse to shell out $5), so maybe they address these topics. But this is a topic I think would be really interesting to address, keeping in mind that you probably do not want passing success to equal running success.
Birnbaum makes a similar mistake when he claims:
"But it *can* tell you that you should adjust your strategy until the OPS-after-fastball is exactly equal to the OPS-after-non-fastball."
Great find Brian. I plunked down the dumb $5 for this. Anyway, I don't think I'm up to the challenge of determining if their "constructed success metric" properly takes into account all the variables, and so I don't know if I can critique the hyper specifics (that such and such inefficiency equals a point a game which equals half a loss a season, etc). But their conclusions in broad strokes seem banal, as you point out Brian: (1) teams ought to pass more than they do; and (2) the serial correlation from down to down seems out of whack and suboptimal, with teams running too often after a bad pass play and vice versa more than they should.
In that regard doesn't seem to be much new. I did find footnote 13 interesing, which basically says (your comment page won't let me copy/paste): the Alamar paper found an extreme disparity of 1.8 yards per attempt differential, based on stats from PFR, but they conducted their own review based on PFR's stats (and specifically the "adj net yards gained per pass attempt"), and found that a disparity of 0.5 still existed, which was "consistent with [Levitt and Kovash's] findings."
Lastly, I saw that comment from TMQ and it seemed off base. Yet in context it also could have just been that, in the flow of the game -- even as a fan -- it seems rather backbreaking. Which is true, even if not scientifically valid or anything. I'm not even arguing perception becoming reality, just that a fumble on a kickoff is sure to elicit a "Well that sucked."
I read the summary, not the full article, but one thing the summary does not address is the value of running down the game clock and avoiding interceptions by rushing with a lead. Teams do not always have an incentive to maximize points scored.
So, would the ideal method of play-calling be a list of pass plays, a list of rush plays, a chart showing the ideal mix given different down/distances and a random number generator?
The defense wouldn't be able to predict the play because it's essentially random. All you as a coach know is your expected utility is the same regardless of rush or pass, and you don't really care what type of play you end up with.
On the Easterbrook kickoff return fumble (he calls it a krumble) it is a momentum argument, added on to a field position one. Essentially he is grouping the TD/FG, the XP and the kick return into one large 'net game position'. Therefore TDs followed to a fumble on the kickoff return have more value than a touchdown and normal return.
Yes, the value of the return is unaffected by what happened before - but the game situation depends on what's just happened and I think that is what Easterbrook's krumble theory tries to look at.
Dave-
I hear what you're saying, but the game theory equilibrium does dictate the the utility of each strategy will be equal across strategies. So in the Ewing example, if a bball team is spreading the ball properly, each play the team calls will have the same payoff in the long run. This assumes the defense is also playing at their equilibrium strategy mix.
So the mix might be: let Ewing post-up 80% of the time and kick out for a 3-pointer 20% of the time. If a team does this in the right proportion, the points per possession for both strategies would be equal. If 'let Ewing post-up' has a higher payoff per possession than the 3-pointer, then the offense should feed it to Ewing more often, and vice versa.
If the defense is not at its equilibrium strategy mix, then the offense can exploit it. Usually in this case, there will be a single offensive strategy that dominates all the others. The offense should hammer on that strategy until the defense adjusts back to equilibrium.
In the NFL right now, that would mean pass, pass, and then pass some more until the defense adjusts, and then mix passes and runs at a ratio where the expected payoff of both strategies are equal. When they're equal, it guarantees the optimum minimum guaranteed total long run payoff.
Brian:
You are correct in situations where the marginal and average success of plays is the same. But that's not always the case. Take stolen bases in baseball for example. Teams should optimally attempt steals whenever the chance of success (given baserunner, pitcher, catcher,) is equal to or greater than the break-even point (at which the WP gain of sucesses equal cost of failure). If a team does that, the average gain from steal attempts will be greater than the gain from non-attempts. But that doesn't mean the team should steal more, because additional attempts will be made by slower runners and/or in more difficult circumstances (left-handed pitchers), and will fall short of the break-even point.
Similarly, in basketball all shot attempts do not have same success rate. The chance of success varies by the second, as men become open, etc. If Ewing is used optimally, it means that each extra/marginal shot attempt you give him should be no better than giving it to a teammate. But his AVERAGE prodcution may still be higher than other players, because his skill allows him to take more high-probability shots per game.
Now in football, marginal and average may be the same for runs and passes. I don't know, but seems plausible. But let's say that pass success rates depended heavily on the defensive alignment, and the decision to pass/run was often made after the QB saw the defensive alignment. Many pass attempts would then be made when success probability was high. In that case, all pass attempts are not equal: each additional pass attempt would be less likely to succeed than prior attempts (as with stolen bases), and it would be possible for the average pass to be better than the average run and yet still be equal at the margins.
Thinking some more about this, if equilibrium required equal average success, shouldn't it then be true that every running back on a team would have the same average yardage gain? (If not, the team theoretically should be giving the ball more to the back with the highest gains). And shouldn't every receiver on a team have the same average pass value (completion/yardage)? And every single player in the NBA, or at least every player on a give team, should then have the same effective field goal percentage.
But of course we don't see anything remotely like that. It's hard to believe this all reflects massive failures by teams to apportion opportunities correctly. Much more plausible, to me, is the idea that a player's productivity at the margins is not the same as his average productivity.
If:
-their sample sizes ( # of carries) were big enough
-each RB were used in identical situations (down, distance, field position)
-and each yard gained were equally valuable,
then the answer is yes.
But that never happens. Most back up RBs are specialists--3rd down pass catchers or short-yardage bruisers for example. Also, the sample size for any single RB is far too small. And lastly, we need a valid linear measure of utility, and yards don't suffice.
Yeah, I'm sticking to my guns here, and agree with Guy's point about average vs marinal success.
Brian, I agree that *marginal* utility should be equal across play types (I'll continue to keep it simple as run vs pass). But that still does not suggest that the average success for each should be equal. It means that, at the optimal strategy, for the slightest change in mix, the gain (or loss) in the success from runs is equally offset by the loss (or gain) in the success from passes. This may or may not happen at the point in which the average success of a run play and a pass play are equal. It seems more likely that they would be different, given that any success measure has only 1 equal value, but infinite inequalities.
The point of the Ewing link that I provided above is that you clearly do not want expected points per possession to be equal when determining the mix of Ewing's shots. The conclusion is:
"As you can see, the team is most efficient when Patrick takes only about 21% of the team’s shots, just slightly more than everyone else. It seems ridiculous at first: in such a game Patrick would be shooting 60% while his teammates shot only 45%; surely he should be getting more shots. But the added benefit of keeping Patrick more poorly-defended pays off, and his team’s shooting percentage improves to about 48.5%."
That's because for the extra shots that put him over 21% of the mix, he shoots less than 45%, even though his total average stays above 45% when you include all of his shots (up to a point... eventually his total average shooting % also falls under 45% as his shots increase to a very high % of the mix). So at the point at which Ewing has taken 21% of the team's shots, the team would be better off having a teammate who shoots 45% take the next shot, even though Ewing's average for all of his shots is still at 60%. He shoots a very high percentage when his mix of the shots is low, but it's a downward sloping function, which means that his marginal shooting percentage as his mix increases is below his average shooting percentage.
That's about as clearly as I can explain it, and I certainly think it applies to football run/pass mix. I also think it's an unintuitive, but important bit of strategic thinking that applies widely to any sport in which the mixture of tactics affects the success rate of those tactics.
I've read just about everything I can on 2-player zero sum game theory, and I've never even come across marginal utility considerations. I think what the Ewing example shows is that it pays to exploit a defense that may not be playing at the minimax equilibrium.
Another point is that you can't divide Ewing's shots into 2 groups like that. You don't have his first 21% of his shots and then the rest of the shots above this percentage. You never know where you are on the way to his total share. The defense will play according to the probability he will get the next shot, and not according to whether or not he's had so many shots so far in a game or season or whatever. Their expectation of that probability is based on his overall percentage, not marginal percentage.
Ultimately, I think our difference of opinion boils down to a definition of utility. For example, there may be some marginal advantage to the run that "sets up the pass," where a run might actually cost yards or points so that eventually the pass will be even more effective. But that advantage provided by the run has to be credited to the total utility of the run. Ultimately the total utilities have to be equal.
You're right about the "next shot" idea... I was just trying to lay out what the margin is in this case, but that is a sloppy way to word it. I will take one more stab at explaining this, then I will agree to disagree if you are still not convinced:
Back to the Ewing example. OK, don't think of it as the "next shot". But you do have to define "the margin". So basically think of it like this: there are varying mixes of Ewing shots as a percentage of his team's total shots, from 0% to 100%. Each combination results in a different shooting percentage for Ewing, and hence different shooting percentage for the entire team. Ewing's shooting percentage starts very high when he takes a very small percentage of shots. But as his shots take a larger percentage of the team shots, his percentage drops as he commands more attention from the defense. His teammates in this example are good for a FG% of 45% at any mix of the shots. When the Knicks are game planning, they can plan for the mix of shots that will come from Ewing. At 21% of the teams' shots, his total shooting percentage is 60%. But if they increase his shots as a mix past 21%, his shooting percentage on the additional shots is below 45%, even as his total shooting percentage for the game remains higher. Here, the margin is the extra shots he takes compared to a game plan with Ewing taking a different percentage of the team's shots (generally speaking, the margin can indentified by asking “as compared to what?”). They are not identifiable, individual shots; but the mere act of taking a higher mix of shots draws more attention from the defense and lowers his total percentage, thus giving the extra shots a lower FG%. Starting from a mix of 0%, they want to increase Ewing’s mix of the team’s shots until the expected FG% from an incremental Ewing shot equals that of his teammates, in this case 45% (and we’ll just ignore 3 pointers for simplicity). Again that incremental shot is not an indentifiable shot; it's simply the act of shooting one more time during the game than he otherwise would.
I just whipped up a quick spreadsheet to help demonstrate Ewing’s performance as his mix increases (I made up my own data similar to the example I had linked to):
http://i553.photobucket.com/albums/jj369/dave_econ/Ewing_Total_Marginal-1.jpg
It's a set of game plans (a subset of all possible game plans; note that I switch from 5% to 1% increments right around 21% of the mix), where by the margin is defined as the extra shots taken and extra expected made shots by Ewing. It demonstrates how his FG% drops below 45% for his marginal shots when his mix is over 21%. Right around the 21% mix mark is where the marginal utility of having Ewing taking an extra shot (over the course of the game) equals the marginal utility of his teammates taking an extra shot (where they both equal 45%). At that point, where the marginal utility of giving an extra shot to Ewing or his teammates is equal, the team is indifferent to the choice, even though Ewing’s FG% for the game will be much higher. But once Ewing’s percentage of the mix passes the point at which his next shot has a 45% FG%, they would rather have his teammates take an extra shot, even though Ewing’s FG% for the entire game remains above 45%.
The blue dots on the graph represent his total FG% across different mixes of the Knicks shots, and the red dots are the FG% of his marginal shots. It is the marginal utility is relevant when deciding the mix of shots taken by Ewing. And I think the same applies to the pass/run mix question.
Ok. Thanks, Dave and Guy. I'll take a look at this and digest it. If I don't respond it's not that I'm ignoring. I just need to wrap my head around it.
Let me try to define the Ewing idea in a different way. Don't think of it as next shot at all. Rather think of it as type of shots. If Ewing were to take 45 shots in a game (sorry not an NBA guy so don't know if my numbers are realistic), presumably there would be a lot of fadeaway jumpers, three pointers, shots against double teams. However, if rather than take these shots where his FG% is for example 21% he passed to his teammates they could presumably get a better shot off. However, if he's only taking 21 shots per game, perhaps its because he's playing within himself with 1 on 1 post up moves, layups and slams. The same thing applies in football. It may be too simplistic to always say run or pass more, b/c we need to be specific about the types or runs or passes, not just the number.
Dave: The problem with your Ewing example is the assumption that only Ewing, but not other players, have a usage/efficiency tradeoff. Why would that be? In game theory, the assumption is that the defense can choose between a "Ewing defense" and a non-Ewing defense. That gives you four possible outcomes on each play. It can be shown mathematically (I now see) that no matter what those four values are, there will be an equilibrium at which neither side can gain by changing its distribution of strategies, and the average outcome on both strategies is identical.
That said, I'm not sure this applies to basketball. And that's because in basketball, the offense decides its play (Ewing or non-Ewing shot) with knowledge of the defensive position at that moment. That's very different from the game theory model, in which both sides are blind. If the defense devotes resources to Ewing, it's not (always) too late for him to let another player shoot. What teams do is let the player who seems to have the highest probability shot take the shot, within the 24-second constraint. Star players will get more high-probability shots, while lesser players will shoot when the star doesn't have a better shot. In those conditions, I don't think equilibrium requires the star player and non-stars to be equally productive.
I would imagine that a 75% pass 25% run mix would be optimal