Is Sabermetrics Helping to Ruin Baseball?

Caution: A lot of opinion to follow. And please understand I love baseball and I obviously am a believer in analytics.

Imagine a world in which baseball teams had no idea how good or bad a player could be expected to perform from year to year. All players would be total blank-slate mysteries. In this world, it wouldn't matter at all how high or low a team’s payroll is. A team that spends $1M on payroll would be just as likely to win a championship as a team that spent $200M. The correlation between payroll and wins would be zero.

Now imagine the opposite scenario in which teams had perfect crystal-ball foresight about exactly how well a player could be expected to perform. Now the team with the highest payroll would always have the most wins because there would be a 1:1 direct linkage between performance and pay. This would make the correlation between payroll and wins a perfect 1. We wouldn't even need to play out the season. Just mail the trophy to the team with the highest payroll and save everyone the trouble.

In reality we’re about halfway between the first case and the second case. We neither have no idea about expected performance, nor do we have perfect predictive knowledge.  MLB's correlation between team payroll and wins has been trending at about r = 0.4. It varies from season to season, but its recent central tendency appears to be right round 0.4.

The problem that analytics causes is this: The better the sabermetrics (or the more uniformly adopted it becomes), the better the crystal ball becomes and the closer we move toward a correlation of 1...which, by the way, is a bad thing. Of course, we'd never see a correlation of 1 because of injuries and simple randomness, but it could get much higher as it did in earlier eras. The correlation has been higher and lower in years past, varying for reasons that have nothing to do with analytics. But the point is that sabermetrics, to the degree it is effective, makes the correlation higher than it otherwise would be, and this gives an ever greater advantage to big-payroll teams.

Game Probabilities - Week 9

Game probabilities for week 9 are up at the New York Times. This week I take a quick look at much things have changed in New England and Pittsburgh.

Pittsburgh's defense ranks 26th in the league in yards allowed per play, and its offense is struggling to run the ball. The Steelers rank 29th in the league in running Success Rate--the proportion of running plays that result in an improved scoring potential. Only the Giants, Jaguars, and Ravens rank behind the Steelers in running success...

Podcast Episode 7 - Virgil Carter

Joining the show this week is Virgil Carter, the man many consider to be the founding father of advanced football statistics. Dave, Brian and Virgil look back at Virgil's playing days when he was suiting up at quarterback for the Bears while studying for his MBA at Northwestern in the off-season. It was there at Northwestern that he published his first paper, Operations Research on Football. That paper introduced the idea of expected point value based on game situation, an idea that is still at the core of advanced football analysis. Virgil also talks about what it was like to play under head coach Paul Brown, and why if it weren't for him, Bill Walsh might have never needed to create his "West Coast" passing offense. 

If you want to learn more about Virgil and his fascinating career both on and off the field, check out the following links:


-Sports Illustrated article from October, 1972: "Handy Pair of Brainy Bengals
-Virgil's graduate research paper: Operations Research on Football
-Pro-Football-Reference Player Page: Virgil Carter

Subscribe on iTunes and Stitcher

Playoff Projections - Week 8

Just like past years, all of the numbers below come from Chris Cox at NFL-forecast.com. His app uses the win probabilities from the ANS team efficiency model to run a Monte Carlo simulation of the remaining NFL games thousands of times. Based on current records, our estimates of team strength, and knowledge of the NFL's tie breaking procedures we can come up with some pretty interesting predictions of how each team will fare come the end of the season. If you want to use a different model or just fiddle with the numbers by hand, go ahead and download the app yourself.

What is the deal with these numbers?

This is our first look at playoff projections this year so it seems like a good time to talk about how good these predictions really are. Like all models, this one has limitations. Here are some of the issues that will come up.

Team Efficiency Rankings: Week 8

If you've read this column before, you know these rankings are all about cancelling out the noise of past perception and evaluating every team's performance based solely on their data from this season.  That does have some limitations, such as not being able to account for injuries, but once the sample size of games gets large enough, it's a fairly accurate predictor of future success.

Halfway through the season, there are no more huge swings waiting to happen.  That does not mean teams cannot rise or fall—last year, the 3-5 Bengals and Redskins made the playoffs while the 7-1 Bears and 6-2 Giants missed the dance—but the teams near the top are quite likely to stay there.

If we accept that, then perhaps we should start taking our new number team a bit more seriously.

Slate: Punt From the Opponent's 26?

I make my first appearance this season at Slate to propose that DAL may have been better off punting from the DET 26 than attempting a 44-yd FG. If DAL could have pinned DET at or inside their own 10, the numbers suggest punting might have been preferable to going for it or trying the FG. It may have even been preferable to making the FG. I also discuss DAL's holding penalty that was the start of the critical path toward an improbable comeback.

...But there’s an extra wrinkle. Strangely, Dallas would have preferred to keep Detroit within 3 points rather than extend its lead to 6. When desperate teams like the Lions with no timeouts remaining get into the outer rim of field goal range, they send in the field goal unit for a long-range attempt. This is an irrational decision, one I discovered the very first time I began looking at win probability numbers. Rather than try to win the game, teams in this situation settle for a tie—or rather, an attempted tie. Even if the field goal attempt is good, it only buys a 50–50 shot at the win in overtime...

When the Defense Should Decline a Penalty After a Loss Part 2 (2nd Downs)

I recently looked at when it made sense for the defense to decline a 10-yard holding penalty following a 1st down play for no gain or a loss. It turned out that defenses should generally prefer to decline after a loss of 3 or more yards.

First downs are easier to analyze because they almost always begin with 10 yards to go. Unfortunately, 2nd downs aren't so cooperative. It's amazing how thin he data gets sliced up. Most downs aren't losses, even fewer have holding penalties, and rarely are they declined. Still, there are enough cases for a solid analysis using 1st-down conversion probability as the bottom line.

Put simply, a defense would prefer to decline a penalty on a 2nd down play whenever the resulting 3rd down situation leads to a conversion less often than the 2nd down plus the 10 yards.

The chart below plots conversion probability for 2nd and 3rd down situations. The red line illustrates the conversion probability of 3rd down and X to go situations. For example, 3rd down and 7 situations are converted about 40% of the time.

The green line illustrates 2nd down situations, but slightly differently. It plots conversion probabilities for 2nd down and X plus 10 yards. For example, 2nd and 13 (i.e. 3 + 10 yds) situations are converted 45% of the the time. The black line is the smoothed line fitted to the 3rd down conversion rates. I plotted things this way because it's the actual comparison we're interested in, given a gain of zero yards.

Packers' Perfect Third Quarter

After a grown-man run from Adrian Peterson to end the first half, the Green Bay Packers opened up the second half up only a touchdown to the dismal Vikings, 24-17. Aaron Rodgers led the Pack on a 16-play, 80-yard touchdown drive that lasted over eight minutes. During the march, Green Bay converted on three third downs and a fourth down. Let's look at the progression of the drive using our Markov model: