World Cup Penalty Kicks, Wimbledon Serves, and Intuitive Algebra

Maybe June should be "other sport month" here at Advanced NFL Stats. Except for Albert Haynesworth's principled, valiant stand against being forced to play defensive tackle a foot and a half to the right from where he is accustomed, there's not much going on in the NFL. Fortunately, there's plenty of other sports going on, including the world's biggest events in soccer and tennis.

Soccer and tennis offer two of the best examples of simple two-strategy zero-sum game theory. Soccer offers us the penalty kick, when a player kicks to the left or right extreme side of the goal so hard that the goalkeeper must simultaneously guess a direction to lunge. Tennis gives us the serve, where the server aims for either the extreme forehand or backhand side of his opponent's service box. Both examples give us the opportunity to examine how well experts are able to approximate the optimum strategy mix.

Play-by-Play Data

I've recently completed a project to compile publicly-available NFL play-by-play data. It took a while, but now it's ready.

The resulting database comprises nearly all non-preseason games from the 2002 through [edit: 2012] seasons. I have not performed any analysis on the data, so what you'll get are the only basics--time, down, distance, yard line, play description, and score. It's almost exactly what I started with. I'll leave any analysis up to you.

Actual vs. Theoretical WP at the World Cup

In response to a few questions on my last post regarding my World Cup win probability (WP) model, here are some actual numbers to chew on. An anonymous commenter pointed us to actual win rates at (a fun site by the way). I've graphed the actual rates below.

The actual win rates are for 708 games stretching all the way back to 1930. The theoretical WPs based on a Poisson distribution are the solid lines, and the actual rates are the little triangles and squares. Keep in mind these are the WPs for the trailing team.

Open Wide for Some Soccer!

"Fast kickin', low scorin',...and ties--you bet!"

I just returned home to the good ol' USA after a week overseas. Evidently, there is some sort of sporting tournament underway in some other country somewhere. It's confusing because sometimes they call this sport "football," except that it's the same sport my daughter played when she was five. Too funny! Still, it is considered a real sport by some, and so it's my job to suck the fun out of it by creating a Win Probability (WP) model, making you realize there's no chance your favorite team can come back to win.

NASCAR Game Theory

When we talk about game theory in sports we almost always talk about zero-sum games. Whatever one team gains, the other loses, whether it’s yards, wins, or even win probability. It’s all very tidy.

One sport that features some non-zero-sum games is auto racing. I’ll admit that until recently I was a sports snob. Spending four hours watching cars make left turns and have their tires changed was never my idea of a good use of time. But now I have to say I am intrigued by NASCAR.

Drivers accumulate points throughout the season based on their finishing positions, and the top 10 drivers qualify for a playoff-type system in the final few races. Every race is worth the same number of points. The winner gets 185 points, the second place finisher gets 170 points, and the third place finisher gets 165 points. You can see the full table here. We can agree that points are not the only consideration in the value of winning a race, but for the sake of discussion I’ll limit the value just to the points.

Before I get into the meat of NASCAR’s non-zero sum game, consider the classic Ultimatum game in which two players divide a sum among themselves. The first player decides how to divide the sum and makes a single offer to the second player who can accept or reject the offer. If the second player accepts the offer, they spilt the sum accordingly, but if he rejects the offer, neither player receives a payout.

MIT/Sloan Panel

Here is the video of the panel at the MIT sports analytics conference I referred to in my post about how Bill Polian doesn't get it. The discussion is educational throughout, and despite my criticism of Bill Polian, he has some very wise things to share. The participants are Mark Cuban, Jonathan Kraft, Daryl Morey, Polian, and Bill Simmons, with Michael Lewis as the moderator. It's over an hour long and worth your time. I've embedded it below, but first let me share what I took to be the most interesting points:

Bill Walsh on Randomizing

In his early Stanford days, Bill Walsh had already cracked the code on how un-random football coaches (and almost all people) are. From "Controlling the Ball with the Passing Game":

"We know that if they don't blitz one down, they're going to blitz the next down. Automatically. When you get down in there, every other play. They'll seldom blitz twice in a row, but they'll blitz every other down. If we go a series where there haven't been blitzes on the first two downs, here comes the safety blitz on third down."

Most NFL offenses tend to alternate rather than randomize. Walsh knew defenses were just as predictable decades ago.