Comments on Advanced Football Analytics (formerly Advanced NFL Stats): Playoff Probabilities: Week 9

As I see it, there are three things I want to do w...

2011-11-08T21:39:56.273-05:00

As I see it, there are three things I want to do with this program:

1. Estimating the seeding slate for each conference (i.e. division winners & wild cards, this one will sample, instead of iterate over all configurations)

2. Finish up computation of division winners (i.e. generalize code to iterate over all divisions, instead of just concentrating on the AFC South).

3. Compute seeding slate for entire conference (not sampled). This should be practical once there are less than about 40 games in the season (per conference), so around week 12 or 13, so I've got a few weeks to work on this.

Chris, assuming I get #1 or #3 done, would you be interested in collaborating? (Actually, just having someone to discuss issues over and check my understanding would be fantastic!)

Ian: It's encoded in my comment about joint pr...

2011-11-07T18:51:40.312-05:00

Ian: It's encoded in my comment about joint probabilities.

E[g(X)] = sum[all x elem of X] g(x)f(x)

where
X = random variable for NFL seasons
g(x) = goals we're interested in, e.g. Team A wins division
f(x) = probability mass function, probability that a particular season x occurs

Now, I'm attacking this directly by taking a series of N samples of X (x_1, x_2 ... x_N) and computing f(x_i)g(x_i), i.e.

E[g(X)] ~ (1/f_s) sum[x elem X_s] g(x)f(x)

where X_s is the set of N samples of X
and f_s = sum[x elem X_s] f(x)

This is basically Monte Carlo integration.

This was a good exercise, since now I know exactly what I'm computing!

And now this leads me to understand that what Chris is doing is Monte Carlo simulation with importance sampling. So he should require fewer samples than I do to converge on the correct answers.

Sam - sorry brains being slow again. How do I use ...

2011-11-07T04:36:07.048-05:00

Sam - sorry brains being slow again. How do I use a random 0 or 1, each with probability 0.5, to simulate a result where a team wins 40 percent of the time?

Importance sampling done correctly actually speeds...

2011-11-06T23:05:03.068-05:00

Importance sampling done correctly actually speeds up convergence. I'm not quite sure I'm doing it correctly though :-) (it's been over 20 years since college...)

Sam -- OK, now I get it. That's interesting. I...

2011-11-06T10:21:46.515-05:00

Sam -- OK, now I get it. That's interesting. I can see where it would be much faster to generate x simulations. So for me, the interesting case is when you are running some kind of MC scheme to estimate the probabilities when the sample space is too large to sample exhaustively. So the difference between the two schemes is that your method samples the entire probability space uniformly while my method samples the most likely areas of the probability space most intensely. I wonder if your method would require extra simulations to in order to get "enough" samples in the most relevant regions? It still might be faster, even if it does.

Chris: I decide who wins each game, then multiply ...

2011-11-06T05:06:16.666-05:00

Chris: I decide who wins each game, then multiply all the game winner's win probability together to find the joint probability of that particular configuration of winners, then accumulate the joint probabilities for each winner. At the end, the total probability space is divided into the accumulated probabilities for each team (I think... right now this is moot since I iterate over all probabilities so the total probability = 1 and this method is obviously correct). I think this is a form of importance sampling.

Gah, the strength of victory tiebreaker looks to be annoying to implement. Right now my only handling of three way ties is to detect that there is one.

Things get a whole lot more complicated when you s...

2011-11-06T01:15:34.504-04:00

Things get a whole lot more complicated when you start looking at full playoff seedings 1-6 in each division. Strength of victory tiebreakers routinely become important.

Not to mention three way ties, which sometimes require multiple iterations through the tiebreakers. The logic is very tricky.

I don't do any tiebreakers past strength of schedule. I just flip a coin at that point. If the playoff race is interesting to me, I will go through all of the tiebreakers manually and figure out who has the advantage for any tiebreakers that go beyond strength of schedule, but those are truly rare instances.

I consider the probability of individual games outcomes as predicted by Brian's team efficiency ratings. In that case, each team has a continuous decimal probability of winning each game that can range from 0 to 1. Random bits won't resolve that, you need random numbers.

Ian: in this monte carlo simulation, you only need...

2011-11-06T00:43:21.090-04:00

Ian: in this monte carlo simulation, you only need a binary decision for each game, so (pseudo)random bits (iid uniform with probability 0.5) are all that are necessary. Modern stream ciphers are a pretty good way to generate random bits fast.

[I originally had a longer response here, but...] ...

2011-11-06T00:38:08.986-04:00

[I originally had a longer response here, but...]

I can see you've spent a significant time on your UI--mine currently is a text editor where I adjust code and data tables and recompile.

I currently handle the first two within-division tiebreakers (head-to-head and best in-division), I detect when those aren't enough and so far remaining ties are under 1% of probability. Adding even this amount of tie-breaking handling did significantly increase run time, I didn't keep records but I'd place it around 50%. Since other tiebreakers occur so rarely, I can afford them to be relatively expensive, though the next two in order seem straightforward and fast. The fifth (strength of victory), I have no idea what it means (* googling just came up with a definition...), and the sixth (strength of schedule) I think I know but some better explanation would be appreciated. What're you doing about the remaining tiebreakers--do you just freeze current statistics or do some sort of calculation or simulation for remaining games?

It's totally believable that you've spent ...

2011-11-05T20:44:44.052-04:00

It's totally believable that you've spent more time on the GUI than on the simulation--right now, my user interface is making changes to code or data tables and recompiling...

What are you doing for the tie breakers? Right now, I don't handle all of the tie breakers, but handling tie breakers quickly is one of the things I've had to spend a lot of time on. Right now, I'm just handling head-to-head and win-loss-tie within division. I keep track of ties not handled by the above separately so I'm not incorrectly adding probabilities. Adding support for the tiebreakers I do handle increased run time by about 50% I think.

Adding win-loss-tie within conference is pretty straightforward. I'm pretty sure I've got a fast common-games computation figured out. I have no idea what "strength of victory" is. I think "strength of schedule" is the ranking from last season used for "strength of schedule" scheduling in current season, which would be simple to implement, if my supposition is correct.

How are you handling the other tie breakers which involve points/touchdowns scored? Do you compute expected points and touchdowns for the rest of the season?

Ian: most (but not all) stream ciphers essentially generate a stream of key-dependent) pseudo-random bits, which are xor-ed against the plain text to produce the cipher text. For a stream cipher to be good, the bits it generates needs to be indistinguishable from a stream of random bits. In the past, most stream ciphers were designed to be fast and cheap in hardware, but in the last decade, a number of stream ciphers were designed to run fast both in hardware and in software. You can read (and get reference implementations) about some good modern stream ciphers here -- this was a project of the EU to identify/develop some good stream ciphers.

I just did a crude performance test. I compared th...

2011-11-05T18:26:03.062-04:00

I just did a crude performance test. I compared the time required for simulations of the the remaining 9 weeks of the season to the time required to simulate 1 week. The results are that the time to simulate 1 week is 81% of the time required to simulate 9 weeks.

So that strongly suggests, at least for my implementation, that the time required to run the tie breakers is far greater than the time required for game simulations.

While undoubtedly increases in performance can be achieved, a sole focus on speeding up the game simulations is unlikely to achieve the expected gain.

For the record, the effort I've put into game ...

2011-11-05T13:56:58.433-04:00

For the record, the effort I've put into game simulation is negligible to the effort for the GUI and the tie-breaker procedures.

Sam - how does the random bit generation work? Tha...

2011-11-05T11:58:16.819-04:00

Sam - how does the random bit generation work? That's not a tactic I'm familiar with but sounds a useful thing to know if doing complex monte carlo simulations.

BTW, modern fast stream ciphers (which make excell...

2011-11-04T16:16:19.577-04:00

BTW, modern fast stream ciphers (which make excellent random bit generators) run at speeds around 1 bit/cpu cycle in software, so a million random bits can generated in under a thousandths of a second on modern gigahertz CPUs.

Ian, don't forget you don't need to comput...

2011-11-04T16:05:11.951-04:00

Ian, don't forget you don't need to compute entire seasons, just remainder of seasons. Also, you only need random bits, which is a lot more compact than numbers.

Sorry, forgot to say - as 256 games x 5,000 season...

2011-11-04T14:55:02.329-04:00

Sorry, forgot to say - as 256 games x 5,000 seasons is ~1.3 million numbers, you just generate one number between 1 and 700,000 and then use that as a seed to pick whether you take pre-gen'ed numbers 11 to 1,280,010 or whatever.

On developing nfl-forecast as quicker web app, as ...

2011-11-04T14:52:50.702-04:00

On developing nfl-forecast as quicker web app, as there are 256 games in an NFL season you could have a pre-generated list of, say, 2 million random numbers to use instead of generating new match result each time (random numbers are fairly processor intensive to generate).

All you'd need to do then is generate one number, and then use that to decide where to start reading in the random number table.

I haven't tested, but it seems that would make it much quicker to simulate seasons.

Chris, I was wrong. It does look like 82%. I think...

2011-11-04T10:30:32.311-04:00

Chris, I was wrong. It does look like 82%. I think that I just miscounted a bar. Based on misreading that game I think I overestimated the differences between the two systems. That being said, ideally the percentages would match exactly.

Btw, big fan of nfl-forecast. I always have it open when I read articles predicting future scenarios. Thanks for being one of those productive people. Would be nice to see a web-based version, maybe attached to Brian's stellar site.

5000 simulations in 10 to 15 seconds sounds rather...

2011-11-04T03:31:40.894-04:00

5000 simulations in 10 to 15 seconds sounds rather slow to me. I haven't benchmarked it, but my program's got to be running at least tens of millions of seasons/sec, and probably closer to 50 million on a rather slow CPU (for a single division).

5000 simulations can be run in 10 to 15 seconds an...

2011-11-04T01:24:28.232-04:00

5000 simulations can be run in 10 to 15 seconds and typically gets you within half a percent of the results achieved by 50,000 simulations (which is what my published forecasts are based on). If you are analyzing a large number of scenarios, you will appreciate getting the results back in 15 seconds compared to more than 2 minutes.

Ian B -- where do you see 77% for NO over TB in the NFL Forecast software? I read it at 82% which is very close to 83% given by Brian. I calibrated the home field advantage a few years ago based on Brian's published predictions. On average, this week my HFA is about 1 to 2% below Brian's. I don't think this is a significant difference, but I could modify my calibration to more closely match his if this trend is consistent over a number of weeks.

The state space is so large, I do wonder how accur...

2011-11-03T23:16:57.626-04:00

The state space is so large, I do wonder how accurate 5000 samples can be. That said, the last time I compared the results from my division champion program which does enumerates the entire state space* came only a few percentage points different.

*(Possible after week 6 since each division had around 32 games remaining.)

Ian: You're right--the software uses a differe...

2011-11-03T17:46:48.842-04:00

Ian: You're right--the software uses a different algorithm than Brian does to convert team efficiency ratings to game probabilities, so the probabilities differ slightly.

In general, the algorithm gives somewhat less of a home-field advantage than Brian's model--usually on the order of 1-3 percentage points worth of probability. This may affect things on the margins, but it leaves the overall picture intact. (Since teams play both home and away games, the deviations--which are small to begin with--tend to cancel each other out as opposed to compounding.)

Hi Josh, I just wanted to point out another exampl...

2011-11-03T16:59:12.814-04:00

Hi Josh, I just wanted to point out another example of what I mentioned earlier. In your article you write that Philadelphia has a 72% chance of winning this weekend. In the nfl-forecast software, it looks like Philly has a 68-69% chance. Other games have much larger differences. My point is that if individual games are off by 3-5%, the summaries that you end up with could be way off.

The numbers used for the main tables are done usin...

2011-11-03T15:29:07.915-04:00

The numbers used for the main tables are done using a modified version of the software that does 50,000 simulation runs, giving standard errors ranging from about .10 percentage points for values close to zero or one and .22 percentage points for values close to 50%.

As for the software, 5,000 runs gives percentages with standard errors ranging from .3 percentage points to .7 (assuming I have the math right).

That might seem like a lot, but you have to consider what we're using it for.

Does it matter whether we estimate Baltimore's chances of making the playoffs as 75% or 76? To me, not so much. I'm much more interested in the broad strokes of the playoff picture and conditional probabilities -- How does a particular scenario affect a team's chances? What are the high leverage games that can really swing a team's probability one way or the other?

To answer these questions, five thousand simulations seems to strike a good balance between precision and speed.

And ultimately, as was pointed out, these are estimates done using projections that will never be 100% accurate. Which is one of the reasons why the numbers are presented rounded off, so as to avoid the appearance of accuracy/precision that just isn't there.

Since the numbers are only given to percent-level ...

2011-11-03T14:19:05.664-04:00

Since the numbers are only given to percent-level accuracy, I suppose you'd want the Monte Carlo to be accurate to rounding error on that number. So for an error of half a percent, root-N gives 40 thousand.

On the other hand, we haven't estimated the systematic errors in the model; if those are bigger than the statistical error, you'd just be wasting your time adding more simulations. That is, you're calculating the wrong number very precisely. Given what we know about luck and injury in the NFL, I'm skeptical that the numbers can get better than 5%ish, so 5000 simulations would be adequate under that assumption.