## "Expert" Predictions

Gregg Easterbrook of ESPN.com writes a yearly column poking fun at all the terrible predictions from the previous NFL season. Here is his latest--It's long but highly entertaining. Unfortunately, it also makes a pretty good case that people like me with complicated mathematical models for predicting games are wasting our time. And the "experts" out there are doing even worse.

Predictions are Usually Terrible

His best line is "Just before the season starts, every sports page and sports-news outlet offers season predictions -- and hopes you don't copy them down." Unfortunately for them, he does.

Easterbrook's examples of horrible predictions underscores the fact that pre-season NFL predictions are completely worthless. Before the 2007 season I made the same point by showing that guessing an 8-8 record for every team is just as or more accurate than the "best" pre-season expert predictions or even the Vegas consensus. (Pay no attention to my own predictions attempt last June before I realized how futile it is.)

Unlike Easterbrook, most of us don't write our predictions down. It's easy to forget how wrong we were and how overconfident we were. So many of us go on making bold predictions every year.

Proof I'm (Almost) Wasting My Time

The most interesting part of the column might be the "Isaacson-Tarbell Algorithm." It's a system suggested by two of Easterbrook's readers last summer for predicting individual games. Just pick the team with the better record, and if both teams have the same record, pick the home team. According to Easterbrook, the Isaacson-Tarbell system would have been correct 67% of the time, about the same as the consensus Vegas favorites. Although devilishly simple, it requires no fancy computer models or expert knowledge and it would have beaten almost every human "expert" with a newspaper column, tv show, or website.

(Actually, I'm going to give credit for inventing the algorithm to my then 6-year old son who is an avid football fan (wonder why?). He devised that very same system during the 2006 season in a contest with my regression model and his grandfather in a weekly pick 'em contest. I'm sure many young fans have followed the same principle over the years.)

The model I built was accurate about 71% of the time last year. Is the extra 4% accuracy (10 games) worth all the trouble? Probably not (for a sane person) but I'll keep doing it anyway. Actually, I think 4% is better than it sounds. Why? Well, a monkey could be 50% correct correct, and a monkey who understood home field advantage could be 57% correct. It's a matter of how far above 57% can a prediction system get?

And there are upsets. No system, human or computer-based, could predict 100% accurately. They can only identify the correct favorite. Sometimes the better team loses. From my own and others' research, it looks like the best model could only be right about 75-80% of the time. So the real challenge is now "how far above 57% and how close to 80% can a system get?" There's only 23 percentage points of range between zero predictive ability and perfect predictive ability. Within that range, 4% is quite significant.

Phil Birnbaum of the Sabremetric Research blog makes the point that experts should not be evaluated on straight-up predictions but on predictions against the spread. I'm not sure that's a good idea, and I think I have a better suggestion.

Phil's point is that there are very few games in which a true expert would have enough insight to correctly pick against the consensus. Therefore, there aren't enough games to distinguish the real experts from the pretenders. His solution is to always pick against the spread.

I don't agree. The actual final point difference of a game has as much to do with the random circumstances of "trash time" as with any true difference in team ability. A better alternative may be to have experts weight their confidence in each game as way to compare their true knowledge.

Consider a hypothetical example Phil Birnbaum cited about an .800 team facing a .300. The true .800 team vs. true .300 team match-up is actually fairly rare. As Phil has eloquently pointed out previously, the .800 team may just be a .600 team that's been a little lucky, and the .300 team could really be a .500 team that's been a little unlucky. There are many more "true" .500 and .600 teams than .300 and .800 teams, so this kind of match-up is far more common than you'd expect. And if the ".500" team has home field advantage, we're really talking about a near 50/50 match-up. Although the apparent "0.800" team may still be the true favorite, a good expert can recognize games like this and set his confidence levels appropriately.

Computer Models vs. "Experts"

Game predictions are especially difficult early in the season, before we really know which teams are good. Over the past 2 years of running a prediction model, I've noticed that math-based prediction models (that account for opponent strength) do better than expert predictions in about weeks 3-8. The math models are free of the pre-season bias about how good teams "should" be. Teams like the Ravens and Bears, which won 13 games in 2006, were favored in games by experts far more than their early performance in 2007 warranted. Unbiased computer models could see just how bad they really would turn out to be.

But later in the season, the human experts come around to realizing which teams are actually any good. The computer models and humans do about equally well at this point. Then when teams lose star players due to injury, the human experts can usually outdo the math models which have difficulty quantifying sudden discontinuities in performance.

And in the last couple weeks, when the best teams have sewn up playoff spots and rest their starters, or when the "prospect" 2nd string QB gets his chance to show what he can do for his 4-10 team, the human experts have a clear advantage. By the end of the season, the math models appear to do only slightly better than experts, but that's only really due to the particularities of NFL playoff seedings.

In Defense of Human Experts

Humans making predictions are often in contests with several others (like the ESPN experts). By picking the favorite in every game, you are guaranteed to come in first...over a several-year contest. But in a single-season contest, you'd be guaranteed to come in 2nd or 3rd to the guy that got a little lucky.

The best strategy is to selectively pick some upsets and hope to be that lucky guy. Plus, toward the end of the year, players that are several games behind are forced to aggressively pick more and more upsets hoping to catch up. Both of those factors have the effect of reducing the overall accuracy of the human experts. The comparison between math models and experts can often be unfair.

In Defense of Mathematical Predictions

Lastly, in defense of the computer models, the vast majority of them aren't done well and give them a bad name. There is an enormous amount of data available on NFL teams, and people tend to take the kitchen-sink approach to prediction models. I started out doing that myself. But if you can identify what part of team performance is repeatable skill and what is due to randomness particular to non-repeating circumstances, you can build a very accurate model. I'm learning as I go along, and my model is already beating just about everything else. So I'm confident it can be even better next season.

### 23 Responses to “"Expert" Predictions”

1. Phil Birnbaum says:

"Beating the consensus" on straight-up picks amounts to the same thing as "Picking underdogs and winning more than half of them."

Right? Because if my strategy as a tout is to always pick favorites, the only way to beat me is to pick strategic underdogs and win more than I do.

Do you think you can do that? I'd be willing to bet you can't, at even odds.

Truth be told, I can't lose the bet ... I just watch your picks, and hedge by betting the same as you (at greater than 50/50 market odds), so I win either way.

But I'd still be willing to bet you you can't.

2. Phil Birnbaum says:

And further to your point about the spread: one way to meet your objection is just to have the touts pick at market odds, and see if they made money at the end of the year.

Do you think touts can do that? I'm not willing to bet you here, because I think that with your expertise, you have a positive expectation on these bets (but I'd bet it's a small one).

3. Patrick Nance says:

I really do enjoy this blog but your second-to-last paragraph is inexcusably wrong, at worst, and very misleading, at best.

Until a clear end-game picture has presented itself, the best strategy is to pick every game according to what you perceive to be the more likely outcome.

One should never pick the less likely outcome simply to "out-luck the inevitably lucky." Nobody on Earth knows whether 0 or 256 (or any number in between) of the NFL games in a season will go according to expectation. The best we can do to is to pick according to expectation until we have a clearer picture of what is needed to win in the end-game. In effect, we do not necessarily pick games according to the likely outcome but rather to the better expected value relative to our contest.

To illustrate when you would accept an unlikely outcome for the sake of expected value, let's say I presented to you a coin-flipping contest. The coin we use comes up tails 99% of the time and heads 1% of the time. You and I both may pick either heads or tails (so, we may make the same pick). I offer to give you \$10,000 if and only if you make the correct pick and I make the wrong one. Now, you know that I will pick tails and there is nothing you can do to change my mind. You would effectively be forced to pick heads, even though the chance that heads is flipped is wildly unlikely. Why? Because picking heads is the only way that your expected value is non-zero (remember: a tie is essentially a loss).

Early on in an NFL picks contest, an individual game's maximized expected value in terms of the contest will very closely mirror its expected outcome. That is to say that your best decision for a game early on is to go with what you feel is the most likely outcome. Later, however, this can become blurred. The best decisions to make in terms of expected value may not be the decisions which have the highest likelihoods of succeeding (as I illustrated in my coin-flipping example).

If you're protecting a lead, for instance, your primary objective should be to duplicate the picks of your competitors so as to prevent them from gaining on your lead. On the other hand, if you're behind, you might have incentive (especially in a single-pay-out structure) to pick differently than the leader so you can have a chance to gain on him. Although it's certainly arguable when these decisions start becoming clear enough to make, having been a veteran of a spread-picks contest for six years running, I feel like week 15 is when people tend to start altering their normal strategy for the sake of trying to finish in the money.

It should not be a hard strategy that should intentionally "pick some upsets", no matter what. You have no clue whether anybody else in your contest will get lucky or not. We might as well say that the best strategy is to "be omniscient".

4. Phil Birnbaum says:

In fairness, I should acknowledge that beating an algorithm is easier than beating the market. But picking at the market favorite is a legitimate strategy that's available to any tout.

5. Brian Burke says:

Perhaps you're right over the short term. But over the long term, someone who beats the consensus is correctly identifying games where the consensus is wrong. Good examples of incorrect consensus favorites are teams with very high pre-season expectations or teams on a "streak."

I guess it depends on the definition of long and short term. Using Tango's system we could calculate how many games we'd need to be confident in separating good experts from the chaff. I'd guess it might be more than one 256 game season. I haven't read "Fooled by Randomness" yet, but somewhere a reviewer mentioned that the author calculated it would take many many years to be able to tell good fund managers from the lucky ones.

I agree that betting the game odds (the money line I think they call it) is a better way of grading experts than the spread. It's essentially the same as the "confidence" system I suggested. There are already pick 'em leagues that work that way.

By the way, you'd probably appreciate the Prediction Tracker site I link to on my main page. It underscores your theory. You can see where someone who predicted well one year does poorly the next.

6. Phil Birnbaum says:

If I may ask, how good do you think anyone can be at picking winners, as compared to the Vegas line? That is, out of 256 games per season, how many Vegas underdogs should really be overdogs?

7. Brian Burke says:

Phil-Good question. I think I see where you're going. I'd say 20% of games are upsets in which the true underdog wins. Vegas spreads are correct about 67% of the time. So I'd estimate about 13% of games--33 games in a regular season.

Randomness would really swamp any 33 game sample. You'd need a few seasons to tell a good expert from a lucky one.

8. Brian Burke says:

Patrick-

Glad you like the site. I think we generally agree. When I say "selectively pick some upsets," I mean pick the underdog in games you feel the consensus is incorrect, and not blindly picking the favorite in every game.

Early in a contest, you should pick to maximize your long-term accuracy. Then, in the end-game (the last couple weeks or so) and you're behind, you'll need to become aggressive and pick winners where your opponent has picked losers, just like you suggest.

However, NFL outcomes are fairly random. If you have more than a few contestants, there is a fairly high likelihood someone will be luckier than you are good.

There may only be a small chance that any one contestant will beat you by luck, but that's not the question. What's the compound probability that everyone in your league won't beat you by luck? If the league is big enough, the more you're at the mercy of luck. You have to hope to be a little lucky too.

9. Phil Birnbaum says:

Yup, you'd need a LOT of games. Suppose the average underdog-who-should-be-an-overdog is .550. That means that of those 33 games, you'd expect to be 18-15. That's an advantage of 1.5 (whole) games per season, assuming you correctly pick out each of those 33 games.

1.5 games out of 256 is a winning percentage increase of only .006 (.012 doubled against the touts who pick the opposite). That's why I argue that straight-up picks are a difficult way to figure out who the best predictors are: you're comparing a .506 guy with a .494 guy.

10. Phil Birnbaum says:

Sorry, assuming that favorites win 67 percent of games, I should have said that you're comparing the .676 guy to the .664 guy.

11. Brian Burke says:

True. But, no one really knows which 33 games we're talking about. So the danger isn't just failing to identify an "overdog" (love that term by the way), but incorrectly guessing one too. The spread between a good expert and bad one could therefore be much bigger than your hypothetical suggests.

At the least, wouldn't that double the difference? That gives the better predictor a 6-game advantage (+3 for him, -3 for the slouch).

Plus, the assumptions are beginning to pile up--mine about the 33 games, yours about a .550 average. Not saying they're wrong, but we don't know they're right either.

So let's look at it from a fresh (and purely statistical) direction. Say Expert A is correct at predicting 8 more games in a full season than Expert B. A's accuracy would be 71% and B's would be 67%.

To statistically test if A is a truly better predictor than B, we can use a 2-proportion t-test. The probability that the experts are actually equal and A only appears better due to chance is 0.164. It's not good enough for the FDA to approve heart medication, but it's fairly solid.

To get a 0.05 confidence level, Expert A would need to be correct in just over 73% of games. In a 2-season 512 game set, he would need just over a 71% accuracy rate.

However, I do agree that scoring straight up winners is not the best measure of NFL expertise. I just disagree that the spread is the best way.

12. Brian Burke says:

Sorry, I didn't make clear I wasn't talking about the 'tout' but was comparing a 'good' expert and 'lesser' one.

13. Phil Birnbaum says:

All agreed, *except* that I think 8 games way, way overestimates the advantage the expert has over the slouch.

Mostly, that's because I'm not sure I buy the idea that the slouch is going to buck the odds in 33 (or whatever) games when he doesn't know what he's doing. Taking 33 random games and betting the underdog is something an intelligent (but punditorially-challenged) slouch should never do.

14. Anonymous says:

Firstly,Great blog.

How you evaluate a prediction system's an interesting problem.

Personally I think just using win% success is too blunt an instrument.

If you make a prediction you're usually putting a probability on an event happening.Team A has an 80% chance of winning the game and team B therefore has a 20% chance.

To follow through on this example,a team with an 80% chance of winning the game will win,on average by around 11 points.

So if you compare this theoretical margin of victory with the game's actual margin of victory you can see how close the prediction was in this single event.

Repeat the procedure over a large number of games and you end up with an average (and a standard deviation) for the difference between your predicted and the actual margin of victory.

It's susceptible to garbage time scores,(although beaten teams are just as likely to pull back leads as teams are to run up the score).But it does credit the very good prediction where your narrow fav loses by a point,whereas a straight "did the fav win method" counts it as a failure.

As a pretty tough benchmark,the vegas line's predicted m.o.v. was out from the actual m.o.v. by about 10.5 points per game over the 2007 season.

Once again great site....I intend to read it all.

15. Mr.Ceraldi says:

Great article and great site Brian-

I'm still trying to get my head around your articles on random/luck in sports.

I found this post from Tango via one of your past links..

"It seems to be saying that the results in any "ONE" game in the NFL is 93% due to the 'luck factor'...and its only the cumlative effect of the better team advantage that appears over a large number of games is this a correct interpretation?

Soory if I'm missing the obvious?

How much luck in a single game?

Remembering that…
var(obs) = var(true) + var(random)
... in a single contest, var(random) is .5^2 for every sport with two equal opponents facing each other.

In MLB, var(true) is .06^2. The luck factor is therefore .5^2 / (.5^2+.06^2) = .986. So, choosing any single random game, and the outcome can be mostly attributed to luck. That is, we have really no idea, just by looking at the final score, which team is actually better.

In NHL, var(true) is .083^2, meaning the luck factor is .973.

The NBA luck factor, with var(true) of .134^2, is .933.

The NFL luck factor, with var(true) of .143^2, is .924.

Another way to look at these “luck factor” numbers: If you were to take the “true talent” winning percentage of all the MLB winning teams, their true winning percentage would be .507. (You can figure this as 1 minus half the luck factor.) So, in the NFL, with the luck factor being .924, the “true” winning percentage of all teams that win in a random week is .538.

So, the random variation, “the breaks”, just don’t even out in a single game, and really overwhelms whatever data you have. (Again, for a single game, without knowing about the history of either opponent.)

Literally, anything can happen, which is of course why we love watching sports.
#4 —tangotiger, 09/01 @ 09:55 AM

16. Mr.Ceraldi says:

Great article and great site Brian-

I'm still trying to get my head around your articles on random/luck in sports.

I found this post from Tango via one of your past links..

"It seems to be saying that the results in any "ONE" game in the NFL is 93% due to the 'luck factor'...and its only the cumlative effect of the better team advantage that appears over a large number of games is this a correct interpretation?

Soory if I'm missing the obvious?

How much luck in a single game?

Remembering that…
var(obs) = var(true) + var(random)
... in a single contest, var(random) is .5^2 for every sport with two equal opponents facing each other.

In MLB, var(true) is .06^2. The luck factor is therefore .5^2 / (.5^2+.06^2) = .986. So, choosing any single random game, and the outcome can be mostly attributed to luck. That is, we have really no idea, just by looking at the final score, which team is actually better.

In NHL, var(true) is .083^2, meaning the luck factor is .973.

The NBA luck factor, with var(true) of .134^2, is .933.

The NFL luck factor, with var(true) of .143^2, is .924.

Another way to look at these “luck factor” numbers: If you were to take the “true talent” winning percentage of all the MLB winning teams, their true winning percentage would be .507. (You can figure this as 1 minus half the luck factor.) So, in the NFL, with the luck factor being .924, the “true” winning percentage of all teams that win in a random week is .538.

So, the random variation, “the breaks”, just don’t even out in a single game, and really overwhelms whatever data you have. (Again, for a single game, without knowing about the history of either opponent.)

Literally, anything can happen, which is of course why we love watching sports.
#4 —tangotiger, 09/01 @ 09:55 AM

17. Mr.Ceraldi says:

Great article and great site Brian-

I'm still trying to get my head around your articles on random/luck in sports.

I found this post from Tango via one of your past links..

"It seems to be saying that the results in any "ONE" game in the NFL is 93% due to the 'luck factor'...and its only the cumlative effect of the better team advantage that appears over a large number of games is this a correct interpretation?

Soory if I'm missing the obvious?

How much luck in a single game?

Remembering that…
var(obs) = var(true) + var(random)
... in a single contest, var(random) is .5^2 for every sport with two equal opponents facing each other.

In MLB, var(true) is .06^2. The luck factor is therefore .5^2 / (.5^2+.06^2) = .986. So, choosing any single random game, and the outcome can be mostly attributed to luck. That is, we have really no idea, just by looking at the final score, which team is actually better.

In NHL, var(true) is .083^2, meaning the luck factor is .973.

The NBA luck factor, with var(true) of .134^2, is .933.

The NFL luck factor, with var(true) of .143^2, is .924.

Another way to look at these “luck factor” numbers: If you were to take the “true talent” winning percentage of all the MLB winning teams, their true winning percentage would be .507. (You can figure this as 1 minus half the luck factor.) So, in the NFL, with the luck factor being .924, the “true” winning percentage of all teams that win in a random week is .538.

So, the random variation, “the breaks”, just don’t even out in a single game, and really overwhelms whatever data you have. (Again, for a single game, without knowing about the history of either opponent.)

Literally, anything can happen, which is of course why we love watching sports.
#4 —tangotiger, 09/01 @ 09:55 AM

18. Mr.Ceraldi says:

Great article and great site Brian-

I'm still trying to get my head around your articles on random/luck in sports.

I found this post from Tango via one of your past links..

"It seems to be saying that the results in any "ONE" game in the NFL is 93% due to the 'luck factor'...and its only the cumlative effect of the better team advantage that appears over a large number of games is this a correct interpretation?

Soory if I'm missing the obvious?

How much luck in a single game?

Remembering that…
var(obs) = var(true) + var(random)
... in a single contest, var(random) is .5^2 for every sport with two equal opponents facing each other.

In MLB, var(true) is .06^2. The luck factor is therefore .5^2 / (.5^2+.06^2) = .986. So, choosing any single random game, and the outcome can be mostly attributed to luck. That is, we have really no idea, just by looking at the final score, which team is actually better.

In NHL, var(true) is .083^2, meaning the luck factor is .973.

The NBA luck factor, with var(true) of .134^2, is .933.

The NFL luck factor, with var(true) of .143^2, is .924.

Another way to look at these “luck factor” numbers: If you were to take the “true talent” winning percentage of all the MLB winning teams, their true winning percentage would be .507. (You can figure this as 1 minus half the luck factor.) So, in the NFL, with the luck factor being .924, the “true” winning percentage of all teams that win in a random week is .538.

So, the random variation, “the breaks”, just don’t even out in a single game, and really overwhelms whatever data you have. (Again, for a single game, without knowing about the history of either opponent.)

Literally, anything can happen, which is of course why we love watching sports.
#4 —tangotiger, 09/01 @ 09:55 AM

19. Brian Burke says:

It's an interesting method and I'm not nearly as wise as Tango, so I'm reluctant to criticize it.

I agree with your interpretation. Another way to understand his result is that if you have 2 NFL teams and only know the result one game between them, you can only be about 7% sure that the winner was the better team.

In reverse, if we omnisciently knew ahead of time which team is better, we could only be correct predicting games 57% of the time (50% luck + 7% talent). That's clearly wrong. Even if we factor in home field advantage, in which the home team wins 57% of the time, we'd still only be at 64% accuracy in our predictions. The spread usually does at least this well, and good math models regularly can beat the spread by a few percentage points.

Of course, we don't have omniscient knowledge of which teams are superior, but we can still predict winners significantly better than Tango's methodology says we should. So there might be something missing in his method.

I think one missing ingredient might be found in his statement that "in a single contest, var(random) is .5^2 for every sport with two equal opponents." He's right, but NFL games do not typically feature equal opponents.

The standard deviation of the binary outcome of a random event that is lopsided 80/20 is not 0.5. It's 0.4. So his var(random) assumes the NFL is made up of completely equal teams, which is not the case.

By the way, if you repeat his math, what do you get? I might be missing something, but my calculations are very different. If:

var(obs) = var(true) + var(rand), then:

0.19^2 = var(true) + 0.125^2
0.0361 = var(true) + 0.01565
var(true) = 0.0361 - 0.01565
var(true) = 0.0204

Therefore, the "luck factor" is:

LF = var(rand)/(var(true) + var(rand))

LF = 0.01565/(0.0361 + 0.01565)
LF = 0.433

and not 0.934! Where did I mess up?

20. Brian Burke says:

Ok. Now I see the difference. In one place, Tango uses the random variance in a 16 game season, then switches to the random variance of one single game. The difference is:

var(rand)season = (.5*.5/16)
var(rand)1game = (.5*.5)

Would var(true) also get the same treatment? Is var(true)1game different than var(true)season? I guess that's the assumption Tango makes, that they're equal. It's hard to conceptualize this.

The bottom line is that if the luck factor really were 0.934, then the actual distribution of season win totals would be nearly identical to the 100% luck distribution, which it's not. It's significantly flatter, indicating a much stronger var(true) and a much weaker var(rand). (see my posts on luck and nfl outcomes for histograms)

For example, if Tango is right, we'd see a lot fewer NFL teams with 13 or more wins in a season, nor would we often see teams with 4 or less.

With a luck factor of .934, we'd see only 1.2 teams per year with 13 or more wins (according to a binomial distribution with p=0.566). In reality, we usually see 3 or sometimes 4.

21. Mr.Ceraldi says:

brian;

In baseball, 1 true SD = .060. For 1 game, 1 random SD = .5. That’ll give you an observed of sqrt(.06^2+.50^2)=.504

So, the reliability is 1-(.06/.504)^2=.014 or 100%-1.4%=98.6%

One baseball game tells you almost nothing of your team

for nfl brian I double checked the math and it works
the last few years in the NFL, the SD is .19, which makes var(observed) = .19^2

the random standard deviation. sqrt(.5*.5/16)

16 is the number of games for each team.

So, var(random) = .125^2

Solve for:
var(obs) = var(true) + var(rand)
.19^2 or 0.0361 =? + .125^2 or .015
var(true) =sqrt(.0361-.015) =sqrt(.02)=.143

var(true), in this case, is .143^2
The NFL luck factor, with var(true) of .143^2
therefore 0.5^2/(0.5^2+0.14^2)=92.7%

Not sure where your math went array? maybe you can find error with help of my calculations

I appreciate your feedback I really
respect Tango for his groundbreaking work on mlb and luck
in baseball.

I am really torn about your comment on the fact that tango assumes random events being between equal teams and thus should be .4 as to .5
for me this is the crux of the issue?

(My work is with NHL where the competive balance is much closer..ridiculously close this year sd of w% .057
favorites w% 53.5% home team w% 52.6)

(BTW the NHL is almost becoming a living breathing example of a 'pure luck' league)

(in fact when I did a crude test of comparision to actual nhl to luck league the difference in sd w% between the two is .05? (I dont have your access to simulation software so I couldn't reproduce your great graphs but my guess is nhl graph would resemble a league where results are 80 to 85% luck
as compared to your 52.5% nfl.))

So I am quicker to accept tango because for the most part the nhl has most of the games between equal teams.

My guess is that the variation skill level in the nfl is not as
wide as observed. Perhaps 16 games is simply not long enough to get a clear picture.

dan

22. Brian Burke says:

Eureka. I think I resolved the discrepancy.

Tango's luck factor is the ratio of the variance of luck to the total observed variance.

But variance itself is an abstract concept, with meaning only in a purely statistical context. It is not "real," but is the square of something real. In this discussion, the units of a standard deviation is "games," but the unit of variance is "games-squared." Tango's luck factor is a ratio of games-squared, not a meaningful number in any real-world sense.

(Just as R-squared isn't the right measure of a real effect, but it's square root, r, is. (See Phil's post here http://sabermetricresearch.blogspot.com/2007/10/r-squared-abuse.html))

So it was incorrect of me to interpret Tango's luck factor as a % of wins due to luck. To think of % games won by luck, the appropriate real-world measure is a ratio of standard deviations, not variance. Accordingly, the % of games won by luck would be:

sqrt(var(luck))/sqrt(var(obs))

which for the NFL is:
SD(luck)/SD(obs) =
0.5/(0.5 + .19) = 0.777

This would mean that true talent trumps luck 1 - 0.777 = 22.3% of the time.

The better team wins by luck exactly 50% of the time, and wins by force of talent (on average) 22.3% of the time. Therefore, the better NFL team wins 72.3% of the time. (Subject to an adjustment due to home field advantage.)

My luck analysis from last summer estimated 76%, so perhaps we're not very far off.

In the NHL, with a var(true) of .083^2, the "real-world" luck factor is:

.5/(.5+.083) = .857

The better hockey club wins 50% of the time by luck and 14.3% of the time by force of talent, for a total of 64.3%.

23. Anonymous says:

This might not be the right place and coupled with the fact that the draft machine has started into full gear since March Madness is going on but I ran across this blog and I decided to share:

http://myespn.go.com/s/conversations/show/story/3267103