Analytics have become a major part of political campaigns. Candidates now employ a large number of analysts who find every point of leverage possible. They target potential voters in ever narrowing slices, matching messages containing certain styles and emphasis with particular demographic audiences.
One of the most visible applications of analytics in this year’s election is the FiveThirtyEight blog, created by Nate Silver. For those who may not be familiar with Nate, he is a noted baseball sabermetrician best known for creating the PECOTA prediction model. His election forecast has favored Obama more heavily than most others throughout the election season, and he has been the target of criticism recently.
In defending his approach, Nate and others have explained his probabilistic reasoning with examples from football. The 90% or so chance Nate gives Obama to win the Electoral College is, for some reason, put in football terms…Romney is down by 3 with 2 minutes to play…or Romney is down 7 with 5 minutes to play…or something along those lines. (I think that's ironic given that even football experts don't seem have a good grasp of situational probabilities.)
I disagree with those analogies, but not because I have any better reason to think that either candidate will win. I think the situation is more like this: Romney is down by a very small number of points with 1 minute to play, and we don’t know who has the ball or where the line of scrimmage is. Or maybe it’s more like this: Romney is down by a point or two and has just snapped a long field goal attempt, and no one has a very good idea which way the wind is blowing. If it was blowing just like it was last game, the kick will almost certainly come up short. But if the wind is blowing more like it did two games ago, he’ll probably make the kick and win.
The distinction is in the level of uncertainty of the model inputs. In a football game, we know the score, time, possession, down, and distance with absolute precision. In contrast, one of the most critical inputs in an election as close as this year's will hinge on enthusiasm, which can only be estimated and relies heavily on uncertain assumptions.
Black Swan author Nassim Taleb coined the term Ludic Fallacy to describe the mistake of projecting the certainty of game analysis onto real world analysis. (Ludus is Latin for ‘game.') When we analyze sports and other games we can be relatively certain of our conclusions because they are bounded systems. For example, in football the field is always 100 yards long, there are always 4 downs, regulation time is always 60 minutes, and a touchdown always gives you 6 points plus an extra point (and sometimes 2). My win probability model is complex and flawed enough knowing that those rules are fixed constants. Imagine trying to make a win probability estimate not knowing how long the game might be, or how many downs each team will have, or how many points each score might be worth. Or imagine that any of those things might change at any moment.
The real world is much messier than any game. The real world is an open system of infinite complexities and interactions, and often subject to the fickle whims of human emotion. The confidence we have in understanding and predicting outcomes in games with their bounded constraints should not be extrapolated into the untidy realm of the real world. The confidence we gain from modeling relatively simple systems like a sports season can quickly become overconfidence when making, say, economic projections or predictions of sociological trends.
One of the deficiencies of any statistical/probabalistic model (including any of my own) is that they are unfalsifiable. According to philosopher Karl Popper, when something is impossible to falsify, it's not really science at all. If I say the Giants have a 65% chance of beating the Steelers but they lose (which they did), no one can say I was wrong. After all, I said the Steelers had a 35% chance. Only after large number of probability estimates are tested by true outcomes can we assess the accuracy of a model. And by then, the nature of the social process or system may have changed to such a degree to render the model obsolete. Sports and other games usually provide the number of tests to give us the sample sizes we need for confident evaluation, but do recessions...terror attacks...presidential elections?
There are some unfair attacks on Nate and his model to be sure, but the folks who are skeptical of predictions from analytic models of complex systems are not 'against science' or 'anti-math' as they are often portrayed. They have a sound epistemological basis for caution. I put myself in the ranks of skeptics who question the ability of economic and sociological models to make sound predictions with high levels of certainty.
That’s not to say we can’t learn anything from modeling real world systems or we shouldn’t try. Of course analytics can make useful information out of data. But because of the interaction of uncertainty and complexity, we are far better at explaining the past than predicting the future. Explanation is easier because there is far less uncertainty in the inputs, and the relationships between variables can be known. Just ask the mortgage-backed securities analysts on Wall Street about the dangers of over-certain projections in real world systems.
This is not a criticism of how Nate's model works or what 538 does. On the contrary, I applaud and admire it. I have no reason to doubt Nate Silver plays things straight with his numbers. Nor do I have reason to doubt that, given the assumptions inherent in the model and the assumptions of the poll inputs, the probabilities of the 538 model are accurate.
My only commentary is a warning about how certain we can be about projections of unbounded, real world systems. In an election this close, a swing of just a few percentage points of enthusiasm and turnout would change the projection from 90% Obama to 90% Romney. There seems to me to be too much leverage in something so uncertain to be confident of anything beyond the 60% level. This is strictly my personal philosophical stance, and I don't pretend to be an expert on polls.
Nate Silver has recently written a book about prediction, and I'd bet that he's well aware of the points I'm making and probably writes about them more eloquently than I could. I look forward to reading it. Perhaps the 538 prediction model accounts for the uncertainty I'm pointing to, but the top-line poll inputs the model relies on almost certainly do not. Unfortunately, the model is proprietary, so we can't know for sure.
I don't usually do this, but I'm going to go out on a limb and make a prediction in the presidential race. It's 50/50. Prove me wrong! The kick is in the air, and it all depends on the wind.
Subscribe to:
Post Comments (Atom)
as has been written elsewhere, it's not just 538 that has Obama as a substantial favorite. the betting market at Pinnacle has Obama as a 75% favorite, at Intrade Obama is 63%, and other models like the Princeton Election Consortium has Obama at 99% at this point.
so various different methods (socio/economic model, poll aggregation, wisdom of crowds) are all kind of converging at a similar point
I think the correct analogy is estimating the result of a game from looking at the stats of the game. Silver looked at 77 state races with at least three polls (distinct firms) with likely voter polls in the final two weeks before the election. Simple average of those polls 'predicted' the winner in 74 of the 77 times. I think that's essentially why the correct analogy is more similar to 'guessing' the game result from looking at stats.
Brian,
I can't prove you wrong, but I can assume that your 50/50 prediction means that you'd be willing to take either side of an even-money bet. If so, I'll take Obama and you can have Romney. I'm a longtime fan of both you and Nate, and I can assure that the margin-of-error reported in the polls is not the only source of uncertainty in his model. One relevant issue seems your tendency towards a frequentist interpretation of probabilities (i.e. "Only after large number of probability estimates are tested by true outcomes can we assess the accuracy of a model."), compared to Nate's more Bayesian leanings.
However, it is possible to look at the "success rate" of candidates who lead by an average of 1 point in national polls two days ahead of the election. You are right, though, that we'll have to assume that voters, in the last two days of this election, will behave similarly to how voters in previous elections have behaved in the final days, and there is certainly no guarantee of that. I guess the question is, do we have any reason to believe otherwise?
Brian, I do think you're mis-interpreting Silver's results a bit. I'm pretty sure his model accounts for the fact that there are many unknowns, including uncertainty about turnout (which would be reflected in polls appearing, ex post, to be biased toward one of the candidates). In fact, it is that uncertainty that allows Romney to still have a 15% chance of winning the electoral college in 538's model. Otherwise, it would probably be close to 0%, due to the overwhelming poll data pointing to an Obama victory in the electoral college. And since 538's model tries to account for uncertainties and reduces the confidence of his predictions accordingly, if Silver thinks something has an 85% chance of happening, I didn't see why that 85% chance is any different than a football model that predicts an 85% chance of a team winning. In other words, I think you're taking a relatively apple-to-apple comparison and trying to make it look like it's apples-to-oranges. Silver would be a very bad statistician if he did not account for an error term in coming up with his forecast. And his track record would imply that he's not bad at math.
" but the folks who are skeptical of predictions from analytic models of complex systems are not 'against science' or 'anti-math' as they are often portrayed"
Indeed, where I learned about math and science, a lack of skepticism is what is against science, not vice versa.
I think the more important point is that Silver (or Wang, DeSart and Holbrook, Jackman, Linzer, etc) sheds way more light onto the state of the race than simply looking at a single national tracking pool (which is what the majority of pundits are doing). Most of the criticism of the 538 model is that people say "But X poll says it's 50-50! How can you say Obama has a lead?" (to which the answer is "We have these things called states, the electoral college, and state level polling."), not the more justified criticisms you've laid out.
I also think you may overstate the randomness of elections. The proof that your 50-50 prediction is wrong is exactly in all the state-level polling that Silver and the political science community relies upon. While it's possible that these polls are wrong, if they're correct, then there are very few ways for Romney to get to 270 with the state-level polling data (with a large enough number of polls in each state to substantially limit margin of error overall) and the constraints of the electoral college.
Also - you're certainly correct that a couple small things could sway the election. But that goes both ways. There's no reason to think that Romney benefits from more from enthusiasm than Obama.
pretty much everyone knows which way the wind blows in about 40 states. For the other 10, if we take all the polls at face value, then the only question is whether the score will get run up (FL), not who will win. Your position seems to be that we don't know enough to know anything. Nate is about half way in between those two extremes. He sees that Romney needs about a 2% tailwind, that the polls didn't detect, to have decent shot at making the kick. He infers odds of such a scenario at least partially by looking at historical accuracy of polls, which is why his model is significantly different than 100/0. It is certainly possible the kick will go thru, but it isn't 50/50.
My guesses:
1. EV 303-235
2. PV 50.2-49.8
3. 20% of your readers are also fans of Silver.
Just started reading Nate's book a few days ago, but so far he's already mentioned a lot of what has been discussed here. He definitely identifies the huge impact of uncertainty (rather than risk, which you would find in bounded games as mentioned) in terms of prediction models.
Once again Brian's key philosophical point is overlooked...
We have know way of knowing how accurate Nate or Brian's model is for another 300 or so elections.??
(Sample size sample size sample size)
The approach is flawed..There is simply not enough refuted evidence to support the use of polls.
Pollsters have been 'wrong' many times in past or have they? this is Brian's point.
Re falsifiability: if you were just looking at strictly wins/losses in individual states, I would agree with you that there would be no way to say that Silver was right or wrong in his estimates, but he does provide us with a more specific way to evaluate him: he provides estimates of vote share in individual states, along with margin of error figures. If his method is wrong, shouldn't his vote share estimates consistently (say, >5% of the time) fall outside the margin of error?
Also, to respond to Anonymous, in a way we can look at Silver's model for 300 elections -- by looking at how his model performs in every Presidential, House, Senate, etc. election from 2008-present. If it's wrong at an unacceptable rate, then we'll know not to trust it.
the betting market at Pinnacle has Obama as a 75% favorite, at Intrade Obama is 63%,
Betting markets just measure the opinions of bettors. Obama paying 10 for $6.30 the day before the election does not mean there is a 63% chance he will win, it just means bettors are generally willing to wager on those terms, and half of them are going to be wrong tomorrow.
and other models like the Princeton Election Consortium has Obama at 99% at this point. And various economic models assure us he is doomed.
I'd guess the 20% crossover reader estimate is low. I've been reading Nate Silver since late 2007/early 2008, when he was posting under a pseudonym on another political website, before he even founded fivethirtyeight.com. Then again, I was reading here back when you had on the order of a dozen readers. I have a talent for finding the bloggers with the good statistics. ;)
The 50/50 estimate is sort of laughable, Brian. When you say "There seems to me to be too much leverage in something so uncertain to be confident of anything beyond the 60% level", you seem to be ignoring that people actually have prediction records on this stuff, and historically, people have predicted even large national elections far more accurately than at the 60% level. There are times they get things wrong (as we would expect!) but then again, Nate's analysis is far more sophisticated than most, and has been far more accurate in past predictions.
Of course, you make the 50% "prediction" because it cannot be disproven by results. Brian the commenter has already pointed out that percentages translate to odds, though. I would be happy to make a head-to-head wager on the results against you at even odds. There would be some difficulty in finding a way to put the money in escrow so we don't have to trust each other to pay up, but assuming we could I would be willing to wager up to a few thousand dollars against you. That limit is more of a measure of how risk-averse I am than anything else. I suppose I could conduct some arbitrage and bet a smaller amount for Romney on intrade, in which case I'd be happy to put my life savings on the line.
Of course, you wouldn't be interested in supporting my arbitrage, so let's put this a simpler way. If you think the odds are truly 50/50, then why don't you bet on Romney on Intrade, up to the limit of your risk-aversion? The spread between your predictied probability and the implied probability on that site is enormous.
But anyway, even this line of argument sells Silver's analysis short. Nate Silver is not just predicting a binary outcome. He's predicting popular vote margins in all 50 states. If you think he's wrong and that this is a coin flip, then you're really saying that you think every state (or at least a critical set of them) are essentially tied.
The reason Nate Silver is so well-regarded and was hired by the NY Times isn't because he predicted Obama would win in 2008. Lots of people did that and they didn't need fancy statistical models to do it. He is highly regarded because he predicted the primary results with unusual accuracy, then he 49 of 50 states correctly in the general election and was much closer on the popular vote margins than anyone else.
Also, to respond to Anonymous, in a way we can look at Silver's model for 300 elections -- by looking at how his model performs in every Presidential, House, Senate, etc. election from 2008-present.
Ah, hindcasting. If I had a nickel for every guy who thought his perfect hindcast meant he could predict the stock market...
Yes, you could calculate an average error over that period but that doesn't really help you as much as you might think. For one thing, you're assuming ceteris paribus, but in fact polling is very different than it was in 1980. In fact, polling is significantly different than it was in 1997 -- far more people have cell phones and response rates have fallen 75%. You can argue "well, we'll just adjust the model then" but then you have a new model for every election and you're back to square one.
He is highly regarded because he predicted the primary results with unusual accuracy, then he 49 of 50 states correctly in the general election
This is just nonsense. We're supposed to be impressed he correctly predicted who would take California and Kansas? There were only four states where the averages were within 2.5% and he got three of them right. One out of four guys flipping quarters did just as well. Also, very late in 2010 Nate was still calling only a 50% chance the GOP took the House.
Don't get me wrong, Nate's a fine sports stats guy, but the notion he's done anything special in terms of predicting elections is silly.
"This is just nonsense. We're supposed to be impressed he correctly predicted who would take California and Kansas? There were only four states where the averages were within 2.5% and he got three of them right."
I suppose you should read Brian's criticism of the "curse of 370". You're picking some awfully convenient endpoints there.
You are also implying that everyone knew with near certainty what the results were going to be a bunch of "swing states" where the margin topped 2.5%. Florida, Ohio, Iowa, New Hampshire, Colorado, Minnesota, Pennsylvania, Nevada, Virginia, and Wisconsin were all considered "in play" in 2008. In other words, you are implying that everyone knew with near certainty that Obama was going to win an electoral college landslide. Not only is that fairly plainly not the case, that conflicts with Brian's supposition that these elections are too complex to predict.
And again, Nate didn't just predict outcomes. He predicted popular vote margins in these states, including ones like California and Kansas. So even in the cases where the outcome was not in doubt, you can assess the accuracy of his models.
"Also, very late in 2010 Nate was still calling only a 50% chance the GOP took the House."
I'm not sure what you mean by late, but over a month before the election he was predicting historically huge losses for the Democrats and roughly a 70% chance of a takeover. Polling picks up later in the congressional races than it does in the national ones, so I'm not sure how meaningful his predictions earlier in that cycle were anyway. He SHOULD have been hedging towards 50% in June or July, for the same reasons that Brian articulates.
Just to drive the point home... here are state-by-state popular vote percentage predictions for the 2008 election from fivethirtyeight:
http://web.archive.org/web/20081219154455/http://www.fivethirtyeight.com/2008/11/todays-polls-and-final-election.html
Just looking at error on the margin in those swing states I mentioned above:
MO: McCain +.1%
NC: Obama +.7%
IN: McCain +2.5%
FL: McCain +1.1%
OH: McCain +1.5%
VA: McCain +.7%
CO: Obama +2.3%
IA: Obama +2.2%
NH: Obama +.2%
MN: McCain +.1%
PA: McCain +2.1%
As you can see, he favored McCain more often when he missed, which makes sense since he missed the popular vote by about 1% in McCain's favor.
You can call these terrible predictions if you like, but the point is that these (along with the error bars around them) are a more meaningful measure of the prediction than just cherry-picking an arbitrary endpoint (or the final result).
I also checked into the 2010 predictions a bit more, and I'm really not sure what you mean by the 50% prediction. He had HUGE error bars around that election - basically, he acknowledged that it is a lot harder to predict the results in the house than in the senate or presidential elections. However, even with those large error bars, the 95% confidence interval was all in the Republican column. He was predicting massive Republican gains.
Last post without a response, I promise.
In the process of finding those 2010 predictions, I got to this post, where Nate more or less explicitly responds to the argument Brian is making.
http://fivethirtyeight.blogs.nytimes.com/2010/09/29/the-uncanny-accuracy-of-polling-averages-part-i-why-you-cant-trust-your-gut/
In other words, you are implying that everyone knew with near certainty that Obama was going to win an electoral college landslide.
Yes.
Not only is that fairly plainly not the case
What an odd statement. You can see it fairly plainly in the poll averages. That's why virtually everyone not related to John McCain was calling the election for Obama.
that conflicts with Brian's supposition that these elections are too complex to predict.
2008 wasn't close. It was very, very easy to predict who would win the 2008 election even without much precision. Brian's point is that it is very difficult to predict who will win the 2012 election.
The myth of Silver's 2008 performance also rests on the mistaken notion that he was playing from the same deck as everyone else. He was not. He was getting detailed polling information from the Obama campaign which gave him a significant leg up.
You can call these terrible predictions if you like, but the point is that these (along with the error bars around them) are a more meaningful measure of the prediction than just cherry-picking an arbitrary endpoint (or the final result).
I agree in principle, but it's somewhat hilarious to make that argument while arguing for a model's validity from a single election year.
Re 2010, he was very late seeing the GOP taking the House, and missed the actual number of GOP seats by 10. Not terrible, but really nothing special either. Yes, elections are hard to predict, but we didn't need 538 to tell us that, either.
I agree it was very easy to predict an Obama victory the day before that election, which is why Silver rated it roughly 99% likely that Obama would win. His prediction this year is that Romney has something like twelve times the likelihood of winning that McCain did.
However, again, you are vastly underestimating the degree of actual uncertainty there was about not just the result of the election (while nearly everyone predicted Obama winning, most had much greater uncertainty than Silver did) but the number of states in play. Several reputable pollsters predicted McCain wins in Ohio, Florida, and Virginia, which you are now looking back on as obvious Obama wins.
The point I'm trying to make, repeatedly, is that:
- Silver didn't make one simple, easy, binary pick (i.e. picking Obama to win the 2008 election). He made dozens of picks about margins of victory and probabilities of victory, plus a bunch of predictions about margins and probabilities on senate races.
- We should judge him on how accurate those many predictions have been.
I'd add that just as important as how accurate his predictions have been is the question of how accurate his predictions of the accuracy of his predictions has been. As he pointed out in the post I linked to, if the candidates he listed as 90% to win always won, it would mean that his 90% predictions were too low. He SHOULD get a 90/10 election wrong 10% of the time, or else he's not accurately forecasting those probabilities. (Not incidentally, Brian runs into exactly the same issues when posting his predictions about NFL game results. He should be wrong on 60/40 picks 40% of the time, or else his model is poorly calibrated.)
So, we have tons and tons of Nate Silver predictions to look at. If you want to critique him, the right question is not: "did he get it right?", because he didn't actually predict the results with 100% certainty. The right question is "when he says something has an 86.3% chance of happening, did it happen around that often?".
---
Again, I'm not sure what you mean by "very late seeing" the 2010 swing. I haven't looked back to see when he first predicted it, but it was over a month out where he was essentially in the right range. And the point is that congressional elections are hard to predict, in part because the polling information is so sparse. His Senate and Presidential predictions are much better because he has better data. It's worth noting that he hasn't even really tried to predict the house this year, probably because he knows how hard it is.
What is this "detailed polling data" he was getting in 2008? All the data he used was posted publicly. Do you think he has a "significant leg up" this year, as well? If so, shouldn't we trust him more, not less?
I think some of us are missing Brian's point with the last paragraph. Replace 50/50 with 99.9999/0.0001 and the idea is equally valid. Probabilistic predictions can never be proven correct or incorrect - and that includes both the short term and the long term. Any individual may have their own subjective criteria for accepting or rejecting a probabilistic model, but the operative word there is subjective.
Excellent comments. Every single one. That says a lot about the readers and commenters here.
To clarify my ungainly essay in the original post here are my real points:
1. I think highly of Nate Silver and 538.
2. But I think there is too much uncertainty in some very high-leverage inputs to be as certain as 538 appears to be.
3. Real-world models are far worse at making predictions than we'd generally like to admit. I think the ability to explain is sometimes confused with the ability to predict.
4. Part of the reason for our overconfidence in making predictive models in open, dynamic, real-world systems comes from the success we've had at analyzing games.
5. And part of the reason for our overconfidence is that none of these models are falsifiable. Not a single plausible model can be, nor has ever been, proven wrong.
6. One way to mitigate our overconfidence is to be publicly explicit about all the assumptions that go into any model, and by extension the assumptions that go into its inputs.
This post reads like one of the comments that Brian likes to eviscerate now and then: "I don't trust your fancypants numbers and my gut tells me different!"
But you've taught me too well. What reason do you have to convince me that the polls showing a consistent lead for Obama in Ohio are wrong?
Silver includes a systematic uncertainty (which rolls up things like turnout, enthusiasm, polling model errors, etc.) into his estimate of election likelihood; as mentioned above, this very term is why Silver gives Romney a ~15% chance instead of a ~1% chance, which is the number you'd get by taking the polls purely at face value.
Brian's criticism can be turned into a quantitative question along these lines:
How much larger would that term have to be to turn a Romney victory into a 40% probability, and what is the quantitative evidence for or against the term being that large?
(You can, from various plots Silver has shown, estimate that he currently has this term with standard deviation in the neighborhood of 3-4%, and it would have to be 5-7% to make it a 60-40 tossup. 5-7% seems like an awful lot to me but I've certainly never studied the question.)
I may be wrong, but it seems to me that one could have more confidence in Nate's model than in a game model. In the game, literally one event, as Brian mentions, the wind blowing, could change the outcome of the game. In the upcoming election, many things would have to change between now and tomorrow night for Romney to win. Additionally, most of those events or sets of events would have to work in Romney's favor.
Someone mentioned the cell phone problem with polling. If anything, not properly polling cell phone users without a landline, or properly weighting for them, would lead to an advantage in polling for the Republican, rather than the Democrat, if the demographics of the vote are marginally similar to previous elections.
I think you analogy is inherently flawed. I can change it if I'd like - Like a game we know the time (differing in different states but early voting plus Nov. 6). We know the rules (most states are winner take all with exceptions for Maine and Nebraska), etc...
The question is uncertainty of course, but there's a lot of uncertainty in your game data as well - did a key player get hurt, what's the weather like, etc..
Part of the point of 538 is to quantify some of that uncertainty. Of course he's using backwards looking data to do so, but then again, so are you in your probability model.
You bring up enthusiasm as a problem but it's just that sort of issue that Silver is trying to solve. Looking back on the elections for which we have data, how important is it really? We see that the major poll that seems to weight it most highly in their LV model, Gallup, seems to get it wrong when it differs from the consensus (and indeed seems to have larger than plausible swings).
And that's just the point - Silver is providing a counterweight to the pundits who often run with a story regardless, and sometimes in spite of, the evidence - see Romney's momentum for weeks after the polls had stabilized. Is it perfect? No, of course not. Does it adequately account for uncertainty? That's a much more difficult question but it most definitely doesn't ignore it.
The analogy is fine. The distinction is that there is no uncertainty in the inputs to a sports model, but there are considerable uncertainties in the polling inputs.
Whoever wins tomorrow and whichever poll happens to stumble on the right numbers proves nothing. Besides, we already know who's going to win. Remember...correlation always means causation. Prove me wrong!
Nate's book is basically all about how he is not that great at what he does/did. I think you'd like it.
For all the attention Silver gets over his model from 2008, people seem all to eager to forget how wrong he was in 2010 when he completely missed the Republican wave which delivered the house.
He also incorrectly called 67% of the Senate races that year.
The state polls were the problem. Garbage in, garbage out. That's the point here. The model doesn't do enough to account for these variations.
This just sounds too innumerate to be real.
We are only a week from saving hundreds of lives by making predictions in an open, dynamic, real-world system that has very little to do with games.
I'd guess a plausible model is proven false when it is determined to no longer be plausible, but then it doesn't apply anymore. Neat.
If Silver, Wang, Linzer, and everyone else are wrong about calling the overall winner (in the electoral college), it won't be their fault, as much as a systematic failure in polling in swing states which would basically drive every polling company out of business (yes, we could thank Obama for that). That is distinctly possible, but is it a 1%, 15%, 50% possibility.
With the overwhelming number of polls in Ohio and other swing states, the statistical models being used by all these secondary people (Silver, Wang) don't even matter when picking the overall winner. The conclusion is plain as day.
The overall winner is the only presidential prediction that really matters, but that isn't the only prediction being made. Some models will fare better with picking states (or congressional districts), margins, county by county results, or any of a number of different things. Even among those models that correctly pick the winner, there will be winners and losers.
Just because there are only two possible outcomes doesn't mean that the outcomes are 50/50. The possibility of a black swan should not completely invalidate all of the other factors (like people telling us who they are voting for) that informed the prediction.
538's swing state probabilities seem too extreme (far from 50).
The blog puts 4 swing states at above 80% likelihood for Obama:
Virginia (80%), New Hampshire (85%), Iowa (85%), and Colorado (80%).
Yet most of the poll data I've seen has the two candidates within the margin of error for those states.
When I run a monte carlo simulation using more even odds that all favor Obama (taken from the Intrade state odds), I still get an Obama victory 70-75% of the time, but the electoral college MOV is nowhere near what Nate Silver is predicting.
I hope he's right, or his supporters are going to get killed on the electoral college voting on intrade.
Larry Sabato predicts are more moderate 290-248 victory for Obama. That prediction matches the mode from my simulations.
Just as completion percentage and passing yards are a noisy measure of a QB's true abilities, so is polling data a noisy indicator of the electorate's true preferences. In both models, there is uncertainty in the thing you are truly trying to measure.
"For all the attention Silver gets over his model from 2008, people seem all to eager to forget how wrong he was in 2010 when he completely missed the Republican wave which delivered the house."
This has been repeated by a couple people, so I can only assume that it's a common talking point somewhere on the interwebs.
It has a nice ring, but again, it's not actually true. The very first forecasts he released, in early September of 2010, predicted a Republican takeover.
"He also incorrectly called 67% of the Senate races that year."
Wow. This is not only wrong, it's hilariously wrong. There were 33 races, and he correctly predicted 30 of them - that's 91%, not 33%.
Now, given that many races are very easy to call, missing 3 is actually pretty bad - there were maybe 9 or 10 races that were in any serious doubt in the closing months of the campaign, so his hit rate was much lower than on Senate races in 2008.
It might bear noting that all three misses were in favor of Republicans. And the misses Silver had in the 2008 elections, small though they were, took the form of underestimating the margins for the Democrats. He did underestimate the Republican swing in 2010 (although he was close), but overall there sure doesn't seem to be any left lean in his predictions.
Brian - the idea that the sports model has certain inputs and the 'real world' has uncertain inputs is a bit arbitrary. Yes the rules of a game are fixed - but so too are the rules of the election (when the votes are cast, how they're counted, how electoral votes are divvied up). But, both are also loaded with uncertainty.
Now you can certainly argue that there is more uncertainty in an election - and you'd be undoubtedly be right. But that's a difference in scale, not kind. Both a football prediction model and Silver's model have some fixed and many uncertain inputs.
I take your point about the inability to falsify a probabilistic prediction. But what else would you suggest? You seem to be heading towards a perfect is the enemy of the good idea. Is polling perfect? By no means, but that doesn't mean it doesn't tell us something.
Nate Silver would be the first to tell you the limits of what he's doing. Indeed he spells it out in his blog and his book. At the same time, I think there is quite a bit of value in it.
I think many (not meaning you) fundamentally misunderstand the type of bayesian analysis he's doing and thus don't really understand his numbers - thus arguments about how many states or races he 'got right.' People seem to think that if he called 7 states 57% one way and they all went the way he called it would be a good result for him (which of course it wouldn't - he should only get the direction right on 4 of those).
At this point his model is essentially predicting roughly a 10% chance that the polls are wrong. Is that number right? I don't have the data to argue one way or the other though I'm sure he does. Is it difficult to prove? Very much so - the best he can do is show how previous polling has done.
After rereading your post, I'm pretty sure I got it wrong the first time. I was thinking more in terms of what might be more properly called parameters (time left, rules) while you were saying that the polls held uncertainty that the score of a game does not right?
In that case, you're certainly correct. But there is still a lot of uncertainty in both.
Possibly more importantly the whole point of Bayesian analytics is to quantify and deal with that very uncertainty. Silver is trying quantify polling uncertainty and combine that with the other things that we know about elections and come up with a number that incorporates all that we understand about the election - not just polling but the history of a state, the effects of the economy on elections etc...
Again, I take your point about the probability not being falsifiable and it's certainly a legitimate one. But I would still argue that we gain a lot from that sort of analysis - often far more than we would from either a naive reading of the polls or, God help us, just listening to the pundits pontificate (which somehow always ends up with why their guys is actually winning).
Here's a question for you - is your objection primarily that he attaches a specific number to it? "Polls show Obama almost certain to win" doesn't seem like it would garner nearly the objections that "Obama has a 86% chance of winning" does.
I'll say this: when all is said and done, Nate's either gonna be celebrated as an infallible genius, or ridiculed as a total dunce, and neither narrative will really be correct, lol.
what I have learned reading the post and all the comments:
1) There are some very smart people who care about football scores,
2) The level of knowledge of statistics in these comments is intimidating to someone uneducated in such things but loves the logic of it all.
3) I'm fairly certain having finished reading all this I am far less certain of what I thought I might know.
For the fun of it:
Obama 332...my prediction. My sense is Silver's more right than Brian Burke thinks he is. I lean left politically and I am an optimist more than a pessimist (until Christian Ponder has to make a third down pass - then I'm not so optimistic).
jmaron
Improbably, as of 1:15 on the east coast, it looks like Nate Silver is going to hit 100% this year. Although, to be fair, his final prediction was that Obama was only a 50.3% favorite in Florida, so for all practical purposes he didn't make a prediction there.
Still, overall, the story here (from a stat-nerd prediction game perspective, of course) is that the polls were actually pretty darn accurate this year.
Looks like the non-ludic terms amount to a percent or less.
All great points, yet in the end: SILVER WAS GOLD!
Brian, how about revisiting this topic with a new look this time--given Silver's outstanding election swing states predictions?
T.Dennis
So, it's the genius narrative! What's the over-under on when Michael Lewis writes a book about Silver?
If you call Nate Silver a genius, you really need to call Simon Jackman, Josh Putnam, Sam Wang, Drew Linzer, and probably others geniuses as well. There were a number of people looking at the polls, creating estimates and error bars around each state's result, and running monte carlo simulations to produce a national estimate. They had varying degrees of complexity (whether that's good or not) and some used some other inputs as well, but they ALL ended up with the same map, and that map turned out to be correct.
@Tarr: If you think I'm calling Silver a genius, you need to read all the comments before replying, lol.
Has anyone recognized the glaring (and I do mean glaring) flaw in the Nate Silver analysis?
Brian, is there a better way to communicate with you rather than a message getting lost in this comment list as an anonymous poster?
It might make for a good article, and could possibly go national.
"Has anyone recognized the glaring (and I do mean glaring) flaw in the Nate Silver analysis?"
Ah yes, i see it has been mentioned. And indeed, looking at the 50 states, for 2012 and 2008, we see that Silver is 99 for 100. While many of his probability estimates are 100% to 0, there are several that are in the range of 60 - 90%. It is unlikely that all of those would be "correct".
Additionally, one can calculate how unlikely it is, to get 99 out of 100 probabilities correct. When I get a chance, I will post it.
Anon,
The short answer as to why is that those probabilities are actually highly correlated.
That said, it is likely that he overestimated the uncertainty.
Gotta give credit where it's due. And it is due.
With regards to the request to revisit. I think the article stands as written: "This is not a criticism of how Nate's model works or what 538 does. On the contrary, I applaud and admire it. I have no reason to doubt Nate Silver plays things straight with his numbers. Nor do I have reason to doubt that, given the assumptions inherent in the model and the assumptions of the poll inputs, the probabilities of the 538 model are accurate."
Harvard Sports Analysis did a review of how Silver did with his predictions, and not nearly the same conclusion on some here seem to be reaching.
http://harvardsportsanalysis.wordpress.com/2012/11/08/nate-silver-and-forecasting-as-an-imperfect-science/
Tarr,
it is reasonable to guess that he did indeed overestimate the uncertaintly, in light of predicting indiana in 2008 as 100% certain, and being wrong.
But the point remains: if he predicts obama to win at 60%, and romney at 40%, then romney has to win 40% of those. He won 0% of those. The probabilities were wrong.
I haven't crunched the numbers to see how wrong it is, but this is one of those rare cases where you can take probability estimates, and determine if they were consistent with results, because of his amazing "success".
To summarize with an analogy: he modeled his coin as 60% heads, 40% tails, flipped it 100 times, and it was heads 99 times. (note: made up numbers simply to illustrate the point, i'll have to look at the details).
Boston Chris - read the update to that article. Aside from ignoring the correlation between the states the author initially forgot the moe is two tailed and thus understated it by a factor of 2. In the update you find that 48 of 50 states were within Silver's moe projections (and assuming a 95% confidence interval you wouldn't expect him to get all 50) so he did very well.
Anon - you would be correct if and only if the states were independent and not, as Silver predicts, highly correlated. The simple way to think of it is that it's less that there are, say, 6 states that are 60/40 Obama as that reality tends to be that it's 60/40 that Obama wins those 6 states. The coin flip analogy simply doesn't work - each coin flip is an independent result - election results are not and neither the model nor common sense looks at them that way. What you have to look at is not whether they went 60/40 in one way or the other but whether the results were within his predicted moe (which they were in 48 of 50 states).
@ Marshall, thanks for pointing out the update. Unfortunately my RSS feed doesn't send me a 2nd link after an update or I would have gladly mentioned it myself. The stats are beyond me, I just wanted to share another perspective and that's why I included the link rather than make any argument myself. Glad to see it was updated.
"in light of predicting indiana in 2008 as 100% certain, and being wrong."
Do you have a source for this? I recall he missed the vote percent by only a point. I would be surprised if that translated into 100% certainty.
Mike,
Anon is wrong, and I have no idea where he got that from. Silver gave McCain a 70% chance of winning Indiana in 2008.
http://web.archive.org/web/20081219154455/http://www.fivethirtyeight.com/2008/11/todays-polls-and-final-election.html
Also, anon, you seem to be thinking that "overestimate the uncertainty" means the opposite of what it means.
Anyway, others have made the point, as I did, that these are highly correlated probabilities. One way to think of this is that the error of a given state's margin is a sum of the errors in the national popular vote margin, the regional margin, and the state margin.
Thanks Tarr. At this point, if some random internet commenter claimed that Nate Silver breathed air and walked on two feet, I would ask for a citation.