Monday, November 06, 2006
From Poll Margin to Wins: Polls as Predictors
The usual way to look at poll accuracy is to subtract the poll result from the vote result. But an alternative is to look at how the probability that a candidate wins depends on the margin they have in the pre-election polls. Since American elections are "winner-take-all" within districts, this is a good way of looking at the practical power of polls to predict winners.
After all-- a statistician would say a poll was better that predicted 51% for the loser who actually got 49% than a poll that predicted 51% for the winner who got 55%. That's right from one point of view, but not from the perspective of predicting winners right. Here I take a look at the latter view of what is important.
The data are from all statewide polls for Senate, Governor or President from 2000 and 2002.
The figure above plots results by poll margin. The x-axis shows the Dem minus Rep margin in the polls. The y-axis plots the percent of races the Dem ACTUALLY won for each margin we saw in the polls. So imagine I take all polls that found a 5 point lead for the Dem. The y-axis plots the proportion of those polls with a 5-point lead in which the Dem actually DID win. I do this separately for each race, Gov, Sen and Pres. The dots show there is a lot of variation, but the pattern of points, and the black trend line through the data show how the predictive accuracy varies over margins from -30 to +30.
One interesting feature is that a margin of zero (a tied poll) produces a 50-50 split in wins with remarkable accuracy. There is nothing I did statistically to force the black trend line to go through the "crosshairs" at the (0, .5) point in the graph, but it comes awfully close. So a tied poll really does predict a coin-flip outcome.
The probability of a win rises or falls rapidly as the polls move away from a margin of zero. By the time we see a 10 point lead in the poll for the Dem, about 90% of the Dems win. When we see a 10 point margin for the Rep, about 90% of Reps win. That symmetry is also not something I forced with the statistics-- it represents the simple and symmetric pattern in the data.
More practically, it means that polls rarely miss the winner with a 10 point lead, but they DO miss it 10% of the time.
A 5 point lead, on the other hand, turns out to be right only about 60-65% of the time. So bet on a candidate with a 5 point lead, but don't give odds. And for 1 or 2 point leads (as in some of our closer races tomorrow) the polls are only barely better than 50% right in picking the winner. That should be a sobering thought to those enthused by a narrow lead in the polls. Quite a few of those "leaders" will lose. Of course, an equal proportion of those trailing in the polls will win.
So read the polls-- they are a lot better than nothing. But don't take that 2 point lead to the bank. That is a failure to appreciate the practical consequences of the margin for error.
Click here to go to Table of Contents