Wednesday, January 09, 2008

Polling Errors in New Hampshire

Hillary Clinton's stunning win over Barack Obama in New Hampshire is not only sure to be a legendary comeback but equally sure to become a standard example of polls picking the wrong winner. By a lot.

There is a ton of commentary already out on this, and much more to come. Here I simply want to illustrate the nature of the poll errors. These show the nature of the problem and help clarify the issues. I'll be back later with some analysis of these errors, but for now let's just see the data.

In the chart, the "cross-hairs" mark the outcome of the race, 39.1% Clinton, 36.4% Obama. This is the "target" the pollsters were shooting for.

The "rings" mark 5%, 10% and 15% errors. Normal sampling error would put a scatter of points inside the "5-ring", if everything else were perfect.

In fact, most polling shoots low and to the left, though often within or near the 5-ring. The reason is undecided voters in the survey. Unless the survey organization "allocates" these voters by estimating a vote for them, some 3-10% in a typical election survey are left out of the final vote estimate. Some measures of survey accuracy divide the undecided, either evenly across candidates or proportionately across them. There is good reason to do that in another post. But what the pollsters publish are the unallocated numbers (almost always) and so it seems fair to plot here the percent of the vote the pollster published, not one with undecided reallocated.

What we see for the Democrats is quite stunning. The polls actually spread very evenly around the actual Obama vote. Whatever went wrong, it was NOT an overestimate of Obama's support. The standard trend estimate for Obama was 36.7%, the sensitive estimate was 39.0% and the last five poll average was 38.4%, all reasonably close to his actual 36.4%.

It is the Clinton vote that was massively underestimated. Every New Hampshire poll was outside the 5-Ring. Clinton's trend estimate was 30.4%, with the sensitive estimate even worse at 29.9% and the 5 poll average at 31.0% compared to her actual vote of 39.1%.

So the clear puzzle that needs to be addressed is whether Clinton won on turnout (or Obama's was low) or whether last minute decisions broke overwhelmingly for Clinton. Or whether the pollster's likely voter screens mis-estimated the make up of the electorate. Or if the weekend hype led to a feeding frenzy of media coverage that was very favorable to Obama and very negative towards Clinton, which depressed her support in the polls but oddly did not lower her actual vote.

On the Republican side we see a more typical pattern, and with better overall results. About half of the post-Iowa polls were within the 5-ring for the Republicans, and most of the rest within the 10-ring.

As expected, errors tend to be low and left, but the overall accuracy is not bad. This fact adds to the puzzle in an important way:

If the polls were systematically flawed methodologically, then we'd expect similar errors with both parties. Almost all the pollsters did simultaneous Democratic and Republican polls, with the same interviewers using the same questions with the only difference being screening for which primary a voter would participate in. So if the turnout model was bad for the Democrats, why wasn't it also bad for the Republicans? If the demographics were "off" for the Dems, why not for the Reps?

This is the best reason to think that the failure of polling in New Hampshire was tied to swiftly changing politics rather than to failures of methodology. However, we can't know until much more analysis is done, and more data about the polls themselves become available.

A good starting point would be for each New Hampshire pollster to release their demographic and cross tab data. This would allow sample composition to be compared and for voter preferences within demographic groups to be compared. Another valuable bit of information would be voter preference by day of interview.

In 1948 the polling industry suffered its worst failure when confidently predicting Truman's defeat. In the wake of that polling disaster, the profession responded positively by appointing a review committee which produced a book-length report on what went wrong, how it could have been avoided and what "best practices" should be adopted. The polling profession was much the better for that examination and report.

The New Hampshire results are not on the same level of embarrassment as 1948, but they do represent a moment when the profession could respond positively by releasing the kind of data that will allow an open assessment of methods. Such an assessment may reveal that in fact the polls were pretty good, but the politics just changed dramatically on election day. Or the facts could show that pollsters need to improve some of their practices and methods. Pollsters have legitimate proprietary interests to protect, but big mistakes like New Hampshire mean there are times when some openness can buy back lost credibility.