Thursday, January 31, 2008

Super Tuesday Polling at a Glance

Updated: 2/5/08, 1:58pm est

No time for commentary this morning, so here is an explanation.

It is ironic and annoying that the most important date on the primary schedule is also the date with the fewest polls per state. Just as the campaigns are struggling to run 22 simultaneous campaigns, so pollsters and the media have invested little in comprehensive polling of the Super Tuesday states. Even large states such as New York and California have fewer than 10 polls since January 1, far fewer than we saw last week in Florida for example. As a result, we have many states with no data at all, preventing a comprehensive overview of the prospects for Tuesday. Even where we do have polls, we lack enough to consistently estimate the trend with data taken since Iowa. Where we can estimate trends, we've done so on the "regular" state pages at You should go there for the best trend estimates we can manage with so little data.

The charts here are a way of seeing the entire set of Super Tuesday states (where we have polling) at a glance.

Rather than plot the usual trends with so few data points, each poll is a point and the darker the point the more recent the poll. The points are also scaled in size to be proportional to the number of delegates at stake in the state.

Instead of a trend estimate, this plot highlights the median of all post-Iowa polling in the state. The shading of points will then let your eye tell you whether there is a visible trend around that median. Be your own data analyst!

When more states become available, they'll be added to updated charts. If a state is missing, we don't have polling for it. (If you think we've missed a state with polling data, let us know!)

Florida Republican Poll Errors

Polling had a pretty good night in Florida on Tuesday. While early polls understated both McCain and Romney's vote, the polls got better as election day closed in, reflecting a (measured) upward trend in both candidates support. By the last day or two of polling most polls were within the ten-ring, and a number were close to the five-ring.

There was more disagreement among the polls as to who was ahead, and this included some late polls that put Romney ahead of McCain. However, most of these polls reflected a race "too-close-to-call", rather than egregious errors about the leader, with one large exception towards the lower right in the plot.

As we've seen all year, the polls got the vote for third and fourth place finishers very accurately, almost all within the five-ring.

Monday, January 28, 2008

Florida Republican Endgame

The polls in Florida point to a very close contest between Mitt Romney and John McCain. As of Monday's polling used here (there will be new final polls available Tuesday morning) Romney has a small lead based on our trend estimators (both standard and sensitive). But that lead is so small that a "dead heat" is probably still a good characterization of the race.

As we've seen in several previous states, the final vote has often broken strongly in favor of the winner with little going to the second place finisher. But who will that winner be? Other than our trend estimates, the data have little more to tell us on that score. With Ron Paul polling consistently at between 4 and 5 points, there are at least 10% of voters yet to make up their minds. A strong surge could boost one of these front-runners from about 30% to near 40% and a very significant win. An even split of undecideds will make for a close finish with both around 33-35%.

Both Romney and McCain have been gaining ground in Florida, but Romney's rise has been consistently sharper than McCain's. Moreover, there is little evidence that McCain has enjoyed a post-South Carolina spurt in Florida. Nor does Romney's Nevada win seem to have helped him beyond his already considerable upward trend.

As the Florida race narrowed to a two way contest for first place, Giuliani and Huckabee have subsided into the 12-15 percent range. They too could go either way for 3rd and 4th, though at the moment Giuliani has a small advantage for third place. As mentioned above, Ron Paul trails with 4-5% for fifth place.

The chart below shows how much the polls in Florida have overlapped since the South Carolina primary and Nevada caucus 10 days ago. While the trend estimates are slightly distinct, the huge overlap of polls for Romney and McCain show that we should remain quite uncertain as to who is "really" ahead. Sometime Tuesday night, we'll find out.

Sunday, January 27, 2008

South Carolina Poll Errors

The polls had a bad day on Saturday in grossly underestimating the support for Barack Obama, though they nailed the Clinton and Edwards votes quite well. Not one poll came within the ten-ring, and the final poll of the primary understated Obama's vote by nearly 15 points. Several polls flirted just inside the 20-ring, and one hapless example of the consequences of poor question wording, the Clemson University poll, understated Obama support by nearly 30 points. The Clemson poll allowed 36% to remain "undecided", hopelessly biasing downward their estimates of candidate support, but especially so for Obama.

My colleague Mark Blumenthal has posted a nice comparison of these South Carolina results with those "terrible" polls from New Hampshire. Below is the same poll error chart for New Hampshire, but scaled the same as the one for South Carolina above.

The New Hampshire results were mostly inside the 10 ring and all were inside the 15 ring. And a couple even touched the 5 ring. Judged by distance from the bullseye, New Hampshire doesn't look that bad, certainly not compared to South Carolina.

But there is another difference, and this is where New Hampshire was terribly wrong and South Carolina not so bad: All but one of the New Hampshire polls had the wrong leader. None of the South Carolina polls, not even Clemson's, got the leader wrong.

So while the distance from the bullseye was quite a bit worse in South Carolina, the creation of confounded expectations was not. It was the expectations that were created and then confounded that make New Hampshire a polling disaster, while there has been little said about the polling errors in South Carolina. (Except here, where we care about such things all the time!)

The other interesting comparison is the parallel that the number 2 finisher in both South Carolina and New Hampshire was quite well estimated. The SC polls got Clinton within normal margin of error. And the New Hampshire polls also got the 2nd place finisher there, Obama, within reasonable error.

The problem in both cases is in the substantial underestimate of the first place finishers vote. The final choices of late deciding voters is a challenge for all polling, and perhaps especially so in primaries where there is no "party identification" to come home to if you can't make up your mind. In New Hampshire the Clinton win rested on significantly more voters supporting her than expected. In South Carolina is was the magnitude of the victory, rather than first place itself, that confounded the polling.

Increases in voter turnout in this cycle may be part of the story (a 75% increase in South Carolina), but here we see those late deciders breaking for different candidates, and yet in both cases for the ultimate winner. Second place results may on average be slightly low compared to the polls, but the first place "bonus" seems quite strong. At least for the Democrats. In the Republican South Carolina primary, both first and second place finishers were a bit underestimated, so there was not the same asymmetric error for first place. The New Hampshire Republican race also about equally understated the votes for first and second. The relatively lightly polled Michigan Republican race shows somewhat greater underestimate of first place (Romney) and second place (McCain). And in Nevada, with only 3 late polls, Romney was dramatically underestimated, while Ron Paul finished second but was only moderately underestimated.

So perhaps these reflect pollsters' difficulty in discerning the likely behavior of undecided voters, or perhaps these are last minute decisions to vote by "not-so-likely" voters who are screened out of the sample but who turn out for the ultimate winner in larger than expected numbers.

Turning to 2nd and 3rd place, the chart below shows that the polls had a pretty good day predicting the Clinton and Edwards votes. Despite some chatter about a late Edwards surge and a Clinton fall (including some evidence in our sensitive trend estimates that such a movement was occurring) most of the late polls were within the five-ring for 2nd and 3rd place, and all got the order of finish right.

Friday, January 25, 2008

South Carolina Democratic Endgame

The South Carolina polling continues to show a substantial lead for Obama, while Edwards' rise hints that he could challenge Clinton for second place.

At the moment, Clinton continues to hold a six-point advantage over Edwards, but Edwards has been rising while Clinton has been moving down. Obama, meanwhile, has been fairly steady at around 40-44% support, though with some hint of a small decline in the sensitive estimator. Note however that the Clemson University poll included here had an amazing undecided rate of 36%. That makes every candidate in their poll look lower than in all other polls that have a much lower rate of undecided. The level of undecided is quite sensitive to how the poll is conducted, including whether respondents are pushed as to whether they "lean" towards a candidate. The Clemson poll apparently didn't push at all among undecided voters. We'd be making a mistake to read their data as indicating a decline of support for anyone.

A second place for Edwards would, of course, be good news for his campaign, while Clinton would no doubt argue she had conceded the state to account for a third place finish. But Edwards still has some ground to make up, and late deciding voters remain an unknown-- if they are unhappy with either Clinton or Obama, Edwards can benefit simply by not being one of them. This may be especially true among independents who vote in the Democratic race, and the expected handful of Republicans who show up. (Republicans and independents can vote in the Democratic primary only if they did NOT vote in last week's Republican primary.)

I think the more compelling story of South Carolina will be the exit poll results. Obama has appealed to white voters in previous primaries and caucuses. The pre-election polls have found him getting as low as 10% of the white vote in South Carolina. The potential for racial polarization in this Southern state could damage his ability to transcend race as a basis of voting. Paradoxically, there has been speculation that Clinton can win the votes of black women, a result that could reduce polarization in the exit poll. We'll know much more about how voters decided by Saturday night.

Monday, January 21, 2008

South Carolina and Nevada Poll Errors

Polling for the South Carolina Republican primary mostly got the winner right, but large undecided percentages prior to the election accounted for a general underestimate of the final vote for both McCain and Huckabee. Late polling also found the race closer than the eventual outcome.

Pollng for third and fourth place was less successful in detecting Fred Thompson's final strength and the drop of support for Mitt Romney, who abandoned South Carolina to spend time in Nevada, where he scored a strong first place finish. While almost all the polls finished inside the "5-ring", correctly seeing a close fight for the 3-4 spots, all but one poll got the order of Thompson-Romney wrong (and the one that got that order right substantially missed the magnitude of both votes.)

In the Nevada Republican caucuses the polls wildly underestimated Romney's final strength of 51% of the vote. (Note that the rings here have to be rescaled to include the very large errors.) Even the best of the three Nevada polls was more that 15 points off on Romney. The earliest of the three polls, taken before Romney's win in Michigan, was over 30 points low.

McCain and Huckabee ignored Nevada, essentially conceding the state to Romney, but the polling still failed to pick up the magnitude of his support there.

The polling likewise failed to capture Ron Paul's second place strength. Preelection polls put Paul at about 7% compared to his finish of 13.7%.

In terms of erroneous expectations, the polling also put McCain well ahead of Paul, uniformly getting the 2nd and 3rd place finishers wrong.

On the Democratic side, the final poll was inside the "10-ring", and the polling improved as the caucus approached. Here the surprisingly poor showing of John Edwards, and the Democrat's caucus reallocation rules for non-viable candidates helped boost the final percentages away from the polls. The three most recent polls all got the order of finish correct.

Friday, January 18, 2008

South Carolina Republican Endgame

South Carolina is looking quite interesting for the Republicans. There has been a decent amount of polling since Iowa, with Huckabee getting a brief bounce but subsequently subsiding a bit, while McCain has gained since the first of the year. There is little evidence that Romney benefited significantly from his Michigan win. Fred Thompson has risen a bit, based on the sensitive estimator, but still trails in fourth place.

If we try to pick these data apart a bit, the sensitive red estimator is trying hard to fit Huckabee's varying fortunes. His December rise, and Iowa bump are picked up, but if those were real then so is the decline we see to his current level of about 22%. Whether he has been trending up down, or flat depends on how wide your view of the polling is. The blue estimator has the longest run view, and still see's him improving over his showing in 2007, but that was ages ago politically. The sensitive red estimator is showing a downturn since Iowa, but if you squint hard and ignore the earliest post Iowa polls you might believe you can still see some rise in the last week or 10 days. But do note the bottom line: the two different trend estimates still put his current support at between 21.6% and 23.3%. So while they disagree on immediate trends, the end up close to the same bottom line.

For McCain, there is little dispute that he has surged since early December when he was in the low-teens to somewhere in the mid-to-upper 20s today. The sensitive estimator thinks the rate of climb since Iowa has been more rapid that does the blue estimator, but again both put his support between 26.9% and 29.3%.

One big question in South Carolina is whether conservative criticism of both Huckabee and McCain is having any effect. If Thompson is benefiting from that, his polls only modestly show it. The sensitive estimate suggests a rise from about 10% to about 14%, but there is no polling evidence for a surge that would allow him to compete for first place.

Finally, Romney's Michigan win seemed to help him in Nevada (based only on 3 polls, I should add) but there is no evidence of a bounce in South Carolina. After spending Wednesday and part of Thursday in the state, Romney appeared to concede the race and moved on the Nevada to campaign, where his chances look better. The Romney trends are also in complete agreement: No substantial trend, and both agree on 16%.

One big warning here. South Carolina has shown suprising numbers of undecided. The Fox poll today, for example, has an unbelievable 19% undecided. That is HUGE. And, if you add up the trend estimates for the four top candidates here, you get... 81% leaving the same 19% unallocated. In Fox's poll, less than 10% pick Paul, Giuliani or Hunter combined, so there is a gigantic amount of room for those last minute deciders to break one way or the other.

With a history of under-the-radar negative campaigning in the state, and evidence of the same this year, it would be surprising if we don't see some important shifts here at the end.

In New Hampshire and Michigan, those late shifts have benefited the first place finisher disproportionately.

This chart is an attempt to illustrate the variation in the polling over the past week. (A comment raised some good questions about it, so let me try to explain it better.) The horizontal axis is the current sensitive trend estimate for each of the top four candidates. The vertical axis is the actual poll results since the Michigan primary. The point is that the sensitive trend puts the order as McCain, Huckabee, Romney and Thompson. But the vertical spread of the points for each candidate shows how much variation we've see from poll to poll. While most of the McCain polls are higher than most of the Huckabee polls, there is some overlap. Likewise most of Huckabee's results are higher than Romney. But Romney and Thompson show considerable overlap. The less the overlap, the more reasonable it is to believe the separation between candidate trends is reliable, and the more overlap the greater the uncertainty. If all the polls showed exactly what the trend estimate shows, then the points would all be on the diagonal line, and there would be no disagreement at all. The fact that most of McCain's polls are below the 45 degree line shows that he has been trending up, while Huckabee's points are mostly above the diagonal, consistent with his recent downward trend. Romney and Thompson are about equally above and below the diagonal, showing little trend over the last eight polls since Michigan.

But the simple point is that the more spread you see vertically, the more uncertainty. And the more overlap between pairs of candidates, the more uncertainty.

For me, that 19% undecided in the Fox poll is scary. And fun!

Nevada Endgame

The Nevada caucuses are upon us, but the polling is scant. As the graphs make clear, we have only four polls since New Years, so the trend estimates here should be taken with more than the usual grain of salt. With just 4 polls, the sensitive red estimator is going to try hard to come close to all the polls, making Clinton and Edwards look like they are experiencing huge trends. Obama's four polls are more clustered, so the red trend is a little better behaved, but still tries hard to find a trend when there truly isn't enough data to support one.

Also, there was no Nevada polling between December 3 and this week, so we have NO IDEA what happened during that time. Both trends are fit to all the data (including polls in 2007 not shown in the plots.) So the blue line is the best conservative guess given all the data in 2007 plus the four polls in 2008. The red line sensitive estimator is, in this case, I think hopeless.

If you force me to choose, I'd take blue in this case. With so little data, you want to be conservative. But you can see the poll-to-poll variation is large for Clinton and Edwards. Perhaps a lot has been changing this week, but you can't be sure it isn't just pollster variation.

And one more caution: as a caucus with very low expected turnout, polling Nevada is at least as perilous as Iowa, if not more so.

For the Republicans, we have only three polls taken this week. The only good news here (statistically speaking) is that McCain, Huckabee and Thompson are all tightly clustered without the large variation we saw for the Dems. Romney, on the other hand, shows a big gap between the first poll, completed BEFORE his win in Michigan and the two polls taken since then. Looks like a bump.

The lack of wild variation lets both trends come closer to the polls, but again just three polls is a ridiculously small number to base much on. I'd stay conservative with Blue here.

Ironic that where uncertainty is intrinsically great because of low turnout and the importance of organization that we also have magnified that uncertainty with so few polls. Makes it hard to be an odds-maker in Nevada.

Thursday, January 17, 2008

An Emerging Republican Consensus? Can it be?

Four events. Three winners. South Carolina and Nevada up in the air. Can there possibly be any reason to think Republicans are settling on a consensus candidate?

Strange as it seems, there is one clearly emerging candidate---he who was declared dead in July: John McCain. (See my premature obituary for McCain here.)

Given McCain's losses in Michigan and Iowa, his one New Hampshire victory is hardly a reason to believe he is an emerging consensus. But the trends in polling across the nation and nine early states points to McCain as the unique candidate in the Republican field who has been strongly rising across all states. Further, his rise is not a "bump" from an early win: the rise predates the primaries, a crucial point.

While McCain does not lead in all these states (and has now lost two of three) his polls now put him in the competitive range in each of these states.

The most striking and compelling feature of the chart is the simultaneous upturn across all states, especially following the long term decline in McCain's support over most of 2007. Why McCain, and why then?

Part of the answer must be the fall of Rudy Giuliani in the fourth quarter of 2007.

After leading in national polls throughout the first three quarters, Giuliani's support took a sharp turn downward in the late fall, closely associated with the timing of the indictment November 8th of his long time friend, partner and associate, Bernard Kerik. (I also think failing to compete in early primaries, and then doing quite badly, is a contributing recent cause of Giuliani's decline. Late win strategies do not have a good track record... ask John Connally.)

McCain's rise comes after Giuliani's decline begins. Given that both candidates appeal more to moderate and somewhat conservative Republicans (as opposed to the conservative base of the party) it is likely that these voters turned from Giuliani and found McCain the most attractive among the remainder of the field.

McCain also shares with Giuliani the advantage of perceived "electability". As Giuliani's fortunes fell, McCain emerged as the candidate Republicans see as having the best chance of defeating any Democrat in November. In primaries, perceived viability is an important asset.

We can also see support for McCain surge in favorability ratings among Republicans. In the Diageo/Hotline national poll taken January 10-12, McCain scores a remarkably high 78% favorable rating among Republicans (with 31% very favorable). Giuliani is at 67% (with only 15% very favorable), Mitt Romney at 55% and Mike Huckabee at 53%. Granted this poll's timing reflects the New Hampshire win and not the Michigan loss, these are still notably high ratings for a candidate who has often alienated important elements of the Republican party.

McCain led the vote choice among Republican primary voters in the Diageo/Hotline survey by 32% to Huckabee's 17% and Romney's 15%. (McCain also lead in the Pew poll 1/9-13 by 29%-20% over Huckabee, and in Gallup/USAToday by 33%-19%. Again, none of these reflect Michigan's impact.)

On electability, McCain was rated most likely to defeat the Democrat by 42% to 17% for Giuliani in the Pew poll. That reversed the result from Pew's November survey that had Giuliani most likely to win by 45% to 16% for McCain.

Among Republican primary voters, more also see McCain as most likely to win the Republican nomination, regardless of their own preference. Diageo/Hotline has McCain with 82% saying likely to win, 25% very likely and 57% somewhat likely. Compare Huckabee at 56% (6%/50%), Romney 53% (6%/47%) and Giuliani 45% (7%/38%).

Also surprising, given McCain's testy relations with so many Republican groups, is the relatively small number who would refuse to vote for him. McCain suffers only 9% of Republicans who would never vote for him in the Pew poll. Huckabee is at 8%, Giuliani at 15% and Romney at a devastating 20%.

Since his nadir in November, McCain has achieved a remarkable recovery among Republican voters. Perceptions of him have changed substantially in the last three months, both in favorability and in electability as well as in support. And those changes are reflected across all of these states and the national polling as well. That is hard to argue with.

But perhaps we still should argue a bit. There are two things that may yet halt the McCain victory. One is the opposition of important organized groups within the Republican party. While the rank and file may have come over to McCain, those groups bitterly opposed to campaign finance reform remain adamant in their opposition, and they wield great influence among party elites as well as some grass roots organizations. The second barrier is the calendar which looks to pit McCain against Huckabee in South Carolina on Saturday. (Romney has apparently conceded South Carolina in favor of a Nevada effort. Fred Thompson has yet to rise in South Carolina polling despite a surprisingly animated debate performance.) South Carolina was McCain's Waterloo in 2000, and its large religious base should be Huckabee's best shot at a win since Iowa. A McCain win can spur his campaign on, but a loss can shake the progress he's made with voter perceptions of his support within the party. (I should add a third threat: McCain's mouth. The "straight talk" he prizes has gotten him in trouble before, and can again.)

A McCain nomination will certainly shake many powerful elements of the party. Whether they can prevent that depends in part on the development of a common choice among the alternatives to McCain. Romney remains anathema to 20% of Republicans. Huckabee is opposed by economic conservatives. Thompson has failed to emerge, and Giuliani has other problems. So while many Republican groups despise John McCain, it is not clear they can unite in embracing one of the alternatives who has also proven attractive to voters.

The candidate profile most like McCain's is Huckabee's. His sharp rise in November and December was similar in appearance, covering a number of states and national polls. But Huckabee's rise stalled in most states and has taken a fall generally since mid-December. His poor third place finishes in New Hampshire and Michigan dampened any Iowa momentum, and he must now win South Carolina based on overwhelming support among conservative Christians to remain in competition. He probably couldn't have a better state to try to pick up that win, but it is a make or break opportunity.

Romney's win in Michigan was a life preserver, but not necessarily a life saver. The Romney decision to concede South Carolina must be bitter given the substantial spending he committed to the state. It is also proof that he has failed to overcome opposition from Christian evangelicals who remain very reluctant to embrace a candidate of the Mormon faith.

Still, Romney's poll profile is at least generally upward sloping. It lacks the amazing coherence of McCain's (or even Huckabee's.) One can still imagine Romney picking up wins in the trench warfare of February 5th, and making the delegate count a serious battle after. But he does not appear to have unified his support across a variety of states.

And Fred Thompson remains the disappointment of the year. In paper qualifications, both in office holding and in ideology, Thompson's campaign came from central casting. But the actor failed to grow into the role. Here the uniformity of polling is a uniform decline across all states since the peak, just before the official announcement in September.

So when we survey the Republican field, only one trend stands out unambiguously and uniformly across states: McCain's rise and emergence as the only candidate doing better and better almost everywhere.

In July I said a McCain recovery would be a miracle of Biblical proportions. Now I've seen, yet I still find it hard to believe.

Wednesday, January 16, 2008

Michigan Poll Errors

(Darker points are polls taken closer to primary date, lighter points were taken earlier. All polls were completed after the New Hampshire primary.)

The Republican primary polls for Michigan did better than those for New Hampshire, but generally underestimated support for Mitt Romney. Only one poll got both Romney and McCain's vote within the five point error ring.

All three polls completed on January 14th, the day before the primary, got McCain's vote within a small error, but the Romney errors ranged from small to quite large for these last polls. Earlier polling was generally further from the final vote, suggesting that a trend to Romney was partially captured by the late polls, but only imperfectly.

For the second and third place finishers, the polls were quite good for McCain vs. Huckabee, with all three final polls inside the five-ring and all eight polls inside the 10 ring.

The pattern of underestimating the winner's percentage while doing quite well on the 2nd and 3rd place finishers suggests that much of the undecided category in the surveys eventually went to Romney, boosting his total beyond the support registered in the poll, while barely adding to the 2nd and 3rd place totals.

Monday, January 14, 2008

Michigan Endgame

Michigan for the Republicans is a critical test for Mitt Romney. After two "silvers" in Iowa and New Hampshire (and a mini-gold in Wyoming which got little attention or credit) Romney badly needs a win in his original home state. With a win, Romney can become the third winning Republican going into the Nevada and South Carolina events this weekend.

But John McCain also needs a Michigan win. Coming back from the dead to win New Hampshire was a huge achievement for McCain, but he needs to prove he is for real outside of his best state from 2000. That year he also won Michigan, thanks in large part to Democratic cross-over votes. This year Democrats are again free to cross over for McCain, thanks both to open primary rules and the fubar Michigan Democratic primary stripped of all meaning by breaking party rules to move ahead in the voting. With no meaningful Democratic vote, those Dems who supported McCain eight years ago are again free to do so this time. Whether or not that happens is just one more nightmare problem for pollsters: how many Democrats will in fact vote in the Republican primary?

And Mike Huckabee could certainly use a strong finish to show that Iowa wasn't his first and last hurrah.

The numbers look not so good for Huckabee. There is the barest hint of a post Iowa bump for him. Rather his gains in November and December seem to be all the rise he's gotten in Michigan. Since Iowa, Huckabee's poll support in fact seems to be falling, as captured by the sensitive red estimator (but not the slow-to-change blue line.) Whether blue at 17% or red at 14%, Huckabee looks likely to be a distant third in Michigan.

McCain and Romney on the other hand are neck and neck and the polling variation is so large there is no way to declare either a leader. Romney has the slightest of leads in the sensitive estimator at 26.1% to McCain's 25.6%, but that difference is meaningless given the spread in polling. McCain has clearly picked up considerable support since Iowa and New Hampshire, but so has Romney. McCain appears to be gaining more rapidly but with so little time since New Hampshire it is impossible to get a reliable estimate of the rate of gain in the last 5 days.

The bottom right panel of the chart shows clearly how uncertain the top two spots in Michigan are. The blue Romney dots mix in with the red McCain dots, overlapping so much that there is clearly no reason to think one is ahead of the other. Even some Huckabee results are within the range of Romney and McCain support.

Most of Huckabee's higher polls are older, and his recent downward trend means that more of his polls are above his current trend estimate. For Romney and McCain, the polls are evenly scattered above and below trend.

Republicans have plenty of reason to turn out Tuesday. They can help make or break the candidacies of McCain or Romney. Those pesky independents and Democrats remain the huge unknown. See Mark Blumenthal's analysis of the recent polling and how many Democrats are included in the various samples for a good look at how squishy that number is.

Wednesday, January 09, 2008

Polling Errors in New Hampshire

Hillary Clinton's stunning win over Barack Obama in New Hampshire is not only sure to be a legendary comeback but equally sure to become a standard example of polls picking the wrong winner. By a lot.

There is a ton of commentary already out on this, and much more to come. Here I simply want to illustrate the nature of the poll errors. These show the nature of the problem and help clarify the issues. I'll be back later with some analysis of these errors, but for now let's just see the data.

In the chart, the "cross-hairs" mark the outcome of the race, 39.1% Clinton, 36.4% Obama. This is the "target" the pollsters were shooting for.

The "rings" mark 5%, 10% and 15% errors. Normal sampling error would put a scatter of points inside the "5-ring", if everything else were perfect.

In fact, most polling shoots low and to the left, though often within or near the 5-ring. The reason is undecided voters in the survey. Unless the survey organization "allocates" these voters by estimating a vote for them, some 3-10% in a typical election survey are left out of the final vote estimate. Some measures of survey accuracy divide the undecided, either evenly across candidates or proportionately across them. There is good reason to do that in another post. But what the pollsters publish are the unallocated numbers (almost always) and so it seems fair to plot here the percent of the vote the pollster published, not one with undecided reallocated.

What we see for the Democrats is quite stunning. The polls actually spread very evenly around the actual Obama vote. Whatever went wrong, it was NOT an overestimate of Obama's support. The standard trend estimate for Obama was 36.7%, the sensitive estimate was 39.0% and the last five poll average was 38.4%, all reasonably close to his actual 36.4%.

It is the Clinton vote that was massively underestimated. Every New Hampshire poll was outside the 5-Ring. Clinton's trend estimate was 30.4%, with the sensitive estimate even worse at 29.9% and the 5 poll average at 31.0% compared to her actual vote of 39.1%.

So the clear puzzle that needs to be addressed is whether Clinton won on turnout (or Obama's was low) or whether last minute decisions broke overwhelmingly for Clinton. Or whether the pollster's likely voter screens mis-estimated the make up of the electorate. Or if the weekend hype led to a feeding frenzy of media coverage that was very favorable to Obama and very negative towards Clinton, which depressed her support in the polls but oddly did not lower her actual vote.

On the Republican side we see a more typical pattern, and with better overall results. About half of the post-Iowa polls were within the 5-ring for the Republicans, and most of the rest within the 10-ring.

As expected, errors tend to be low and left, but the overall accuracy is not bad. This fact adds to the puzzle in an important way:

If the polls were systematically flawed methodologically, then we'd expect similar errors with both parties. Almost all the pollsters did simultaneous Democratic and Republican polls, with the same interviewers using the same questions with the only difference being screening for which primary a voter would participate in. So if the turnout model was bad for the Democrats, why wasn't it also bad for the Republicans? If the demographics were "off" for the Dems, why not for the Reps?

This is the best reason to think that the failure of polling in New Hampshire was tied to swiftly changing politics rather than to failures of methodology. However, we can't know until much more analysis is done, and more data about the polls themselves become available.

A good starting point would be for each New Hampshire pollster to release their demographic and cross tab data. This would allow sample composition to be compared and for voter preferences within demographic groups to be compared. Another valuable bit of information would be voter preference by day of interview.

In 1948 the polling industry suffered its worst failure when confidently predicting Truman's defeat. In the wake of that polling disaster, the profession responded positively by appointing a review committee which produced a book-length report on what went wrong, how it could have been avoided and what "best practices" should be adopted. The polling profession was much the better for that examination and report.

The New Hampshire results are not on the same level of embarrassment as 1948, but they do represent a moment when the profession could respond positively by releasing the kind of data that will allow an open assessment of methods. Such an assessment may reveal that in fact the polls were pretty good, but the politics just changed dramatically on election day. Or the facts could show that pollsters need to improve some of their practices and methods. Pollsters have legitimate proprietary interests to protect, but big mistakes like New Hampshire mean there are times when some openness can buy back lost credibility.

Tuesday, January 08, 2008

New Hampshire Pre- and Post-Iowa

Here is one more way to look at the impact of Iowa. The plot shows each poll (tracking polls are included only when their samples don't overlap, e.g. every 3rd day for a 3 day track.) Polls ending on the same day are separated in the chart, though the order is arbitrary within day.

McCain had already captured the lead over Romney by the Iowa caucus. While three post-Iowa polls still find a Romney lead, the majority of polls put McCain in the lead. Moreover, the height of the bars is quite similar before and after Iowa, indicating little effect.

That lack of change is interesting since Romney might have been expected to fall due to his disappointing second place finish in Iowa. McCain's 4th place in Iowa wasn't read as a "loss", in the bizarre expectations game. But regardless of their finishes, Iowa seems to have not moved the New Hampshire Republican electorate at all.

On the Democratic side, the large pro-Obama bounce is obvious. If it didn't show up in the first day or two post-Iowa, it clearly appears after that. A Clinton lead of some 10-15 points became an Obama lead of some 8-10 points.

Last Day of the New Hampshire Endgame

The last New Hampshire polls are in, and there is little change of trend from Monday. The Obama rise continues to look steady and strong. The sensitive trend estimator in red continues fit the sharp turn in Obama support better than the slower-to-change blue estimator, though both end up close to one another in estimating current support.

The current estimates are 39.0% for Obama and 29.9% for Clinton. If you like the more stable standard blue estimate, the numbers are 36.7% and 30.4% respectively.

There is the tiniest of hints that Romney's decline and McCain's surge have both flattened just a bit in the sensitive Red estimator, though the actual difference between the red and blue trend estimates is trivial for each candidate.

The sensitive red trend puts McCain at 33.4% and Romney at 27.5%, while the standard blue estimate has it 34.2% to 27.5%.

The first bit of suspense tonight will be whether McCain succeeds in holding a lead over Romney. The trend data say yes, but there is some considerable variation in the McCain poll results.

On the Democratic side, the question should be the size of the Obama win.

Edwards looks to be a distant third, a result that should be damaging to his campaign to emerge as one of the two "change" candidates.

Huckabee appears to have utterly failed to capitalize on his Iowa success. That is a big deal, in my view, because his campaign needed dramatic successes to bootstrap itself into a national effort. Iowa alone is not enough. Can he recapture the Iowa momentum in Michigan or Nevada or South Carolina in the face of a new look at McCain? (Or a reborn Romney should an upset happen in NH.)

The Thompson campaign gave up in NH, and Giuliani's collapse there just continues the question of whether his once high flying campaign can survive the early series of losses he now seems set for.

Some of these questions will be answered tonight.

Monday, January 07, 2008

New Hampshire Endgame

The New Hampshire endgame polling presents an interesting contrast. The Republican race shows virtually no hint of an "Iowa Bounce." The Democratic race, on the other hand, is showing a huge bounce for Obama and a drop for Clinton. Edwards is largely unaffected.

The charts also show the better performance of the sensitive red-line estimator when things are as dynamic as they have been since Thursday. The Red estimator catches the upturn in Obama support pretty well, while the blue estimator is trying hard to keep up but its "slow to change" nature means it totally misses the timing of the upswing.

If anyone were actually asking if there has been an Obama bounce, surely they are no longer asking.

The Republican side is a bit more sedate, probably because the leader there, John McCain, was hardly a top finisher in Iowa. The upward trend for McCain, and the downward one for Romney, predated the Iowa caucuses. At most the trends we saw earlier have largely continued. Huckabee appears to be the candidate without a bounce, in fact.

These are dynamics we've seen before when Iowa has had an impact. The short interval between Iowa has been much debated. One side says it doesn't allow enough time for an Iowa bounce to be fully felt. I'm of the opposite opinion. The short interval maximizes the effect of Iowa by not allowing time for losers in Iowa to retool their approach and for "added scrutiny" of the Iowa winner to slow their climb. The issue for Clinton and Romney is how to halt what is beginning to look like disastrous slides. With more time between events they would be better able to recover. If New Hampshire is a second loss for both, then both campaigns have to find ways to recover by South Carolina.

Friday, January 04, 2008

Iowa 2008 Trend, Entrance Poll and Outcome

Last week I took a look at how the poll trend estimates did in the 2004 Iowa Democratic Caucus. This morning we have two new data points for comparison.

It has been a long day and night, so I won't say much tonight. We will look at the polls in more detail after a bit of sleep.

The bottom line in 2004 was that the polls under-estimated winners and over-estimated losers. (See the plot below for the 2004 comparison.) This year again the poll trend substantially underestimated the size of Obama's win. Clinton was quite well estimated, and Edwards did significantly better than the poll trend estimated.

The complex reallocation of preferences in the Democratic caucus also affected the entrance poll, which was quite close for Obama and Clinton but underestimated Edwards' final share of delegates.

The lower tier of candidates all finished below their poll trend estimates, though at such low levels of support that none of the errors are large.

On the Republican side, with a simpler form of voting at the caucus, the polls did a bit better, except again for substantially underestimating the winner, Mike Huckabee. Other candidates ended up with shares of the caucus vote pretty close to their poll trend estimates. Ron Paul did a little better than the poll trend and Giuliani a little worse.

Iowa Entrance Poll Results: Republicans

The entrance poll tables can be found at MSNBC here. The widths of the bars are proportional to the size of the groups.

Thursday, January 03, 2008