Monday, November 26, 2007

Zogby Internet Poll Trial Heats are Odd

A new Zogby Interactive poll, conducted using volunteers over the internet, has produced some odd results for trial heats involving Senator Clinton against all four top Republican opponents. What makes this especially odd is that the results are not equally unusual for Obama.

This poll was reported by Reuters' John Whitesides, who also reports on the Reuters sponsored polling Zogby does by conventional telephone methods. The similarities in the reports make it hard to tell, but apparently these results are not part of the Reuters-Zogby polling partnership, but are independent work by Zogby Interactive. Likewise Zogby's website posts the results without mention of who sponsored the work, so presumably Reuters did not.

The Zogby poll was conducted 11/21-26/07 with 9150 respondents who had agreed to take part in Zogby's online polling. This is not a normal random sample of the population. More on the technical issues below.

The hugely surprising result is that the Zogby poll finds Sen. Hillary Clinton losing to all four top Republicans in head-to-head trial heats. What makes that surprising is that Clinton LEADS all four of those Republicans in the trend estimates based on all other polling by between 3.8 and 11.6 points. Zogby also has Clinton losing to Arkansas Governor Mike Huckabee by 5 points. There are too few Clinton-Huckabee trial heat polls from other organizations for me to compute a trend estimate for that comparison.

The chart above shows all the trial heat data from national polling and the estimated trend lines for each pairing. The data points for the new Zogby data are indicated in the charts as "Zogby Inet" in blue for Clinton and red for each Republican.

What is immediately clear is that the Zogby Clinton numbers are well below the estimated trend for Clinton in each of the four comparisons. Clinton is consistently 8-10 points below her trend estimate based on other polling.

In contrast, the Republican results are quite close to the trend estimate in most cases: Giuliani is at 43 in Zogby, with a trend of 44. Romney is 43 in Zogby, 38.3 in trend; Thompson is 44 in Zogby, 41.3 trend, and McCain is 42 Zogby, 42.7 trend. Those Republican numbers are about the kind of normal noise we see around the trend estimate, so don't seem out of line.

Why then is Clinton so far down in comparison to other polls? The Reuters story doesn't note that these results are far from other polling, and instead uses the theme that Clinton is declining to frame these Zogby results:

The results come as other national polls show the race for the Democratic nomination tightening five weeks before the first contest in Iowa, which kicks off the state-by-state nomination battles in each party.

Some Democrats have expressed concerns about the former first lady's electability in a race against Republicans. The survey showed Clinton not performing as well as Obama and Edwards among independents and younger voters, pollster John Zogby said.

While this is certainly a theme of recent reporting, boosted by a pre-Thanksgiving ABC/WP poll showing Obama leading Clinton in Iowa, it is striking that no other poll has found recent results as far from the trend estimates as are Zogby's results and that the Reuters story fails to note that fact.

One answer to why Clinton does so badly MIGHT be that the poll has too few Democrats and thus biases its results. But if that were so, we'd expect Obama to also underperform his trend estimates. That doesn't happen, as the chart below makes clear.

The Zogby results for Obama are all quite close to his trend estimate from all polls:
Zogby has Obama at 46% vs Giuliani, while the trend puts him at 44.3. Against Romney Zogby has Obama at 46%, while trend says 46.6. Against Thompson Zogby has Obama at 47, while trend is 47.0, and against McCain Zogby has Obama at 45 while trend puts him at 43.4.

This is clearly not consistent with a general anti-Democratic bias in the Zogby Internet poll. It is also clear from the graph that the Obama pairings find Republicans doing quite close to the trend estimates as they did against Clinton.

(Trial heats against Edwards are not very common recently, so the Zogby results for him lack much polling for comparison.)

And so we are left with a puzzle: What is it about these respondents that so strongly affects Clinton support but no one else?

We can probably rule out one easy explanation: That Clinton has suddenly collapsed and Zogby is just the first to find it. The reason is internal to the Zogby result. If Clinton really has suddenly become 10 points less attractive, we'd expect all four Republicans paired against her to do BETTER than their trend estimates when facing her. But what happens is Clinton goes down and they don't do any better. That is hard to reconcile with a real change in Clinton's support. (A tortured version would say Clinton must have collapsed among Dems who now say they are undecided while refusing to move towards any of the Republicans. But that isn't usually what happens in real data when one candidate declines sharply. Usually the other moves up at least a bit, drawing not only from unhappy partisans but especially from independents who now are disenchanted with the former front-runner. So while you could make the math work with this story, it doesn't seem very well supported by the data.)

The Zogby Internet polling has a questionable track record in statewide races for Senate and Governor in 2006, where they often far over-estimated the competitiveness of races compared to conventional phone polls taken at the same time. One way to make sense of those problems turns out not to help much here. It is reasonable that the people who volunteer to take political polls over the internet are considerably more interested in politics (and likely more strongly partisan) than is a random sample of likely voters. That should be expected to lead to fewer people with "don't know" responses as better informed and more partisan respondents are likely to both know more about the candidates and to have made up their minds sooner than a proper random sample. That helps explain why Zogby's 2006 internet polls looked as they did.

But this does no good in Clinton's case. What we see is that MORE internet respondents are undecided about their vote between Clinton and four Republicans than the trend estimates based on less involved and partisan phone samples show. The Zogby undecided rates for the Clinton pairings are 20, 17, 17 and 16% (plus 17% undecided in the Huckabee comparison.) The comparable undecided rates based on the trend estimates are 8.2, 12.8, 9.0 and 10.6. That is an average undecided rate of 17.5 in Zogby vs 10.15 in the trends. Likewise the undecided rate is slightly lower for Obama pairings than it is for Clinton: 17, 13, 14, 13, and 14 for Huckabee. How could it be that a sample that is almost certainly more involved, knowledgeable and partisan can be LESS decided about Cinton, the single best known figure in the race? Again, a tortured story might be constructed, but I think a simpler explanation is that this result is not consistent within the Zogby data itself, or in comparison with outside polling.

Where does this leave us? Puzzled. If these results came from voting machines, I'd suspect that something in the ballot design or the recording mechanism caused a modest but consistent undercount of the Clinton support. The effect seems confined only to that one candidate, and not to any others, Democrats or Republicans. And there was no boost in support for the Republicans paired against Clinton. In this case, I'm similarly inclined to wonder if there is the possibility that the Zogby online survey had a glitch that caused a systematic "undervote" for Clinton. Certainly if my research assistant brought me these results, I'd want to check the software for mistakes before I published it.

Let's assume the Zogby organization has checked for any such possible mistakes or glitches and has ruled that out. (One would assume they were as surprised by the data as anyone and since their reputation is on the line, would have checked very carefully before releasing the data.) Is there any reasonable model of how candidate preferences are evolving that might explain this result, and the stability of Republicans paired against Clinton AND the stability of Obama support and that of his Republican pairings?

Without access to the raw data it is impossible to test any speculation here. But here is one possibility: Internet polls, presumably including Zogby's, use weighting to adjust for non-representativeness in their volunteer respondents. (There is a huge debate about whether this, and more sophisticated approaches, can produce generalizable population estimates with good statistical properties, but we'll leave that for another day.) Clinton has more support among women and somewhat older people. Both those groups are likely to be underrepresented in any pool of internet respondents. As a result the responses of those with these characteristics who ARE present in the sample are likely to be weighted up quite a bit to reach population proportions in the weighted sample. If the relatively few older women who are in the sample are ALSO atypical in other ways that both make them volunteer for internet surveys AND be less disposed to support Clinton than are non-internet volunteering older women, then weighting these respondents up won't properly capture Clinton's support and will lead to a systematic underestimate of her support.

That could do it, but it sounds pretty tortured to me.

I'd check the software one more time.

And based on the large outliers the Clinton results produce, I'd hold off on the Reuters headline until I saw some confirmation from other polls.