Sunday, August 24, 2008

How Pollsters Affect Poll Results

Who does the poll affects the results. Some. These are called "house effects" because they are systematic effects due to survey "house" or polling organization. It is perhaps easy to think of these effects as "bias" but that is misleading. The differences are due to a variety of factors that represent reasonable differences in practice from one organization to another.

For example, how you phrase a question can affect the results, and an organization usually asks the question the same way in all their surveys. This creates a house effect. Another source is how the organization treats "don't know" or "undecided" responses. Some push hard for a position even if the respondent is reluctant to give one. Other pollsters take "undecided" at face value and don't push. The latter get higher rates of undecided, but more important they get lower levels of support for both candidates as a result of not pushing for how respondents lean. And organizations differ in whether they typically interview adults, registered voters or likely voters. The differences across those three groups produce differences in results. Which is right? It depends on what you are trying to estimate-- opinion of the population, of people who can easily vote if the choose to do so or of the probable electorate. Not to mention the vagaries of identifying who is really likely to vote. Finally, survey mode may matter. Is the survey conducted by random digit dialing (RDD) with live interviewers, by RDD with recorded interviews ("interactive voice response" or IVR), or by internet using panels of volunteers who are statistically adjusted in some way to make inferences about the population.

Given all these and many other possible sources of house effects, it is perhaps surprising the net effects are as small as they are. They are often statistically significant, but rarely are they notably large.

The chart above shows the house effect for each polling organization that has conducted at least five national polls on the Obama-McCain match-up since 2007. The dots are the estimated house effects and the blue lines extend out to a 95% confidence interval around the effects.

The largest pro-Obama house effect is that of Harris Interactive, at just over 4 points. The poll most favorable to McCain is Rasmussen's Tracking poll at just less than -3 points. Everyone else falls between these extremes.

Now let's put this in context. We are looking at effects on the difference between the candidates, so that +4 from Harris is equivalent to two points high on Obama and two points low on McCain. Taking half the estimated effect above gives the average effect per candidate. The average effects are at most 2 points per candidate. Not trivial, but not huge.

Estimating the house effect is not hard. But knowing where "zero" should be is very hard. A house effect of zero is saying the pollster perfectly matches some standard. The ideal standard, of course, is the actual election outcome. But we don't know that now, only after the fact in November. So the standard used here is the house effect relative to our Pollster Trend Estimate. If a pollster consistently runs 2 points above our trend, their house effect would be +2.

The house effects are calculated so that the average house effect is zero. This doesn't depend on how many polls a pollster conducts. And it doesn't mean the pollster closest to zero is the "best". It just means their results track our trend estimate on average. That can also happen if a pollster gyrates considerably above and below our trend, but balances out. A nicer result is a poll that closely follows the trend. But either pattern could produce a house effect near zero. For example, Democracy Corps and Zogby have very similar house effects near -1. But look at their plots below and you see that Democracy Corps has followed our trend quite closely, though about a point below the trend. Zogby has also been on average a point below trend, but his polls have shown large variation around the trend, with some polls as near-outliers above while others are near outliers below the trend. The net effect is the same as for Democracy Corps, but the variability of Zogby's results is much higher.

Incidentally, the Democracy Corps poll is conducted by the Democratic firm of Greenberg Quinlan Rosner Reserch in collaboration with Democratic strategist James Carville. Yet the poll has a negative house effect of -1. Does this mean the Democracy Corps poll is biased against Obama? No. It means they use a likey voter sample, which typically produces modestly more pro-Republican responses than do registered voter or adult samples. Assuming that the house effect necessarily reflects a partisan bias is a major mistake.

How can you use these house effects? Take a pollster's latest results and subtract the house effect from their reported Obama minus McCain difference. That puts their results in the same terms as all others, centered on the Trend Estimate. This is especially useful if you are comparing results from two pollsters with different house effects. Removing those house differences makes their results more comparable.

What impact do house effects have on our Trend Estimate? A little. Our estimator is designed to resist big effects of any single pollster, but it isn't infallible, especially when some pollsters do far more polls than others or when one pollster dominates during some small period of time. We can estimate house effects, adjust for these, and reestimate our trend with house effects removed. The result runs through the center of the polls, but doesn't allow the number of polls done by an organization to be as influential.

The results are shown in the chart below. The blue line is our standard estimator and the red line is the estimate with house effects removed. Without house effects the current trend stands at +2.0 while ignoring house effects produces an estimate of +1.7. A little different, but given the range of variability across polls and the uncertainty as to where the race "really" stands, this is not a big effect.

The impact of house effects isn't always this small. Looking back along the trend we see that the red and blue lines diverged by as much as 1 point in late June, an effect due significantly to the large number of Rasmussen and Gallup tracking polls during that time and few polls with positive house effects in that period. A smaller but still notable divergence occurred in late February and early March.

The bottom line is that there are real and measurable differences between polling organizations, but the magnitude of these effects is considerably less than some commentary would suggest. Many of the house effect estimates above are not statistically different from zero. Even ignoring that, the range of effects is rather small, though of course in a tight race the differences may be politically important. Finally, the effects on our Trend Estimate is detectable but does not lead to large distortions, even if we can see some noticeable differences at some times.

The charts below move though all the pollsters and plots their poll results compared to the standard trend and the trend removing house effects. Pollsters with fewer than 5 polls are all lumped together as "Other" pollsters. Once they get to our minimum number of polls, we'll have house effects for them too.