Monday, May 07, 2007

Partisan variation in 2007 polling

The Newsweek poll released this weekend was quickly criticized for under-representing Republican partisans. For the adult sample, Newsweek had 22% Republicans and 35% Democrats, while for registered voters their numbers were 24% and 36% for Reps and Dems. Independents were 36% of adults and 37% of registered, with 7% of adults and 3% of registered refusing or unable to answer the question.

How out of line are these results? The graph above shows the distribution of party id for all polls taken since January 1, 2007. The groups are for three populations (Adults, Registered Voters and Likely Voters) and for whether the party ID question is "leaned", that is does it include those who say they "lean" to one party or the other, after first saying they are independent or non-partisan. In the legend for the graph A.N is adults, not leaned, RV.L is Registered Voters, Leaned, and so on.

It is clear that the distribution of Republicans, Democrats and independents depends quite a bit on the population and the question wording. For the adult, not leaned, sample that Newsweek used for presidential approval, the 22% Republican they found is at the low end of samples in 2007, where the median is 25%. Among registered voters, not leaned, the median in 2007 is 29%, compared to Newsweek's 24%. For Democrats, the Newsweek sample at 35% for adults compared to a median of 35%, while among registered voters their 36% compares to a median of 35%.

So the Newsweek results are a bit low (-3%) for adult Republicans, and a bit lower (-5%) among registered voters, but quite close to the median for Democrats. While the adult Republican number is the lowest of the year in comparable samples and questions, it is only 3 points from the median. That is indicative of the relatively low spread of estimates of Republicans in the adult population this year, where half of all polls fall between 24% and 28% and only 3 of 24 polls fall above 30%. Among registered voters with unleaned questions, half of all polls fall between 27% and 32%, with four polls above 32% and six polls below 27%. Newsweek's registered voter sample is the second lowest reading for Republicans of this group.

The relationship between percent Republican and percent Democrat is not terribly strong within categories of population and question wording, as shown below.

If anything, there is a positive correlation of the two partisan groups, ranging from .44 to .81 for adults and registered voters. (The four cases of likely voters have little variation and produce negative correlations, and I discount this.) The positive correlation is a good indication that what is at work with partisanship is that some questions or survey organizations produce more or fewer independents, driving down or up both Reps and Dems simultaneously.

The variation we see above is a reminder that party identification is a central concept, but subject to quite a lot of variation in measurement depending on population and question wording (and probably "house effects" due to survey organization, a topic ignored in this post.)

Given the sample size of each partisan group, the range of variation we see across polls is not surprising, and that raises an important point. One of the first things we ask about a poll is what its partisan split is. That is an important diagnostic to have available, and Newsweek deserves to be praised for openly disclosing this bit of data. All pollsters SHOULD release this information but a number do not. The current criticism of Newsweek is exactly the disincentive these organizations have for such release. If your party id is even a bit off, it becomes an immediate target for criticism and partisan cries of "fraud". Such is an uninformed view. Normal sampling variation will produce a range of party id estimates, and the data above show that in general this variation is well behaved, doing about what we'd expect given the sample sizes involved. But it does provide a tempting target for anyone unhappy with a particular poll.

This is not to say we should not scrutinize pollsters based on their data. That is entirely what we do here, of course! And I think I was the first to point out that the latest Newsweek poll appears to be an outlier on presidential approval. But that conclusion was based on a great deal of evidence from over 1400 polls and a consistent methodology I apply to every new poll regardless of source and regardless of results.

Finally, it is easy to overstate the impact of the distribution of party id on other survey estimates, such as presidential approval or horserace results. While party id is a powerful individual level predictor of those things, a discrepancy of 3-5 percent in partisanship, as in this case, does not translate directly into a similar discrepancy in every question with partisan overtones. Partisans are not completely homogeneous in their views, so the impact of variation in party id is mitigated by the variation within partisan category in approval or candidate preference. If we want to analyze a poll's results, it is better to focus directly on the question of interest, rather than on another variable like party id which may influence but not entirely determine the result we care about.

For further analysis of variation in party id across pollsters see here, and for variation over time see here.