Monday, April 21, 2008

Pennsylvania Undecided Voters

Senator Clinton currently holds a 6 point lead over Senator Obama in Pennsylvania, based on our Pollster Trend Estimate, 49%-43%. But that leaves about 8 percent undecided. What they do will determine whether Clinton's vote expands her lead compared to the polls, or if the undecided narrow or possibly reverse, the lead.

My partner at Pollster, Mark Blumenthal, has looked at this using aggregate polling data here and in his column here.

In this post I take a look at the individual level, though using data that are three weeks old, so use caution in extrapolating to tomorrow's electorate.

Using data from the Time/SRBI poll of Pennsylvania, conducted 4/2-6/08, I estimate a model of support for Obama compared to Clinton. I use "the usual suspects" as variables predicting vote: partisanship, gender, race, Hispanic ethnicity, region of the state, age, education, religion and income. The data at that time found an eight point Clinton lead, a bit higher than today's trend estimate.

Using the coefficients for "decided" voters, I can estimate the probable vote of the undecided 11% of voters in the poll. This gives us a look at how they would be expected to behave IF they behave like those who have already picked a candidate. (Note the "if" here. As with all models, this assumes stable influence of the variables among the undecided as among the decided.)

The plot above shows the distribution of estimated probability of voting for Obama. Values close to zero are very likely to support Clinton, while values close to 1 are very likely Obama supporters. Those close to .5 are flipping a coin. The shape of the distribution gives a sense of where voters "lump up" in their estimated preferences.

The black line plots the distribution among those who reported a vote preference. The red line plots the distribution of estimated support among those who said they were undecided in early April.

The key point is that the undecided resemble the decided, with a small shift to the left, suggesting they were as a group somewhat more likely to support Clinton. In these data, the primary difference between undecided and decided voters was age, with older voters more likely to say they hadn't decided. As we've seen in virtually every exit poll, older voters are more likely to support Clinton, so the result we find here, that the undecided lean a bit more towards Clinton, is consistent with this result.

Now again for the caveats. These data are three weeks old. The model requires the assumption that undecided voters ultimately behave like those who decided. Different variables as predictors can make a difference. And so on.

The goal here is NOT, NOT, NOT a prediction of tomorrow's vote. Much may have changed since the first week of April.

The point is to illustrate what we can learn about undecided voters beyond the simple fact they say "undecided". In this case, the data suggest they are not wildly different from those who decided, but their older age makes it more likely they ultimately lean more to Clinton.

The Time/SRBI data are archived at the Roper Center for Public Opinion Research. I am solely responsible for the analysis here.