Friday, December 21, 2007

Comparing Primary Trend Estimates

Lots of exciting movement in the polling in both parties has pushed me to review the bidding on trend estimators. As regular readers know, the "blue" line estimator that is our standard is deliberately tuned to be a bit conservative. It requires a good bit of evidence that a change in direction is "real" before the trend will move sharply. With lots of polling, this estimator has an excellent track record of finding turning points of opinion while not chasing wild geese.

But in a hot primary, with relatively few polls each week (this week has been an exception!) it is reasonable to ask if there is short term change taking place that "old blue" just isn't quick enough to catch. So let's take a look at two alternatives.

Before we go there, let's remember that the variation across polls is quite large compared to some of the changes in support we are talking about. The variation around the trend estimates here is about +/-5 points while the trend differences are often a point or two. That means we are using quite noisy polls to estimate trends that vary much less than do the individual polls. (Mark Blumenthal has spent much of the fall discussing the variation in poll methodology and the implications this has for how uncertain we should be about individual results.)

So let's think of the single most sensitive alternative estimator we could pick: the latest poll. That would certainly move rapidly, and so be "responsive". But it would also reflect individual "house" effects due to polling organization and practices. It would also be highly unstable as an estimate of support, because individual polls vary over that approximate +/- 5 point or more range we see in the plots. In effect, this most sensitive possible estimator would just connect the dots and produce a plot that looks a lot like an earthquake on a seismograph. Lot of noise, but hard to see the systematic trend.

So we'd like to smooth out this random variation (and non-random variation due to house effects). One option is to take a rolling average. The more polls in the average the smoother the result, and you can take your pick of 5, 10 or more polls to smooth over. Of course, the more your include in the average, the more out of date the average is because it includes some polls taken a while back. I've chosen a 5 poll average here, because it should be quite sensitive yet still gain some of the advantages of averaging. Using more polls would smooth more, but defeat the purpose of having an especially sensitive estimate of trend.

The second comparison uses the same "local regression" methodology that is our standard approach here, but sets the degree of smoothing to about half that of the standard blue estimator. This "red line" estimator is more sensitive than the standard, but not as prone to jumping around as the moving average. "Red" should detect short term change more quickly than "blue", but it will also chase phantom changes due to flukes of a few polls that happen to be too high or too low. (The 5 poll average will be even more susceptible to this.)

So what do we see when we compare these estimators in IA, NH, SC and the US?

For the most part, all are in substantial agreement about big trends over the full year. The red line and black moving average show more variation than does blue, and may have picked up some "real" short term change that blue considered noise. But over all the 44 state x candidate x party comparisons, the agreement among estimators is pretty close most of the time.

For the Dems in Iowa, all three estimators are in quite close agreement right now. The differences are in the range of a point of each other. A sharp eye can see some differences of trajectory. For example, Clinton in Iowa is trending down in the blue estimator, while red sees a very recent upward trend while the black moving average fluctuates erratically. But zoom in and the differences are about a half a point or so. My experience is that you just can't reliably estimate such small differences, but feel free to make your own call here.

On the Republican side in Iowa you see similar agreement with small differences near the end. Small differences for Romney and Huckabee are seen in comparing red and blue estimators-- Red and the moving average see a bit of upturn for Romney and downturn for Huckabee in the most recent polls that the blue estimator isn't convinced of. As with the Dems, none of these differences is very large.

In New Hampshire, the picture is essentially the same for the Dems. Blue sees Clinton moving down, Obama up and Edwards gaining more slowly. Red and the MA think Clinton has turned back up in the last few days, while Obama has stalled or turned down. But all these differences are again matters of at most a percentage point difference in estimated current support.

For Republicans in New Hampshire, there are somewhat bigger differences for Romney and McCain, though the trends agree for the other candidates. Red and MA think Romney took a turn down recently by as much as a couple of points, while blue sees a continuing upward trend. The blue and red estimators are still within 2 points of each other, but a real difference in upward or downward momentum would, of course, be important.

For McCain, the latest couple of polls show a substantial spike in support, and red and MA chase that spike, leading to the largest difference we've seen so far among the estimators. All three see McCain gaining, but red puts him about 4 points higher than does the standard blue trend, while MA is a point lower than red.

In South Carolina where there has been less polling and more noise, we see the biggest differences of all. It is worth appreciating what a huge range of results we've seen in recent SC polling for Clinton and Obama. I am not willing to believe that the true level of support has really varied between 20 and 45 points! But this makes the estimation especially tricky (and is why I prefer the stability that the blue estimator provides in the face of extremely noisy polling.

Blue sees Clinton as flat for some while, in the process splitting the difference between some quite high polls and other quite low ones. Either her support has suddently collapsed (but with simultaneous high and low polls) or the best bet is what the blue line estimates. Red and the MA in contrast see a downturn recently from about 43 to about 35, a major drop.

For Obama, all three see an upward trend, but with red and MA moving up much more sharply than blue. Blue puts Obama at 31, while red would go for 35 or 36.

For SC Republicans the big difference is with Huckabee, where red and MA see a very large recent gain, while blue agrees on the sharp trend but doesn't put the support as high yet. Blue puts Huckabee at about 20, while red and MA would go as high as 28.

There are small differences of recent trend for Romney and Giuliani, but these are quite small-- more on the order of the Iowa differences.

Finally, on the national scene, we gain the advantages of more dense polling. The Democratic trends are quite close to one another. And on the Republican side, even the rapid rise of Huckabee is picked up quite well and with close agreement among all three estimators.

The bottom line is that most of the differences we see among the estimators are small-- on the order of a point or two in the estimates. The apparent differences in the most recent trends strike me as generally being too small to reliably distinguish. What constitutes a "real" change in trend is hard to define, but I think most of the current differences are too small to put a lot of faith in.

In the case of the large differences in South Carolina, I'm inclined to pay more attention to the very large spread across individual polls, and demand clearer evidence of change, but the red and moving averages are perhaps telling us that real change is taking place. More polls would help, but in their absence I think a more prudent reading is that it is hard to know exactly what is happening. (And again I'd refer readers to Mark's posts on the differences across pollsters in practices and methods.)

But the bottom line is this is a fun game. I'll be updating with all three trend lines so you can pick your favorite and place your bets accordingly. Starting January 3 we'll begin to see how the polls and the trends line up with actual votes.