Tuesday, May 01, 2007

Pres08: A closer look at primary trends

Who is up and who is down in recent presidential nomination polling? What are the current trends? The question is not as trivial to answer as it might seem. If we look at different polls, we can find some bouncing up while others bounce down. Commentators often reach different conclusions because the are comparing different polls. None of the recent polls have the order of finish significantly different--- all have Clinton and Giuliani in first place and Obama and McCain in second, with Edwards and Gore together and Romney, Fred Thompson and Gingrich mixed together. But the gaps between the candidates, and who has moved up or down since the last poll varies quite a bit across polls.

The goal of my kind of analysis is to avoid the trap of focusing on only one or a couple of polls. My approach is quite skeptical of the evidence provided by any single poll, but quite confident in the information from all the polls taken together. The problem is how to combine the polls to get a good estimate of what is "really" happening, and not to be deceived by the random variation from poll to poll. So let's see how the presidential nomination races are shaping up when we take all the polls seriously.

Regular readers know that my "standard" trend estimate is the blue line in the charts. This is a line that is calculated to go through the "middle" of the data, with an average error of zero, meaning the points below the line balance the points above the line. "Old Blue", as I affectionately call this line, is deliberately conservative in the sense that it takes quite a bit of new polling data to convince it to change trend direction. The reason for this is that we know there is quite a bit of noise in the polls (just look at the spread of points around the line!) so when a new poll comes in high it might mean an upturn in support, but it is just as likely that it simply reflects random noise and the next poll is as likely to come in low. If we allow the trend estimate to chase each new data point too much, we'll just plot random noise rather than the best estimate of the trend in support. Experience with these and other data (such as presidential approval) has shown that Old Blue is seldom misled about new trends, though it does take a while (about a dozen polls) to notice changing trends.

While it is good to avoid responding too much to a single poll, it is also true that Old Blue may stick to a trend longer than it should. A more sensitive estimator would notice a change in direction quicker, and would jump on the new trend while it is still news-- and before others notice it. "Ready Red" is the answer to this. The red line in the charts is twice as sensitive to change as is Old Blue. As a result it will pick up changes in momentum more quickly, letting us spot new trends early. Unfortunately, it will also sometimes be misled and will think it sees a new trend when in fact none exists-- just a few polls that happen to be "down" or "up" but which really don't represent any significant shift.

Of course you can adjust the sensitivity of the trend estimator to anything between Ready Red and Old Blue (or outside them too, for that matter) to see how much difference the sensitivity makes. There is no perfect way to choose a "best" estimator. I've settled on the more conservative Blue estimator as my standard because I find the hasty red estimator has often jumped the gun on presidential approval trends, which more data has subsequently shown were not really changing. But that doesn't mean we shouldn't examine the more sensitive trend estimate-- it tells us a lot, even if we have to be a bit cautious. (I'd rather be right but slow. Others prefer to be quick, and adjust to mistakes as necessary. Both approaches have their virtues.)

The chart above shows the data and the trend estimates for the top Democrats. Republicans are in the chart below. In addition to Old Blue and Ready Red, there are a number of gray trend lines (81 in each figure). These show the estimated trends for levels of sensitivity from quite a bit MORE sensitive than Ready Red to MORE conservative than Old Blue. If you can see the gray lines, this means that at least for some levels of sensitivity the estimated trend differs from either Blue or Red. If you cannot see the gray lines, or only barely, this means that the estimated trend hardly depends on the amount of sensitivity and the many gray lines all lie under Red or Blue in the plot, and so are covered up. This typically happens when there is a smooth, steady trend with no bends in it.

So enough statistics, let's look at the politics.

In the Democratic race, the Old Blue estimator says that Clinton has been flat since January, after a bit of decline in 2006 and a slight rebound late in 2006. Not much action.

But if we look at Ready Red, the Clinton campaign appears to be falling off in recent polling, declining by about 3 or 4 points since her peak in January. That isn't a large drop, but it does suggest that the stable picture of Old Blue may be masking some short term decay.

The Obama campaign is similarly interesting in comparison of the two trend estimates. Old Blue sees a sharp rise in early 2007 with a slower but still upward trend recently. Red sees more of a plateau in recent polls, with some indecisive and quite small up and down bounces. If I believe Red, I say Obama has stalled. If I believe Blue, I say he has slowed but is still moving up a little. (If I'm really crazy, I say the last little uptick in Ready Red suggests Obama is about to move up again, but that would be giving an awful lot of weight to the very last polls on a sensitive estimator. I'm not that crazy.)

The two trends are in pretty close agreement for Gore, but with Red suggesting a slight downturn at the very end, while Blue says the trend remains up a bit. Again, the difference is driven by only the polls at the very end and I'm not willing to bet much on them.

The Edwards campaign could take heart in Red's somewhat higher rate of climb in support compared to Blue. Both agree Edwards has been moving up, but Red sees the upturn as sharper and ending at a higher level. The best that can be said here for Red is that this trend has been supported by more polls than is the last little change of Obama or for Gore.

As for the numbers, the estimates are not far apart regardless of which estimator we pick.

Clinton: 35.7 (Blue)/34.1 (Red)
Obama: 24.0/25.1
Edwards: 15.1/16.6
Gore: 15.5/14.0

On the Republican side, Giuliani has enjoyed a long and sustained rise based on Old Blue, but suffers a recent downturn if Ready Red is to be believed. If the sensitive estimate is right, there has been over a five point decline in Giuliani's recent standing since early February. If Blue is right, then don't be hasty and Giuliani has continued to gain, though at a slower rate than in late 2006.

Red and Blue agree that 2007 has been a bad time for the McCain campaign. After a flat 2006, McCain has dropped over five points in both estimators. Sensitive Red thinks there may be a chance of a recent reversal of that slide, but Old Blue remains entirely unconvinced that McCain's fortunes are improving.

Blue and Red also agree that Gingrich has suffered a bit of a recent decline (more or less coinciding with talk of a possible Fred Thompson candidacy.) This is a nice example of both estimators reaching the same conclusion, even with late trends. Red sees Gingrich slightly worse off than does Blue, but the difference is slight.

Likewise, both estimators are largely in agreement that Romney has sustained his tortoise-like slow but steady increase. Despite some campaign gaffes, both trends remain up, with Red being a little more bullish than Blue.

Fred Thompson lacks enough data to provide a fair assessment of the estimators, but who can ignore him at this point. The only rational approach would be to be conservative in the face of very limited data for a trend estimate. By that account, Old Blue says the sudden possibility of a Thompson campaign has generated 10 points of support, but with no evidence of a trend either way since polling on Thompson began. The Red estimator jumps around-- there just isn't enough data for a sensitive trend.

The current estimates for each trend are:

Giuliani: 34.8 (Blue)/30.4 (Red)
McCain: 19.4/21.3
Romney: 9.4/10.5
FThompson: 9.9/10.1
Gingrich: 8.3/7.3

We can check the sensitivity of these estimates to the amount of smoothing used to estimate the trend. Here I use 81 separate estimates of the current standing of each candidate, with the smoothing ranging from MORE sensitive than Red to MORE conservative than Blue. This is a wider range than I think anyone would reasonably want. The most sensitive end produces trends that jump around way more than anyone could believe, and the most conservative fit is basically just a straight line with hardly any change at all. But somewhere between these limits of silliness are a range of reasonable estimates. If the bottom line estimate for a candidate is pretty compact, then the amount of smoothing doesn't matter. If the estimates are spread out, then we at least know that sensitivity matters and we should be cautious. The summary of the data are presented below.

The top half of the plot shows that the estimates for most candidates are in fact within a fairly small range regardless of how sensitive the estimates happen to be. Gore and Edwards are quite close, with some overlapping estimates of support. But Obama and Clinton are clearly distinct from each other and from Edwards and Gore. Similarly, Gingrich, Thompson and Romney show similar estimates and considerable overlap, while McCain is clearly above them and Giuliani clearly ahead of McCain.

Giuliani stands out among all the candidates in demonstrating more dependence on the sensitivity of the estimator. His box is more spread out than those of other candidates in the top half of the plot. In the lower half, which plots the distribution of all the estimates, Giuliani shows a bi-modal distribution. If we pick a more sensitive estimator, Giuliani support falls in the lower "hump" of the distribution, while less sensitive estimators suggest a stronger standing, producing the right hump. This difference is not trivial-- the more sensitive estimator says Giuliani is at about 30% support, while the more conservative one says 35%. No other candidate shows as large a discrepancy. This is due to the rather substantial downturn that Ready Red sees in Giuliani's polling over the last three months, but which Old Blue is still reluctant to accept. Which is right? Well, that's the whole point here: If you are a bit more daring, believe what Red has to say. If you like to buy municipal bonds, go with Blue.

One final way to look at sensitivity is to plot the estimated support for each candidate against the degree of smoothing used for each of the 81 estimates. Low values are less smoothing and more sensitive trends, while high degree of smoothing are more conservative and less sensitive.

The good news from my point of view is that most of the lines do not demonstrate a strong relationship between amount of smoothing and the estimated support. While there is a little movement, it isn't sharp for anyone. The Giuliani line is the one showing the greatest variation across degree of smoothing, as I already noted. The upshot of this is that while I constantly worry about how much my estimates are affected by my preference for Old Blue, the data show that mostly it doesn't matter a lot, and certainly not within a reasonable range of smoothing. (In the plot above, Old Blue is a degree of smoothing of .7, while Ready Red is at .35.)

It is good to compare Old Blue and Ready Red-- both offer helpful insights into the nomination race. Your acceptance of the risk of being too slow to recognize change versus the risk of chasing phantom blips should help you decide which to give more credence.