Thursday, September 20, 2007

Short term trend, long term trend and sensitivity

The trend estimator we use for the presidential nomination races (and for other trends as well) is designed to capture "real" changes in support while ignoring the inevitable random noise that is the result of random sampling of opinion. The "standard" trend estimator (which I fondly call "old Blue") is deliberately conservative in the sense that it takes a good bit of evidence before the estimator will believe a change in the polls is real. In the case of presidential approval, this estimator has an excellent track record of catching changes in approval while not thinking it sees a change when there isn't one there. (For a fuller explanation of the estimators see this post.)

But I've also used a more sensitive estimator ("ready Red") that catches changes more quickly but is more often fooled by random noise into believing a change has occurred that subsequent polls show didn't happen. I've posted the comparison, and a more extensive comparison of a range of sensitivity estimates here and updated that with each new set of polls.

What has become clear is that "old blue" has been too conservative this summer, and has missed an important change in the trend of Rudy Giuliani's support. I want to use this to illustrate what the different estimators reveal and what they smooth out, and to explain a move to a somewhat more sensitive estimator.

To illustrate this issue I've computed three different trend estimates, plotted in the figure above. The blue line is the most conservative and least sensitive. The red line is more sensitive, while the black line is the most sensitive of all. (Red is twice as sensitive as Blue, and Black is twice as sensitive as Red.)

The current estimates of Giuliani's support are Blue: 27.3%, Red: 28.3% and Black: 27.8%. The estimates are all within one point of each other. In the figure it is clear that the three lines usually overlap each other pretty closely, and so much so in Romney's case that they are all but indistinguishable.

The caveat is that the lines can and do disagree with each other when support makes a change of direction. Giuliani's polling has done that at least twice since January. After rising in late 2006 and early 2007, his support peaked late in the first quarter and fell for a while. But the Red and Black estimators agree that the fall was rather sharp and halted by the end of the second quarter after which there has been a mild but noticeable upturn. The more conservative Blue estimator caught the decline but has not shown the subsequent upturn. As a result, it has continued to see a long term decline, rather than a recent stabilization and some improvement. Even though the three estimators currently are within a percentage point of each other in estimating his support, the short term dynamics of the Giuliani campaign are not well represented by the Blue line.

The blue estimator was set to be especially conservative in the face of few polls unevenly spaced, as we had in 2006 and the first half of 2007 . As polling has become more frequent over the summer, that extra degree of conservatism became a liability rather than an asset. With enough new polls, a more sensitive estimator can be used. And both Red and Black pick up clear changes in Giuliani's trend.

You can see more modest differences for other candidates. Red and Black think McCain had a modest bump in late 2006 and early 20007, but blue doesn't. And if you focus on the very latest data red sees a flattened trend, black thinks there is a bit of an upturn, and blue still has McCain in decline. This kind of disagreement at the end of the series and based on few polls is common. There isn't enough data to be sure of the trend into the near future. Still, the McCain estimates are currently Blue: 13.2%, Red 14.2% and Black 15.4%, a range of 2.2 points.

The Romney trend has been so stable that all three estimators agree almost exactly, except at the very end. The most sensitive Black estimator sees a blip up at about the time of the Iowa straw poll, which is believable, but Red and Blue don't see it. And Black sees a bit of decline since then, which Red barely hints at and Blue doesn't see at all. The bottom line is Blue: 9.7%, Red: 9.0% and Black: 8.2%, a 1.5 point range.

The Thompson trends show a bit more short term variability, in part due to fewer polls, but perhaps also reflecting some real short term changes. Both Black and Red see two short term dips in Thompson support (of a couple of points) that Blue smooths out. And Blue doesn't detect the post-announcement blip that Red and Black pick up. And the bottom line for Thompson is Blue: 20.1%, Red: 20.9% and Black 21.2%, a range of 1.8 points in the estimates.

The Democratic trends have been a bit more stable and so the discrepancies between estimators are generally less. Red and Black see a little more dynamic in Clinton's trends early in 2007 and a small blip or two for Obama. And Red and Black think Edwards was up a couple of points in April while Blue doesn't see it. And Richardson hasn't had enough national movement for there to be any discrepancies.

As for the current estimates, Clinton is at Blue: 39.7%, Red: 40.1%, Black: 39.3%, a range of 0.8 points. Obama is at Blue:21.8%, Red: 21.5% and Black 22.0%, a 0.5 point range, and Edwards a bit more variable at Blue: 12.2, Red: 13.7 and Black: 14.3%, a 2.1 point range (and an example of the more sensitive estimators thinking there is a bit of an upturn in Edwards' support.

The estimates of current support vary a bit for each candidate, but far less than the variation among recent polls. That is the point of using the estimator, after all--- to remove random variation and get a better estimate of where support stands. The estimates across all the candidates are within 2.2 points of each other at the moment, which I think is pretty good agreement regardless of how sensitive an estimator we use.

But the visual pattern of the trend, in Giuliani's case especially, is not well captured by the most conservative "blue" estimator. Now that there is more frequent polling, and therefore less chance that a single poll or two will distort the trend, I think we are safe using a somewhat more sensitive estimator for the trends. We'll be a bit riskier in order to also catch shorter term changes better.

As a result of this analysis, we'll be updating our trend plots for the national data to reflect a somewhat more sensitive estimator. However, we won't go too far on that, since we do believe there is a lot of noise in the polls and we want our trend estimates to resist chasing each new poll.

The state trends, however, will continue to use the more conservative estimator until enough polling becomes available to allow us to increase the sensitivity. Some states are developing a good bit of data, but even New Hampshire has only 42 Republican polls, while nationally we have 130.

And as we've been doing all along, we'll assess sensitivity and write posts on differences between the estimators that may suggest short term changes of interest. Even if we can be fooled by a sensitive estimator, it is always fun to speculate about what it might mean, if there really is a change.