Friday, January 27, 2006

Palestinian Exit Polls Faced Impossible Task

Distribution of margin separating district candidates in Palestinian Legislative Council elections.

In multi-member district voting the percent of the vote separating candidates is an indication of the intrinsic statistical difficulty of estimating winners based on exit polls. For example, in a district with 5 seats, the critical question is what is the vote margin separating the 5th and 6th candidates. If that is large, then you can be more confident that the 5th (and 1st-4th) candidate will win, and that the 6th (and 7th-nth) will lose. However, with many candidates and multiple seats the votes separating the top candidates is likely to be rather small, meaning that the margin of error for any exit poll will almost certainlybe larger than the margin between candidates. This difficulty would affect any exit poll, regardless of problems of response bias, practical difficulties in conducting the poll and so on.

As of Friday at 11:00 CST (19:00 Ramallah) the Palestinian Central Elections Commission (CEC) has not released the final and complete vote tally. They have, however, released the preliminary counts for the winning candidates. That is not the ideal data for this exercise, but is enough to at least illustrate my point and make the case while we wait for the final results.

For each of the 16 districts, I have sorted the winners by their total vote, then calculated the percentage of the vote separating each in order. So for example, if the top candidate won 35% of the vote and the second candidate won 33% then the difference between them would be 2%. Likewise if the third place candidate won 30% then the difference between him or her and the number 2 candidate would be 3% and so on. I calculate this for each of the candidates reported to win a seat by the CEC.

Now, this is not the ideal data! What we really want is the vote for the candidate who just failed to get elected. So if there are 5 seats, we want the vote for the number 6 candidate to compare to the number 5. But those data aren't in yet. I'm using the gaps between winners as an indication of how closely bunched winning candidates were. It is possible that the winners are closely bunched but that all losers fell far behind them. If that turns out to be the case then my argument here falls apart. But until we get the full data, this is the best we have to work with.

So the task of the exit poll is to estimate the vote separating pairs of candidates in order to estimate who has more votes. That is obviously subject to sampling error which depends on the sample size and non-sampling errors that depend on response errors (voters who fail to accurately report their vote), non-response, and any other problems that prevent a completely accurate measurement of voter's actual behavior in the voting booth.

The Palestinian Center for Policy and Survey Research reports that its exit poll of 17573 voters had a margin of error of +/- 4% for the national list vote, and +/-5-7% for the district vote, depending on the number of candidates (and the sample size in the district). Now the margin of error for each candidate is 5-7%. The margin of error for the difference between two candidates is approximately twice that, or 10-14% for the difference to be statistically significant at the 95% confidence level (and U.S. exit polls usually require much MORE confidence than that.)

So what are the chances of reliably estimating who is ahead? The top figure shows that in almost 40% of the cases, any adjacent pair of candidates (that is 1st & 2nd, 2nd & 3rd, 3rd & 4th, etc) were separated by less than a percentage point. Another 30% were separated by one to two percentage points. Only about 5% of all candidates were separated by as much as 5 or more percentage points, well within any reasonable margin of error for an exit poll.

Now, let me stress that this doesn't mean that the next candidate, the one not elected, cannot be far behind those who do win. We just don't have that data yet. Suppose a plurality of voters cast their votes for all Hamas candidates, and a smaller group of voters cast their ballots for all Fatah candidates. Then the gaps between winning Hamas candidates would be small, but the gap between Hamas and Fatah could be large, and the exit poll could detect that.

But consider that while candidates group by parties, they also have individual constituencies that may boost a particular candidate regardless of party. So some candidates do better than their party and some fall behind the party. While Hamas won most of the district seats, Fatah candidates do mix in with these winners. So we can use that to see how much gap there is between winners of different parties where this happened.

The figure below shows the gap between adjacent candidates by election district for those districts with more than one seat. I also exclude those winning candidates who were part of the quota of Christian seats. Their votes usually fall far behind the last winning Muslim candidate.

Jenin is a good example of party mixing. The top vote getters were Hamas, Fatah, Hamas and Fatah with votes of 30761, 29059, 27857 and 26909. Since there were other candidates for Hamas and Fatah on the ballot, I'd certainly expect that the 5th place candidate was close behind the 26909 of the fourth place winner. And the gap separating all these candidates is less than 2.5% of the vote. No exit poll could possibly detect such differences reliably.

In Ramallah, the top four winners were Hamas while the 5th winner was Fatah. The votes for 4th and 5th were 30679 and 22045, a gap of 3% of the total votes cast.

In Gaza the big gap comes between the 5th and 6th place candidates. Here the top five are all Hamas, while the 6th-8th place finishers were all independents. Even here though the gap between 5th and 6th is only 6%.

So, when we ask exit polls to do the impossible, we shouldn't be too surprised if the results are less than satisfying. In U.S. exit polling, few races are called solely on the basis of exit results in anything less than blow-out races. If the results are at all close sample precinct results of actual tabulated vote are used to augment the exit results, and in very close races the margin of error rarely becomes significant based on the combination of exit and precinct returns, awaiting more complete county-level vote reports before a race can be called.

So to attempt to call the district results in the multi-member Palestinian Legislative Council races was probably not a good idea. While I eagerly await the full vote results to test if this claim holds for the candidates who fell just short of being elected, I think the current preliminary evidence is strong enough to remind us of the limits of exit polls even in theory-- ignoring all the practical problems that also make them less perfect than theory would allow.

Links: See Matthew Shugart's discussion of these problems from a different perspective at Fruits and Votes, and Mark Blumenthal's take at MysteryPollster.