That's a long story. Let me say first of all that the main point of the exit polls is not to project who will win the election -- although the exit poll interviews are combined with vote count data in order to make projections. Mostly, the exit polls are used to provide information about who voted for whom and why. This purpose explains why the exit poll questionnaires are so long -- containing up to 30 questions.
In 2004, the exit pollsters chose anywhere from 15 to 55 precincts in each state for their exit poll sample: a total of 1480 precincts nationwide. Interviewers are sent to the corresponding polling places, some of which host more than one precinct. At some polling places, interviewers were instructed to approach every voter; at most, interviewers were instructed to approach every "Xth" voter, where X could be anywhere from 2 to 10. This instruction is intended to provide a random sample within the precinct; the value of X is chosen to obtain about 100 completed questionnaires per polling place.
The interviewers ask each approached voter to complete the exit poll questionnaire (attached to a clipboard). Slightly over half of the approached voters agreed to complete the questionnaire. Three times during the day, the interviewers call in tallied results for reported votes (presidential and often statewide races), plus all the answers on a subsample of questionnaires. In some states with many early/absentee voters, the exit pollsters also conduct telephone surveys of these early voters. (In vote-by-mail Oregon, only a telephone survey is conducted.)
Back at 'exit poll central,' the exit pollsters use these data in two ways: to make projections and tabulations. Projections are estimates of the vote shares in each state (and DC), together with measures of the uncertainty in those estimates. These projections initially are based on the interview tallies, which are compared with past results from the same precincts. Sometimes the interview tallies alone suffice to "call" the winner of a state with great confidence, especially if the race was not expected to be very close to begin with. Often the interview tallies are inconclusive. In these more competitive states, the exit pollsters continually update their projections to incorporate actual vote counts. The analysts examine both "quick counts" from the exit poll precincts and selected other precincts, and cumulative county-level totals. As quick counts become available, they replace the interview data in the projections.
Tabulations "break down" the vote share by various categories: for instance, gender, family income, or religious affiliation. These tabulations are computed for each state and for a special national sample. Preliminary tabulations were released for each state as the polls closed, and later updated to match the updated projections -- in effect, to match the official returns. (Actually, the initial tabulations were weighted to match the composite projections, which incorporate not only interview data but pre-election expectations.) National tabulations were updated several times, incorporating interview data and official returns from various states as they became available. Such tabulations, as posted on CNN.com, became the basis for much of the exit poll controversy.
It depends, of course. Most attempts to argue that exit polls are highly accurate strangely steer around U.S. national exit polls; Steve Freeman, for instance, in his well-known 2004 paper, focuses on Germany and Utah. There is no single measure of exit poll accuracy, and even if there were, we wouldn't know what it equaled for all past U.S. exit polls. If you look closely, many of the arguments boil down to assertions that the exit polls should be accurate.
One useful accuracy measure is the "Within Precinct Error" (WPE), which basically equals the percentage difference between the exit poll margin and the official margin for each precinct. By convention, WPE is positive if the Republican candidate does better in the exit poll than in the official count, and negative if the Democratic candidate does better in the exit poll. (For instance, if Kerry led in a precinct exit poll by 5 points, but trailed in the official count by 3 points, the WPE would be -8 points.)
1/6/07: WPE, of course, isn't a perfect accuracy measure -- for obvious and non-obvious reasons. The obvious reason, and the reason that motivates the entire exit poll debate, is that a large "error" could reflect an inaccurate vote count instead of (or as well as) an inaccurate poll result. So, like all measures, WPE has to be interpreted cautiously.
We know that in the last five large-scale presidential exit polls*, the average WPE has always been substantially negative, overstating the Democratic performance: -2.2 points in 1988, -5.0 points in 1992, -2.2 points in 1996, -1.8 points in 2000, and -6.5 points in 2004. (See page 34 of the evaluation report.) So, as far as we can tell from WPE, no recent exit poll has been accurate within the margin of error. And the 1992 survey was almost as far off as the 2004 survey. As Mark Blumenthal has pointed out, the documentary The War Room confirms that the actual 1992 exit poll projections -- at least at midday -- overstated Bill Clinton's performance. Few people noticed at the time: partly because the exit polls were not leaked on the Internet, partly because the discrepancies only altered the magnitude of Clinton's victory.
* Fine print: (1) I said "large-scale" because the Los Angeles Times has conducted national presidential exit polls, but those are much smaller. (2) In 1988, each network conducted its own exit polls; the WPE figure here is for CBS, the exit poll on which Warren Mitofsky worked. Later exit polls have been conducted by a series of network-sponsored consortiums.
Another useful accuracy measure is the Call 3 (end-of-day) "Best Geo Error" for each state. The Best Geo error is the discrepancy between the vote share estimate, based on interview data, and the official returns. (Like WPE, the Best Geo error is reported as a percentage difference in margins.) The vote share estimates can vary substantially from the raw percentages used to calculate WPE; they incorporate information on turnout, estimates of early and absentee voting (often based on telephone polls), and comparisons with past races. The average state Best Geo Error in 2004 was -5.0 points, somewhat smaller than the mean WPE.
Probably not, although they certainly may contribute. Greg Palast offers an estimate of 3.6 million (or "3,600,380 to be exact") uncounted ballots in 2004 alone. In Palast's account, these include about 1.4 million spoiled ballots (ballots for which no presidential vote was counted, such as the infamous "hanging chad" ballots in Florida) -- also known as "residual votes" or "overvotes and undervotes." They also include about 1.1 million uncounted provisional ballots and over half a million absentee ballots. These figures do not add up to 3.6 million, and it isn't clear where they come from, how accurate they are, or what proportion of these ballots were cast for Kerry. (Electionline.org reported approximately 528,000 uncounted provisional ballots nationwide, although these figures were known to be incomplete.) We can at least say that many ballots go uncounted in each election, and there is good reason to believe that these uncounted ballots are disproportionately Democratic. (It is very hard to say how disproportionately Democratic they are.)
Whatever advantage uncounted ballots have conferred to Republican candidates in the past, they are unlikely to account for much of the exit poll discrepancies. There is no obvious relationship between uncounted ballots and exit poll results. For instance, New Hampshire has had double-digit exit poll discrepancies (Within Precinct Error) in three of the last four presidential elections (evaluation report, page 33), but its residual vote rate was 1.7% in 2000 and 1.2% in 2004. (Incidentally, 2000 was the election without a double-digit WPE in New Hampshire. Also, an extensive manual recount in New Hampshire in 2004 indicated -- as characterized by an activist who had pressed for the recount -- "minimal and statistically insignificant" differences [additional details here]. This result tends to support the inference that the exit poll result was substantially erroneous and should not be attributed to either uncounted ballots or vote-switching.)
One might suppose that uncounted ballots could at least account for a mean WPE of about -2, arguing like this: if 3% of votes are never counted, and if these uncounted votes skew 80:20 to Democrats (2.4% of total vote to 0.6%), then they cost Democrats about 1.8% on the margin. However, the arithmetic is less favorable to this analysis than one might suppose, because Democratic votes lost in heavily Democratic (or heavily Republican) precincts have minimal impact on expected WPE in those precincts. For instance, if a precinct's voters actually cast 90% of their votes for the Democratic candidate, but fully 5% of those Democratic votes go uncounted, the Democrat will end up with about 89.5% of the vote, for an expected WPE of -1 in that precinct.
One certainly can't rule out a priori that fraud (and/or spoilage) might account for at least part of the 1992 exit poll discrepancies. But as far as I know, no one has made a serious attempt to argue that George H. W. Bush committed double-digit fraud in New Jersey that year (as the exit polls might be taken to suggest) -- or, more generally, to explain how Bush stole perhaps 5 million net votes that year, and why he bothered. In the end, the argument seems circular at best. People who began by asserting that exit polls are accurate end up asserting that the 1992 exit polls possibly may have been accurate. And then, in my experience, they change the subject. If anyone can direct me to any sort of fraud investigation triggered by the 1992 exit polls, please do.
[1/6/07: TIA seems mystified by my analysis of how spoiled ballots would affect WPE. I realize that intuitively, if the vote count is altered by two points on the margin, the 'exit poll result' should be 'off' by two points. In fact, spoiled ballots concentrated in heavily Democratic precincts would affect the turnout assumptions in the exit poll projection models -- but, as I demonstrated, they would have much less effect on WPE, which is what we are discussing here. For this reason and the others stated, it is unconvincing to attribute the observed past WPEs to spoiled ballots or to fraud.]
Here is how Mitofsky International's website puts it: "[Mitofsky's] record for accuracy is well known. 'This caution in projecting winners is a Mitofsky trademark, one which has served him well...,' said David W. Moore, the managing editor of the Gallup Poll in his book, The Super Pollsters." In other words, Mitofsky very rarely "called" or predicted the winner incorrectly. (Mitofsky died on September 1, 2006; as of this writing, the page I have cited is still active.)
If Mitofsky's calls were rarely wrong, doesn't this mean that the exit poll data must be highly accurate? No, it doesn't. One reason for Mitofsky's success was that he avoided making calls in close races based on interview data alone. Edison/Mitofsky (the firms that jointly conducted the 2004 exit poll) did not make any incorrect projections in 2004. Perhaps people who believe that the exit polls evince fraud should take Mitofsky's "caution in projecting winners" more seriously.
[1/6/07: TIA's response to this passage exemplifies the circularity of many exit poll arguments. He says that Mitofsky's "final exit poll accurately projects a fraudulent recorded vote count; the preliminary exit polls closely matche the TRUE vote. Butt very few individuals get to see those numbers, since they are 'not for public viewing'. Only spreadsheet bloggers get to use them." (Errors in original.) Oh-kay. So, TIA doesn't know what the preliminary exit poll results are, but he knows that they closely match the TRUE vote? Thus, the claim that we know exit polls have been accurate quickly degenerates into the assertion that exit polls must have been accurate.]
It depends on what one means by "the exit polls" and "won." As I explained above, there are really 51 different exit polls (if one counts the telephone-only poll in Oregon), one for each state plus D.C. For each state exit poll, we now know the final projection based on interview data alone (called the Call 3 Best Geo), as well as the pollsters' estimate of the uncertainty in each projection. (See the table on pages 21-22 of the exit poll evaluation report.) The final interview-only projection for Ohio showed Kerry ahead by 6.5 points with a "standard error" of 3.9 points. Using the conventional 95% standard for "margin of error," the margin of error would be 7.8 points. Using the 99.5% standard that the exit pollsters used as the first (not only) criterion for a "call status," the margin of error was over 10 points. So, Kerry's apparent lead in Ohio was within the margin of error. Kerry led in three other interview-only projections in states that Bush eventually won; all three were also within the margin of error. The election was too close to call based on exit poll data alone.
However, the national sample had about 12,000 respondents, and it gave John Kerry about a three-point margin. If the national exit poll were a random sample, its 95% margin of error on the margin would be about 1.8% -- so Kerry's lead appears to be outside the margin of error. The pollsters did not calculate an uncertainty estimate for the national sample, because they do not figure projections for the popular vote. If they did, probably even Kerry's lead in the national sample would be within the margin of error, at least using the 99.5% standard.
(Note that the concept of "margin of error" is widely misunderstood: see point 3.9 below.)
[1/6/07: TIA asserts that the 99.5% standard is, more or less, "a canard that was first suggested by the Mystery Pollster." Well, no. For instance, the 2001 Konner et al. review of Election Night 2000 notes on p. 13 that the CNN/CBS Decision Team used a 99.5% standard as one of its criteria for call status. TIA goes on to point out, again, that the exit poll discrepancies cannot be attributed to random chance. Nonetheless, as I noted, the exit poll in Ohio did not indicate a Kerry lead beyond the margin of error, even using a 95% criterion; the election was too close to call based on the exit polls. There is nothing paradoxical or sneaky about saying both these things.]
TruthIsAll sometimes has argued that the exit polls should be treated as simple random samples (like drawing marbles from a hat). In this instance, the margin of error for Ohio, with a reported sample size of 2040, would be about 4.5 points on the margin using the 95% standard. There are two problems with this reasoning. First, the exit polls are not simple random samples; they are limited to a relatively small number of precincts (49 in Ohio), and this limitation increases the statistical uncertainty. Second, the pollsters do not use a textbook formula to calculate their margins of error. Instead, they examine the actual deviations of their exit poll samples from past results in the same precincts. Ideally, all these deviations would be of the same size, in the same direction. (For instance, hypothetically, the poll might show Bush doing 2 points better everywhere in 2004 compared to 2000 -- although a result that neat would be extraordinarily unlikely.) The greater the variability in these deviations, the larger the margin of error.
So, the pollsters' estimates of uncertainty (margins of error) were relatively large because the precinct-level results varied widely, compared with past returns. This wide variation could be an indicator of problems with the exit poll interviews. (One source of variation is that, as I mentioned, some of the polling places contain multiple precincts. Because the interviewers have no way to tell which voters come from which precincts, they interview voters from all the precincts -- but the interview results are compared with past returns from the "intended" precinct only.)
That table (on page 2 of the national methods statement) applies to percentages in the tabulations, not to the vote projections.
[1/6/07: TIA adjures us to "[r]ead the Edison - Mitofsky notes at the bottom of this 12:22am NEP screen shot," but the screenshot isn't visible. If TIA has documentary evidence that the standard errors reported in the evaluation report (pp. 21-22) are faked, that would certainly be interesting.]
Yes: overall, and in many states, the exit poll results differed from the official results by beyond the margin of error, overstating Kerry's performance. (This overstatement is often called red shift, meaning that the "red candidate" Bush did better in the official returns than in the exit polls.) For instance, I noted above that in Ohio, Kerry led in the best interview-only estimate by 6.5 points with a "standard error" of 3.9 points. A 95% confidence interval for the margin is about double the size of the standard error: plus or minus 7.8 points. So Kerry's lead was less than the margin of error, and Ohio was too close to call based on the interview data (even if the pollsters accepted nominal 95% confidence, which they don't). However, since Bush officially won Ohio by 2.1 points, the exit poll discrepancy in Ohio (based on this estimate) was 8.6 points. That discrepancy is beyond the margin of error, at least at a 95% confidence level.
As I mentioned earlier, it turns out that at least the last five presidential exit polls have had overall discrepancies (measured as Within Precinct Error) outside the margin of error, but the 2004 discrepancies were the largest. We don't know how many states were outside the margin of error.
[1/6/07: TIA unaccountably responds with an outright misstatement: that "THE MOE WAS EXCEEDED IN SIXTEEN STATES, ALL IN FAVOR OF BUSH - NONE FOR KERRY." That has nothing to do with my point, but it also isn't true. The exit poll error in South Dakota was 6.8 points with a standard error of 3.3 -- a statistically significant (at the 95% level) overstatement of Bush's vote share.]
Margins of "error" refer to random sampling error. Most survey researchers would say that results outside the calculated margin of error most likely evince non-sampling error in the survey, such as non-response bias, sampling bias, or measurement error. The statistical "margin of error" assumes an unbiased sample, but competent survey researchers are rarely in a position to assume that they actually have unbiased samples.
Many people are under the mistaken impression that larger surveys are inherently more accurate. Larger surveys do have smaller margins of (sampling) error, but they are not inherently less vulnerable to non-sampling error. For instance, the 1936 Literary Digest presidential poll had a huge sample size of over 2.2 million respondents (out of 10 million post cards mailed), giving it a nominal margin of error of less than 0.1%. In the poll, Alf Landon held a dominating lead over Franklin Delano Roosevelt, with 57% of the projected vote. In the actual election, Landon got under 37%. The statistical odds against this outcome appear to be larger than anything ever reported by TruthIsAll. But the sample was not random: the mailing lists used to address the post cards tended to favor more prosperous voters, and in 1936 this class bias turned out to be catastrophic.
(People sometimes misinterpret these points as arguing that we should assume that fraud did not occur in 2004 -- whereupon they protest that other evidence points to fraud in 2004. But my point here has nothing to do with whether fraud occurred in 2004. It is about whether large survey errors should be interpreted as evidence of fraud.)
[1/6/07: TIA points out that modern surveys are much better than the Literary Digest poll was. That is true, but misses the point entirely. The point, again, is that statistical "margins of error" don't incorporate possible non-sampling error. TIA apparently chooses to believe that non-sampling error does not exist, but that is his article of faith, not the opinion of survey researchers. TIA further concludes, "If all the votes were counted and everyone eligible was allowed to vote, the Democrats would win every election." This assertion suggests the limits of his faith in "scientific polling," since it certainly isn't supported by survey research.]
No, the largest exit poll discrepancies were generally not in battleground states. Using the "IM WPE" statistic (which averages the WPEs for all precincts in each state, as opposed to other methods that trim extreme values), the largest discrepancies were in Mississippi (-18.5), Connecticut (-16.0), Delaware (-15.9), Vermont (-15.2), and New Hampshire (-14.0). Using the actual interview-only projections, the largest discrepancies were in Vermont (-16.5), Delaware (-16.0), New York (-13.9), New Hampshire (-13.6), and Mississippi (-13.1). Of these six states, only New Hampshire was a battleground state. (It is true, however, that the average discrepancy in the battleground states was larger than the average discrepancy in other states. Edison/Mitofsky report that at the precinct level, the average WPE was -7.9 for precincts in 11 "swing states," and 'only' -6.1 for precincts in other states.)
[1/6/07: TIA complains that my "margin-based" measure is "misleading," but why? Every discussion of "Within-Precinct Error" is based on a difference in margin. For that matter, most discussions of pre-election polls are based on margins: who is ahead by how much. Next, TIA asserts, "the NY exit poll said that Kerry won by 63-36%." Actually, the interview-only best estimate gave Kerry 65.1 percent of the vote. TIA apparently is citing an estimate that incorporates the pre-election polls -- which seems like a rookie mistake to be making over two years later. And yet, in his response to section 4.11, TIA will assert, "How likely is it that Kerry won NY by over 30 points? Very." Hmm.]
Previous: (2) The "Rules": Did They Favor Kerry?
Next: (4) Explaining the Exit Poll Discrepancies
Up to Table of Contents