Elections are fast approaching, and you might be wondering how reliable can results from exit polls really be? In other words, can you tell with certain confidence which candidate will win prior to the actual election? In particular, consider the state governor elections.
Let’s suppose for the sake of simplicity that only two candidates are running – Candidate A and Candidate B. If we split the population of state voters into two classes (those voting for Candidate A and those voting for Candidate B), then the population can be characterized by estimating the proportion of people voting for Candidate A (p) and those voting for Candidate B (1 – p).
In the absence of any historical data about the popularity of each candidate, you could consider the worst case scenario, which would occur in a very tight race where Candidate A gets just over 50% of the votes (say, 51%). We consider this a pessimistic scenario because the sample size needed to demonstrate a candidate received the majority of votes (i.e., at least 50%) decreases as the difference widens between the true proportion of votes for Candidates A and B. Thus, the assumption that the true proportion of votes for Candidate A was really 0.51 results in a larger sample size than that you’d obtain if this proportion was larger than 0.51.
An alternative to a formal hypothesis test is to calculate the sample size needed to produce a range of values that says with certain confidence that the proportion of votes received by Candidate A is at least 0.5. If the true proportion of votes going to Candidate A was really 0.51, then the sample size is a function of this interval’s margin of error. The margin of error is the difference between the proportion estimate (based on the data) and the lower bound estimated by the confidence interval. In this particular scenario, the margin of error is 0.01. You can calculate this range of values even prior to getting any data!
Minitab Statistical Software’s Sample Size for Estimation command (under the Stat > Power and Sample Size menu) allows you to calculate the sample size needed to construct a 95% confidence interval with a 1% margin of error. If the planned proportion is estimated to be 0.51 with a margin of error of 0.01 (under Options set it to estimate a Lower Bound confidence interval), the sample size needed is approximately 7,000.
We can proceed then to collect a random (representative) sample of voters in a state and include them as part of this exit poll study. If p was truly 0.51 then you would expect the lower bound confidence interval to have a margin of error of exactly 0.01.
Suppose that from the 6864 voters, 3528 voted for Candidate A. Then the confidence bound below exhibits, as expected, a margin of error of 1% (which comes from subtracting the Lower Bound 0.503987 from the estimate 0.513986).
Assumptions are important when running a statistical test. In this case we have assumed that voters can only choose from two alternatives, plus votes cannot be nullified or voided. In addition, we assume the population is large enough to be considered infinite (this is a reasonable assumption if the voting-age population in a state is in the millions). In a future post, I will explore how to compensate for finite populations.