Predicting the U.S. Presidential Election: Evaluating Two Models (Part One)

Mitt RomneyBarack ObamaYou may have read about statistical models that claim to predict the outcome of the upcoming Presidential election. It’s easy to imagine that these models are complicated and contain many demographic, sociological, economic, and political factors. However, I was surprised to read in an article that two simple models supposedly generate accurate predictions.

Both of these models use stock market data. One model is based on the Dow Jones and the other on the S&P 500. Statistics are best when they are a hands-on experience, so while neither study included the data, I obtained both the stock market data and election data so we can try these models ourselves using Minitab statistical software!

We’ll evaluate both models. If the models are satisfactory, we’ll use them to make predictions for the upcoming Presidential election. Today, we’ll evaluate the Dow Jones model and tomorrow the S&P model. You can get the worksheet for the Dow Jones model here.

Model 1: Three-Year Change in the Dow Jones Industrial Average

The first model comes from a recent study titled “Social Mood, Stock Market Performance and U.S. Presidential Elections” by Prechter, et al.

The researchers find a positive, significant relationship between several outcomes for presidential elections that have an incumbent and the percentage change in the Dow Jones over a 3 year period. Each three-year period extends from November 1 after the previous election through October 31 of the year of the election.

Their theory states that the stock market is a proxy variable for social mood, not that the stock market directly affects voting. The stock market is a good measure of social mood because if society feels positive enough to invest more money in the stock market, they are presumably happy with the status quo, which could favor the incubment.

The Dow Jones Industrial Average (DJIA) data back to 1897 was easy to obtain from the Federal Reserve Economic Data (FRED) web site. For elections prior to 1896, I use the Foundation for the Study of Cycles data set that the study used. This data set uses market data from earlier indices to create a longer DJIA.

The study looks at the percentage change in the DJIA over different lengths of time (2-4 years) and how it correlates to different election outcomes. The researchers also include the traditional big three predictors of Presidential elections: economic growth, inflation, and unemployment. The study concludes that the three-year change in the DJIA is the best predictor. Further, when the DJIA predictor is included in the model, the other “Big Three” predictors become insignificant.

I’m going to test the three-year model to determine if it can predict whether the incumbent wins or loses. I also include other election outcomes in the worksheet if you want to try those out.

Assessing the Model with Binary Logististic Regression

Because the election outcome for the incumbent only has two possible values (Win or Lose), we need to use Binary Logistic Regression. And, we have one predictor, the percentage DJIA change over three years. We get the results below.

Binary logistic output for the Dow Jones model

The p-value for the Dow Jones changes is significant at 0.025. Further, the odds ratio is 1.10, which indicates that every 1% increase in the Dow Jones is associated with the incumbent being 1.10 times more likely to win. The goodness-of-fit tests (not shown) all have very high p-values (greater than 0.6), which suggests that the model fits.

We should also look at the concordant/discordant pairs in the output:

Binary logistic output for the Dow Jones model

This portion of the output indicates whether the predicted event probabilities of the binary logistic regression model match the observed outcomes. To do this, Minitab compares all pairs of observations that have different response values (Win or Lose) and their predicted event probabilities.

  • If the predicted probability of success is higher for the observation corresponding to a "success," the pair is considered concordant.
  • If the predicted probability of success is higher for the observation corresponding to a "failure," the pair is considered discordant.

For the Dow Jones model, 86.3% of the pairs are concordant, which is excellent! There are nearly 7 times as many pairs that are concordant than are discordant. In other words, the predicted event probabilities are accurate.

Predicting the Election with Minitab

The authors of the study specifically don’t use the model to predict the election's outcome because, for them, it's a study designed to determine the stronger influence in voting behavior. But why let that stop us from using their model to make a prediction? We have determined that the three-year percentage change in the DJIA is a significant predictor and that the model produces accurate event probabilities.

So, we just need to enter the the percentage change in the Dow Jones from November 1, 2009  to October 31, 2012 (33.8%) in the Prediction subdialog for binary logistic regression. Enter 33.8 and we get the follow prediction output.

Prediction based on the binary logistic regression of the Dow Jones model

The model predicts that President Obama has a 95% chance of being re-elected. The confidence interval (59, 99) is very wide, but it is entirely above 50 percent. Further, from the concordant pairs, we know that the probability is generally correct. This probability of re-election may seem high given the tight race in the polls. However, this prediction is based entirely on the Dow Jones, which has increased significantly while President Obama has been in office.

I'll close with a table that puts this prediction in the context of all Presidential elections since 1828 with an incumbent candidate. The table is sorted by the probability that the incumbent is re-elected. You can see how the high probabilities are associated with "Won" and the low probabilities are associated with "Lost." The middle probabilities are a bit mixed up, as you'd expect. The green row indicates the prediction for the upcoming election.

I'm pretty impressed by the fact that a single predictor works so well for something as complex as a Presidential election.

Probabilities that the incumbent is re-elected

In the next post, we'll look at a different model that uses the S&P 500 over a different length of time.  I was surprised by those results!



Name: Josh S • Monday, November 5, 2012

I'm not sure this metric is any more valid than the ones listed here: http://www.xkcd.com/1122/
These may all be 'statistically significant' without having a real grounding in reality.

Name: Jim Frost • Monday, November 5, 2012

Hi Josh, thanks for your comment. There are a couple of key differences between the historical precendents on that web site and statistical analysis.

For one thing, the electoral precedents are not metrics (a metric is a standard of measurement for a quantitative analysis). Instead this type of argument is an informal statement along the lines of "no President has been re-elected if . . .". And, these are followed by some observation that supposedly determines the outcome of the election. In other words, if you just know this one fact, you'll know the outcome of the election.

The stock market data are different, particularly when you apply statistical analysis. We've measured changes in the stock market for nearly 200 years. We then statistically relate these changes to nearly two centuries of Presidential elections and their outcomes.

Further, the statistical model does not suggest that changes in the stock market entirely predict the outcome. In fact, good statistics quantify the uncertainty. In this case, we have both the confidence interval for the prediction and the concordant/discordant pair analysis which assesses how well the probabilies have matched historical reality.

The last table shows how the model produces probabilities, derived from changes in the Dow Jones, that generally match the actual outcomes. However, these are probabilities, not certainties. So, you'll also notice that there are candidates who have won when the probability was low and candidates who have lost when the probability was high.

However, the general trend for the probabilites is that higher probabilities correspond to a win and low probabilities correspond to a loss.

These assessments are how we ground the analysis in reality. It's possible that some day the Dow Jones model will no longer predict election outcomes. However, we have yet to see that in the historical record.


Name: Daniya • Saturday, May 31, 2014

Dear Jim,
thank you very much, for your article, it's very helpful, I am doing research in the same topic, mostly following Robert Prechter's paper. The difficulty is, I couldn't find the data for stock market before 1896. Could you please send it to my e-mail? Thank you very much in advance.
Kind regards,

blog comments powered by Disqus