Predicting the U.S. Presidential Election: Evaluating Two Models (Part Two)

Minitab Blog Editor 02 November, 2012

Seal of the President of the United StatesYesterday, I presented a model that uses Dow Jones data to predict the winner in Presidential elections that have an incumbent. Today, I test a model that uses S&P 500 data. (Here are the data for today's blog that you can use in Minitab Statistical Software.)

Model 2: The Three Month Change in the S&P 500

The second model is presented by Sam Stovall, Chief Equity Strategist at S&P Capital IQ in his paper, “The Presidential Predictor: Stock Price Performances Have Typically Presaged Victors.” Unlike the Dow Jones study, this paper was written vaguely and presented unhelpful statistics. Also, the author did not perform a single hypothesis test to determine whether his model is significant.

Initially, it appears that the author is writing about incumbents being re-elected or replaced. However, what he writes about and the numbers that he presents relate to two different election scenarios. He mentions incumbents being re-elected or replaced 9 times. In fact, he explicitly links his numbers to incumbents. For example, "Rising [S&P 500] prices have typically signaled the reelection of the incumbent, while falling prices have pointed to his replacement, all to the tune of an average 82% accuracy rate over the past 100+ years."

Only once does he vaguely talk about changes in the party of the President. However, when looking deeper into his numbers, it turns out the numbers that he presents relate to whether an election changes the party affiliation of the President. This is a different question and data set, because not all elections have an incumbent. To cover all bases, I’ll test both cases, incumbents and party changes.

S&P 500 Data

Clearly, I didn’t start out with a good impression of this study...but let’s collect the data and see what the testing reveals.

I found daily open and close prices at Yahoo Finance, but only back to 1956. (That’s when the S&P 500 index started, it turns out.) For these data, I captured the percentage change from July 31 to October 31 prior to each Presidential election.

Like I did for the DJIA, I found a data set that estimates the S&P 500 prior to its existence. Unlike the pseudo-DJIA data, the pseudo-S&P 500 data comprise over half the data points.

The author did not state which pseudo-S&P 500 data set he used, but the one I could find only has monthly averages rather than specific values for specific days. However, these data produce three-month periods of advances and declines that match the study’s, at least as far as I can tell.

S&P 500 and Elections with an Incumbent

We’ll test the theory that when the S&P 500 rises from July 31 to October 31 prior to the election, the incumbent is typically re-elected and that when the S&P 500 declines over that time, the challenger wins.

So we're looking at two things: the S&P 500 can be Negative or Positive, and the incumbent can Win or Lose. These two categorical variables produce a 2X2 table. Consequently, we’ll use the Chi-Square test to determine whether these variables are associated.

Chi-square test of Presidential elections with an incumbent

The insignificant p-value of 0.226 indicates that the value of one variable isn’t related to the value of the other variable. These data don’t support the hypothesis that the three-month changes in the S&P 500 predict Presidential elections that have an incumbent. Additionally, three cells have expected counts of less than 5, which suggests that we don't have enough data. This lack of data is probably why the author used party-change data, because that increases the amount of data.

You also can compare the actual count in each table cell (top number) to the expected count (bottom number). The expected count indicates the number you expect to see if there is no relationship between the variables. You can see how close the expected and actual counts are for all cells in the table above.

The Positive row shows that 80 percent of elections that had a Positive S&P 500 also saw the incumbent winning. This percentage looks promising, but it is deceptive. For a comparison, look at the All row:  you'll see that 74% of all elections with an incumbent candidate result in re-electing the incumbent. That's why you need to perform a hypothesis test!

Finally, I performed a binary logistic regression analysis on the incumbent outcome with the three-month percentage change of the S&P 500 as the single, continuous predictor (similar to the Dow Jones model yesterday). The p-value of the S&P 500 was insignificant (0.295).

S&P 500 and Elections that Produce a Party Change

Now, we'll see if the S&P 500 can predict changes in the party that controls the White House. There have been only 19 Presidential elections since 1900 with an incumbent candidate. For the party change analysis, we can now use the data from all 27 elections since 1900. 

Below is the Chi-Square analysis.

Chi-Square analysis of the S&P 500 and party changes

The p-value of 0.029 is significant, which suggests that there is a relationship between the two variables.

The green boxes indicate cells where the observed counts are greater than the expected counts. The red boxes indicate cells where the observed counts are less than the expected counts. Also, notice how the row percentages are reversed for the Negative and Positive rows. Collectively, these patterns and the significant p-value suggest that a positive S&P 500 favors the incumbent's party while a negative S&P 500 favors the opposition.

However, we still see the warning about low expected counts. We should be careful with our interpretation. So, I analyzed the data using binary logistic regression with a continuous predictor, the percentage change in the S&P 500. The output is below:

Binary logistic regression model of party change for all Presidential elections since 1900

The p-value for the S&P 500 predictor is 0.064, which just barely misses the 0.05 cutoff. We'll continue to the concordant and discordant pairs to see what we get.

Measures of association from binary logistic regression for the party change model

Here, the concordant pairs outnumber the discordant pairs by 3 to 1. That's not bad, but not as good as the Dow Jones model, where it was 7 to 1.

Finally, we'll make a prediction using the 1.94% change in the S&P 500 from July 31 to October 31, 2012.

Prediction using the binary logistic regression model for party change

This model suggests that there is only a 37% chance that the party that resides in the White House will change with the election. However, the confidence interval for the event probability spans both sides of 50%.

Based on the concordant/discordant pairs, the predicted event probability is correct more often than not.  However, while these results "lean Obama," as the TV pundits say, they're not as strong as the Dow Jones model we looked at yesterday.

Closing Thoughts

It is very interesting to see how the stock market data was able to model the outcomes of U.S. Presidential elections. Both models predict that Barack Obama will be re-elected; however, the Dow Jones model made this case more strongly than the S&P 500 model.

It's important to note that the two sets of data represent changes in the market over timespans of different lengths. The Dow Jones model looked at the three years prior to the election while the S&P 500 only looked at two months. It would be interesting to see if three-year changes in the S&P 500 would produce similar results as the Dow Jones.

Finally, assessing the S&P 500 study unintentionally revealed how careful you must be when conducting and writing about quantitative studies. To do this effectively, you must have clear operational definitions of the variables, use proper hypothesis testing, and clearly relate the findings to the data that you actually use!