On Friday, October 5, the Bureau of Labor Statistics (BLS) released the September Employment Situation Summary, and it was a doozy! Right before the Presidential election, the unemployment rate dramatically dropped to below 8% for the first time since January 2009. Because of the startling nature and contradictory information contained in the report, these job numbers were received with some skepticism.
Previously, I’ve blogged about changes in the quarterly GDP and how important it is to understand the larger context of the inherent variability of the changes, the imprecise nature of the estimates, and the various revisions to the estimates. These considerations all apply to the jobs reports. However, with the job creation data, we have the advantage of having two independent measures that we can use together.
In this post, I’ll focus on understanding both job creation measures and how they fit into the larger context. What can we learn from the two different measures? I’ll even make a couple of predictions, including one for the next jobs report that comes out just before the election!
You can find the data for this blog here.
The Two Measures of Job Creation
The BLS jobs report contains two measures of employment growth, the payroll survey and the household survey. It’s the household survey that is causing the ruckus. After all, it reported a surprising 873,000 new jobs in September 2012, which is the highest number since June 1983. This number is the reason why unemployment dropped by an unusually large amount, from 8.1% to 7.8%. Meanwhile, the payroll survey reported a relatively meager 114,000 new jobs, which is relatively consistent with recent values.
What can we make of this? Let’s compare the two measures.
The payroll survey covers 486,000 business establishments while the household survey covers 60,000 households. The household survey is clearly much smaller so the error should be larger than the payroll survey. Further, the household survey is subject to assumptions about the subpopulations and how they flow in and out of employment. These assumptions can change dramatically and further compound the errors.
The time series plot shows how the household data generally follow the payroll data but it is much nosier. The BLS states that a monthly change in the household data must exceed +/- 436,000 jobs to be statistically significant while for the payroll data it only needs to be +/- 90,000 jobs. This difference reflects the difference in noise.
At this point, you may well be wondering if the smaller and more erratic household survey has any value? You wouldn’t be alone in wondering this. After all, Alan Greenspan has his own doubts. The payroll data are widely considered to be more reliable. However, yes, the household data still provide valuable information that should not be discounted, which is detailed in this statistical study by George Perry.
The household survey includes some groups that the payroll data do not. For example, the payroll data does not capture the self-employed and start-up companies. In general, the payroll data doesn't adequately account for the birth and death of businesses. Additionally, if a person has two jobs, the payroll data counts him as two employed people while the household survey counts him correctly as one employed person.
Looking at the cumulative job creation after the recession of the early 2000s, the textbook, Macroecnomics, by N. Gregory Mankiw states:
In short, both surveys bring some unique information to the table.
Blip, or Trend?
The noisy household data makes it difficult to determine whether a particularly high value is a blip caused by random error or a new trend. However, there are some hints in the data. Below is the time series plot again, but with two circled groups. The groups appear to be unusually high household values compared to the payroll values. These values occurred during a recovery after a recession. Importantly, the payroll data has not detected the job gains at this point.
Are the circled data points important, or merely random fluctuations in noisy data? Let’s quantify it!
Modeling the Relationship Between the Payroll and Household Jobs Created Data
In the time series plot, we just eyeballed the difference between the two measures of jobs created. Now, we’ll create a regression model for how these two measures move together. After experimenting with various predictors, including GDP data and lagged variables, I found that a very simple model with Household as the response and Payroll as the single predictor worked as well as any other model that I tried. Below is a fitted line plot that shows how monthly changes in the payroll data are associated with monthly changes in the household data.
I can hear you chuckling at that low R-squared (24.3%)! However, the relationship is strong enough to help us. The predictor (Payroll) is very significant (p = 0.000) but it doesn’t account for much of the variance. This situation can occur when you have noisy data. And, we know the household data are noisy!
Regression calculates the line that you see in the graph. It also analyzes the variances so we can detect unusual distances from the line to the data points (the standardized residuals). The low R-squared indicates that the data points tend to fall further from line than a model with a higher R-squared. Consequently, a point has to be relatively far from the line for this model in order to stand out from the noise and be classified as unusual. Quantifying that distance, even for a weak relationship, can be far more useful than not quantifying it at all.
Do Spikes Occur Randomly, or Are They Part of a Pattern?
I’ve run the same model in General Regression and predicted the September household employment based on the payroll estimate of 114,000 jobs created. The prediction interval for a single new household observation extends from 433,000 jobs lost to 613,000 jobs created. That is an extremely wide interval thanks to the noisy data. However, the reported value of 873,000 jobs created falls well outside this interval. In other words, it’s unusual even given the large amount of error in the data.
I’ll use the standardized residuals to identify the unusually large observations because they are particularly helpful for finding outliers. Minitab statistical software classifies any standardized residual that is greater than 2 as an unusually large residual. I’ve listed all 6 of these large residuals in the table below to see if we can learn anything from them.
Date |
Standardized residual |
Economic Status |
April 1991 |
2.33 |
Recovery: recession of the early 1990s |
September 2001 |
2.88 |
Recovery: recession of the early 2000s |
February 2002 |
3.12 |
Recovery: recession of the early 2000s |
September 2002 |
2.35 |
Recovery: recession of the early 2000s |
November 2007 |
2.12 |
Just before the recession of the late 2000s |
September 2012 |
2.96 |
? |
There are 5 unusually large values prior to the current one. What we need to do is determine whether the previous spikes tended to occur randomly or as part of a pattern. This assessment will help us interpret the most recent spike.
Of the 5 previous spikes, four (80%) occurred during the recovery period after a recession. One occurred just before the recession that started in late 2007. Five samples are not enough to perform a formal statistical analysis. However, historically speaking, spikes in the household data tend to occur during economic recoveries.
Predictions
The predictions that follow don’t factor in any potential shocks to the system. For instance, both the European debt crisis and the Great Fiscal Cliff present large risks to the U.S. economy.
Prediction 1: We are in a slow recovery period
We’ve just observed a rare spike in the household data that falls outside the expected range for even this noisy data set. Historically, these spikes have occurred more often during recovery periods. Relatedly, the cumulative household data had persistently captured the recovery from the previous recession earlier than the payroll data.
By itself, the household spike in conjunction with the historical pattern aren’t enough to suggest a recovery. However, the spike didn’t happen in a vacuum. We have other positive indicators, and recently they’ve started to occur simultaneously.
Employment, sales of existing homes, consumer confidence, and car sales are all up. In October, the Thomson Reuters/University of Michigan index of consumer sentiment jumped to a 5 year high. Also in October, the number of Americans filing for unemployment benefits dropped to a 4-1/2 year low. New jobless claims dropped to 339,000, the lowest since February 2008. When new claims fall below 375,000, it typically indicates that hiring is strong enough to lower the unemployment rate.
Prediction 2: Jobs lost in the household data for the October jobs report
While looking at the unusual spikes in the household data, I also looked to see what happened in the month after the spike. I found that for all 5 prior cases, the months after a spike all had a loss. In fact, these months had an average standardize residual of -1.4 and an average job loss of 400,000!
Regression residuals should not have patterns like this. So, I included a binary predictor to identify the months that follow a positive spike. This predictor is significant (p = 0.001) and the coefficient is -386. This coefficient indicates that months that follow a positive spike have an average job loss of 386,000 compared to months that don’t follow a spike, while holding the payroll value constant.
On the Friday before Election Day, the household data is likely to indicate a job loss and a resulting increase in the unemployment rate.
Closing Thoughts
We saw how the noisy household employment data reduces the explanatory power of a regression model. However, this data set captures information that the payroll data can miss, which gives it value. Looking more closely at these data, I made two predictions that, on the surface, appear to be contradictory. The historical tendency is both that spikes tend to occur in the monthly household data during a recovery and that the subsequent month experiences a greater than normal job loss.
On Friday, November 2, we can confirm or reject my second prediction when the BLS releases its next job report. However, the first prediction will take longer to confirm. That confirmation will be possible only after we have enough data to take a historical look back to the present times!
What are your predictions?
Data notes:
- All data are from the BLS. The household data for all Januaries are unusual. The BLS makes annual adjustments to changes in the population assumptions and lumps all of the resulting (large) job changes into the month of January. This makes it impossible to use all of the raw data for statistical analyses. Consequently, the BLS has also released a smoothed data set for researchers that adds in the adjustments over time. However, this data set only goes up to December 2011, and we need the latest data. Therefore, I’ve excluded the month of January from the analysis. Because we’re looking at the monthly changes rather than the cumulative data, the missing Januaries don’t present a problem. When I analyzed the smoothed data for all months and compared the results to the regular data without January, the results were nearly identical.
- To model the household job creation data, I tried both the payroll and GDP data. I found that using GDP and lagged GDP predictors produced a model equally good as the model that contains the payroll data. There were no gains using both the GDP and payroll data. There are clearly many other data sets that I could have tried for potential predictors, but I would need to be a full-time economist to have sufficient time!
- After including the binary predictor to identify months that followed a spike, the standardized residuals for the subsequent months were no longer extreme. Typically, if you find usable information in the residuals (non-random patterns), you should incorporate that knowledge into the model. When modeling time-series data with regression, you need to be particularly aware of time-dependent effects like this. Including the binary predictor did not appreciably change the large standardized residuals for the months that actually had the spikes.