What the Heck Is Best Subsets Regression, and Why Would I Want It?

Last time, we used stepwise regression to come up with models for the gummi bear data. Stepwise regression is a great tool, but it has a downside: when we use stepwise selection in design of experiments, especially if we focus on only the last step, we can miss interesting models that might be useful.

One way to look at more models is to use Minitab’s Best Subsets feature. Instead of identifying a single model based on statistical significance, Minitab’s Best Subsets feature shows a number of different models, as well as some statistics to help us compare those models.

To get the idea, let’s look at some smaller models.

The best 4-predictor model has an r-squared of 80.2, and adjusted r-squared of 79, a Mallows' Cp of 50.1, and an S of 10.131.

The variables column (Vars) shows how many terms are in the model. In this case, I requested that Minitab print two models with 1–4 terms. Here are the statistics that Minitab provides to help you choose a model:

  • R2 is for when you compare models with the same number of terms. Higher is better.
  • Adjusted R2 is for when you compare models with different numbers of terms. Higher is better.
  • Mallows’ Cp is for when you compare models with the same or different numbers of terms, provided all of the models came from the same initial set of terms. If you change the predictors and run best subsets a second time, you cannot use Mallows’ Cp to compare the models. The smaller the Mallows’ Cp, the better it is for prediction. The closer it is to the printed number of variables + 1 (for the intercept term), the less biased the estimates of the coefficients are.
  • S is an estimate of the variability about the regression line. Smaller is better.

In the output above, the model with

  • the position of the fulcrum
  • the angle of the catapult
  • the number of rubber band windings, and
  • the interaction between the position of the catapult, the position of the gummi bear, and the number of rubber band windings

has the best statistics. However, the statistics for the other 4-predictor model are so close that it would be hard to say that one is practically better than the other.

With a smaller number of models to work with, you can use Minitab to check the predicted R2 values. Predicted R2 is similar to the other R2 type statistics, but estimates how well a model predicts new observations. Often, this criterion about new observations is the most appealing assessment to use. We want the model to predict what will happen when we launch a new gummi bear. As it turns out, the predicted R2 values are nearly equal for both four-term models: 77.09%. Based on these statistics, there’s no reason to think that one model will outperform the other, even though their predictions can vary considerably.

Next time, I’ll plan to take a look at some larger models to see if we can do better on the predictions. If you’re ready for more statistics now, take a look at what Jim Colton can show you about some common misconceptions about R2.


Name: Shilpa • Friday, December 27, 2013

I am trying to run a logistic regression with 11 predictor variables and a response variable which is binary (yes=1 no=0).
I want to perform the best subset regression using minicab but I not quite sure if that can be done only for linear regression or logistic as well. There is no where I can find the response to my question.
Also- is there anywhere in minitab we specify what kind each variable is? ex: categorical v/s linear

Name: Cody Steele • Thursday, January 2, 2014

Hi Shilpa,

The implementation of Best Subsets Regression that is in Minitab is intended for use with continuous responses. Another difference that's important to consider is that binary logistic regression in Minitab uses maximum likelihood estimation for the coefficients while best subsets regression uses least squares estimation. While the results could be similar, different estimation methods can lead to different conclusions.

In Minitab, the worksheet does not let you specify whether a numeric variable is categorical or not. You can typically specify whether a numeric predictor variable is continuous or categorical when you run your analysis. In Minitab analyses, text variables are always treated as categorical, so if your variable contains descriptive labels, Minitab will know that the predictor is a categorical variable.

I hope this helps with your question!


Name: Ben • Sunday, September 7, 2014

I see that the Best Subsets output above shows interactions, but I can't find how to get interaction terms in Best Subsets for Minitab 17. How can I do that? Thanks!

Name: Cody Steele • Monday, September 15, 2014

Interaction terms are not a normal part of the best subsets algorithm. I used Minitab’s calculator to multiply the predictors together to create columns that held the interaction terms. Then, I entered those columns into Best Subsets. If you want to see more about what the calculator can do, go to this list of topics:

How much work this is depends on the number of variables, but keep this in mind: Interaction terms create a lot of multicollinearity. If you create formulas for the interactions with DOE data, use the coded units. If you use uncoded units or if you’re not working from a designed experiment, use Calc > Standardize on your predictors to subtract the mean and divide by the standard deviation. For an introduction to multicollinearity, go to “What is multicollinearity?”

I hope that helps!

blog comments powered by Disqus