# William of Occam Chooses a Model

*Entities should not be multiplied unnecessarily.*

— William of Occam, *Quodlibeta Septem*

We’ve had a chance now to explore Best Subsets Regression and Stepwise Regression in Minitab statistical software. Both techniques are ways of quickly looking at lots of candidate models so that you can identify promising ones.

We’ve seen that statistical significance and model fit statistics can’t guarantee a fit as good as we’re looking for. With stepwise regression, we found a model with unsatisfactory residuals until we considered extra terms. With best subsets regression, we had to understand that there are drawbacks to choosing the best model according to the fit statistics.

So what does a good compromise model look like for the gummi bear data? Let’s look more closely at some of the models we’ve seen that have had random residuals.

Once we included all possible interactions in the stepwise procedure, we came up with a model with 10 terms besides the blocks:

Main effects |
Interactions |

Position of fulcrum Windings Position of catapult |
Windings by angle Position of gummi by windings Position of catapult by angle Position of catapult by position of gummi Position of fulcrum by position of windings by angle Position of catapult by position of gummi by position of windings by angle Position of catapult by position of gummi by position of fulcrum by windings by angle |

For comparison, the best 10-term model from best subsets regression has these terms:

Main effects |
Interactions |

Angle Windings Position of fulcrum |
Position of catapult by windings Position of gummi by windings Position of fulcrum by windings Position of fulcrum by angle Position of catapult by windings by angle Position of catapult by position of gummi by position of windings by angle Position of catapult by position of gummi by position of fulcrum by windings by angle |

Another model of interest would be a "parsimonious" model. If you've heard of "Occam's Razor," you might have learned that it concerns parsimony, or succinctness. In this situation, it Occam's Razor means we should select the model that satisfies our assumptions with the fewest number of variables.

William of Ockham's idea of shaving away superfluous assumptions helps in two ways. For the parsimonious model, we’ll only need enough terms to bring the residuals into a random state. So we won't unnecessarily multiply the number of terms. We will also use the simplest terms that we can. That means we won't unnecessarily multiply the number of effects within a term. We'll include terms with the fewest effects in the model first.

Once the position of the fulcrum is in the model, no other single term explains anywhere close to the same amount of variation, but we have to add more terms than you might expect to get the residuals to look random. We start with the main effects. Then we start adding the two-factor interactions that explain the most variation in the data until the residuals do look random. We don't consider any interactions with three or more terms because we don't need them.

Main effects |
Interactions |

Position of catapult Position of gummi Position of fulcrum Windings Angle |
Position of gummi by windings Position of gummi by angle Position of fulcrum by windings Position of fulcrum by angle |

The parsimonious model has the advantage of being easier to understand than the models that included more complex interactions. It’s difficult to understand why any interaction would involve the position of the catapult—move the catapult back and the gummi bear should go that much less distance. It seems like how high the gummi is on the catapult arm should have the same effect no matter where the catapult is. The interactions in the parsimonious model are easier to understand. The position of the gummi bear could interact with the angle of the catapult because both affect the height of the gummi bear at launch.

But which model is right? Well, it depends on what you value, but first, let’s do a practical comparison of these models. Here are the predictions stored at all of the points in the design:

At many of the points in the design, the predictions are quite close together.

But we know that what we really want to use the model for is hitting a CD a certain distance away from the catapult.

Say we choose 30 inches for the target distance. For the parsimonious model, Minitab’s Response Optimizer suggests settings close to those shown below. (I’ve simplified them a little bit because I don’t want to have to get the angle to 4.30551 degrees with my protractor.)

Here are the predictions from all 3 models with these factor settings:

The difference between the predictions at these settings is about 0.5 inches. The confidence intervals and the prediction intervals aren’t too far apart, either. While there are distances where you’d get bigger differences in the fits, it’s quite possible that no matter which model you use, the settings for your design are going to be the same. When all of the models are about the same from a practical standpoint, I’ll work with the parsimonious one. Thanks, William of Occam!

Hey, did that Response Optimizer reference make you curious? If so, you can take a closer look at how Ford used Response Optimization in Minitab to launch the 2011 Fiesta successfully.