# Multiple Regression Analysis and Response Optimization Examples using the Assistant in Minitab

In Minitab, the Assistant menu is your interactive guide to choosing the right tool, analyzing data correctly, and interpreting the results. If you’re feeling a bit rusty with choosing and using a particular analysis, the Assistant is your friend!

Previously, I’ve written about the linear model features in Minitab. In this post, I’ll work through a multiple regression analysis example and optimize the response variable to highlight the new features in the Assistant.

## Choose a Regression Analysis

As part of a solar energy test, researchers measured the total heat flux. They found that heat flux can be predicted by the position of the focal points. We’ll use the new features in the Assistant to correctly position the focal points.

I’ve used this example dataset for a previous post about prediction intervals. It now includes an additional variable to highlight the Assistant’s capabilities.

In Minitab, go to Assistant > Regression, and you’ll see the interactive decision tree. You can click the diamonds for more information about how to choose and for examples of the analyses.

We have three X variables (predictors) and want to fit a regression model and to optimize the response variable. Following the tree takes us to Optimize Response at the bottom right.

Our response variable is HeatFlux and the X variables are the East, South, and North focal points. From my previous post, we’ve determined that we want to target the heat flux value of 234, but the Assistant can also maximize or minimize the response. We’ll also have the Assistant help us check for interaction effects and curvature.

Click the Optimize Response button and fill in the dialog that appears with this information:

The Assistant takes our candidate X variables and produces a regression model using stepwise regression. Let's take a look at the reports that the Assistant provides.

## Summary Report

This Summary Report tells us that our regression model is statistically significant with a P value less than 0.001 and has an R-squared value of 96.15%. Great! The comments section indicates which variables were included in the model. In this case, the Assistant includes East, South and North, along with several polynomial terms to model curvature and several interaction terms.

## Effects Report

The Effects Report graphically illustrates all of the interaction and main effects that are in the regression model. The lines are curved when the Assistant includes a polynomial term to fit a curve.

For example, the East*South interaction is significant, which indicates that the effect one variable has on heat flux depends on the setting of the other variable. If South is set at a low setting (31.84), increasing East reduces the heat flux. However, if south is at a high setting (40.55), increasing East increases the heat flux.

## Diagnostic Report

The Diagnostic Report displays the residuals versus fitted values and identifies unusual points that we should investigate. Based on the criteria for large residuals, expect roughly 5% of the observations to be flagged as having a large residual. So, the two we have are not necessarily problematic. There are also two points with unusual X values. You can click the points to see which row they are in the worksheet.

## Model Building Report

The Model Building Report shows the details about how the Assistant built the regression model, the regression equation, which variables contribute the most information, and whether the X variables are correlated with each other. North contributes the most information to this model. East is not significant but the Assistant includes it because it is part of a higher-order term.

The Assistant always watches your back. For example, when it builds your model, it uses standardized X variables because standardization removes most of the correlation between linear and higher-order terms, which reduces the chance of adding these terms unnecessarily. The final model is displayed in unstandardized (natural) units.

## Prediction and Optimization Report

The Prediction and Optimization Report shows the Assistant’s solutions for obtaining our targeted value of 234. The optimal settings for the focal points are East 37.82, South 31.84, and North 16.01. For these settings, the models predicts a heat flux of 234 with a prediction interval of 216 to 252. The Assistant also provides alternate solutions for you to consider using your subject area expertise.

## Report Card

Finally, the Report Card prevents us from overlooking issues that could make the results unreliable. For example, we should collect a larger sample size and check the unusual residuals. Normality is not an issue for our data. And, there is a reminder that we should perform confirmation runs to validate the optimal values.

The methods used in the Assistant are based on established statistical practice and theory, referenced guidelines in the literature, and simulation studies performed by statisticians at Minitab. For details, read the technical white paper for Multiple Regression in the Assistant.

If you're just learning about regression, check out my regression tutorial!

Name: George Canning • Tuesday, July 15, 2014

I am stuck. Let's assume I have some data. One of the variables are Years and Sales for each. The years go back to 2005. However, I want to use 2005 as the base year and to figure out how much my sales have gone up since that time. How to I get multiple regression to do this in Minitab.

Name: Jim Frost • Thursday, July 17, 2014

Hi George,

You don't mention specifically what is making you stuck, but I can cover two general routes you can try--regression and time series analysis.

If you try regression, you'll need to be extra careful about incorporating time related effects in your model. You may need to include lagged variables to account for these effects. A lagged variable is when you think that a previous observation influences the current observation. You'll also have to be sure to check residual versus order plot to look for time related effects that your model misses. You can also perform the Durbin-Watson test to see if adjacent observations are correlated.

As for the using year as a predictor, in order to get more friendly coefficients, try recoding the years so that 2005 = 0, 2006 = 1, etc. Mathematically it should work out equivalently using the original or recoded years but the coefficients could be super small using 2005, etc.

It might be easier to use a time series analysis because these analyses incorporate the time related effects. Choosing the correct time series analysis depends on the characteristics of your data. You'll need to find out if your data has a trend and/or seasonal patterns. Use Stat > Time Series > Time Series Plot to graphically display the data and look for these patterns.

After graphing your data, Minitab's Help system can help you choose the correct analysis. In the menu, go to Stat > Time Series and pick one of the analyses, say Single Exponential Smoothing. Click the Help button in the dialog that appears and then click Overview in the top left of the help topic.

The Time Series Overview is your guide for choosing the correct time series analysis based on the characteristics you see in your data. It includes examples of what to look for and how to choose.

I hope this helps! Don't hesitate to write again as you work through this!
Jim

Name: Mark • Tuesday, August 12, 2014

So I'm currently using Assistant on Minitab 13 to try and optimize a response, in this case oil production. The part where I'm stuck is determining which 5 "Continuous X variables" to use when I have many more than that. Is there another process on Minitab that I need to use in order to reduce the number of variables I should look at?

Thanks for the help!

Name: Jim Frost • Tuesday, August 19, 2014

Hi Mark,

I assume you mean Minitab 17 because that is when these features became available.

You should use the regular Regression command which doesn't have the limit on 5 predictor (X) variables. In the Minitab menu, go to: Stat > Regression > Regression > Fit Regression Model.

Here, enter your response and all of your predictors. If you want Minitab to check for higher-order terms and interactions like the Assistant Menu, you'll need to include all of those additional terms by clicking on the Model button and specifying the interactions and polynomials there. Click OK in the Model dialog.

Next, click the Stepwise button and choose a Stepwise method. Usually choosing "Stepwise" from the dropdown is a good choice because it does both forward selection and backward elimination as needed.

If you choose a stepwise procedure, the terms that you specify in the Model dialog are candidates for the final model. This replicates the functionality of the Assistant Menu and you can use more than 5 predictors.

If you then narrow it down to 5 predictor variables (but you can have more than 5 interaction and polynomial terms), you can go back to the Assistant Menu and use those variables there. If have more than 5 predictors, just continue with the regular Regression analysis. You can still optimize the response variable(s) using Stat > Regression > Regression > Response Optimizer. There are also other neat things you can do with your model in the Regression menu, such as create surface plots.

I hope this helps!
Jim