dcsimg
 

Regression Analysis

Blog posts and articles about regression analysis techniques applied to Lean and Six Sigma quality improvement projects.

by Lion "Ari" Ondiappan Arivazhagan, guest blogger.  An alarming number of borewell accidents, especially involving little children, have occurred across India in the recent past. This is the second of a series of articles on Borewell accidents in India. In the first installment of the series, I used the G-chart in Minitab Statistical Software to predict the probabilities of innocent children... Continue Reading
In part 1 of this post, I covered how Six Sigma students at Rose-Hulman Institute of Technology cleaned up and prepared project data for a regression analysis. Now we're ready to start our analysis. We’ll detail the steps in that process and what we can learn from our results. What Factors Are Important? We collected data about 11 factors we believe could be significant: Whether the date of... Continue Reading
By Peter Olejnik, guest blogger. Previous posts on the Minitab Blog have discussed the work of the Six Sigma students at Rose-Hulman Institute of Technology to reduce the quantities of recyclables that wind up in the trash. Led by Dr. Diane Evans, these students continue to make an important impact on their community. As with any Six Sigma process, the results of the work need to be evaluated. A... Continue Reading
If you wanted to figure out the probability that your favorite football team will win their next game, how would you do it?  My colleague Eduardo Santiago and I recently looked at this question, and in this post we'll share how we approached the solution. Let’s start by breaking down this problem: There are only two possible outcomes: your favorite team wins, or they lose. Ties are a possibility,... Continue Reading
Choosing the correct linear regression model can be difficult. After all, the world and how it works is complex. Trying to model it with only a sample doesn’t make it any easier. In this post, I'll review some common statistical methods for selecting models, complications you may face, and provide some practical advice for choosing the best regression model. It starts when a researcher wants to... Continue Reading
Stepwise regression and best subsets regression are both automatic tools that help you identify useful predictors during the exploratory stages of model building for linear regression. These two procedures use different methods and present you with different output. An obvious question arises. Does one procedure pick the true model more often than the other? I’ll tackle that question in this post. Fi... Continue Reading
Using a sample to estimate the properties of an entire population is common practice in statistics. For example, the mean from a random sample estimates that parameter for an entire population. In linear regression analysis, we’re used to the idea that the regression coefficients are estimates of the true parameters. However, it’s easy to forget that R-squared (R2) is also an estimate.... Continue Reading
You need to consider many factors when you’re buying a used car. Once you narrow your choice down to a particular car model, you can get a wealth of information about individual cars on the market through the Internet. How do you navigate through it all to find the best deal?  By analyzing the data you have available.   Let's look at how this works using the Assistant in Minitab 17. With the... Continue Reading
We like to host webinars, and our customers and prospects like to attend them. But when our webinar vendor moved from a pay-per-person pricing model to a pay-per-webinar pricing model, we wanted to find out how to maximize registrations and thereby minimize our costs. We collected webinar data on the following variables: Webinar topic Day of week Time of day – 11 a.m. or 2 p.m. Newsletter promotion –... Continue Reading
I’ve written about the importance of checking your residual plots when performing linear regression analysis. If you don’t satisfy the assumptions for an analysis, you might not be able to trust the results. One of the assumptions for regression analysis is that the residuals are normally distributed. Typically, you assess this assumption using the normal probability plot of the residuals. Are... Continue Reading
Previously, I showed why there is no R-squared for nonlinear regression. Anyone who uses nonlinear regression will also notice that there are no P values for the predictor variables. What’s going on? Just like there are good reasons not to calculate R-squared for nonlinear regression, there are also good reasons not to calculate P values for the coefficients. Why not—and what to use instead—are the... Continue Reading
In Blind Wine Part I, we introduced our experimental setup, which included some survey questions asked ahead of time of each participant. The four questions asked were: On a scale of 1 to 10, how would you rate your knowledge of wine? How much would you typically spend on a bottle of wine in a store? How many different types of wine (merlot, riesling, cabernet, etc.) would you buy regularly (not as... Continue Reading
Previously, I’ve written about when to choose nonlinear regression and how to model curvature with both linear and nonlinear regression. Since then, I’ve received several comments expressing confusion about what differentiates nonlinear equations from linear equations. This confusion is understandable because both types can model curves. So, if it’s not the ability to model a curve, what isthe... Continue Reading
In regression analysis, you'd like your regression model to have significant variables and to produce a high R-squared value. This low P value / high R2 combination indicates that changes in the predictors are related to changes in the response variable and that your model explains a lot of the response variability. This combination seems to go together naturally. But what if your regression model... Continue Reading
In Minitab, the Assistant menu is your interactive guide to choosing the right tool, analyzing data correctly, and interpreting the results. If you’re feeling a bit rusty with choosing and using a particular analysis, the Assistant is your friend! Previously, I’ve written about the new linear model features in Minitab 17. In this post, I’ll work through a multiple regression analysis example and... Continue Reading
If betting wasn't allowed on horse racing, the Kentucky Derby would likely be a little-known event of interest only to a small group of horse racing enthusiasts. But like the Tour de France, the World Cup, and the Masters Tournament, even those with little or no knowledge of the sport in general seem drawn to the excitement over its premier event—the mint juleps, the hats...and of course,... Continue Reading
In April 2012, I wrote a short paper on binary logistic regression to analyze wine tasting data. At that time, François Hollande was about to get elected as French president and in the U.S., Mitt Romney was winning the Republican primaries. That seems like a long time ago… Now, in 2014, Minitab 17 Statistical Softwarehas just been released. Had Minitab 17, been available in 2012, would have I... Continue Reading
Nonlinear regression is a very powerful analysis that can fit virtually any curve. However, it's not possible to calculate a valid R-squared for nonlinear regression. This topic gets complicated because, while Minitab statistical software doesn’t calculate R-squared for nonlinear regression, some other packages do. So, what’s going on? Minitab doesn't calculate R-squared for nonlinear models... Continue Reading
We released Minitab 17 Statistical Software a couple of days ago. Certainly every new release of Minitab is a reason to celebrate. However, I am particularly excited about Minitab 17 from a data analyst’s perspective.  If you read my blogs regularly, you’ll know that I’ve extensively used and written about linear models. Minitab 17 has a ton of new features that expand and enhance many types of... Continue Reading
If you regularly perform regression analysis, you know that R2 is a statistic used to evaluate the fit of your model. You may even know the standard definition of R2: the percentage of variation in the response that is explained by the model. Fair enough. With Minitab Statistical Software doing all the heavy lifting to calculate your R2 values, that may be all you ever need to know. But if you’re... Continue Reading