Regression Analysis

Blog posts and articles about regression analysis methods applied to Lean and Six Sigma projects.

For one reason or another, the response variable in a regression analysis might not satisfy one or more of the assumptions of ordinary least squares regression. The residuals might follow a skewed distribution or the residuals might curve as the predictions increase. A common solution when problems arise with the assumptions of ordinary least squares regression is to transform the response... Continue Reading
Analysis of variance (ANOVA) can determine whether the means of three or more groups are different. ANOVA uses F-tests to statistically test the equality of means. In this post, I’ll show you how ANOVA and F-tests work using a one-way ANOVA example. But wait a minute...have you ever stopped to wonder why you’d use an analysis of variance to determine whether means are different? I'll also show how... Continue Reading

7 Deadly Statistical Sins Even the Experts Make

Do you know how to avoid them?

Get the facts >
Depending on how often and when you use statistical software like Minitab, there may be specific tools or a group of tools you find yourself using over and over again. You may have to do a monthly report, for instance, for which you use one tool in our Basic Statistics menu, another in Quality Tools, and a third in Regression.  But there are a lot of functions and capabilities in our software, and... Continue Reading
About a year ago, a reader asked if I could try to explain degrees of freedom in statistics. Since then,  I’ve been circling around that request very cautiously, like it’s some kind of wild beast that I’m not sure I can safely wrestle to the ground. Degrees of freedom aren’t easy to explain. They come up in many different contexts in statistics—some advanced and complicated. In mathematics, they're... Continue Reading
I’ve written about R-squared before and I’ve concluded that it’s not as intuitive as it seems at first glance. It can be a misleading statistic because a high R-squared is not always good and a low R-squared is not always bad. I’ve even said that R-squared is overrated and that the standard error of the estimate (S) can be more useful. Even though I haven’t always been enthusiastic about... Continue Reading
When running a binary logistic regression and many other analyses in Minitab, we estimate parameters for a specified model based on the sample data that has been collected. Most of the time, we use what is called Maximum Likelihood Estimation. However, based on specifics within your data, sometimes these estimation methods fail. What happens then? Specifically, during binary logistic regression, an... Continue Reading
What is an interaction? It’s when the effect of one factor depends on the level of another factor. Interactions are important when you’re performing ANOVA, DOE, or a regression analysis. Without them, your model may be missing an important term that helps explain variability in the response! For example, let’s consider 3-point shooting in the NBA. We previously saw that the number of 3-point... Continue Reading
In my last post, I looked at viewership data for the five seasons of HBO’s hit series Game of Thrones. I created a time series plot in Minitab that showed how viewership rose season by season, and how it varied episode by episode within each season.   My next step is to fit a statistical model to the data, which I hope will allow me to predict the viewing numbers for future episodes.    I am going to... Continue Reading
In statistics, there are things you need to do so you can trust your results. For example, you should check the sample size, the assumptions of the analysis, and so on. In regression analysis, I always urge people to check their residual plots. In this blog post, I present one more thing you should do so you can trust your regression results in certain circumstances—standardize the continuous... Continue Reading
In the world of linear models, a hierarchical model contains all lower-order terms that comprise the higher-order terms that also appear in the model. For example, a model that includes the interaction term A*B*C is hierarchical if it includes these terms: A, B, C, A*B, A*C, and B*C. Fitting the correct regression model can be as much of an art as it is a science. Consequently, there's not always a... Continue Reading
How deeply has statistical content from Minitab blog posts (or other sources) seeped into your brain tissue? Rather than submit a biopsy specimen from your temporal lobe for analysis, take this short quiz to find out. Each question may have more than one correct answer. Good luck! Which of the following are famous figure skating pairs, and which are methods for testing whether your data follow a... Continue Reading
If you perform linear regression analysis, you might need to compare different regression lines to see if their constants and slope coefficients are different. Imagine there is an established relationship between X and Y. Now, suppose you want to determine whether that relationship has changed. Perhaps there is a new context, process, or some other qualitative change, and you want to determine... Continue Reading
When you work in data analysis, you quickly discover an irrefutable fact: a lot of people just can't stand statistics. Some people fear the math, some fear what the data might reveal, some people find it deadly dull, and others think it's bunk. Many don't even really know why they hate statistics—they just do. Always have, probably always will.  Problem is, that means we who analyze data need to com... Continue Reading
The College Football Playoff technically doesn't start until December 31st, but in reality it started Saturday night in Indianapolis. The winner of the Big Ten Championship Game was in the playoff, while the loser was out. The stakes couldn't have been higher. So the competitors need to make sure they gain every advantage they can. And that's where 4th down decisions come in. With a lot of... Continue Reading
This week is the annual Thanksgiving holiday in the United States, a period where we are encouraged to eat turkey and cranberries, then consider the blessings in our lives before falling into a comfortable pre-football nap. That includes many of us here at Minitab.  Consequently, we won't have new posts for you over the next two days.  But one of the things I'm grateful for is having had the... Continue Reading
Did you ever wonder why statistical analyses and concepts often have such weird, cryptic names? One conspiracy theory points to the workings of a secret committee called the ICSSNN. The International Committee for Sadistic Statistical Nomenclature and Numerophobia was formed solely to befuddle and subjugate the masses. Its mission: To select the most awkward, obscure, and confusing name possible... Continue Reading
By Matthew Barsalou, guest blogger A problem must be understood before it can be properly addressed. A thorough understanding of the problem is critical when performing a root cause analysis (RCA) and an RCA is necessary if an organization wants to implement corrective actions that truly address the root cause of the problem. An RCA may also be necessary for process improvement projects; it is... Continue Reading
In Part 5 of our series, we began the analysis of the experiment data by reviewing analysis of covariance and blocking variables, two key concepts in the design and interpretation of your results. The 250-yard marker at the Tussey Mountain Driving Range, one of the locations where we conducted our golf experiment. Some of the golfers drove their balls well beyond this 250-yard maker during a few of... Continue Reading
In Part 3 of our series, we decided to test our 4 experimental factors, Club Face Tilt, Ball Characteristics, Club Shaft Flexibility, and Tee Height in a full factorial design because of the many advantages of that data collection plan. In Part 4 we concluded that each golfer should replicate their half fraction of the full factorial 5 times in order to have a high enough power to detect... Continue Reading
With Speaker John Boehner resigning, Kevin McCarthy quitting before the vote for him to be Speaker, and a possible government shutdown in the works, the Freedom Caucus has certainly been in the news frequently! Depending on your political bent, the Freedom Caucus has caused quite a disruption for either good or bad.  Who are these politicians? The Freedom Caucus is a group of approximately 40... Continue Reading