Blog posts and articles about regression analysis methods applied to Lean and Six Sigma projects.

For
one reason or another, the response variable in a regression
analysis might not satisfy one or more of
the assumptions of ordinary least squares regression. The
residuals might follow a skewed distribution or the
residuals might curve as the predictions increase. A common
solution when problems arise with the assumptions of ordinary least
the assumptions of ordinary least squares regression is to transform the response...

Analysis of variance (ANOVA) can determine whether the means of
three or more groups are different. ANOVA uses F-tests to
statistically test the equality of means. In this post, I’ll show
you how ANOVA and F-tests work using a one-way ANOVA example.
But wait a minute...have you ever stopped to wonder why you’d
use an analysis of variance to determine whether
means are different? I'll also show how...

Depending on how often and when you use statistical software like
Minitab, there may be specific tools or a group of tools you
find yourself using over and over again. You may have to do a monthly report, for
instance, for which you use one tool in our Basic Statistics menu,
another in Quality Tools, and a third in
Regression.
But there are a lot of functions and capabilities in our
software, and...

About
a year ago, a reader asked if I could try to explain
degrees of freedom in statistics. Since then,
I’ve been circling around that request very cautiously, like it’s
some kind of wild beast that I’m not sure I can safely wrestle to
the ground.
Degrees of freedom aren’t easy to explain. They come up in many
different contexts in statistics—some advanced and complicated. In
mathematics, they're...

I’ve written about R-squared before and I’ve concluded that it’s
not as intuitive as it seems at first glance. It can be a
misleading statistic because a high R-squared is not always good and a low
R-squared is not always bad. I’ve even said that R-squared is overrated and that the standard error of the estimate (S) can be
more useful.
Even though I haven't always been enthusiastic about...

When running a binary logistic regression and many other
analyses in Minitab, we estimate parameters for a specified model
based on the sample data that has been collected. Most of the time,
we use what is called Maximum Likelihood Estimation. However, based
on specifics within your data, sometimes these estimation methods
fail. What happens then?
Specifically, during binary logistic regression, an...

What is an interaction? It’s when the effect of one factor
depends on the level of another factor. Interactions are important
when you’re performing ANOVA, DOE, or a regression analysis.
Without them, your model may be missing an important term that
helps explain variability in the response!
For example, let’s consider 3-point shooting in the NBA. We
previously saw that the number of 3-point...

In my last post, I looked at
viewership data for the five seasons of HBO’s hit series Game of
Thrones. I
created a time series plot in Minitab that showed how
viewership rose season by season, and how it varied episode by
episode within each season.
My next step is to fit a statistical model to the data, which
I hope will allow me to predict the viewing numbers for future
episodes.
I am going to...

In statistics, there are things you need to do so you can trust
your results. For example, you should check the sample size, the
assumptions of the analysis, and so on. In regression analysis, I
always urge people to check their residual plots.
In this blog post, I present one more thing you should do so you
can trust your regression results in certain
circumstances—standardize the continuous...

In the world of linear models, a hierarchical model contains all
lower-order terms that comprise the higher-order terms that also
appear in the model. For example, a model that includes the
interaction term A*B*C is hierarchical if it includes these terms:
A, B, C, A*B, A*C, and B*C.
Fitting the correct regression model can be as
much of an art as it is a science. Consequently, there's not always
a...

How deeply has statistical content from Minitab blog posts (or
other sources) seeped into your brain tissue? Rather than submit a
biopsy specimen from your temporal lobe for analysis, take this
short quiz to find out. Each question may have more than one
correct answer. Good luck!
Which
of the following are famous figure skating pairs, and which are
methods for testing whether your data follow a...

If you perform linear regression analysis, you might need to
compare different regression lines to see if their constants and
slope coefficients are different. Imagine there is an established
relationship between X and Y. Now, suppose you want to determine
whether that relationship has changed. Perhaps there is a new
context, process, or some other qualitative change, and you want to
determine...

When you work in data analysis, you quickly discover an
irrefutable fact: a lot of people just can't stand
statistics. Some people fear the math, some fear what the data
might reveal, some people find it deadly dull, and others think
it's bunk. Many don't even really know why they hate
statistics—they just do. Always have, probably always
will.
Problem is, that means we who analyze data need to
com...

The
College Football Playoff technically doesn't start until December
31st, but in reality it started Saturday night in Indianapolis. The
winner of the Big Ten Championship Game was in the playoff, while
the loser was out. The stakes couldn't have been higher. So the
competitors need to make sure they gain every advantage they can.
And that's where 4th down decisions come in. With a lot of...

This week is the annual Thanksgiving holiday in the United
States, a period where we are encouraged to eat turkey and
cranberries, then consider the blessings in our lives before
falling into a comfortable pre-football nap. That includes many of
us here at Minitab.
Consequently,
we won't have new posts for you over the next two days. But
one of the things I'm grateful for is having had the...

Did
you ever wonder why statistical analyses and concepts often have
such weird, cryptic names?
One conspiracy theory points to the workings of a secret
committee called the ICSSNN. The International Committee for
Sadistic Statistical Nomenclature and Numerophobia was formed
solely to befuddle and subjugate the masses. Its mission: To select
the most awkward, obscure, and confusing name possible...

By Matthew Barsalou, guest
blogger
A problem must be understood before it can be properly
addressed. A thorough understanding of the problem is critical when
performing a
root cause analysis (RCA) and an RCA is necessary if an
organization wants to implement corrective actions that truly
address the root cause of the problem. An RCA may also be necessary
for process improvement projects; it is...

In Part 5 of our series, we began the analysis of
the experiment data by reviewing analysis of covariance and
blocking variables, two key concepts in the design and
interpretation of your results.
The
250-yard marker at the Tussey Mountain Driving Range, one of the
locations where we conducted our golf experiment. Some of the
golfers drove their balls well beyond this 250-yard maker during a
few of...

In
Part 3 of our series, we decided to test our 4
experimental factors, Club Face Tilt, Ball Characteristics, Club
Shaft Flexibility, and Tee Height in a full factorial design
because of the many advantages of that data collection plan.
In Part 4 we concluded that each golfer
should replicate their half fraction of the full factorial 5 times
in order to have a high enough power to detect...

With
Speaker John Boehner resigning, Kevin McCarthy quitting before the
vote for him to be Speaker, and a possible government shutdown in
the works, the Freedom Caucus has certainly been in the news
frequently! Depending on your political bent, the Freedom Caucus
has caused quite a disruption for either good or bad.
Who are these politicians? The Freedom Caucus is a group of
approximately 40...