Blog posts and articles about the statistical method called Linear Regression and its use in real-world quality projects.

Overfitting a model is a real problem you need to beware of when
performing regression analysis. An overfit model result in
misleading regression coefficients, p-values,
and R-squared statistics. Nobody wants that,
so let's examine what overfit models are, and how to avoid falling
into the overfitting trap.
Put simply, an overfit model is too complex for the data you're
analyzing. Rather than... Continue Reading

Maybe you're just getting started with analyzing data. Maybe
you're reasonably knowledgeable about statistics, but it's been a
long time since you did a particular analysis and you feel a little
bit rusty. In either case, the Assistant menu in Minitab Statistical Software
gives you an interactive guide from start to finish. It will help
you choose the right tool quickly, analyze your data... Continue Reading

In statistics, as in life, absolute certainty is rare. That's
why statisticians often can't provide a result that is as specific
as we might like; instead, they provide the results of an analysis
as a range, within which the data suggest the true answer lies.
Most of us are familiar with "confidence intervals," but that's
just of several different kinds of intervals we can use to
characterize the... Continue Reading

Previously,
I’ve written about when to choose nonlinear regression and
how to model curvature with both linear and
nonlinear regression. Since then, I’ve received several
comments expressing confusion about what differentiates nonlinear
equations from linear equations. This confusion is understandable
because both types can model curves.
So, if it’s not the ability to model a curve, what isthe... Continue Reading

As someone who has collected and analyzed real data for a
living, the idea of using simulated data for a Monte Carlo
simulation sounds a bit odd. How can you improve a real product
with simulated data? In this post, I’ll help you understand the
methods behind Monte Carlo simulation and walk you through a
simulation example using Companion by Minitab.
Companion by Minitab is a software platform that... Continue Reading

Dear Readers,
As
2016 comes to a close, it’s time to reflect on the passage of time
and changes. As I’m sure you’ve guessed, I love statistics and
analyzing data! I also love talking and writing about it. In fact,
I’ve been writing statistical blog posts for over five years, and
it’s been an absolute blast. John Tukey, the renowned statistician,
once said, “The best thing about being a statistician... Continue Reading

by Matt Barsalou, guest blogger
I know that Thanksgiving is always on the last Thursday in
November, but somehow I failed to notice it was fast approaching
until the Monday before Thanksgiving. This led to frantically
sending a last-minute invitation, and a hunt for a turkey.
I live in Germany and this greatly complicated the matter. Not
only is Thanksgiving not celebrated, but also actual turkeys... Continue Reading

Since the release of Minitab
Express in 2014, we’ve often received questions in technical
support about the differences between Express and Minitab 17.
In this post, I’ll attempt to provide a comparison between these
two Minitab products.
What Is Minitab 17?
Minitab 17 is an all-in-one graphical and statistical analysis
package that includes basic analysis tools such as hypothesis
testing,... Continue Reading

Face it, you love regression analysis as much as I do.
Regression is one of the most satisfying analyses in Minitab:
get some predictors that should have a relationship to a response,
go through a model selection process, interpret fit statistics like
adjusted R2 and predicted R2, and make
predictions. Yes, regression really is quite wonderful.
Except when it’s not. Dark, seedy corners of the data... Continue Reading

You’ve
performed multiple linear regression and have settled on a model
which contains several predictor variables that are statistically
significant. At this point, it’s common to ask, “Which variable is
most important?”
This question is more complicated than it first appears. For one
thing, how you define “most important” often depends on your
subject area and goals. For another, how you collect... Continue Reading

Design of Experiments (DOE) is the perfect tool to efficiently
determine if key inputs are related to key outputs. Behind the
scenes, DOE is simply a regression analysis. What’s not simple,
however, is all of the choices you have to make when planning your
experiment. What X’s should you test? What ranges should you select
for your X’s? How many replicates should you use? Do you need
center... Continue Reading

In my last post, we took the red pill and dove
deep into the unarguably fascinating and uncompromisingly
compelling world of the matrix plot. I've stuffed this post with
information about a topic of marginal interest...the marginal
plot.
Margins are important. Back in my English composition days, I
recall that margins were particularly prized for the inverse linear
relationship they maintained with... Continue Reading

Suppose you’ve collected data on cycle time, revenue, the
dimension of a manufactured part, or some other metric that’s
important to you, and you want to see what other variables may be
related to it. Now what?
When I graduated from college with my first statistics degree,
my diploma was bona fide proof that I'd endured hours and hours of
classroom lectures on various statistical topics, including
l... Continue Reading

For
one reason or another, the response variable in a regression
analysis might not satisfy one or more of
the assumptions of ordinary least squares regression. The
residuals might follow a skewed distribution or the
residuals might curve as the predictions increase. A common
solution when problems arise with the assumptions of ordinary least
squares regression is to transform the response... Continue Reading

In my last post, I looked at
viewership data for the five seasons of HBO’s hit series Game of
Thrones. I
created a time series plot in Minitab that showed how
viewership rose season by season, and how it varied episode by
episode within each season.
My next step is to fit a statistical model to the data, which
I hope will allow me to predict the viewing numbers for future
episodes.
I am going to... Continue Reading

In this post, I’ll address some common questions we’ve received
in technical support about
the difference between fitted and data means, where to find each
option within Minitab, and how Minitab calculates each.
First,
let’s look at some definitions. It’s useful to have an example, so
I’ll be using the Light Output data set from Minitab’s Data Set
Library, which includes a description of the sample... Continue Reading

In the world of linear models, a hierarchical model contains all
lower-order terms that comprise the higher-order terms that also
appear in the model. For example, a model that includes the
interaction term A*B*C is hierarchical if it includes these terms:
A, B, C, A*B, A*C, and B*C.
Fitting the correct regression model can be as
much of an art as it is a science. Consequently, there's not always
a... Continue Reading

How deeply has statistical content from Minitab blog posts (or
other sources) seeped into your brain tissue? Rather than submit a
biopsy specimen from your temporal lobe for analysis, take this
short quiz to find out. Each question may have more than one
correct answer. Good luck!
Which
of the following are famous figure skating pairs, and which are
methods for testing whether your data follow a... Continue Reading

If you perform linear regression analysis, you might need to
compare different regression lines to see if their constants and
slope coefficients are different. Imagine there is an established
relationship between X and Y. Now, suppose you want to determine
whether that relationship has changed. Perhaps there is a new
context, process, or some other qualitative change, and you want to
determine... Continue Reading

With
Speaker John Boehner resigning, Kevin McCarthy quitting before the
vote for him to be Speaker, and a possible government shutdown in
the works, the Freedom Caucus has certainly been in the news
frequently! Depending on your political bent, the Freedom Caucus
has caused quite a disruption for either good or bad.
Who are these politicians? The Freedom Caucus is a group of
approximately 40... Continue Reading