Blog posts and articles about the statistical method called Linear Regression and its use in real-world quality projects.

In my previous post, I showed
you that the
coefficients are different when choosing (-1,0,1) vs (1,0) coding
schemes for General Linear Model (or
Regression).
We used the two different
equations to calculate the same fitted values. Here I will focus on
showing what the different coefficients represent.
Let's use the data and models from the last blog post:
We can display the means for
each level... Continue Reading

Since Minitab 17 Statistical
Software launched in February 2014, we've gotten
great feedback from many people have been using the General Linear
Model and Regression tools.
But in speaking with people as part of Minitab's Technical
Support team, I've found many are noticing that there are two
coding schemes available with each. We frequently get calls from
people asking how the coding scheme you... Continue Reading

Minitab 17 gives you the confidence you need to improve quality.

Download the Free Trial
By Erwin Gijzen, Guest Blogger
In
my previous post, we assessed the out-of-spec level for a
process with capability analysis and visualized process variability
using a control chart. Our goal is to reduce variability, but when
a process has a multitude of categorical and continuous variables,
identifying root causes can be a huge challenge. Analyzing
covariance—using the statistical technique... Continue Reading

by Erwin Gijzen, Guest
Blogger
People who work in quality improvement know that the root causes
of quality issues are hard to find. A typical production process
can contain hundreds of potential causes. Additionally, companies
often produce products with multiple quality requirements, such as
dimensions, surface appearance, and impact resistance.
With so many variables, it’s no wonder many companies... Continue Reading

We’ve been pretty excited about March Madness here at Minitab.
Kevin Rudy’s been busy creating his regression model and
predicting the winners for the 2015 NCAA Men’s Basketball
Tournament. But we’re not the only ones. Lots of folks are
doing their best analysis to help you plan out your bracket now
that the tip-offs for the round of 64 are just a day away. As you
ponder your last-minute changes,... Continue Reading

As someone who has
collected and analyzed real data for a living, the idea of
using simulated data for a Monte Carlo simulation sounds a bit odd.
How can you improve a real product with simulated data? In this
post, I’ll help you understand the methods behind Monte Carlo
simulation and walk you through a simulation example using
Devize.
What is Devize, you ask? Devize is Minitab's
exciting new,... Continue Reading

Choosing
the correct linear regression model can be difficult. After all,
the world and how it works is complex. Trying to model it with only
a sample doesn’t make it any easier. In this post, I'll review some
common statistical methods for selecting models, complications you
may face, and provide some practical advice for choosing the best
regression model.
It starts when a researcher wants to... Continue Reading

Stepwise regression and best subsets regression are both
automatic tools that help you identify useful predictors during the
exploratory stages of model building for linear regression. These
two procedures use different methods and present you with different
output.
An obvious question arises. Does one procedure pick the true
model more often than the other? I’ll tackle that question in this
post.
Fi... Continue Reading

Using a sample to estimate the properties of an entire population
is common practice in statistics. For example, the mean from a
random sample estimates that parameter for an entire population. In linear
regression analysis, we’re used to the idea that the regression coefficients are estimates of the
true parameters. However, it’s easy to forget that R-squared
(R2) is also an estimate.... Continue Reading

I’ve written about the importance of checking your residual plots when performing
linear regression analysis. If you don’t satisfy the assumptions
for an analysis, you might not be able to trust the results. One of
the assumptions for regression analysis is that the residuals are
normally distributed. Typically, you assess this assumption using
the normal probability plot of the residuals.
Are... Continue Reading

Previously, I showed why there is no R-squared for nonlinear regression. Anyone
who uses nonlinear regression will also notice that there are no P
values for the predictor variables. What’s going on?
Just like there are good reasons not to calculate R-squared for
nonlinear regression, there are also good reasons not to calculate
P values for the coefficients.
Why not—and what to use instead—are the... Continue Reading

Previously,
I’ve written about when to choose nonlinear regression and
how to model curvature with both linear and
nonlinear regression. Since then, I’ve received several
comments expressing confusion about what differentiates nonlinear
equations from linear equations. This confusion is understandable
because both types can model curves.
So, if it’s not the ability to model a curve, what isthe... Continue Reading

There is more than just the p value in a probability plot—the
overall graphical pattern also provides a great deal of useful
information. Probability plots are a powerful tool to better
understand your data.
In this post, I intend to present the main principles of
probability plots and focus on their visual interpretation using
some real data.
In probability plots, the data density distribution... Continue Reading

In regression analysis, you'd like your regression model to have
significant variables and to produce a high R-squared value. This
low P value / high R2 combination indicates that changes
in the predictors are related to changes in the response variable
and that your model explains a lot of the response variability.
This combination seems to go together naturally. But what if
your regression model... Continue Reading

You
know what really gets on my nerves? A lot of things.
That slow, slinky way that cats walk by. Grrrr.
The rude, abrupt arrival of delivery persons in their
obnoxiously loud trucks. (Why do they always pull up
just as I’m settling down for a nap?) Grrrr.
Total strangers who reach down and poke me with fat, clumsy
fingers that reek of antibacterial soap.
Grrrr.
And this one always gets my dander up:... Continue Reading

In Minitab, the Assistant menu is your interactive guide to choosing
the right tool, analyzing data correctly, and interpreting the
results. If you’re feeling a bit rusty with choosing and using a
particular analysis, the Assistant is your friend!
Previously, I’ve written about the new linear model features in Minitab 17. In
this post, I’ll work through a multiple regression analysis example
and... Continue Reading

Nonlinear regression is a very powerful
analysis that can fit virtually any curve. However, it's not
possible to calculate a valid R-squared for nonlinear regression.
This topic gets complicated because, while Minitab statistical software doesn’t calculate R-squared for
nonlinear regression, some other packages do.
So, what’s going on?
Minitab doesn't calculate R-squared for nonlinear models... Continue Reading

By popular demand, Release 17 of Minitab
Statistical Software comes with a new graphical analysis called
the Bubble Plot.
This exploratory tool is great for visualizing the relationships
among three variables on a single plot.
To see how it works, consider the total medal count by country
from the recently completed 2014 Olympic Winter Games. Suppose I
want to explore whether there might be a... Continue Reading

We released Minitab 17 Statistical Software a couple of days ago.
Certainly every new release of Minitab is a reason to celebrate.
However, I am particularly excited about Minitab 17 from a data
analyst’s perspective.
If you read my blogs regularly, you’ll know that I’ve
extensively used and written about linear models. Minitab 17 has a
ton of new features that expand and enhance many types of... Continue Reading

R-squared gets
all of the attention when it comes to determining how well a linear
model fits the data. However, I've stated previously that R-squared is overrated. Is there a different
goodness-of-fit statistic that can be more helpful? You bet!
Today, I’ll highlight a sorely underappreciated regression
statistic: S, or the standard error of the regression. S provides
important information that... Continue Reading