Blog posts and articles about statistics principles and how they apply to quality improvement methods like Lean and Six Sigma.

Do you recall my “putting the cart before the horse” analogy in
part 1 of this blog series? The comparison is simple.
We all, at times, put the cart before the horse in relatively
innocuous ways, such as eating your dessert before you’ve eaten
your dinner, or deciding what to wear before you’ve been invited to
the party. But performing some tasks in the wrong order, such as
running a statistical... Continue Reading

Once upon a time, when people wanted to compare the standard
deviations of two samples, they had two handy tests available, the
F-test and Levene's test.
Statistical lore has it that the F-test is so named because
it so frequently fails you.1
Although the F-test is suitable for data that are normally
distributed, its sensitivity to departures from
normality limits when and where it can be used.
Leve... Continue Reading

Along with the explosion of interest in visualizing data over
the past few years has been an excessive focus on how attractive
the graph is at the expense of how useful it is. Don't get me
wrong...I believe that a colorful, modern graph comes across better
than a black-and-white, pixelated one. Unfortunately, however, all
the talk seems to be about the attractiveness and not the value of
the... Continue Reading

About
a year ago, a reader asked if I could try to explain
degrees of freedom in statistics. Since then,
I’ve been circling around that request very cautiously, like it’s
some kind of wild beast that I’m not sure I can safely wrestle to
the ground.
Degrees of freedom aren’t easy to explain. They come up in many
different contexts in statistics—some advanced and complicated. In
mathematics, they're... Continue Reading

Like so many of us, I try to stay healthy by watching my weight.
I thought it might be interesting to apply some statistical
thinking to the idea of maintaining a healthy weight, and the
central limit theorem could provide some particularly useful
insights. I’ll start by making some simple (maybe even simplistic)
assumptions about calorie intake and expenditure, and see where
those lead. And then... Continue Reading

You have a column of categorical data. Maybe it’s a column of
reasons for production downtime, or customer survey responses, or
all of the reasons airlines give for those riling flight delays.
Whatever type of qualitative data you may have, suppose you want to
find the most common categories. Here are three different ways to
do that:
1. Pareto Charts
Pareto Charts easily help you separate the vital... Continue Reading

If you need to assess process
performance relative to some specification limit(s),
then process
capability is the tool to use. You collect some accurate
data from a stable process, enter those measurements in Minitab,
and then choose Stat > Quality Tools >
Capability Analysis/Sixpack or Assistant
> Capability Analysis.
Now, what about sorting the data?
I’ve been asked “why does Cpk change when I... Continue Reading

In my time at Minitab, I’ve gotten a good understanding of what
types of graphs users create. Everyone knows about histograms, bar
charts, and time series plots. Even relatively less familiar plots
like the interval plot and
individual value plot are still used quite often.
However, one of the most underutilized graphs we have available is
the area graph. If you’re not familiar with an Area... Continue Reading

In an earlier post, I shared an
overview of acceptance sampling, a method that lets you
evaluate a sample of items from a larger batch of products (for
instance, electronics components you've sourced from a new
supplier) and use that sample to decide whether or not you should
accept or reject the entire shipment.
There are two approaches to acceptance sampling. If you do it by
attributes, you... Continue Reading

If you're just getting started in the world of quality
improvement, or if you find yourself in a position where you
suddenly need to evaluate the quality of incoming or outgoing
products from your company, you may have encountered the term
"acceptance sampling." It's a statistical method for evaluating the
quality of a large batch of materials from a small sample of items,
which statistical
softwar... Continue Reading

In my last post, I walked
through the steps to
install Minitab 17 on a Mac using Apple Boot Camp.
Minitab 17 can also be installed on a Mac using desktop
virtualization software.
In addition to your Mac, you’ll need:
A copy of Windows 7 or later version ISO
Minitab 17 Statistical Software
Desktop virtualization software allows you to install and use
Windows on your Intel-based Mac without requiring... Continue Reading

While Minitab 17 is currently a
Windows-only application, there are people who only have a Mac
available for the installation who also find they need to use
Minitab 17.
It is possible to run Minitab 17 on a Macintosh, though the
steps involved in the installation can seem a little daunting at
first. In the Technical Support department, we sometimes hear
reluctance in people’s voices when we throw... Continue Reading

Not long ago, I couldn’t abide
statistics. I did respect
it, but in much the same way a
gazelle respects a lion. Most of my early experiences with
statistics indicated that close encounters resulted in pain, so I
avoided further contact whenever possible.
So how is it that today I write about statistics? That’s simple:
it merely required completely reinventing the way I thought about
and approached... Continue Reading

There are many reasons why a distribution might not be
normal/Gaussian. A non-normal pattern might be caused by several
distributions being mixed together, or by a drift in time, or by
one or several outliers, or by an asymmetrical behavior, some
out-of-control points, etc.
I recently collected the scores of three different teams (the
Blue team, the Yellow team and the Pink team) after a laser... Continue Reading

Since it's the Halloween season, I want to share how a classic
horror film helped me get a handle on an extremely useful
statistical distribution.
The
film is based on John W. Campbell's classic novella "Who Goes
There?", but I first became familiar with it from John
Carpenter's 1982 film The Thing.
In the film, researchers in the Antarctic encounter a predatory
alien with a truly frightening... Continue Reading

Step
3 in our DOE problem solving methodology is to determine how many
times to replicate the base experiment plan. The discussion in Part 3
ended with the conclusion that our
4 factors could best be studied using all 16 combinations of the
high and low settings for each factor, a full factorial. Each
golfer will perform half of the sixteen possible combinations and
each golfer’s data could stand as... Continue Reading

I read trade publications that cover everything from banking to
biotech, looking for interesting perspectives on data analysis and
statistics, especially where it pertains to quality
improvement.
Recently I read a great blog post from Tony Taylor, an analytical
chemist with a background in pharmaceuticals. In it, he discusses
the implications of the FDA's updated guidance for industry analytical... Continue Reading

September 17 marked the release of new information from the
American Community Survey (ACS) from the U.S. Census Bureau. Here’s
a bar chart of what the press releases looked like for that
day:
Clearly there was a theme in play, one that was great news for
major metropolitan areas. The Census Bureau even released a graph showing that the percentage of people
within the 25 most populous metropolitan... Continue Reading

You run a capability analysis
and your Cpk is bad. Now what?
First, let’s start by defining
what “bad” is. In simple terms, the smaller the Cpk, the more
defects you have. So the larger your Cpk is, the
better. Many
practitioners use a Cpk of 1.33 as the gold standard, so we’ll
treat that as the gold standard here, too.
Suppose we collect some data and run a capability analysis using
Minitab
Statisti... Continue Reading

Statisticians say the darndest things. At least, that's how it
can seem if you're not well-versed in statistics.
When I began studying statistics, I approached it as a language.
I quickly noticed that compared to other disciplines, statistics
has some unique problems with terminology, problems that don't
affect most scientific and academic specialties.
For
example, dairy science has a highly... Continue Reading