Blog posts and articles about statistical principles and their application in quality improvement methods such as Lean and Six Sigma.

As someone who has collected and analyzed real data for a
living, the idea of using simulated data for a Monte Carlo
simulation sounds a bit odd. How can you improve a real product
with simulated data? In this post, I’ll help you understand the
methods behind Monte Carlo simulation and walk you through a
simulation example using Companion by Minitab.
Companion by Minitab is a software platform that... Continue Reading

Choosing the right type of subgroup in a control chart is
crucial. In a rational subgroup, the variability within a subgroup
should encompass common causes, random, short-term variability and
represent “normal,” “typical,” natural process variations, whereas
differences between subgroups are useful to detect drifts in
variability over time (due to “special” or “assignable” causes).
Variation within... Continue Reading

Grocery
shopping. For some, it's the most dreaded household activity. For
others, it's fun, or perhaps just a “necessary evil.”
Personally, I enjoy it! My co-worker, Ginger, a content manager
here at Minitab, opened my eyes to something that made me love
grocery shopping even more: she shared the data behind her family’s
shopping trips. Being something of a data nerd, I really geeked out
over the... Continue Reading

Earlier, I wrote about the
different types of data statisticians typically encounter. In
this post, we're going to look at why, when given a choice in the
matter, we prefer to analyze continuous data rather than
categorical/attribute or discrete data.
As a reminder, when we assign something to a group or give it a
name, we have created attribute or
categorical data. If we count something,
like... Continue Reading

You run a capability analysis
and your Cpk is bad. Now what?
First, let’s start by defining
what “bad” is. In simple terms, the smaller the Cpk, the more
defects you have. So the larger your Cpk is, the
better. Many
practitioners use a Cpk of 1.33 as the gold standard, so we’ll
treat that as the gold standard here, too.
Suppose we collect some data and run a capability analysis using
Minitab
Statisti... Continue Reading

In Part 1 of Gauging Gage, I looked at how adequate a
sampling of 10 parts is for a Gage R&R Study and providing
some advice based on the results.
Now I want to turn my attention to the other two factors in the
standard Gage experiment: 3 operators and 2 replicates.
Specifically, what if instead of increasing the number of parts in
the experiment (my previous post demonstrated you would need... Continue Reading

by Kevin Clay, guest blogger
In transactional or service processes, we often deal with
lead-time data, and usually that data does not follow the normal
distribution.
Consider a Lean Six Sigma project to reduce the lead time
required to install an information technology solution at a
customer site. It should take no more than 30 days—working 10 hours
per day Monday–Friday—to complete, test and... Continue Reading

"You take 10 parts and have 3 operators measure each 2
times."
This standard approach to a Gage R&R experiment is so
common, so accepted, so ubiquitous that few people ever question
whether it is effective. Obviously one could look at whether
3 is an adequate number of operators or 2 an adequate number of
replicates, but in this first of a series of posts about
"Gauging Gage," I want to look at... Continue Reading

Everyone who analyzes data regularly has the experience of
getting a worksheet that just isn't ready to use. Previously I
wrote about tools you can use to
clean up and eliminate clutter in your data and
reorganize your data.
In this post, I'm going to
highlight tools that help you get the most out of messy data by
altering its characteristics.
Know Your Options
Many problems with data don't become... Continue Reading

You've collected a bunch of
data. It wasn't easy, but you did it. Yep, there it is, right
there...just look at all those numbers, right there in neat columns
and rows. Congratulations.
I hate to ask...but what are you
going to do with your data?
If you're not sure precisely
what to do with the data you've got, graphing it is a
great way to get some valuable insight and direction. And a good
graph to... Continue Reading

In my last post, I wrote about
making a cluttered data set easier to work with by removing
unneeded columns entirely, and by displaying just those columns you
want to work with now. But
too much unneeded data isn't always the problem.
What can you do when someone
gives you data that isn't organized the way you need it to be?
That happens for a variety of
reasons, but most often it's because the... Continue Reading

Isn't it great when you get a set of data and it's perfectly
organized and ready for you to analyze? I love it when the people
who collect the data take special care to make sure to format it
consistently, arrange it correctly, and eliminate the junk,
clutter, and useless information I don't need.
You've
never received a data set in such perfect condition, you say?
Yeah, me neither. But I can... Continue Reading

In its industry guidance to companies that manufacture drugs and
biological products for people and animals,
the Food and Drug Administration (FDA) recommends three stages for
process validation:
Process Design,
Process Qualification, and Continued Process Verification. In
this post, we we will focus on that third stage.
Stage 3: Continued Process Verification
Per the FDA guidelines, the goal of... Continue Reading

People can make mistakes when they test a hypothesis with
statistical analysis. Specifically, they can make either Type I or
Type II errors.
As you analyze your own data and test hypotheses, understanding
the difference between Type I and Type II errors is extremely
important, because there's a risk of making each type of error in
every analysis, and the amount of risk is in your
control.
So
if... Continue Reading

Welcome to the Hypothesis Test Casino! The featured game of the
house is roulette. But this is no ordinary game of
roulette. This is p-value roulette!
Here’s how it works: We have two roulette wheels, the Null wheel
and the Alternative wheel. Each wheel has 20 slots (instead of the
usual 37 or 38). You get to bet on one slot.
What happens if the ball lands in the slot you bet on? Well,
that depends... Continue Reading

Like
many, my introduction to 17th-century French philosophy came at the
tender age of 3+. For that is when I discovered the
Etch-a-Sketch®, an entertaining ode to Descartes' coordinate plane.
Little did I know that the seemingly idle hours I spent doodling
on my Etch-a-Sketch would prove to be excellent training for the
feat that I attempt today: plotting an Empirical Cumulative
Distribution... Continue Reading

My colleague Cody Steele wrote a post that
illustrated how
the same set of data can appear to support two contradictory
positions. He showed how changing the scale of a graph that
displays mean and median household income over time drastically
alters the way it can be interpreted, even though there's no change
in the data being presented.
When we analyze data, we need to present the results in... Continue Reading

Right
now I’m enjoying my daily dose of morning joe. As the steam rises
off the cup, the dark rich liquid triggers a powerful enzyme
cascade that jump-starts my brain and central nervous system,
delivering potent glints of perspicacity into the dark crevices of
my still-dormant consciousness.
Feels good, yeah! But is it good for me? Let’s see what the
studies say…
Drinking more than 4 cups of coffee... Continue Reading

Statistics can be challenging, especially if you're not
analyzing data and interpreting the results every day. Statistical
software makes things easier by handling the arduous
mathematical work involved in statistics. But ultimately, we're
responsible for correctly interpreting and communicating what the
results of our analyses show.
The p-value is probably the most frequently cited
statistic. We... Continue Reading

As a person who loves baking (and eating) cakes, I find it
bothersome to go through all the effort of baking a cake when the
end result is too dry for my taste. For that reason, I decided to
use a designed experiment in Minitab to help me reduce the moisture
loss in baked chocolate cakes, and find the optimal settings of my
input factors to produce a moist baked chocolate cake. I’ll share
the... Continue Reading