Blog posts and articles about the role of the normal distribution in statistics, data analysis, and quality improvement.

Here is a scenario involving process capability that we’ve seen
from time to time in Minitab's technical support department. I’m
sharing the details in this post so that you’ll know where to look
if you encounter a similar situation.
You need to run a capability analysis. You generate the output
using Minitab
Statistical Software. When you look at the results, the Cpk is
huge and the histogram in... Continue Reading

In my last post, we took the red pill and dove
deep into the unarguably fascinating and uncompromisingly
compelling world of the matrix plot. I've stuffed this post with
information about a topic of marginal interest...the marginal
plot.
Margins are important. Back in my English composition days, I
recall that margins were particularly prized for the inverse linear
relationship they maintained with... Continue Reading

Earlier this month, PLOS.org
published an article titled "Ten Simple Rules for Effective Statistical
Practice." The
10 rules are good reading for anyone who draws conclusions and makes decisions
based on data, whether
you're trying to extend the boundaries of scientific knowledge or
make good decisions for your business.
Carnegie Mellon University's
Robert E. Kass and several co-authors devised... Continue Reading

For
one reason or another, the response variable in a regression
analysis might not satisfy one or more of
the assumptions of ordinary least squares regression. The
residuals might follow a skewed distribution or the
residuals might curve as the predictions increase. A common
solution when problems arise with the assumptions of ordinary least
squares regression is to transform the response... Continue Reading

For hundreds of years, people having been improving their
situation by pulling themselves up by their bootstraps. Well, now
you can improve your statistical knowledge by pulling yourself up
by your bootstraps. Minitab
Express has 7 different bootstrapping analyses that can help
you better understand the sampling distribution of your
data.
A sampling distribution describes the likelihood of... Continue Reading

Once upon a time, when people wanted to compare the standard
deviations of two samples, they had two handy tests available, the
F-test and Levene's test.
Statistical lore has it that the F-test is so named because
it so frequently fails you.1
Although the F-test is suitable for data that are normally
distributed, its sensitivity to departures from
normality limits when and where it can be used.
Leve... Continue Reading

In the
first part of this series, we looked at a case study where
staff at a hospital used ATP swab tests to test 8 surfaces for
bacteria in 10 different hospital rooms across 5 departments. ATP
measurements below 400 units pass the swab test, while measurements
greater than or equal to 400 units fail the swab test and require
further investigation.
I
offered two tips on exploring and visualizing... Continue Reading

Working with healthcare-related data often feels different than
working with manufacturing data. After all, the common thread among
healthcare quality improvement professionals is the motivation to
preserve and improve the lives of patients. Whether collecting data
on the number of patient falls, patient length-of-stay, bed
unavailability, wait times, hospital acquired-infections, or
readmissions,... Continue Reading

T-tests are handy hypothesis tests in statistics when you want to
compare means. You can compare a sample mean to a hypothesized or
target value using a one-sample t-test. You can compare the means
of two groups with a two-sample t-test. If you have two groups with
paired observations (e.g., before and after measurements), use the
paired t-test.
How do t-tests work? How do t-values fit in? In this... Continue Reading

About
a year ago, a reader asked if I could try to explain
degrees of freedom in statistics. Since then,
I’ve been circling around that request very cautiously, like it’s
some kind of wild beast that I’m not sure I can safely wrestle to
the ground.
Degrees of freedom aren’t easy to explain. They come up in many
different contexts in statistics—some advanced and complicated. In
mathematics, they're... Continue Reading

Five-point
Likert scales are commonly associated with surveys and are used in
a wide variety of settings. You’ve run into the Likert scale if
you’ve ever been asked whether you strongly agree, agree, neither
agree or disagree, disagree, or strongly disagree about something.
The worksheet to the right shows what five-point Likert data look
like when you have two groups.
Because Likert item data are... Continue Reading

In my last post, I discussed how a DOE was
chosen to optimize a chemical-mechanical polishing process in
the microelectronics industry. This important process improved the
plant's final manufacturing yields. We selected an experimental
design that let us study the effects of six process parameters in
16 runs.
Analyzing the Design
Now we'll examine the analysis of the DOE results after the
actual... Continue Reading

Like so many of us, I try to stay healthy by watching my weight.
I thought it might be interesting to apply some statistical
thinking to the idea of maintaining a healthy weight, and the
central limit theorem could provide some particularly useful
insights. I’ll start by making some simple (maybe even simplistic)
assumptions about calorie intake and expenditure, and see where
those lead. And then... Continue Reading

There's nothing like a boxplot, aka box-and-whisker diagram, to
get a quick snapshot of the distribution of your data. With a
single glance, you can readily intuit its general shape, central
tendency, and variability.
To
easily compare the distribution of data between groups, display
boxplots for the groups side by side. Visually compare the central
value and spread of the distribution for each... Continue Reading

How deeply has statistical content from Minitab blog posts (or
other sources) seeped into your brain tissue? Rather than submit a
biopsy specimen from your temporal lobe for analysis, take this
short quiz to find out. Each question may have more than one
correct answer. Good luck!
Which
of the following are famous figure skating pairs, and which are
methods for testing whether your data follow a... Continue Reading

When you work in data analysis, you quickly discover an
irrefutable fact: a lot of people just can't stand
statistics. Some people fear the math, some fear what the data
might reveal, some people find it deadly dull, and others think
it's bunk. Many don't even really know why they hate
statistics—they just do. Always have, probably always
will.
Problem is, that means we who analyze data need to
com... Continue Reading

There are many reasons why a distribution might not be
normal/Gaussian. A non-normal pattern might be caused by several
distributions being mixed together, or by a drift in time, or by
one or several outliers, or by an asymmetrical behavior, some
out-of-control points, etc.
I recently collected the scores of three different teams (the
Blue team, the Yellow team and the Pink team) after a laser... Continue Reading

Control charts are a fantastic tool. These charts plot your
process data to identify common cause and special cause variation.
By identifying the different causes of variation, you can take
action on your process without over-controlling it.
Assessing the stability of a process can help you determine
whether there is a problem and identify the source of the problem.
Is the mean too high, too low,... Continue Reading

By Matthew Barsalou, guest
blogger
A problem must be understood before it can be properly
addressed. A thorough understanding of the problem is critical when
performing a
root cause analysis (RCA) and an RCA is necessary if an
organization wants to implement corrective actions that truly
address the root cause of the problem. An RCA may also be necessary
for process improvement projects; it is... Continue Reading

Since it's the Halloween season, I want to share how a classic
horror film helped me get a handle on an extremely useful
statistical distribution.
The
film is based on John W. Campbell's classic novella "Who Goes
There?", but I first became familiar with it from John
Carpenter's 1982 film The Thing.
In the film, researchers in the Antarctic encounter a predatory
alien with a truly frightening... Continue Reading