The
Pareto chart is a graphic representation of the 80/20 rule, also
known as the Pareto principle. If you're a quality improvement
specialist, you know that the chart is named after the early 20th
century economist Vilfredo Pareto, who discovered that roughly 20%
of the population in Italy owned about 80% of the property at that
time.
You probably also know that the Pareto principle was... Continue Reading

About
a year ago, a reader asked if I could try to explain
degrees of freedom in statistics. Since then,
I’ve been circling around that request very cautiously, like it’s
some kind of wild beast that I’m not sure I can safely wrestle to
the ground.
Degrees of freedom aren’t easy to explain. They come up in many
different contexts in statistics—some advanced and complicated. In
mathematics, they're... Continue Reading

I
live with a German national, who often tells me that we Americans
spend way too much of our lives at work. He also
frequently comments that we work much less efficiently than Germans
do, during the increased time we’re at work.
Which reminds me—I need to pay my water bill online...
Okay, I’m back. Quick, wasn’t it? So convenient. Now, where was
I? Oh, work habits.
After checking the hourly weather... Continue Reading

There's nothing like a boxplot, aka box-and-whisker diagram, to
get a quick snapshot of the distribution of your data. With a
single glance, you can readily intuit its general shape, central
tendency, and variability.
To
easily compare the distribution of data between groups, display
boxplots for the groups side by side. Visually compare the central
value and spread of the distribution for each... Continue Reading

How deeply has statistical content from Minitab blog posts (or
other sources) seeped into your brain tissue? Rather than submit a
biopsy specimen from your temporal lobe for analysis, take this
short quiz to find out. Each question may have more than one
correct answer. Good luck!
Which
of the following are famous figure skating pairs, and which are
methods for testing whether your data follow a... Continue Reading

Did
you ever wonder why statistical analyses and concepts often have
such weird, cryptic names?
One conspiracy theory points to the workings of a secret
committee called the ICSSNN. The International Committee for
Sadistic Statistical Nomenclature and Numerophobia was formed
solely to befuddle and subjugate the masses. Its mission: To select
the most awkward, obscure, and confusing name possible... Continue Reading

If
you use ordinary linear regression with a response of count data,
if may work out fine (Part
1), or you may run into some problems (Part
2).
Given that a count response could be problematic, why not use a
regression procedure developed to handle a response of counts?
A Poisson regression analysis is designed to analyze a
regression model with a count response.
First, let's try using Poisson... Continue Reading

My previous post showed an example of using
ordinary linear regression to model a count response. For that particular count data, shown by the blue
circles on the dot plot below, the model assumptions for linear
regression were adequately satisfied.
But frequently, count data may contain many values equal or
close to 0. Also, the distribution of the counts may be
right-skewed. In the quality field,... Continue Reading

Ever use dental floss to cut soft cheese? Or Alka Seltzer to
clean your toilet bowl? You can find a host of nonconventional uses for ordinary objects
online. Some are more peculiar than others.
Ever use ordinary linear regression to evaluate a response
(outcome) variable of counts?
Technically, ordinary linear regression was designed to evaluate
a a continuous response variable. A continuous... Continue Reading

I've never understood the fascination with selfies.
Maybe it's because I'm over 50. After surviving the slings and
arrows of a half a century on Earth, the minute or two I spend in
front of the bathroom mirror each morning is more than
enough selfie time for me.
Still, when I heard that Microsoft had an online app that estimates
the age of any face on a photo, I was intrigued.
How would the app... Continue Reading

It’s usually not a good idea to rely solely on a single
statistic to draw conclusions about your process. Do that, and you
could fall into the clutches of the “duck-rabbit” illusion shown
here:
If you fix your eyes solely on the duck, you’ll miss the
rabbit—and vice-versa.
If you're using
Minitab
Statistical Software for capability analysis, the
capability indices Cp and Cpk are good examples of... Continue Reading

I always knew I was different. Even as a kid.
“Is that me? Way out there in left field?” I asked the doc.
“Yes,” he nodded, as he looked at my chart. “I used brushing to
identify you on the graph.”
I wasn’t sure I liked getting brushed. It felt like my true
identify was being detected and displayed in a window for all to
see.
The doctor must have sensed my discomfort.
“It’s not uncommon—even for those... Continue Reading

Right
now I’m enjoying my daily dose of morning joe. As the steam rises
off the cup, the dark rich liquid triggers a powerful enzyme
cascade that jump-starts my brain and central nervous system,
delivering potent glints of perspicacity into the dark crevices of
my still-dormant consciousness.
Feels good, yeah! But is it good for me? Let’s see what the
studies say…
Drinking more than 4 cups of coffee... Continue Reading

If you’re not a statistician, looking through statistical output
can sometimes make you feel a bit like Alice in
Wonderland. Suddenly, you step into a fantastical world
where strange and mysterious phantasms appear out of nowhere.
For example, consider the T and P in your t-test results.
“Curiouser and curiouser!” you might exclaim, like Alice, as you
gaze at your output.
What are these values,... Continue Reading

"He looks just like his father...and
mother!"
Popular morphing sites online let you visualize the
hypothetical offspring of some very unlikely couples.
The baby of Albert Einstein and Kim Kardashian
(Kimbert?) would presumably look something like the image
shown at right.
What happens if you morph the features of two different
graphs?
For example, what would the baby of a time series plot and... Continue Reading

The word kurtosis sounds like a painful, festering
disease of the gums. But the term actually describes the shape of a
data distribution.
Frequently, you'll see kurtosis defined as how sharply "peaked"
the data are. The three main types of kurtosis are shown below.
Lepto means "thin" or "slender" in Greek. In
leptokurtosis, the kurtosis value is high.
Platy means "broad" or "flat"—as in duck-billed
pl... Continue Reading

Do you suffer from PAAA (Post-Analysis Assumption Angst)? You’re
not alone.
Checking the required assumptions for a statistical
analysis is critical. But if you don’t have a Ph.D. in statistics,
it can feel more complicated and confusing than the primary
analysis itself.
How
does the
cuckoo egg data, a common sample data set often used to teach
analysis of variance, satisfy the following
formal... Continue Reading

If
you teach statistics or quality statistics, you’re probably already
familiar with the cuckoo egg data set.
The common cuckoo has decided that raising baby chicks is a
stressful, thankless job. It has better things to do than fill the
screeching, gaping maws of cuckoo chicks, day in and day out.
So the mother cuckoo lays her eggs in the nests of other bird
species. If the cuckoo egg is similar... Continue Reading

You
know what really gets on my nerves? A lot of things.
That slow, slinky way that cats walk by. Grrrr.
The rude, abrupt arrival of delivery persons in their
obnoxiously loud trucks. (Why do they always pull up
just as I’m settling down for a nap?) Grrrr.
Total strangers who reach down and poke me with fat, clumsy
fingers that reek of antibacterial soap.
Grrrr.
And this one always gets my dander up:... Continue Reading

These
days, my memory isn't what it used to be. Besides that, my memory
isn't what it used to be.
But my incurable case of CRS (Can't Remember Stuff) is
not nearly as bad as that of the exponential distribution.
When modelling failure data for reliability analysis, the
exponential distribution is completely memoryless. It retains no
record of the previous failure of an item.
That might sound like a... Continue Reading