Stories and real-world examples that show you how to apply statistics and statistical software to solve problems.

The two previous posts in this series focused on manipulating
data using Minitab’s
calculator and the
Data menu.
In
this third and final post, we continue to explore helpful features
for working with text data and will focus on some features in
Minitab’s Editor menu.
Using the Editor Menu
The
Editor menu is unique in that the options displayed
depend on what is currently active (worksheet, graph,... Continue Reading

My previous post focused on
manipulating text data using Minitab’s calculator.
In this post we continue to explore some of the useful tools for
working with text data, and here we’ll focus on Minitab’s Data
menu. This is the second in a 3-part series, and in the final post
we’ll look at the new features in Minitab’s Editor menu.
Using the Data Menu
When I think of the Data menu, I think... Continue Reading

With Minitab, it’s easy to create graphs and manage numeric,
date/time and text data. But Minitab’s enhanced data manipulation
features make it easier to work with text data, too.
This
is the first of three posts in which I'm going to focus on various
tools in Minitab that are useful when working with text data,
including the Calculator, the Data menu, and the Editor menu.
Using the Calculator
You... Continue Reading

Choosing the right type of subgroup in a control chart is
crucial. In a rational subgroup, the variability within a subgroup
should encompass common causes, random, short-term variability and
represent “normal,” “typical,” natural process variations, whereas
differences between subgroups are useful to detect drifts in
variability over time (due to “special” or “assignable” causes).
Variation within... Continue Reading

You run a capability analysis
and your Cpk is bad. Now what?
First, let’s start by defining
what “bad” is. In simple terms, the smaller the Cpk, the more
defects you have. So the larger your Cpk is, the
better. Many
practitioners use a Cpk of 1.33 as the gold standard, so we’ll
treat that as the gold standard here, too.
Suppose we collect some data and run a capability analysis using
Minitab
Statisti... Continue Reading

Histograms are one of the
most common graphs used to display numeric data. Anyone who
takes a statistics course is likely to learn about the histogram,
and for good reason: histograms are easy to understand and can
instantly tell you a lot about your data.
Here are three of the most important things you can learn by
looking at a histogram.
Shape—Mirror, Mirror, On the Wall…
If the left side of a... Continue Reading

Did
you ever wonder why statistical analyses and concepts often have
such weird, cryptic names?
One conspiracy theory points to the workings of a secret
committee called the ICSSNN. The International Committee for
Sadistic Statistical Nomenclature and Numerophobia was formed
solely to befuddle and subjugate the masses. Its mission: To select
the most awkward, obscure, and confusing name possible... Continue Reading

T'was the season for toys recently, and Christmas day found me
playing around with a classic, the Etch-a-Sketch. As I noodled with
the knobs, I had a sudden flash of recognition: my drawing reminded
me of the Empirical CDF Plot in Minitab Statistical Software. Did you just ask,
"What's a CDF plot? And what's so empirical about it?" Both very
good questions. Let's start with the first, and we'll... Continue Reading

Data mining can be helpful in the exploratory phase of an
analysis. If you're in the early stages and you're just figuring
out which predictors are potentially correlated with your response
variable, data mining can help you identify candidates. However,
there are problems associated with using data mining to select
variables.
In my previous post, we used data mining to settle on
the following... Continue Reading

True or false: When comparing a parameter for two sets of
measurements, you should always use a hypothesis test to determine
whether the difference is statistically significant.
The answer? (drumroll...) True!
...and False!
To understand this paradoxical answer, you need to keep in mind
the difference between samples, populations, and descriptive and
inferential statistics.
Descriptive Statistics and... Continue Reading

Today,
September 16, is World Ozone Day. You don't hear much about the
ozone layer any more.
In fact, if you’re under 30, you might think this is just
another trivial, obscure observance, along the lines of International Dot Day (yesterday) or National Apple Dumpling Day (tomorrow).
But there’s a good reason that, almost 30 years ago, the United
Nations designated today to as a day to raise... Continue Reading

You’ve
performed multiple linear regression and have settled on a model
which contains several predictor variables that are statistically
significant. At this point, it’s common to ask, “Which variable is
most important?”
This question is more complicated than it first appears. For one
thing, how you define “most important” often depends on your
subject area and goals. For another, how you collect... Continue Reading

In regression, "sums of squares" are used to represent
variation. In this post, we’ll use some sample data to walk through
these calculations.
The
sample data used in this post is available within Minitab by
choosing Help > Sample Data,
or File > Open Worksheet >
Look in Minitab Sample Data folder (depending on
your version of Minitab). The dataset is called
ResearcherSalary.MTW, and contains data... Continue Reading

I blogged a few months back about three different Minitab tools
you can use to examine your data over time. Did you know you
that you can also use a simple run chart to display how your
process data changes over time? Of course those “changes” could be
evidence of special-cause variation, which a run chart can help you
see.
What’s special-cause variation, and how’s it different from
common-cause... Continue Reading

While some posts in our Minitab blog focus on
understanding t-tests and t-distributions this post will focus
more simply on how to hand-calculate the t-value for a one-sample
t-test (and how to replicate the p-value that Minitab gives
us).
The formulas used in this post are available within Minitab
Statistical Software by choosing the following menu path:
Help > Methods and Formulas
> Basic... Continue Reading

An
outlier is an observation in a data set that lies a substantial
distance from other observations. These unusual observations can
have a disproportionate effect on statistical analysis,
such as the mean, which can lead to misleading results.
Outliers can provide useful information about your data or process,
so it's important to investigate them. Of course, you have to find
them first.
Finding... Continue Reading

Businesses are getting more and more data from existing and
potential customers: whenever we click on a web site, for example,
it can be recorded in the vendor's database. And whenever we use
electronic ID cards to access public transportation or other
services, our movements across the city may be analyzed.
In the very near future, connected objects such as cars and
electrical appliances will... Continue Reading

For hundreds of years, people having been improving their
situation by pulling themselves up by their bootstraps. Well, now
you can improve your statistical knowledge by pulling yourself up
by your bootstraps. Minitab
Express has 7 different bootstrapping analyses that can help
you better understand the sampling distribution of your
data.
A sampling distribution describes the likelihood of... Continue Reading

Analysis of variance (ANOVA) can determine whether the means of
three or more groups are different. ANOVA uses F-tests to
statistically test the equality of means. In this post, I’ll show
you how ANOVA and F-tests work using a one-way ANOVA example.
But wait a minute...have you ever stopped to wonder why you’d
use an analysis of variance to determine whether
means are different? I'll also show how... Continue Reading

In statistics, t-tests are a type of hypothesis test that allows
you to compare means. They are called t-tests because each t-test
boils your sample data down to one number, the t-value. If you
understand how t-tests calculate t-values, you’re well on your way
to understanding how these tests work.
In this series of posts, I'm focusing on concepts rather than
equations to show how t-tests work.... Continue Reading