Data analysis gives you the keys to how to manufacture the best product, provide the best services, or answer an academic research question. I’ll share practical tidbits that may help you do just that. Continue Reading »
In statistics, there are things you need to do so you can trust
your results. For example, you should check the sample size, the
assumptions of the analysis, and so on. In regression analysis, I
always urge people to check their residual plots.
In this blog post, I present one more thing you should do so you
can trust your regression results in certain
circumstances—standardize the continuous... Continue Reading
In the world of linear models, a hierarchical model contains all
lower-order terms that comprise the higher-order terms that also
appear in the model. For example, a model that includes the
interaction term A*B*C is hierarchical if it includes these terms:
A, B, C, A*B, A*C, and B*C.
Fitting the correct regression model can be as
much of an art as it is a science. Consequently, there's not always
a... Continue Reading
If you perform linear regression analysis, you might need to
compare different regression lines to see if their constants and
slope coefficients are different. Imagine there is an established
relationship between X and Y. Now, suppose you want to determine
whether that relationship has changed. Perhaps there is a new
context, process, or some other qualitative change, and you want to
determine... Continue Reading
Control charts are a fantastic tool. These charts plot your
process data to identify common cause and special cause variation.
By identifying the different causes of variation, you can take
action on your process without over-controlling it.
Assessing the stability of a process can help you determine
whether there is a problem and identify the source of the problem.
Is the mean too high, too low,... Continue Reading
approaches, you are probably taking the necessary steps to protect
yourself from the various ghosts, goblins, and witches that are prowling
around. Monsters of all sorts are out to get you, unless they’re
sufficiently bribed with candy offerings!
I’m here to warn you about a ghoul that all statisticians and
data scientists need to be aware of: phantom degrees of freedom.
These phantoms... Continue Reading
Speaker John Boehner resigning, Kevin McCarthy quitting before the
vote for him to be Speaker, and a possible government shutdown in
the works, the Freedom Caucus has certainly been in the news
frequently! Depending on your political bent, the Freedom Caucus
has caused quite a disruption for either good or bad.
Who are these politicians? The Freedom Caucus is a group of
approximately 40... Continue Reading
An exciting new study sheds light on the relationship between P
values and the replication of experimental results. This study
highlights issues that I've emphasized repeatedly—it is crucial to
interpret P values correctly, and significant
results must be replicated to be trustworthy.
The study also supports my disagreement with the decision
by the Journal of Basic and Applied Social Psychology to
b... Continue Reading
Repeated measures designs don’t fit our impression of a typical
experiment in several key ways. When we think of an experiment, we
often think of a design that has a clear distinction between the
treatment and control groups. Each subject is in one, and only one,
of these non-overlapping groups. Subjects who are in a treatment
group are exposed to only one type of treatment. This is the... Continue Reading
analysis, overfitting a model is a real problem. An overfit model
can cause the regression coefficients, p-values, and R-squared to be misleading. In this post,
I explain what an overfit model is and how to detect and avoid this
An overfit model is one that is too complicated for your data
set. When this happens, the regression model becomes tailored to
fit the quirks and... Continue Reading
Scientists who use the Hubble Space Telescope to explore the
galaxy receive a stream of digitized images in the form binary
code. In this state, the information is essentially worthless-
these 1s and 0s must first be converted into pictures before the
scientists can learn anything from them.
The same is true of statistical distributions and parameters that are used to describe sample data. They... Continue Reading
my previous post, I wrote about the hypothesis testing ban in
the Journal of Basic and Applied Social Psychology. I
showed how P values and confidence intervals provide important
information that descriptive statistics alone don’t provide. In
this post, I'll cover the editors’ concerns about hypothesis
testing and how to avoid the problems they describe.
The editors describe hypothesis testing... Continue Reading
Banned! In February 2015, editor David Trafimow and associate
editor Michael Marks of the Journal of Basic and Applied Social
Psychology declared that the null hypothesis statistical
testing procedure is invalid. They promptly banned P values,
confidence intervals, and hypothesis testing from the journal.
The journal now requires descriptive statistics and effect
sizes. They also encourage large... Continue Reading
2016 presidential race is becoming more real. We’ve had several
announcements with Ted Cruz, Rand Paul, Hillary Clinton, and Marco
Rubio officially entering the race to be President. While the
prospective Democratic candidates are down to one, or at most a
few, the Republican field is extra-large this election cycle. The
first order of business for a GOP candidate is to survive the
nomination... Continue Reading
In this series of posts, I show how hypothesis tests and
confidence intervals work by focusing on concepts and graphs rather
than equations and numbers.
Previously, I used graphs to show what statistical significance really
means. In this post, I’ll explain both confidence intervals and
confidence levels, and how they’re closely related to P values and
How to Correctly... Continue Reading
This is a companion post for a series of blog posts about
understanding hypothesis tests. In this series, I create a
graphical equivalent to a 1-sample t-test and confidence interval
to help you understand how it works more intuitively.
This post focuses entirely on the steps required to create the
graphs. It’s a fairly technical and task-oriented post designed for
those who need to create the... Continue Reading
What do significance levels and P values mean in hypothesis
tests? What is statistical significance anyway? In this
post, I’ll continue to focus on concepts and graphs to help you
gain a more intuitive understanding of how hypothesis tests work in
To bring it to life, I’ll add the significance level and P value
to the graph in my previous post in order to perform a graphical
version of... Continue Reading
Hypothesis testing is an essential procedure in statistics. A
hypothesis test evaluates two mutually exclusive statements about a
population to determine which statement is best supported by the
sample data. When we say that a finding is statistically
significant, it’s thanks to a hypothesis test. How do these tests
really work and what does statistical significance actually
In this series of... Continue Reading
It’s safe to say that most people who use statistics are more
familiar with parametric analyses than nonparametric analyses.
Nonparametric tests are also called distribution-free tests because
they don’t assume that your data follow a specific
You may have heard that you should use nonparametric tests when
your data don’t meet the assumptions of the parametric test,
especially the... Continue Reading
Minitab is the leading provider of software and services for quality
improvement and statistics education. More than 90% of Fortune 100 companies
use Minitab Statistical Software, our flagship product, and more students
worldwide have used Minitab to learn statistics than any other package.
Minitab Inc. is a privately owned company headquartered in State College,
Pennsylvania, with subsidiaries in the United Kingdom, France, and
Australia. Our global network of representatives serves more than 40
countries around the world.