P Value

Blog posts and articles about how to use and interpret the P Value statistic in quality improvement efforts.

If you want to convince someone that at least a basic understanding of statistics is an essential life skill, bring up the case of Lucia de Berk. Hers is a story that's too awful to be true—except that it is completely true. A flawed analysis irrevocably altered de Berk's life and kept her behind bars for a full decade, and the fact that this analysis targeted and harmed just one person makes it... Continue Reading
In the world of linear models, a hierarchical model contains all lower-order terms that comprise the higher-order terms that also appear in the model. For example, a model that includes the interaction term A*B*C is hierarchical if it includes these terms: A, B, C, A*B, A*C, and B*C. Fitting the correct regression model can be as much of an art as it is a science. Consequently, there's not always a... Continue Reading

7 Deadly Statistical Sins Even the Experts Make

Do you know how to avoid them?

Get the facts >
How deeply has statistical content from Minitab blog posts (or other sources) seeped into your brain tissue? Rather than submit a biopsy specimen from your temporal lobe for analysis, take this short quiz to find out. Each question may have more than one correct answer. Good luck! Which of the following are famous figure skating pairs, and which are methods for testing whether your data follow a... Continue Reading
If you perform linear regression analysis, you might need to compare different regression lines to see if their constants and slope coefficients are different. Imagine there is an established relationship between X and Y. Now, suppose you want to determine whether that relationship has changed. Perhaps there is a new context, process, or some other qualitative change, and you want to determine... Continue Reading
I’ve written a fair bit about P values: how to correctly interpret P values, a graphical representation of how they work, guidelines for using P values, and why the P value ban in one journal is a mistake. Along the way, I’ve received many questions about P values, but the questions from one reader stand out. This reader asked, why is it so easy to interpret P values incorrectly? Why is the common... Continue Reading
There are many reasons why a distribution might not be normal/Gaussian. A non-normal pattern might be caused by several distributions being mixed together, or by a drift in time, or by one or several outliers, or by an asymmetrical behavior, some out-of-control points, etc. I recently collected the scores of three different teams (the Blue team, the Yellow team and the Pink team) after a laser... Continue Reading
P-values are frequently misinterpreted, which causes many problems. I won't rehash those problems here here since my colleague Jim Frost has detailed the issues involved at some length, but the fact remains that the p-value will continue to be one of the most frequently used tools for deciding if a result is statistically significant.  You know the old saw about "Lies, damned lies, and... Continue Reading
Back when I was an undergrad in statistics, I unfortunately spent an entire semester of my life taking a class, diligently crunching numbers with my TI-82, before realizing 1) that I was actually in an Analysis of Variance (ANOVA) class, 2) why I would want to use such a tool in the first place, and 3) that ANOVA doesn’t necessarily tell you a thing about variances. Fortunately, I've had a lot more... Continue Reading
I have two young children, and I work full-time, so my adult TV time is about as rare as finding a Kardashian-free tabloid.  So I can’t commit to just any TV show. It better be a good one. I was therefore extremely excited when Netflix analyzed viewer data to find out at what point watchers get hooked on the first season of various shows. Specifically, they identified the episode at which 70% of... Continue Reading
As Halloween approaches, you are probably taking the necessary steps to protect yourself from the various ghosts, goblins, and witches that are prowling around. Monsters of all sorts are out to get you, unless they’re sufficiently bribed with candy offerings! I’m here to warn you about a ghoul that all statisticians and data scientists need to be aware of: phantom degrees of freedom. These phantoms... Continue Reading
In Part 5 of our series, we began the analysis of the experiment data by reviewing analysis of covariance and blocking variables, two key concepts in the design and interpretation of your results. The 250-yard marker at the Tussey Mountain Driving Range, one of the locations where we conducted our golf experiment. Some of the golfers drove their balls well beyond this 250-yard maker during a few of... Continue Reading
By Matthew Barsalou, guest blogger Teaching process performance and capability studies is easier when actual process data is available for the student or trainee to practice with. As I have previously discussed at the Minitab Blog, a catapult can be used to generate data for a capability study. My last blog on using a catapult for this purspose was several years ago, so I would like to revisit... Continue Reading
In Part 3 of our series, we decided to test our 4 experimental factors, Club Face Tilt, Ball Characteristics, Club Shaft Flexibility, and Tee Height in a full factorial design because of the many advantages of that data collection plan. In Part 4 we concluded that each golfer should replicate their half fraction of the full factorial 5 times in order to have a high enough power to detect... Continue Reading
With Speaker John Boehner resigning, Kevin McCarthy quitting before the vote for him to be Speaker, and a possible government shutdown in the works, the Freedom Caucus has certainly been in the news frequently! Depending on your political bent, the Freedom Caucus has caused quite a disruption for either good or bad.  Who are these politicians? The Freedom Caucus is a group of approximately 40... Continue Reading
Step 3 in our DOE problem solving methodology is to determine how many times to replicate the base experiment plan. The discussion in Part 3 ended with the conclusion that our 4 factors could best be studied using all 16 combinations of the high and low settings for each factor, a full factorial. Each golfer will perform half of the sixteen possible combinations and each golfer’s data could stand as... Continue Reading
An exciting new study sheds light on the relationship between P values and the replication of experimental results. This study highlights issues that I've emphasized repeatedly—it is crucial to interpret P values correctly, and significant results must be replicated to be trustworthy. The study also supports my disagreement with the decision by the Journal of Basic and Applied Social Psychology to b... Continue Reading
Repeated measures designs don’t fit our impression of a typical experiment in several key ways. When we think of an experiment, we often think of a design that has a clear distinction between the treatment and control groups. Each subject is in one, and only one, of these non-overlapping groups. Subjects who are in a treatment group are exposed to only one type of treatment. This is the... Continue Reading
If you use ordinary linear regression with a response of count data, if may work out fine (Part 1), or you may run into some problems (Part 2). Given that a count response could be problematic, why not use a regression procedure developed to handle a response of counts? A Poisson regression analysis is designed to analyze a regression model with a count response. First, let's try using Poisson... Continue Reading
My previous post showed an example of using ordinary linear regression to model a count response. For that particular count data, shown by the blue circles on the dot plot below, the model assumptions for linear regression were adequately satisfied. But frequently, count data may contain many values equal or close to 0. Also, the distribution of the counts may be right-skewed. In the quality field,... Continue Reading
Ever use dental floss to cut soft cheese? Or Alka Seltzer to clean your toilet bowl? You can find a host of nonconventional uses for ordinary objects online. Some are more peculiar than others. Ever use ordinary linear regression to evaluate a response (outcome) variable of counts?  Technically, ordinary linear regression was designed to evaluate a a continuous response variable. A continuous... Continue Reading