Guest Post: Pruning Your Hypothesis Testing Decision Tree

Most people who have taken a statistics class, whether it be Six Sigma or a college course or elsewhere, learned about the assumptions from which each test was developed. And boy are there a lot of assumptions!

Chances are a rather large and complicated flowchart was presented to help students navigate to the right hypothesis test – in my original course the chart was so big it had to be printed on 11x17 paper and folded in half to fit into the materials. Now, for many just learning about data analysis, understanding how to perform and interpret the results of the appropriate test was a tall enough task. But checking many of those assumptions required additional hypothesis tests.

Guest blogger Joel Smith is the Director of Rapid Continuous Improvement at Keurig Dr. Pepper as well as the co-author of the Applied Statistics Manual. Before joining Keurig Dr. Pepper, he worked at Minitab for 13 years.

Joel will also be hosting a panel on Leading Successful Data Analysis at the 2019 Minitab Insights Conference! Learn more and register today to see engaging speakers, hands-on training with Minitab experts and opportunities to share your knowledge and learn from others.

For example, suppose I want to compare whether two methods for performing a process result in different amounts of average time necessary to complete them. Following one of those decision trees, I’m likely to find that I need to test for equal variances and normality before performing the test. So first I open Minitab Statistical Software and go to Stat > Basic Statistics > Normality Test.

I get these results:

minitab-blog-joel-smith-0419-probability-plot-method-a

minitab-blog-joel-smith-0419-probability-plot-method-b

Uh-oh! That's definitely nonnormal.

Not to worry though. My trusty old flowchart covers the case where the data is nonnormal. It tells me I should go perform a Mann-Whitney test (it fails to mention the Mann-Whitney test requires similarly-shaped distributions, but luckily Minitab has your back and lets you know in a ToolTip in the menu). But when I get the results I notice the hypothesis test was for the median and not the mean, which is what I really wanted:

minitab-blog-joel-smith-0419-mann-whitney-session-output-1

Well now I face a decision the tree didn’t cover. Do I use a nonparametric test that isn’t even testing what I really want? Or just ignore my normality results and press ahead?

I choose the latter and go ahead and go to Stat > Basic Statistics > 2 Variances and perform that test to get this result:

minitab-blog-joel-smith-0419-test-ci-2-variances

At first I’m confused by the two p-values but of course Minitab’s help quickly clarifies I should use Levene’s test in this case.

So at this point I failed normality but found the suggested test in that case didn’t answer my question, and then I failed equal variances. My chart conveniently didn’t mention what to do in that case…it said “consult your Master Black Belt.”

I AM the Master Black Belt! 😟

Out of options, I press ahead and go to Stat > Basic Statistics > 2-Sample t. In the Options subdialog I notice I have the option to assume equal variances, and by default it isn’t even checked!

minitab-blog-joel-smith-0419-2-sample-t-dialog

I’m feeling better about the equal variances thing now, and then I even remember that rather than relying on a p-value exclusively for a normality test, I can also use the “fat pencil test” to assess whether the points generally fall along a line covered up if I place a fat pencil in front of the screen.

minitab-blog-joel-smith-0419-pencil-illustration

Now I’m feeling even better. I get my results:

minitab-blog-joel-smith-0419-test-ci-2-variances-session-output-1

Proudly I put together my presentation and excitedly share what I’ve learned at my tollgate review

... only to be immediately questioned about the nonnormality and unequal variances!

Frustrated, I throw up my hands and decide hypothesis testing is too difficult and complicated.

But wait …

Was all of that even necessary? What if the data is nonnormal and the variances aren’t equal and a 2-sample t-test is performed anyway? Are the answers wrong?

When developing the Assistant menu, the experts at Minitab had to answer these very questions and, in doing so, found many of the formal, traditional assumptions to be not particularly important, especially if the right type of test was performed. For most practitioners, the giant, confusing decision tree of assumptions and hypothesis tests could be greatly simplified, and practitioners could act with much greater confidence.

For those of us who train and coach others in statistics, this is a hugely important development in our use of the tools! In fact, it was a major driving force in Matt Barsalou and I authoring the Applied Statistics Manual (available from ASQ or Amazon!). Frequently referencing Minitab’s research, the book presents the reader with only the information necessary to correctly perform and understand the results of a test, so they can quickly get back to what is really important: improving quality and processes.

Want to follow along in Minitab Statistical Software? Download the Data Set
(It's OK if you don't already have it – download a 30-day free trial)