# Hypothesis Testing

Blog posts and articles about hypothesis testing, especially in the course of Lean Six Sigma quality improvement projects.

The 1949 film A Connecticut Yankee in King Arthur's Court includes the song “Busy Doing Nothing,” and this could be written about the Null Hypothesis as it is used in statistical analyses.  The words to the song go: We're busy doin' nothin'Workin' the whole day through Tryin' to find lots of things not to do And that summarises the role of the Null Hypothesis perfectly. Let me explain why. What's... Continue Reading
One highlight of writing for and editing the Minitab Blog is the opportunity to read your responses and answer your questions. Sometimes, to my chagrin, you point out that we've made a mistake. However, I'm particularly grateful for those comments, because it permits us to correct inadvertent errors.  I feared I had an opportunity to fix just such an error when I saw this comment appear on one of... Continue Reading

MINITAB INSIGHTS CONFERENCE 2017

Chicago, IL | 11-12 September, 2017

BUILD SKILLS. EXCHANGE IDEAS. DEVELOP COMMUNITY.

Register by July 20 for a \$100 discount!

People can make mistakes when they test a hypothesis with statistical analysis. Specifically, they can make either Type I or Type II errors. As you analyze your own data and test hypotheses, understanding the difference between Type I and Type II errors is extremely important, because there's a risk of making each type of error in every analysis, and the amount of risk is in your control.    So if... Continue Reading
Welcome to the Hypothesis Test Casino! The featured game of the house is roulette. But this is no ordinary game of roulette. This is p-value roulette! Here’s how it works: We have two roulette wheels, the Null wheel and the Alternative wheel. Each wheel has 20 slots (instead of the usual 37 or 38). You get to bet on one slot. What happens if the ball lands in the slot you bet on? Well, that depends... Continue Reading
Statistics can be challenging, especially if you're not analyzing data and interpreting the results every day. Statistical software makes things easier by handling the arduous mathematical work involved in statistics. But ultimately, we're responsible for correctly interpreting and communicating what the results of our analyses show. The p-value is probably the most frequently cited statistic. We... Continue Reading
To make objective decisions about the processes that are critical to your organization, you often need to examine categorical data. You may know how to use a t-test or ANOVA when you’re comparing measurement data (like weight, length, revenue, and so on), but do you know how to compare attribute or counts data? It easy to do with statistical software like Minitab.  One person may look at this bar... Continue Reading
In Parts 1 and 2 of this blog series, I wrote about how statistical inference uses data from a sample of individuals to reach conclusions about the whole population. That’s a very powerful tool, but you must check your assumptions when you make statistical inferences. Violating any of these assumptions can result in false positives or false negatives, thus invalidating your results.  The common... Continue Reading
In Part 1 of this blog series, I wrote about how statistical inference uses data from a sample of individuals to reach conclusions about the whole population. That’s a very powerful tool, but you must check your assumptions when you make statistical inferences. Violating any of these assumptions can result in false positives or false negatives, thus invalidating your results.  The common data... Continue Reading
If you’re not a statistician, looking through statistical output can sometimes make you feel a bit like Alice in Wonderland. Suddenly, you step into a fantastical world where strange and mysterious phantasms appear out of nowhere.   For example, consider the T and P in your t-test results. “Curiouser and curiouser!” you might exclaim, like Alice, as you gaze at your output. What are these values,... Continue Reading
Data mining can be helpful in the exploratory phase of an analysis. If you're in the early stages and you're just figuring out which predictors are potentially correlated with your response variable, data mining can help you identify candidates. However, there are problems associated with using data mining to select variables. In my previous post, we used data mining to settle on the following... Continue Reading
I watched an old motorcycle flick from the 1960s the other night, and I was struck by the bikers' slang. They had a language all their own. Just like statisticians, whose manner of speaking often confounds those who aren't hep to the lingo of data analysis. It got me thinking...what if there were an all-statistician biker gang? Call them the Nulls Angels. Imagine them in their colors, tearing... Continue Reading
True or false: When comparing a parameter for two sets of measurements, you should always use a hypothesis test to determine whether the difference is statistically significant. The answer? (drumroll...) True! ...and False! To understand this paradoxical answer, you need to keep in mind the difference between samples, populations, and descriptive and inferential statistics.  Descriptive Statistics and... Continue Reading
There may be huge potential benefits waiting in the data in your servers. These data may be used for many different purposes. Better data allows better decisions, of course. Banks, insurance firms, and telecom companies already own a large amount of data about their customers. These resources are useful for building a more personal relationship with each customer. Some organizations already use... Continue Reading
In 2011 we had solar panels fitted on our property. In the last few months we have noticed a few problems with the inverter (the equipment that converts the electricity generated by the panels from DC to AC, and manages the transfer of unused electric to the power company). It was shutting down at various times throughout the day, typically when it was very sunny, resulting in no electricity being... Continue Reading
So the data you nurtured, that you worked so hard to format and make useful, failed the normality test. Time to face the truth: despite your best efforts, that data set is never going to measure up to the assumption you may have been trained to fervently look for. Your data's lack of normality seems to make it poorly suited for analysis. Now what? Take it easy. Don't get uptight. Just let your data... Continue Reading
Have you ever accidentally done statistics? Not all of us can (or would want to) be “stat nerds,” but the word “statistics” shouldn’t be scary. In fact, we all analyze things that happen to us every day. Sometimes we don’t realize that we are compiling data and analyzing it, but that’s exactly what we are doing. Yes, there are advanced statistical concepts that can be difficult to understand—but... Continue Reading
While some posts in our Minitab blog focus on understanding t-tests and t-distributions this post will focus more simply on how to hand-calculate the t-value for a one-sample t-test (and how to replicate the p-value that Minitab gives us).  The formulas used in this post are available within Minitab Statistical Software by choosing the following menu path: Help > Methods and Formulas > Basic... Continue Reading
Analysis of variance (ANOVA) can determine whether the means of three or more groups are different. ANOVA uses F-tests to statistically test the equality of means. In this post, I’ll show you how ANOVA and F-tests work using a one-way ANOVA example. But wait a minute...have you ever stopped to wonder why you’d use an analysis of variance to determine whether means are different? I'll also show how... Continue Reading
Among the most underutilized statistical tools in Minitab, and I think in general, are multivariate tools. Minitab offers a number of different multivariate tools, including principal component analysis, factor analysis, clustering, and more. In this post, my goal is to give you a better understanding of the multivariate tool called discriminant analysis, and how it can be used. Discriminant... Continue Reading
Once upon a time, when people wanted to compare the standard deviations of two samples, they had two handy tests available, the F-test and Levene's test. Statistical lore has it that the F-test is so named because it so frequently fails you.1 Although the F-test is suitable for data that are normally distributed, its sensitivity to departures from normality limits when and where it can be used. Leve... Continue Reading