Three Dangerous Statistical Mistakes

Follow the Steps Carefully to Avoid Statistical Mistakes It's all too easy to make mistakes involving statistics. Powerful statistical software can remove a lot of the difficulty surrounding statistical calculation, reducing the risk of mathematical errors -- but correctly interpreting the results of an analysis can be even more challenging.

No one knows that better than Minitab's technical trainers. All of our trainers are seasoned statisticians with years of quality improvement experience. They spend most of the year traveling around the country (and around the world) to help people learn to make the best use of Minitab software for analyzing data and improving quality.
A few years ago, Minitab trainers compiled a list of common statistical mistakes, the ones they encountered over and over again. Being somewhat math-phobic myself, I expected these mistakes would be primarily mathematical. I was wrong: every mistake on their list involved either the incorrect interpretation of the results of an analysis, or a design flaw that made meaningful analysis impossible.

Here are three of their most commonly observed mistakes that involve drawing an incorrect conclusion from the results of analysis. (I'm sorry to say that, yes, I have made all three of these mistakes at least once in my time.)

Statistical Mistake 1. Not Distinguishing Between Statistical Significance and Practical Significance

It's important to remember that using statistics, we can find a statistically significant difference that has no discernible effect in the "real world." In other words, just because a difference exists doesn't make the difference important. And you can waste a lot of time and money trying to "correct" a statistically significant difference that doesn't matter.

Let's say you love Tastee-O's cereal. The factory that makes them weighs every cereal box at the end of the filling line using an automated measuring system. Say that 18,000 boxes are filled per shift, with a target fill weight of 360 grams and a standard deviation of 2.5 grams.

Using statistics, the factory can detect a shift of 0.06 grams in the mean fill weight 90% of the time. But just because that 0.06 gram shift is statistically significant doesn't mean it's practically significant. A 0.06 gram difference probably amounts to two or three Tastee-O's -- not enough to make you, the customer, notice or care.

In most hypothesis tests, we know that the null hypothesis is not exactly true. In this case, we don’t expect the mean fill weight to be precisely 360 grams -- we are just trying to see if there is a meaningful difference. Instead of a hypothesis test, the cereal maker could use a confidence interval to see how large the difference might be and decide if action is needed.

Statistical Mistake 2. Stating That You've Proved the Null Hypothesis

In a hypothesis test, you pose a null hypothesis (H0) and an alternative hypothesis (H1). Then you collect data, analyze it, and use statistics to assess whether or not the data support the alternative hypothesis. A p-value above 0.05 indicates “there is not enough evidence to conclude H1 at the .05 significance/alpha level”.

In other words, even if we do not have enough evidence in favor of the alternative hypothesis, the null hypothesis may or may not be true.

For example, we could flip a fair coin 3 times and test:

H0: Proportion of Heads = 0.40

H1: Proportion of Heads ≠ 0.40

In this case, we are guaranteed to get a p-value higher than 0.05. Therefore we cannot conclude H1. But not being able to conclude H1 doesn't prove that H0 is correct or true! This is why we say we "fail to reject" the null hypothesis, rather than we "accept" the null hypothesis.

Statistical Mistake 3. Assuming Correlation = Causation

Simply put, correlation is a linear association between two variables. For example, a house's size and its price tend to be highly correlated: larger houses have higher prices, while smaller houses have lower prices.

But while it's tempting to observe the linear relationship between two variables and conclude that a change in one is causing a change in the other, that's not necessarily so -- statistical evidence of correlation is not evidence of causation.

Consider this example: data analysis has shown a strong correlation between ice cream sales and murder rates. When ice cream sales are low, the murder rate is low. When ice cream sales are high, the murder rate is high.

So could we conclude that ice cream sales lead to murder? Or vice versa? Of course not! This is a perfect example of correlation not equaling causation. Yes, the murder rate and ice cream sales are correlated. In the summer months, both are high. In the winter months, both are low. So when you think beyond the correlation, the data suggest not that the murder rate and ice cream sales affect each other, but rather that both are affected by another factor: the weather.

If you've ever misinterpreted the significance of a correlation between variables, at least you've got company: the media is rife with examples of news stories that equate correlation and causation -- especially when it comes to the effects of diet, exercise, chemicals and other factors on our health!

Have you ever jumped to the wrong conclusion after looking at statistics?