Let's say you love Tastee-O's cereal. The factory that makes them weighs every cereal box at the end of the filling line using an automated measuring system. Say that 18,000 boxes are filled per shift, with a target fill weight of 360 grams and a standard deviation of 2.5 grams.
Using statistics, the factory can detect a shift of 0.06 grams in the mean fill weight 90% of the time. But just because that 0.06 gram shift is statistically significant doesn't mean it's practically significant. A 0.06 gram difference probably amounts to two or three Tastee-O's -- not enough to make you, the customer, notice or care.
In most hypothesis tests, we know that the null hypothesis is not exactly true. In this case, we don’t expect the mean fill weight to be precisely 360 grams -- we are just trying to see if there is a meaningful difference. Instead of a hypothesis test, the cereal maker could use a confidence interval to see how large the difference might be and decide if action is needed.
In other words, even if we do not have enough evidence in favor of the alternative hypothesis, the null hypothesis may or may not be true.
For example, we could flip a fair coin 3 times and test:
But while it's tempting to observe the linear relationship between two variables and conclude that a change in one is causing a change in the other, that's not necessarily so -- statistical evidence of correlation is not evidence of causation.
Consider this example: data analysis has shown a strong correlation between ice cream sales and murder rates. When ice cream sales are low, the murder rate is low. When ice cream sales are high, the murder rate is high.
So could we conclude that ice cream sales lead to murder? Or vice versa? Of course not! This is a perfect example of correlation not equaling causation. Yes, the murder rate and ice cream sales are correlated. In the summer months, both are high. In the winter months, both are low. So when you think beyond the correlation, the data suggest not that the murder rate and ice cream sales affect each other, but rather that both are affected by another factor: the weather.
If you've ever misinterpreted the significance of a correlation between variables, at least you've got company: the media is rife with examples of news stories that equate correlation and causation -- especially when it comes to the effects of diet, exercise, chemicals and other factors on our health!
Have you ever jumped to the wrong conclusion after looking at statistics?