Bewildering Things Statisticians Say: "Failure to Reject the Null Hypothesis"

nulls angels: the toughest statisticians around! Subcultures have languages all their own. Teen gangs, statisticians, gamers, music buffs, sports nuts, furries...all use terminology that baffles outsiders.The arcane language helps identify kindred spirits: using the correct phrase proves you belong. The proper buzzwords can gain you admittance to the right professional circles...or the wrong biker bars. Maybe both. 

Not knowing them can get you into serious trouble. When you enter a dangerous place (like the data analysis arena), you need at least a basic grasp of the jargon the local toughs use. 

I'm not comparing any particular group of statisticians to a street gang, but the discipline definitely has its own language, one that can seem inpenetrable and obtuse. It's all too easy for a seasoned vet of the stats battlefield to confound newcomers who aren't hep to the lingo of data analysis. 

Like that gent over there...the big guy wearing the Nulls Angels jacket, the analyst everyone calls "Tiny." He's always telling war stories about how he "failed to reject the null hypothesis." 

Looking at the phrase from a purely editorial vantage, "failing to reject the null hypothesis" is cringe-worthy. Doesn't "failure to reject" amount to a double negative? Isn't it just a more high-falutin', circular equivalent to accept? At minimum, "failure to reject" is clunky phrasing.  

Maybe so. But from a statistical perspective, it's undeniably accurate. Replacing "failure to reject" with "accept" would be wrong. 

In this case, Tiny and the rest of those bad-boy statisticians in the Nulls Angels have a good reason to talk the way they do. 

What Is the Null Hypothesis, Anyway? 

There are many different kinds of hypothesis tests, including one- and two-sample t-tests, tests for association, tests for normality, and many more. If you're using Minitab statistical software, you have direct access to all of these tests through the Stat menu. If you want a little statistical guidance, the Assistant can lead you through many of the most commonly used hypothesis tests step-by-step.

In a hypothesis test, you're going to look at two propositions: the null hypothesis (or H0 for short), and the alternative (H1). The alternative hypothesis is what we hope to support. The null hypothesis, in contrast, is presumed to be true, until the data provide sufficient evidence that it is not. 

A similar idea underlies the U.S. criminal justice system: you've heard the phrase "Innocent until proven guilty"? In the statistical world, the null hypothesis is taken for granted until the alternative is proven true. The null hypothesis is never proven true; you simply fail to reject it.

How Do We "Fail to Reject" the Null Hypothesis? 

The degree of statistical evidence we need in order to “prove” the alternative hypothesis is the confidence level. The confidence level is simply 1 minus the Type I error rate (alpha, also referred to as the significance level), which occurs when you incorrectly reject the null hypothesis. The typical alpha value of 0.05 corresponds to a 95% confidence level: we're accepting a 5% chance of rejecting the null even if it is true. (When hypothesis-testing life-or-death matters, we can lower the risk of a Type I error to 1% or less.)

Regardless of the alpha level we choose, any hypothesis test has only two possible outcomes: 

  1. Reject the null hypothesis (p-value <= alpha) and conclude that the alternative hypothesis is true at the 95% confidence level (or whatever level you've selected).
  2. Fail to reject the null hypothesis (p-value > alpha) and conclude that not enough evidence is available to suggest the null is false at the 95% confidence level.

In the results of a hypothesis test, we typically use the p-value to decide if the data support the null hypothesis or not. If the p-value is very low (typically below 0.05), statisticians say "the null must go." 

If We Don't Accept the Alternative Hypothesis, Don't We Have to Accept the Null Hypothesis? 

This still doesn't explain why a statistician can't say "we accept the null hypothesis," as a certain unnamed, wet-behind-the-ears, statistically-challenged editor might have suggested to Tiny.


Here's the bottom line: even if we fail to reject the null hypothesis, it does not mean the null hypothesis is true. That's because a hypothesis test does not determine which hypothesis is true, or even which is most likely: it only assesses whether available evidence exists to reject the null hypothesis. 

"My hypothesis is Null until proven Alternative, sir!"  "Null Until Proven Alternative"

Look at it in terms of "innocent until proven guilty" in a courtroom: As the person analyzing data, you are the judge. The hypothesis test is the trial, and the null hypothesis is the defendant. The alternative hypothesis is like the prosecution, which needs to make its case beyond a reasonable doubt (say, with 95% certainty).

If the evidence presented doesn't prove the defendant is guilty beyond a reasonable doubt, you still have not proved that the defendant is innocent. But based on the evidence, you can't reject that possibility

So how would that verdict be announced? It enters the court record as "Not guilty."  

That phrase is perfect: "Not guilty" doesn't mean the defendant is innocent, because that has not been proven. It just means the prosecution couldn't prove its case to the necessary, "beyond a reasonable doubt" standard. It failed to convince the judge to abandon the assumption of innocence.

If you follow that rationale, then you can see that "failure to reject the null" is just the statistical equivalent of "not guilty." In a trial, the burden of proof falls to the prosecution. When analyzing data, the entire burden of proof falls to the sample data you've collected. Just as "not guilty" is not the same thing as "innocent," neither is "failing to reject" the same as "accepting" the null hypothesis. 

So the next time you're looking to hang around at the local Nulls Angels clubhouse, remember that "failing to reject the null" is not "accepting the null." Knowing the difference just might get Tiny to buy you a drink.  


Name: Savvas Zannetos • Wednesday, January 30, 2013

Amazing blog on statistics, excellent article. But I do have one comment. Although not many textbooks write it down Sir Ronald Fisher proposed the "less than 0.05" criterion of statistical significance (α=0.05). Thus we reject the null hypothesis if p value is less than alpha and we fail to reject if p value is equal or more than alpha.You reported it vice versa. Of course in practice this will make
very little difference. I mean if I get a p value of exactly 0.05 then I will probably repeat the experiment.

Name: karen • Wednesday, January 30, 2013

if p is exactly 0.05, you may consider rejecting the null with confidence marginally less than 95%. Rather than bear the expense of repeating the experiment, consider the relative practical marginal difference between a p-value at .049 and .05, and weigh the practical risk of a Type I error--i.e., rejecting Ho if it is actually true. To me, even p-value at 0.04 looks a lot like 0.05 and 0.06, on a practical 'risk' basis.

There is nothing mathematically sacred about the 0.05 threshold--people use it because they LOVE 95% confidence in their decisions.

Name: Gregory Murray • Friday, February 8, 2013

Thanks for this great article!!!!! I will now do more searching on this blog to increase my understanding and use of Statistics.

Name: Ian Wardell • Monday, September 15, 2014

Unfortunately you don't seem to have explained what the null hypothesis is!

People make claims like for example reincarnation is meaningless. I ask why. They say that the meaninglessness position is the "null hypothesis" and I need to show that it is meaningful.

Basically what I'd like to know is where we have 2 alternative hypotheses, what justifies one being the "null hypothesis"? And why is it the default position?

Name: Eston • Tuesday, September 16, 2014

Thanks for the comment, Ian. I believe I did explain what the null hypothesis is; however, you seem to be asking why it's CALLED the null hypothesis, which is a fair question. It's implied in my post above, but basically it's called the null hypothesis because it's the hypothesis that reflects the NEUTRAL state in the situation -- in other words, that no relationship exists, or that the notion you are investigating (presumably, that a relationship exists) is a non-starter, or null. Hope this helps to clarify. Thanks again for writing, good question.

blog comments powered by Disqus