One of the more misunderstood concepts in statistics is alpha, more formally known as the significance level. Alpha is typically set before you conduct an experiment. When the calculated p-value from a hypothesis test is less than the significance level (α), the results of an experiment are so unlikely to happen by chance that the more likely explanation is the results occur because of the effect being studied. That the results are unlikely to happen by chance is what we mean by the phrase “statistical significance,” not to be confused with practical significance.
There was a wonderful example of how confusing alpha can be when the National Institutes of Health canceled trials for an HIV vaccine. The headline from US News and World Report reads “HIV Vaccine Study Cancelled: Recipients More Likely to Catch Virus Than Those Given Placebo.” As is often the case with language, the headline offers more than one interpretation. One possible reading is that receiving this potential HIV vaccine caused more subjects to get HIV. This reading is suggested by the beginning of the subhead to the article: “Review of large-scale study found alarming.” The idea that a vaccine meant to prevent HIV would cause more people to get the virus makes "alarming" an understatement.
However, the subhead closes with a different tone, noting that the trial had a “non-statistically significant’ result.” A non-statistically significant result doesn’t seem nearly as alarming (because it’s not). The NIH press release about this doesn’t give us all of the information we’d need to reproduce their math, such as the number of subjects who had the vaccine at least 28 weeks. But we can probably get a good approximation from the total number of subjects.
|
Placebo Group |
Vaccine Group |
Total |
1,244 |
1,250 |
New HIV infections |
30 |
41 |
The NIH press release also didn’t report the alpha level they used to determine that these results were not significant. By tradition, alpha is 0.05. The value 0.05 probably has its roots in the paper “The Statistical Method in Psychical Research” published by Sir Ronald Fisher in 1929 in the Proceedings of the Society for Psychical Research. Fisher writes, “it is a common practice to judge a result significant, if it is of such magnitude that it would have been produced by chance not more frequently than once in twenty trials. This is an arbitrary, but convenient, level of significance for the practical investigator.”
So let’s use 0.05 as alpha for now, and use a 2 proportions hypothesis test to see if there's a statistically significant difference in HIV infection between the proportion of participants who received the vaccine and those who did not. Here's how to do it in Minitab Statistical Software:
Minitab provides the following output:
Test and CI for Two Proportions
Sample X N Sample p
1 30 1244 0.024116
2 41 1250 0.032800
Difference = p (1) - p (2)
Estimate for difference: -0.00868424
95% CI for difference: (-0.0217291, 0.00436056)
Test for difference = 0 (vs not = 0): Z = -1.30 P-Value = 0.192
Fisher's exact test: P-Value = 0.228
As you can see in the output above, the resulting Fisher's exact test p-value is 0.228. Because 0.228 is bigger than 0.05, and big in general, we would say that the difference in the proportion of infections between these two groups is probably not because of the vaccine.
The downside is that we still don’t have a vaccine for HIV. But the idea that the vaccine contributes to the contraction of HIV isn’t supported by the NIH data. Understanding what alpha is helps us feel more confident about what the results really mean.