Statistics can be unintuitive. What’s a large difference? What’s a large sample size? When is something statistically significant? You might think you know, based on experience and intuition, but you really don’t know until you actually run the analysis. You have to run the proper statistical tests to know what the data are telling you!
Even experts can get tripped up by their hunches, as we'll see.
In my family, we’re huge fans of the Mythbusters. This fun Discovery Channel show mixes science and experiments to prove or disprove various myths, urban legends, and popular beliefs. Are Daddy Longleg spiders really super poisonous? Can diving underwater protect you from an explosion, or being shot? Are toilets really the cleanest place in the house? What is the fastest way to cool a beer? They often find a way to work in impressive explosions, one of their hallmarks. Thanks to Mythbusters, my 7 year-old daughter was able to explain to me that you can identify the explosive ANFO because it’s made out of pellets!
I love the Mythbusters because they make science fun. They find ways to test the myths and go to extensive efforts to rule out competing variables. The hosts go through extensive planning and small-scale testing before conducting the full-sized experiment. The Mythbuster’s skilled crew and well-stocked workshop can build a rig or robot to test virtually anything in a controlled and repeatable fashion. They also place a strong focus on collecting data and using that data to make decisions about the myths. This show is a fun way to bring the scientific method alive for our young daughter. Good stuff!
Having said that, I did catch them making a statistical mistake during an episode we watched recently. I’m pointing this mistake out only to highlight how non-intuitive statistics can be, and not to put down the hard work of the Mythbusters.
The Myth: Yawning is Contagious
This episode tested the myth that yawning is contagious—so if you see someone yawn, you’re more likely to yawn yourself. They recruited 50 people who thought they were being considered for an appearance on the show. One by one, each subject spoke with the recruiter who either yawned, or not, during the spiel. The subjects then sat by themselves in an isolation room and were told to wait. While in the isolation room for a set amount of time, unbeknownst to them, the Mythbusters watched to see if they yawned.
- 25%, 4 out of 16, who were not exposed to a yawn, yawned while waiting. I’ll call this the non-yawn group.
- 29%, 10 out of 34, who were exposed to a yawn, yawned. I’ll call this the yawn group.
Jamie Hyneman, one of the hosts, concluded that because of their large sample size (n=50), the difference of 4% was meaningful. They didn’t run a statistical test but the decision was based on his intuition about the statistical power that the sample size gave them. Let’s test this out a bit more rigorously.
Testing the Myth with the Two Proportions Test
To test their data, we’ll need to use the two proportions test in Minitab (Stat > Basic Statistics > 2 Proportions). We can use summarized data rather than data in a worksheet.
Fill in the main 2 Proportions dialog like this:
The Mythbusters wanted to test whether the proportion for the yawn group was greater than the non-yawn group. So we need to perform a one-sided test, which also provides a little more statistical power.
Click Options and choose greater than as the alternative hypothesis to determine whether the first proportion is greater than the second proportion.
We get the following output:
You’ll see that there are two p-values. The Fisher’s exact test is for small sample sizes. The note about the normal approximation and small sample sizes indicates that we should use the Fisher’s exact test P-Value of 0.513. This value is greater than any reasonable alpha value (typically 0.05), so we can’t reject the null hypothesis.
Conclusion: the data do not show that there is a higher proportion of yawning subjects in the yawn group than in the non-yawn group. Further, rather than having a large sample, Minitab indicates that the sample is small.
Power and Sample Size: How Large is Large Enough?
Fans of the show know that when they can’t confirm a myth, the Mythbusters find an exaggerated way to replicate the myth to show the extreme conditions that are necessary to make the myth happen. This method is a great way to increase the number of explosions they get to show!
As much as I want to, I can’t give you an impressive explosion for the blog post finale! However, I can give you a startling answer to the question of how large a sample the Mythbusters needed to have a good chance to detect a difference of 29% versus 25%. The answer is so large that you might just end up waving yours arms around like Adam Savage!
To figure this out, we’ll use Minitab’s Power and Sample Size calculation for Two Proportions (Stat > Power and Sample Size > 2 Proportions). We’ll use the proportions from the study and a power of 0.8, which is a good standard value, as I’ve discussed here.
In a nutshell, a power of 0.8 indicates that a study has an 80% chance of detecting a difference between the 2 populations if that difference truly exists.
Fill in the dialog like this:
Under Options, choose Greater than (p1 > p2). We get the following results:
The results show that the Mythbusters needed a whopping 1,523 subjects per group (3046 total) to have an 80% chance of detecting the small difference in population proportions! That's a far cry from the 50 subjects that they actually had. Why is this so large? There are two main reasons.
First, the effect size is small and that requires a larger sample. Second, the data for this test are categorical rather than continuous. The subjects either yawned or did not yawn while in the isolation room. Generally speaking, any given amount of categorical data represents less useful information than the same amount of continuous data. Consequently, you need a larger sample size when you're analyzing categorical data.
Retrospective Power Analysis
We can also take the results of the study and use them to determine how much power the study had. To do this, we input the sample size and the estimate of each proportion from the study into the power and sample size dialog. Of course, we don’t know the true values of the population proportions, but the study provides the best estimates that we have at this point.
For this study, Minitab calculates a power of 0.09. This value indicates that there was less than a 10% chance of detecting such a small difference, assuming that the difference truly exists. Therefore, insignificant results are to be expected for this study regardless of whether the difference truly exists or not.
Closing Thoughts: The Mythbusters Need Minitab
Given the results of the 2 Proportions Test and the power analysis, we can conclude:
- There is no evidence that yawns are contagious.
- The study had inadequate power to detect a difference.
Coming from the university world of academic research projects, I would say that the Mythbusters conducted a pilot study. These are small experiments designed to gather initial estimates (such as the proportions) and determine the feasibility of conducting a larger study. At this point, the main result is that the study, as it was performed, was not up to the task at hand. It could not reasonably detect the size of the difference that is likely to exist, if there is even a difference.
That does not mean that this project was a waste of time, though, because you don’t know this until you do at least some research.
In the research world, the question now would be whether further research is worthwhile. This determination is different for each research project. You need to balance the effect size (small in this case), the benefits (negligible), and the additional costs (very large for a much larger sample size). So, I'd guess that a large follow-up study is unlikely to happen!
We remain huge fans of the Mythbusters! This case study only serves to highlight the fact that conducting research and data analysis is a tricky business that can trip up even the experts! That’s why you need Minitab Statistical Software in your corner. The Mythbusters should look into getting a copy!