How Much Data Do You Really Need? Check Power and Sample Size

Minitab Blog Editor | 20 March, 2012

Topics: Minitab Statistical Software, Power and Sample Size, Data Analysis

Gathering data is like tasting fine wine — you need the right amount. With wine, too small a sip keeps you from accurately assessing a subtle bouquet, but too large a sip overwhelms the palate. We can’t tell you how big a sip to take at a wine-tasting event, but when it comes to collecting data, Minitab Statistical Software’s Power and Sample Size tools can tell you how much data you need to be sure about your results.

To make sound decisions based on statistical analysis, you need to be sure you can trust your results. We can measure this using statistical power—the likelihood that your test will identify a significant difference or effect when one truly exists. The statistical power you need varies based on your goals and your resources. Testing critical airplane parts, for example, demands a higher degree of certainty than testing DVD players.

You can use Minitab Statistical Software’s Power and Sample Size tools to make sure you collect enough data to conduct a reliable analysis, while avoiding wasting resources by collecting more data than you need. You can also asses the power of tests that have already been run and estimate the sample size you need to obtain a specific margin of error.

Understanding Power and Sample Size

Minitab’s Power and Sample Size tools help you balance your need for statistical power with the expense of gathering data by answering this question:  How much data do you need?  This deceptively simple question can take many forms.

  • How many samples do you need to determine if the average thickness of paper from one supplier is the same as another supplier?
  • How many people should you sample to be 95% confident that the proportion of people supporting a candidate is within 3% of its true value?
  • Can you trust the t-test conclusion that indicates the average test scores of two school districts are not different?
  • How many replicates does your experiment need if you want to have at least an 85% chance of detecting the factors that significantly affect your manufacturing process?

Ideally, you want to collect enough data to ensure you have sufficient power to draw sound conclusions.

Using Minitab Statistical Software to Determine Power and Sample Size

Minitab gives you tools to estimate sample size and power for the following statistical tests:

  • Sample Size for Estimation
  • 1-Sample Z
  • 1- and 2-Sample t
  • Paired t
  • 1 and 2 Proportions
  • 1- and 2-Sample Poisson Rate
  • 1 and 2 Variances
  • One-Way ANOVA
  • 2-Level Factorial Design
  • Plackett-Burman Design
  • General Full Factorial Design

Minitab's power and sample size capabilities allow you to examine how different test properties affect each other. For example, with a two-sample t-test you can calculate:

  • Sample sizes—the number of observations in each sample.
  • Differences (effects)—the minimum difference between the mean for one population and the mean for the other that you can detect.
  • Power—the probability of detecting a significant difference when one truly exists.

power-sample-size-dialog

If you enter values for any two of these properties, Minitab will calculate the third. For instance, if you specify values for the minimum difference and power, Minitab will determine the sample size required to detect the specified difference at the specified level of power.

Prospective and Retrospective Power and Sample Size

Calculating statistical power before you collect data to ensure that your hypothesis test will detect significant effects is called a “prospective” study. For example, suppose your company makes cereal, and you need to determine whether the box-filling process is meeting requirements. You want to be sure the mean fill weight of the process does not differ from the target weight of 365 grams by more than 2.5 grams. Using a standard deviation of 4.58 grams and a power of 85%, how many cereal boxes do you need to sample?  The more samples you test, the better the chance you’ll detect such a difference if it exists—but if you test too many samples, your test will take longer and cost more than necessary.

Using Minitab’s Power and Sample Size for 1-Sample t reveals that you only need to sample 33 cereal boxes to detect a difference of more than 2.5 grams with a power of 85%.

power-sample-size-output

You can also use Minitab to understand the power of tests that have already been conducted. This is called a “retrospective” study. For example, a parts manufacturer compares the weight of parts made with two steel formulations, and the results are not statistically significant with a p-value of 0.05. Using Minitab, the manufacturer can calculate this test’s power based on the sample size, the minimum difference they want to be able to detect, and the standard deviation to determine if they can rely on the results of their analysis. If the power to detect this difference is low, they may want modify the experiment by sampling more parts to increase the power and re-evaluate the formulations. However, if the power is high, they may conclude that the two steel formulations are not different, and forgo additional data collection.

Minitab Makes Power and Sample Size Easy

The Power and Sample Size tools in Minitab  make it easier than ever to be sure you can count on the results of your analyses. 

If you’re not already using the power of Minitab to get the maximum value from your data, download a free, fully-functional  30-day trial of Minitab Statistical Software today.