This article explains why a standard Gage R&R cannot adequately assess the capability of many measurement systems and demonstrates that when a standard study is not enough, an Expanded Gage R&R is an ideal tool to comprehensively characterize your measurement system.
The Limitations of Traditional Gage R&R Studies
If you can’t trust your measurement system, you can’t trust the data it produces. That’s why Measurement Systems Analysis (MSA) is a key component of establishing, improving and maintaining quality systems. Whether you’re engaged in a Six Sigma project or an ISO-9000 certification, an MSA helps you identify problems with your measurement system and determine if you can trust your data.
The most common type of MSA is the Gage Repeatability and Reproducibility (Gage R&R) study. Most Gage R&R studies assess the effects of two factors on variation in your measurement system—typically Operator and Part. It can help you answer a variety of questions, including:
- Is your measurement system sensitive enough?
- Is your measuring tool consistent?
- Are the people taking the measurements consistent?
However, the effects of Operator and Part frequently are not enough to provide a complete understanding of the measurement system. Adding a third variable (typically “Gage”) to the standard study is often required.
When 3 or more factors are included in the analysis, we call the study an Expanded Gage R&R. In the following situations, a third factor is crucial to understanding the system. Here are a couple example scenarios.
- An electronics manufacturer makes voltage regulators on 3 production lines, each with its own gaging system. Faced with an unacceptably high reject rate, the quality manager suspects the measurement system is at fault, but each gage has been calibrated to its own standard and passed its Gage R&R with flying colors. The manager conducts an Expanded Gage R&R that includes the three gages as well as Operator and Part. The calculated %Tolerance — the proportion of the tolerance that is taken up by the measurement system variability — is 79%. A %Tolerance greater than 30% is considered unacceptable. After the manufacturer calibrates the gages to one standard, rejects are virtually eliminated.
- A California machine shop produces stainless steel parts to extremely tight tolerances for use in robotic surgical instruments. Customers require verification of the capability of their dimensional measurement systems. Because any measurement technician could use any of dozens of gages, a standard Gage R&R could not demonstrate capability. They conducted an Expanded Gage R&R, including Operator, Part and Gage. The Total Gage R&R %Tolerance of 3% was so low that the shop was able to reduce QA sample size while maintaining the same level of quality.
What are the 5 Main Differences Between Standard and Expanded Gage R&R Studies?
- The expanded study allows additional factors such as Gage, Laboratory, Location, etc., to be evaluated, in addition to Operator and Part.
- The design can be unbalanced, meaning that – unlike the standard study – missing data points are allowed in the analysis for an expanded study.
- The interactions of the additional factors with Operator and Part can also be evaluated.
- The sampling plan for the expanded study will quickly grow beyond a reasonable size and will require reducing the sample size of at least one variable. For example, reducing the number of parts from 10 to 5 is a common approach.
- The study can include factors that are either fixed or random, for additional flexibility. In Gage studies, if you intentionally select certain levels of interest – such as the most experienced and least experienced operators – then the factor is fixed. If you randomly select the levels to represent the overall population, the factor is random. Typical Gage studies calculate results assuming all factors are random. But treating fixed factors as random factors can result in over- or under-emphasizing their importance.
Experiences with Expanded Gage R&R in the Field
Minitab has helped dozens of companies implement Expanded Gage studies to correctly assess their measurement system and improve quality — from surface roughness at Corning, Inc., to coating thickness at AzkoNobel. We have learned that simply running a separate standard Gage R&R at each of the levels of the extra variable is rarely an efficient design for answering the questions of interest.
To help more quality practitioners reap the benefits of this powerful tool, let’s take a step-by-step look at how to design, analyze and interpret the results of an Expanded Gage R&R Study. We will use a system for measuring film thickness from the microelectronics industry for illustration.
Process and Data Collection for Expanded Gage R&R Studies
Photoresist coating is used in the microelectronics industry to etch integrated circuits for microprocessors, RAM, etc., onto silicon wafers.[1] We need to assess the measurement system for the thickness of this photoresist coating. The thickness affects how coated silicon wafers perform in microelectronics, so obtaining accurate measurements is critical.
The data collection plan is outlined below (only 1 branch is shown for simplicity):

- 5 wafers are randomly selected to represent the typical process performance.
- 3 operators are randomly selected.
- 3 gages are randomly selected.
- Each operator will measure each wafer with each gage twice.
In a standard Gage R&R plan, we would select 10 wafers at random to represent process performance. If a standard study was followed for each of the 3 gages, the total sample size would be:
(10 Parts) x (3 Operators) x (2 Replicates) x (3 Gages) = 180 measurements
That is an unacceptably large sample size. By decreasing the number of parts (wafers) from 10 to 5, the entire study can be completed in 90 measurements.
Changing the sampling plan is commonly required to reduce the size of the Expanded Gage R&R study to a manageable level. This is an important difference between a standard and an expanded study. Later, we will demonstrate that reducing the number of parts from 10 to 5 did not compromise the quality of our calculations.
Entering the Data for Expanded Gage R&R Studies
 As can be seen in the worksheet for this study’s 90-row dataset, each operator measures each wafer on each of the three gages, twice. Each row has a column that identifies the Operator, Gage, Wafer and Thickness reading. Even though missing data is not allowed in a Standard Gage R&R, an expanded study accommodates missing data, as seen in Row 10 below.
As can be seen in the worksheet for this study’s 90-row dataset, each operator measures each wafer on each of the three gages, twice. Each row has a column that identifies the Operator, Gage, Wafer and Thickness reading. Even though missing data is not allowed in a Standard Gage R&R, an expanded study accommodates missing data, as seen in Row 10 below.
To carry out the analysis in Minitab, choose Stat > Quality Tools > Gage Study > Gage R&R Study (Expanded). Complete the dialog box as shown below. The analysis treats Operator, Part and Gage as random factors because each of these factor levels (e.g. each operator) was randomly sampled from a larger population. (If our measurement system had only two gages and our main goal was to compare them to each other, then our analysis should consider Gage as a fixed factor,[2] and we would identify it as a fixed factor in the dialog box.)
Next, we select the terms we wish to evaluate by clicking Terms… and adding all main effects (Wafer, Operator and Gage) as well as all second-order terms — Wafer*Operator, Wafer*Gage, and Operator*Gage. By including “Gage” in the study, not only do we determine the variability due to the gage main effect, but also its interaction with the other two variables, Operator and Part. Finally, we select the graphs we would like to evaluate by clicking Graphs… and completing the dialog box as shown.


Then click OK to close the dialog boxes, and Minitab will perform the analysis.
Interpreting the Results of the Expanded Gage R&R Study
Minitab provides a great deal of numeric and graphical output. Let’s evaluate the two most important data tables first. The ANOVA table (Analysis of Variance) shows which sources of variation were statistically significant. Factors with p-values less than .05 in the ANOVA table below are statistically significant.
 The ANOVA output indicates that part-to-part, gage-to-gage variation, the Wafer*Operator interaction, and the Wafer*Gage interaction are statistically significant. The high p-values for Operator and the Operator*Gage interaction indicate that these two sources of variation are not statistically significant, and therefore will not be of concern when trying to reduce the variability of the measurement system. (Wafer-to-Wafer variability also is statistically significant, but since we are focusing on the measurement system, part-to-part variation is not a key concern in this study.)
The ANOVA output indicates that part-to-part, gage-to-gage variation, the Wafer*Operator interaction, and the Wafer*Gage interaction are statistically significant. The high p-values for Operator and the Operator*Gage interaction indicate that these two sources of variation are not statistically significant, and therefore will not be of concern when trying to reduce the variability of the measurement system. (Wafer-to-Wafer variability also is statistically significant, but since we are focusing on the measurement system, part-to-part variation is not a key concern in this study.)
It is also important to evaluate the ANOVA table for the number of degrees of freedom (an indicator of the number of repeat measurements) available to estimate the repeatability of the gage. Here we see 57 degrees of freedom, well above the 30 to 45 recommended by simulation studies.[3] The more degrees of freedom, generally, the better the estimate. Therefore, the reduced number of Parts in the study has not hindered our ability to estimate the contribution of the gage repeatability to the overall variation of the measurement system.

Next we’ll examine the Gage Evaluation table. The Automotive Industry Action Group[4] has set guidelines for %Study Variation and Number of Distinct Categories at a maximum of 30% and a minimum of 5 categories, respectively. Here we see that both measures indicate that this measurement system just narrowly achieves both.
The Gage Evaluation table also shows the relative importance of each of the sources of variation. The variation due to Gage and Wafer*Gage are the two strongest contributors to the overall variation, each accounting for about 15% of Study Variation. We can see the contribution of Gage to the variation in the main effects plot below. The average reading by gage varies from 111 to 123 microns.
However, this is not the full story, because the Wafer*Gage interaction was also a strong contributor to the measurement system variation, as shown in the figure below.

The general agreement seen in the three gages on wafers 3 and 5 indicates there is not a consistent bias between the three gages. However, Gage 1 has a strong positive bias for wafers 1 and 4. Even though the measurement system is acceptable, determining why the gage exhibited bias when measuring wafers 1 and 4 — and fixing this problem — will reduce overall variation in the measurement system.
 Finally, we return to the question of the effect of reducing the number of parts from 10 to 5. Our capability estimators % Study Variation and Number of Distinct Categories are a function of the part-to-part variability which can be calculated from the parts in the study or from historical data. With only 5 parts, one would expect more reliable results from using the historical standard deviation. The ratio of the measurement system variation to the process variation calculated from historical data is called the % Process shown in the Gage Evaluation table. The general specification on % Process (less than 30%) is the same as that for % Study Variation. When reducing the number of parts below 10, entering a historical standard deviation and focusing on % Process instead of % Study Variation is strongly recommended. In this way, the size of the study can be reduced without concern that the quality of the results have been compromised. In this case, we see that the % Process and % Study Var are nearly equal. Therefore, our conclusions remain the same.
Finally, we return to the question of the effect of reducing the number of parts from 10 to 5. Our capability estimators % Study Variation and Number of Distinct Categories are a function of the part-to-part variability which can be calculated from the parts in the study or from historical data. With only 5 parts, one would expect more reliable results from using the historical standard deviation. The ratio of the measurement system variation to the process variation calculated from historical data is called the % Process shown in the Gage Evaluation table. The general specification on % Process (less than 30%) is the same as that for % Study Variation. When reducing the number of parts below 10, entering a historical standard deviation and focusing on % Process instead of % Study Variation is strongly recommended. In this way, the size of the study can be reduced without concern that the quality of the results have been compromised. In this case, we see that the % Process and % Study Var are nearly equal. Therefore, our conclusions remain the same.
Actionable Conclusions From the Expanded Gage R&R Study
- The Expanded Gage R&R study has provided a comprehensive assessment of the measurement system for the photoresist thickness measurement. When the Number of Distinct Categories = 5, the system meets the minimum acceptance criteria for a measurement used to study the process.
- Because Gage and the Wafer*Gage interaction were the strongest contributors to the measurement variation, determining the cause of the differences between Gages, particularly for certain parts, will reduce overall measurement system variation. The within gage repeatability was also a reasonably large source of variation. Identifying ways to make the gage more repeatable will also reduce variation in the system.
Conclusion
As we have seen, a standard Gage R&R cannot adequately assess the capability of many measurement systems. When a standard study is not enough, an Expanded Gage R&R is an ideal tool to comprehensively characterize your measurement system.
Curious About Minitab? Start Learning now!
References
[1] Johnson, L., and S. P. Bailey (2012), “Implementing an Expanded Gage R&R Study.” ASQ World Conference on Quality and Improvement, Anaheim, Ca.
[2] Dolezal, K. K., R. K. Burdick, and N. J. Birch (1998). “Analysis of a Two-Factor R&R Study with Fixed Operators.” Journal of Quality Technology, Vol 30, p163.
[3] Zuo, Y., (2009) “Effect of Sample Size on Variance Component Estimates in Gage R & R Studies.” Minitab Technical White Paper.
[4] AIAG Measurement Systems Analysis, Reference Manual, 3rd ed. (2003). Automotive Industry Action Group, Southfield, Mich.
 
                                             
                                             
                                             
                                            