Back when I used to work in Minitab Tech Support, customers often asked me, “What’s the difference between Cpk and Ppk?” It’s a good question, especially since many practitioners default to using Cpk while overlooking Ppk altogether. It’s like the '80s pop duo Wham!, where Cpk is George Michael and Ppk is that other guy.

Poofy hairdos styled with mousse, shoulder pads, and leg warmers aside, let’s start by defining rational subgroups and then explore the difference between Cpk and Ppk.

Rational Subgroups

A rational subgroup is a group of measurements produced under the same set of conditions. Subgroups are meant to represent a snapshot of your process. Therefore, the measurements that make up a subgroup should be taken from a similar point in time. For example, if you sample 5 items every hour, your subgroup size would be 5.

Formulas, Definitions, Etc.

The goal of capability analysis is to ensure that a process is capable of meeting customer specifications, and we use capability statistics such as Cpk and Ppk to make that assessment. If we look at the formulas for Cpk and Ppk for normal (distribution) process capability, we can see they are nearly identical:

The only difference lies in the denominator for the Upper and Lower statistics: Cpk is calculated using the WITHIN standard deviation, while Ppk uses the OVERALL standard deviation. Without boring you with the details surrounding the formulas for the standard deviations, think of the within standard deviation as the average of the subgroup standard deviations, while the overall standard deviation represents the variation of all the data. This means that:

Cpk:

  • Only accounts for the variation WITHIN the subgroups
  • Does not account for the shift and drift between subgroups
  • Is sometimes referred to as the potential capability because it represents the potential your process has at producing parts within spec, presuming there is no variation between subgroups (i.e. over time)

Ppk:

  • Accounts for the OVERALL variation of all measurements taken
  • Theoretically includes both the variation within subgroups and also the shift and drift between them
  • Is where you are at the end of the proverbial day

Examples of the Difference Between Cpk and Ppk

For illustration, let's consider a data set where 5 measurements were taken every day for 10 days.

Example 1 - Similar Cpk and Ppk

As the graph on the left side shows, there is not a lot of shift and drift between subgroups compared to the variation within the subgroups themselves. Therefore, the within and overall standard deviations are similar, which means Cpk and Ppk are similar, too (at 1.13 and 1.07, respectively).

Example 2 - Different Cpk and Ppk

In this example, I used the same data and subgroup size, but I shifted the data around, moving it into different subgroups. (Of course we would never want to move data into different subgroups in practice – I’ve just done it here to illustrate a point.)

Since we used the same data, the overall standard deviation and Ppk did not change. But that’s where the similarities end.

Look at the Cpk statistic. It’s 3.69, which is much better than the 1.13 we got before. Looking at the subgroups plot, can you tell why Cpk increased? The graph shows that the points within each subgroup are much closer together than before. Earlier I mentioned that we can think of the within standard deviation as the average of the subgroup standard deviations. So less variability within each subgroup equals a smaller within standard deviation. And that gives us a higher Cpk.

To Ppk or Not to Ppk

And here is where the danger lies in only reporting Cpk and forgetting about Ppk like it’s George Michael’s lesser-known bandmate (no offense to whoever he may be). We can see from the examples above that Cpk only tells us part of the story, so the next time you examine process capability, consider both your Cpk and your Ppk. And if the process is stable with little variation over time, the two statistics should be about the same anyway.

(Note: It is possible, and okay, to get a Ppk that is larger than Cpk, especially with a subgroup size of 1, but I’ll leave explanation for another day.)

Comments for Process Capability Statistics: Cpk vs. Ppk

Name: Omar Mora
Time: Tuesday, June 26, 2012

Michelle, thanks for this post. Long term vs Short term capability, subrational subgroups, are extremely important concepts.
Looking forward your "Cpk-larger-than-Ppk-when-subgroup-size-of-1" article.
If possible, consider for a future post to talk about confidence intervals for Cpk and/or Ppk.

Name: Arun
Time: Wednesday, June 27, 2012

NIce clear thoughts. Liked it.

Keep it up buddy!

Name: Quentin
Time: Friday, July 20, 2012

Great explanation. I second the comment by Omar on the "Cpk-larger-than-Ppk-when-subgroup-size-of-1" topic. This is a very common question. I'll be looking for it.

Name: Chuck Sauder
Time: Monday, October 15, 2012

Really liked the article.
My question is how Minitab calculates different values for Cpk and Ppk when there are no subgroups (subgroup size = 1)

Name: Michelle Paret
Time: Monday, October 15, 2012

Chuck, I'm glad you liked the article.

Good question about Cpk vs. Ppk when subgroup size =1. In this case, Minitab uses the average moving range to calculate the within stdev (and Cpk), not the typical stdev formula which is used to calculate the overall stdev (and Ppk).

Name: Mike Lickley
Time: Monday, November 26, 2012

Great article thank you. Am I correct in thinking that if I run a test and vary process variables that I should use the Ppk? Since the subgroups are not the same the Cpk is not a true reflection of the variability as I am introducing variability by changing the process. Thank you

Name: Quentin
Time: Thursday, November 29, 2012

Very nice post. I googled "CPK and PPK" and found this. Much better than wikipedia's explanation. So here I am, a SAS programmer who is going to start following a mintab blog!

Name: Michelle Paret
Time: Wednesday, December 5, 2012

Mike, if you are varying process variables then it's likely your process will not be stable, which is one of the important assumptions for capability analysis. In addition, if you are introducing variability, then the overall stdev (used to calculate Ppk) will not be representative of the variation your process exhibits at any given time. I would suggest getting your process to a stable state and then collecting data to evaluate the process capability of the current, stable process.

Quentin, I'm happy to hear the explanation provided was helpful. Thank you for following our blog.

Name: Kerry Kearney
Time: Monday, December 17, 2012

Great article, not sure if the "...subgroup size of 1" article is avialable yet.

My question:

If we are collecting data in no particular order and using a subgroup size of one, can we hope to get a Cpk that has any connection to reality? Slightly alter the order of the data and we get a different Cpk...

Name: Michelle Paret
Time: Tuesday, December 18, 2012

Kerry, I'm glad the article was helpful. Great question about what to do when the data was recorded in no particular order. When the subgroup size is 1, within stdev is calculated using the average moving range. In other words, Minitab looks at the range between row1 and row2, then row2 and row3, etc. Minitab assumes the data are in chronological order. That is why changing the order of the data affects the average moving range and thus Cpk.

If you do not know in what order the data were collected, I highly recommend using Assistant > Capability Analysis > Capability Analysis > Snapshot. Minitab will then provide you with only the statistics (e.g. Ppk) that are applicable.

(And I haven't gotten around to writing the Ppk may be larger than Cpk when n=1 post yet. Hopefully I will have time ones of these days...)

Name: Vahid
Time: Wednesday, January 16, 2013

Is this formula right?
δ^2 overall=δ^2 within+δ^(2 ) between

Name: Michelle Paret
Time: Wednesday, January 16, 2013

Vahid, for process capability for the normal distribution, the overall stdev is calculated using the typical stdev formula (e.g. use Stat > Basic Statistics > Display Descriptive Statistics). Depending on what options you have selected, the formula might also divide by c4 (i.e. stdev overall = stdev/c4) where c4 is an unbiasing constant.

Name: Ramesh
Time: Tuesday, February 12, 2013

Thanks for article, For normal data the formula is easy to understand. Would you please elaborate on non normal data what's the difference between cpk & ppk

Name: Michelle Paret
Time: Tuesday, February 12, 2013

Ramesh, this is a good question and one that comes up often. When you choose a nonnormal distribution to model your data, Minitab cannot calculate within-subgroup capability metrics such as Cpk. For a detailed explanation of why nonnormal distributions preclude a within-subgroup analysis, please see http://www.minitab.com/support/documentation/Answers/NoWithinSubgroupCapability.pdf.

Name: Matthew Copeland
Time: Tuesday, February 12, 2013

Most places I work (have worked) have copious amounts of data and are not doing logical sampling.
They also tend to set the subgroup size to 1.
In that case I advise that the “overall” or ppk is the real number. The cpk is the “right of the process”
Great stuff. Write more please

Name: Ravikumar
Time: Friday, February 22, 2013

HI Grate article!

My query is, while caluclating the either Cpk or Ppk to a particular parameter whether i have to mention both the values so as to assure my customer that the future prodution be qualitative? Currently i am quoting the Ppk. Pls suggest

Name: Michelle Paret
Time: Monday, February 25, 2013

I would leave it up to your customer as to whether or not you report just Ppk or both Cpk and Ppk. It's possible your customer is most interested in Ppk since it reflects the current state of your overall process.

Leave a comment





Captcha