I once had a boss who had difficulty understanding many, many things. When I need to discuss statistical concepts with people who don't have a statistical background, I like to think about how I could explain things so even my old boss would get it.
My boss and I shared a common interest in rock and roll, so that's the device I'll use to explain one of the workhorses of quality statistics, the Pareto chart. I'd tell my boss to imagine that instead of managing a surly gang of teenaged restaurant employees, he's managing a surly rock and roll band, the Zero Sigmas. The band did a 100-date tour last year, and before going on the road again, he wants to see what were the most frequent mishaps, in hopes things might run a bit more smoothly. He's got a table of data, but it's a little difficult to figure out what the raw numbers mean.
He needs to create a Pareto chart with that data. It's a very straightforward tool, but therein lies the danger: Because it looks like a standard bar chart, the Pareto chart can be misinterpreted. A well-intentioned (but statistics-impaired) boss may take one look at it and, assuming it's a regular old run-of-the-mill bar chart, imagine he's got it all figured out without actually thinking about what he's seeing. He'll want to make sure he really understand what the Pareto chart reveals.
In my boss's defense, there's really not much difference between a Pareto chart and your regular old run-of-the-mill bar charts, except that the Pareto chart ranks your defects (or whatever it is you're measuring) from largest to smallest.
From a quality improvement perspective, this is important because it can help you identify which quality problems are the most critical in terms of volume, expense, or other factors. Once you've prioritized your challenges, you can focus improvement efforts where they'll have the largest benefits.
Organizations tend to use Pareto charts in one of two ways.
In quality-speak, we say the Pareto chart separates the "vital few" problems from the "trivial many." In other words, it gives you an easy way to visualize which problems have the biggest impact on your organization.
When you look at the data in a Pareto chart, you might find out, for instance, that even though there's a perception that customers complain more frequently about, say, shipping speed, your greatest volume of complaints is really about the voicemail system. Knowing that can help you tackle the problem that's most important to the most customers first.
In keeping with the rock-and-roll theme, the Pareto chart will help us see which incidents on last year's tour kept the Zero Sigmas from rocking audiences to the fullest.
When you create a Pareto chart in our statistical software, your data must include the names of each defect. These names can be text or numeric. If your data are summarized in a table, you must include a column of frequencies or counts, with nonnegative numeric values for each defect.
Let's say you've identified and tallied 9 types mishaps that occurred with some regularity during last year's tour. You can arrange the data in a Minitab worksheet like this:
To create a chart that shows the frequencies of these incidents graphically, we just select Stat > Quality Tools > Pareto Chart and enter Incident as our Defects data and Count as our Frequencies data. Minitab produces the following graph:
The right Y-axis shows the percent of the total mishaps accounted for by each type of incident, while the left Y-axis shows the count of those incidents. The red line indicates cumulative percentage, which can help you judge the added contribution of each category. The bars of show the count (and the percentage of total) for each category. Below the bars, the counts, percents, and cumulative percents are listed for each incident category.
You'll notice the last grouping is labeled "Other." Your raw data didn't include an "Other", but by default Minitab puts all categories with counts that represent less than 5% of the total defect count into this "Other" category.
In this example, 27.9% of the incidents involved the Zero Sigmas starting their gig late, which they did every single night of the tour. Another 22.3 percent of incidents involved the band's singer, Hy P. Value, forgetting the lyrics to his own songs. The combined, or cumulative, percentage for starting late and forgetting lyrics is 50%, and if you add in the guitars going out of tune, you've accounted for a whopping 67.9% of the incidents that plagued the band's 100-day tour.
In terms of overall numbers of incidents, it looks like these are the three areas you should focus on if you want the Zero Sigmas to kick out the jams more efficiently on the next tour. This illustrates how you would create a Pareto chart for the first use above: to determine the most commonly occurring types of defects.
Although the Pareto chart is easy to create, understand and use, it does have some limitations:
About that last bullet...this example looks at the overall counts of incidents that happened on the Zero Sigma's recent tour. But how do we know those are the incidents that had the biggest impact on the tour's overall rock-and-roll awesomeness? That's exactly the kind of great question my old boss would never have thought to ask, and it's the question I'll answer in my next post.