Data Analysis

Blog posts and articles with tips for analyzing data for quality improvement methodologies, including Six Sigma and Lean.

"Data! Data! Data! I can't make bricks without clay."  — Sherlock Holmes, in Arthur Conan Doyle's The Adventure of the Copper Beeches Whether you're the world's greatest detective trying to crack a case or a person trying to solve a problem at work, you're going to need information. Facts. Data, as Sherlock Holmes says.  But not all data is created equal, especially if you plan to analyze as part of... Continue Reading
Choosing the right type of subgroup in a control chart is crucial. In a rational subgroup, the variability within a subgroup should encompass common causes, random, short-term variability and represent “normal,” “typical,” natural process variations, whereas differences between subgroups are useful to detect drifts in variability over time (due to “special” or “assignable” causes). Variation within... Continue Reading

7 Deadly Statistical Sins Even the Experts Make

Do you know how to avoid them?

Sign Up Today >
Earlier, I wrote about the different types of data statisticians typically encounter. In this post, we're going to look at why, when given a choice in the matter, we prefer to analyze continuous data rather than categorical/attribute or discrete data.  As a reminder, when we assign something to a group or give it a name, we have created attribute or categorical data.  If we count something, like... Continue Reading
You run a capability analysis and your Cpk is bad. Now what? First, let’s start by defining what “bad” is. In simple terms, the smaller the Cpk, the more defects you have. So the larger your Cpk is, the better. Many practitioners use a Cpk of 1.33 as the gold standard, so we’ll treat that as the gold standard here, too. Suppose we collect some data and run a capability analysis using Minitab Statisti... Continue Reading
In Part 1 of Gauging Gage, I looked at how adequate a sampling of 10 parts is for a Gage R&R Study and providing some advice based on the results. Now I want to turn my attention to the other two factors in the standard Gage experiment: 3 operators and 2 replicates.  Specifically, what if instead of increasing the number of parts in the experiment (my previous post demonstrated you would need... Continue Reading
"You take 10 parts and have 3 operators measure each 2 times." This standard approach to a Gage R&R experiment is so common, so accepted, so ubiquitous that few people ever question whether it is effective.  Obviously one could look at whether 3 is an adequate number of operators or 2 an adequate number of replicates, but in this first of a series of posts about "Gauging Gage," I want to look at... Continue Reading
Everyone who analyzes data regularly has the experience of getting a worksheet that just isn't ready to use. Previously I wrote about tools you can use to clean up and eliminate clutter in your data and reorganize your data.  In this post, I'm going to highlight tools that help you get the most out of messy data by altering its characteristics. Know Your Options Many problems with data don't become... Continue Reading
You've collected a bunch of data. It wasn't easy, but you did it. Yep, there it is, right there...just look at all those numbers, right there in neat columns and rows. Congratulations. I hate to ask...but what are you going to do with your data? If you're not sure precisely what to do with the data you've got, graphing it is a great way to get some valuable insight and direction. And a good graph to... Continue Reading
In my last post, I wrote about making a cluttered data set easier to work with by removing unneeded columns entirely, and by displaying just those columns you want to work with now. But too much unneeded data isn't always the problem. What can you do when someone gives you data that isn't organized the way you need it to be?   That happens for a variety of reasons, but most often it's because the... Continue Reading
Isn't it great when you get a set of data and it's perfectly organized and ready for you to analyze? I love it when the people who collect the data take special care to make sure to format it consistently, arrange it correctly, and eliminate the junk, clutter, and useless information I don't need.   You've never received a data set in such perfect condition, you say? Yeah, me neither. But I can... Continue Reading
In its industry guidance to companies that manufacture drugs and biological products for people and animals, the Food and Drug Administration (FDA) recommends three stages for process validation: Process Design, Process Qualification, and Continued Process Verification. In this post, we we will focus on that third stage. Stage 3: Continued Process Verification Per the FDA guidelines, the goal of... Continue Reading
Like many, my introduction to 17th-century French philosophy came at the tender age of 3+. For that is when I discovered the Etch-a-Sketch®, an entertaining ode to Descartes' coordinate plane. Little did I know that the seemingly idle hours I spent doodling on my Etch-a-Sketch would prove to be excellent training for the feat that I attempt today: plotting an Empirical Cumulative Distribution... Continue Reading
My colleague Cody Steele wrote a post that illustrated how the same set of data can appear to support two contradictory positions. He showed how changing the scale of a graph that displays mean and median household income over time drastically alters the way it can be interpreted, even though there's no change in the data being presented. When we analyze data, we need to present the results in... Continue Reading
To make objective decisions about the processes that are critical to your organization, you often need to examine categorical data. You may know how to use a t-test or ANOVA when you’re comparing measurement data (like weight, length, revenue, and so on), but do you know how to compare attribute or counts data? It easy to do with statistical software like Minitab.  One person may look at this bar... Continue Reading
by Rehman Khan, guest blogger There are many articles giving Minitab tips already, so to be different I have done mine in the style of my books, which use example-based learning. All ten tips are shown using a single example. If you don’t already know these 10 tips you will get much more benefit if you work along with the example. You don’t need to download any files to work along—although, if you... Continue Reading
In its industry guidance to companies that manufacture drugs and biological products for people and animals, the Food and Drug Administration (FDA) recommends three stages for process validation. While my last post covered statistical tools for the Process Design stage, here we will focus on the statistical techniques typically utilized for the second stage, Process Qualification. Stage 2: Process... Continue Reading
In the first part of this series, we saw how conflicting opinions about a subjective factor can create business problems. In part 2, we used Minitab's Assistant feature to set up an attribute agreement analysis study that will provide a better understanding of where and when such disagreements occur.  We asked four loan application reviewers to reject or approve 30  selected applications, two... Continue Reading
T'was the season for toys recently, and Christmas day found me playing around with a classic, the Etch-a-Sketch. As I noodled with the knobs, I had a sudden flash of recognition: my drawing reminded me of the Empirical CDF Plot in Minitab Statistical Software. Did you just ask, "What's a CDF plot? And what's so empirical about it?" Both very good questions. Let's start with the first, and we'll... Continue Reading
Previously, I discussed how business problems arise when people have conflicting opinions about a subjective factor, such as whether something is the right color or not, or whether a job applicant is qualified for a position. The key to resolving such honest disagreements and handling future decisions more consistently is a statistical tool called attribute agreement analysis. In this post, we'll... Continue Reading
While there are many graph options available in Minitab’s Graph menu, there is no direct option to generate a waterfall chart. This type of graph helps visualize the cumulative effect of sequentially introducing positive or negative values. In this post, I’ll show you the steps to follow to make Minitab display a waterfall chart even without a "waterfall chart" tool. If you don’t already have... Continue Reading