dcsimg
 

Stats

Blog posts and articles about statistics principles and how they apply to quality improvement methods like Lean and Six Sigma.

Do you recall my “putting the cart before the horse” analogy in part 1 of this blog series? The comparison is simple. We all, at times, put the cart before the horse in relatively innocuous ways, such as eating your dessert before you’ve eaten your dinner, or deciding what to wear before you’ve been invited to the party. But performing some tasks in the wrong order, such as running a statistical... Continue Reading
Once upon a time, when people wanted to compare the standard deviations of two samples, they had two handy tests available, the F-test and Levene's test. Statistical lore has it that the F-test is so named because it so frequently fails you.1 Although the F-test is suitable for data that are normally distributed, its sensitivity to departures from normality limits when and where it can be used. Leve... Continue Reading

7 Deadly Statistical Sins Even the Experts Make

Do you know how to avoid them?

Get the facts >
Along with the explosion of interest in visualizing data over the past few years has been an excessive focus on how attractive the graph is at the expense of how useful it is. Don't get me wrong...I believe that a colorful, modern graph comes across better than a black-and-white, pixelated one. Unfortunately, however, all the talk seems to be about the attractiveness and not the value of the... Continue Reading
About a year ago, a reader asked if I could try to explain degrees of freedom in statistics. Since then,  I’ve been circling around that request very cautiously, like it’s some kind of wild beast that I’m not sure I can safely wrestle to the ground. Degrees of freedom aren’t easy to explain. They come up in many different contexts in statistics—some advanced and complicated. In mathematics, they're... Continue Reading
Like so many of us, I try to stay healthy by watching my weight. I thought it might be interesting to apply some statistical thinking to the idea of maintaining a healthy weight, and the central limit theorem could provide some particularly useful insights. I’ll start by making some simple (maybe even simplistic) assumptions about calorie intake and expenditure, and see where those lead. And then... Continue Reading
You have a column of categorical data. Maybe it’s a column of reasons for production downtime, or customer survey responses, or all of the reasons airlines give for those riling flight delays. Whatever type of qualitative data you may have, suppose you want to find the most common categories. Here are three different ways to do that: 1. Pareto Charts Pareto Charts easily help you separate the vital... Continue Reading
If you need to assess process performance relative to some specification limit(s), then process capability is the tool to use. You collect some accurate data from a stable process, enter those measurements in Minitab, and then choose Stat > Quality Tools > Capability Analysis/Sixpack or Assistant > Capability Analysis. Now, what about sorting the data? I’ve been asked “why does Cpk change when I... Continue Reading
In my time at Minitab, I’ve gotten a good understanding of what types of graphs users create. Everyone knows about histograms, bar charts, and time series plots. Even relatively less familiar plots like the interval plot and individual value plot are still used quite often. However, one of the most underutilized graphs we have available is the area graph. If you’re not familiar with an Area... Continue Reading
In an earlier post, I shared an overview of acceptance sampling, a method that lets you evaluate a sample of items from a larger batch of products (for instance, electronics components you've sourced from a new supplier) and use that sample to decide whether or not you should accept or reject the entire shipment.  There are two approaches to acceptance sampling. If you do it by attributes, you... Continue Reading
If you're just getting started in the world of quality improvement, or if you find yourself in a position where you suddenly need to evaluate the quality of incoming or outgoing products from your company, you may have encountered the term "acceptance sampling." It's a statistical method for evaluating the quality of a large batch of materials from a small sample of items, which statistical softwar... Continue Reading
In my last post, I walked through the steps to install Minitab 17 on a Mac using Apple Boot Camp.  Minitab 17 can also be installed on a Mac using desktop virtualization software. In addition to your Mac, you’ll need: A copy of Windows 7 or later version ISO Minitab 17 Statistical Software Desktop virtualization software allows you to install and use Windows on your Intel-based Mac without requiring... Continue Reading
While Minitab 17 is currently a Windows-only application, there are people who only have a Mac available for the installation who also find they need to use Minitab 17.  It is possible to run Minitab 17 on a Macintosh, though the steps involved in the installation can seem a little daunting at first. In the Technical Support department, we sometimes hear reluctance in people’s voices when we throw... Continue Reading
Not long ago, I couldn’t abide statistics. I did respect it, but in much the same way a gazelle respects a lion. Most of my early experiences with statistics indicated that close encounters resulted in pain, so I avoided further contact whenever possible. So how is it that today I write about statistics? That’s simple: it merely required completely reinventing the way I thought about and approached... Continue Reading
There are many reasons why a distribution might not be normal/Gaussian. A non-normal pattern might be caused by several distributions being mixed together, or by a drift in time, or by one or several outliers, or by an asymmetrical behavior, some out-of-control points, etc. I recently collected the scores of three different teams (the Blue team, the Yellow team and the Pink team) after a laser... Continue Reading
Since it's the Halloween season, I want to share how a classic horror film helped me get a handle on an extremely useful statistical distribution.  The film is based on John W. Campbell's classic novella "Who Goes There?", but I first became  familiar with it from John Carpenter's 1982 film The Thing.   In the film, researchers in the Antarctic encounter a predatory alien with a truly frightening... Continue Reading
Step 3 in our DOE problem solving methodology is to determine how many times to replicate the base experiment plan. The discussion in Part 3 ended with the conclusion that our 4 factors could best be studied using all 16 combinations of the high and low settings for each factor, a full factorial. Each golfer will perform half of the sixteen possible combinations and each golfer’s data could stand as... Continue Reading
I read trade publications that cover everything from banking to biotech, looking for interesting perspectives on data analysis and statistics, especially where it pertains to quality improvement. Recently I read a great blog post from Tony Taylor, an analytical chemist with a background in pharmaceuticals. In it, he discusses the implications of the FDA's updated guidance for industry analytical... Continue Reading
September 17 marked the release of new information from the American Community Survey (ACS) from the U.S. Census Bureau. Here’s a bar chart of what the press releases looked like for that day: Clearly there was a theme in play, one that was great news for major metropolitan areas. The Census Bureau even released a graph showing that the percentage of people within the 25 most populous metropolitan... Continue Reading
You run a capability analysis and your Cpk is bad. Now what? First, let’s start by defining what “bad” is. In simple terms, the smaller the Cpk, the more defects you have. So the larger your Cpk is, the better. Many practitioners use a Cpk of 1.33 as the gold standard, so we’ll treat that as the gold standard here, too. Suppose we collect some data and run a capability analysis using Minitab Statisti... Continue Reading
Statisticians say the darndest things. At least, that's how it can seem if you're not well-versed in statistics.  When I began studying statistics, I approached it as a language. I quickly noticed that compared to other disciplines, statistics has some unique problems with terminology, problems that don't affect most scientific and academic specialties.  For example, dairy science has a highly... Continue Reading