Statistics

Blog posts and articles about statistical principles in quality improvement methods like Lean and Six Sigma.

As someone who has collected and analyzed real data for a living, the idea of using simulated data for a Monte Carlo simulation sounds a bit odd. How can you improve a real product with simulated data? In this post, I’ll help you understand the methods behind Monte Carlo simulation and walk you through a simulation example using Companion by Minitab. Companion by Minitab is a software platform that... Continue Reading
Choosing the right type of subgroup in a control chart is crucial. In a rational subgroup, the variability within a subgroup should encompass common causes, random, short-term variability and represent “normal,” “typical,” natural process variations, whereas differences between subgroups are useful to detect drifts in variability over time (due to “special” or “assignable” causes). Variation within... Continue Reading

7 Deadly Statistical Sins Even the Experts Make

Do you know how to avoid them?

Have you ever tried to install ventilated shelving in a closet?  You know: the heavy-duty, white- or gray-colored vinyl-coated wire shelving? The one that allows you to get organized, more efficient with space, and is strong and maintenance-free? Yep, that’s the one. Did I mention this stuff is strong?  As in, really hard to cut?  It seems like a simple 4-step project. Measure the closet, go the... Continue Reading
Grocery shopping. For some, it's the most dreaded household activity. For others, it's fun, or perhaps just a “necessary evil.” Personally, I enjoy it! My co-worker, Ginger, a content manager here at Minitab, opened my eyes to something that made me love grocery shopping even more: she shared the data behind her family’s shopping trips. Being something of a data nerd, I really geeked out over the... Continue Reading
If you regularly perform regression analysis, you know that R2 is a statistic used to evaluate the fit of your model. You may even know the standard definition of R2: the percentage of variation in the response that is explained by the model. Fair enough. With Minitab Statistical Software doing all the heavy lifting to calculate your R2 values, that may be all you ever need to know. But if you’re... Continue Reading
In Parts 1 and 2 of Gauging Gage we looked at the numbers of parts, operators, and replicates used in a Gage R&R Study and how accurately we could estimate %Contribution based on the choice for each.  In doing so, I hoped to provide you with valuable and interesting information, but mostly I hoped to make you like me.  I mean like me so much that if I told you that you were doing... Continue Reading
Earlier, I wrote about the different types of data statisticians typically encounter. In this post, we're going to look at why, when given a choice in the matter, we prefer to analyze continuous data rather than categorical/attribute or discrete data.  As a reminder, when we assign something to a group or give it a name, we have created attribute or categorical data.  If we count something, like... Continue Reading
You run a capability analysis and your Cpk is bad. Now what? First, let’s start by defining what “bad” is. In simple terms, the smaller the Cpk, the more defects you have. So the larger your Cpk is, the better. Many practitioners use a Cpk of 1.33 as the gold standard, so we’ll treat that as the gold standard here, too. Suppose we collect some data and run a capability analysis using Minitab Statisti... Continue Reading
by Kevin Clay, guest blogger In transactional or service processes, we often deal with lead-time data, and usually that data does not follow the normal distribution. Consider a Lean Six Sigma project to reduce the lead time required to install an information technology solution at a customer site. It should take no more than 30 days—working 10 hours per day Monday–Friday—to complete, test and... Continue Reading
Everyone who analyzes data regularly has the experience of getting a worksheet that just isn't ready to use. Previously I wrote about tools you can use to clean up and eliminate clutter in your data and reorganize your data.  In this post, I'm going to highlight tools that help you get the most out of messy data by altering its characteristics. Know Your Options Many problems with data don't become... Continue Reading
In Part 1 of this blog series, I compared Six Sigma to a diamond because both are valuable, have many facets and have withstood the test of time. I also explained how the term “Six Sigma” can be used to summarize a variety of concepts, including philosophy, tools, methodology, or metrics. In this post, I’ll explain short/long-term variation and between/within-subgroup variation and how they help... Continue Reading
You've collected a bunch of data. It wasn't easy, but you did it. Yep, there it is, right there...just look at all those numbers, right there in neat columns and rows. Congratulations. I hate to ask...but what are you going to do with your data? If you're not sure precisely what to do with the data you've got, graphing it is a great way to get some valuable insight and direction. And a good graph to... Continue Reading
In my last post, I wrote about making a cluttered data set easier to work with by removing unneeded columns entirely, and by displaying just those columns you want to work with now. But too much unneeded data isn't always the problem. What can you do when someone gives you data that isn't organized the way you need it to be?   That happens for a variety of reasons, but most often it's because the... Continue Reading
B'gosh n' begorrah, it's St. Patrick's Day today! The day that we Americans lay claim to our Irish heritage by doing all sorts of things that Irish people never do. Like dye your hair green. Or tell everyone what percentage Irish you are. Despite my given name, I'm only about 15% Irish. So my Irish portion weighs about 25 pounds. It could be the portion that hangs over my belt due to excess potatoes... Continue Reading
Isn't it great when you get a set of data and it's perfectly organized and ready for you to analyze? I love it when the people who collect the data take special care to make sure to format it consistently, arrange it correctly, and eliminate the junk, clutter, and useless information I don't need.   You've never received a data set in such perfect condition, you say? Yeah, me neither. But I can... Continue Reading
Predictions can be a tricky thing. Consider trying to predict the number rolled by 2 six-sided dice. We know that 7 is the most likely outcome. We know the exact probability each number has of being rolled. If we rolled the dice 100 times, we could calculate the expected value for the number of times each value would be rolled. However, even with all that information, we can't definitively predict... Continue Reading
In its industry guidance to companies that manufacture drugs and biological products for people and animals, the Food and Drug Administration (FDA) recommends three stages for process validation: Process Design, Process Qualification, and Continued Process Verification. In this post, we we will focus on that third stage. Stage 3: Continued Process Verification Per the FDA guidelines, the goal of... Continue Reading
People can make mistakes when they test a hypothesis with statistical analysis. Specifically, they can make either Type I or Type II errors. As you analyze your own data and test hypotheses, understanding the difference between Type I and Type II errors is extremely important, because there's a risk of making each type of error in every analysis, and the amount of risk is in your control.    So if... Continue Reading
Welcome to the Hypothesis Test Casino! The featured game of the house is roulette. But this is no ordinary game of roulette. This is p-value roulette! Here’s how it works: We have two roulette wheels, the Null wheel and the Alternative wheel. Each wheel has 20 slots (instead of the usual 37 or 38). You get to bet on one slot. What happens if the ball lands in the slot you bet on? Well, that depends... Continue Reading
Like many, my introduction to 17th-century French philosophy came at the tender age of 3+. For that is when I discovered the Etch-a-Sketch®, an entertaining ode to Descartes' coordinate plane. Little did I know that the seemingly idle hours I spent doodling on my Etch-a-Sketch would prove to be excellent training for the feat that I attempt today: plotting an Empirical Cumulative Distribution... Continue Reading