Stats

Blog posts and articles about statistics principles and how they apply to quality improvement methods like Lean and Six Sigma.

Histograms are one of the most common graphs used to display numeric data. Anyone who takes a statistics course is likely to learn about the histogram, and for good reason: histograms are easy to understand and can instantly tell you a lot about your data. Here are three of the most important things you can learn by looking at a histogram.  Shape—Mirror, Mirror, On the Wall… If the left side of a... Continue Reading
by Matthew Barsalou, guest blogger.  The old saying “if it walks like a duck, quacks like a duck and looks like a duck, then it must be a duck” may be appropriate in bird watching; however, the same idea can’t be applied when observing a statistical distribution. The dedicated ornithologist is often armed with binoculars and a field guide to the local birds and this should be sufficient. A... Continue Reading

7 Deadly Statistical Sins Even the Experts Make

Do you know how to avoid them?

Have you ever wanted to know the odds of something happening, or not happening?  It's the kind of question that students are frequently asked to calculate by hand in introductory statistics classes, and going through that exercise is a good way to become familiar with the mathematical formulas the underlie probability (and hence, all of statistics).  But let's be honest: when class is over, most... Continue Reading
In its industry guidance to companies that manufacture drugs and biological products for people and animals, the Food and Drug Administration (FDA) recommends three stages for process validation. While my last post covered statistical tools for the Process Design stage, here we will focus on the statistical techniques typically utilized for the second stage, Process Qualification. Stage 2: Process... Continue Reading
Have you ever wished your control charts were better?  More effective and user-friendly?  Easier to understand and act on?  In this post, I'll share some simple ways to make SPC monitoring more effective in Minitab. Common Problems with SPC Control Charts I worked for several years in a large manufacturing plant in which control charts played a very important role. Virtually thousands of SPC... Continue Reading
T'was the season for toys recently, and Christmas day found me playing around with a classic, the Etch-a-Sketch. As I noodled with the knobs, I had a sudden flash of recognition: my drawing reminded me of the Empirical CDF Plot in Minitab Statistical Software. Did you just ask, "What's a CDF plot? And what's so empirical about it?" Both very good questions. Let's start with the first, and we'll... Continue Reading
In my last post on DMAIC tools for the Define phase, we reviewed various graphs and stats typically used to define project goals and customer deliverables. Let’s now move along to the tools you can use in Minitab Statistical Software to conduct the Measure phase. Measure Phase Methodology The goal of this phase is to measure the process to determine its current performance and quantify the problem.... Continue Reading
When you’re working in Minitab and prepping your data for analysis, it’s common to group data into categories that imply a specific order, such as Low, Medium, High or Beginning, Middle, End. But if the data were to appear in a different order in tables and graphs (for example, Beginning, End, Middle), the result could be confusing, and might distract from your message. Fortunately, with Minitab’s va... Continue Reading
If you’re familiar with Lean Six Sigma, then you’re familiar with DMAIC. DMAIC is the acronym for Define, Measure, Analyze, Improve and Control. This proven problem-solving strategy provides a structured 5-phase framework to follow when working on an improvement project. This is the first post in a five-part series that focuses on the tools available in Minitab Statistical Software that are most... Continue Reading
Dear Readers, As 2016 comes to a close, it’s time to reflect on the passage of time and changes. As I’m sure you’ve guessed, I love statistics and analyzing data! I also love talking and writing about it. In fact, I’ve been writing statistical blog posts for over five years, and it’s been an absolute blast. John Tukey, the renowned statistician, once said, “The best thing about being a statistician... Continue Reading
In Part 1 of this blog series, I wrote about how statistical inference uses data from a sample of individuals to reach conclusions about the whole population. That’s a very powerful tool, but you must check your assumptions when you make statistical inferences. Violating any of these assumptions can result in false positives or false negatives, thus invalidating your results.  The common data... Continue Reading
Statistical inference uses data from a sample of individuals to reach conclusions about the whole population. It’s a very powerful tool. But as the saying goes, “With great power comes great responsibility!” When attempting to make inferences from sample data, you must check your assumptions. Violating any of these assumptions can result in false positives or false negatives, thus invalidating... Continue Reading
Since the release of Minitab Express in 2014, we’ve often received questions in technical support about the differences between Express and Minitab 17.  In this post, I’ll attempt to provide a comparison between these two Minitab products. What Is Minitab 17? Minitab 17 is an all-in-one graphical and statistical analysis package that includes basic analysis tools such as hypothesis testing,... Continue Reading
The ultimate goal of most quality improvement projects is clear: reducing the number of defects, improving a response, or making a change that benefits your customers. We often want to jump right in and start gathering and analyzing data so we can solve the problems. Checking your measurement systems first, with methods like attribute agreement analysis or Gage R&R, may seem like a needless waste... Continue Reading
We’ve got a plethora of case studies showing how businesses from different industries solve problems and implement solutions with data analysis. Take a look for ideas about how you can use data analysis to ensure excellence at your business! Boston Scientific, one of the world’s leading developers of medical devices, is just one organization who has shared their story. A team at their Heredia,... Continue Reading
Data mining uses algorithms to explore correlations in data sets. An automated procedure sorts through large numbers of variables and includes them in the model based on statistical significance alone. No thought is given to whether the variables and the signs and magnitudes of their coefficients make theoretical sense. We tend to think of data mining in the context of big data, with its huge... Continue Reading
In regression, "sums of squares" are used to represent variation. In this post, we’ll use some sample data to walk through these calculations. The sample data used in this post is available within Minitab by choosing Help > Sample Data, or File > Open Worksheet > Look in Minitab Sample Data folder (depending on your version of Minitab).  The dataset is called ResearcherSalary.MTW, and contains data... Continue Reading
See if this sounds fair to you. I flip a coin. Heads: You win \$1.Tails: You pay me \$1. You may not like games of chance, but you have to admit it seems like a fair game. At least, assuming the coin is a normal, balanced coin, and assuming I’m not a sleight-of-hand magician who can control the coin. How about this next game? You pay me \$2 to play.I flip a coin over and over until it comes up heads.Your... Continue Reading
Figures lie, so they say, and liars figure. A recent post at Ben Orlin's always-amusing mathwithbaddrawings.com blog nicely encapsulates why so many people feel wary about anything related to statistics and data analysis. Do take a moment to check it out, it's a fast read. In all of the scenarios Orlin offers in his post, the statistical statements are completely accurate, but the person offering... Continue Reading
Often, when we start analyzing new data, one of the very first things we look at is whether certain pairs of variables are correlated. Correlation can tell if two variables have a linear relationship, and the strength of that relationship. This makes sense as a starting point, since we're usually looking for relationships and correlation is an easy way to get a quick handle on the data set we're... Continue Reading