Statistics Help

Blog posts and articles that offer tips about the statistics used in lean and six sigma quality improvement projects.

In my last post, I discussed what the "Number of Distinct Categories" means in gage R&R output . Another common question with Gage Crossed is what table to look at when assessing your measurement system.  By default, Minitab gives a %Contribution table and %Study Variation table. Which one should you use when assessing where the variation is mostly coming from? Well, you could use either of them. ... Continue Reading
They were careless people, Tom and Daisy—they smashed up things and creatures and then retreated back into their money or their vast carelessness, or whatever it was that kept them together, and let other people clean up the mess they had made. — F. Scott Fitzgerald As Nick learns in The Great Gatsby, you can't be careless about your friends or you'll create a big mess. You don't want to be riding... Continue Reading

7 Deadly Statistical Sins Even the Experts Make

Do you know how to avoid them?

Sign Up Today >
Riddle: What two tools in Minitab can be used to perform the same analysis on your data? Well, there are probably a few pairs that can be mentioned, but I am going to focus on Discriminant Analysis and Binary Logistic Regression.These tools can be used to predict group membership.  If we look at exh_mvar.mtw, located in Minitab’s sample data folder, we have the perfect data set to use. Here is a... Continue Reading
Most people are familiar with the concept of statistics based on exposure to every-day information, such as people polls, election results, weather or sports stats, commercial product comparisons, etc. Someone collects a set of data, does some number-crunching, and produces for us some interesting statistics. These sorts of sample statistics are called ‘descriptive statistics’. Descriptive... Continue Reading
Adhering to the proper assumptions in any statistical analysis is very important. And there seems to be an assumption for everything. For this post, I’d like to clear up some confusion about one particular assumption for assessing normality. A data set is normally distributed when the data itself follows a uni-modal bell-shaped curve that is symmetric about its mean. This graph, created from the... Continue Reading
We humans do have a tendency to succumb to gold rush fever. And this can happen even in the left-brained, rational field of statistics. After we collect our data, it’s difficult to resist the urge to desperately dash for p-values, as if they were 70% off at Macy’s the day after Thanksgiving.But no matter how well-versed you are in statistics, it’s good practice to get into the habit of intuitively... Continue Reading
In an earlier post, I focused on using Minitab to present the coupon data I collected from my e-mail inbox into a bar chart. The bar chart made it easy for me to visually analyze which days of the week are better or worse for receiving the best coupons from my favorite retailers. As a reminder, here’s how I ranked each coupon’s worthiness: Not worth your timeOffering average savings (A “noteworthy”... Continue Reading
  Time series plots help us see variations over time by displaying observations on the y-axis against equally spaced time intervals on the x-axis. In Lean Six Sigma, they can show you the before-and-after effects of a process change. I recently used a time series plot to see how the price of gold has changed over time. After reaching record highs recently, gold prices have dipped, although some... Continue Reading
Sometimes, statistical terms can seem like they were zapped down from outer space by sadistic, mealy-mouthed aliens: R-squared adjusted, heteroeskadasticity, 3-parameter Weibull distribution. But not all statistics terminology should leave you feeling woozy and glassy-eyed. Some terms  actually make intuitive sense. Knowing those terms can help you get a handle on output that may seem fuzzy at... Continue Reading
Careful data analysis is always important, but sometimes we need to quickly get a sense of the relationship between variables or factors.  It’s also true that pictures speak louder than raw data – you may have analyzed every last scrap of your data and run every possible test to confirm your analysis, but an effective graph shows people what your data mean in much less time than a collection of... Continue Reading