Angst Over ANOVA Assumptions? Ask the Assistant.

Do you suffer from PAAA (Post-Analysis Assumption Angst)? You’re not alone.

Checking the required assumptions for a statistical  analysis is critical. But if you don’t have a Ph.D. in statistics, it can feel more complicated and confusing than the primary analysis itself.

How does the cuckoo egg data, a common sample data set often used to teach analysis of variance, satisfy the following formal assumptions for a classical one-way ANOVA (F-test)?

  • Normality
  • Homoscedasticity
  • Independence

Are My Data (Kinda Sorta) Normal?

To check the normality of each group of data, a common strategy is to display...

A Fun ANOVA: Does Milk Affect the Fluffiness of Pancakes?

by Iván Alfonso, guest blogger

I'm a huge fan of hot cakes—they are my favorite dessert ever. I’ve been cooking them for over 15 years, and over that time I’ve noticed many variation in textures, flavor, and thickness. Personally, I like fluffy pancakes.

There are many brands of hotcake mix on the market, all with very similar formulations. So I decided to investigate which ingredients and inputs may influence the fluffiness of my pancakes.

Potential factors could include the type of mix used, the type of milk used, the use of margarine or butter (of many brands), the amount of mixing time, the...

Using Probability Plots to Understand Laser Games Scores

There is more than just the p value in a probability plot—the overall graphical pattern also provides a great deal of useful information. Probability plots are a powerful tool to better understand your data.

In this post, I intend to present the main principles of probability plots and focus on their visual interpretation using some real data.

In probability plots, the data density distribution is transformed into a linear plot. To do this, the cumulative density function (the so-called CDF, cumulating all probabilities below a given threshold) is used (see the graph below). For a normal...

Exponential: How a Poor Memory Helps to Model Failure Data

These days, my memory isn't what it used to be. Besides that, my memory isn't what it used to be. 

But my incurable case of CRS (Can't Remember Stuff) is not nearly as bad as that of the exponential distribution.

When modelling failure data for reliability analysis, the exponential distribution is completely memoryless. It retains no record of the previous failure of an item.

That might sound like a bad thing. But this special characteristic makes the distribution extremely useful for modelling the behavior of items that have a constant failure rate.

Using the Exponential Distribution to Model...

Did Welch’s ANOVA Make Fisher's Classic One-Way ANOVA Obsolete?

One-way ANOVA can detect differences between the means of three or more groups. It’s such a classic statistical analysis that it’s hard to imagine it changing much.

However, a revolution has been under way for a while now. Fisher's classic one-way ANOVA, which is taught in Stats 101 courses everywhere, may well be obsolete thanks to Welch’s ANOVA.

In this post, I not only want to introduce you to Welch’s ANOVA, but also highlight some interesting research that we perform here at Minitab that guides the implementation of features in our statistical software.

One-Way ANOVA Assumptions

Like any...

The Best European Football League: What the CTQ’s and Minitab Can Tell Us

by Laerte de Araujo Lima, guest blogger

In a previous post (How Data Analysis Can Help Us Predict This Year's Champions League), I shared how I used Minitab Statistical Software to predict the 2013-2014 season of the UEFA Champions league. This involved the regression analysis of main critical-to-quality (CTQ) factors, which I identified using the “voice of the customer” suggestions of some friends.

Since that post was published, my friends have stopped discussing the UEFA Champions league—they were convinced by the results I shared.

But now they’ve challenged me to use Six Sigma tools to...

I Think I Can, I Know I Can: A High-Level Overview of Process Capability Analysis

Remember "The Little Engine That Could," the children's story about self-confidence in the face of huge challenges? In it, a train engine keeps telling itself "I think I can" while carrying a very heavy load up a big mountain. Next thing you know, the little engine has done it...but until that moment, the outcome was uncertain.

It's a wonderful story for teaching kids about self-confidence. But from a quality and customer service viewpoint, it's a horror story: if your business depends on taking the load up the hill, you want to know you can do it.

That's where capability analysis comes in. 


Creating a Custom Report using Minitab, part 2

Now that you’ve seen how to automatically import data and run analyses in my previous post, let’s create the Monthly Report!

I will be using a Microsoft Word Document (Office 2010) and adding bookmarks to act as placeholders for the Graphs, statistics, and boilerplate conclusions.

Let’s go through the steps to accomplish this:

  • Open up an existing report that you have previously created in Microsoft Word.
  • Highlight a section of the document where you would like to place the created Minitab graph or statistic.
  • Go to the Insert tab, click the Bookmark link, and type in the name of what you will be...

How to Handle Extreme Outliers in Capability Analysis

Transformations and non-normal distributions are typically the first approaches considered when the when the Normality test fails in a capability analysis. These approaches do not work when there are extreme outliers because they both assume the data come from a single common-cause variation distribution. But because extreme outliers typically represent special-cause variation, transformations and non-normal distributions are not good approaches for data that contain extreme outliers.

As an example, the four graphs below show distribution fits for a dataset with 99 values simulated from a...

Is Your Statistical Software FDA Validated for Medical Devices or Pharmaceuticals?

We're frequently asked whether Minitab has been validated by the U.S. Food and Drug Administration (FDA) for use in the pharmaceutical and medical device industries.

Minitab does extensive testing to validate our software internally, but Minitab’s statistical software is not—and cannot be—FDA-validated out-of-the-box.

Nobody's can.

It is a common misconception that software vendors can go through a certification process to achieve FDA software validation. It's simply not true.

Software vendors who claim their products are FDA-validated should be scrutinized. It is up to the software purchaser to...

Are Atlanta's Winters Getting Colder and Snowier?

Atlanta was a mess on January 28th, 2014.  Thousands were trapped on the roads overnight while others managed to get to roadside stores to camp out. Thousands of students were forced to spend the night in their schools and the National Guard was called in to get them home. Many wondered how less than three inches of snow could cripple the city, particularly when Atlanta had experienced a similar storm in 2011?

This traumatic event, the recollection of recent snow storms, and now the current storm prompted some to wonder whether Atlanta has been experiencing more cold and snow than before. How...

Gauging Gage Part 3: How to Sample Parts

In Parts 1 and 2 of Gauging Gage we looked at the numbers of parts, operators, and replicates used in a Gage R&R Study and how accurately we could estimate %Contribution based on the choice for each.  In doing so, I hoped to provide you with valuable and interesting information, but mostly I hoped to make you like me.  I mean like me so much that if I told you that you were doing something flat-out wrong and had been for years and probably screwed somethings up, you would hear me out and hopefully just revert back to being indifferent towards me.

For the third (and maybe final) installment, I...

Using nonparametric analysis to visually manage durations in service processes

My main objective is to encourage greater use of statistical techniques in the service sector and present new ways to implement them.

In a previous blog, I presented an approach you can use  to identify process steps that may be improved in the service sector (quartile analysis). In this post I'll show how nonparametric distribution analysis may be implemented in the service sector to analyze durations until a task is completed.

Knowing how much time you need to complete a task may be very useful when assessing process efficiency, and is an important factor in many businesses.

Consider a...

See How Easily You Can Do a Box-Cox Transformation in Regression

For one reason or another, the response variable in a regression analysis might not satisfy one or more of the assumptions of ordinary least squares regression. The residuals might follow a skewed distribution or the residuals might curve as the predictions increase. A common solution when problems arise with the assumptions of ordinary least squares regression is to transform the response variable so that the data do meet the assumptions. Minitab makes the transformation simple by including the Box-Cox button. Try it for yourself and see how easy it is!

The government in Queensland,...

Explaining the Central Limit Theorem with Bunnies & Dragons

When I think about the Central Limit Theorem (CLT), bunnies and dragons are just about the last things that come to mind. However, that’s not the case for Shuyi Chiou, whose playful CreatureCast.org animation explains the CLT using both fluffy and fire-breathing creatures.

Per the article that accompanied this video in The New York Times:

“Many real-world observations can be approximated by, and tested against, the same expected pattern: the normal distribution. In this familiar symmetric bell-shaped pattern, most observations are close to average, and there are fewer observations further from...

Normality Tests and Rounding

All measurements are rounded to some degree. In most cases, you would not want to reject normality just because the data are rounded. In fact, the normal distribution would be a quite desirable model for the data if the underlying distribution is normal since it would smooth out the discreteness in the rounded measurements.

Some normality tests reject a very high percentage of time due to rounding when the underlying distribution is normal (Anderson-Darling and Kolmogorov-Smirnov), while others seem to ignore the rounding (Ryan-Joiner and chi square).

As an extreme example of how data that is...

Anderson-Darling, Ryan-Joiner, or Kolmogorov-Smirnov: Which Normality Test Is the Best?

Minitab Statistical Software offers three tests for Normality: Anderson-Darling (AD), Ryan-Joiner (RJ), and Kolmogorov-Smirnov (KS). The AD test is the default, but is it the best test at detecting Non-Normality? Let's compare the ability of each of these normality tests to detect non-normal data under three different scenarios.  We'll use simulated data for each, but they reflect common situations you're likely to encounter if you're analyzing data for quality improvement.

Scenario 1 – The manufacturing process produces large outliers from time-to-time. In this simulation, 29 values are...

A correspondence table for non parametric and parametric tests

Most of the data that one can collect and analyze follow a normal distribution (the famous bell-shaped curve). In fact, the formulae and calculations used in many analyses simply take it for granted that our data follow this distribution; statisticians call this the "assumption of normality."

For example, our data need to meet the normality assumption before we can accept the results of a one- or two-sample t (Student) or z test. Therefore, it is generally good practice to run a normality test before performing the hypothesis test.

But wait...according to the Central Limit Theorem, when the...

The Gentleman Tasting Coffee: A Variation on Fisher’s Famous Experiment

by Matthew Barsalou, guest blogger

In the 1935 book The Design of Experiments, Ronald A. Fisher used the example of a lady tasting tea to demonstrate basic principles of statistical experiments. In Fisher’s example, a lady made the claim that she could taste whether milk or tea was poured first into her cup, so Fisher did what any good statistician would do—he performed an experiment.

The lady in question was given eight random combinations of cups of tea with either the tea poured first or the milk poured first. She was required to divide the cups into two groups based on whether the milk or...