Did Welch’s ANOVA Make Fisher's Classic One-Way ANOVA Obsolete?

One-way ANOVA can detect differences between the means of three or more groups. It’s such a classic statistical analysis that it’s hard to imagine it changing much.

However, a revolution has been under way for a while now. Fisher's classic one-way ANOVA, which is taught in Stats 101 courses everywhere, may well be obsolete thanks to Welch’s ANOVA.

In this post, I not only want to introduce you to Welch’s ANOVA, but also highlight some interesting research that we perform here at Minitab that guides the implementation of features in our statistical software.

One-Way ANOVA Assumptions

Like any...

The Best European Football League: What the CTQ’s and Minitab Can Tell Us

by Laerte de Araujo Lima, guest blogger

In a previous post (How Data Analysis Can Help Us Predict This Year's Champions League), I shared how I used Minitab Statistical Software to predict the 2013-2014 season of the UEFA Champions league. This involved the regression analysis of main critical-to-quality (CTQ) factors, which I identified using the “voice of the customer” suggestions of some friends.

Since that post was published, my friends have stopped discussing the UEFA Champions league—they were convinced by the results I shared.

But now they’ve challenged me to use Six Sigma tools to...

Introducing the Bubble Plot

When you're evaluating a dataset, graphical analysis can be very important. While an analysis like a regression or ANOVA can be backed up by numbers, being able to visualize how your dataset is behaving can be even more convincing than a group of p-values—especially to those who aren’t trained in statistics.

For example, let’s look at a few variables we think may be correlated. In this specific example, we will take the Unemployment Rate and the Crime Rate for each state in the U.S. We have 3 columns of data in Minitab: C1, which contains the State Name; C2, which contains the Crime Rate; and...

How to Handle Extreme Outliers in Capability Analysis

Transformations and non-normal distributions are typically the first approaches considered when the when the Normality test fails in a capability analysis. These approaches do not work when there are extreme outliers because they both assume the data come from a single common-cause variation distribution. But because extreme outliers typically represent special-cause variation, transformations and non-normal distributions are not good approaches for data that contain extreme outliers.

As an example, the four graphs below show distribution fits for a dataset with 99 values simulated from a...

(We Just Got Rid of) Three Reasons to Fear Data Analysis

Today our company is introducing Minitab 17 Statistical Software, the newest version of the leading software used for quality improvement and statistics education.    So, why should you care? Because important people in your life -- your co-workers, your students, your kids, your boss, maybe even you -- are afraid to analyze data.    There's no shame in that. In fact, there are pretty good reasons for people to feel some trepidation (or even outright panic) at the prospect of making sense of a set of data.

I know how it feels to be intimidated by statistics. Not long ago, I would do almost...

Fix Problems in Regression Analysis with Partial Least Squares

Face it, you love regression analysis as much as I do. Regression is one of the most satisfying analyses in Minitab: get some predictors that should have a relationship to a response, go through a model selection process, interpret fit statistics like adjusted R2 and predicted R2, and make predictions. Yes, regression really is quite wonderful.

Except when it’s not. Dark, seedy corners of the data world exist, lying in wait to make regression confusing or impossible. Good old ordinary least squares regression, to be specific.

For instance, sometimes you have a lot of detail in your data, but not...

Understanding ANOVA by Looking at Your Household Budget

by Arun Kumar, guest blogger

One of the most commonly used statistical methods is ANOVA, short for “Analysis of Variance.” Whether you’re analysing data for Six-Sigma styled quality improvement projects, or perhaps just taking your first statistics course, a good understanding of how this technique works is important.

A lot of concepts are involved in any analysis using ANOVA and its subsequent interpretation. You’re going to have to grapple with terms such as Sources of Variation, Sum of Squares, Mean Squares, Degrees of Freedom, and F-ratio—and you’ll need to understand what statistical...

Making a Difference in How People Use Data

A colleague of mine at Minitab, Cheryl Pammer, was recently featured in "A Statistician's Journey," a monthly feature that appears in the print and online versions of the American Statistical Association's AMSTAT News magazine.  

Each month, the magazine asks ASA members to talk about the paths they took to get to where they are today. Cheryl is a "user experience designer" at Minitab. In other words, she's one of the people who help determine how our statistical softwaredoes what it does, and tries to make it as helpful, useful, and beneficial as possible. Cheryl is always looking for ways to...

Use Analysis of Means to Classify Baseball Parks

When I first got interested in looking at baseball park factors, I only wanted to know which parks benefited hitters and which benefited pitchers. Once I got started, I got interested in the difference between ESPN's published formula and its results and whether there were obvious reasons for the variation in park factors from year-to-year.

But today I’m returning to the original question: which parks are hitters’ parks, and which are pitchers’ parks?

We already know that the mean and median are inadequate by themselves. For example, consider AT&T Park, where the mean suggests a pitchers’...

Using Multi-Vari Charts to Analyze Families of Variations

When trying to solve complex problems, you should first list all the suspected variables identify the few critical factors and separate them from the trivial many, which are not essential to understanding the cause.




Many statistical tools enable you to efficiently identify the effects that are statistically significant in order to converge on the root cause of a problem (for example ANOVA, regression, or even designed experiments (DOEs)). In this post though, I am going to focus on a very simple graphical tool, one that is very intuitive, can be used by virtually anyone, and does not...

Coach Bill Belichick: A Statistical "Hoodie" Analysis, Part 2

by Bob Yoon, guest blogger

Yesterday's post shared how an analysis of Bill Belichick's hoodie-wearing patterns found no statistically significant difference in New England Patriots wins if he wore sleeved or sleeveless hoodies, nor if the hoodie were from Reebok or Nike.

Since these hypothesis tests failed to reject the null hypothesis, I combined these factors under “grey hoodie” and started a new Minitab worksheet.

But when I took a look at all the different outfits Belichick wore, there were still too many variables for a good analysis. I then decided to split this category into two: Type and...

Using Design of Experiments to Minimize Noise Effects

All processes are affected by various sources of variations over time. Products which are designed based on optimal settings, will, in reality, tend to drift away from their ideal settings during the manufacturing process.

Environmental fluctuations and process variability often cause major quality problems. Focusing only on costs and performances is not enough. Sensitivity to deterioration and process imperfections is an important issue. It is often not possible to completely eliminate variations due to uncontrollable factors (such as temperature changes, contamination, humidity, dust etc…).


Regression Analysis: How Do I Interpret R-squared and Assess the Goodness-of-Fit?

After you have fit a linear model using regression analysis, ANOVA, or design of experiments (DOE), you need to determine how well the model fits the data. To help you out, Minitab statistical software presents a variety of goodness-of-fit statistics. In this post, we’ll explore the R-squared (R2 ) statistic, some of its limitations, and uncover some surprises along the way. For instance, low R-squared values are not always bad and high R-squared values are not always good!

What Is Goodness-of-Fit for a Linear Model?

Definition: Residual = Observed value - Fitted value

Linear regression...

Normal: The Kevin Bacon of Distributions

When you learned statistics, most of what you learned was centered around the Normal distribution.  Maybe you became close friends and you later found out his birth name was Gaussian, but either way you probably just call him Normal.

You might know Normal’s a pretty popular guy with plenty of relationships with other distributions.  There are some obvious connections, like how eNormal is Lognormal, but I thought I’d share some less obvious ones. 

You probably already know that by subtracting his mean and dividing by his standard deviation you get Standard Normal.

What if you squared Standard...

Lean Six Sigma in Healthcare: Improving Patient Satisfaction

For providers like Riverview Hospital Association, serving Wisconsin Rapids, Wis. and surrounding areas, recent changes in the U.S. healthcare system have placed more emphasis on improving the quality of care and increasing patient satisfaction. “In this era of healthcare reform, it is even more essential for providers to have a systematic method to improve the way care is delivered,” says Christopher Spranger, director of Lean Six Sigma and Quality Improvement at Riverview Hospital Association. “We have had a Lean Six Sigma program in place for four years, and we are continuously working on...

What Statistical Software Should You Choose: Three More Critical Questions

Earlier I wrote about four important questions you should ask if you're looking at using statistical software to analyze data in your organization, especially if you're hoping to improve quality using methods like Six Sigma. But there are other points to consider as well. If you're in market for statistical software, be sure to investigate these questions, too!

What Types of Statistical Analysis Will They Be Doing? 

The specific types of analysis you need to do could play a big part in determining the right statistical software for your organization. The American Statistical Association's softwa...

Real-life Data Analysis: How Many Licks to the Tootsie Roll Center of a Tootsie Pop?

by Cory Heid, guest blogger

Almost all of us have tried a Tootsie Pop at some point. I’m willing to bet that most of us also thought, “I wonder how many licks it does take to get to the center of the Tootsie Pop?” If you haven’t wondered about this, here’s the classic commercial that may get you more curious:

Personally, I was not very satisfied with the owl's answer of “3,” so I decided to continue the little boy’s quest to find the number of licks required to reach the center of a Tootsie Pop.


Looking around the ‘net, I found that other studies done by student researchers at various...

Evaluating a Gage Study With One Part

Recently, Minitab News featured an article that talked about how to perform a Gage R&R Study with only one part. This prompted many users to contact our technical support team with questions about next steps, like these:

  • What can I do with the output of a Gage study with only one part? 
  • How can I use the variance component estimates to obtain meaningful information about my measurement system?

By themselves, the variance component estimates from the ANOVA output for a Gage study with just one part are not particularly useful. However, if we combine what we’ve learned about the variance for...

Forget Statistical Assumptions - Just Check the Requirements!

One of the most poorly understood concepts in the use of statistics is the idea of assumptions. You've probably encountered many of these assumptions, such as "data normality is an assumption of the 1-sample t-test."  But if you read that statement and believe normality is a requirement of the 1-sample t-test, then you have missed a subtle and important characteristic of assumptions and need to read on...

An "assumption" is not necessarily a "requirement"!

To understand where this idea of assumptions come from, let's forget about statistics for a minute and imagine we sell bikes online.  We...

Gummi Bear DOE: What Do the Blocks Mean?

Last time I used design of experiments to look at the gummi bear data, I interpreted the center point data. The data say that I won’t need any square or cubic terms to get a good fit to the data. Traditionally, the next effect to look at in design of experiments is the block effect.

I was worried that there would be a wearout effect acting on my catapult, so I changed popsicle sticks and rubber bands periodically. I also simply didn’t have time to collect all of my data at the same time, so the blocks represent different days. Moreover, I collected the data for the third block in a...