Angst Over ANOVA Assumptions? Ask the Assistant.

Do you suffer from PAAA (Post-Analysis Assumption Angst)? You’re not alone.

Checking the required assumptions for a statistical  analysis is critical. But if you don’t have a Ph.D. in statistics, it can feel more complicated and confusing than the primary analysis itself.

How does the cuckoo egg data, a common sample data set often used to teach analysis of variance, satisfy the following formal assumptions for a classical one-way ANOVA (F-test)?

  • Normality
  • Homoscedasticity
  • Independence

Are My Data (Kinda Sorta) Normal?

To check the normality of each group of data, a common strategy is to display...

Where Did All the World Cup Goals Go? Find Out with a 2-Sample Poisson Rate Test

A few weeks ago I looked at the number of goals that were being scored in the World Cup. At the time there were 2.9 goals per game, which was the highest since 1970. Unfortunately for spectators who enjoyed the higher scoring goals, this did not last.

By the end, the average had fallen to 2.7 goals per game, the same amount scored in the 1998 World Cup. After such a high-scoring start, the goals per game fell off and ended up being pretty similar to other recent World Cups.

What happened?

Comparing the Group Stage to the Knockout Stage

After 15 straight days of games in the group stage, there...

The Five Coolest Things You Can Do When You Right-click a Graph in Minitab Statistical Software

Minitab graphs are powerful tools for investigating your process further and removing any doubt about the steps you should take to improve it. With that in mind, you’ll want to know every feature about Minitab graphs that can help you share and communicate your results effectively. While many ways to modify your graph are on the Editor menu, some of the best features become available when you right-click your graph.

Here are the five coolest things you can do when you right-click a graph in Minitab Statistical Software.

Send graph to...

Once your graph is ready for your report or presentation,...

Common Statistical Mistakes You Should Avoid

It's all too easy to make mistakes involving statistics. Powerful statistical software can remove a lot of the difficulty surrounding statistical calculation, reducing the risk of mathematical errors—but  correctly interpreting the results of an analysis can be even more challenging. 

No one knows that better than Minitab's technical trainers. All of our trainers are seasoned statisticians with years of quality improvement experience. They spend most of the year traveling around the country (and around the world) to help people learn to make the best use of Minitab software for analyzing data...

Hockey Penalties, Fans Booing, and Independent Trials

We’re in the thick of the Stanley Cup playoffs, which means hockey fans are doing what seems to be every sports fan's favorite hobby...complaining about the refs! While most complaints, such as “We’re not getting any of the close calls!” are subjective and hard to get data for, there's one question that we should be able to answer objectively with a statistical analysis: Are hockey penalties independent trials? That is, does the team that the next penalty will be called on depend on the team that any previous penalties were called on?

Think of flipping a coin. Even if it comes up heads 10 times...

Did Welch’s ANOVA Make Fisher's Classic One-Way ANOVA Obsolete?

One-way ANOVA can detect differences between the means of three or more groups. It’s such a classic statistical analysis that it’s hard to imagine it changing much.

However, a revolution has been under way for a while now. Fisher's classic one-way ANOVA, which is taught in Stats 101 courses everywhere, may well be obsolete thanks to Welch’s ANOVA.

In this post, I not only want to introduce you to Welch’s ANOVA, but also highlight some interesting research that we perform here at Minitab that guides the implementation of features in our statistical software.

One-Way ANOVA Assumptions

Like any...

Equivalence Testing for Quality Analysis (Part II): What Difference Does the Difference Make?

My previous post examined how an equivalence test can shift the burden of proof when you perform hypothesis test of the means. This allows you to more rigorously test whether the process mean is equivalent to a target or to another mean.

Here’s another key difference: To perform the analysis, an equivalence test requires that you first define, upfront, the size of a practically important difference between the mean and the target, or between two means.

Truth be told, even when performing a standard hypothesis test, you should know the value of this difference. Because you can’t really evaluate...

I Think I Can, I Know I Can: A High-Level Overview of Process Capability Analysis

Remember "The Little Engine That Could," the children's story about self-confidence in the face of huge challenges? In it, a train engine keeps telling itself "I think I can" while carrying a very heavy load up a big mountain. Next thing you know, the little engine has done it...but until that moment, the outcome was uncertain.

It's a wonderful story for teaching kids about self-confidence. But from a quality and customer service viewpoint, it's a horror story: if your business depends on taking the load up the hill, you want to know you can do it.

That's where capability analysis comes in. 


Analyzing College Football Overtimes

Two weeks ago Penn State and Michigan played in a quadruple-overtime thriller that almost went into a 5th overtime. Had Penn State coach Bill O’Brien kicked a field goal in the 4th overtime instead of going for it on 4th and 1, the game would have continued. But the Nittany Lions converted the 4th down (which, by the way, wasn’t a gamble) and went on to score the game winning touchdown in the 4th overtime.

Watching this game got me asking a bunch of questions. How many college football overtime games go into 4 overtimes? Did Penn State still have home-field advantage since they were playing at...

Using Hypothesis Tests to Bust Myths about the Battle of the Sexes

In my home, we’re huge fans of Mythbusters, the show on Discovery Channel. This fun show mixes science and experiments to prove or disprove various myths, urban legends, and popular beliefs. It’s a great show because it brings the scientific method to life. I’ve written about Mythbusters before to show how, without proper statistical analysis, it’s difficult to know when a result is statistically significant. How much data do you need to collect and how large does the difference need to be?

For this blog, let's look at a more recent Mythbusters episode, “Battle of the Sexes – Round Two.” I...

When Should NHL Goalies Get Pulled?

Even the best NHL goalies can get pulled several times each season. Do they really have cold streaks, or is a drop in save percentage on a given day part of normal random variation?

My colleague Doug Gorman and I decided to find out using our favorite statistical software package.  

Control Charts for Coaching Decisions

We used a control chart approach to determine if coaching decisions to pull goalies are supported by sound statistical rules, or if they seem to be more emotional reactions.

We generated 3-sigma lower control limits for each of 10 goalies based on their 2011-2012 game-by-game save...

Getting Started with Factorial Design of Experiments (DOE)

When I talk to quality professionals about how they use statistics, one tool they mention again and again is design of experiments, or DOE. I'd never even heard the term before I started getting involved in quality improvement efforts, but now that I've learned how it works, I wonder why I didn't learn about it sooner. If you need to find out how several factors are affecting a process outcome, DOE is the way to go. 

Somewhere in school you probably learned, like I did, that when you do an experiment you need to hold all the factors constant except for the one you're studying. That seems simple...

Using Minitab to Choose the Best Ranking System in College Basketball

Life is full of choices. Some are simple, such as what shirt to put on in the morning (although if you’re like me, it’s not so much of a “choice” as it is throwing on the first thing you grab out of the closet). And some choices are more complex. In the quality world, you might have to determine which distribution to choose for your capability analysis or which factor levels to use to bake the best cookie in a design of experiments. But all of these choices pale in comparison* to the most important decision you have to make each year: which college basketball teams to pick during March...

Will the Weibull Distribution Be on the Demonstration Test?

Over on the Indium Corporation's blog, Dr. Ron Lasky has been sharing some interesting ideas about using the Weibull distribution in electronics manufacturing. For instance, check out this discussion of how dramatically an early first-failure can affect an analysis of a part or component (in this case, an alloy used to solder components to a circuit board). 

This got me thinking again about all the different situations in which the Weibull distribution can help us make good decisions. The main reason Weibull is so useful is that it's very flexible in fitting different types of data, because it...

Understanding Type 1 and Type 2 Errors from the Feline Perspective: All Mistakes Are Not Equal!

Serving cat food? I sure hope you've set your alpha
level high enough.

"Bad kitty!" That's a phrase you almost never hear, but even we cats make the occasional mistake. I was reminded of this recently as I watched my human trying to analyze some data. People frequently make mistakes when they test a hypothesis with data analysis. Specifically, they can make either Type I or Type II errors.   

When I first started reading my human's statistics textbooks a few years ago, this idea seemed awfully silly to me. We cats appreciate being direct, and you either get the answer correct or you don't. I...

Do NFL Teams Have a Greater Home Field Advantage on Thursday Night?

When Alex Smith travels to Seattle, he has to go up against 67,000 screaming Seahawk fans that make Seattle one of the loudest stadiums in football. When Joe Flacco goes into Pittsburgh, he has to overcome 65,000 Steelers fans clad in black and gold and waving Terrible Towels. And when Matt Schaub plays in Jacksonville he has to, well...people do go to football games in Jacksonville, right?

Either way, all three scenarios have one thing in common. The home field advantage is exactly the same.

Whether you have a sold out stadium full of rambunctious fans, or the stadium is half full, the home...

The Stats Cat on Sample Size, Statistical Power, and the Revenge of the Zombie Salmon

Marlowe the Stats Cat here. That guy I share my house with left his laptop unattended again, and I spent the evening searching the web for news about one of my favorite subjects: salmon. Yum. But I wound up getting more than a collection of cool salmon pictures...I also got a better understanding of the role the size of a dataset plays when you're doing a hypothesis test.  

You see, my search led me to this paper that summarized a 2009 analysis of neuroimaging data collected from a frozen salmon. Yes, you did read that correctly: some people with Ph.D.'s actually ran an MRI on a dead fish....

The Problem With P-Charts: Out-of-control Cycle LaneYs!

Since we introduced new control charts in Minitab 16.2, I’ve been waiting to come across some real data I could use to showcase their awesome power. My friends, this day has come! I am about to reveal a perhaps unconventional use of the Laney P' chart to investigate national cycling data in the UK. So we’re not looking at any real process here, which is how the P' chart is usually used, just data from a national study about cycling habits. Why? Because this dataset gives us a prime example (see what I did there…?) of overdispersion issues that can cause problems when we use standard P control...

Gummi Bear DOE: Replicates and Center Points, Part 2

Last time, we talked about center points and replicates in design of experiments. It turns out that both are tools that you can use to increase the probability of finding a statistically significant difference. But what we really want to know is, how many center points and replicates should be in the gummi bear experiment? To answer that question, we have to estimate the standard deviation of the distances the gummi bears go.

Estimating the Standard Deviation When You Do Design of Experiments

I do have an old data set from some students launching gummi bears that I can use. Historical data is a...

Busting the Mythbusters with Statistics: Are Yawns Contagious?

This looks like a typical Mythbusters experiment!

Statistics can be unintuitive. What’s a large difference? What’s a large sample size? When is something statistically significant? You might think you know, based on experience and intuition, but you really don’t know until you actually run the analysis. You have to run the proper statistical tests to know what the data are telling you!

Even experts can get tripped up by their hunches, as we'll see.

In my family, we’re huge fans of the Mythbusters. This fun Discovery Channel show mixes science and experiments to prove or disprove various myths,...