dcsimg

Angst Over ANOVA Assumptions? Ask the Assistant.

Do you suffer from PAAA (Post-Analysis Assumption Angst)? You’re not alone.

Checking the required assumptions for a statistical  analysis is critical. But if you don’t have a Ph.D. in statistics, it can feel more complicated and confusing than the primary analysis itself.

How does the cuckoo egg data, a common sample data set often used to teach analysis of variance, satisfy the following formal assumptions for a classical one-way ANOVA (F-test)?

  • Normality
  • Homoscedasticity
  • Independence

Are My Data (Kinda Sorta) Normal?

To check the normality of each group of data, a common strategy is to display...

How Deadly Is this Ebola Outbreak?

The current Ebola outbreak in Guinea, Liberia, and Sierra Leone is making headlines around the world, and rightfully so: it's a frightening disease, and last week the World Health Organization reported its spread is outpacing their response. Nearly 900 of  the more than 1,600 people infected during this outbreak have died, including some leading medical professionals trying to stanch the outbreak's spread. And yesterday, one of the American doctors who contracted the disease arrived back in the U.S. for treatment.

Many sources state that Ebola virus outbreaks have a case fatality rate of up to...

A Fun ANOVA: Does Milk Affect the Fluffiness of Pancakes?

by Iván Alfonso, guest blogger

I'm a huge fan of hot cakes—they are my favorite dessert ever. I’ve been cooking them for over 15 years, and over that time I’ve noticed many variation in textures, flavor, and thickness. Personally, I like fluffy pancakes.

There are many brands of hotcake mix on the market, all with very similar formulations. So I decided to investigate which ingredients and inputs may influence the fluffiness of my pancakes.

Potential factors could include the type of mix used, the type of milk used, the use of margarine or butter (of many brands), the amount of mixing time, the...

Cuckoo for Quality: A Birdseye View of a Classic ANOVA Example

If you teach statistics or quality statistics, you’re probably already familiar with the cuckoo egg data set.

The common cuckoo has decided that raising baby chicks is a stressful, thankless job. It has better things to do than fill the screeching, gaping maws of cuckoo chicks, day in and day out.

So the mother cuckoo lays her eggs in the nests of other bird species. If the cuckoo egg is similar enough to the eggs of the host bird, in size and color pattern, the host bird may be tricked into incubating the egg and raising the hatchling. (The cuckoo can then fly off to the French Riviera, or...

Where Did All the World Cup Goals Go? Find Out with a 2-Sample Poisson Rate Test

A few weeks ago I looked at the number of goals that were being scored in the World Cup. At the time there were 2.9 goals per game, which was the highest since 1970. Unfortunately for spectators who enjoyed the higher scoring goals, this did not last.

By the end, the average had fallen to 2.7 goals per game, the same amount scored in the 1998 World Cup. After such a high-scoring start, the goals per game fell off and ended up being pretty similar to other recent World Cups.

What happened?

Comparing the Group Stage to the Knockout Stage

After 15 straight days of games in the group stage, there...

Blind Wine Part IV: The Participants

In Part I, Part II, and Part III we shared our experiment, the survey results, and the experimental results. To wrap things up, we're going to see if the survey results tied to the experimental results in any meaningful way...

First, we look at whether self-identified knowledge correlated to the total number of correct appraisals:

We have no evidence of a relationship (p = 0.795).  So we'll look at the number correct by how much each participant usually spends:

Again, no evidence of a relationship (p = 0.559).  

How about how many types each regularly buys?

There appears to be something here,...

Blind Wine Part III: The Results

In Part I and Part II we learned about the experiment and the survey, respectively. Now we turn our attention to the results...

Our first two participants, Danielle and Sheryl, enter the conference room and are given blindfolds as we explain how the experiment will proceed.  As we administer the tasting, the colors of the wine are obvious but we don't know the true types, which have been masked as "A," "B," "C," and "D." 

As Danielle and Sheryl proceed through each tasting, it is easy to note that they start off correctly identifying the color of each wine; it is also obvious that tasting...

How to Interpret a Regression Model with Low R-squared and Low P values

In regression analysis, you'd like your regression model to have significant variables and to produce a high R-squared value. This low P value / high R2 combination indicates that changes in the predictors are related to changes in the response variable and that your model explains a lot of the response variability.

This combination seems to go together naturally. But what if your regression model has significant variables but explains little of the variability? It has low P values and a low R-squared.

At first glance, this combination doesn’t make sense. Are the significant predictors still...

Do the Data Really Say Female-Named Hurricanes Are More Deadly?

A recent study has indicated that female-named hurricanes kill more people than male hurricanes. Of course, the title of that article (and other articles like it) is a bit misleading. The study found a significant interaction between the damage caused by the storm and the perceived masculinity or femininity of the hurricane names. So don’t be confused by stories that suggest all female-named hurricanes are deadlier than male-named hurricanes. The study actually found no effect of masculinity/femininity for less severe storms. It was the more severe storms where the gender of the name had a...

The Five Coolest Things You Can Do When You Right-click a Graph in Minitab Statistical Software

Minitab graphs are powerful tools for investigating your process further and removing any doubt about the steps you should take to improve it. With that in mind, you’ll want to know every feature about Minitab graphs that can help you share and communicate your results effectively. While many ways to modify your graph are on the Editor menu, some of the best features become available when you right-click your graph.

Here are the five coolest things you can do when you right-click a graph in Minitab Statistical Software.

Send graph to...

Once your graph is ready for your report or presentation,...

Multiple Regression Analysis and Response Optimization Examples using the Assistant in Minitab 17

In Minitab, the Assistant menu is your interactive guide to choosing the right tool, analyzing data correctly, and interpreting the results. If you’re feeling a bit rusty with choosing and using a particular analysis, the Assistant is your friend!

Previously, I’ve written about the new linear model features in Minitab 17. In this post, I’ll work through a multiple regression analysis example and optimize the response variable to highlight the new features in the Assistant.

Choose a Regression Analysis

As part of a solar energy test, researchers measured the total heat flux. They found that heat...

Using Probability Plots to Understand Laser Games Scores

There is more than just the p value in a probability plot—the overall graphical pattern also provides a great deal of useful information. Probability plots are a powerful tool to better understand your data.

In this post, I intend to present the main principles of probability plots and focus on their visual interpretation using some real data.

In probability plots, the data density distribution is transformed into a linear plot. To do this, the cumulative density function (the so-called CDF, cumulating all probabilities below a given threshold) is used (see the graph below). For a normal...

Common Statistical Mistakes You Should Avoid

It's all too easy to make mistakes involving statistics. Powerful statistical software can remove a lot of the difficulty surrounding statistical calculation, reducing the risk of mathematical errors—but  correctly interpreting the results of an analysis can be even more challenging. 

No one knows that better than Minitab's technical trainers. All of our trainers are seasoned statisticians with years of quality improvement experience. They spend most of the year traveling around the country (and around the world) to help people learn to make the best use of Minitab software for analyzing data...

Five Guidelines for Using P values

There is high pressure to find low P values. Obtaining a low P value for a hypothesis test is make or break because it can lead to funding, articles, and prestige. Statistical significance is everything!

My two previous posts looked at several issues related to P values:

In this post, I’ll look at whether P values are still helpful and provide guidelines on how to use them with these issues in mind.

Sir Ronald A Fisher

Are P Values Still Valuable?

Given...

Hypothesis Testing and P Values

by Matthew Barsalou, guest blogger

Programs such as the Minitab Statistical Software make hypothesis testing easier; but no program can think for the experimenter. Anybody performing a statistical hypothesis test must understand what p values mean in regards to their statistical results as well as potential limitations of statistical hypothesis testing.

A p value of 0.05 is frequently used during statistical hypothesis testing. This p value indicates that if there is no effect (or if the null hypothesis is true), you’d obtain the observed difference or more in 5% of studies due to random...

Dividing a Data Set into Training and Validation Samples

Adam Ozimek had an interesting post April 15th on the Modeled Behavior blog at Forbes.com. He observed that one of the advantages of big data is how easy it is to get test data to validate a model that you built from sample data.

Ozimek notes that he is “for the most part a p-value checking, residual examining, data modeling culture economist,” but he’s correct to observe that if you can test your model on real data, then you should.

What I’ll describe is certainly not the only way to divide data in Minitab Statistical Software. Still, I think it’s pretty good if I do say so myself. Want...

Proving My Toddler Really Doesn’t Know her Left Foot from her Right

"Do it myself!

If only I had a nickel for every time I heard that phrase from my toddler in a given day. From throwing away trash, to putting frozen waffles in the toaster, to feeding the dog, I hear it so often that I could possibly retire with all the nickels I’d collect.

And of course, I hear this proclamation every single time my 2-year-old puts on her shoes.

What happens when a toddler tries to put on their own shoes? Well, at least in the case of my little one, the left shoe goes on the right foot, and the right shoe on the left foot, followed by a triumphant “Do it myself! Yay!!!” And the...

Not All P Values are Created Equal

The interpretation of P values would seem to be fairly standard between different studies. Even if two hypothesis tests study different subject matter, we tend to assume that you can interpret a P value of 0.03 the same way for both tests. A P value is a P value, right?

Not so fast! While Minitab statistical software can correctly calculate all P values, it can’t factor in the larger context of the study. You and your common sense need to do that!

In this post, I’ll demonstrate that P values tell us very different things depending on the larger context.

Recap: P Values Are Not the Probability of...

Hockey Penalties, Fans Booing, and Independent Trials

We’re in the thick of the Stanley Cup playoffs, which means hockey fans are doing what seems to be every sports fan's favorite hobby...complaining about the refs! While most complaints, such as “We’re not getting any of the close calls!” are subjective and hard to get data for, there's one question that we should be able to answer objectively with a statistical analysis: Are hockey penalties independent trials? That is, does the team that the next penalty will be called on depend on the team that any previous penalties were called on?

Think of flipping a coin. Even if it comes up heads 10 times...

Selecting the Right Quality Improvement Project

I wrote a post a few years back on the difficulties that can ensue when you’re just trying to get started on your Lean Six Sigma or quality improvement initiative. It can become especially difficult when you have many potential projects staring at you, but you aren’t quite sure which one will give you the most bang for your buck.

A project prioritization matrix can be a good place to start when you need to choose which projects to focus on, as it can help you logically select optimal improvement projects against their weighted value, based on your company’s predefined metrics. The matrix can...