dcsimg

Where Did All the World Cup Goals Go? Find Out with a 2-Sample Poisson Rate Test

A few weeks ago I looked at the number of goals that were being scored in the World Cup. At the time there were 2.9 goals per game, which was the highest since 1970. Unfortunately for spectators who enjoyed the higher scoring goals, this did not last.

By the end, the average had fallen to 2.7 goals per game, the same amount scored in the 1998 World Cup. After such a high-scoring start, the goals per game fell off and ended up being pretty similar to other recent World Cups.

What happened?

Comparing the Group Stage to the Knockout Stage

After 15 straight days of games in the group stage, there...

Blind Wine Part IV: The Participants

In Part I, Part II, and Part III we shared our experiment, the survey results, and the experimental results. To wrap things up, we're going to see if the survey results tied to the experimental results in any meaningful way...

First, we look at whether self-identified knowledge correlated to the total number of correct appraisals:

We have no evidence of a relationship (p = 0.795).  So we'll look at the number correct by how much each participant usually spends:

Again, no evidence of a relationship (p = 0.559).  

How about how many types each regularly buys?

There appears to be something here,...

Blind Wine Part III: The Results

In Part I and Part II we learned about the experiment and the survey, respectively. Now we turn our attention to the results...

Our first two participants, Danielle and Sheryl, enter the conference room and are given blindfolds as we explain how the experiment will proceed.  As we administer the tasting, the colors of the wine are obvious but we don't know the true types, which have been masked as "A," "B," "C," and "D." 

As Danielle and Sheryl proceed through each tasting, it is easy to note that they start off correctly identifying the color of each wine; it is also obvious that tasting...

How to Interpret a Regression Model with Low R-squared and Low P values

In regression analysis, you'd like your regression model to have significant variables and to produce a high R-squared value. This low P value / high R2 combination indicates that changes in the predictors are related to changes in the response variable and that your model explains a lot of the response variability.

This combination seems to go together naturally. But what if your regression model has significant variables but explains little of the variability? It has low P values and a low R-squared.

At first glance, this combination doesn’t make sense. Are the significant predictors still...

Do the Data Really Say Female-Named Hurricanes Are More Deadly?

A recent study has indicated that female-named hurricanes kill more people than male hurricanes. Of course, the title of that article (and other articles like it) is a bit misleading. The study found a significant interaction between the damage caused by the storm and the perceived masculinity or femininity of the hurricane names. So don’t be confused by stories that suggest all female-named hurricanes are deadlier than male-named hurricanes. The study actually found no effect of masculinity/femininity for less severe storms. It was the more severe storms where the gender of the name had a...

The Five Coolest Things You Can Do When You Right-click a Graph in Minitab Statistical Software

Minitab graphs are powerful tools for investigating your process further and removing any doubt about the steps you should take to improve it. With that in mind, you’ll want to know every feature about Minitab graphs that can help you share and communicate your results effectively. While many ways to modify your graph are on the Editor menu, some of the best features become available when you right-click your graph.

Here are the five coolest things you can do when you right-click a graph in Minitab Statistical Software.

Send graph to...

Once your graph is ready for your report or presentation,...

Multiple Regression Analysis and Response Optimization Examples using the Assistant in Minitab 17

In Minitab, the Assistant menu is your interactive guide to choosing the right tool, analyzing data correctly, and interpreting the results. If you’re feeling a bit rusty with choosing and using a particular analysis, the Assistant is your friend!

Previously, I’ve written about the new linear model features in Minitab 17. In this post, I’ll work through a multiple regression analysis example and optimize the response variable to highlight the new features in the Assistant.

Choose a Regression Analysis

As part of a solar energy test, researchers measured the total heat flux. They found that heat...

Using Probability Plots to Understand Laser Games Scores

There is more than just the p value in a probability plot—the overall graphical pattern also provides a great deal of useful information. Probability plots are a powerful tool to better understand your data.

In this post, I intend to present the main principles of probability plots and focus on their visual interpretation using some real data.

In probability plots, the data density distribution is transformed into a linear plot. To do this, the cumulative density function (the so-called CDF, cumulating all probabilities below a given threshold) is used (see the graph below). For a normal...

Common Statistical Mistakes You Should Avoid

It's all too easy to make mistakes involving statistics. Powerful statistical software can remove a lot of the difficulty surrounding statistical calculation, reducing the risk of mathematical errors—but  correctly interpreting the results of an analysis can be even more challenging. 

No one knows that better than Minitab's technical trainers. All of our trainers are seasoned statisticians with years of quality improvement experience. They spend most of the year traveling around the country (and around the world) to help people learn to make the best use of Minitab software for analyzing data...

Five Guidelines for Using P values

There is high pressure to find low P values. Obtaining a low P value for a hypothesis test is make or break because it can lead to funding, articles, and prestige. Statistical significance is everything!

My two previous posts looked at several issues related to P values:

In this post, I’ll look at whether P values are still helpful and provide guidelines on how to use them with these issues in mind.

Sir Ronald A Fisher

Are P Values Still Valuable?

Given...

Hypothesis Testing and P Values

by Matthew Barsalou, guest blogger

Programs such as the Minitab Statistical Software make hypothesis testing easier; but no program can think for the experimenter. Anybody performing a statistical hypothesis test must understand what p values mean in regards to their statistical results as well as potential limitations of statistical hypothesis testing.

A p value of 0.05 is frequently used during statical hypothesis testing. This p value indicates that if there is no effect (or if the null hypothesis is true), you’d obtain the observed difference or more in 5% of studies due to random sampling...

Dividing a Data Set into Training and Validation Samples

Adam Ozimek had an interesting post April 15th on the Modeled Behavior blog at Forbes.com. He observed that one of the advantages of big data is how easy it is to get test data to validate a model that you built from sample data.

Ozimek notes that he is “for the most part a p-value checking, residual examining, data modeling culture economist,” but he’s correct to observe that if you can test your model on real data, then you should.

What I’ll describe is certainly not the only way to divide data in Minitab Statistical Software. Still, I think it’s pretty good if I do say so myself. Want...

Proving My Toddler Really Doesn’t Know her Left Foot from her Right

"Do it myself!

If only I had a nickel for every time I heard that phrase from my toddler in a given day. From throwing away trash, to putting frozen waffles in the toaster, to feeding the dog, I hear it so often that I could possibly retire with all the nickels I’d collect.

And of course, I hear this proclamation every single time my 2-year-old puts on her shoes.

What happens when a toddler tries to put on their own shoes? Well, at least in the case of my little one, the left shoe goes on the right foot, and the right shoe on the left foot, followed by a triumphant “Do it myself! Yay!!!” And the...

Not All P Values are Created Equal

The interpretation of P values would seem to be fairly standard between different studies. Even if two hypothesis tests study different subject matter, we tend to assume that you can interpret a P value of 0.03 the same way for both tests. A P value is a P value, right?

Not so fast! While Minitab statistical software can correctly calculate all P values, it can’t factor in the larger context of the study. You and your common sense need to do that!

In this post, I’ll demonstrate that P values tell us very different things depending on the larger context.

Recap: P Values Are Not the Probability of...

Hockey Penalties, Fans Booing, and Independent Trials

We’re in the thick of the Stanley Cup playoffs, which means hockey fans are doing what seems to be every sports fan's favorite hobby...complaining about the refs! While most complaints, such as “We’re not getting any of the close calls!” are subjective and hard to get data for, there's one question that we should be able to answer objectively with a statistical analysis: Are hockey penalties independent trials? That is, does the team that the next penalty will be called on depend on the team that any previous penalties were called on?

Think of flipping a coin. Even if it comes up heads 10 times...

Selecting the Right Quality Improvement Project

I wrote a post a few years back on the difficulties that can ensue when you’re just trying to get started on your Lean Six Sigma or quality improvement initiative. It can become especially difficult when you have many potential projects staring at you, but you aren’t quite sure which one will give you the most bang for your buck.

A project prioritization matrix can be a good place to start when you need to choose which projects to focus on, as it can help you logically select optimal improvement projects against their weighted value, based on your company’s predefined metrics. The matrix can...

How to Correctly Interpret P Values

The P value is used all over statistics, from t-tests to regression analysis. Everyone knows that you use P values to determine statistical significance in a hypothesis test. In fact, P values often determine what studies get published and what projects get funding.

Despite being so important, the P value is a slippery concept that people often interpret incorrectly. How do you interpret P values?

In this post, I'll help you to understand P values in a more intuitive way and to avoid a very common misinterpretation that can cost you money and credibility.

What Is the Null Hypothesis in Hypothesis...

Did Welch’s ANOVA Make Fisher's Classic One-Way ANOVA Obsolete?

One-way ANOVA can detect differences between the means of three or more groups. It’s such a classic statistical analysis that it’s hard to imagine it changing much.

However, a revolution has been under way for a while now. Fisher's classic one-way ANOVA, which is taught in Stats 101 courses everywhere, may well be obsolete thanks to Welch’s ANOVA.

In this post, I not only want to introduce you to Welch’s ANOVA, but also highlight some interesting research that we perform here at Minitab that guides the implementation of features in our statistical software.

One-Way ANOVA Assumptions

Like any...

Equivalence Testing for Quality Analysis (Part II): What Difference Does the Difference Make?

My previous post examined how an equivalence test can shift the burden of proof when you perform hypothesis test of the means. This allows you to more rigorously test whether the process mean is equivalent to a target or to another mean.

Here’s another key difference: To perform the analysis, an equivalence test requires that you first define, upfront, the size of a practically important difference between the mean and the target, or between two means.

Truth be told, even when performing a standard hypothesis test, you should know the value of this difference. Because you can’t really evaluate...

Equivalence Testing for Quality Analysis (Part I): What are You Trying to Prove?

With more options, come more decisions.

With equivalence testing added to Minitab 17, you now have more statistical tools to test a sample mean against target value or another sample mean.

Equivalence testing is extensively used in the biomedical field. Pharmaceutical manufacturers often need to test whether the biological activity of a generic drug is equivalent to that of a brand name drug that has already been through the regulatory approval process.

But in the field of quality improvement, why might you want to use an equivalence test instead of a standard t-test?

Interpreting Hypothesis...