Where Did All the World Cup Goals Go? Find Out with a 2-Sample Poisson Rate Test

A few weeks ago I looked at the number of goals that were being scored in the World Cup. At the time there were 2.9 goals per game, which was the highest since 1970. Unfortunately for spectators who enjoyed the higher scoring goals, this did not last.

By the end, the average had fallen to 2.7 goals per game, the same amount scored in the 1998 World Cup. After such a high-scoring start, the goals per game fell off and ended up being pretty similar to other recent World Cups.

What happened?

Comparing the Group Stage to the Knockout Stage

After 15 straight days of games in the group stage, there had been 2.8 goals scored per game. But when the knockout stage started, the amount of goals per game dropped to 2.2. Will a 2-sample Poisson rate test show us that this is a significant difference? I’m using a Poisson rate test instead of a 2-sample t-test because goals are counts of occurrences, and not a continuous variable like length. You can get the data I used here.

2-sample Poisson rate test

The p-value for this hypothesis test is 0.144, which is greater than 0.05. That means we can’t conclude the difference is significant.

However, let me point out two things. First, we have a pretty small sample size for games in the knockout stage. It’s possible that the difference is significant, but our test doesn’t have enough power to detect it. Second, the Knockout stage contains a pretty big outlier. Remember when Germany destroyed Brazil 7-1? An 8-goal game is not very typical in soccer. And if we remove that observation...

2-sample Poisson rate test

Voila! The average goals per game for the knockout stage drops to 1.8, and the difference becomes statistically significant since the p-value is less than 0.05.

Is it possible that teams play differently in the knockout stage than they do in the group stage? After all, in the group stage, the margin you win by matters, since goal differential is one of the tie breakers. But once you get to the knockout stage, a 1-0 win is just as good as a 4-0 win. Teams might play defensively after obtaining a lead in hopes of not allowing an equalizer. And heavy underdogs also might play a conservative, defensive game, hoping to go the distance with the score being 0-0 and taking their chances on penalty kicks.

But before we go saying there is a difference between the two stages, let’s look at some more data.

The 2010 World Cup

I went back 4 years and collected the goals scored per game in the 2010 World Cup, then ran another 2-sample Poisson rate test.

2-sample Poisson rate test

Not only is the p-value not significant, the rate of goals scored per game in the knockout stage is actually higher than the rate in the group stage. So throw out any theories about teams playing more conservative in the knockout stage!

It appears that from game to game, goals in soccer can be pretty random. In a small sample, you can have a run of high-scoring games or one very high-scoring game, as we saw with Germany and Brazil. But trying to determine a reason for the pattern can be folly. Sometimes, you just have to chalk things up to random variation!

Right, Brazil?


Name: Rob • Friday, July 18, 2014

Hay, wouldn't u need to use a paired test of some kind cause the groups are related?! Knockout group is a subset of the group stage.

Name: Kevin Rudy • Tuesday, July 22, 2014

The knockout stage does include the same teams as the group stage, but different games. For example, the United States played Belgium in the knockout stage, but played Ghana, Portugal, and Germany in the group stage. So because the games were all different, the knockout stage is not a subset of the group stage.

A paired t-test is appropriate when you're measuring the same item under two different conditions. So if the exact same games were played in the two stages, a paired t-test would be appropriate. But because they were different, we can't use it.

Name: Peg Pennington • Thursday, July 24, 2014

Kevin, Thanks for blog. I'm glad you did the follow up because I was sure the Brazil game would push the average up. I guess that's why we look at data.

I'm still on MT-16 and when I tried to get the data I'm not seeing anything. Am I doing something wrong? I'd really like to share this with my class.

Thanks Peg

Name: Carly Barry • Thursday, July 24, 2014

Hi Peg, We followed-up with you about accessing this Minitab dataset via the email address you gave when you commented.

Don't hesitate to let us know if you're still having trouble.

Thanks for reading the Minitab Blog!

blog comments powered by Disqus