What If Major League Baseball Had a 16-Game Season?

Kevin Rudy 15 April, 2016

When it comes to statistical analyses, collecting a large enough sample size is essential to obtaining quality results. If your sample size is too small, confidence intervals may be too wide to be useful, linear models may lack necessary precision, and control charts may get so out of control that they become self-aware and rise up against humankind.

NFL and MLB Logos

Okay,that last point may have been over-exaggerated, but you get the idea. 

However, sometimes collecting a large sample size is easier said than done. Financial or time constraints often limit the number of observations we can collect. And in the world of sports, there is no better example of this than the NFL.

Football is a violent sport, so the players need a week to rest and recover between games. This time constraint limits the regular season to only 16 games. This is very small compared to the other major American leagues—hockey, basketball, and baseball. The NHL and NBA both play a 82-game season, while MLB plays 162 games!

But we never consider the sample size when we consider the best and worst teams in the NFL. It's not uncommon to see teams with sub-par records come back and have a great record the following year, or vice versa. We'll often credit/blame coaches and quarterbacks, but did you ever hear a sports analyst just say "Hey, sometimes crazy things can happen over a 16-game sample"? And we're almost at the point in the MLB baseball season where most teams have played 16 games.

That makes me wonder, what would baseball look like if they only played 16 games?

Looking at Major League Baseball as a 16-Game Season

I took the previous 10 seasons and recorded every MLB team's record in their first 16 games. I also looked at their final record to get a good estimate of their "true" winning percentage. The fitted line plot below shows the relationship between a baseball team's winning percentage in their first 16 games and their final winning percentage.

Fitted Line Plot

The relationship isn't completely random, as a higher winning percentage in your first 16 means you're more likely to have a better final winning percentage. But it's not a very strong relationship, as only 20.2% of the variation in a team's final winning percentage is explained by their winning percentage in the first 16 games.

Observations toward the bottom right show teams that started off with a very strong record but ended up in the bottom of the league. You can see that the Colorado Rockies have a habit of doing this. But the more interesting teams are in the top left corner. These are teams that started out slow but ended up as one of the best teams in the league. In fact, there were 31 teams that started .500 or worse in their first 16 games and ended up making the playoffs. That's 35% of all playoff teams in the last 10 years! Four teams that stand out are the Rays, Rockies, Rangers, and Phillies. The Rays, Rockies and Rangers were all sub-.500 and in last place in their division after the first 16 games—and all three ended up making it to the World Series that same year. And the Phillies were 8-8 and in third place after 16 games of the 2008 season. That would have put them out of the playoffs. But with a larger sample, they finished first in their division and ended up winning the World Series!

Can a Small Sample be Good?

In the world of sports, a small sample size isn't necessarily a bad thing. Small samples definitely make things entertaining. For example, just compare the first round of the NCAA tournament (single elimination) to the first round of the NBA playoffs (7 game series). The former has upsets galore (East Tennessee St over Michigan St anyone?) that would never be near impossible in a 7 game game series. And the variance that can occur in an NFL regular season certainly contributes to it being more popular than the marathon that is the MLB regular season. Larger samples help determine who the better team is, but the unpredictability that we love in sports is greatly helped by smaller samples.

Of course, the quality world varies greatly from sports entertainment. Usually, we want all the observations we can get to improve the reliability of our results. Just make sure that you don't collect such a large sample that statistically significant results aren't practical to your situation. Luckily, Minitab Statistical Software offers power and sample size analyses to help you determine how much data to collect. You want enough data to ensure you'll reliable results without spending extra time and money on unnecessary observations. 

And remember, when that NFL team comes out of nowhere to win their division next year; it could be the coach, it could be the quarterback...or it could just be the sample size!