3 Things Baseball Can Teach Us About Control Charts

Control charts are some of the most useful tools in statistical science. They track process statistics over time and detect when the mean or standard deviation change from what they have been. The signals that control charts send about special causes can help you zero in on the fastest ways to improve any process, whether you’re making tires, turbines, or trying to improve patient care.

I’ve mentioned before that I’m a baseball fan. For the past several years, I’ve been noticing articles about the Year of the Pitcher in Major League Baseball (2010, 2011, 2012, 2013, 2014). That repetition suggests a shift to me, and I thought “What a great way to illustrate some neat things that you can do with control charts!”

Here are a few things to remember about control charts that you can illustrate with Major League Baseball data from 1969 to 2013, courtesy of numbers from www.baseball-reference.com.

Use meaningful units in control charts

We'll start in 1969 because that’s when new rules decreased the height of the pitcher’s mound in Major League Baseball parks. However, there have been some other notable changes in the game over the years that mean that we have to be sensible about the data that we plot. For example, if we make an I-MR chart of hits, we see some special causes right away:

The I-MR chart of hits shows changes in 1973 and 1993.

Four points are out of control on the MR chart because of strike-shortened seasons in 1981 and 1994. One technique when you know the reason for an out-of-control point is to exclude those samples from calculating the control limits. That way, the control limits represent expected process variation. But in statistics we like to use as much of our data as possible. If you want to keep the data from those years, an alternative to throwing them out would be to plot a different variable. I used Minitab’s Calculator to create a column that contains the number of hits per at bat.  

Set the baseline

The I chart shows unusual points in 1972  and a series beginning in 1994.

The control chart above shows the number of hits per at bat. You still see some out-of-control points on the chart, but they no longer correspond to the strike-shortened seasons. The first out-of-control point is 1972. Not coincidentally, the American League instituted the designated hitter in 1973. A corresponding increase in hits per at bat makes sense. The next out-of-control signal comes in 1994. The most popular explanation, given that 4 of the next 7 points are out of control, is that this marks the beginning of the steroid era in baseball. The steroid theory explains that, beginning in 1994, increased use of performance-enhancing drugs reached a tipping point so that the effects of drugs were statistically visible in the game. Another explanation is that 1993 is when baseball began playing games in the thin air of Colorado, where Mile High Stadium was a hitter-friendly precursor to Coors Field.

In cases like this, you have to decide whether it’s fair to compare all of this variation on one chart or not. If you know that there has been a change in the rules, then you would expect to see corresponding out-of-control points. In fact, we might not be getting enough out-of-control points to show the changes precisely.

The same logic applies to any process: typically, you want to calculate control limits from a stable baseline. For example, if you calculate the control limits using the years 1973-1993, then 3 of the 4 years without the designated hitter are out-of-control and 6 of the 15 years 1994 to 2007 are unusual. The out-of-control points show when the process was different from the baseline years 1973-1993:

Setting the baseline with 1973-1993 shows the pre-designated hitter cutoff and some steroids years.

If you calculate the control limits using the years 1994-2013, then the MR chart shows precipitous changes in 1973 and 1993:

Using the steroid years to set the baseline shows that years before and after looked relatively unusual.

The easiest way to create control charts

Of course, when you have different things to compare, you might want to look for points that are unusual relative to the process that they should fit. For example, you would want different control limits before and after you improve a process. Minitab’s Assistant Menu makes this easy with Before/After control charts. With the baseball statistics, a before/after control chart lets us look for points that are unusual within an era. Let’s set the dividing line at 1993 and just use the post-designated hitter years:

Setting different control limits for different stages lets you see when the steroid era ends.

In the first era, there are no out-of-control points. In the second era, the years 2010-2013 are unusual, marking the return of pitching domination so many people notice. The Assistant Menu also performs a statistical test to verify that the average batting average is statistically greater in the second era than in the first.


Control charts are a powerful tool for understanding your processes. Minitab makes control charting easy, whether you want to compare different eras in baseball or different phases in your process. And the Assistant Menu makes comparisons even easier by providing all of the information you need in a single report, ready for you to export to a presentation.

Ready for more? Check out our webcast on using control charts!


Photograph of Donald "Zack" Greinke by Keith Allison, used under Creative Commons Attribution-Share Alike 2.0 Generic license.

7 Deadly Statistical Sins Even the Experts Make

Do you know how to avoid them?

Get the facts >


blog comments powered by Disqus