Lessons from a Statistical Analysis Gone Wrong, Part 3
If you've read the first two parts of this tale, you know it started when I published a post that involved transforming data for capability analysis. When an astute reader asked why Minitab didn't seem to transform the data outside of the capability analysis, it revealed an oversight that invalidated the original analysis.
I removed the errant post. But to my surprise, the reader who helped me discover my error, John Borneman, continued looking at the original data. He explained to me, "I do have a day job, but I'm a data geek. Plus, doing this type of analysis ultimately helps me analyze data found in my real work!"
I want to share what he did, because it's a great example of how you can take an analysis that doesn't work, ask a few more questions, and end up with an analysis that does work.
Another Look at the Original Analysis
At root, the original post asked, "What is the probability of any horse beating Secretariat's record?" A capability study with Secretariat's winning time as the lower spec limit would provide a estimate of that probability, but as the probability plot below indicates, the data was not normal:
So we ran Stat > Capability Analysis > Normal and selected the option to apply the Johnson transformation before calculating capability. Minitab returned a capability analysis, but the resulting graph doesn't explicitly note that the Johnson transformation was not used.
Note the lack of information about the transformation in the preceding graph. If you don't see details about the transformation, it means the transformation failed. But I failed to notice what wasn't there. I also neglected to check the Session Window, which does tell you the transformation wasn't applied:
Applying the Transformation by Itself
When you select the Johnson transformation as part of the capability analysis in Minitab, the transformation is just a supporting player to the headliner, capability analysis. The transformation doesn't get a lot of attention.
But using Stat > Quality Tools > Johnson Transformation places the spotlight exclusively on the transformation, and Minitab highlights whether the transformation succeeds—or, in this case, fails.
When I looked at this data, I saw that it wasn't normally distributed. But Borneman noticed something else: the data had an ordinal pattern—the race times fell into buckets that were one full second apart.
That means the data lacked discrimination: it was not very precise.
While ordinal data can be used in many analyses, poor discrimination often causes problems when trying to transform data or fit it to a common distribution. Capability studies, where the data at the tails is important, really shouldn't be performed with ordinal data—especially when there is low discrimination.
What Can We Do If the Data Is Truly Ordinal?
But other techniques are available, particularly graphical tools, including box plots and time series plots. And if you wish to compare two groups of data and the data is ordinal with more than 10 categories, you can use ANOVA, a t-test, or even non-parametric tests such as Moods Median.
Playing out the "what if" scenario that this data was ordinal, Borneman used this approach to see if there was a difference between the Kentucky derby winning times in races run between 1875 and 1895 and those between 1896 and 2015.
"The race was 1.5 miles until 1896, when it was shortened to 1.25 miles," Borneman says when looking at the results. "So obviously we'd expect to see a difference, but it's a good way to illustrate the point."
Ordinal data is valuable, but given its limited discrimination, it can only take you so far.
What Can We Do If the Data Is Not Truly Ordinal?
Borneman soon realized that the original data must have been rounded, and more precise data might not be ordinal. "Races clock the horse's speed more accurately than to the nearest second," he says. "In fact, I found that the Derby clocks times to the nearest 1/100 of a second since 2001. The race was timed to the 1/4 second from 1875 to 1905, and to the 1/5 second 1906 to 2000."
He found Kentucky Derby winning race times with more precise measurements, and not the rounded times:
Then he compared the rounded and non-rounded data. "The dot plot really shows the differences in discrimination between these two data sets," he says.
Does the New Data Fit the Normal Distribution?
Borneman wondered if the original analysis could be revisited with this new, more precise data. But a normality test showed the new data also was not normally distributed, and that it didn't fit any any common distribution.
However, running the Johnson Transformation on this data worked!
That meant the more detailed data could be used to perform the capability analysis that failed with the original, rounded data.
An Even More Dramatic Result
Running the capability study using the Johnson transformation and using Secretariat's time as the lower spec limit, Borneman found that the probability of another horse getting a time less than 119.4 seconds is 0.32%.
This is quite a difference from the original analysis, which found about a 5% chance of another horse beating Secretariat's time. In fact, it adds even more weight to the original post's argument that Secretariat was unique among Triple Crown winners.
Now, it should be noted that using a capability study to assess the chance of a future horse beating Secretariat's time is a bit, well, unorthodox. It may make for a fun blog post, but it does not account for the many factors that change from race to race.
"And as my wife—a horse rider and fanatic—pointed out, we also don't know what type of race each jockey and trainer ran," Borneman told me. "Some trainers have the goal to win the race, and not necessarily beat the fastest time."
Borneman's right about this being an off-label use of capability analysis. "On the other hand," he notes, "Secretariat's time is definitely impressive."
What Have I Learned from All This?
In the end, making this mistake reinforced several old lessons, and even taught me some new ones. So what am I taking away from all of this?
- Graphs are great, but you can't assume they tell the whole story. Check all of the feedback and results available.
- Know what the output should include. This is especially important if it's been a while since you performed a particular analysis. A quick peek at Minitab Help or Quality Trainer is all it takes.
- Try performing the analyses in different ways. If I had performed this capability analysis using the Assistant in addition to using the Stat menu, for example, I would have discovered the problem earlier. And it would only have taken a few seconds.
And here's the biggest insight I'm taking from this experience:
- When your analysis fails, KEEP ASKING QUESTIONS. The original analysis failed because the data could not be transformed. But by digging just a little deeper, Borneman realized that rounded data was inhibiting the successful transformation. And by asking variations on "what if," he demonstrated that you can still get good insights—even when your data won't behave the way you'd hoped.
I'm glad to learn these lessons, even at the cost of some embarrassment over my initial mistake. I hope sharing my experience will help you avoid a similar situation.