I have a good time putting together simple data sets that you can use to build your confidence in statistics. But I tend to like fairly old things: Shakespeare (1564-1616), Poe (1809-1849) and gummi bears (invented 1922). But I have some modern interests too. One of those, appearing in about 2009, is Minecraft.
If you like Minecraft, then here’s a data set that you can use to practice a few things in Minitab Statistical Software. One of the nicest things about Minitab is that even with this spreadsheet, saved in Googledocs, you can copy and paste directly into Minitab.
Change Data with the Time Series Menu
The data set contains cumulative values. The number of buys goes from 533,451 on 11/2/2010 to 547,544 on 11/4/2010. This means that we’re missing the number of people who buy Minecraft per day, on average, which is a neat number to check out.
Fortunately, Minitab Statistical Software has an easy way to get the daily data from the cumulative rows. (If you're not already using Minitab, you can download a free 30-day trial.) Try this:
- Choose Stat > Time Series > Differences.
- In Series, enter Date.
- In Store differences in, enter ‘Days between rows’.
- Click OK.
- Press Ctrl+E to reopen the Differences dialog box:
- In Series, enter buys.
- In Store differences in, enter ‘Buys per row’.
- Click OK.
- Choose Calc > Calculator.
- In Store result in variable, enter ‘Buys per day’.
- In Expression, enter ‘Buys per row’/’Days between rows’.
- Click OK.
Without doing any complicated formulas, you now have a column that shows the average number of times people bought minecraft for the range of dates in the data set.
Use Graphs to Show What Was Hidden
First, let’s take a look at a scatterplot of the original data for the number of people who buy Minecraft. These are the steps I used to make scatterplots of the number of people who bought Minecraft.
- Choose Graph > Scatterplot.
- Choose Simple. Click OK.
- In row 1, enter buys for the Y variable and Date for the X variable.
- In row 2, enter ‘Buys per day’ for the Y variable and Date for the X variable.
- Click OK.
The first scatterplot of the cumulative data shows a fairly straight line, indicating that the number of people who buy Minecraft per day stays about the same. Nothing looks unusual.
The second scatterplot of the data per day shows two data points that clearly stand out from the others.
A little research quickly reveals why these two data points are unusual. The first, December 20, 2012 you might guess would be related to eager Christmas shoppers. However, it turns out that December 20 was the day that Minecraft reached beta status. On December 21, the price increased and the company no longer promised that future updates would be free. So Minecraft gave a heavy incentive to people to buy before December 20.
The second date, 8/14/2011, was the Sunday of Minecraft Wedding Weekend. At first, this sounds like an event where everyone decorated their Minecraft worlds in white and had cake, but actually, this was the weekend that Mojang owner Markus Persson got married. To celebrate, if you bought Minecraft that weekend, you got a code for a free copy.
What You Learned
So you’ve had a chance to practice with Minitab, building your confidence so that you’ll be ready to make solid decisions with your own data. You saw how to change cumulative data into data increasing by row. You also got to spot unusual observations in a scatterplot.
Like seeing the value of finding unusual data? Examine the graph Newcrest Mining Ltd. uses to show when the time comes to replace a fuel injector, part of a strategy expected to save $835,000 in a year.