What to Do When Your Data's a Mess, part 3
Everyone who analyzes data regularly has the experience of getting a worksheet that just isn't ready to use. Previously I wrote about tools you can use to clean up and eliminate clutter in your data and reorganize your data.
In this post, I'm going to highlight tools that help you get the most out of messy data by altering its characteristics.
Know Your Options
Many problems with data don't become obvious until you begin to analyze it. A shortcut or abbreviation that seemed to make sense while the data was being collected, for instance, might turn out to be a time-waster in the end. What if abbreviated values in the data set only make sense to the person who collected it? Or a column of numeric data accidentally gets coded as text? You can solve those problems quickly with statistical software packages.
Change the Type of Data You Have
Here's an instance where a data entry error resulted in a column of numbers being incorrectly classified as text data. This will severely limit the types of analysis that can be performed using the data.
To fix this, select Data > Change Data Type and use the dialog box to choose the column you want to change.
One click later, and the errant text data has been converted to the desired numeric format:
Make Data More Meaningful by Coding It
When this company collected data on the performance of its different functions across all its locations, it used numbers to represent both locations and units.
That may have been a convenient way to record the data, but unless you've memorized what each set of numbers stands for, interpreting the results of your analysis will be a confusing chore. You can make the results easy to understand and communicating by coding the data.
In this case, we select Data > Code > Numeric to Text...
And we complete the dialog box as follows, telling the software to replace the numbers with more meaningful information, like the town each facility is located in.
Now you have data columns that can be understood by anyone. When you create graphs and figures, they will be clearly labeled.
Got the Time?
Dates and times can be very important in looking at performance data and other indicators that might have a cyclical or time-sensitive effect. But the way the date is recorded in your data sheet might not be exactly what you need.
For example, if you wanted to see if the day of the week had an influence on the activities in certain divisions of your company, a list of dates in the MM/DD/YYYY format won't be very helpful.
You can use Data > Date/Time > Extract to Text... to identify the day of the week for each date.
Now you have a column that lists the day of the week, and you can easily use it in your analysis.
Manipulating for Meaning
These tools are commonly seen as a way to correct data-entry errors, but as we've seen, you can use them to make your data sets more meaningful and easier to work with.
There are many other tools available in Minitab's Data menu, including an array of options for arranging, combining, dividing, fine-tuning, rounding, and otherwise massaging your data to make it easier to use. Next time you've got a column of data that isn't quite what you need, try using the Data menu to get it into shape.