Getting the Most Out of Your Text Data, Part 1
With Minitab, it’s easy to create graphs and manage numeric, date/time and text data. But Minitab’s enhanced data manipulation features make it easier to work with text data, too.
This is the first of three posts in which I'm going to focus on various tools in Minitab that are useful when working with text data, including the Calculator, the Data menu, and the Editor menu.
Using the Calculator
You might be surprised to hear that Minitab’s Calculator is just as useful with text as it is with numbers. Here are just a few things you can use it for:
ISOLATE CRITICAL INFORMATION
Sometimes it’s helpful to extract individual words or characters from text data for use in isolation. For example, if we have a column of product ID’s and need just part of the letters or numbers that are part of the text string in a column, the LEFT, MID and RIGHT functions in the calculator can be very useful.
The LEFT function in the calculator will extract values from a text string beginning with the leftmost and will stop at the number of characters we specify. In the example above, we could complete the Calc > Calculator dialog box to pull out the two characters on the left side (AB or BC) by completing the dialog box as shown in the example below:
The RIGHT function works in exactly the same way as the LEFT function, except that RIGHT extracts characters beginning with the rightmost. Here we’re pulling out the 4 characters beginning from the right side:
Similarly, we can use the MID function in the calculator to extract the number of characters we want from the middle of a text string. With the MID function, we enter the text column plus the position of the first character we want to extract, then the number of characters we want to extract. In this example we want to extract the 2 characters between the hyphens. In that case the first character we want is the fourth, so we’d complete the Calculator dialog box like this:
COMBINE DATA FOR ADDED MEANING
In other cases, the whole can be greater than the sum of its parts. For example, if values for Month, Day and Year are stored in separate columns, we may want to combine these into a single column:
The month, day and year of each observation was originally recorded in separate columns, which complicates graphing. Fortunately, the calculator can be used to combine the three columns into a single column. To do that, we can use the CONCATENATE function:
The empty space between the double quotes will add a space between the Month and Day, and the comma plus the empty space will add a comma after the Day and a space before the year:
REPLACE INCORRECT PORTIONS OF TEXT DATA
A consistent recording error doesn’t have to result in time-consuming hand-corrections. In this example an operator who handles product returns has noted the incorrect year portion of the catalog code that is used to reference the item:
The calculator’s SUBSTITUTE function can be used to replace the incorrect Spring 13 with Spring 14. The calculator will find the text and replace it with the new text that we specify:
These are just a few of the useful functions included in Minitab’s Calculator. To see a complete list of Minitab’s calculator functions with explanations and examples of how to use each, open Minitab and go to Help > Help.
If you’re already using the calculator in Minitab, the easiest way to access the same information is to click the Help button in the lower-left corner of the calculator
In my next post, we’ll explore some of the text data manipulation features that Minitab offers in the Data menu.