# Ghostbusting Spaces in Minitab

It’s been almost 5 years since I used a quotation from *Ghostbusters *to introduce one of my early blog posts. But as we’re getting a few bits of entertainment news about the next installment in the *Ghostbusters *franchise, I thought it might be a good time to talk about busting ghosts in Minitab.

In the Minitab sense, ghosts are spaces that are in your data that you can’t see. The busting action is slightly different for text data and numeric data.

## Remove Spaces from Text Data

Let’s look at text data first. Let’s say that you have data that look like this:

It *looks *like the same value repeated over and over again. It *looks *like there’s nothing there but the word “Ghostbusters.”

This is the moment where you get your PKE meter out. Or rather, this is where you tally your data to see what's going on. That moment’s going to be like this:

- Choose
**Stat > Tables > Tally Individual Variables**. - In
**Variables**, enter*Title*. Click**OK**.

The count of the single word should be the same as the number of rows in the data. It’s not. There are ghosts there that you can’t see.

**Tally for Discrete Variables: Title **

Title Count

Ghostbusters 5

Ghostbusters 6

Ghostbusters 7

Ghostbusters 5

Ghostbusters 8

Ghostbusters 9

Ghostbusters 5

Ghostbusters 7

Ghostbusters 3

Ghostbusters 4

Ghostbusters 11

Ghostbusters 8

Ghostbusters 8

Ghostbusters 7

Ghostbusters 5

Ghostbusters 2

N= 100

Tally, standing in for your PKE meter, shows a stairstep pattern because of the extra spaces at the end of the word. Extra spaces that you can't see in the data window unless you highlight a cell.

One way to fix the problem would be to go through the data window clicking on the individual cells to find the ones with extra spaces to delete. Fortunately, Minitab gives you a better way to make all of these values the same. You can use the calculator function TRIM:

- Choose
**Calc > Calculator**. - In
**Store Result in Variable**, enter*Title2*. - In
**Expression**, enter*Trim(Title)*. Click**OK**.

If you tally the new values, you get the result you expect when all of the values are the same:

**Tally for Discrete Variables: Title2 **

Title2 Count

Ghostbusters 100

N= 100

All those ghosts are in the trap, not in your data.

## Remove Spaces from Numeric Data

If you wanted, you could trim a series of numbers to eliminate spaces, but a bigger problem remains unsolved. For example, if you paste some numbers formatted as text from Excel, you might get a column like this:

These numbers *look *like numbers. It doesn’t *look *like there are extra spaces that you can’t see, but the spaces are there. We can see them if we tally them:

**Tally for Discrete Variables: Numbers**

Numbers Count

9 55

9 8

9 11

9 10

9 8

9 8

N= 100

But the column is also formatted as text. That’s what the “–T” next to the C9 means in the worksheet. If we want to analyze the numbers, find the mean or create a scatterplot, then we need the numbers to be formatted as numbers. If we use the TRIM function, the numbers will still be text. For this, we’re going to have to use something a little bit stronger:

No, not crossing the streams. (Egon said that crossing the streams was *bad*.) Instead, we change the column format.

- Put the cursor in the column of numbers.
- Right-click and select
**Format Column**. - In Choose type, select
**Automatic numeric**. - Click
**OK**.

**Tally for Discrete Variables: Numbers**

Numbers Count

9 100

N= 100

In one step, you’ve converted the format of the column and eliminated the extra spaces. The ghosts are all gone.

## Wrap-up

If you have to work with data, especially data that someone types, then you’re going to have to deal with messy data. The TRIM function and the ability to change column formats are two tools that Minitab gives you to get the ghosts out of your data.

Have other problems? Minitab has lots of other tools to simplify the cleaning that you’ll need to do for your data. If you’re ready for a few more tricks, check out 3 Ways to Clean Up Data So You Can Promote Public Dialog and 3 Features to Make You Glad You're You When You Have to Clean Data in Minitab.

Ghostbusters *images: TM & © 2014 Columbia Pictures Industries Inc. *