Exploring Healthcare Data, Part 2

Meredith Griffith | 03 May, 2016

Topics: Healthcare

In the first part of this series, we looked at a case study where staff at a hospital used ATP swab tests to test 8 surfaces for bacteria in 10 different hospital rooms across 5 departments. ATP measurements below 400 units pass the swab test, while measurements greater than or equal to 400 units fail the swab test and require further investigation.

I offered two tips on exploring and visualizing data using graphs, brushing, and conditional formatting.

  1. Evaluate the shape of your data.
  2. Identify and investigate outliers.

By performing these preliminary explorations on the swab test data, we discovered that the mean ATP measurement would not be effective for testing whether surfaces showed statistically significant differences in contamination levels. This was due to the data being highly skewed by extreme outliers.

We then identified where these unusually high-ATP measurements were discovered in the hospital. These findings provide valuable information for appropriately focusing process improvement efforts on particular hospital rooms, departments, and surfaces within those rooms.

Now that we've seen how much some simple exploration and visualization tools can reveal, let's run through three more tools that will help you explore your own healthcare data in order to draw actionable insights.

If you’d like to follow along and didn't already download the data from the first post, you can download and explore the data yourself! If you don’t yet have Minitab 17, you can download the free, 30-day trial.

Tip #3: Manipulate the data

The swab test data the hospital staff collected and recorded is unstacked—this simply means that all response measurements are contained in multiple columns rather than stacked together in one column. To do additional data visualization and a more formal analysis, you need to reconfigure or manipulate how the data is arranged. We can accomplish this by stacking rows.

The ATP Stacked.MTW worksheet in the downloadable Minitab project file above already has the data reshaped for you. But you can manipulate the data on your own using the ATP Unstacked.MTW worksheet. Just navigate to Data > Stack > Rows, and complete the dialog as shown:

health care data - stack rows to prepare for analysis

Stacking all rows of your data and storing the associated column subscripts (or column names) in a separate column will result in all ATP measurements stacked into one column, a separate column containing categories for Surfaces, and another column containing the Room Number.

With stacked data, you are properly set up to perform formal analyses in Minitab—this is an important step as you work with your data, as most Minitab analyses require columns of stacked data. We won’t tackle a formal analysis here, but rest assured that you are set up to do so!

Tip #4: Extract information from your original data set

Once your data are stacked, you can use functions available in Calc > Calculator and Data > Recode to leverage information intrinsic to your original data to create new variables to explore and analyze.

For instance, we know the first character of each room number denotes the department. You can use the ‘left’ function in Calc > Calculator to extract the left-most character from the Room column, and store the result in a new column labeled Department. You can do this by filling out the Calculator dialog as shown:

manipulating health care data

You also know that ATP measurements below 400 ‘pass’ the ATP swab test. Recoding ranges of ATP values to text to indicate which values ‘Pass’ and which values ‘Fail’ can be useful when visualizing the data. You can do this by filling out the Data > Recode > To Text dialog as shown:

health care data dialog box

Finally, you can use this newly extracted data to create a stacked bar chart showing the counts of measurements that failed, passed, or were missing from the ATP swab test across Department and the recoded ATP. Using the ATP Stacked.MTW worksheet, navigate to Graph > Bar Chart > Stack. Verify that the Bars represent drop-down shows the default selection, Counts of unique values. Click OK. Select Department and Recoded ATP as Categorical variables, and click OK.

Minitab produces the following graph:

Health care ATP swab test data

The bar chart reveals that:

  • Department 4 has the highest count of ATP measurements that failed the swab test.
  • The sanitation team should consider focusing initial efforts in department 4 as the investigation of problems with room-cleaning procedures continues.

Tip #5: Obtain important statistics that describe your data

Now that we’ve manipulated the data in a way that prepares us for more formal analyses, identified which department contains the most contaminated surfaces, and compared the portion of measurements in each department that passed or failed the ATP swab test, we can display descriptive statistics to get an idea of how mean or median bacteria levels differed or varied across surfaces and across departments.

Using the ATP Stacked.MTW worksheet, navigate to Stat > Basic Statistics > Display Descriptive Statistics. Enter ATP as the Variable, Department as the By variable, and click OK. Press Ctrl + E to re-enter the Display Descriptive Statistics dialog, and replace Department with Surface as the By variable. Click OK.  The following output displays in Minitab’s Session Window.

Health care data descriptive statistics

Health care data swab tests descriptive statistics

The descriptive statistics reveal helpful information:

  • These statistics allow for easy comparison of mean and median ATP measurements as well as the variation of ATP measurements, either by department or by surface.
  • Notice that mean ATP measurements are much higher than median ATP measurements for both sets of descriptive statistics. This is because the data are right-skewed. Certain analyses that assume you have normally distributed data—such as t-tests to compare means—might not be the best tool to formally analyze this data. Comparing medians might offer more insight.
  • Both sets of descriptive statistics highlight which departments and surfaces to focus on for investigation and process improvement efforts. For instance, department 4 has the highest median ATP presence, while Bed Rails, Phone, and Call Button—the touch points closest to a sick patient in a hospital bed—appear to be the most problematic surfaces to sanitize. Process improvement efforts can begin with this information.

What Else Can You Do with Your Data?

What you’ve seen in this two-part blog post is just the beginning. But consider how much of this initial exploration is actionable! By having this foundation for visualizing and manipulating your data, you’ll be well on your way to investigating and testing root causes, and more efficiently performing analyses that yield trustworthy results.

If you’re interested in how other healthcare organizations use Minitab for quality improvement, check out our case studies.