Working at the Edge of Human Knowledge, Part Three: Data Validation

Minitab Blog Editor 03 November, 2011

We monitored activityThe diligence required to obtain and validate good data became apparent to me very early at the biomechanics lab. Imagine a young guy who's eager not to mess up. There is this nagging fear that a lot of mistakes in research happen when you miss something or do something incorrectly at the outset, and it bites you in the derriere later. You fear that during data analysis you'll uncover a problem you can’t fix. I didn’t want that to happen on my watch.

I quickly learned that this fear was a well-founded one!

As part of a bone density study, we planned to measure each subject’s activity. Our subjects were going to wear activity monitors for 12 hours, on randomly scheduled days, multiple times per year. These activity monitors use accelerometers to measure movement, and they are sophisticated enough to be able to distinguish natural, human movements from artificial types of movement, such as riding in a car. They also are very durable and easy to use -- the researcher doesn’t adjust anything.  And they have been well validated in the scientific literature using very sophisticated analyses.

In short, no one expected any problems with these trusted devices. I thought this would be a nice, simple place to start verifying measurements before the study to avoid problems later. It was a good place to start, but not as simple as expected!

The data from the activity monitors don’t translate to an exact picture of the activities that the subject performs. However, you can see the scores rise and fall with activity levels, and compare scores to see where each subject’s activity level falls within your sample of activity scores. To make sure that our activity monitors were working correctly, I had pilot subjects wear the devices for a quick measurement system analysis. Sure enough, greater activity produced higher readings, just as expected. So far, so good. I thought I’d move on to creating a standard procedure.

To collect good data, you need standard procedures for setting up and using measurement equipment. So, I wanted to establish these standards for the activity monitors. The devices are worn on a belt around the subject’s waist. A good standard would be to standardize the position of the devices on each subject’s waist -- readings shouldn't be higher or lower between subjects because of inconsistent positioning.
There was no literature on the differences in positioning, so I did a pilot study of my own. This time I had the subjects wear multiple devices all around their waist. I wanted to quantify the potential risk involved by seeing how much the readings would vary by position on each subject.
As the data came in, it first appeared that positioning was very important. The high and low readings were often different by 15%. This was more than I expected, but there was no research to compare it to at the time. However, with more data came more insight. The positional pattern of high and low varied from subject to subject. Finally, it became obvious that while the pattern was inconsistent between subjects, it was consistent between devices.

In other words, several of our monitors tended to read too low, by varying degrees.

I contacted the manufacturer and sent the devices back so they could check them out. It turned out the manufacturer had recently switched suppliers for a component and it was causing problems. (Apparently that company was not using Minitab Statistical Software for quality management!)

When I re-tested the repaired monitors, the differences between positions on a subject were all less than a couple of percent, which meant there was no practical difference between positions. So, it appeared that all of our monitors were now working correctly and that position wasn’t a big issue. I established a standard procedure using standard belts that fit the monitors perfectly and were infinitely adjustable to each subject’s waist. I did this to prevent the monitors from flopping around due to a loose fit.

This experience was both unsettling and positive. It was unsettling because it confirmed my fears: something you miss definitely can come back and bite you later -- even if you're dealing with a very simple situation. Further, some data problems are subtle and they don’t show up until you check the data several different ways. It was also a positive experience because it kept me on my toes and ready for bigger challenges that were to come!