Working at the Edge of Human Knowledge, Part Two: Data Collection

Minitab Blog Editor 20 October, 2011

Numbers aren't enough by themselvesLast week, I wrote about the excitement of working in an environment where your job is to push back the boundary of what is unknown. The interaction between messy reality and neat, usable data is an interesting place that ties together the lofty goals of scientists to the nitty-gritty world.

This time, I’m going bring it to life on a more personal level. While my experiences are in academic research, Six Sigma green belts working on a quality improvement initiative experience a very similar learning curve in their quests to expand their company’s knowledge base.

My background was first in statistical analysis and then I transitioned to research. My first big research project was extremely eye-opening, because I was responsible for both ensuring that we generated good, clean data, and then analyzing it! This is when the large amount of effort required to generate good data became very apparent. 

Collecting good data is a process, not a single measurement event. This process requires that you have standard procedures and measurement instruments that work together in perfect harmony to produce data that you actively verify is correct. If you have a bad procedure or a bad instrument, your data is bad. Everything has to be perfect. You have to get directly involved in the nitty gritty details of the process and the measuring instruments to ensure perfection.

So, back to my first big research project... I was the one full-time person on the data side. I worked with a large team of very talented experts, who each contributed a part of their time to this project. These experts included electrical and mechanical engineers, programmers, electricians, shop technicians, nutritionists, and a bone densitometer operator, among others. Together, we developed and tested hardware, software, assessments, and procedures that would produce an array of different types of data. We also had a full-time nurse on the project who interacted with the subjects on a daily basis. She provided feedback about the suitability of using our devices and survey assessments on our subjects, as well as administering the surveys and fitting the monitoring equipment. Out of this milieu, I had to be sure that we produced a mountain of many different types of trustable data. It was quite a balancing act!

Good data is a tough taskmaster. I was more of a numbers person, but just knowing statistics isn’t good enough. No statistical analysis can save you if you have bad data. So, for the sake of good data, I learned a lot of new skills and quickly got my hands dirty in the tiny details of data collection.

There is a lot of work to do before you even begin collecting data.

For example, I learned how to use and extract data from equipment such as force plates and bone densitometers. I learned how to strip wires, solder, choose the right electrical connector for the right application, and the basics of circuit boards. I learned about different programming languages so I could work with the programmer to ensure that data collection was just right. I even worked with a battery company to design a custom dual-voltage battery that fit inside a small space to meet our unique needs. I learned a wealth of information about nutrition and how to assess nutritional intake. I studied how different positioning of measurement devices could affect the results. Whew! Fortunately, I love learning new things!

Along the way, I checked and rechecked pilot data as we changed things in order to monitor the improvement in data quality. I also wrote standard procedures to ensure consistent data collection.

It was a great learning experience. And these are the critical lessons I learned:

  • Collecting good data is a process, not an event.
  • You will spend more time determining the best way to collect the data than actually collecting the data.
  • You must be determined, adaptable, and willing to learn a lot of new things.
  • Don’t assume anything. Check and double check all of your data streams. Verify everything.
  • You can learn statistics in school, but there’s nothing like having a multi-million dollar project on the line to really know statistics inside and out!

In my next post, I’ll give one example of how verifying seemingly simple data often involves more than meets the eye. This was a case where the rubber met the road, and I smelled smoke!