Gummi Bear Statistics: Introduction to Covariates

Cody Steele | 14 March, 2012

Topics: Design of Experiments - DOE

"There is a fifth dimension, beyond that which is known to man. It is a dimension as vast as space and as timeless as infinity. It is the middle ground between light and shadow, between science and superstition, and it lies between the pit of man's fears and the summit of his knowledge. This is the dimension of imagination. It is an area which we call...The Twilight Zone."

In my last entry, I told you that I don’t have a thermometer or a hygrometer in my office. That means that I have to treat the temperature and humidity of the location where I collect data as variables that I can neither manipulate nor measure. Thus, we use blocking to try to keep track of those variables. That way we can distinguish the effects of the variables that we can’t measure from the effects of the variables that I manipulate through design of experiments methods.

But let’s enter Serling’s dimension of imagination for a moment. Let’s imagine that I have a hygrometer, a thermometer, maybe even a barometer. I’m still not willing to build a campfire in the office so that I can manipulate temperature, but I can measure it. So what strategy would I use for variables that I can measure, but not manipulate?

I could still block, but I would be giving up information. Instead, I can record the data and then account for the effect of the covariates on the response variables. Using covariates gives me an even clearer picture of  the variables that I’m studying than I would get if I used blocks.

What does it all mean? It means we now have 4 strategies for handling variables:

  • Manipulate—for variables we can manipulate and measure. These are the variables we’ll learn the most about by including them as factors with design of experiments.
  • Leave the same—for variables that will stay the same while we’re collecting data, whether we force them to stay the same or not.
  • Block—for variables that we can neither measure nor manipulate, but that will change while we’re collecting data.
  • Include covariates—for variables that we can measure but not manipulate, and that will change while we’re collecting data.

There’s one more strategy that we’re going to talk about next time: randomization. Randomization is especially important because it’s also going to deal with the variables that I couldn’t even think of when I was making my fishbone diagram.

Want to keep your confidence high? Remember that some folks use the term "concomitant variable" instead of covariate. That way, you're ready for the discussion no matter what field you're in.