The language of statistics is a funny thing, but there usually isn't much to laugh at in the consequences that can follow when misunderstandings occur between statisticians and non-statisticians. We see these consequences frequently in the media, when new studies—that usually contradict previous ones—are breathlessly related, as if their findings were incontrovertible facts.
Similar, though less visible, misinterpretations abound in meeting rooms throughout the business world. When people who work with data and know statistics share their analyses with colleagues who aren't well-versed in the world of data, the message that gets received may be very different than the one the analyst tried to send.
There are two equally vital solutions to this problem. One is encouraging and instilling greater statistical literacy in the population. Obviously, that's a big challenge that can't be solved by any one statistician or analyst. But we individuals can control the second solution, which is to pay more attention to how we present the results of our analyses, and enhance our sensitivity to the statistical knowledge possessed by our audiences.
I've written about the challenges of statistical communication before, but I've been thinking about it anew after a friend sent me a link to this post and subsequent discussion about replacing the term "statistical significance." I won't speculate on the likelihood of that proposal, but it felt like a good time to review some words or phrases that mean one thing in statistical vernacular, but may signify something very different in a popular context.
Here's what I came up with, presented in a tabular form:
Say the word... |
Statisticians mean... |
Most people mean... |
Assumptions | Constraints within which we can do a particular analysis, such as data needing to follow a normal distribution. | Bias, prejudices, opinions or foregone conclusions about the topic or question under discussion. |
Confidence | A measurement of the uncertainty in a statistical analysis. | The strength with which a person believes or places faith in his or her abilities or ideas. |
Confounded | Variables whose effects cannot be distinguished. | Confused, perplexed, or inconvenient. |
Critical value | The cutoff point for a hypothesis test. | An measurement, sum, or number with great practical importance—such as a minimum cash balance in a checking account. |
Dependent | A variable that's beyond our control—such as the outcome of an experiment. | An outcome or thing we can control or influence. "Going to the party is dependent on completing my work." |
Independent | A factor we can control or manipulate. | An outcome or thing we cannot control or influence. "They will make the decision independent of whatever we might recommend." |
Interaction | When the level of one factor depends on the level of another. | Communications and social engagements with others. |
Mean | The sum of all the values in your data divided by the number of values (sX/n). | An adjective signifying hostility or, in slang, positivity: "That mean response surprised us all." |
Mode | The most frequent value in a data set. | A manner or method of performing a task. "You'll finish faster if you change your operating mode." |
Median | A data set's middle value. | Intermediate or average. So-so. |
Normal | Data that follow a bell-shaped curve. | Something that is commonplace, ordinary, plain, or unexceptional. |
Power | The capability to detect a significant effect. | Degree of control or influence. |
Random | A sample captured such that all individuals in a population have equal odds of selection. | Unpredictable; beyond control. |
Range | The difference between the lowest and highest values in a data set. | An array or collection. |
Regression | Predicting one variable based on the values of other variables. | Retreat or loss. Moving backwards. |
Residuals | The differences between observed and fitted values. | Leftovers. Scraps. |
Significance | The odds that the results observed are not just a chance result. | Importance or seriousness. |
Can you add to my list? What statistical terms have complicated your efforts to communicate results?