Data Analysis

Blog posts and articles with tips for analyzing data for quality improvement methodologies, including Six Sigma and Lean.

Last time, I shared some useful tools for handling date and time data. But Minitab has many other useful tools for manipulating date/time data that you might not be aware of. Let’s take a look at a few more helpful tips and tricks. Extracting Information from a Date/Time Column If you look under the Data menu, you’ll notice Extract from Date/Time > To Numeric or To Text. This function allows you to... Continue Reading
You’ve probably heard about the different “types” of learning styles that exist. Some people call themselves “visual” learners, who find learning easiest with pictures, maps, and graphs. Other people say they learn best through hearing or actually doing an activity. I’d have to lump myself into the visual learner category because I find visual representations the easiest way to quickly grasp... Continue Reading

7 Deadly Statistical Sins Even the Experts Make

Do you know how to avoid them?

Get the facts >
Collecting information for data analysis is like tasting fine wine—you want the right amount. Take too small a sip and you won't be able to assess it properly: you won't have enough information! But if you take a giant swig, your palate will be overwhelmed. That amount is just way more than you really need to make a solid recommendation. So, how big a sip should you take? I'm no wine expert, so... Continue Reading
“How do you write your blogs?" someone asked me the other day. “It’s really simple," I replied. "I just apply the infinite monkey theorem.” According to the infinite monkey theorem, if enough monkeys type randomly on a keyboard for a long enough time (infinity), they will be almost certain to produce any given text: a play by Shakespeare, the U.S. Constitution, or Minitab Help.A key premise is the... Continue Reading
Variability can make things difficult whether you are performing data analysis for a quality improvement initiative or for an academic study. Recently, I detailed how variability reduces your statistical power. As promised, this will help you solve a mystery. One of the many things I love about research are the unexpected mysteries. You get to be Sherlock Holmes! When you’re exploring the unknown, ... Continue Reading
I was playing around with the power and sample size graphs in Minitab recently, and I noticed something interesting. Power, for the uninitiated, is usually described as the likelihood that you will find a significant effect or difference when one truly exists. There is a lot of good content on Power in the Minitab Help, StatGuide and Glossary. In any case, rather than simply describe what I found,... Continue Reading
The diligence required to obtain and validate good databecame apparent to me very early at the biomechanics lab. Imagine a young guy who's eager not to mess up. There is this nagging fear that a lot of mistakes in research happen when you miss something or do something incorrectly at the outset, and it bites you in the derriere later. You fear that during data analysis you'll uncover a problem you... Continue Reading
Last week, I wrote about the excitement of working in an environment where your job is to push back the boundary of what is unknown. The interaction between messy reality and neat, usable data is an interesting place that ties together the lofty goals of scientists to the nitty-gritty world.This time, I’m going bring it to life on a more personal level. While my experiences are in academic... Continue Reading
People like to say that seeing is believing, but the fact is that sometimes simply “seeing” isn’t enough. Whenever you’re making important decisions, like during a Lean Six Sigma project, it’s important to take a close look at the data. Because if you don’t, sometimes seeing is deceiving!Consider college basketball. Since 1990, the team ranked #1 in the AP preseason poll  has won the national... Continue Reading
I’d like to give you a chance to win some money. Tell me which of these games you’d rather play: Game 1: We flip a coin once. If it lands on tails, I’ll give you 100 bucks.Game 2: We flip a coin 10 times. If it lands on tails at least one time, I’ll give you 100 bucks. If you said “Doh” and picked Game 2, either you’re Homer Simpson or you already have a good intuitive understanding of an important... Continue Reading
In my last post, I detailed a study where the regression analysis seemed to show that higher calcium intake was associated with reduced injuries among our subjects. I had taken the data we collected for our main study and tried to use it to see if there were patterns amongst those who experienced the knee pain.A post hoc analysis like this can often give you good results but it can lead you... Continue Reading
Most Lean Six Sigma projects use data analysis to examine and reduce the prevalence of defects in a product or service. Experienced quality practitioners know that defects can sometimes feel abstract when you analyze them, until you’re the customer who experiences them firsthand.   On my first trip to Rome, my luggage never showed up in the terminal.So instead of gazing in awe at the Sistine... Continue Reading
In a previous post, I told you how omitting the subject's weight led to a surprising result in a preliminary regression analysis of the effects of physical activity on bone density. It's a good example of why you need to be very careful about the data you collect: factors you DO NOT include sometimes have a big influence on the ones you DO include! Let's look at how this played out in my... Continue Reading
If you’re a quality improvement expert, you already know that statistics can be a powerful tool in your quest to reduce the cost of poor quality and save money. But statistics might be even more powerful than you think—it may actually have helped someone win a lottery jackpot. According to Business Insider, a former mathematics professor with a Ph.D. in statistics has won the Texas lottery... Continue Reading
Riddle: What two tools in Minitab can be used to perform the same analysis on your data? Well, there are probably a few pairs that can be mentioned, but I am going to focus on Discriminant Analysis and Binary Logistic Regression.These tools can be used to predict group membership.  If we look at exh_mvar.mtw, located in Minitab’s sample data folder, we have the perfect data set to use. Here is a... Continue Reading
When we collect data for quality improvement projects, we try to identify and measure all the factors that could influence our outcome.  In scientific research studies, we try to neatly organize, classify, and measure all relevant aspects of the subject matter in order to quantify the relationship between all significant variables.The problem is that reality can be messy and hard to measure,... Continue Reading
We humans do have a tendency to succumb to gold rush fever. And this can happen even in the left-brained, rational field of statistics. After we collect our data, it’s difficult to resist the urge to desperately dash for p-values, as if they were 70% off at Macy’s the day after Thanksgiving.But no matter how well-versed you are in statistics, it’s good practice to get into the habit of intuitively... Continue Reading
It’s summer and maybe you’re traveling to Florida’s coast for a beach vacation (or just wishing you were, like I am)! I got the chance to talk to Dr. Henry Briceño from the Southeast Environmental Research Center (SERC) about how he and his team use statistics to monitor Florida’s water quality. The center’s research projects throughout South Florida have provided a basis for management decisions... Continue Reading
As I mentioned before, accurate instruments won’t yield good data if you haven’t answered three fundamental questions that should precede every measurement system analysis. Whether you're doing a 6 Sigma project, or data analysis in support of a research project or some other goal, you need to be careful about how, when and where you gather the data. Here’s how I learned that lesson the hard way.I... Continue Reading
Sometimes, statistical terms can seem like they were zapped down from outer space by sadistic, mealy-mouthed aliens: R-squared adjusted, heteroeskadasticity, 3-parameter Weibull distribution. But not all statistics terminology should leave you feeling woozy and glassy-eyed. Some terms  actually make intuitive sense. Knowing those terms can help you get a handle on output that may seem fuzzy at... Continue Reading