Data Analysis

Blog posts and articles with tips for analyzing data for quality improvement methodologies, including Six Sigma and Lean.

I was recently asked a couple of questions about stability studies in Minitab. Question 1:  If you enter in a lower and upper spec in the Stability Study dialog window, why do I see only one confidence bound per fitted line on the resulting graph? Shouldn’t there be two? You use a stability study to analyze the stability of a product over time and to determine the product's shelf life. In order to... Continue Reading
September 17 marked the release of new information from the American Community Survey (ACS) from the U.S. Census Bureau. Here’s a bar chart of what the press releases looked like for that day: Clearly there was a theme in play, one that was great news for major metropolitan areas. The Census Bureau even released a graph showing that the percentage of people within the 25 most populous metropolitan... Continue Reading
Step 1 in our DOE problem-solving methodology is to use process experts, literature, or past experiments to characterize the process and define the problem. Since I had little experience with golf myself, this was an important step for me. This is not an uncommon situation. Experiment designers often find themselves working on processes that they have little or no experience with. For example, a... Continue Reading
You run a capability analysis and your Cpk is bad. Now what? First, let’s first start by defining what “bad” is. In simple terms, the smaller the Cpk, the more defects you have. So the larger your Cpk is, the better. Many practitioners use a Cpk of 1.33 as the gold standard, so we’ll treat that as the gold standard here, too. Suppose we collect some data and run a capability analysis using Minitab St... Continue Reading
You know what the big thing is in the data analysis world—"Big Data." Big, big, big, very big data. Massive data. ENORMOUS data. Data that is just brain-bendingly big. Data so big that we need globally interconnected supercomputers that haven't even been built yet just to contain one one-billionth of it. That's the kind of big data everybody's so excited about.  Whatever. There's no denying that... Continue Reading
I recently guest lectured for an applied regression analysis course at Penn State. Now, before you begin making certain assumptions—because as any statistician will tell you, assumptions are important in regression—you should know that I have no teaching experience whatsoever, and I’m not much older than the students I addressed. I’m just 5 years removed from my undergraduate days at Virginia Tech,... Continue Reading
As we broke for lunch, two participants in the training class began to discuss, debate, and finally fight over a fundamental task in golf—how to drive the ball the farthest off the tee. Both were avid golfers and had spent a great deal of time and money on professional instruction and equipment, so the argument continued through the lunch hour, with neither arguer stopping to eat. Several other... Continue Reading
This summer, I created a model to determine the correct 4th down decision. But whether it’s for business or some personal interest, creating a model is just the starting point. The real benefits come from applying your model. And for the Big Ten 4th down calculator, the time to apply the model is now! On Saturday night, Penn State and Rutgers officially kicked off conference play for the 2015 Big... Continue Reading
Repeated measures designs don’t fit our impression of a typical experiment in several key ways. When we think of an experiment, we often think of a design that has a clear distinction between the treatment and control groups. Each subject is in one, and only one, of these non-overlapping groups. Subjects who are in a treatment group are exposed to only one type of treatment. This is the... Continue Reading
When I started out on the blog, I spent some time showing some data sets that would be easy to illustrate statistical concepts. It’s easier to show someone how something works with something familiar than with something they’ve never thought about before. Need a quick illustration to share with someone about how to summarize a variable in Minitab? See if they have a magazine on their desk, and... Continue Reading
Whatever industry you're in, you're going to need to buy supplies. If you're a printer, you'll need to purchase inks, various types of printing equipment, and paper. If you're in manufacturing, you'll need to obtain parts that you don't make yourself.  But how do you know you're making the right choice when you have multiple suppliers vying to fulfill your orders?  How can you be sure you're... Continue Reading
If you use ordinary linear regression with a response of count data, if may work out fine (Part 1), or you may run into some problems (Part 2). Given that a count response could be problematic, why not use a regression procedure developed to handle a response of counts? A Poisson regression analysis is designed to analyze a regression model with a count response. First, let's try using Poisson... Continue Reading
My previous post showed an example of using ordinary linear regression to model a count response. For that particular count data, shown by the blue circles on the dot plot below, the model assumptions for linear regression were adequately satisfied. But frequently, count data may contain many values equal or close to 0. Also, the distribution of the counts may be right-skewed. In the quality field,... Continue Reading
Rare events inherently occur in all kinds of processes. In hospitals, there are medication errors, infections, patient falls, ventilator-associated pneumonias, and other rare, adverse events that cause prolonged hospital stays and increase healthcare costs.  But rare events happen in many other contexts, too. Software developers may need to track errors in lines of programming code, or a quality... Continue Reading
Imagine a multi-million dollar company that released a product without knowing the probability that it will fail after a certain amount of time. “We offer a 2 year warranty, but we have no idea what percentage of our products fail before 2 years.” Crazy, right? Anybody who wanted to ensure the quality of their product would perform a statistical analysis to look at the reliability and survival of... Continue Reading
To make objective decisions about the processes that are critical to your organization, you often need to examine categorical data. You may know how to use a t-test or ANOVA when you’re comparing measurement data (like weight, length, revenue, and so on), but do you know how to compare attribute or counts data? It easy to do with statistical software like Minitab.  One person may look at this bar... Continue Reading
There's more data available today than ever before, and with statistical software such as Minitab it only takes a couple of seconds to get some significant insights, whether it concerns how to make your business run better or national politics.  For instance, if we look back at the last 9 presidential elections (1980 to 2012), there are some interesting correlations between the percent of state... Continue Reading
When we take pictures with a digital camera or smartphone, what the device really does is capture information in the form of binary code. At the most basic level, our precious photos are really just a bunch of 1s and 0s, but if we were to look at them that way, they'd be pretty unexciting. In its raw state, all that information the camera records is worthless. The 1s and 0s need to be converted... Continue Reading
If you want to use data to predict the impact of different variables, whether it's for business or some personal interest, you need to create a model based on the best information you have at your disposal. In this post and subsequent posts throughout the football season, I'm going to share how I've been developing and applying a model for predicting the outcomes of 4th down decisions in Big... Continue Reading
As you may know, we added Bubble Plots to Minitab's menu of meaningful graphs in Release 17. If you are familiar, I think you'll agree that Bubble Plots make a perfect addition to the pantheon of impressive and powerful plots that you can produce in Minitab. They’re great. Of course, they would have been even greater if they used my idea...but that’s spilt milk under the bridge now. If you haven’t... Continue Reading