dcsimg
 

Data Analysis

Blog posts and articles with tips for analyzing data for quality improvement methodologies, including Six Sigma and Lean.

When I blogged about automation back in March, I made my husband out to be an automation guru. Well, he certainly is. But what you don’t know about my husband is that while he loves to automate everything in his life, sometimes he drops the ball. He’s human; even I have to cut him a break every now and then. On the other hand, instances of hypocrisy in his behavior tend to make for a good story.... Continue Reading
You need to consider many factors when you’re buying a used car. Once you narrow your choice down to a particular car model, you can get a wealth of information about individual cars on the market through the Internet. How do you navigate through it all to find the best deal?  By analyzing the data you have available.   Let's look at how this works using the Assistant in Minitab 17. With the... Continue Reading

7 Deadly Statistical Sins Even the Experts Make

Do you know how to avoid them?

Get the facts >
Here is a scenario involving process capability that we’ve seen from time to time in Minitab's technical support department. I’m sharing the details in this post so that you’ll know where to look if you encounter a similar situation. You need to run a capability analysis. You generate the output using Minitab Statistical Software. When you look at the results, the Cpk is huge and the histogram in... Continue Reading
If you've used our software, you’re probably used to many of the things you can do in Minitab once you’ve fit a model. For example, after you fit a response to a given model for some predictors with Stat > DOE > Response Surface > Analyze Response Surface Design, you can do the following: Predict the mean value of the response variable for new combinations of settings of the predictors. Draw... Continue Reading
Design of Experiments (DOE) is the perfect tool to efficiently determine if key inputs are related to key outputs. Behind the scenes, DOE is simply a regression analysis. What’s not simple, however, is all of the choices you have to make when planning your experiment. What X’s should you test? What ranges should you select for your X’s? How many replicates should you use? Do you need center... Continue Reading
In the great 1971 movie Willy Wonka and the Chocolate Factory, the reclusive owner of the Wonka Chocolate Factory decides to place golden tickets in five of his famous chocolate bars, and allow the winners of each to visit his factory with a guest. Since restarting production after three years of silence, no one has come in or gone out of the factory. Needless to say, there is enormous interest in... Continue Reading
Last Tuesday Night, Major League Baseball announced the rosters for tomorrow's All-Star game in San Diego. Immediately, as I'm sure was anticipated, people began talking about who made it and who didn't. Who got left out, and who shouldn't have made it. As a fun little exercise, I decided to take a visual look at the all-star teams, to see what kind of players were selected. I looked at position... Continue Reading
When you perform a statistical analysis, you want to make sure you collect enough data that your results are reliable. But you also want to avoid wasting time and money collecting more data than you need. So it's important to find an appropriate middle ground when determining your sample size. Now, technically, the Major League Baseball regular season isn't a statistical analysis. But it does kind... Continue Reading
Earlier this month, PLOS.org published an article titled "Ten Simple Rules for Effective Statistical Practice." The 10 rules are good reading for anyone who draws conclusions and makes decisions based on data, whether you're trying to extend the boundaries of scientific knowledge or make good decisions for your business.  Carnegie Mellon University's Robert E. Kass and several co-authors devised... Continue Reading
by Matthew Barsalou, guest blogger Control charts plot your process data to identify and distinguish between common cause and special cause variation. This is important, because identifying the different causes of variation lets you take action to make improvements in your process without over-controlling it. When you create a control chart, the software you're using should make it easy to see where... Continue Reading
You often hear the data being blamed when an analysis is not delivering the answers you wanted or expected. I was recently reminded that the data chosen or collected for a specific analysis is determined by the analyst, so there is no such thing as bad data—only bad analysis.  This made me think about the steps an analyst can take to minimise the risk of producing analysis that fails to answer... Continue Reading
An outlier is an observation in a data set that lies a substantial distance from other observations. These unusual observations can have a disproportionate effect on statistical analysis, such as the mean, which can lead to misleading results. Outliers can provide useful information about your data or process, so it's important to investigate them. Of course, you have to find them first.  Finding... Continue Reading
It’s not easy to get data ready for analysis. Sometimes, data that include all the details we want aren’t clean enough for analysis. Even stranger, sometimes the exact opposite can be true: Data that are convenient to collect often don’t include the details that we want when we analyze them. Let’s say that you’re looking at the documentation for the National Health and Nutrition Examination Survey... Continue Reading
Businesses are getting more and more data from existing and potential customers: whenever we click on a web site, for example, it can be recorded in the vendor's database. And whenever we use electronic ID cards to access public transportation or other services, our movements across the city may be analyzed. In the very near future, connected objects such as cars and electrical appliances will... Continue Reading
Remember the classic science fiction film The Matrix? The dark sunglasses, the leather, computer monitors constantly raining streams of integers (inexplicably in base 10 rather than binary or hexadecimal)? And that mind-blowing plot twist when Neo takes the red pill from Morpheus' outstretched hand? Well to me, there's one thing even more mind-blowing than the plot of the Matrix: the Matrix Plot.... Continue Reading
Time series data is proving to be very useful these days in a number of different industries. However, fitting a specific model is not always a straightforward process. It requires a good look at the series in question, and possibly trying several different models before identifying the best one. So how do we get there? In this post, I'll take a look at how we can examine our data and get a feel... Continue Reading
There may not be a situation more perilous than being a character on Game of Thrones. Warden of the North, Hand of the King, and apparent protagonist of the entire series? Off with your head before the end of the first season! Last male heir of a royal bloodline? Here, have a pot of molten gold poured on your head! Invited to a wedding? Well, you probably know what happens at weddings in the show. ... Continue Reading
Suppose you’ve collected data on cycle time, revenue, the dimension of a manufactured part, or some other metric that’s important to you, and you want to see what other variables may be related to it. Now what? When I graduated from college with my first statistics degree, my diploma was bona fide proof that I'd endured hours and hours of classroom lectures on various statistical topics, including l... Continue Reading
This is an era of massive data. A huge amount of data is being generated from the web and from customer relations records, not to mention also from sensors used in the manufacturing industry (semiconductor, pharmaceutical, petrochemical companies and many other industries). Univariate Control Charts In the manufacturing industry, critical product characteristics get routinely collected to ensure... Continue Reading
Do you recall my “putting the cart before the horse” analogy in part 1 of this blog series? The comparison is simple. We all, at times, put the cart before the horse in relatively innocuous ways, such as eating your dessert before you’ve eaten your dinner, or deciding what to wear before you’ve been invited to the party. But performing some tasks in the wrong order, such as running a statistical... Continue Reading