How I Learned to Love Statistics

Some people take to statistics and data analysis naturally. They're attracted to numbers and aren't intimidated by formulas full of arcane symbols drawn from long-dormant languages.

That's not me. 

My name is Eston Martz, and I gravitate to words, not numbers. In school I was the kid completely unfazed by William Faulkner and James Joyce. But Statistics? It stupefied me. I feared it. 

And that's a shame, because analyzing data is an extremely powerful tool to help us understand the world--which is why statistics is central to quality improvement methods like Lean Six Sigma.  

While working as a...

Studying Old Dogs with New Statistical Tricks Part II: Contour Plots and Cracking Bones

Yesterday I wrote about how paleontologist Zhijie Jack Tseng used 3D surface plots created in Minitab Statistical Software to look at how the skulls of hyenas and some extinct dogs with similar dining habits fit into a spectrum of possible skull forms that had been created with 3D modelling techniques.

What's interesting about this from a data analysis perspective is how Tseng took tools commonly used in quality improvement and engineering and applied them to his research into evolutionary morphology.

We used Tseng's data to demonstrate how to create and explore 3D surface plots yesterday, so...

Studying Old Dogs with New Statistical Tricks: Bone-Cracking Hypercarnivores and 3D Surface Plots

A while back my colleague Jim Frost wrote about applying statistics to decisions typically left to expert judgment; I was reminded of his post this week when I came across a new research study that takes a statistical technique commonly used in one discipline, and applies it in a new way. 

The study, by paleontologist Zhijie Jack Tseng, looked at how the skulls of bone-cracking carnivores--modern-day hyenas--evolved. They may look like dogs, but hyenas in fact are more closely related to cats. However, some extinct dog species had skulls much like a hyena's. 

Tseng analyzed data from 3D...

No Matter How Strong, Correlation Still Doesn't Imply Causation

There's been a really interesting conversation about correlation and causation going on in the LinkedIn Statistics and Analytics Consultants group. 

This is a group with a pretty advanced appreciation of statistical nuances and data analysis, and they've been focusing on how the understanding of causation and correlation can be very field-dependent. For instance, evidence supporting causation might be very different if we're looking at data from a clinical trial conducted under controlled conditions as opposed to observational economic data.

Contributors also have been citing some pretty...

Will the Weibull Distribution Be on the Demonstration Test?

Over on the Indium Corporation's blog, Dr. Ron Lasky has been sharing some interesting ideas about using the Weibull distribution in electronics manufacturing. For instance, check out this discussion of how dramatically an early first-failure can affect an analysis of a part or component (in this case, an alloy used to solder components to a circuit board). 

This got me thinking again about all the different situations in which the Weibull distribution can help us make good decisions. The main reason Weibull is so useful is that it's very flexible in fitting different types of data, because it...

Explaining Quality Statistics So Your Boss Will Understand: Weighted Pareto Charts

Failure to properly calibrate this machine will result in defective rock and roll. 

In my last post, I imagined using the example of a rock and roll band -- the Zero Sigmas -- to explain Pareto charts to my music-loving but statistically-challenged boss. I showed him how easy it was to use a Pareto chart to visualize defects or problems that occur most often, using the example of various incidents that occurred on the Zero Sigmas last tour.  

The Pareto chart revealed that starting performances late was far and away the Zero Sigmas' most frequent "defect," one that occurred every single night of...

The Diversity (and Consistency) of Quality Improvement: the 2013 ASQ ITEA Presentations

I'm in the airport at Indianapolis, waiting to go home after three exciting days at the 2013 American Society for Quality World Conference.  As I write this, it's Wednesday evening after the conference has closed, and it turns out my flight has been delayed.

This could give me ample opportunity to muse about the quality issues that might keep me from reaching central Pennsylvania tonight. But I'm kind of pumped up, so I'm more interested in thinking about what I've experienced and seen over the past few days. This is the kind of event that makes you want to keep focusing on the positive, not...

Talking Design of Experiments (DOE) and Quality at the 2013 ASQ World Conference

The 2013 ASQ World Conference is taking place this week in Indianapolis, Indiana, and it's been a treat to see how our software was used in the projects highlighted in many of the presentations. As a supporter of the conference, a key event for quality practitioners around the world, Minitab was proud to sponsor one of the presentations that seemed to get a lot of attendees talking. Scott Sterbenz, a Six Sigma leader from Ford Motor Company, delivered a presentation entitled "Leveraging Designed Experiments for Success," which explained how to make designed experiments succeed with examples...

Explaining Quality Statistics So Your Boss Will Understand: Pareto Charts

I once had a boss who had difficulty understanding many, many things. When I need to discuss statistical concepts with people who don't have a statistical background, I like to think about how I could explain things so even my old boss would get it. 

My boss and I shared a common interest in rock and roll, so that's the device I'll use to explain one of the workhorses of quality statistics, the Pareto chart. I'd tell my boss to imagine that instead of managing a surly gang of teenaged restaurant employees, he's managing a surly rock and roll band, the Zero Sigmas. The band did a 100-date tour...

Explaining Quality Statistics So My Boss Will Understand: Measurement Systems Analysis (MSA)

As a teenaged dishwasher at a local eatery, I had a boss who'd never washed dishes in a restaurant himself. I once spent 40 minutes trying to convince him that forks and spoons should go in their holders with the business end up, while knives should go in point-down. Whatever I said, he didn't get it. We were ordered to put forks and spoons in the holders with the handles up.

The outraged wait staff soon made clear what I hadn't: you can't immediately tell the difference between a fork and a spoon when all you can see is the handle! Explaning that in the right way would have minimized wasted...

Enough Is Enough! Handling Multicollinearity in Regression Analysis

In regression analysis, we look at the correlations between one or more input variables, or factors, and a response. We might look at how baking time and temperature relate to the hardness of a piece of plastic, or how educational levels and the region of one's birth relate to annual income. The number of potential factors you might include in a regression model is limited only by your imagination...and your capacity to actually gather the data you imagine.

But before throwing data about every potential predictor under the sun into your regression model, remember a thing called multicollinearity...

What Statistical Software Should You Choose: Three More Critical Questions

Earlier I wrote about four important questions you should ask if you're looking at using statistical software to analyze data in your organization, especially if you're hoping to improve quality using methods like Six Sigma. But there are other points to consider as well. If you're in market for statistical software, be sure to investigate these questions, too!

What Types of Statistical Analysis Will They Be Doing? 

The specific types of analysis you need to do could play a big part in determining the right statistical software for your organization. The American Statistical Association's softwa...

Getting Started with Factorial Design of Experiments (DOE)

When I talk to quality professionals about how they use statistics, one tool they mention again and again is design of experiments, or DOE. I'd never even heard the term before I started getting involved in quality improvement efforts, but now that I've learned how it works, I wonder why I didn't learn about it sooner. If you need to find out how several factors are affecting a process outcome, DOE is the way to go. 

Somewhere in school you probably learned, like I did, that when you do an experiment you need to hold all the factors constant except for the one you're studying. That seems simple...

Choosing Statistical Software: Four Questions You Should Ask

Data.  Analysis. Statistics. It seems like everybody is talking about the importance of doing data analysis, whether it's analytics for predicting consumer behavior or looking at critical metrics for Six Sigma and other data-driven quality improvement programs. Not only do we have more data available to us than ever before, we're also blessed...and/or cursed...with an enormous range of software options to help us make sense out of all this data we're trying so hard to understand. 

Your options for doing data analysis run the gamut—from a pencil, paper and calculator costing a couple of bucks...

Why Isn't This "Six Sigma" Project Improving Quality?

Whether you're a quality improvement veteran or you're just starting to do research about what quality improvement methods are available today, you've seen headlines and articles that explain why Six Sigma and other data-driven quality improvement methods don't work.

Typically these pieces have an attention-grabbing headline, like Six Sigma Initiative Fails to Save the Universe, followed by a dissection of a deployment or project that failed—usually in spectacular fashion—to achieve its goals. 

"There!" the writer typically crows. "See? It's obvious Six Sigma doesn't work!" What makes these...

Why the Weibull Distribution Is Always Welcome

In college I had a friend who could go anywhere and fit right in. He'd have lunch with a group of professors, then play hacky-sack with the hippies in the park, and later that evening he'd hang out with the local bikers at the toughest bar in the city. Next day he'd play pickup football with the jocks before going to an all-night LAN party with his gamer pals. On an average weekend he might catch an all-ages show with the small group of straight-edge punk rockers on our campus, or else check out a kegger with some townies, then finish the weekend by playing some D&D with his friends from the...

A Story-based Approach to Learning Statistics (and Statistical Software)

Want to learn more about analyzing data? Try taking a page from Aesop's book. 

Well...really, I'm suggesting taking multiple pages from Minitab's book, but my suggestion stems from an idea that Aesop epitomizes.  

Aesop was no fool. When he wanted to convey even the heaviest of lessons, he didn't waste time detailing the intellectual and philosophical arguments behind them. He didn't argue, cajole, or berate. He didn't lecture or pontificate. 

He told a story. 

Minitab uses the same approach in Meet Minitab, the introductory guide to data analysis and quality statistics using our statistical...

Bewildering Things Statisticians Say: "Failure to Reject the Null Hypothesis"

Subcultures have languages all their own. Teen gangs, statisticians, gamers, music buffs, sports nuts, furries...all use terminology that baffles outsiders.The arcane language helps identify kindred spirits: using the correct phrase proves you belong. The proper buzzwords can gain you admittance to the right professional circles...or the wrong biker bars. Maybe both. 

Not knowing them can get you into serious trouble. When you enter a dangerous place (like the data analysis arena), you need at least a basic grasp of the jargon the local toughs use. 

I'm not comparing any particular group of...

FMEA: A Good Way to Save Yourself Some Grief

In the past couple of years, I've noticed a new acronym popping up across the Web. In case you've not yet encountered it, "FML" typically appears in social media updates about something gone awry.  As in, "The cat ate my homework. FML!"  Or, "My production line just broke down, and now the company is going to be short on a major order. FML!" 

This acronym reminds me of an abbreviation used in Lean Six Sigma and quality improvement: FMEA.  It's short for "Failure Modes and Effects Analysis," which basically means "look very, very carefully at how and why stuff can go wrong."

FMEA: Failure Modes...

Monte Carlo Is Not as Difficult as You Think

Before I started studying statistics, references to a mysterious "Monte Carlo Method" made it seem like the most cryptic thing in the data-analysis universe. People were developing programs dedicated solely to Monte Carlo, and offering special workshops and seminars. It seemed so great and terrible that someone like me—mere mortal that I am—would never be able to understand it. 

Fast-forward a few years, and now that I have some experience with it, I'm wondering why Monte Carlo has the reputation it does. The fact of the matter is, at least from a data analysis perspective, Monte Carlo...

Choosing the Right Distribution Model for Reliability Data

Recently I've been refreshing my knowledge of reliability analysis, which is the use of data to assess a product's ability to perform over time. Quality engineers typically use reliability analysis to predict the likelihood that a certain percentage of products will fail over a given amount of time.   

Statistical software will do the calculations involved in a reliability analysis, but there's a catch: first, you must choose a distribution to model your data. Put plainly, you need to tell the software to base its analysis on the normal distribution, the Weibull distribution, or perhaps some...