# Stats

Blog posts and articles about statistics principles and how they apply to quality improvement methods like Lean and Six Sigma.

Not long ago, I couldn’t abide statistics. I did respect it, but in much the same way a gazelle respects a lion. Most of my early experiences with statistics indicated that close encounters resulted in pain, so I avoided further contact whenever possible. So how is it that today I write about statistics? That’s simple: it merely required completely reinventing the way I thought about and approached... Continue Reading
There are many reasons why a distribution might not be normal/Gaussian. A non-normal pattern might be caused by several distributions being mixed together, or by a drift in time, or by one or several outliers, or by an asymmetrical behavior, some out-of-control points, etc. I recently collected the scores of three different teams (the Blue team, the Yellow team and the Pink team) after a laser... Continue Reading

### 7 Deadly Statistical Sins Even the Experts Make

Do you know how to avoid them?

Since it's the Halloween season, I want to share how a classic horror film helped me get a handle on an extremely useful statistical distribution.  The film is based on John W. Campbell's classic novella "Who Goes There?", but I first became  familiar with it from John Carpenter's 1982 film The Thing.   In the film, researchers in the Antarctic encounter a predatory alien with a truly frightening... Continue Reading
Step 3 in our DOE problem solving methodology is to determine how many times to replicate the base experiment plan. The discussion in Part 3 ended with the conclusion that our 4 factors could best be studied using all 16 combinations of the high and low settings for each factor, a full factorial. Each golfer will perform half of the sixteen possible combinations and each golfer’s data could stand as... Continue Reading
I read trade publications that cover everything from banking to biotech, looking for interesting perspectives on data analysis and statistics, especially where it pertains to quality improvement. Recently I read a great blog post from Tony Taylor, an analytical chemist with a background in pharmaceuticals. In it, he discusses the implications of the FDA's updated guidance for industry analytical... Continue Reading
September 17 marked the release of new information from the American Community Survey (ACS) from the U.S. Census Bureau. Here’s a bar chart of what the press releases looked like for that day: Clearly there was a theme in play, one that was great news for major metropolitan areas. The Census Bureau even released a graph showing that the percentage of people within the 25 most populous metropolitan... Continue Reading
Statisticians say the darndest things. At least, that's how it can seem if you're not well-versed in statistics.  When I began studying statistics, I approached it as a language. I quickly noticed that compared to other disciplines, statistics has some unique problems with terminology, problems that don't affect most scientific and academic specialties.  For example, dairy science has a highly... Continue Reading
Just 100 years ago, very few statistical tools were available and the field was largely unknown. Since then, there has been an explosion of tools available, as well as ever-increasing awareness and use of statistics.   While most readers of the Minitab Blog are looking to pick up new tools or improve their use of commonly-applied ones, I thought it would be worth stepping back and talking about one... Continue Reading
Last month the ESPN series Outside the Lines reported on major league pitchers suffering serious injuries from being struck in the head by line drives, and efforts MLB is making towards having protective gear developed for pitchers. You can view the report here if you'd like: A couple of things jump out at me from the clip: The overwhelming majority of pitchers are not interested in wearing... Continue Reading
In Minitab Statistical Software, putting a regression line on a scatterplot is as easy as choosing a picture with a regression line on a scatterplot: A neat trick is that you can also add calculated lines onto a scatterplot for comparison or other communication purposes. Here’s a demonstration. United States Sentencing Guidelines The United States Sentencing Guidelines say how people who... Continue Reading
In my previous post, I showed you that the coefficients are different when choosing (-1,0,1) vs (1,0) coding schemes for General Linear Model (or Regression).  We used the two different equations to calculate the same fitted values. Here I will focus on showing what the different coefficients represent.  Let's use the data and models from the last blog post: We can display the means for each level... Continue Reading
Since we added them to Minitab Statistical Software, we've gotten great feedback from many people who have been using the General Linear Model and Regression tools. But in speaking with people as part of Minitab's Technical Support team, I've found many are noticing that there are two coding schemes available with each. We frequently get calls from people asking how the coding scheme you choose... Continue Reading
In my previous post, I wrote about the hypothesis testing ban in the Journal of Basic and Applied Social Psychology. I showed how P values and confidence intervals provide important information that descriptive statistics alone don’t provide. In this post, I'll cover the editors’ concerns about hypothesis testing and how to avoid the problems they describe. The editors describe hypothesis testing... Continue Reading
In previous posts, I discussed the results of a recycling project done by Six Sigma students at Rose-Hulman Institute of Technology last spring. (If you’re playing catch up, you can read Part I and Part II.) The students did an awesome job reducing the amount of recycling that was thrown into the normal trash cans across all of the institution’s academic buildings. At the end of the spring... Continue Reading
Banned! In February 2015, editor David Trafimow and associate editor Michael Marks of the Journal of Basic and Applied Social Psychology declared that the null hypothesis statistical testing procedure is invalid. They promptly banned P values, confidence intervals, and hypothesis testing from the journal. The journal now requires descriptive statistics and effect sizes. They also encourage large... Continue Reading
In this series of posts, I show how hypothesis tests and confidence intervals work by focusing on concepts and graphs rather than equations and numbers.   Previously, I used graphs to show what statistical significance really means. In this post, I’ll explain both confidence intervals and confidence levels, and how they’re closely related to P values and significance levels. How to Correctly... Continue Reading
Imagine that you are watching a race and that you are located close to the finish line. When the first and fastest runners complete the race, the differences in times between them will probably be quite small. Now wait until the last runners arrive and consider their finishing times. For these slowest runners, the differences in completion times will be extremely large. This is due to the fact that... Continue Reading
Monte Carlo simulation has all kinds of useful manufacturing applications. And - in celebration of Pi Day - I thought it would be apropos to show how you can even use Monte Carlo simulation to estimate pi, which of course is the mathematical constant that represents the ratio of a circle’s circumference to its diameter. For our example, let’s start with a circle of radius 1 inscribed within a... Continue Reading
What do significance levels and P values mean in hypothesis tests? What is statistical significance anyway? In this post, I’ll continue to focus on concepts and graphs to help you gain a more intuitive understanding of how hypothesis tests work in statistics. To bring it to life, I’ll add the significance level and P value to the graph in my previous post in order to perform a graphical version of... Continue Reading
Our vacation planning has begun. My daughter has requested a trip to Disney World as her high school graduation present. For most people, trip planning might mean a simple phone call to the local travel agent or an even simpler do-it-yourself online booking. Not for me. As a statistician, a request like this means I’ve got a lot of data analysis ahead. So many travel questions require (in my... Continue Reading