dcsimg
 

Data Analysis

Blog posts and articles with tips for analyzing data for quality improvement methodologies, including Six Sigma and Lean.

As a Minitab trainer, one of the most common questions I get from training participants is "what should I do when my data isn’t normal?" A large number of statistical tests are based on the assumption of normality, so not having data that is normally distributed typically instills a lot of fear. Many practitioners suggest that if your data are not normal, you should do a nonparametric version of... Continue Reading
Many of the things you need to monitor can be measured in a concrete, objective way, such as an item's weight or length. But, many important characteristics are more subjective, such as the collaborative culture of the workplace, or an individual's political outlook. A survey is an excellent way to measure these kinds of characteristics. To better understand a characteristic, a researcher asks... Continue Reading
The 2016 presidential race is becoming more real. We’ve had several announcements with Ted Cruz, Rand Paul, Hillary Clinton, and Marco Rubio officially entering the race to be President. While the prospective Democratic candidates are down to one, or at most a few, the Republican field is extra-large this election cycle. The first order of business for a GOP candidate is to survive the nomination... Continue Reading
In 1898, Russian economist Ladislaus Bortkiewicz published his first statistics book entitled Das Gesetz der keinem Zahlen, in which he included an example that eventually became famous for illustrating the Poisson distribution. Bortkiewicz researched the annual deaths by horse kicks in the Prussian Army from 1875-1984. Data was recorded from 14 different army corps, with one being the Guard... Continue Reading
The Cp and Cpk are well known capability indices commonly used to ensure that a process spread is as small as possible compared to the tolerance interval (Cp), or that it stays well within specifications (Cpk). Yet another type of capability index exists: the Cpm, which is much less known and used less frequently. The main difference between the Cpm and the other capability indices is that the... Continue Reading
The two previous posts in this series focused on manipulating data using Minitab’s calculator and the Data menu. In this third and final post, we continue to explore helpful features for working with text data and will focus on some new features in Minitab 17.2’s Editor menu. Using the Editor Menu  The Editor menu is unique in that the options displayed depend on what is currently active... Continue Reading
My previous post focused on manipulating text data using Minitab’s calculator. In this post we continue to explore some of the useful tools for working with text data, and here we’ll focus on Minitab 17.2’s Data menu. This is the second in a 3-part series, and in the final post we’ll look at the new features in Minitab 17.2’s Editor menu. Using the Data Menu When I think of the Data menu, I think... Continue Reading
With Minitab, it’s easy to create graphs and manage numeric, date/time and text data.  Now Minitab 17.2’s enhanced data manipulation features make it even easier to work with text data. This is the first of three posts in which I'm going to focus on various tools in Minitab that are useful when working with text data, including the Calculator, the Data menu, and the Editor menu. Using the Calculator Y... Continue Reading
In this series of posts, I show how hypothesis tests and confidence intervals work by focusing on concepts and graphs rather than equations and numbers.   Previously, I used graphs to show what statistical significance really means. In this post, I’ll explain both confidence intervals and confidence levels, and how they’re closely related to P values and significance levels. How to Correctly... Continue Reading
To choose the right statistical analysis, you need to know the distribution of your data. Suppose you want to assess the capability of your process. If you conduct an analysis that assumes the data follow a normal distribution when, in fact, the data are nonnormal, your results will be inaccurate. To avoid this costly error, you must determine the distribution of your data. So, how do you determine... Continue Reading
Imagine that you are watching a race and that you are located close to the finish line. When the first and fastest runners complete the race, the differences in times between them will probably be quite small. Now wait until the last runners arrive and consider their finishing times. For these slowest runners, the differences in completion times will be extremely large. This is due to the fact that... Continue Reading
I always knew I was different. Even as a kid. “Is that me? Way out there in left field?” I asked the doc. “Yes,” he nodded, as he looked at my chart. “I used brushing to identify you on the graph.” I wasn’t sure I liked getting brushed. It felt like my true identify was being detected and displayed in a window for all to see. The doctor must have sensed my discomfort. “It’s not uncommon—even for those... Continue Reading
What do significance levels and P values mean in hypothesis tests? What is statistical significance anyway? In this post, I’ll continue to focus on concepts and graphs to help you gain a more intuitive understanding of how hypothesis tests work in statistics. To bring it to life, I’ll add the significance level and P value to the graph in my previous post in order to perform a graphical version of... Continue Reading
Our vacation planning has begun. My daughter has requested a trip to Disney World as her high school graduation present. For most people, trip planning might mean a simple phone call to the local travel agent or an even simpler do-it-yourself online booking. Not for me. As a statistician, a request like this means I’ve got a lot of data analysis ahead. So many travel questions require (in my... Continue Reading
There are times when we are deep in a particular analysis and simply cannot seem to get past this dialog window, or that error message. Fortunately, the support team at Minitab is here to help. Here is a list of situations people have called us about when using Minitab, and how to solve them. If your situation isn't listed, please call Minitab Technical Support, and we will be happy to assist.... Continue Reading
Hypothesis testing is an essential procedure in statistics. A hypothesis test evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data. When we say that a finding is statistically significant, it’s thanks to a hypothesis test. How do these tests really work and what does statistical significance actually mean? In this series of... Continue Reading
Many things have shocked me since having my first baby back in August. I didn’t think it was possible to be so tired that it actually hurt, and I also didn’t think that changing 10+ diapers a day would actually be the norm (or that needing to perform 10+ outfit changes was even possible, let alone necessary). I also didn’t think that we’d fall in love so hard with the little guy. What a wonderful,... Continue Reading
As I’m sure you’ve heard by now, Kentucky is really good at basketball. They're the only team in the country without a loss, and they have a realistic shot at becoming to first team to win the championship with an undefeated record since the 1976 Indiana Hoosiers. Under any ranking system you want to use, Kentucky is clearly the #1 team in college basketball. Well, almost any ranking system. All... Continue Reading
In my previous post, I showed you how to set up data collection for a gage R&R analysis using the Assistant in Minitab 17. In this case, the goal of the gage R&R study is to test whether a new tool provides an effective metric for assessing resident supervision in a medical facility.   As noted in that post, I'm drawing on one of my favorite bloggers about health care quality, David Kashmer of the... Continue Reading
I left off last with a post outlining how the Six Sigma students at Rose-Hulman were working on a project to reduce the amount of recycling thrown in the normal trash cans in all of the academic buildings at the institution. Using the DMAIC methodology for completing improvement projects, they had already defined the problem at hand: how could the amount of recycling that’s thrown in the normal trash... Continue Reading