dcsimg

Dividing a Data Set into Training and Validation Samples

Adam Ozimek had an interesting post April 15th on the Modeled Behavior blog at Forbes.com. He observed that one of the advantages of big data is how easy it is to get test data to validate a model that you built from sample data.

Ozimek notes that he is “for the most part a p-value checking, residual examining, data modeling culture economist,” but he’s correct to observe that if you can test your model on real data, then you should.

What I’ll describe is certainly not the only way to divide data in Minitab Statistical Software. Still, I think it’s pretty good if I do say so myself. Want...

Selecting the Right Quality Improvement Project

I wrote a post a few years back on the difficulties that can ensue when you’re just trying to get started on your Lean Six Sigma or quality improvement initiative. It can become especially difficult when you have many potential projects staring at you, but you aren’t quite sure which one will give you the most bang for your buck.

A project prioritization matrix can be a good place to start when you need to choose which projects to focus on, as it can help you logically select optimal improvement projects against their weighted value, based on your company’s predefined metrics. The matrix can...

“Hello, How Can I Help You?”- A Look at Quality Improvement in Financial Services

It’s common to think that process improvement initiatives are meant to cater only to manufacturing processes, simply because manufacturing is where Lean and Six Sigma began. However, many other industries, in particular financial services and banking, also rely on data analysis and Lean Six Sigma tools to improve processes.

Rod Toro is a business process improvement manager at Edward Jones, and I recently got the chance to talk with him about a Lean Six Sigma project the service division at his company completed to improve customer satisfaction.

Edward Jones has been increasing the number of...

Hockey Penalties, Fans Booing, and Independent Trials

We’re in the thick of the Stanley Cup playoffs, which means hockey fans are doing what seems to be every sports fan's favorite hobby...complaining about the refs! While most complaints, such as “We’re not getting any of the close calls!” are subjective and hard to get data for, there's one question that we should be able to answer objectively with a statistical analysis: Are hockey penalties independent trials? That is, does the team that the next penalty will be called on depend on the team that any previous penalties were called on?

Think of flipping a coin. Even if it comes up heads 10 times...

Creating a Custom Report using Minitab, part 2

Now that you’ve seen how to automatically import data and run analyses in my previous post, let’s create the Monthly Report!

I will be using a Microsoft Word Document (Office 2010) and adding bookmarks to act as placeholders for the Graphs, statistics, and boilerplate conclusions.

Let’s go through the steps to accomplish this:

  • Open up an existing report that you have previously created in Microsoft Word.
  • Highlight a section of the document where you would like to place the created Minitab graph or statistic.
  • Go to the Insert tab, click the Bookmark link, and type in the name of what you will be...

Creating a Custom Report using Minitab, part 1

As a member of Minitab’s Consulting and Custom Development Services team, I get to help companies across a variety of industries create many different types of reports for management. These reports often need to be generated weekly or monthly. I prefer to automate tasks like this whenever possible, so that new or updated reports can be created without much effort. A little investment up front can save a lot of time by eliminating the need to recreate the wheel every time management wants a current report. 

I’m going to tell you how to use Minitab Statistical Softwareto automatically generate a...

What I Learned From Treating Childbirth as Failure, Part II

A couple of years ago, I wrote a blog post titled "What I Learned From Treating Childbirth as Failure" that conveniently ended up getting published the day before my daughter was born.  You should read it first, but to summarize it demonstrates how we can predict the odds of an event happening during certain time intervals even when the original data is highly censored.

Since then, several people have asked (two in the comments alone) where I came up with the numbers I stated at the end:

  • When should a relative arrive on a 7-day stay to have the greatest chance of being there for the birth? (May...

How to Correctly Interpret P Values

The P value is used all over statistics, from t-tests to regression analysis. Everyone knows that you use P values to determine statistical significance in a hypothesis test. In fact, P values often determine what studies get published and what projects get funding.

Despite being so important, the P value is a slippery concept that people often interpret incorrectly. How do you interpret P values?

In this post, I'll help you to understand P values in a more intuitive way and to avoid a very common misinterpretation that can cost you money and credibility.

What Is the Null Hypothesis in Hypothesis...

A Different Look at the New Medicare Data

It’s been an exciting week to be interested in Medicare data. On April 9th,  the American government opened up data from the Centers for Medicare and Medicaid Services (CMS) that show charges made to Medicare and payments received by over 880,000 entities. If you went to Bing on Monday, April 14, at about 12:30, chose to look at news stories, and typed Medicare money into the search box, here’s a sampling of what you got:

Medicare doctors: Who gets the big bucks & for what
The Medicare Data’s Pitfalls
Medicare Data Shines Light on Billions Paid to TX Doctors
Political Ties of Top Billers for...

Re-analyzing Wine Tastes with Minitab 17

In April 2012, I wrote a short paper on binary logistic regression to analyze wine tasting data. At that time, François Hollande was about to get elected as French president and in the U.S., Mitt Romney was winning the Republican primaries. That seems like a long time ago…

Now, in 2014, Minitab 17 Statistical Software has just been released. Had Minitab 17, been available in 2012, would have I conducted my analysis in a different way?  Would the results still look similar?  I decided to re-analyze my April 2012 data with Minitab 17 and assess the differences, if there are any.

There were no...

What Can Classical Chinese Poetry Teach Us About Graphical Analysis?

A famous classical Chinese poem from the Song dynasty describes the views of a mist-covered mountain called Lushan.

The poem was inscribed on the wall of a Buddhist monastery by Su Shi, a renowned poet, artist, and calligrapher of the 11th century.

Deceptively simple, the poem captures the illusory nature of human perception.
 

   Written on the Wall of West Forest Temple

                                      --Su Shi
 
  From the side, it's a mountain ridge.
  Looking up, it's a single peak.
  Far or near, high or low, it never looks the same.
  You can't know the true face of Lu Mountain
  When...

What if the NCAA tournament wasn’t single elimination?

Connecticut just defeated Kentucky to win the NCAA Men's Basketball Championship. The game had the highest combined seeding of any championship game in NCAA tournament history. This shows that while a single elimination tournament can be very entertaining, it doesn’t always determine who the “best” team is. In fact, despite winning the championship, Connecticut is still ranked 8th in the Pomeroy Ratings and 10th in the Sagarin Predictor Rankings. Though Connecticut played the best basketball the past 3 weeks, it would be folly to ignore the 30 games they played before that!

But although I’d...

ITEA Sneak-Peek: The Great Escape from Foam Defects

The 2014 ASQ World Conference on Quality and Improvement is coming up in early May in Dallas, and this year’s International Team Excellence Award Process (ITEA) will also come to a close at the conference, as winners from the finalist teams will be chosen for ASQ gold, silver, or bronze-level statuses.

What’s ITEA?

The annual ASQ ITEA process celebrates the accomplishments of quality improvement teams from a broad spectrum of industries from around the world. The ITEA is the only international team recognition process of its kind in the United States, and since 1985, more than 1,000 teams from...

Introducing the Bubble Plot

When you're evaluating a dataset, graphical analysis can be very important. While an analysis like a regression or ANOVA can be backed up by numbers, being able to visualize how your dataset is behaving can be even more convincing than a group of p-values—especially to those who aren’t trained in statistics.

For example, let’s look at a few variables we think may be correlated. In this specific example, we will take the Unemployment Rate and the Crime Rate for each state in the U.S. We have 3 columns of data in Minitab: C1, which contains the State Name; C2, which contains the Crime Rate; and...

Control Chart Tutorials and Examples

The other day I was talking with a friend about control charts, and I wanted to share an example one of my colleagues wrote on the Minitab Blog.  Looking back through the index for "control charts" reminded me just how much material we've published on this topic.

Whether you're just getting started with control charts, or you're an old hand at statistical process control, you'll find some valuable information and food for thought in our control-chart related posts. 

Different Types of Control Charts

One of the first things you learn in statistics is that when it comes to data, there's no...

Did Welch’s ANOVA Make Fisher's Classic One-Way ANOVA Obsolete?

One-way ANOVA can detect differences between the means of three or more groups. It’s such a classic statistical analysis that it’s hard to imagine it changing much.

However, a revolution has been under way for a while now. Fisher's classic one-way ANOVA, which is taught in Stats 101 courses everywhere, may well be obsolete thanks to Welch’s ANOVA.

In this post, I not only want to introduce you to Welch’s ANOVA, but also highlight some interesting research that we perform here at Minitab that guides the implementation of features in our statistical software.

One-Way ANOVA Assumptions

Like any...

Analyze a DOE with the Assistant in Minitab 17

By now, you probably know that Minitab 17 includes Design of Experiments (DOE) in the Assistant. We already spent some time looking at 5 highlights when you create a screening experiment with the Assistant in Minitab 17.

But the Assistant can also help you make sense of the data you collect for your experiment. After you create a design with the Assistant, choose Assistant > DOE > Analyze and Interpret and you’re on your way. Exactly what you get depends on which type of design you’re analyzing, but there’s some really neat stuff to help you get the most out of your data. Here are 3...

Equivalence Testing for Quality Analysis (Part II): What Difference Does the Difference Make?

My previous post examined how an equivalence test can shift the burden of proof when you perform hypothesis test of the means. This allows you to more rigorously test whether the process mean is equivalent to a target or to another mean.

Here’s another key difference: To perform the analysis, an equivalence test requires that you first define, upfront, the size of a practically important difference between the mean and the target, or between two means.

Truth be told, even when performing a standard hypothesis test, you should know the value of this difference. Because you can’t really evaluate...

Equivalence Testing for Quality Analysis (Part I): What are You Trying to Prove?

With more options, come more decisions.

With equivalence testing added to Minitab 17, you now have more statistical tools to test a sample mean against target value or another sample mean.

Equivalence testing is extensively used in the biomedical field. Pharmaceutical manufacturers often need to test whether the biological activity of a generic drug is equivalent to that of a brand name drug that has already been through the regulatory approval process.

But in the field of quality improvement, why might you want to use an equivalence test instead of a standard t-test?

Interpreting Hypothesis...

The Best European Football League: What the CTQ’s and Minitab Can Tell Us

by Laerte de Araujo Lima, guest blogger

In a previous post (How Data Analysis Can Help Us Predict This Year's Champions League), I shared how I used Minitab Statistical Software to predict the 2013-2014 season of the UEFA Champions league. This involved the regression analysis of main critical-to-quality (CTQ) factors, which I identified using the “voice of the customer” suggestions of some friends.

Since that post was published, my friends have stopped discussing the UEFA Champions league—they were convinced by the results I shared.

But now they’ve challenged me to use Six Sigma tools to...