Using Binary Logistic Regression to Investigate High Employee Turnover

Human resources might not be a business area where you’d typically expect to conduct a Six Sigma project. However, Jeff Parks, Lean Six Sigma master black belt, found the opportunity to apply Six Sigma to human resources while leading quality improvement efforts at a large manufacturer of aerospace engine parts.

The manufacturer was suffering from high employee attrition, or turnover, and struggled to understand why. With a DMAIC Six Sigma project, Parks set out to work with the HR department to investigate and reduce the high turnover rates.

In 2009, the manufacturer had normal attrition rates...

Giving Thanks for the Regression Menu

Juicy, butter roasted turkey.

Steaming mashed potatoes.

Tangy cranberry relish.

Delicious candied sweet potatoes.

Creamy green bean casserole.

Sweet and airy corn bread.

Silken pumpkin pie.

The traditional Thanksgiving menu has so many mouth-watering dishes on the table, you don’t know where to start.

If you savor statistics as much as food, you might feel similarly as you gaze at all of the delicious analyses on Minitab’s Regression menu:

How can you decide which regression analysis to choose? In this post, I’ll give you some bite-sized samples of each regression dish to help you decide which one to...

Predicting the U.S. Presidential Election: Evaluating Two Models (Part Two)

Yesterday, I presented a model that uses Dow Jones data to predict the winner in Presidential elections that have an incumbent. Today, I test a model that uses S&P 500 data. (Here are the data for today's blog that you can use in Minitab Statistical Software.)

Model 2: The Three Month Change in the S&P 500

The second model is presented by Sam Stovall, Chief Equity Strategist at S&P Capital IQ in his paper, “The Presidential Predictor: Stock Price Performances Have Typically Presaged Victors.” Unlike the Dow Jones study, this paper was written vaguely and presented unhelpful statistics. Also, the...

Predicting the U.S. Presidential Election: Evaluating Two Models (Part One)

You may have read about statistical models that claim to predict the outcome of the upcoming Presidential election. It’s easy to imagine that these models are complicated and contain many demographic, sociological, economic, and political factors. However, I was surprised to read in this article that two simple models supposedly generate accurate predictions.

Both of these models use stock market data. One model is based on the Dow Jones and the other on the S&P 500. Statistics are best when they are a hands-on experience, so while neither study included the data, I obtained both the stock...

Ditching Feathered Hair and Bell-Bottoms

Do you like checking out old yearbooks for photos and to see how the appearance of you or your friends may have changed over the years? In honor of Minitab Statistical Software’s 40th Anniversary, we dug through our own past to see how the "our look"  has changed since 1972. Check out how Minitab software packaging has evolved into what we know today:

The very beginning - 1972

The first version of Minitab was distributed in 5 boxes of punched computer cards.

1986

Minitab 5 introduced high-resolution graphics, in addition to new statistical functions.

1988

The Minitab Student Edition, a streamlined...

Using Minitab To Weed Out Bloopers

In my last blog, we looked at how a single data entry error can cruelly sabotage your statistical analysis.

And if that doesn't scare you silly, maybe this will.

The frequency of data entry errors can be as high as 27%, even when using the conservative "double-entry" method to record each data value twice.

So what can you do? Besides make offerings to appease Ate, the ancient Greek goddess of delusion, folly, and reckless errors?

First, some old-school advice. There’s no substitute for taking a deep breath, rolling up your sleeves, and double-checking every observation in your data.

But suppose...

Analyzing Titanic Survival Rates, Part II: Binary Logistic Regression

In honor of the 100th anniversary of the sinking of the Titanic, we recently posted a dataset on the passengers aboard the ship that included Class (coach or first), Gender (female or male), Age, and Status (survived or died).  From Age an additional column was created indicating Child (17 years or younger) or Adult (18 years or older).

In an earlier post, we showed how survival rates could be compared between levels of one variable—for example, females versus males—using Stat > Tables > Cross Tabulation and Chi Square.  But what if we wanted to take allfactors into consideration to paint a...

The Odds of Finding a Four-Leaf Clover

St. Patrick’s Day is just around the corner, and maybe you’ve found yourself thinking about four-leaf clovers and trying to find one yourself. According to Irish tradition, those who find a four-leaf clover are destined for good luck, as each leaf in the clover symbolizes good omens for faith, hope, love, and luck for the finder.

A lesser-known fact about four-leaf clovers is that they aren’t the luckiest symbol after all. Irish legend indicates that those who find a five-leaf clover will actually have more luck and financial success than those who just find a four-leaf clover.

However, good...

More March Madness with Minitab and Nonlinear Regression

What, it’s still not March? Blasted February, why won't you just end already! Oh well, at least it gives us time for some more data analysis.

In my last post, I used Minitab’s Fitted Line Plot to create a regression model that predicted the probability of a home team winning a basketball game based on the difference in ranks between the two teams. This model had an r-squared value of 95.2%, which is great. But since it’s still February, let’s spend some time trying to improve on that number.

Improving the Regression Model

My last model used the difference in ranks between two teams. This assumes...

Data Analysis and the Mystery of the Confounded Calcium

In my previous blog post, I showed how omitting a confounding predictor from a linear regression model obscured the significance of another predictor variable. Confounding variables can be insidious because you don’t always know about them, and you may have to deduce their existence. 

In that vein, this post is like a mystery story. I’ll set up the mystery and include the clues. You put on your Sherlock Holmes cap, use your knowledge of confounding variables, and see if you can come up with your own theories about how one or more confounding predictors are most likely involved. 

The Study

For...

Statistical Tools for Predicting Group Membership

Riddle: What two tools in Minitab can be used to perform the same analysis on your data? Well, there are probably a few pairs that can be mentioned, but I am going to focus on Discriminant Analysis and Binary Logistic Regression.

These tools can be used to predict group membership.  If we look at exh_mvar.mtw, located in Minitab’s sample data folder, we have the perfect data set to use. Here is a snapshot of the first 30 or so observations:

 



Fifty fish from each place of origin (Alaska, Canada) were caught and growth ring diameters of scales were measured for the time when they lived in...