Blind Wine Part II: The Survey

In Blind Wine Part I, we introduced our experimental setup, which included some survey questions asked ahead of time of each participant. The four questions asked were:

  • On a scale of 1 to 10, how would you rate your knowledge of wine?
  • How much would you typically spend on a bottle of wine in a store?
  • How many different types of wine (merlot, riesling, cabernet, etc.) would you buy regularly (not as gifts)?
  • Out of the following 8 wines, which do you think you could correctly identify by taste?
  • Merlot
  • Cabernet Sauvignon
  • Pinot Noir
  • Malbec
  • Chardonnay
  • Pinot Grigio
  • Sauvignon Blanc
  • Riesling

Today, we'd like to...

What Is the Difference between Linear and Nonlinear Equations in Regression Analysis?

Previously, I’ve written about when to choose nonlinear regression and how to model curvature with both linear and nonlinear regression. Since then, I’ve received several comments expressing confusion about what differentiates nonlinear equations from linear equations. This confusion is understandable because both types can model curves.

So, if it’s not the ability to model a curve, what is the difference between a linear and nonlinear regression equation?

Linear Regression Equations

Linear regression requires a linear model. No surprise, right? But what does that really mean?

A model is linear...

How to Interpret a Regression Model with Low R-squared and Low P values

In regression analysis, you'd like your regression model to have significant variables and to produce a high R-squared value. This low P value / high R2 combination indicates that changes in the predictors are related to changes in the response variable and that your model explains a lot of the response variability.

This combination seems to go together naturally. But what if your regression model has significant variables but explains little of the variability? It has low P values and a low R-squared.

At first glance, this combination doesn’t make sense. Are the significant predictors still...

Multiple Regression Analysis and Response Optimization Examples using the Assistant in Minitab 17

In Minitab, the Assistant menu is your interactive guide to choosing the right tool, analyzing data correctly, and interpreting the results. If you’re feeling a bit rusty with choosing and using a particular analysis, the Assistant is your friend!

Previously, I’ve written about the new linear model features in Minitab 17. In this post, I’ll work through a multiple regression analysis example and optimize the response variable to highlight the new features in the Assistant.

Choose a Regression Analysis

As part of a solar energy test, researchers measured the total heat flux. They found that heat...

Chaos at the Kentucky Derby? Bet on It!

If betting wasn't allowed on horse racing, the Kentucky Derby would likely be a little-known event of interest only to a small group of horse racing enthusiasts. But like the Tour de France, the World Cup, and the Masters Tournament, even those with little or no knowledge of the sport in general seem drawn to the excitement over its premier event—the mint juleps, the hats...and of course, the betting.

As most of you probably already know, then, a big part of betting is the odds placed on a particular horse, so that a bet on the favorite to win the race would pay out significantly less than a...

Re-analyzing Wine Tastes with Minitab 17

In April 2012, I wrote a short paper on binary logistic regression to analyze wine tasting data. At that time, François Hollande was about to get elected as French president and in the U.S., Mitt Romney was winning the Republican primaries. That seems like a long time ago…

Now, in 2014, Minitab 17 Statistical Software has just been released. Had Minitab 17, been available in 2012, would have I conducted my analysis in a different way?  Would the results still look similar?  I decided to re-analyze my April 2012 data with Minitab 17 and assess the differences, if there are any.

There were no...

Why Is There No R-Squared for Nonlinear Regression?

Nonlinear regression is a very powerful analysis that can fit virtually any curve. However, it's not possible to calculate a valid R-squared for nonlinear regression. This topic gets complicated because, while Minitab statistical software doesn’t calculate R-squared for nonlinear regression, some other packages do.

So, what’s going on?

Minitab doesn't calculate R-squared for nonlinear models because the research literature shows that it is an invalid goodness-of-fit statistic for this type of model. There are bad consequences if you use it in this context.

Why Is It Impossible to Calculate a...

Unleash the Power of Linear Models with Minitab 17

We released Minitab 17 Statistical Software a couple of days ago. Certainly every new release of Minitab is a reason to celebrate. However, I am particularly excited about Minitab 17 from a data analyst’s perspective. 

If you read my blogs regularly, you’ll know that I’ve extensively used and written about linear models. Minitab 17 has a ton of new features that expand and enhance many types of linear models. I’m thrilled!

In this post, I want to share with my fellow analysts the new linear model features and the benefits that they provide.

New Linear Model Analyses in Minitab 17

We’ve added...

R-Squared: Sometimes, a Square is just a Square

If you regularly perform regression analysis, you know that R2 is a statistic used to evaluate the fit of your model. You may even know the standard definition of R2: the percentage of variation in the response that is explained by the model.

Fair enough. With Minitab Statistical Software doing all the heavy lifting to calculate your R2 values, that may be all you ever need to know.

But if you’re like me, you like to crack things open to see what’s inside. Understanding the essential nature of a statistic helps you demystify it and interpret it more accurately.

R-squared: Where Geometry Meets...

Regression Analysis: How to Interpret S, the Standard Error of the Regression

R-squared gets all of the attention when it comes to determining how well a linear model fits the data. However, I've stated previously that R-squared is overrated. Is there a different goodness-of-fit statistic that can be more helpful? You bet!

Today, I’ll highlight a sorely underappreciated regression statistic: S, or the standard error of the regression. S provides important information that R-squared does not.

What is the Standard Error of the Regression (S)?

S becomes smaller when the data points are closer to the line.

In the regression output for Minitab statistical software, you can find...

A Statistical Look at How Turnovers Impacted the NFL Season

“Turnovers are like ex-wives. The more you have, the more they cost you.” – Dave Widell, former Dallas Cowboys lineman

It doesn’t take witty insight from a former NFL player to realize how big an impact turnovers can have in a football game. Every time an announcer talks about “Keys to the Game,” winning the turnover battle is one of them. And as Cowboys fans know all too well, an ill-timed interception can ruin not only your chances of winning that game, but it can ruin your entire season, too.

But hold on a minute. A few weeks ago, Andrew Luck and the Colts proved that you could still win a...

How High Should R-squared Be in Regression Analysis?

Just how high should R2 be in regression analysis? I hear this question asked quite frequently.

Previously, I showed how to interpret R-squared (R2). I also showed how it can be a misleading statistic because a low R-squared isn’t necessarily bad and a high R-squared isn’t necessarily good.

Clearly, the answer for “how high should R-squared be” is . . . it depends.

In this post, I’ll help you answer this question more precisely. However, bear with me, because my premise is that if you’re asking this question, you’re probably asking the wrong question. I’ll show you which questions you should...

Regression Analysis Tutorial and Examples

I’ve written a number of blog posts about regression analysis and I think it’s helpful to collect them in this post to create a regression tutorial. I’ll supplement my own posts with some from my colleagues.

This tutorial covers many aspects of regression analysis including: choosing the type of regression analysis to use, specifying the model, interpreting the results, determining how well the model fits, making predictions, and checking the assumptions. At the end, I include examples of different types of regression analyses.

If you’re learning regression analysis right now, you might want to...

Fix Problems in Regression Analysis with Partial Least Squares

Face it, you love regression analysis as much as I do. Regression is one of the most satisfying analyses in Minitab: get some predictors that should have a relationship to a response, go through a model selection process, interpret fit statistics like adjusted R2 and predicted R2, and make predictions. Yes, regression really is quite wonderful.

Except when it’s not. Dark, seedy corners of the data world exist, lying in wait to make regression confusing or impossible. Good old ordinary least squares regression, to be specific.

For instance, sometimes you have a lot of detail in your data, but not...

See How Easily You Can Do a Box-Cox Transformation in Regression

For one reason or another, the response variable in a regression analysis might not satisfy one or more of the assumptions of ordinary least squares regression. The residuals might follow a skewed distribution or the residuals might curve as the predictions increase. A common solution when problems arise with the assumptions of ordinary least squares regression is to transform the response variable so that the data do meet the assumptions. Minitab makes the transformation simple by including the Box-Cox button. Try it for yourself and see how easy it is!

The government in Queensland,...

Correlation Is not Causation: Why Running the Football Doesn’t Cause You to Win Games in the NFL

I know we lost by 2 touchdowns, but if only you had given Peterson 3 more carries we would have won!

Last week, ESPN ran an article about why the running game still matters. They used statistics to show that the more you run the football in the NFL, the more likely you are to win the game. Specifically, if you have a running back who gets at least 20 carries, you win about 70% of the time. Statistics from different eras all had the same result: it appears that the more you run the football, the better your odds of winning the football game are.

If only it were that simple.

There is no doubt that...

Regression Analysis: Moving On with Minitab

by Matthew Barsalou, guest blogger

I recently moved, and right after finishing the less-than-joyous task of unpacking I decided to take and break and relax by playing with Minitab Statistical Software.  

As a data source I used the many quotes I received from moving companies. I'd invited many companies to look around my previous home, and then they would provide me an estimate with the price in Euros as well as an estimate on the amount of goods that would need to be transported. The "amount of goods" estimate was given in boxes. I don’t know what size boxes where referred to, but all the...

Four Tips on How to Perform a Regression Analysis that Avoids Common Problems

In my previous post, I highlighted recent academic research that shows how the presentation style of regression results affects the number of interpretation mistakes. In this post, I present four tips that will help you avoid the more common mistakes of applied regression analysis that I identified in the research literature.

I’ll focus on applied regression analysis, which is used to make decisions rather than just determining the statistical significance of the predictors. Applied regression analysis emphasizes both being able to influence the outcome and the precision of the predictions.


Interpreting Halloween Statistics with Binary Logistic Regression

As Halloween is almost here, I'm ready to check out some Halloween statistics. You can have a lot of fun with Minitab on Halloween.

The National Retail Foundation (NRF) released the results of their Halloween Consumer Spending Survey last month. The basics are easy to summarize:

Because we have Minitab, we can dig a little deeper into the data. The NRF gives some information about the proportion of respondents who participate and the proportion of participators who will celebrate with different activities. The proportions for participators are broken down by different age groups. There’s...

Using Prediction Intervals to Define Process Windows

Making parts that are truly interchangeable is a critical aspect of modern manufacturing. The same parts may be manufactured in different plants spread around the globe or by suppliers located far away. Parts need to be manufactured to specifications to ensure that they are almost identical to allow an easy assembly of new products.

Interchangeability is increasingly important in the service industry as well. Because customers expect similar standards from a service company wherever it does business around the globe, best practices need to be deployed throughout a company and...