dcsimg
 

Regression Analysis

Blog posts and articles about regression analysis methods applied to Lean and Six Sigma projects.

by Lion "Ari" Ondiappan Arivazhagan, guest blogger.  An alarming number of borewell accidents, especially involving little children, have occurred across India in the recent past. This is the second of a series of articles on Borewell accidents in India. In the first installment of the series, I used the G-chart in Minitab Statistical Software to predict the probabilities of innocent children... Continue Reading
In part 1 of this post, I covered how Six Sigma students at Rose-Hulman Institute of Technology cleaned up and prepared project data for a regression analysis. Now we're ready to start our analysis. We’ll detail the steps in that process and what we can learn from our results. What Factors Are Important? We collected data about 11 factors we believe could be significant: Whether the date of... Continue Reading
By Peter Olejnik, guest blogger. Previous posts on the Minitab Blog have discussed the work of the Six Sigma students at Rose-Hulman Institute of Technology to reduce the quantities of recyclables that wind up in the trash. Led by Dr. Diane Evans, these students continue to make an important impact on their community. As with any Six Sigma process, the results of the work need to be evaluated. A... Continue Reading
If you wanted to figure out the probability that your favorite football team will win their next game, how would you do it?  My colleague Eduardo Santiago and I recently looked at this question, and in this post we'll share how we approached the solution. Let’s start by breaking down this problem: There are only two possible outcomes: your favorite team wins, or they lose. Ties are a possibility,... Continue Reading
The Minitab Fan section of the Minitab blog is your chance to share with our readers! We always love to hear how you are using Minitab products for quality improvement projects, Lean Six Sigma initiatives, research and data analysis, and more. If our software has helped you, please share your Minitab story, too! My LSS coach suggested that I regularly conduct data analysis to refresh my Minitab... Continue Reading
Recently, Minitab’s Joel Smith posted about his vacation and being pooped on twice by birds. Then guest blogger Matthew Barsalou wrote a wonderful follow-up on the chances of Joel being pooped on a third time. While I cannot comment on how Joel has handled this situation psychologically so far, I can say that if I had been pooped on twice in a short amount of time, I would be wary of our... Continue Reading
As someone who has collected and analyzed real data for a living, the idea of using simulated data for a Monte Carlo simulation sounds a bit odd. How can you improve a real product with simulated data? In this post, I’ll help you understand the methods behind Monte Carlo simulation and walk you through a simulation example using Devize. What is Devize, you ask? Devize is Minitab's exciting new,... Continue Reading
In my recent meetings with people from various companies in the service industries, I realized that one of the problems they face is that they were collecting large amounts of "qualitative" data: types of product, customer profiles, different subsidiaries, several customer requirements, etc. As I discussed in my previous post, one way to look at qualitative data is to use different types of... Continue Reading
Choosing the correct linear regression model can be difficult. After all, the world and how it works is complex. Trying to model it with only a sample doesn’t make it any easier. In this post, I'll review some common statistical methods for selecting models, complications you may face, and provide some practical advice for choosing the best regression model. It starts when a researcher wants to... Continue Reading
Last fall I had a birthday. It wasn’t one of those tougher birthdays where the number ends in a zero. Still, the birthday got me thinking. In response, I told myself, age is just a number. Then I did a mental double-take. Can a statistician say that? After all, numbers are how I understand the world and the way it works. Can age just be a number? After some musing, I concluded that age is just a... Continue Reading
"Data! Data! Data! I can't make bricks without clay."  — Sherlock Holmes, in Arthur Conan Doyle's The Adventure of the Copper Beeches Whether you're the world's greatest detective trying to crack a case or a person trying to solve a problem at work, you're going to need information. Facts. Data, as Sherlock Holmes says.  But not all data is created equal, especially if you plan to analyze as part of... Continue Reading
Stepwise regression and best subsets regression are both automatic tools that help you identify useful predictors during the exploratory stages of model building for linear regression. These two procedures use different methods and present you with different output. An obvious question arises. Does one procedure pick the true model more often than the other? I’ll tackle that question in this post. Fi... Continue Reading
Using a sample to estimate the properties of an entire population is common practice in statistics. For example, the mean from a random sample estimates that parameter for an entire population. In linear regression analysis, we’re used to the idea that the regression coefficients are estimates of the true parameters. However, it’s easy to forget that R-squared (R2) is also an estimate.... Continue Reading
You need to consider many factors when you’re buying a used car. Once you narrow your choice down to a particular car model, you can get a wealth of information about individual cars on the market through the Internet. How do you navigate through it all to find the best deal?  By analyzing the data you have available.   Let's look at how this works using the Assistant in Minitab 17. With the... Continue Reading
I’ve written about the importance of checking your residual plots when performing linear regression analysis. If you don’t satisfy the assumptions for an analysis, you might not be able to trust the results. One of the assumptions for regression analysis is that the residuals are normally distributed. Typically, you assess this assumption using the normal probability plot of the residuals. Are... Continue Reading
In my previous post, I described how I was asked to weigh in on the ethics of researchers (DeStefano et al. 2004) who reportedly discarded data and potentially set scientific knowledge back a decade. I assessed the study in question and found that no data was discarded and that the researchers used good statistical practices. In this post, I assess a study by Brian S. Hooker that was... Continue Reading
The other day I received a request from a friend to look into a new study in a peer reviewed journal that found a link between MMR vaccinations and an increased risk of autism in African Americans boys. To draw this conclusion, the new study reanalyzed data that was discarded a decade ago by a previous study. My friend wanted to know, from a statistical perspective, was it unethical for the... Continue Reading
Previously, I showed why there is no R-squared for nonlinear regression. Anyone who uses nonlinear regression will also notice that there are no P values for the predictor variables. What’s going on? Just like there are good reasons not to calculate R-squared for nonlinear regression, there are also good reasons not to calculate P values for the coefficients. Why not—and what to use instead—are the... Continue Reading
I caught the end of Toy Story over the weekend, which is definitely one of my all-time favorite children’s movies. Now—unfortunately or fortunately—I can’t get Randy Newman's theme song,“You’ve Got a Friend in Me,” out of my head! It's also got me thinking about the nature of friendship, and how "best friends forever" are supposed to always be there when you need them. And, not to get too maudlin... Continue Reading
The current Ebola outbreak in Guinea, Liberia, and Sierra Leone is making headlines around the world, and rightfully so: it's a frightening disease, and last week the World Health Organization reported its spread is outpacing their response. Nearly 900 of  the more than 1,600 people infected during this outbreak have died, including some leading medical professionals trying to stanch the... Continue Reading