Violations of the Assumptions for Linear Regression: Closing Arguments and Verdict

Minitab Blog Editor | 18 February, 2013

Topics: Regression Analysis

  Lionel Loosefit has been hauled to court for violating the assumptions of regression analysis. On the last day of the trial, the prosecution and defense present their closing arguments. And the fate of Mr. Loosefit is decided by judge and jury...

The Prosecution's Summary

Prosecutor: Ladies and gentlemen, we’ve presented a slew of evidence in this trial. You’ve seen, with your own eyes, every possible heinous violation of the assumptions for regression in the defendant’s model. Here’s what we’ve shown, in a nutshell:

nutshell

Prosecutor: We’ve carefully delineated each violation with specific graphic evidence on Days 1, 2, and 3 of the trial. The evidence is so overwhelming, you might have trouble keeping it straight. Luckily, there’s a quick, easy way to review the cumulative evidence and reach a verdict. When you run regression analysis in Minitab's statistical software, simply click Graphs and select Four in One.

Had the defendant taken this simple precaution, he’d have gotten this:

fourpack

 

Prosecutor: Had he done so, Mr. Loosefit might have then tried to transform his data to remedy some of these problems. In fact, the General Regression command in Minitab comes with a built-in power transformation expressly for that purpose:

 box cox

Prosecutor: Of course, the defendant didn’t bother to try that, either. And he had many other opportunities to amend the error of his ways. For example, using Regression > Fitted Line Plot, he could have changed his model from linear to quadratic to better account for the curvature in his data.

linear vs quadratic

Prosecutor: What's more, he could have right-clicked a graph and chose Brush to identify and investigate outliers in his data.

Brushing

Prosecutor: Of course, Mr. Loosefit didn’t bother to do those things either. Was he unsure how to interpret the plots? Does he suffer from post-traumatic statistics disorder (PTSD)? The defense might have you believe that. But it won’t hold water. Because Mr. Loosefit had only one X variable and one Y variable. So he could have easily run his regression analysis in the Minitab Assistant (Assistant > Regression) and obtained a clear, user-friendly diagnostic report showing the problems in his model:        

Assistant report

Prosecutor: But he didn't do that either. In short, Minitab gave Lionel Loosefit all the chances in the world. Why did he not avail himself of any of these opportunities to take the high road? To follow the basic tenets of statistical decency?  

[Spectators shake heads sadly.]

Prosecutor: I’ll tell you why, ladies and gentlemen. Because Lionel Loosefit has absolutely no residual shame!!!!

[Courtroom erupts]

Judge: Order! Order in the court!

Prosecutor: And for that reason alone, you must find him guilty on all counts!

The Defense's Summary

Brain saltsDefense: The prosecution makes it sound so easy doesn’t it? Just choose Graphs > Four in one, they say. Just transform the data, they say. But we all know real life doesn’t always work out quite so neatly. And as he grappled with the complex statistical requirements for a regression analysis, not realizing all the options that existed in Minitab to help him, working under deadlines, Lionel Loosefit did what any one us would do. He reached for help in a bottle…

[Spectators shake heads sadly]

Defense: Effervescent brain salt. Lured by the promise of an instant cure for his statistics-induced brain troubles, Mr. Loosefit chugged the entire bottle. Unfortunately, he was completely unaware of the possible side-effects. Allow me to read those, starting on page 234, line 3487 [puts bottle insert under an electron microscope]:

"In susceptible individuals, effervescent brain salt may cause tennis elbow, pink eye, tapeworm, dry heaves, church laughter…

[2 hours later]

hammertoes, bubonic plague, sudden bouts of belly-dancing, foot-in-mouth disease, triple vision, and short-term paralysis of the index finger."

Did you catch that last one? Short-term paralysis of the index finger, ladies and gentlemen. Making the defendant temporarily unable to click Graphs > Four in one, or another other options to evaluate regression assumptions in Minitab.  

Spectator [whispers]: Is that all they got for a defense? Salt on the brain?

Defense: No! We’ve got an even more compelling defense, for all you whispering spectators out there. Remember, as the prosecution itself demonstrated on Day 3, the date and time that Mr. Loosefit performed his regression analysis was duly recorded in the Minitab Session window:

SEssion window final

Think back, everyone. Almost exactly one year ago to the day. February 19, 2012. The Knicks vs the Mavericks. Madison Square Garden. Game time started at 1:00 PM ET.

Prosecutor: Objection, your honor. This is highly irrelevant.

Judge: Sustained. Counsel, get to the point.

Defense: Your Honor, like many of us, Mr Loosefit has two monitors at his work station. On one monitor he was performing a regression analysis using Minitab. On the other monitor, he was watching Jeremy Lin score 28 points and tally a career high of 14 assists and 5 steals.  

Prosecutor: Surely you’re not suggesting…

Defense: Absolutely! The defendant is not guilty by reason of temporary Linsanity! Caught up in the delirium of the game, he forgot to display residual plots. It could have happened to any of us. Thankfully, the sudden bout of Linsanity ended as quickly as it started. Today, the defendant does not present the slightest [yawn] danger to statistical…[yawn] socie…..zzzzzzz….[snores]

Judge: It appears that the defense rests.

The Verdict

Judge [to jury]: Have you reached a verdict?

Jury: We have, your honor. We find the defendant guilty on all charges.

Judge: Mr. Loosefit, I’m not 95% confident that we can accurately predict your future responses. That’s why I want to make sure that you don’t have even one degree of freedom to estimate a model, ever again.

[Courtroom applause]

Judge: I hereby sentence you to 30 years of hard labor, calculating all the statistics for each regression analysis yourself…without using Minitab. You’ll be calculating coefficients and plotting residuals for the remainder of your days, Loosefit. And to make doubly sure you learn your lesson, I’m denying you access to the calculators in Calc > Calculator or even the Tools > Calculator. All your calculations must be done by hand, including long division!

Generation X spectator: “Long division”? What’s that?! 

Generation Y spectator: Sounds like some kind of sadistic medieval punishment.

Generation Z spectator: Didn’t they outlaw that at the Geneva Convention?

Judge: But I'm not without mercy. I won't make you search for every formula in a statistics textbook. You'll be allowed to use Help > Methods and Formulas > Statistics > Regression. You'll  find all the formulas you need there.  

Loosefit: No! NO!! Not long division! Okay, I confess. But it was my p-values that made me do it. I wanted to feel statistically significant!

Judge: Statistical significance doesn't mean jack diddly if your model assumptions aren't met. Bailiff, take Mr. Loosefit away.

Update: Since his sentencing, Lionel Loosefit has undergone a power transformation in prison. He now works for the benefit of the public good, making sure that others remember to use Minitab to check their assumptions whenever they perform a regression analysis.