# Linear or Nonlinear Regression? That Is the Question.

As you probably noticed, the field of statistics is a strange beast. Need more evidence? Linear regression can produce curved lines and nonlinear regression is not named for its curved lines.

So, when should you use Nonlinear Regression over one of our linear methods, such as Regression, Best Subsets, or Stepwise Regression?

Generally speaking, you should try linear regression first. It’s easier to use and easier to interpret. However, if you simply aren’t able to get a good fit with linear regression, then it might be time to try nonlinear regression.

Let’s look at a case where linear regression doesn’t work. Often the problem is that, while linear regression can model curves, it might not be able to model the specific curve that exists in your data. The graphs below illustrate this with a linear model that contains a cubed predictor.

The fitted line plot shows that the raw data follow a nice tight function and the R-squared is 98.5%, which looks pretty good. However, look closer and the regression line systematically over and under-predicts the data at different points in the curve. When you check the residuals plots (which you *always* do, right?), you see patterns in the Residuals versus Fits plot, rather than the randomness that you want to see. This indicates a bad fit, but it’s the best that linear regression can do.

Let’s try it again, but using nonlinear regression. It's important to note that because nonlinear regression allows a nearly infinite number of possible functions, it can be more difficult to setup. In this case, it required considerable effort to determine the function that provided the optimal fit for the specific curve present in these data, but since my main point is to explain when you want to use nonlinear regression instead of linear, we don't need to relate all of those details here. (Just like on a cooking show, on the blog we have the ability to jump from the raw ingredients to a great outcome in the graphs below without showing all of the work in between!)

What is the difference between linear and nonlinear regression equations?

The fitted line plot shows that the regression line follows the data almost exactly -- there are no systematic deviations. It’s impossible to calculate R-squared for nonlinear regression, but the S value (roughly speaking, the average absolute distance from the data points to the regression line) improves from 72.4 (linear) to just 13.7 for nonlinear regression. You want a lower S value because it means the data points are closer to the fit line. What's more, the Residual versus Fits plot shows the randomness that you want to see. It’s a good fit!

Nonlinear regression can be a powerful alternative to linear regression but there are a few drawbacks. In addition to the aforementioned difficulty in setting up the analysis and the lack of R-squared, be aware that:

• The effect each predictor has on the response can be less intuitive to understand.

• P-values are impossible to calculate for the predictors.

• Confidence intervals may or may not be calculable.

If you're using Minitab now, you can play with this data yourself by going to **File -> Open Worksheet**, then click on the **Look in Minitab Sample Data folder **icon and choose **Mobility.MTW**. These data are the same that I’ve used in the Nonlinear Regression Help example, which contains a fuller interpretation of the Nonlinear Regression output.

If you'd like to try it, you can download the free 30-day trial of Minitab Statistical Software. If you're learning about regression, read my regression tutorial!

Name: Nabil Darwazeh• Monday, February 17, 2014Why it it impossible to calculate R-squared for nonlinear regression, while EXCEL does calculate the R-Squared

Name: Jim Frost• Thursday, February 20, 2014Hi Nabil,

That's a very timely question. In a couple of weeks I'll publish a blog post about this very topic. So, in the mean time, I'll provide a brief explanation.

For linear models, the sums of the squared errors always add up in a specific manner: SS Regression + SS Error = SS Total.

This seems quite logical. The variability that the regression model accounts for plus the error variability add up to equal the total variability. Further, R-squared equals SS Regression / SS Total, which mathematically must produce a value between 0 and 100%.

In nonlinear regression, SS Regression + SS Error does not equal SS Total! This completely invalidates R-squared for nonlinear models, which no longer has to be between 0 and 100%

It's true that some software packages calculate R-squared for nonlinear regression. However, academic studies have shown that this approach is invalid. Using R-squared to evaluate nonlinear models will generally lead you astray. You don't want this! That's why Minitab doesn't offer R-squared for nonlinear regression.

Instead, compare S values, and go with the smaller values.

Again, check back in a couple of weeks for a complete post about this!

Thanks for reading and the great question!

Jim

Name: Shasha• Monday, March 17, 2014Hi Jim,

So can I conclude that a regression model with high R-sq(adj) does not mean that the model is accurate? How do I determine if the regression model is reliable despite a low R-sq(adj)?

Reagrds,

Shasha

Name: Jim Frost• Friday, March 21, 2014Hi Shasha,

A high adjusted R-squared (or even the regular R-squared) doesn't necessarily mean that the model is a good fit. You should always check the residual plots to be sure that the model is not biased. If the residual plots look good, then you can trust the goodness-of-fit measures, such as R-squared and adjusted R-squared.

You would have different interpretations of a low adjusted R-squared depending on how it compares to your regular R-squared.

If the regular is high and the adjusted is low, you probably have too many predictors in your model.

Read this post for more details about using adjusted R-squared:

http://blog.minitab.com/blog/adventures-in-statistics/multiple-regession-analysis-use-adjusted-r-squared-and-predicted-r-squared-to-include-the-correct-number-of-variables

If both types of R-squared are low, it's not necessarily bad if you have significant predictors and your residual plots are good. However, it depends on what you want to do with your model.

Read this blog post for more details about this scenario:

http://blog.minitab.com/blog/adventures-in-statistics/how-high-should-r-squared-be-in-regression-analysis

Jim

Name: Daisy• Friday, June 6, 2014Hi Jim,

Can you elaborate more on the linearity analysis for the MSA study? Should I refer to the p value to determined the linearity is good? If we can consider the p value, is it p>0.05 will result the good linearity?

Name: Chris• Monday, June 16, 2014Hi Jim -

In your 2 examples of fitting the data for Mobility versus some f(density), what are the resulting 2 "best fit" equations (for the linear and non-linear regression examples)? Since the resulting equations are show in their entirety, I'm having trouble understanding the difference between "linear" and "non-linear" regression in your example. Thanks.

Regards,

Chris

Name: Jim Frost• Tuesday, June 17, 2014Hi Chris,

I'll tackle the linear versus nonlinear regression question first.

As you know, both linear and nonlinear regression can model curves. The fundamental difference between linear and nonlinear regression, and the basis for the analyses' names, are the acceptable functional forms of the model. Specifically, linear regression requires linear parameters while nonlinear does not.

A linear regression function must be linear in the parameters, which constrains the equation to just one basic form. Parameters are linear when each term in the model is additive and contains only one parameter that multiplies the term:

Response = constant + parameter * predictor + ... + parameter * predictor

or y = bo + b1X1 + b2X2 + ... + bkXk

However, a nonlinear equation can take many different forms. In fact, because there are an infinite number of possibilities, you must specify the expectation function Minitab uses to perform nonlinear regression.

So, while a linear model with polynomial terms (e.g. squared terms) produces a curve, it is still linear regression because the functional form is linear.

Y = Constant + b1 * X1 + b2 * X1 squared

It's still linear in the parameters even though the predictor variable has been squared.

Here's an example of a nonlinear function, the Michaelis-Menten equation. There are 2 parameters (thetas) and one predictor (X). Very different than the linear form!

y = theta1 * X1 / ( theta2 + X1 )

For the mobility example, the 2 equations are:

Linear:

Mobility = 1243 + 412.3 Density Ln - 94.29 * Density Ln^2 - 32.90 Density Ln^3

The predictor for this model is the natural log of density and it is also in the model in its squared and cubed forms. Despite being a natural log and having the higher-order terms, it's still a linear model because it fits the linear functional form of 1 parameter * 1 predictor for each term and the terms are additive.

Nonlinear:

Mobility = (1288.14 + 1491.08 * Density Ln + 583.238 * Density Ln^2 + 75.4167 * Density Ln^3) / (1 + 0.966295 * Density Ln + 0.397973 * Density Ln^2 + 0.0497273 * Density Ln^3)

Basically, it's one polynomial equation divided by another that produces a curver which can't be fit by a linear function. Unfortunately the graph chopped the denominator!

I hope this helps!

Jim

Name: Alisa• Sunday, August 10, 2014Hi,

I have one problem. Both, linear and nonlinear regression analysis and coefficients for both are significant. How I can know which solution is better?

Name: Jim Frost• Tuesday, August 19, 2014Hi Alisa,

Because you mention significant coefficients, I'm guessing that when you say nonlinear regression analysis, you actually mean linear regression using polynomials to fit a curve. In nonlinear regression, you can't calculate p-values to determine whether the coefficients are significant.

To help you distinguish the two, click the link in the post for "What is the difference between linear and nonlinear regression equations?"

So, I'll assume that you have a linear model with a polynomial term (such as a squared term) that is significant. There are a couple of things to check.

If you have just have one predictor variable and include it in a squared or cubed form, use a fitted line plot to visually see whether or not the extra term(s) better follow the curve. Just fit the model with and without the polynomial term to see how it changes in the graph. In Minitab: Stat > Regression > Fitted Line Plot. Seeing is believing!

If you have more than one predictor, do the same as above with regular Regression but compare the residual plots. Look to see if the model without the polynomial term has a non-random pattern in it. If adding the polynomial removes the pattern, it's generally good to use the polynomial. However, you have to be sure that you're not overfitting the model. See below.

You can check the adjusted R-squared, and especially the predicted R-squared to be sure that you're not including too many terms. Including too many terms can improve the apparent fit, but it is actually fitting the random error in the data rather than the true relationships. I describe how to assess this here:

http://blog.minitab.com/blog/adventures-in-statistics/multiple-regession-analysis-use-adjusted-r-squared-and-predicted-r-squared-to-include-the-correct-number-of-variables

If the residual plots look good and you're not overfitting the model, you can then assess S, the standard error of the regression. This tells you how wrong the model is on average. Smaller values indicate a better fit. To read about this statistic, click the link in this post for "S value".

Finally, if you just want some examples of how to compare how well different curvilinear models fit a dataset, including comparing a nonlinear model to linear models, read this post:

http://blog.minitab.com/blog/adventures-in-statistics/curve-fitting-with-linear-and-nonlinear-regression

I hope this helps! Please don't hesitate to write again if you have further questions!

Jim