How Residuals Can Save You Thousands of Dollars on Your Next Car Purchase

Minitab Blog Editor 08 August, 2013

Purchasing a used car can be stressful due to all the factors that need to be considered. Web sites such as www.cars.com provide you a wealth of information, but how do you navigate through it all to find the best deal?

Minitab to the rescue. Once you narrow your choice down to a particular car model, such as an Acura TSX, the data from www.cars.com can be copied and pasted into Minitab. After some data manipulation, you can use a regression analysis to develop an equation that calculates the expected list price of a vehicle based on variables such as year, mileage, whether or not the technology package is included, and whether or not a free Carfax report is included (which is possibly an indicator of how confident the seller is in the vehicle).

A Regression Model for Used Car Price

Let's apply this idea to an Acura TSX, using data for 986 cars downloaded from www.cars.com on 7/24/2013.  If you'd like to do this analysis yourself, download the data (and a free 30-day trial of our statistical software, if you don't already have it). 

After you've opened the data in Minitab, choose Stat > Regression > General Regression and fill out the dialog box like this:

general regression factors

Below is the regression model Minitab fits to this data.

regression analysis of used car prices

What Does Regression Analysis Tell Us About the Price of Used Cars?

Several interesting findings come from this regression analysis:

  • Every mile that is added to the car decreases the expected list price by approximately 6 cents.
     
  • Each year that is added to the car's age decreases the expected list price by approximately $1310.
     
  • The technology package adds, on average, approximately $1044 to the list price of the car.
     
  • Cars with a free Carfax report have a list price, on average, approximately $441 more than those with a paid report. A Carfax report only costs $40, so this increased price is likely due to the fact that the car has a clean report (or else they probably wouldn’t provide it for free!).
     
  • An impressive 89.8% of the variation in car list price is explained by these predictors.

Those findings are interesting, but the main focus of this analysis is to find the car that has the best value. In other words, the car that has the largest difference between the actual list price and the expected list price. The residuals from the regression analysis contain this exact information.

Finding the Price Difference in a Residuals Plot

To get the residuals plot from this analysis, rerun the analysis (you can just hit Ctrl-E on your keyboard to bring up the last dialog box used, which should be the General Regression dialog shown above). Then click on Graphs, and check the box for "Normal plot of residuals" so it looks like this:

residuals plot

Press OK, run the analysis, and you'll get the plot shown below. 

This probability plot of the residuals indicates that three cars have an unusually large difference between the actual list price and the expected list price. They are underpriced by $7,500 to $10,000.

normal probability plot

Unfortunately, two of those cars have severe damage.

Damaged Cardamaged car 1

After removing the two damaged cars from the analysis, one car is clearly priced better than the other 983 cars. There's our best value. 

plot of car prices

This car appears to have no damage. The www.cars.com description is below.

Description of Car

In summary, there is a little bit of work getting the data from www.cars.com into Minitab in an analysis-ready format, but the effort will reveal the best-value cars, resulting in potential savings in the thousands of dollars.