In Minitab Statistical Software, putting a regression line on a scatterplot is as easy as choosing a picture with a regression line on a scatterplot:
A neat trick is that you can also add calculated lines onto a scatterplot for comparison or other communication purposes. Here’s a demonstration.
The raw data from the United States Sentencing Comission for 2013, the most recent year on their website as of 2/16/2015, has 80,035 cases. Cut that data set down to cases where a specific, nonzero amount was recorded for a monetary loss and a specific amount was recorded for the total of fines, restitution, and cost of supervision and you get a data set with 9,619 cases. Here’s what the scatterplot with a regression line for that data set looks like:
If there’s a relationship between the cost and the loss, we might hypothesize that a fair solution would be for cost and loss to be approximately equal, Y = X. Here are the steps for drawing a new line on the scatterplot:
The single case where the loss was $5.9 billion and no restitution or fines were part of the sentence, as well as the other 5 cases where the loss exceeded $500 million seem to squish the main portion of the data considerably, so I edited the x-axis to extend only to 400 million.
The regression fit is well below the calculated line, which suggests that the costs tend to be less than the loss. However, the r-squared value for the regression line is 3.3%. What the data really indicate is that there's no linear relationship between the loss and the costs a criminal is asked to pay.
Of course, we know that the regression line fitting all of the data is heavily influenced by the most extreme case where the loss was $5.9 billion and there was no cost. Actually, the cost and the loss are identical in about 34% of the cases in the data. If we consider only cases where the costs a criminal paid were nonzero and the loss was less than $500 million, the r-squared value increases to 73.3% and the regression line looks much closer to the line Y = X:
The United States Sentencing Commission recorded over 18,000 variables about the sentences that defendants received in 2013. Coming up with what’s fair is clearly a complicated matter.
You can add calculated lines to all kinds of graphs in Minitab. If you’re ready for more, see how you can use a calculated line to put a line in front of the bars on a histogram.
The image of the Fontaine de la Justice in Cudrefin, Switzerland, is by Roland Zumbuehl and is licensed under this Creative Commons License.