Topics: Statistics in the News

There’s a lot going on in the world, so you might not have noticed that the Organization for Economic Development (OECD) released their new set of health statistics for member nations. On the OECD website, you can now download the free data series for 2014. (Be aware that “for 2014” means that the organization has a pretty good idea about what happened in 2012.)

Of course, there’s nothing more fun than sharpening your Minitab skills with real data. Each time the OECD releases their data, we hear about how much money is spent per person on health care compared to how long people live in that nation. (For example, 2013, 2012, and 2011) We also tend to be treated to graphics similar to the scatterplot below, typically with clever variations for the symbols:

In this display, the point on the far right appears to be an outlier, spending more per capita than other nations but not fitting the general trend of increasing life expectancy as spending increases. When you see an apparent outlier, you want to investigate.

## Investigating Outliers with Brushing

Brushing is a feature in Minitab that makes it easy to investigate outliers in graphs. For example, from the scatterplot in Minitab, you can fit simple regression models with and without the suspected outlier to see how great the influence is on the model. You can copy and paste the data at the end of the post if you want to follow along.

Here’s how to do a visual sensitivity analysis of data on a scatterplot with brushing. Start by making the graph:

1. Choose Graph > Scatterplot.
2. From the gallery of scatterplots, select Simple. Click OK.
3. In Y variables, enter ‘Life Expectancy’.
4. In X variables, enter ‘Per Capita Health Expenditures’. Click OK.

Turn on brushing mode and set an ID variable to get specific information about the outlier.  With the scatterplot window active, follow these steps:

1. Choose Editor > Brush.
2. Choose Editor > Set ID Variables.
3. In Variables, enter Country 'Life expectancy' 'Per capita health expenditures'. Click OK.
4. On the graph, click and drag to cover the unusual point.

In the brushing window, you can see that the unusual point is the United States, row 34 in the data set. You can also see the specific values of life expectancy and per capita health expenditures.

To do the regression with and without the outlier, use brushing to create an indicator variable:

1. With the brushing window still showing, choose Editor > Create Indicator Variable.
2. In Column, enter United States.
3. Select Update now. Click OK.
4. Choose Editor > Select.
5. Choose Editor > Select Item > Symbols.
6. Choose Editor > Edit Symbols.
7. Select the Groups tab.
8. In Categorical variables for grouping, enter ‘United States’. Click OK.

With groups on the graph, you can do separate regression fits for the groups.

1. Choose Editor > Add > Regression Fit.
3. Check Apply same groups of current displays to regression fit. Click OK.
4. Choose Editor > Add > Regression Fit.

On the graph, the red curve shows that when you include the United States in the data, the decrease in life expectancy is slower than when you exclude the United States.  Bonus tip:  Hover over each curve in Minitab to see the regression equation used to create the curve.

## Using Brushing in Other Types of Graphs

Brushing is a great tool in Minitab for investigating specific points on a graph, but it doesn’t just work on scatterplots. If you’re ready for more, you can see the complete list of graphs you can brush and how to make a graph that excludes brushed points. Prefer an example? Check out how Patrick Runkel uses brushing to study the relationships on a bubble plot!

Here’s the data for the per capita health care expenditures in United States dollars and the life expectancy for the total population at birth.

 Per Capita Health Expenditures Life Expectancy Country 3997 82.1 Australia 4896 81.0 Austria 4419 80.5 Belgium 4602 81.5 Canada 1577 78.9 Chile 2077 78.2 Czech Republic 4698 80.1 Denmark 1447 76.5 Estonia 3559 80.7 Finland 4288 82.1 France 4811 81.0 Germany 2409 80.7 Greece 1803 75.2 Hungary 3536 83.0 Iceland 3890 81.0 Ireland 2304 81.8 Israel 3209 82.3 Italy 3649 83.2 Japan 2291 81.3 Korea 4578 81.5 Luxembourg 1048 74.4 Mexico 5099 81.2 Netherlands 3172 81.5 New Zealand 6140 81.5 Norway 1540 76.9 Poland 2457 80.5 Portugal 2105 76.2 Slovak Republic 2667 80.2 Slovenia 2998 82.5 Spain 4106 81.8 Sweden 6080 82.8 Switzerland 984 74.6 Turkey 3289 81.0 United Kingdom 8745 78.7 United States