What is an interaction? It’s when the effect of one factor depends on the level of another factor. Interactions are important when you’re performing ANOVA, DOE, or a regression analysis. Without them, your model may be missing an important term that helps explain variability in the response!
For example, let’s consider 3-point shooting in the NBA. We previously saw that the number of 3-point attempts per game has been steadily increasing in the NBA. And there is no better example of this than the Golden State Warriors, who shoot 35% of their shots from behind the arc (2nd in the NBA). Seeing as how the Warriors currently lead the NBA in points per 100 possessions (a better indicator of offense than points per game since it accounts for pace), could it be that shooting more 3-pointers increases the number of points you score? For every NBA team since 1981, I collected their season totals for points per 100 possessions (ORtg) and the percentage of field goal attempts from 3-point range (3PAr). For example, if your 3PAr is 0.30, then 30% of your field goal attempts are 3-pointers (and the other 70% are from 2). Here is a fitted line plot of the two variables.
At first glance, it doesn’t look like shooting a lot of your shots from 3 has any effect on a team’s offensive rating. However, we’re missing an important variable. The Golden State Warriors don’t score a lot of points just because they shoot a lot of 3-pointers. They score a lot because they shoot a lot of 3-pointers and they make a lot of them.
So now let’s include each team’s percentage of successful 3-pointers (3P%) in the model.
Both of our terms are now significant, but the R-squared value is only 4.53. That means that our model explains only 4.53% of the variation in a team’s offensive rating. This is because we’re still leaving out an important term: the interaction! If your percentage of successful 3-pointers is low and you shoot a lot of 3-pointers, your offensive rating is going to be lower than if your percentage of successful 3-pointers is high and you shoot a lot of 3-pointers.
Let’s see what happens when we include the interaction term:
The interaction term is significant in the model, and our R-squared value has now increased to 20.27%!
When an interaction term is significant in the model, you should ignore the main effects of the variables and focus on the effect of the interaction. Minitab provides several tools to better help you understand this effect. The easiest to use is the line plot.
In this plot, the red line represents the highest value for percentage of successful 3-pointers (3P%) in the data, and the blue line represents the lowest. When you shoot significantly more 2-pointers than 3-pointers (the left side of the 3PAr axis) the offensive rating is similar for both the high and low settings of 3P%. But as you shoot fewer 2-pointers and more 3-pointers, offensive rating goes up for the high-success setting of 3-point shooting percentage, and drastically drops for the low-success setting.
Because 3P% is a continuous variable, we should be interested in seeing effects of the interaction for more than just the high and low setting. This can be accomplished using a contour plot.
Now we can see the full range of values for both 3P% and 3PAr. The colors represent different ranges for offensive rating. Dark green represents a higher rate for offensive rating, while light green and blue represent lower offensive ratings.
We see that if your percentage of successful 3-pointers (3P%) is between approximately 33% and 38%, your 3PAr doesn’t have a large effect on your offensive rating. A 3P% above 38% that means that you should shoot more 3-pointers, where as a percentage below 33% means that means you should shoot fewer 3-pointers.
Now that we understand how the interaction works, let’s use our results to look as some NBA teams. So far in this NBA season, only five teams fall outside the 3P% range of 33% to 38%. Two teams make more than 38% of their 3-pointers (Warriors and Spurs) and 3 teams make less than 33% (Heat, Timberwolves, and Lakers). So do the Warriors and Spurs correctly shoot a high percentage of their field goals from 3, and do the Heat, Timberwolves, and Spurs shoot a high percentage of their shots from 2?
The Warriors are good at shooting 3s, and they know it. They have the highest 3-point percentage in the NBA, and shoot the second-highest percentage of their field goals from 3 (the Rockets, who shoot the highest percentage of their field goals from 3, are not shown on the plot). On the other side, the Timberwolves are bad at shooting 3s, and they know it. They have the second-worst 3-point percentage and shoot the lowest percentage of their field goals from 3. The Heat also shoot poorly from 3, but they don’t take a lot as they rank 24th in the NBA in percentage of field goal attempts from 3-point range.
The interesting teams are the Spurs and the Lakers. The Spurs are second in the league, making 39.3% of their 3-pointers. However, only 22.4% of their field goals are 3-pointers, which is 26th in the league. They could benefit by shooting even a higher percentage of their shots from 3. And then there’s the Lakers. Despite ranking dead last in 3-point percentage, they shoot 29% of their field goals from 3. That’s good for 14th in the league. From this analysis, the Lakers are taking too many 3-pointers.
Now, this model purposely leaves out other predictors that could affect offensive rating (like 2-point shooting percentage). So don’t assume that 3-point shooting is all that goes into offensive rating. But it does give us a simple example of how interactions work and how you can use them to look at a real life process. Interactions can be an important part of any data model, so don’t neglect them!