Last time I touched on the subject of the greatest Super Bowl quarterback, I promised a multivariate analysis considering several different statistics. Let’s get right to a factor analysis.
Getting Ready for Factor Analysis
One purpose of factor analysis is to identify underlying factors that you can’t measure directly. These factors explain the variation of many different variables in fewer dimensions. Here are the variables we’re going to consider:
- Margin of victory
- Difference between Super Bowl winner’s passer rating and the playoff passer rating allowed by the opposing team—PR Diff (Winner – Allowed)
- Point spread—Spread
- Adjusted career rating of the losing quarterback—Adjusted Career PR Loser
- The difference between the winning and losing quarterback’s ratings—PR Difference (Winner – Loser)
- Winning quarterback’s rating—Passer Rating Winner
- Losing quarterback’s rating—Passer Rating Loser
Determining the number of factors
To begin the factor analysis, you usually determine the number of factors to use. The determination is similar to determining the number of principal components. Looking for eigenvectors bigger than 1, for the number of factors that determine 80% of the variation, and for the number of components that explain large amounts of variation relative to the other factors. A scree plot of the eigenvectors look like this:
Two factors have eigenvalues greater than 1, and the third factor is close. The 3 factors explain about 80% of the variation in the data, so 3 factors seems likes a reasonable number to explore.
Once we determine the number of factors, we want to see if we can find a rotation that produces underlying factors that make sense. In general, rotation of the factors makes them load on fewer variables so that the factors are simpler. For example, the Minitab output from the varimax rotation shows the unrotated and rotated factor loadings:
Unrotated Factor Loadings and Communalities
Variable Factor1 Factor2 Factor3 Communality
Passer Rating Loser -0.644 0.502 0.420 0.843
Passer Rating Winner 0.723 0.563 0.127 0.857
PR Difference (Winner - Loser) 0.953 0.027 -0.212 0.955
Adjusted Career PR Loser -0.181 0.748 -0.400 0.752
Spread -0.536 0.189 -0.687 0.794
PR Diff (Winner - Allowed) 0.620 0.570 0.145 0.731
Margin of victory 0.700 -0.324 -0.214 0.640
Variance 3.0410 1.5952 0.9359 5.5721
% Var 0.434 0.228 0.134 0.796
Rotated Factor Loadings and Communalities
Variable Factor1 Factor2 Factor3 Communality
Passer Rating Loser 0.032 -0.912 -0.106 0.843
Passer Rating Winner 0.909 0.171 0.030 0.857
PR Difference (Winner - Loser) 0.598 0.767 0.096 0.955
Adjusted Career PR Loser 0.336 -0.256 -0.757 0.752
Spread -0.363 -0.085 -0.810 0.794
PR Diff (Winner - Allowed) 0.850 0.086 0.011 0.731
Margin of victory 0.178 0.754 0.198 0.640
Variance 2.1852 2.0973 1.2896 5.5721
% Var 0.312 0.300 0.184 0.796
In this output, the unrotated first factor has 5 variables where the absolute value of the loading is 0.6 or higher. The rotated first factor has 2 variables with loadings of 0.6 or higher, so the rotated factor should be easier to interpret.
We’re lucky, in this case, because the different rotation methods available in Minitab all produce factors that load on the same variables. When different methods agree, you feel more certain about the results.
Interpreting the factors
The first factor, which loads highly on the winning quarterback’s passer rating and the difference between that passer rating and what the opposing team allowed in the playoffs, looks like a measure of how well the winning quarterback played. Higher values of this factor indicate better performance.
The second factor is the most difficult to interpret because of the signs of the different variables with high loadings. You get a higher value of the second component by having a higher margin of victory, by having a higher difference between the ratings of the winning and losing quarterbacks, and by having a lower passer rating by the losing quarterback. I would think that the first two components would be values that you would want to be high, but that you would also want the third value to be high.
It looks like the variation in the data suggests that a losing team is much more likely to lose by a lot of points if the opposing quarterback plays poorly. In my first post about the best Super Bowl quarterback, I made the judgement that winning a competitive Super Bowl was more impressive than winning a noncompetitive match. Thus, I’m going to tend to think that lower values of the second component, caused by high passer ratings of the opposing quarterback, small differences, and smaller margins of victory are more impressive; but I’ll conduct the final comparisons both ways to see how it affects the conclusion.
The third factor loads on two factors: the point spread and the passer rating of the losing quarterback. This factor is about the quality of the victory. The more positive the point spread, the more unexpected the victory was. Also, the better the opposing quarterback the better the victory was. Because both of these loadings are negative, more negative values of the third factor indicate better performance.
In addition to the decision of what to do with the second component, there are still some other considerations for how to determine the best super Bowl quarterback. For example, should we compare the candidate quarterbacks to the average performance or to the best performance? Should we look at the mean performance of the best quarterbacks or the median performance? With so many options available for the remaining analysis, we’ll have to wait for next time to review them all. For now, here’s some initial impressions of the three factors.
In terms of a quarterback playing well, especially in light of the opposing team, Terry Bradshaw’s first victory over the Dallas Cowboys, in Super Bowl X, takes the prize among our candidate quarterbacks. A factor score of 1.62 is not quite as good as Jim Plunkett’s 1.77, but pretty good for a guy throwing against Hall of Fame cornerback Mike Renfro. Among the candidates, Bradshaw also has the second-place score for his second victory over the Cowboys in Super Bowl XIII.
We’ll explore the best of the second factor in more detail, but the extremes make quarterback’s look good in both directions. Among the candidates, Tom Brady has the minimum score from his victory over the Carolina Panthers in Super Bowl XXXVIII. Brady overcame an incredible effort by Jake Delhomme that resulted in a 113.6 passer rating, the highest rating by a losing quarterback in a Super Bowl. Brady’s effort is also the overall minimum for factor 2.
On the maximum side of factor 2 lies another candidate, Montana’s victory over the Broncos in Super Bowl XXIV. The 45-point victory is the only Super Bowl in our data set where the winning quarterback’s passer rating exceeded the losing quarterback’s by over 100 points.
With respect to the third component, no victory was more unexpected than Brady’s overcoming of the Kurt Warner-led Rams in Super Bowl XXXVI. The 14-point underdog did enough that day to fend off the fourth quarter charge of the Greatest Show on Turf in what was, at the time, the first Super Bowl to be decided by a score on the final play of the game.
So, will it come down to Bradshaw or Brady defying the odds, or Montana’s domination? We’ll evaluate all three factors next time!
Ready to try out your own factor analysis? Check out the overview in Perform a Factor Analysis on the Minitab Support Center.