In my last blog post, I tried to make a regression model to predict fantasy football scores for the upcoming NFL season. However, my R-Squared value (R-Sq) was only 61%. Now I'm going to break the data up by position to try and create a better model.
I used the same data set as before, which means each player had played three full seasons with the same team. I ran Minitab’s regression analysis on each position, with data from a player’s previous two seasons predicting how he would do in the current season. The R-Sq values of each model are below:
- Quarterback Model: R-Sq = 73%
- Wide Receiver Model: R-Sq = 38%
- Running Back Model: R-Sq = 29%
Let's look at the running backs and receivers first. The R-sq values tell us that using fantasy scores from the last two seasons is a poor predictor of future scores. In those positions, elite players simply do not stay elite. Adrian Peterson is the only player to finish as a top 10 running back in each of the last 3 years, and Andre Johnson and Roddy White are the only players to finish as top 10 receivers in each of the last 3 years.
I tried to use other variables, such as targets and yards per catch for receivers, and carries and yards per rush for running backs. Unfortunately, none of those variables were statistically significant. It's just hard to be able to tell why Darren McFadden and Brandon Lloyd became so good last year by looking at data from previous seasons.
But we can make a decent model for quarterbacks! Let's look at the Minitab output.
The p-value for "Yr 2 Avg" is 0.444, which is greater than our chosen significance level of 0.05. This means that a quarterback's fantasy average from two years ago doesn't explain a significant amount of variation beyond what his fantasy average from the previous year explains. So we can drop this predictor from the model and use a quarterback's fantasy score from only the previous season.
Let's use this model to predict 2011 fantasy football scores for quarterbacks. Note that R-sq (pred) is a more appropriate metric than R-sq or R-sq (adj) for comparing the predictive ability of various models. Based on this metric, the one-predictor model appears superior to the two-predictor model. Now, 68% is not a high R-sq (pred) value, but we’re predicting fantasy football scores here, not airline safety. So let's go ahead and use Minitab's model to predict the top 20 fantasy quarterbacks.
NOTE: Data from last season are only from weeks 1 to 16. Games in week 17 were not counted. Fantasy scores were calculated using ESPN's standard scoring system.
Michael Vick is at the top of the list because he averaged almost 26 points per game last year. The model predicts that he won't do as well this year, but it's still enough to be the #1 fantasy quarterback. However, Aaron Rodgers has been more consistent the last 3 years, as he's finished as a top 3 QB each season. Vick has more upside, but Rodgers has more consistency. Which would you rather have?
The model rounds out the top 10 with Matt Cassel and Kyle Orton. This is much higher than many people expect them to finish. So if you're looking for a QB you can grab late in the draft that has some upside, this is where you want to look.
And the biggest surprise is all the way down at spot #15. Matt Schaub's numbers fell last year, probably due to the emergence of Arian Foster. If Foster continues to perform at a high level, he'll continue to take fantasy points away from Schaub. Buyer beware!
Photograph by Greg Robbins licensed under Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Generic License.