College basketball season tips off today, and for the second straight season Kentucky is the #1 ranked preseason team in the AP poll. Last year Kentucky did not live up to that ranking in the regular season, going 24-10 and earning a lowly 8 seed in the NCAA tournament. But then, in the tournament, they overachieved and made a run all the way to the championship game...before losing to Connecticut.
In football, Florida State was the AP poll preseason #1 football team. While they are currently still undefeated, they aren't quite playing like the #1 team in the country. So this made me wonder, which preseason rankings are more accurate, football or basketball?
I gathered data from the last 10 seasons, and recorded the top 10 teams in the preseason AP poll for both football and basketball. Then I recorded the difference between their preseason ranking and their final ranking. Both sports had 10 teams that weren’t ranked or receiving votes in the final poll, so I gave all of those teams a final ranking of 40.
Creating a Histogram to Compare Two Distributions
Let’s start with a histogram to look at the distributions of the differences. (It's always a good idea to look at the distribution of your data when you're starting an analysis, whether you're looking at quality improvement data work or sports data for yourself.)
You can create this graph in Minitab Statistical Software by selecting Graph > Histograms, choosing "With Groups" in the dialog box, and using the Basketball Difference and Football Difference columns as the graph variables:
The differences in the rankings appear to be pretty similar. Most of the data is towards the left side of this histogram, meaning for most cases the difference between the preseason and final ranking is pretty small.
Conducting a Mann-Whitney Hypothesis Test on Two Medians
We can further investigate the data by performing a hypothesis test. Because the data is heavily skewed, I’ll use a Mann-Whitney test. This compares the medians of two samples with similarly-shaped distributions, as opposed to a 2-sample t test, which compares the means. The median is the middle value of the data. Half the observations are less than or equal to it, and half the observations are greater than or equal to it.
To perform this test in our statistical software, we select Stat > Nonparametrics > Mann-Whitney, then choose the appropriate columns for our first and second sample:
The basketball rankings have a smaller median difference than the football rankings. However, when we examine the p-value we see that this difference is not statistically significant. There is not enough evidence to conclude that one preseason poll is more accurate than the other.
But what about the best teams? I grouped each of the top 3 ranked teams and looked at the median difference between their preseason and final rank.
The preseason AP basketball poll has a smaller difference for the #1 and #3 ranked teams. But the football poll is better for the #2 team, having an impressive median value of 1. Overall, both polls are relatively good, as neither has a median value greater than 6. And the differences are close enough that we can’t conclude that one is more accurate than the other.
What Does It Mean for the Teams?
While the odds are against both Kentucky and Florida State to finish the season ranked #1 in their respective polls, previous seasons indicate that they’re still likely to finish as one of the top teams. This is better news for Kentucky, as being one of the top teams means they’ll easily make the NCAA basketball tournament and get a high seed. However, Florida State must finish as one of the top 4 teams, or else they’ll miss out on the football postseason completely.
So while we can’t conclude one poll is better than the other, teams at the top of the AP basketball poll are clearly much more likely to reach the postseason than football.