# Predicting the 2017 NCAA Tournament

Predictions can be a tricky thing. Consider trying to predict the number rolled by 2 six-sided dice. We know that 7 is the most likely outcome. We know the exact probability each number has of being rolled. If we rolled the dice 100 times, we could calculate the expected value for the number of times each value would be rolled. However, even with all that information, we can't definitively predict the value of an individual roll. The process includes random variation that we can't predict. At best, all we can do is make an educated guess.

The same logic applies to trying to predict a basketball tournament. We can know who the best teams are and we can model the probability they have of advancing. But just like rolling dice, the process involves random variation that makes predicting an individual game hard. We can't predict when a team is going to catch fire and hit almost 60% of their 3-point shots. And we can't predict when a team that normally forces turnovers on 25% of its opponents possessions is going to do so only 10% of the time. At best, all we can do is make an educated guess.

But hey, educated guessing is still better than completely guessing! Plus it's a lot of fun. So let's get started!

I’ll be using the Sagarin Predictor Ratings to determine the probability each team has of advancing in the NCAA tournament using a binary logistic model created with Minitab Statistical Software. You can find the details of how the probabilities are being calculated here.

Before we start, I’d also like to mention one other set of basketball ratings, called the Pomeroy Ratings. Both the Sagarin ratings and the Pomeroy ratings have proven to be pretty accurate in predicting college basketball games. But Ken Pomeroy always breaks down the tournament using his system. So instead of duplicating his numbers, I like to use the Sagarin predictor ratings. But I’ll be sure to mention places where the two systems disagree, and you can select the one you want to go with!

Alright, enough with the small talk. Let’s get to the statistics!

# East

The following table has the probabilities each team in the East Region has of advancing in each round (up to the Final Four).

 Team 2nd Round Sweet 16 Elite 8 Final 4 (1) Villanova 99% 73% 45% 30% (2) Duke 97% 72% 43% 20% (5) Virginia 89% 51% 24% 14% (4) Florida 90% 46% 20% 12% (3) Baylor 91% 55% 28% 10% (6) SMU 79% 39% 18% 6% (8) Wisconsin 76% 24% 10% 5% (10) Marquette 50% 14% 5% 1% (7) South Carolina 50% 14% 5% 1% (9) Virginia Tech 24% 3.3% 1% 0.2% (11) USC/Providence 21% 5% 1% 0.1% (12) UNC Wilmington 11% 5% 0.5% 0.1% (13) East Tenn St 10% 1% 0.1% < 0.1% (14) New Mexico St 9% 1% 0.2% < 0.1% (15) Troy 3% 0.3% <0.1% < 0.1% (16) Mt. St. Marys/New Orleans 1% 0.1% < 0.1% < 0.1%

Congratulations on the overall number 1 seed Villanova—now here comes the hardest region in the tournament! The east region features the highest rated 8 seed (Wisconsin at #17), 5 seed (Florida at #10) and 4 seed (Virginia at #7). Villanova is a great team (ranked #2 in Sagarin and Pomeroy), but their path is going to be very hard right out of the gate. If Wisconsin defeats Virginia Tech, they'd have a 30% chance of knocking off the Wildcats. And both Virginia and Florida would have a 40% chance of beating Villanova. And we haven't even mentioned the 2 seed yet!

That 2 seed would, of course, be Duke. They are ranked #8 in the Sagarin Predictor rankings. But despite being 6 spots lower than Villanova, you'll see that they have a similar probability of reaching the Sweet Sixteen and Elite Eight. That's because their path is much easier.  Neither Marquette or South Carolina are as good as Wisconsin, and Baylor is a pretty weak 3 seed. In fact, the Sagarin Ratings would have SMU as only a 1-point underdog to Baylor, and the Pomeroy Ratings would actually favor SMU! So don't pick Baylor to go too far in your bracket.

But getting back to Duke, their ranking might actually be a little low. Duke lost a handful of games this year when they were missing Grayson Allen and Coach Krzyzewski. Had Coach K been there the entire year and had Grayson Allen been healthy and, er, "well behaved", this team would probably had performed better in some of those losses. And since they're both back for the tournament, it stands to reason this team might actually be better than #8. But how much better remains to be seen.

If you're looking for early upsets, this probably isn't the region for you. Both UNC Wilmington and East Tennessee State are good teams, but they drew brutal opening round opponents. The Pomeroy Ratings give those teams a better chance than shown here (more on that later), but even then the best probability is UNC Wilmington having a 22% chance of beating Virginia. It's possible (especially if Virginia has another offensive funk they've been prone to this year) but the upsets are more likely to come in the later rounds.

Speaking of upsets in later rounds, SMU is a team capable of making a run. I've already  mentioned that they're capable of beating Baylor, but the Pomeroy Ratings actually would favor them against Duke too! (Although the same caveat I gave on Duke earlier would also apply to the Pomeroy Ratings.) Having SMU in your Sweet Sixteen or even Elite Eight isn't a poor choice if you want to pick some chaos.

So overall, your best bet is taking Villanova or Duke in this region. Although Virginia and Florida both make for interesting dark horses. The problem is they will most likely have to play each other in the 2nd round, and that game is basically a tossup. So good luck picking which one to go with!

# West

 Team 2nd Round Sweet 16 Elite 8 Final 4 (1) Gonzaga 99% 87% 56% 43% (4) West Virginia 95% 70% 33% 24% (3) Florida St 93% 67% 38% 12% (2) Arizona 96% 54% 29% 8% (7) Saint Mary's 72% 37% 19% 5% (5) Notre Dame 81% 27% 8% 4% (11) Xavier 54% 18% 7% 1% (6) Maryland 46% 13% 4% 0.6% (8) Northwestern 50% 7% 2% 0.5% (9) Vanderbilt 50% 7% 2% 0.5% (10) VCU 28% 9% 3% 0.4% (12) Princeton 19% 2% 0.2% 0.1% (13) Bucknell 5% 1% 0.1% <0.1% (14) Florida Gulf Coast 7% 1% 0.1% <0.1% (15) North Dakota 4% 0.3% <0.1% <0.1% (16) South Dakota St 1% 0.2% <0.1% <0.1%

Gonzaga has been to the NCAA tournament 19 times. Despite often being a higher seed, they have a winning record in tournament games of 24-19 and have made 7 Sweet Sixteens and 2 Elite Eights. However, despite all that success, they have never been to a Final Four. Well, that could change this year. This is by far the best team Gonzaga has ever had. They're ranked #1 in both Sagarin and Pomeroy. They drew the weakest 2 seed in the tournament. And, their Pomeroy Adjusted Efficiency Margin (the value that the ratings are based on) is the 4th best in the history of the Pomeroy Ratings. They trail only 2002 Duke, 2008 Kansas, and 2015 Kentucky. Of course, none of those teams won the tournament because single elimination tournaments can be like that. But make no mistake about it—this Gonzaga team is great.

Their stiffest competition actually comes from the 4 seed. West Virginia is ranked 4th in Sagarin and 5th in Pomeroy. The Mountaineers lost a lot of close games this year. And because close games are more a result of luck than ability, West Virginia is a much better team than their record indicates. Gonzaga and West Virginia will most likely meet in the Sweet Sixteen, and the winner of that game will be favored in their next game to reach the Final Four.

In the bottom half of the bracket, Arizona and Florida State are very week 2 and 3 seeds respectively. Florida State is 18th in Sagarin and 19th in Pomeroy. Arizona is 21st in Sagarin and 20th in Pomeroy. That leaves the door wide open for Saint Mary's to make a run. Sagarin would have the Gaels as slight underdogs to Florida State and Arizona, and Pomeroy would actually favor them in both games! If you want to root for the little guy, having Saint Mary's vs. Gonzaga in the Elite Eight wouldn't be a terrible pick. I mean, that's if you actually still count Gonzaga as a little guy anymore.

This region also has great potential for upsets in the first round. It favors 11-seeded Xavier over Maryland. And it gives Princeton a 19% chance of beating Notre Dame. That's not great, but on the bright side Pomeroy is much more optimistic, giving Princeton a 31% chance of winning. So I'll take this opportunity to illustrate the main difference between the Pomeroy Ratings and the Sagarin Ratings. It's the mid majors. I divided all 68 teams in the tournament into teams from mid-major conferences and power conferences. Then I looked at the difference in their rankings in the two systems. Here are the results.

The average difference for teams in the power conferences is 0, and there isn't much variation. But for mid-major teams, they are on average ranked 8.5 spots lower in Sagarin than Pomeroy. So when a mid major plays a power conference team, the Pomeroy ratings are going to give the mid major a better chance of winning. In our Princeton/Notre Dame example, Pomeroy says Notre Dame should be favored by 5 points where as Sagarin has it at 8.5. To see who might be closer I decided to see what the spread was in Vegas. And wouldn't you know it. They put it right in the middle at 7 points.  So what should you do? Personally, since West Virginia will be a heavy favorite in the second round anyway, I say pick Princeton and root for the upset. Go Nerds!

In this region, you probably want to pick either Gonzaga or West Virginia. Sure, Florida State and Arizona have shots too, but their probabilities are pretty low considering they're a 3 and 2 seed. Chances are other people in your pool are going to pick Florida State and Arizona at a rate higher than 12% and 8%, respectively. So going with Gonzaga or West Virginia (and maybe taking a chance with Saint Mary's in the Elite Eight) should give you an edge.

# Midwest

 Team 2nd Round Sweet 16 Elite 8 Final 4 (1) Kansas 97% 76% 46% 26% (2) Louisville 98% 65% 41% 23% (3) Oregon 95% 65% 32% 15% (4) Purdue 89% 51% 26% 14% (5) Iowa St 81% 42% 20% 10% (6) Creighton 68% 27% 10% 4% (7) Michigan 50% 18% 8% 3% (10) Oklahoma St 50% 18% 8% 3% (8) Miami FL 58% 15% 5% 2% (9) Michigan St 58% 15% 5% 2% (11) Rhode Island 32% 8% 2% 0.4% (12) Nevada 19% 5% 1% 0.2% (13) Vermont 11% 2% 0.3% < 0.1% (14) Iona 5% 1% <0.1% < 0.1% (15) Jacksonville St 2% 0.1% <0.1% < 0.1% (16) NC Central/UC Davis 3% 0.2% < 0.1% < 0.1%

Kansas is the weakest 1 seed in the tournament, ranked 9th in Sagarin and 10th in Pomeroy. And thus you'll see they have the lowest probability of reaching the Final Four of all the 1 seeds. Louisville is actually ranked ahead of Kansas, but has a slightly lower probability of reaching the Final Four due to a harder path. Both Michigan and Oklahoma State are capable of knocking off Louisville in the 2nd round. The problem with picking that upset is you have to decide whether to take Michigan or Oklahoma State, and their opening game is a coin flip!

The statistics have Oregon as the 3rd most likely team to win this region, but that comes with an asterisk. In their next to last game of the season, Oregon senior Chris Boucher tore his ACL. He was the 3rd leading scorer, 2nd leading rebounder, and leading shot blocker. The statistics don't know Oregon has to play the rest of the season without him, so their chances are lower than shown here. Be wary of picking Oregon to go too far in your bracket.

That leaves Purdue as a viable option if you're looking for a dark horse. They would only be a 1-point underdog to Kansas according to Sagarin, so that is definitely a winnable game. And Iowa State already won on the road against Kansas earlier this season, so expect the Jayhawks to have their hands full in the Sweet Sixteen. Of course, Iowa State plays a very good Nevada team in the opening round. So if you're planning on putting Purdue in the Sweet Sixteen, picking Nevada as an upset isn't a bad option.

Overall, this region is pretty open. Louisville would be favored in any potential matchup, but they will have a tough game right off the bat wtih either Michigan or Oklahoma State. So if you wanted to go crazy and pick something like Michigan, Oklahoma State, or even Creighton in the Final Four, this would be the region to do it. But most likely, it'll be Kansas or Louisville.

# South

 Team 2nd Round Sweet 16 Elite 8 Final 4 (1) North Carolina 99% 86% 69% 44% (2) Kentucky 97% 60% 39% 21% (3) UCLA 95% 58% 25% 10% (10) Wichita St 77% 35% 21% 10% (4) Butler 91% 60% 18% 7% (6) Cincinnati 68% 32% 12% 5% (5) Minnesota 65% 28% 6% 2% (8) Arkansas 56% 9% 4% 0.9% (9) Seton Hall 44% 6% 2% 0.5% (7) Kansas St/Wake Forest 32% 10% 2% 0.6% (12) Middle Tenn St 35% 11% 2% 0.3% (7) Dayton 23% 5% 1% 0.3% (13) Winthrop 9% 2% 0.1% < 0.1% (14) Kent St 5% 0.5% <0.1% < 0.1% (15) Northern Kentucky 3% 0.1% <0.1% < 0.1% (16) Texas Southern 1% 0.1% < 0.1% < 0.1%

This is the region that Villanova should have been given with the overall #1 seed. North Carolina has a cakewalk to the Elite Eight. Gonzaga is the only other team with a greater than 50% chance of reaching the Elite Eight (56%) and North Carolina's probability blows that away at 69%! Of course, once they get there they'll face a tough game. But who will it be?

Kentucky and UCLA are the two most likely teams to face North Carolina in the Elite Eight. But who is the next team after that? 10 seeded Wichita State! Yep, the Shockers are ranked #11 in Sagarin and #8 in Pomeroy. In fact, both systems would favor Wichita State over UCLA if those two teams played. And they'd only be a slight underdog to Kentucky. Three years ago an undefeated Wichita State team got the #1 seed only to lose to a very under-seeded Kentucky team in the 2nd round. This year they get a chance for payback, as they could now be the under-seeded team that pulls the 2nd round upset.

Every year there is a 12 seed that beats a 5 seed, and this region gives us our best chance. Sagarin gives Middle Tennessee State a 35% chance of beating Minnesota, but that's the lowest odds you'll find. Pomeroy gives Middle Tennessee State a 45% chance of winning. And in Vegas, the line is a pick em, meaning they think this game is a coin flip! That's an upset you should absolutely consider, and it's a no-brainer to pick if your pool gives you bonus points for upsets. In fact, if you get bonus points for upsets, go ahead and put Middle Tennessee State in the Sweet Sixteen. With North Carolina being such a heavy favorite to win their Sweet Sixteen game anyway, go ahead and try to maximize those bonus points!

North Carolina and Kentucky are both top 5 teams in Sagarin and Pomeroy, so choosing either to go to the Final Four is a good selection. However, North Carolina has the much easier path. And of course, look out for Wichita State. They definitely have the potential to "shock" the world and win this region.

# Final Four

 Team Final Four Semifinal Champion (1) Gonzaga 43% 29% 18% (1) North Carolina 44% 28% 14% (1) Villanova 30% 16% 10% (4) West Virginia 24% 16% 9% (2) Kentucky 21% 12% 6% (1) Kansas 26% 12% 5% (2) Louisville 23% 11% 5% (2) Duke 20% 9% 5% (5) Virginia 14% 6% 3% (4) Florida 12% 5% 3%

The top 5 ranked teams in the Sagarin ratings are Gonzaga, Villanova, North Carolina, West Virginia, and Kentucky (in that order). So it's no surprise that those teams have the top 5 probabilities of winning the entire tournament. The only difference in the order is North Carolina gets a bump over Villanova due to the fact that they have a much easier path. But it's really a wide open tournament, as the favorite has only a 18% chance of winning the title. That's a far cry from the 41% chance Kentucky had as the top team two years ago. So when you pick your champion, try to think about who the other people entering your pool will choose. If you don't think anybody in your pool actually believes in Gonzaga, then they are the clear the choice for you. If you're entering a pool with hundreds of entries, West Virginia could be a good selection since there will be a ton of entries picking the higher seeds. Villanova could also be a good pick, if you think most people will avoid them since they won it last year. The choice is yours. So good luck, and remember, you're not just taking guesses.

You're taking educated guesses!