Predicting the NCAA Tournament with Minitab

Finally! Not only is it March, but the NCAA tournament brackets are out! It's almost being like being a kid opening a brand new toy on Christmas morning. Hey, I said almost!

Anyway, in case you missed it, over the past few weeks I’ve used Minitab to create a regression model that predicts the probability one basketball team has of beating another, then improved that model, and tested the model. But there is one little bit of housekeeping that has to be done before we break down the brackets.

A Model for Neutral Site Games

The previous model took into account which team was playing at home. So I’ve used the Pomery Rankings to collect the ranks and probabilities of 150 neutral site games. Then I followed the exact same procedure as our previous model to create a nonlinear regression equation that will work for neutral site games.

Looks familiar, right? Except that in this model, a Difference in Statistics of 0 gives you a probability of 50% (the other model would give the home team a probability of about 64%). This makes sense because, on a neutral court, two teams with the same rank should have an equal chance of winning.

So now let’s get to the fun part! Below is the probability of each first-round game. Also, for each region I've included probabilities for some potential future-round games. Remember, all probabilities are based on the LRMC rankings because they've predicted the most NCAA tournament games in the last 9 years. But you could use this model for any ranking system you want! Okay, without further ado, let's get started.

First Four

Team 1

Team 2

Favorite

Iona

BYU

Iona (52%)

California

South Florida

California (67%)

Miss Valley State

Western Kentucky

Miss Valley State (50%)

Lamar

Vermont

Lamar (56%)

There was lots of complaining about Iona getting in over Drexel. But in LRMC’s rankings, Iona was 31st while Drexel was 50th. Iona is better than you think. The problem is that BYU is no slouch, either. Whoever wins this game will be able to compete with Marquette. As for Cal, the model favors them not only over South Florida, but over Temple too. But more on that later.

South Region (Opening Round)

Team 1

Team 2

Favorite

Kentucky

Miss Valley State

Kentucky (98%)

Iowa State

Connecticut

Iowa State (55%)

Wichita State

VCU

Wichita State (79%)

Indiana

New Mexico State

Indiana (70%)

UNLV

UNLV (75%)

Baylor

South Dakota State

Baylor (62%)

Notre Dame

Xavier

Notre Dame (57%)

Duke

Lehigh

Duke (75%)

Things don't look good for the winner of the Mississippi Valley St-Western Kentucky game. Kentucky should easily beat either team. The Iowa St-Connecticut game is close to a coin toss, just as you would expect an 8-9 game to be.

VCU might be a trendy upset pick because of the run they made last year, but they're going to have to beat the odds, as they open with a very talented Wichita State team. Other than Kentucky, Wichita State has the best chance of getting out of the first round. Don't sleep on the Shockers.

That brings us to Indiana, whose numbers should be taken with a grain of salt. That's because senior guard Verdell Jones III tore his ACL in the Big Ten Tournament. The model can’t account for injuries, and how much his absence will affect Indiana is unknown.  And New Mexico State already has a 30% of winning. If you’re looking for a “safe” upset pick, this might be it. Even if Indiana wins, Wichita State (your sweet sixteen team) will be favored to knock them off in the second round.

But if you’re looking to throw caution to the wind with your upset picks, the bottom half of the region could go nuts. South Dakota State has a 38% chance of knocking off Baylor. And if you’re looking for an even bigger upset, may I interest you in Lehigh and their 1 in 4 chance of beating Duke? It's been 11 years since a 15 seed beat a 2 seed, and Lehigh has by far the best shot of any 15 seed in the tournament.

South Region (Future Rounds)

Team 1

Team 2

Favorite

Kentucky

Iowa State

Kentucky (88%)

Wichita State

Indiana

Wichita State (58%)

Kentucky

Wichita State

Kentucky (70%)

Baylor

UNLV

Baylor (58%)

Duke

Notre Dame

Duke (74%)

Duke

Baylor

Duke (55%)

Kentucky

Duke

Kentucky (78%)

Duke

Wichita State

Wichita State (60%)

Kentucky

Baylor

Kentucky (81%)

Things set up nicely for Kentucky. The lowest probability of winning they have in any potential game is 70%. And it's actually Wichita State that has the best chance of knocking off Kentucky, not Baylor or Duke. The model favors Wichita State over Indiana, Duke, and Baylor, so they are definitely a dark horse Final-Four team. It’s too bad they’re in the same region as Kentucky, otherwise their chances would be even better.

One interesting note: Notre Dame's probability of beating Duke is only 1% better than Lehigh's! Duke is by far the weakest 2 seed in the field, but they lucked out by having the weakest 7 and 10 seeds in their region. And just to let you know, Lehigh would be favored over Xavier in a potential second-round game.

If Duke has to play Notre Dame and Baylor, their chance of reaching the elite 8 is only 30%. If Baylor plays UNLV and Duke, their chance is only 16%! And if either of them get that far, they'll be underdogs to Kentucky, Wichita State, and even Indiana!

Overall, Kentucky is a big favorite to win this region. But if you want to pick somebody else to reach the final four, Wichita State may be your best bet.

West Region (Opening Round)

Team 1

Team 2

Favorite

Michigan State

LIU Brooklyn

Michigan State (94%)

Memphis

Saint Louis

Memphis (60%)

New Mexico

Long Beach State

New Mexico (56%)

Louisville

Davidson

Louisville (56%)

Murray State

Murray State (75%)

Marquette

Iona

Marquette (59%)

Florida

Virginia

Florida (59%)

Missouri

Norfolk State

Missouri (94%)

The 1 and 2 seeds look to roll in the first round. Don't look for any upsets there.

Things get very interesting after that. The 7 and 8 seeds actually have a better chance of winning than the 4 and 5 seeds.

Louisville has only a 56% chance of beating Davidson in the first round. And New Mexico is only a slight favorite over Long Beach State, too! Who to pick!?!?! Maybe this is where you use mascots to pick the winner. Let’s move on.

For the second straight region, we see the 6 seed has a better chance of winning than the 3 seed. I used Iona for this table, but the probability for Marquette-BYU is similar. Last year, VCU was criticized for their inclusion into the tournament and proved everybody wrong by getting to the Final Four. If Iona can get past BYU, they have a legitimate chance to also prove people wrong.

Speaking of going on tears, Murray State had quite the run this season. They should continue their great season into the second round.

West Region (Future Rounds)

Team 1

Team 2

Favorite

Michigan State

Memphis

Michigan State (62%)

Louisville

New Mexico

New Mexico (58%)

Louisville

Long Beach State

Long Beach State (52%)

Michigan State

New Mexico

Michigan State (67%)

Michigan State

Louisville

Michigan State (73%)

Marquette

Murray State

Marquette (57%)

Missouri

Florida

Missouri (67%)

Missouri

Marquette

Missouri (64%)

Michigan State

Missouri

Michigan State (57%)

Um, yeah...good luck picking the winner of this region. Michigan State won the Big 10 tournament, and is worse off for doing so. They got a 1 seed, but their path to the Final Four will be anything but easy. If Memphis gets past Saint Louis, the Spartans have a heck of an opponent waiting for them in the second round. Sure, Michigan State is favored, but a 62% chance of winning in the second round isn’t what a 1 seed wants to see. And by the way, if Memphis beats Michigan State they would be favored against every other team except Missouri.

Louisville wasn't given any breaks by winning the Big East either. If they get past Davidson, they'll actually be underdogs to either New Mexico or Long Beach State.

If Marquette survives the BYU-Iona winner, they'll have their hands full with Murray State. If that game happens, will anybody outside the state of Wisconsin not be rooting for the Racers?

Missouri should breeze by Norfolk State, but will face a stiff test if they play Florida in the second round. Missouri is a talented team, but Florida is the best 7 seed in the field. A 2 seed goes down in the second round just about every year. It's not great, but the 33% chance Florida has is the highest of any 7 or 10 seed in the field.

So all 4 of the top seeds could be in for tight games in the second round, which could spell chaos for this region. Michigan State and Missouri definitely have the best chance of getting to the regional finals, but don't be surprised if one (or both) of them go down before that.

East Region (Opening Round)

Team 1

Team 2

Favorite

Syracuse

UNC Asheville

Syracuse (85%)

Kansas State

Southern Miss

Kansas State (67%)

Vanderbilt

Harvard

Vanderbilt (68%)

Wisconsin

Montana

Wisconsin (77%)

Cincinnati

Texas

Texas (56%)

Florida State

State. Bonaventure

Florida State (70%)

Gonzaga

West Virginia

Gonzaga (60%)

Ohio State

Loyola (MD)

Ohio State (94%)

Syracuse is the weakest 1 seed, as the LRMC rankings have them as the 8th best team in the country. Syracuse is favored against UNC Asheville, but the model gives UNC Asheville a 15% chance of winning. It’s not great, but it’s a lot better chance than the 16 seed usually has.

Its not often a 8-9 game has a heavy favorite, but that's what Kansas State is. Southern Miss has a high RPI, but is ranked 69th in the LRMC rankings. Meanwhile, Wisconsin and Vanderbilt look likely to advance in the first round.

The model hasn’t liked the 11 seeds so far, but that changes here. Texas is actually favored over 6th seeded Cincinnati. Florida State is the lowest ranked 3 seed in the field, but luckily gets a relatively easy opponent in State. Bonaventure.

And it's always nice to see that Gonzaga is favored. Why? Because who doesn't pick Gonzaga to win at least 1 game in their bracket?

East Region (Future Rounds)

Team 1

Team 2

Favorite

Syracuse

Kansas State

Syracuse (69%)

Wisconsin

Vanderbilt

Wisconsin (53%)

Syracuse

Wisconsin

Syracuse (57%)

Syracuse

Vanderbilt

Syracuse (60%)

Florida State

Texas

Florida State (57%)

Florida State

Cincinnati

Florida State (62%)

Ohio State

Gonzaga

Ohio State (80%)

Ohio State

Florida State

Ohio State (80%)

Syracuse

Ohio State

Ohio State (67%)

It's hard to think a team is better off by losing, but that’s exactly what happened in the Big 10 tournament title game Sunday. By being the 2 seed in the East Region, Ohio State actually has an easier path to the Final Four than Michigan State has as the 1 seed in the West. The Buckeyes have a 60% of getting to the regional finals, and that's assuming they have to play the highest ranked team in each game!

Syracuse is the team most likely to play Ohio State in the regional finals. But the Orangemen will have a tough game against either  Wisconsin or Vanderbilt in the Sweet Sixteen.

Florida State has been a trendy pick because they just won the ACC tournament. But don't let a sample size of 3 games fool you. This is still a team that lost to Boston College (owners of a record of 9-22 with a LRMC rank of 260). Like I said, they are the lowest ranked 3 seed in the tournament. The odds are very low of this team reaching the Final Four.

Midwest Region (Opening Round)

Team 1

Team 2

Favorite

North Carolina

Lamar

North Carolina (88%)

Creighton

Alabama

Creighton (57%)

Temple

California

California (57%)

Michigan

Ohio

Michigan (61%)

San Diego State

NC State

NC State (54%)

Georgetown

Belmont

Belmont (57%)

Saint Mary’s

Purdue

Saint Mary’s (59%)

Kansas

Detroit

Kansas (93%)

Look at North Carolina’s probability in their first-round game. Another 16 seed with a double digit chance of winning. Also, if Vermont wins the play in game, UNC wins 90% of the time.  So let’s take the average and say it’s 89%. That means the probability of all the 1 seeds winning this year is .98*.94*.85*.89 = 70%. Put another way, the chance of a 16 seed winning is 30%. It has to happen sometime……….right?

Speaking of huge upsets, I’ve heard some analysts on ESPN say Detroit can upset Kansas. Well, they can upset Kansas, but it’s very unlikely. What is more likely is upsets almost everywhere else in the region. Especially…………wait for it…………Belmont!!!!!

The numbers love Belmont. LRMC currently has them ranked 9th. Now, is Belmont really the 9th best team in the country? Probably not, but we have already shown that the model works very well at predicting games. So I’m going for it in my bracket. Belmont over Georgetown!

Okay, now that that’s out of the way, let me touch on Michigan and Temple. It’s a shame Belmont isn’t the 13 seed in this bracket, because the model doesn’t like either of these teams. It favors California over Temple and gives Ohio a 39% shot of beating Michigan.

And we also have another 11 seed favored over a 6 seed. So in this region, the model likes 3 double-digit seeds to advance into the second round.

Midwest Region (Future Rounds)

Team 1

Team 2

Favorite

North Carolina

Creighton

North Carolina (70%)

Michigan

California

California (55%)

Michigan

Temple

Michigan (52%)

North Carolina

California

North Carolina (75%)

North Carolina

Michigan

North Carolina (78%)

NC State

Belmont

Belmont (76%)

Georgetown

NC State

Georgetown (71%)

Kansas

Saint Mary’s

Kansas (77%)

Kansas

Belmont

Kansas (64%)

Kansas

Georgetown

Kansas (71%)

North Carolina

Kansas

Kansas (56%)

What's that you say? Belmont favored in a potential second round game with NC State? Done and done! Belmont in the Sweet Sixteen!

North Carolina seems set up nicely for a run to the elite 8. In fact, their toughest opponent on the way would be Creighton!

So not only does the model like California over Temple, it favors them over Michigan, too! But note that all of these probabilities are in the 50s. So really, a lot could happen in this part of the bracket.

On the bottom half of the bracket, Kansas is also set up nicely for a run to the Elite 8. If any region is going to match the 1 seed versus the 2 seed in the regional finals, this looks to be it.

Final Four and Championship Game

Team 1

Team 2

Favorite

Kentucky Michigan State Kentucky (65%)
Kentucky Missouri Kentucky (72%)
Wichita State Michigan State Michigan State (55%)
Wichita State Missouri Wichita State (52%)
Syracuse North Carolina North Carolina (56%)
Syracuse Kansas Kansas (62%)
Ohio State North Carolina Ohio State (61%)
Ohio State Kansas Ohio State (55%)
Kentucky Ohio State Kentucky (58%)
Michigan State Ohio State Ohio State (58%)
Kentucky Kansas

Kentucky (62%)

Michigan State Kansas Kansas (54%)

We see that Kentucky would be favored in every matchup displayed in the table. Well, that's to be expected since they are ranked 1st in the LRMC rankings. Whoever comes out of the West Region is going to have their hands full if they meet Kentucky in the Final Four.

Ohio State is favored in every matchup except with Kentucky. Considering that they have the easiest region of any 2 seed, I like their chances of making it this far. And personally, I think an Ohio St-Kansas matchup in the semi-finals would be pretty awesome. Who wouldn't want to watch Jared Sullinger go toe-to-toe with Thomas Robinson?

My Bracket

And in the end, I present you with my bracket. I could have simply picked the team that was favored in every game, but what fun would that be? We all know that the favorite doesn't always win. So I've used the probabilities to make some educated guesses on the upsets. Also, I'm a sucker for upsets. But what upsets do you find most intriguing? Let us know!  And, most of all, enjoy the games!

Name: SportsFan • Tuesday, March 13, 2012

Would you be willing to share the head-to-head win percentages that you calculated for all the teams?

Name: Kevin • Wednesday, March 14, 2012

If I gave head-to-head win percentages for every possible matchup, that would just be too many games to try and list. However, are there any specific teams or matchups you're interested in? If you let me know, I can run them through the model and tell you what it says.

Name: Paul K • Wednesday, March 14, 2012

You just may have created a statistics spark in the eye of a daughter. Will be fun to watch were this goes (your picks and the daughter's interest in stats). Love this Minitab blog!