I have been compiling various historical NCAA tournament data over the years, and here is some data that might help you to fill out your bracket.
1 vs 16
We have probably all seen the various upset probabilities in the 1st round by seed. We all know that a 1-seed has never lost to a 16 seed. Will this be the year? Odds are no, but considering the general chaos of this year and higher than normal amount of parity, maybe it IS a good year to make history. After all, we have two 1-seeds this year with more losses outside of the RPI Top 50 than any team since at least 2011 (Oregon and Virginia). Oregon’s worst loss is to UNLV, a team with an RPI of 147. If Southern beats Holy Cross, the Ducks will face a team with an RPI of 185. Even Kansas lost to Oklahoma State (RPI = 172). What’s Austin Peay’s RPI? 190. Gulp.
Other First Round Trends
The historical average number of upsets in the first round since 1985 (the 64+ team era) is exactly 8.0 with a standard deviation of about 2.3. Upset of 2 seeds happen on average about once every 4 years (0.23 per year) and there have been 3 since 2012, so it seems unlikely. But, if you are curious, right now the Vegas lines for the Xavier and Oklahoma games are around -12, while the lines for MSU and Nova are around -18.
Upsets of 3-seeds and 4-seeds have a probability that is pretty similar at 0.65 and 0.81 per year. So, odds are we will see at least one. The Vegas odds also show some interesting clustering here as Texas A&M, Kentucky, and Miami are all favored by around 14. Duke and Utah are favored by 9-10, Iowa State is favored by 8, and California and West Virginia (surprisingly) are only favored by 7.
Upsets on the 5 and 6 line are obviously more common. Historically there are about 1.4 upsets of each seed, so we can expect around 3 total for both. Two of the 8 games have no spread yet, as two of the four 11-seeds are in the First Four, but the current spread for the Seton Hall – Gonzaga game is a pick’em, and Baylor is only a 5.5-point favorite over Yale while the other 5-seeds are at least 9 point favorites. Texas is also only a 4.5-point favorite over N. Iowa, and in general Vegas seems down on the Pac 12, which makes we wonder about Arizona’s chances against Wichita or Vandy.
Once you get to the 7- and 8-lines, we start dipping into coin flip territory. There are over 1.5 7-10 upsets per year and basically 2 upsets per year in the 8-9 games. Interestingly, all four 9-seeds are currently favored in Vegas and VCU is favored by 4.5 over Oregon State. The only 7 or 8 team that seems like a good bet? Ironically, it’s Iowa as a 7.5 favorite over Temple.
Beyond Round One
In the round of 32, the average number of upsets by seed is 4.7 with a standard deviation of 2.15. As a general rule, the more upsets in the 1st round, the less you see in the second round and vice versa. This makes sense, as if weaker teams survive the 1st round, they typically get eliminated in the 2nd round. I refer to this as “the law of the conservation of upset.” Beyond this, I have some other historical data that I find interesting. A year or two ago, I realized that they is a surprisingly strong correlation between the probability of an upset and the difference between the seeds of the two teams. That correlation is shown here, where the size of the diamond correlates to the frequency of that potential match-up.
Here are some general thoughts:
• 1-seeds are less likely to get upset than other seeds. Those data points are generally above the line for pretty much all combinations until you get to a 1-2 match-up, where it becomes a coin flip
• 2-seeds do better than expected against 7 seeds, but much worse against 10-seeds. I don’t have a good explanation for this one.
• 3-seeds kind of suck. The 3-6 match-up is basically a coin flip and 3-seeds do worse than expected against both 1- and 2-seeds.
• 4-seeds seem to have more of an advantage over 5-seeds than you would expect and they definitely do better against 1-seeds.
Final Four
Based on all this, it is no surprise that 1-seeds are the most likely to make the Final 4, but they only make up 40% of the historical participants. 20% are 2-seeds, 12% are 3-seeds, 8% are 4-seeds, and 18% are something lower than that. Once teams make the Final Four, you can almost throw the seeds out the window as the probability of the seeds in the Championship Game and the eventual Champion mirror the probabilities of making the Final Four in the first place. In other words, the games are more coin-flip like once you get to the last weekend. That being said, the 1-seeds still do outperform the other seeds (they make of 46% of the title game participants and 57% of all Champions). Finally, 4-seeds generally suck in the Final Four (2-10 all time in the national semis) and no team seeded lower than 8 has ever made a title game.
Enjoy!
1 vs 16
We have probably all seen the various upset probabilities in the 1st round by seed. We all know that a 1-seed has never lost to a 16 seed. Will this be the year? Odds are no, but considering the general chaos of this year and higher than normal amount of parity, maybe it IS a good year to make history. After all, we have two 1-seeds this year with more losses outside of the RPI Top 50 than any team since at least 2011 (Oregon and Virginia). Oregon’s worst loss is to UNLV, a team with an RPI of 147. If Southern beats Holy Cross, the Ducks will face a team with an RPI of 185. Even Kansas lost to Oklahoma State (RPI = 172). What’s Austin Peay’s RPI? 190. Gulp.
Other First Round Trends
The historical average number of upsets in the first round since 1985 (the 64+ team era) is exactly 8.0 with a standard deviation of about 2.3. Upset of 2 seeds happen on average about once every 4 years (0.23 per year) and there have been 3 since 2012, so it seems unlikely. But, if you are curious, right now the Vegas lines for the Xavier and Oklahoma games are around -12, while the lines for MSU and Nova are around -18.
Upsets of 3-seeds and 4-seeds have a probability that is pretty similar at 0.65 and 0.81 per year. So, odds are we will see at least one. The Vegas odds also show some interesting clustering here as Texas A&M, Kentucky, and Miami are all favored by around 14. Duke and Utah are favored by 9-10, Iowa State is favored by 8, and California and West Virginia (surprisingly) are only favored by 7.
Upsets on the 5 and 6 line are obviously more common. Historically there are about 1.4 upsets of each seed, so we can expect around 3 total for both. Two of the 8 games have no spread yet, as two of the four 11-seeds are in the First Four, but the current spread for the Seton Hall – Gonzaga game is a pick’em, and Baylor is only a 5.5-point favorite over Yale while the other 5-seeds are at least 9 point favorites. Texas is also only a 4.5-point favorite over N. Iowa, and in general Vegas seems down on the Pac 12, which makes we wonder about Arizona’s chances against Wichita or Vandy.
Once you get to the 7- and 8-lines, we start dipping into coin flip territory. There are over 1.5 7-10 upsets per year and basically 2 upsets per year in the 8-9 games. Interestingly, all four 9-seeds are currently favored in Vegas and VCU is favored by 4.5 over Oregon State. The only 7 or 8 team that seems like a good bet? Ironically, it’s Iowa as a 7.5 favorite over Temple.
Beyond Round One
In the round of 32, the average number of upsets by seed is 4.7 with a standard deviation of 2.15. As a general rule, the more upsets in the 1st round, the less you see in the second round and vice versa. This makes sense, as if weaker teams survive the 1st round, they typically get eliminated in the 2nd round. I refer to this as “the law of the conservation of upset.” Beyond this, I have some other historical data that I find interesting. A year or two ago, I realized that they is a surprisingly strong correlation between the probability of an upset and the difference between the seeds of the two teams. That correlation is shown here, where the size of the diamond correlates to the frequency of that potential match-up.
Here are some general thoughts:
• 1-seeds are less likely to get upset than other seeds. Those data points are generally above the line for pretty much all combinations until you get to a 1-2 match-up, where it becomes a coin flip
• 2-seeds do better than expected against 7 seeds, but much worse against 10-seeds. I don’t have a good explanation for this one.
• 3-seeds kind of suck. The 3-6 match-up is basically a coin flip and 3-seeds do worse than expected against both 1- and 2-seeds.
• 4-seeds seem to have more of an advantage over 5-seeds than you would expect and they definitely do better against 1-seeds.
Final Four
Based on all this, it is no surprise that 1-seeds are the most likely to make the Final 4, but they only make up 40% of the historical participants. 20% are 2-seeds, 12% are 3-seeds, 8% are 4-seeds, and 18% are something lower than that. Once teams make the Final Four, you can almost throw the seeds out the window as the probability of the seeds in the Championship Game and the eventual Champion mirror the probabilities of making the Final Four in the first place. In other words, the games are more coin-flip like once you get to the last weekend. That being said, the 1-seeds still do outperform the other seeds (they make of 46% of the title game participants and 57% of all Champions). Finally, 4-seeds generally suck in the Final Four (2-10 all time in the national semis) and no team seeded lower than 8 has ever made a title game.
Enjoy!