MEN'S BASKETBALL Dr. G&W's March Madness Analysis: Quantifying the Madness
- Spartans Illustrated Message Board
- 1 Replies
It has been almost two full months since the 2021-2022 college basketball season came to a close with the Kansas Jayhawks’ come-from-behind victory over the North Carolina Tar Heels in the national championship game. But data does not have an offseason.
So far this spring, we have examined the NCAA Tournament résumés of the top men’s basketball college head coaches over the past 40 years. We have found that just based on the raw numbers, Michigan State head coach Tom Izzo is among the best. When it comes to performance compared to expectation, Coach Izzo is the best of all time.
We also explored and quantified the difficulty of both the draw and actual path of several notable teams over the past 20 tournaments. Interestingly, the data suggests that the 2022 Jayhawks had the easiest path to a national title of any team since 2002. Ironically, the 2021 Baylor Bears had the most difficult path in the same timeframe.
Finally, it is time to wrap up this series with a deep-dive into the overall odds of the NCAA Tournament as well as the odds of picking a perfect bracket in an office pool. As we shall see, it is possible to quantify the Madness of March.
Using these tools, it is possible to calculate the odds for any team to win any of the NCAA Tournaments back to 2002 when Kenpom began tracking this data. When all of this data is taken together, a big picture emerges as to the chances for any team to cut down the nets. Figure 1 below summarizes this data.
Figure 1: Odds for every NCAA Tournament team to win the National Title for the 2002-2022 seasons using both a linear scale (left) and a log scale (right)
As we can see, the best pre-tournament odds of any team in the last 20 years are just slightly better than 35 percent, which were the odds that Gonzaga had prior to the 2021 tournament. Other notable teams whose odds were greater than 25 percent include the 2002 Duke team, the 2015 Kentucky team, the 2008 Kansas team and the 2019 Virginia team.
Note that the difference in odds shown in Figure 1 for teams with similar pre-tournament Kenpom efficiencies are due entirely to differences in the tournament draws for each team. This topic was covered in detail in the previous installment of this series.
Of the seven total teams whose odds were greater than 25 percent entering the tournament, only two of those teams (Kansas in 2008 and Virginia in 2019) actually won the national title, which is right on the expected value of 2.13, based on the calculated odds. In other words, the #math checks out.
The bottom line is that winning the NCAA Tournament is hard. Even teams that finish the season with a Kenpom efficiency margin of +30.0 or greater average just one-in-five odds of cutting down the nets. A historically average No. 1 seed has odds of only 14 percent.
Lower seeded teams have much worse odds. The right panel shows the same data, but listed on a log scale. Interestingly, the total span of championship odds for the best and worst teams of the last 20 years extends 14 orders of magnitude.
For those scoring at home, the team with the estimated worst odds in the past 20 years was the 2005 No. 16 seed Alabama A&M team, which lost to No. 16 Oakland in the play-in game. My math gave the Bulldogs a one-in-97 trillion chance to win the national title.
The most trivial way to make this calculation is to assume that all 63 games are coin flips and each team has a 50 percent chance to win each game. If this were the case, the odds of picking the perfect bracket would be about one-in-9.2 quintillion, which is the number that is cited most frequently. But it is actually a form of upper bound on the real odds.
The reason is that not all games are toss-ups. No. 1-seeded Kansas did not have a 50 percent chance to beat No. 16-seeded Texas Southern this spring. Kansas’ odds were closer to 97 percent. In other words, the coin that we use to make the calculation in the previous example is loaded. It would only be a “fair” coin in the extreme case.
As it turns out, the true odds to pick a perfect bracket are based on a certain weighted average (technically the geometric mean) of the odds for the favored team to win each tournament game. This weighted average is a function of the specific strengths of each team in any given tournament, which means that the odds of picking all of the games correctly vary from year-to-year.
The actual weighted average is around 58 percent (and not 50 percent) based on data for the past 20 tournaments. The value tells us that the real odds to pick a perfect bracket are closer to one-in-540 trillion. That is still a really large number, but it is 17,000 times more likely than the value that most people reference.
It is also possible to calculate the lower bound for the odds to pick a perfect bracket. These odds occur in the scenario where the favored teams win all 63 games in the NCAA Tournament. Effectively, the tournament would proceed according to “chalk.”
In this scenario, the weighted average of the hypothetical coin is closer to 68 percent, on average. Based on this value, the lower bound for the odds to select the perfect bracket has averaged about one-in-49 billion over the past 20 tournaments. The real odds are about 11,000 times less likely than this lower bound.
Figure 2: Actual odds of a perfect bracket compared to the "chalk" bracket where the favorite teams win each contest and the average odds resulting from a series of Monte Carlo simulations of each tournament.
As we can see from the orange bars, the “coin” weighted average is between 65 and 70 percent for the most likely, “chalk” brackets. This translates to odds between approximately one-in-1 billion and one-in-1 trillion. The best possible odds of a perfect bracket would have been in 2015 using a strategy of picking all of the Kenpom favorites to win all 63 games. In that scenario, the odds of being correct were one-in 4.3 billion.
When each tournament was then simulated, the odds dropped significantly, as shown by the striped green bars. Over the past 20 tournaments the geometric average of the odds for a perfect simulated bracket ranged from a high of one-in-60 trillion in 2015 to a low of one-in-10 quadrillion in 2006. Note that the “chalk” data and the simulated data are highly correlated.
It is interesting to note that the odds of selecting a perfect bracket were better in years such as 2015, 2019 and 2021. The odds were worse in years such as 2003 and 2006. In the previous piece in this series, I pointed out that the former set of years were ones where the bracket was particularly strong and the later years were ones where the bracket was particularly weak.
As a general rule, a stronger bracket should result in fewer upsets and it will therefore be more predictable with better odds to pick the perfect bracket. While there is a correlation between the simulated odds and the actual odds of a perfect bracket, that correction is quite weak. As the dotted green bars show, the actual odds of correctly picking the results of all 63 games have varied between one-in-3.2 trillion (in 2019) and one-in-350 quadrillion in 2022.
A comparison of the simulation odds and the actual odds essentially provides a way to quantify the Madness of March. In the years when the actual odds are higher than the average of the simulations (such as 2007, 2008 and 2019) the tournament tended to have fewer upsets total and a larger number of higher seeds advance to the Final Four. For example, 2008 is the only year in history where all four No. 1 seeds advanced to the Final Four.
The opposite is true for the years where the actual odds are significantly worse than the simulated odds. In those years there was an above average amount of Madness due to a large number of upsets, the occurrence of major upsets (such as a No. 15 seed beating a No. 2 seed) or both. These years also tend to result in lower seeds advancing to the Final Four.
To highlight a few examples, in 2011 a No. 8 seed (Butler) and a No. 11 seed (VCU) made the Final Four. In 2018, No. 1 seed Virginia lost to No. 16 seed UMBC (University of Maryland, Baltimore County) and a No. 11 seed (Loyola Chicago) made the Final Four. In 2021, No. 2 seed Ohio State lost in the first round and No. 11 seed UCLA made the Final Four. In 2022, No. 15 seed Saint Peter’s made the Elite Eight and No. 8 North Carolina reached the final game.
When it comes to unlikely events in the NCAA Tournament, No. 1 seed Virginia’s loss in the first round to No. 16 seed UMBC is usually the event cited as being the most “Mad.” However, the statistics (based on the Vegas spread) suggest that this type of upset should occur in about one percent of all games. In other words, we should expect to see a No. 1 seed go down about once every 25 years.
However, a No. 15 seed advancing to the regional finals (as Saint Peter’s did this year) has odds of roughly 0.18 percent, or one-in-550. This suggests that this type of event should only happen once every 140 tournaments. The math suggests that no one alive will likely ever witness such an unlikely NCAA Tournament run again in their lifetime.
That said, the magical run of the St. Peter’s Peacocks is still way more likely than ever predicting a perfect NCAA Tournament bracket.
With that, it is time to finally put a bow on the college basketball season. Until next time, enjoy, and Go Green.
So far this spring, we have examined the NCAA Tournament résumés of the top men’s basketball college head coaches over the past 40 years. We have found that just based on the raw numbers, Michigan State head coach Tom Izzo is among the best. When it comes to performance compared to expectation, Coach Izzo is the best of all time.
We also explored and quantified the difficulty of both the draw and actual path of several notable teams over the past 20 tournaments. Interestingly, the data suggests that the 2022 Jayhawks had the easiest path to a national title of any team since 2002. Ironically, the 2021 Baylor Bears had the most difficult path in the same timeframe.
Finally, it is time to wrap up this series with a deep-dive into the overall odds of the NCAA Tournament as well as the odds of picking a perfect bracket in an office pool. As we shall see, it is possible to quantify the Madness of March.
Overall Tournament Odds
Throughout this series, and in my annual NCAA Tournament preview, I have outlined a variety of tools that can be deployed in order to gain a deeper understanding of the way the tournament actually works. Almost all of them hinge on the use of Kenpom efficiency data to project point spreads and victory probabilities for any arbitrary tournament matchup.Using these tools, it is possible to calculate the odds for any team to win any of the NCAA Tournaments back to 2002 when Kenpom began tracking this data. When all of this data is taken together, a big picture emerges as to the chances for any team to cut down the nets. Figure 1 below summarizes this data.
Figure 1: Odds for every NCAA Tournament team to win the National Title for the 2002-2022 seasons using both a linear scale (left) and a log scale (right)
As we can see, the best pre-tournament odds of any team in the last 20 years are just slightly better than 35 percent, which were the odds that Gonzaga had prior to the 2021 tournament. Other notable teams whose odds were greater than 25 percent include the 2002 Duke team, the 2015 Kentucky team, the 2008 Kansas team and the 2019 Virginia team.
Note that the difference in odds shown in Figure 1 for teams with similar pre-tournament Kenpom efficiencies are due entirely to differences in the tournament draws for each team. This topic was covered in detail in the previous installment of this series.
Of the seven total teams whose odds were greater than 25 percent entering the tournament, only two of those teams (Kansas in 2008 and Virginia in 2019) actually won the national title, which is right on the expected value of 2.13, based on the calculated odds. In other words, the #math checks out.
The bottom line is that winning the NCAA Tournament is hard. Even teams that finish the season with a Kenpom efficiency margin of +30.0 or greater average just one-in-five odds of cutting down the nets. A historically average No. 1 seed has odds of only 14 percent.
Lower seeded teams have much worse odds. The right panel shows the same data, but listed on a log scale. Interestingly, the total span of championship odds for the best and worst teams of the last 20 years extends 14 orders of magnitude.
For those scoring at home, the team with the estimated worst odds in the past 20 years was the 2005 No. 16 seed Alabama A&M team, which lost to No. 16 Oakland in the play-in game. My math gave the Bulldogs a one-in-97 trillion chance to win the national title.
Perfect Bracket Odds
Over the years, many people have dreamed of winning their NCAA Tournament “office” bracket by somehow picking the results of all 63 games correctly (not counting the play-in round). Naturally, this has led many people to attempt to calculate those odds. The internet has a lot of articles that attempt this calculation. Most of them are wrong.The most trivial way to make this calculation is to assume that all 63 games are coin flips and each team has a 50 percent chance to win each game. If this were the case, the odds of picking the perfect bracket would be about one-in-9.2 quintillion, which is the number that is cited most frequently. But it is actually a form of upper bound on the real odds.
The reason is that not all games are toss-ups. No. 1-seeded Kansas did not have a 50 percent chance to beat No. 16-seeded Texas Southern this spring. Kansas’ odds were closer to 97 percent. In other words, the coin that we use to make the calculation in the previous example is loaded. It would only be a “fair” coin in the extreme case.
As it turns out, the true odds to pick a perfect bracket are based on a certain weighted average (technically the geometric mean) of the odds for the favored team to win each tournament game. This weighted average is a function of the specific strengths of each team in any given tournament, which means that the odds of picking all of the games correctly vary from year-to-year.
The actual weighted average is around 58 percent (and not 50 percent) based on data for the past 20 tournaments. The value tells us that the real odds to pick a perfect bracket are closer to one-in-540 trillion. That is still a really large number, but it is 17,000 times more likely than the value that most people reference.
It is also possible to calculate the lower bound for the odds to pick a perfect bracket. These odds occur in the scenario where the favored teams win all 63 games in the NCAA Tournament. Effectively, the tournament would proceed according to “chalk.”
In this scenario, the weighted average of the hypothetical coin is closer to 68 percent, on average. Based on this value, the lower bound for the odds to select the perfect bracket has averaged about one-in-49 billion over the past 20 tournaments. The real odds are about 11,000 times less likely than this lower bound.
Perfect Bracket Odds Over the Years
Now that it is clear that the odds of a perfect bracket have clear bounds and differ year-to-year, it is time to visualize what these odds have looked like over the years. Figure 2 provides this summary.Figure 2: Actual odds of a perfect bracket compared to the "chalk" bracket where the favorite teams win each contest and the average odds resulting from a series of Monte Carlo simulations of each tournament.
As we can see from the orange bars, the “coin” weighted average is between 65 and 70 percent for the most likely, “chalk” brackets. This translates to odds between approximately one-in-1 billion and one-in-1 trillion. The best possible odds of a perfect bracket would have been in 2015 using a strategy of picking all of the Kenpom favorites to win all 63 games. In that scenario, the odds of being correct were one-in 4.3 billion.
When each tournament was then simulated, the odds dropped significantly, as shown by the striped green bars. Over the past 20 tournaments the geometric average of the odds for a perfect simulated bracket ranged from a high of one-in-60 trillion in 2015 to a low of one-in-10 quadrillion in 2006. Note that the “chalk” data and the simulated data are highly correlated.
It is interesting to note that the odds of selecting a perfect bracket were better in years such as 2015, 2019 and 2021. The odds were worse in years such as 2003 and 2006. In the previous piece in this series, I pointed out that the former set of years were ones where the bracket was particularly strong and the later years were ones where the bracket was particularly weak.
As a general rule, a stronger bracket should result in fewer upsets and it will therefore be more predictable with better odds to pick the perfect bracket. While there is a correlation between the simulated odds and the actual odds of a perfect bracket, that correction is quite weak. As the dotted green bars show, the actual odds of correctly picking the results of all 63 games have varied between one-in-3.2 trillion (in 2019) and one-in-350 quadrillion in 2022.
A comparison of the simulation odds and the actual odds essentially provides a way to quantify the Madness of March. In the years when the actual odds are higher than the average of the simulations (such as 2007, 2008 and 2019) the tournament tended to have fewer upsets total and a larger number of higher seeds advance to the Final Four. For example, 2008 is the only year in history where all four No. 1 seeds advanced to the Final Four.
The opposite is true for the years where the actual odds are significantly worse than the simulated odds. In those years there was an above average amount of Madness due to a large number of upsets, the occurrence of major upsets (such as a No. 15 seed beating a No. 2 seed) or both. These years also tend to result in lower seeds advancing to the Final Four.
To highlight a few examples, in 2011 a No. 8 seed (Butler) and a No. 11 seed (VCU) made the Final Four. In 2018, No. 1 seed Virginia lost to No. 16 seed UMBC (University of Maryland, Baltimore County) and a No. 11 seed (Loyola Chicago) made the Final Four. In 2021, No. 2 seed Ohio State lost in the first round and No. 11 seed UCLA made the Final Four. In 2022, No. 15 seed Saint Peter’s made the Elite Eight and No. 8 North Carolina reached the final game.
When it comes to unlikely events in the NCAA Tournament, No. 1 seed Virginia’s loss in the first round to No. 16 seed UMBC is usually the event cited as being the most “Mad.” However, the statistics (based on the Vegas spread) suggest that this type of upset should occur in about one percent of all games. In other words, we should expect to see a No. 1 seed go down about once every 25 years.
However, a No. 15 seed advancing to the regional finals (as Saint Peter’s did this year) has odds of roughly 0.18 percent, or one-in-550. This suggests that this type of event should only happen once every 140 tournaments. The math suggests that no one alive will likely ever witness such an unlikely NCAA Tournament run again in their lifetime.
That said, the magical run of the St. Peter’s Peacocks is still way more likely than ever predicting a perfect NCAA Tournament bracket.
With that, it is time to finally put a bow on the college basketball season. Until next time, enjoy, and Go Green.