As Selection Sunday approaches, amateur and “professional” bracketologists alike can’t stop talking about resumes. Which teams are “in” and which teams are “out” of the NCAA Tournament? How can we tell? How do we fairly compare a team from the Big Ten with a team from the Atlantic 10? Bracketologists discuss records, quality wins, and bad losses. But recently, the use of various metrics has been used to judge whether a team does or does not deserve a ticket to the Big Dance.
When it comes to college basketball metrics, there are a lot to choose from, and from a team point of view there are generally two types: prediction-based and results-based.
Prediction-based metrics, such as Kenpom efficiency margins, are designed to predict how a team is likely to perform in the future, based on how they have performed in the past. Kenpom’s method measures how many points a team scores or gives up per possession, and then makes an adjustment based on the quality of the opponent.
In my opinion, the beauty and power of Kenpom’s method is that is it is simple, easy to explain, and it correlates very well to point spreads, which is the most robust way to predict the outcome of a given contest. A table of Kenpom efficiency margins actually means something real.
For example Michigan State’s adjusted efficiency margin is 14.22. What this means is that if MSU were to play an average Division I team (like Iowa State, with an efficiency margin of of -0.08 and a rank of 175 out of 357 teams), the Spartans would be expected to outscore the Cyclones by about 14 or 15 points if they were to play 100 possessions on a neutral court.
It has a clear, tangible meaning in the real world and it makes predictions about the future that are accurate, on average. For me, this makes Kenpom efficiency margins the gold standard for prediction-based metrics in college basketball. There is no reason to look at any other metric.
Results-based metrics, however, are designed to measure something slightly different. One of the main drawbacks of Kenpom is that it does not strictly consider winning or losing. MSU’s own Kevin Pauga explained the difference on Twitter using the following analogy:
From an efficiency point of view, moving from a margin of victory of -2 to +2 is just as good as moving from a margin of victory of +2 to +6. But, since the goal of a basketball game is to score more points than the opponent in 60 minutes, moving from -2 to +2 is obviously way more important than moving from +2 to +6.
When it comes down to selecting at-large teams for the NCAA Tournament, wins and loses do matter. So while Kenpom efficiencies generally do predict how a team will perform in the NCAA Tournament, it is not the best metric to use in selecting or seeding teams.
That said, creating a result-based metrics with the same beauty and simplicity as Kenpom efficiency has proven difficult. For years, the NCAA Selection Committee used the “RPI” in an attempt to compare teams. The RPI had a simple and transparent formula, but it was based on a combination of the winning percentages of given teams opponent and the opponents of those opponents.
The RPI sort-of worked, but when it comes down to it, mathematical manipulations of opponent winning percentages have no real meaning. Furthermore, the RPI seemed to give strange results most years. But, it was easy to calculate from simple win and loss records and despite it flaws, it was a tool used by the Selection Committee for decades.
Then, in 2018, the NCAA introduced a new, “improved” metric that they called the NET. The formula has not been released publicly to my knowledge, but it seems to have been “improved” by adding additional factors such as Kenpom-like efficiencies to the existing (and dubious) calculations involving opponents’ win percentages. The tweet below gives a brief overview of the NET for what it is worth:
Unfortunately, in my mind, what the NCAA has done is simply created a more complicated, less transparent, and equally mathematically dubious metric. Furthermore, there is already enough evidence to suggest that it has its problems. The most obvious case is that of Colgate University this year.
The Colgate Raiders are a Patriot League team who are currently sitting at 11-1 and are ranked No. 9 in the NET and No. 6 in the RPI. But a quick glance at their schedule is... underwhelming. The Raiders have played exactly unique three teams: Army, Holy Cross and Boston University.
To their credit, Colgate swept Holy Cross and Boston and went 3-1 against Army. That said, no rational person believes that Colgate has the ninth most impressive resume in the country. If that is not what the NET is designed to measure, then I am not sure why we are even using it.
In other words, the only thing about the NET that I agree with is that it is properly named. It is full of holes and does not hold water.
So the question that remains is, is it possible to construct a better metric, preferably one with the simple transparency and quantitative strength the Kenpom efficiencies provide? Fortunately, I think that we can.
A “NEW” Way to Quantify Results
If we try to create a better metric, it is most important to think carefully about what we are trying to measure. A good metric should be a tool to compare teams, based on wins and losses, on a level playing field. It should be as simple as possible and the answer that it gives should be quantitively.
In this context, by “quantitative” I mean that if the metric spits out a number “3” that number should mean something and it should be 50 percent better than “2.” Ideally, this metric should have a unit, such as “wins.”
The first step in creating such a metric is to figure out a way to normalize the schedules of different teams. The approach that I used was to exploit the concept of expected value. It is similar to the method that I used to calculate the strengths of schedule of Big Ten teams throughout this season, based on Kenpom efficiency. Kenpom uses a similar method to calculate strength of schedule.
In this case, I took the schedule of each team in question, and I calculated the odds that an average high major team, with a fixed Kenpom efficiency margin would win each game. I used an efficiency of 19.00 (approximately as good as a team like Rutgers) as a trial. The sum of the probabilities in each game is equal to the expected number of wins that an average high-major team would be expected to accumulate for any given schedule.
For example, imagine a schedule of a mid-major team like Loyola-Chicago. Let’s now assume that if an average Power Five team were to play that schedule, it would be expected to win an average of 80 percent of the games, based on the Kenpom efficiency margin of each opponent on that schedule. Let’s assume that there are a total of 20 games. In this example, the average Power Five team would be expected to go 16-4 with Loyola’s schedule (as 80 percent times 20 equals 16).
But, let’s now think about a schedule of a high-major team like Michigan State. The same arbitrary, Rutgers-like Power Five team might only be expected to win 50 percent of the games on MSU’s schedule. If the schedule was also 20 games, than the that Power Five team would only be expected to go 10-10 with that schedule.
Now that the schedules have been effectively “normalized,” it is possible to measure how each team did relative to the projected performance of the reference, average power five team. In the previous example, if Loyola were to have won 17 games on their schedule, they would be “+1.” In other words, they would have won one more game than the reference team would have been expected to win.
If Michigan State were to go 12-8 on their schedule, the Spartans would be “+2” wins. In this situation, it would be reasonable to conclude that MSU’s results on its schedule was more impressive (by one win) than the results achieved by Loyola on its schedule, once the schedules were normalized.
If teams play the same number of games, it is fair to simply use the raw difference between actual wins and normalized expected wins as a metric. But, it is more fair to divide this number by the total number of games on the schedule to get a marginal win percentage. In the example above, Loyola is +0.05 and MSU is +0.10. In this case, the unit is the percentage of wins relative to the expectation of an average power five team.
Note that while this metric uses Kenpom efficiencies to normalize the schedule difficulty, the actual Kenpom efficiency margins of teams in question (Loyola and MSU, in this example) do not appear in the calculation at all. In this way, this new metric is complementary to, but does not overlap with the information that Kenpom provides. The two metrics provide different information.
Since all good sports metrics have a snappy acronyms, I have decided to name this one “PReNEW” for “Performance Relative to Normalized Expected Wins.” But, I think that I will just call it the NEW Index for short. While I can’t say that I have exhaustively reviewed every results-based metric out there, I have not seen anything quite like this one. I think that it has significant potential value.
Applying the NEW metric to the 2021 Field
As an exercise, I decided to apply my new NEW Index to the current NCAA field. I did not make the calculation for every Division 1 team, but I did make it for every team currently in the Top 100 of the NET as well as all teams in the six high-major conferences.
As an experiment, I decided to compare the results of this calculation to the to predictions of the two most prominent on-line bracketologists: Joe Lunardi of ESPN and Jerry Palm of CBS. Table 1 below give the the NCAA Tournament resumes of the top 16 teams, according to ESPN.
Table 1: Top 16 projected teams in the NCAA Tournament based on ESPN, CBS, and the NEW Index
In the center of the table, I show the marginal percentage version of NEW Index, as well as the NEW rank, which I use as a seed list. Next to the NEW rank is the current seed list posted by ESPN and CBS as of the morning of March 7. The current NET and Kenpom rankings are also given in the table.
In my opinion, the NEW index does a remarkable job in projecting the top 16 teams of the field relative to these experts. The top seven teams in the NEW index appear on the No. 1 and No. 2 seed lines of both ESPN and CBS’ current projections.
Furthermore the NEW index only differs from the other two projections of the top 16 on two teams: West Virginia and Texas who project as No. 5 seeds by the NEW Index. In contrast, ESPN’s current top 16 teams has five teams outside of the current NET top 16 and four teams not in the Kenpom top 16.
As for the teams not on the first four lines, but who are considered safely in the NCAA.
Table 2: Additional "safe" teams projected for the NCAA Tournament based on ESPN, CBS, and the NEW Index
Tournament, those teams are shown in Table 2, sorted based on the approximate s-curve in Joe Lunardi’s most recent bracket.
Table 3: Teams currently on the NCAA Tournament bubble based on ESPN, CBS, and the NEW Index
For the teams that are currently considered to be “on the bubble” those teams are summarized below in Table 3. Note that if the ranking is shown in blue, the team is currently projected to be “in” the tournament. If the team’s rank is a shade of red, that team is currently projected to be “out.” I adjusted the shade of the color to highlight ranking that differ notably from the other two rankings.
A significant advantage of the NEW Index is that it does not appear to have any “Colgate-like” anomalies. The Raiders currently rank No. 52 in the NEW Index. That said, there are a few surprises. For example, the NEW Index projects BYU as a No. 4 seed (instead of a No. 7 seed), Wichita State as a No. 6 seed (instead of a No. 10 or No. 12 seed), and Drake as a safe No. 7 seed instead of squarely on the bubble.
On the other side of the coin, there are a couple of teams that the NEW Index seems to not favor. Texas Tech and Oklahoma only project as No. 11 and a barely-in-the-tournament No. 12 seed right now, instead of cozy No. 5 to No. 7 seeds where ESPN and CBS place them. That said, both of those teams have over 40 percent of their wins in quad-four, which is notably lower than most other tournament teams. Maybe the NEW Index is simply onto something...
Somewhat shocking is the NEW Index’s predictions for the bubble teams match the expert prognosticators surprisingly well. The NEW Index only disagrees on a handful of teams. The NEW metric currently has SMU and Syracuse in the Tournament instead of Boise State and Xavier.
As for Michigan State, the NEW Index is a bit more positive on the Spartans as well. It currently ranks the Spartans No. 38 right now, somewhat comfortably away from the bubble as a No. 10 seed. ESPN currently lists MSU as the top team in the “last four in” group, while this morning CBS moved MSU out of the First Four and it the main bracket as a No. 11 seed.
If it were up to me, I would rely on three main metrics in selecting teams for the NCAA Tournament: a predictive metric such as Kenpom, result-based metric such as the NEW Index, and the good, old-fashioned eye test that considers record, quality wins, bad loses, and other intangibles such as injuries and (in 2021) the impact of COVID.
There simply is no single metric or mathematical formula that can tell us if a team is “in” or “out,” and frankly, that is a good thing. If there were, we wouldn’t need a committee at all. With this in mind, creating and increasing complex metric, such as the NET, had little value and it should be phased out. I believe that it is time for a NEW way of thinking.
When it comes to college basketball metrics, there are a lot to choose from, and from a team point of view there are generally two types: prediction-based and results-based.
Prediction-based metrics, such as Kenpom efficiency margins, are designed to predict how a team is likely to perform in the future, based on how they have performed in the past. Kenpom’s method measures how many points a team scores or gives up per possession, and then makes an adjustment based on the quality of the opponent.
In my opinion, the beauty and power of Kenpom’s method is that is it is simple, easy to explain, and it correlates very well to point spreads, which is the most robust way to predict the outcome of a given contest. A table of Kenpom efficiency margins actually means something real.
For example Michigan State’s adjusted efficiency margin is 14.22. What this means is that if MSU were to play an average Division I team (like Iowa State, with an efficiency margin of of -0.08 and a rank of 175 out of 357 teams), the Spartans would be expected to outscore the Cyclones by about 14 or 15 points if they were to play 100 possessions on a neutral court.
It has a clear, tangible meaning in the real world and it makes predictions about the future that are accurate, on average. For me, this makes Kenpom efficiency margins the gold standard for prediction-based metrics in college basketball. There is no reason to look at any other metric.
Results-based metrics, however, are designed to measure something slightly different. One of the main drawbacks of Kenpom is that it does not strictly consider winning or losing. MSU’s own Kevin Pauga explained the difference on Twitter using the following analogy:
From an efficiency point of view, moving from a margin of victory of -2 to +2 is just as good as moving from a margin of victory of +2 to +6. But, since the goal of a basketball game is to score more points than the opponent in 60 minutes, moving from -2 to +2 is obviously way more important than moving from +2 to +6.
When it comes down to selecting at-large teams for the NCAA Tournament, wins and loses do matter. So while Kenpom efficiencies generally do predict how a team will perform in the NCAA Tournament, it is not the best metric to use in selecting or seeding teams.
That said, creating a result-based metrics with the same beauty and simplicity as Kenpom efficiency has proven difficult. For years, the NCAA Selection Committee used the “RPI” in an attempt to compare teams. The RPI had a simple and transparent formula, but it was based on a combination of the winning percentages of given teams opponent and the opponents of those opponents.
The RPI sort-of worked, but when it comes down to it, mathematical manipulations of opponent winning percentages have no real meaning. Furthermore, the RPI seemed to give strange results most years. But, it was easy to calculate from simple win and loss records and despite it flaws, it was a tool used by the Selection Committee for decades.
Then, in 2018, the NCAA introduced a new, “improved” metric that they called the NET. The formula has not been released publicly to my knowledge, but it seems to have been “improved” by adding additional factors such as Kenpom-like efficiencies to the existing (and dubious) calculations involving opponents’ win percentages. The tweet below gives a brief overview of the NET for what it is worth:
Unfortunately, in my mind, what the NCAA has done is simply created a more complicated, less transparent, and equally mathematically dubious metric. Furthermore, there is already enough evidence to suggest that it has its problems. The most obvious case is that of Colgate University this year.
The Colgate Raiders are a Patriot League team who are currently sitting at 11-1 and are ranked No. 9 in the NET and No. 6 in the RPI. But a quick glance at their schedule is... underwhelming. The Raiders have played exactly unique three teams: Army, Holy Cross and Boston University.
To their credit, Colgate swept Holy Cross and Boston and went 3-1 against Army. That said, no rational person believes that Colgate has the ninth most impressive resume in the country. If that is not what the NET is designed to measure, then I am not sure why we are even using it.
In other words, the only thing about the NET that I agree with is that it is properly named. It is full of holes and does not hold water.
So the question that remains is, is it possible to construct a better metric, preferably one with the simple transparency and quantitative strength the Kenpom efficiencies provide? Fortunately, I think that we can.
A “NEW” Way to Quantify Results
If we try to create a better metric, it is most important to think carefully about what we are trying to measure. A good metric should be a tool to compare teams, based on wins and losses, on a level playing field. It should be as simple as possible and the answer that it gives should be quantitively.
In this context, by “quantitative” I mean that if the metric spits out a number “3” that number should mean something and it should be 50 percent better than “2.” Ideally, this metric should have a unit, such as “wins.”
The first step in creating such a metric is to figure out a way to normalize the schedules of different teams. The approach that I used was to exploit the concept of expected value. It is similar to the method that I used to calculate the strengths of schedule of Big Ten teams throughout this season, based on Kenpom efficiency. Kenpom uses a similar method to calculate strength of schedule.
In this case, I took the schedule of each team in question, and I calculated the odds that an average high major team, with a fixed Kenpom efficiency margin would win each game. I used an efficiency of 19.00 (approximately as good as a team like Rutgers) as a trial. The sum of the probabilities in each game is equal to the expected number of wins that an average high-major team would be expected to accumulate for any given schedule.
For example, imagine a schedule of a mid-major team like Loyola-Chicago. Let’s now assume that if an average Power Five team were to play that schedule, it would be expected to win an average of 80 percent of the games, based on the Kenpom efficiency margin of each opponent on that schedule. Let’s assume that there are a total of 20 games. In this example, the average Power Five team would be expected to go 16-4 with Loyola’s schedule (as 80 percent times 20 equals 16).
But, let’s now think about a schedule of a high-major team like Michigan State. The same arbitrary, Rutgers-like Power Five team might only be expected to win 50 percent of the games on MSU’s schedule. If the schedule was also 20 games, than the that Power Five team would only be expected to go 10-10 with that schedule.
Now that the schedules have been effectively “normalized,” it is possible to measure how each team did relative to the projected performance of the reference, average power five team. In the previous example, if Loyola were to have won 17 games on their schedule, they would be “+1.” In other words, they would have won one more game than the reference team would have been expected to win.
If Michigan State were to go 12-8 on their schedule, the Spartans would be “+2” wins. In this situation, it would be reasonable to conclude that MSU’s results on its schedule was more impressive (by one win) than the results achieved by Loyola on its schedule, once the schedules were normalized.
If teams play the same number of games, it is fair to simply use the raw difference between actual wins and normalized expected wins as a metric. But, it is more fair to divide this number by the total number of games on the schedule to get a marginal win percentage. In the example above, Loyola is +0.05 and MSU is +0.10. In this case, the unit is the percentage of wins relative to the expectation of an average power five team.
Note that while this metric uses Kenpom efficiencies to normalize the schedule difficulty, the actual Kenpom efficiency margins of teams in question (Loyola and MSU, in this example) do not appear in the calculation at all. In this way, this new metric is complementary to, but does not overlap with the information that Kenpom provides. The two metrics provide different information.
Since all good sports metrics have a snappy acronyms, I have decided to name this one “PReNEW” for “Performance Relative to Normalized Expected Wins.” But, I think that I will just call it the NEW Index for short. While I can’t say that I have exhaustively reviewed every results-based metric out there, I have not seen anything quite like this one. I think that it has significant potential value.
Applying the NEW metric to the 2021 Field
As an exercise, I decided to apply my new NEW Index to the current NCAA field. I did not make the calculation for every Division 1 team, but I did make it for every team currently in the Top 100 of the NET as well as all teams in the six high-major conferences.
As an experiment, I decided to compare the results of this calculation to the to predictions of the two most prominent on-line bracketologists: Joe Lunardi of ESPN and Jerry Palm of CBS. Table 1 below give the the NCAA Tournament resumes of the top 16 teams, according to ESPN.
Table 1: Top 16 projected teams in the NCAA Tournament based on ESPN, CBS, and the NEW Index
In the center of the table, I show the marginal percentage version of NEW Index, as well as the NEW rank, which I use as a seed list. Next to the NEW rank is the current seed list posted by ESPN and CBS as of the morning of March 7. The current NET and Kenpom rankings are also given in the table.
In my opinion, the NEW index does a remarkable job in projecting the top 16 teams of the field relative to these experts. The top seven teams in the NEW index appear on the No. 1 and No. 2 seed lines of both ESPN and CBS’ current projections.
Furthermore the NEW index only differs from the other two projections of the top 16 on two teams: West Virginia and Texas who project as No. 5 seeds by the NEW Index. In contrast, ESPN’s current top 16 teams has five teams outside of the current NET top 16 and four teams not in the Kenpom top 16.
As for the teams not on the first four lines, but who are considered safely in the NCAA.
Table 2: Additional "safe" teams projected for the NCAA Tournament based on ESPN, CBS, and the NEW Index
Tournament, those teams are shown in Table 2, sorted based on the approximate s-curve in Joe Lunardi’s most recent bracket.
Table 3: Teams currently on the NCAA Tournament bubble based on ESPN, CBS, and the NEW Index
For the teams that are currently considered to be “on the bubble” those teams are summarized below in Table 3. Note that if the ranking is shown in blue, the team is currently projected to be “in” the tournament. If the team’s rank is a shade of red, that team is currently projected to be “out.” I adjusted the shade of the color to highlight ranking that differ notably from the other two rankings.
A significant advantage of the NEW Index is that it does not appear to have any “Colgate-like” anomalies. The Raiders currently rank No. 52 in the NEW Index. That said, there are a few surprises. For example, the NEW Index projects BYU as a No. 4 seed (instead of a No. 7 seed), Wichita State as a No. 6 seed (instead of a No. 10 or No. 12 seed), and Drake as a safe No. 7 seed instead of squarely on the bubble.
On the other side of the coin, there are a couple of teams that the NEW Index seems to not favor. Texas Tech and Oklahoma only project as No. 11 and a barely-in-the-tournament No. 12 seed right now, instead of cozy No. 5 to No. 7 seeds where ESPN and CBS place them. That said, both of those teams have over 40 percent of their wins in quad-four, which is notably lower than most other tournament teams. Maybe the NEW Index is simply onto something...
Somewhat shocking is the NEW Index’s predictions for the bubble teams match the expert prognosticators surprisingly well. The NEW Index only disagrees on a handful of teams. The NEW metric currently has SMU and Syracuse in the Tournament instead of Boise State and Xavier.
As for Michigan State, the NEW Index is a bit more positive on the Spartans as well. It currently ranks the Spartans No. 38 right now, somewhat comfortably away from the bubble as a No. 10 seed. ESPN currently lists MSU as the top team in the “last four in” group, while this morning CBS moved MSU out of the First Four and it the main bracket as a No. 11 seed.
If it were up to me, I would rely on three main metrics in selecting teams for the NCAA Tournament: a predictive metric such as Kenpom, result-based metric such as the NEW Index, and the good, old-fashioned eye test that considers record, quality wins, bad loses, and other intangibles such as injuries and (in 2021) the impact of COVID.
There simply is no single metric or mathematical formula that can tell us if a team is “in” or “out,” and frankly, that is a good thing. If there were, we wouldn’t need a committee at all. With this in mind, creating and increasing complex metric, such as the NET, had little value and it should be phased out. I believe that it is time for a NEW way of thinking.