Ranting on the RPI

College basketball starts this week! It seems like it’s been a longer offseason than normal to me. I thought it would never get here…

I guess that’s just how it works when your team finishes (tied for) last in conference one year, but you expect big things from the upcoming season. MU has a top 10 ranked recruiting class, including the phenom Henry Ellenson, who many think will be a top 10 pick in this year’s NBA draft. So I’m super excited, but…

I’m a little concerned that we may have screwed ourselves when it comes to selection/seeding for the NCAA tournament. Our non-conference schedule is set up in such a way that we play a handful of games against quality opponents (Belmont, Iowa, LSU, NC State/ASU, Wisconsin), but then face a slew of total crap. I mean the real bottom feeders in division I. In December/January, we play 7 games against teams not expected to finish within the top 300. Barring some type of unpredictable catastrophe, we will win all those games. Easily.

But here’s the problem. The NCAA selection committee uses the RPI as a ranking tool to guide them when selecting/seeding the tourney. It’s a simple (really, simplistic) formula for rating teams, based on win percentage (25%), opponent’s win percentage (50%), and opponent’s-opponent’s win percentage (25%). And this rating system is flawed. The MU non-conference SOS (strength of schedule) will look terrible, and if you only look at the rating itself, it will look like MU didn’t play a single decent team.

I’m far from the first person to complain about the RPI and using it for selection/seeding. But I think it’s likely to anger me even more than usual this year due to my team’s schedule. And I want to add my own view, which I think differs somewhat from the most common sentiments.

The biggest criticism I’ve heard about the RPI is that it rewards achievement rather than performance – meaning a win is a win no matter how the 2 teams performed during the game. And it turns out that performance is a much better predictor than end result. The KenPom ratings are probably the most well-known system that rates teams based on performance. Another common criticism is that it weights SOS (75%) too much in the formula.

But there’s another flaw with the RPI that I believe is even worse. Because the rating for a team is an aggregation of win/loss percentage over all games, the actual results of individual games don’t even matter. And every single game is given equal weight, no matter how uninteresting the result. So, imagine a scenario where a bubble team, with a current RPI of 50, plays a bottomfeeder, with an RPI of 325. Let’s say the bubble team wins by 35 points. Expected result, right? This result wouldn’t change any sane person’s opinion of the bubble team. But what happens? The bubble team’s RPI is almost guaranteed to drop by a significant amount. That makes no logical sense. I refuse to believe in the validity of a system that punishes teams for winning games they are expected to win.

Another consequence of this is that tournament teams are rewarded for scheduling a slew of mediocre competition, while punished for playing some really good teams and a few really bad teams. The really bad opponents will kill your SOS, enough that playing those really good teams won’t be enough to balance it out. And if you schedule a bunch of games against good teams, you’ll probably lose some. Much safer, much better, to avoid playing teams that might beat you, as long as you also avoid the really terrible teams with atrocious records. No one can tell me this system is good for the game.

So what could we do to fix this problem? KenPom (and others) use performance, which includes margin of victory. But another way to deal with this issue, even without using scoring margins, is to weight games based on information.

When you weight games based on information, you’ll ignore results that tell you nothing you didn’t already know. Example: if UNC is currently ranked 5th, and they beat the 250th team, that result tells you nothing. So it should have no impact on the ratings. If, however, UNC loses to the 250th team, that tells you a lot, since it’s a big surprise, and should have a major impact. If UNC beats the 25th best team, that does tell you something, as #25 will sometimes beat #5, so that game should be given a moderate weight. But #25 beating #5 would be given an even higher weight, since it’s a more surprising result.