A Closer Look at 3-pt Defense

markus_shot

Photo by Brian Georgeson

As the opening weekend of the big dance approaches, and Marquette is finally back in where it belongs, I wanted to dig into a stat that will be very relevant to us in Greenville, SC: 3-point defense.

Our first round matchup is against South Carolina, a team that sports one of the highest ranked 3-point defenses in the nation. I’ve seen some fans on #mubb twitter question whether this ranking is deserved, based in part on the relatively poor 3-point offenses in the SEC. I’ve been playing around with the 2017 NCAA Div 1 Men’s Basketball game data provided by Kaggle this week, so I thought I’d write some code to do a brief analysis on this subject.

My method is as follows: For each game a team played, compute:

  • Opponent’s 3-pt % for that game
  • Opponent’s season 3-pt % after removing games played against team of record

I can then compute basic stats on the differences between those paired quantities, and run a hypothesis test to determine how likely it is that any difference is due to chance.

In the tables farther down, I show the basic stats for the top 25 teams using 3 different rankings:

  • Team’s defensive 3-pt %
  • Average difference between opponent’s 3-pt % against team of record and rest of league
  • T-stat of that difference

The columns in each table are:

  • Team
  • Games played
  • Defensive 3-pt %
  • Average of difference for opponent’s 3-pt % (positive = better than average)
  • Standard deviation of difference for opponent’s 3-pt %
  • T-stat on paired differences
  • Bootstrap-resampled p-value on paired differences*

*Bootstrap resampled p-value is a randomized estimate of the p-value. This method is valuable when the sample size is fairly low. P-value refers to the probability that you might encounter a difference as large as the data, if the team’s defense was simply average. In this case, I’m computing a one-tailed test to determine if the team of record is capable of allowing a lower than average 3-pt %. Lower p-values indicate a higher level of confidence that the team is better than average at 3-pt defense.

One other note: as I said, the data comes from Kaggle, and though I assume it’s clean, I can’t be sure there are no errors. I see slight discrepancies between my 3-pt defensive numbers and kenpom.com, so the data may not be perfect.

A couple observations: South Carolina’s 3-pt defense does appear to be better than average, but probably not quite as good as advertised. They are ranked 6th in defensive %, 13th in opponent’s difference %, and 42nd in statistical significance. The lower ranking in significance comes from a higher variance in defensive % from game-to-game, and suggests some of their outperformance may be due to luck. But look who’s highly ranked in all 3 categories (#1 in difference): Duke, a potential second round matchup for Marquette.

(Also, we suck at this aspect of the game. No surprise there.)

Here are the ranked data:

Table I – Ranked by lowest 3-pt defensive %

Rank Team Games OppAvg Diff Stdev T-stat P-value
1 Morgan St 29 28.0% 5.4% 10.3% 2.822 0.0009
2 Rhode Island 33 29.1% 5.8% 10.9% 3.053 0.0005
3 NC Central 30 29.2% 3.9% 12.0% 1.770 0.0382
4 New Mexico St 30 29.8% 4.2% 11.7% 1.975 0.0224
5 Arizona 34 29.9% 5.9% 9.0% 3.808 0.0002
6 South Carolina 31 29.9% 4.6% 15.8% 1.627 0.0496
7 Duke 35 29.9% 6.4% 12.0% 3.140 0.0008
8 Gonzaga 33 30.0% 6.0% 11.1% 3.122 0.0008
9 Alcorn St 29 30.1% 2.8% 11.2% 1.326 0.0933
10 Wichita St 33 30.1% 5.5% 11.6% 2.719 0.0028
11 Minnesota 33 30.3% 5.5% 11.9% 2.658 0.0050
12 Robert Morris 33 30.4% 3.9% 11.7% 1.928 0.0241
13 Nevada 34 30.5% 4.4% 10.6% 2.428 0.0072
14 Louisville 32 30.6% 6.2% 11.1% 3.144 0.0011
15 St Mary’s CA 32 30.7% 5.1% 9.4% 3.084 0.0011
16 Col Charleston 33 30.7% 3.8% 9.0% 2.444 0.0048
17 Florida 32 30.8% 4.1% 10.0% 2.319 0.0065
18 New Orleans 28 30.8% 4.2% 8.6% 2.591 0.0040
19 FL Gulf Coast 30 31.0% 4.9% 9.0% 2.999 0.0011
20 Villanova 34 31.1% 5.5% 7.8% 4.153 0.0000
21 Colorado St 32 31.1% 3.9% 11.4% 1.918 0.0327
22 Seattle 27 31.1% 2.9% 10.8% 1.395 0.0771
23 Illinois St 32 31.1% 4.7% 7.9% 3.368 0.0003
24 Winthrop 30 31.2% 4.6% 10.8% 2.323 0.0092
25 Furman 30 31.5% 4.0% 9.2% 2.377 0.0070
284 Marquette 31 37.0% -1.9% 11.9% -0.885 0.8154

 

Table II – Ranked by biggest opponent’s difference

Rank Team Games OppAvg Diff Stdev T-stat P-value
1 Duke 35 29.9% 6.4% 12.0% 3.140 0.0008
2 Louisville 32 30.6% 6.2% 11.1% 3.144 0.0011
3 Gonzaga 33 30.0% 6.0% 11.1% 3.122 0.0008
4 Arizona 34 29.9% 5.9% 9.0% 3.808 0.0002
5 Rhode Island 33 29.1% 5.8% 10.9% 3.053 0.0005
6 Villanova 34 31.1% 5.5% 7.8% 4.153 0.0000
7 Minnesota 33 30.3% 5.5% 11.9% 2.658 0.0050
8 Wichita St 33 30.1% 5.5% 11.6% 2.719 0.0028
9 Morgan St 29 28.0% 5.4% 10.3% 2.822 0.0009
10 St Mary’s CA 32 30.7% 5.1% 9.4% 3.084 0.0011
11 FL Gulf Coast 30 31.0% 4.9% 9.0% 2.999 0.0011
12 Illinois St 32 31.1% 4.7% 7.9% 3.368 0.0003
13 South Carolina 31 29.9% 4.6% 15.8% 1.627 0.0496
14 Virginia 32 31.6% 4.6% 12.2% 2.146 0.0169
15 Winthrop 30 31.2% 4.6% 10.8% 2.323 0.0092
16 Nevada 34 30.5% 4.4% 10.6% 2.428 0.0072
17 New Mexico St 30 29.8% 4.2% 11.7% 1.975 0.0224
18 New Orleans 28 30.8% 4.2% 8.6% 2.591 0.0040
19 BYU 33 32.2% 4.1% 11.2% 2.111 0.0182
20 Florida 32 30.8% 4.1% 10.0% 2.319 0.0065
21 Baylor 31 31.6% 4.0% 9.9% 2.251 0.0123
22 Furman 30 31.5% 4.0% 9.2% 2.377 0.0070
23 Robert Morris 33 30.4% 3.9% 11.7% 1.928 0.0241
24 NC Central 30 29.2% 3.9% 12.0% 1.770 0.0382
25 Colorado St 32 31.1% 3.9% 11.4% 1.918 0.0327
267 Marquette 31 37.0% -1.9% 11.9% -0.885 0.8154

 

Table III – Ranked by strongest statistical significance

Rank Team Games OppAvg Diff Stdev T-stat P-value
1 Villanova 34 31.1% 5.5% 7.8% 4.153 0.0000
2 Arizona 34 29.9% 5.9% 9.0% 3.808 0.0002
3 Illinois St 32 31.1% 4.7% 7.9% 3.368 0.0003
4 Louisville 32 30.6% 6.2% 11.1% 3.144 0.0011
5 Duke 35 29.9% 6.4% 12.0% 3.140 0.0008
6 Gonzaga 33 30.0% 6.0% 11.1% 3.122 0.0008
7 St Mary’s CA 32 30.7% 5.1% 9.4% 3.084 0.0011
8 Rhode Island 33 29.1% 5.8% 10.9% 3.053 0.0005
9 FL Gulf Coast 30 31.0% 4.9% 9.0% 2.999 0.0011
10 Morgan St 29 28.0% 5.4% 10.3% 2.822 0.0009
11 Wichita St 33 30.1% 5.5% 11.6% 2.719 0.0028
12 Minnesota 33 30.3% 5.5% 11.9% 2.658 0.0050
13 New Orleans 28 30.8% 4.2% 8.6% 2.591 0.0040
14 Col Charleston 33 30.7% 3.8% 9.0% 2.444 0.0048
15 Nevada 34 30.5% 4.4% 10.6% 2.428 0.0072
16 Furman 30 31.5% 4.0% 9.2% 2.377 0.0070
17 Winthrop 30 31.2% 4.6% 10.8% 2.323 0.0092
18 Florida 32 30.8% 4.1% 10.0% 2.319 0.0065
19 Wyoming 30 32.1% 3.0% 7.0% 2.310 0.0096
20 Baylor 31 31.6% 4.0% 9.9% 2.251 0.0123
21 St Peter’s 32 31.9% 3.2% 8.2% 2.221 0.0119
22 Virginia 32 31.6% 4.6% 12.2% 2.146 0.0169
23 BYU 33 32.2% 4.1% 11.2% 2.111 0.0182
24 Oregon 33 31.9% 3.7% 10.2% 2.067 0.0209
25 Texas 33 32.6% 3.8% 10.5% 2.052 0.0206
42 South Carolina 31 29.9% 4.6% 15.8% 1.627 0.0496
266 Marquette 31 37.0% -1.9% 11.9% -0.885 0.8154

 

 

Yet Another CBB Rating System

It’s here! March Madness, that time of year where we all try to predict how a bunch of 20-year-olds playing a game filled with randomness will perform. It’s fun! But we sure don’t bet real money on it, cause that would be illegal!

Actually guys, it’s not as much fun for me again this year, because, for the 3rd consecutive season, my team was not invited to keep playing. Someone please fix this. Sigh…

Anyway, this year I decided to build my own quantitative rating and prediction system to help me fill out my bracket. Here’s how my system works:

I define a numeric rating, R(i), for each team i. I also define the probability of team i defeating team j on a neutral court as

P(i->j) = R(i) / (R(i) + R(j)).

I then create a cost function, defined over all games played during the season, which is the sum over

[P(i->j) – GP(i->j)]^2,

where GP(i->j) is an estimated game probability that depends only on the result of each individual game. The GP probability value for each game must be provided as an input to the algorithm.

Of course, coming up with a good value for GP(i->j) is tricky, since we only have one occurrence (one result) for each game. The simplest method for determining GP(i->j) is to assign 100% probability if team i defeated team j, and 0% otherwise. But this ignores home/away, so it’s not ideal.

For my system, I decided to build a model based on the home/away adjusted margin of victory for the game. If team i wins by a large margin, P(i->j) will be close to 100%, but if it’s a close game, the value will be closer to 50%. (The home/away factor adjusts the margin in favor of the road team by 3.5 points)

To compute the final ratings for all teams, I initialize all teams with an equal rating (1.0) and perform an iterative optimization that minimizes the overall cost function with respect to the ratings. I use gradient descent as the optimization procedure.

Below, I list my ratings for the top 100 teams in division 1. (My Marquette Golden Eagles just managed to sneak in at #99, woo!). Note that the value of a team’s rating carries no particular meaning by itself – it’s only useful when compared to the other team ratings.

Rank Team Rating
1 Kansas 8.779
2 Michigan St 8.317
3 North Carolina 8.174
4 Villanova 7.717
5 West Virginia 7.504
6 Virginia 7.149
7 Louisville 6.637
8 Oklahoma 6.602
9 Purdue 6.035
10 Kentucky 5.737
11 Duke 5.483
12 Arizona 5.342
13 Xavier 5.200
14 Miami FL 5.199
15 Oregon 5.151
16 Indiana 5.136
17 Iowa St 4.919
18 Texas A&M 4.802
19 Baylor 4.700
20 Maryland 4.601
21 SMU 4.514
22 Utah 4.400
23 Iowa 4.390
24 California 4.266
25 Vanderbilt 4.246
26 Wichita St 3.983
27 Gonzaga 3.842
28 Connecticut 3.827
29 Pittsburgh 3.552
30 Butler 3.543
31 Seton Hall 3.538
32 VA Commonwealth 3.531
33 Notre Dame 3.496
34 Texas 3.487
35 USC 3.465
36 Cincinnati 3.444
37 Florida 3.443
38 St Mary’s CA 3.221
39 Creighton 3.182
40 South Carolina 3.154
41 Kansas St 3.117
42 Michigan 3.107
43 Syracuse 3.057
44 Texas Tech 2.990
45 Colorado 2.969
46 Wisconsin 2.953
47 St Joseph’s PA 2.882
48 Washington 2.825
49 Florida St 2.791
50 Valparaiso 2.700
51 Dayton 2.682
52 Yale 2.671
53 SF Austin 2.620
54 Oregon St 2.617
55 Georgia Tech 2.589
56 Clemson 2.589
57 Providence 2.559
58 Northwestern 2.529
59 Ohio St 2.515
60 Georgia 2.467
61 Hawaii 2.400
62 San Diego St 2.388
63 BYU 2.353
64 Arkansas 2.330
65 UCLA 2.309
66 Tulsa 2.297
67 G Washington 2.284
68 Virginia Tech 2.209
69 Arizona St 2.182
70 Georgetown 2.167
71 Houston 2.114
72 Ark Little Rock 2.110
73 Rhode Island 2.089
74 Nebraska 2.072
75 UC Irvine 2.046
76 Mississippi 2.045
77 Stanford 2.029
78 LSU 2.023
79 Princeton 2.012
80 NC State 2.006
81 Memphis 2.003
82 Monmouth NJ 1.994
83 Evansville 1.949
84 Alabama 1.925
85 St Bonaventure 1.921
86 UNC Wilmington 1.896
87 Richmond 1.896
88 S Dakota St 1.887
89 Temple 1.884
90 Mississippi St 1.881
91 Stony Brook 1.806
92 Oklahoma St 1.794
93 Akron 1.787
94 Tennessee 1.764
95 Davidson 1.751
96 William & Mary 1.749
97 Santa Barbara 1.732
98 James Madison 1.731
99 Marquette 1.727
100 Iona 1.674