Sunday, May 29, 2011

Team Even Strength Shooting Talent

A while back, I received a comment relating to how my playoff probability model accounts for teams that are outliers with respect to shooting percentage, with the 2009-10 Washington Capitals offered as an example of such a team.

The answer is relatively straightforward: I merely regress each team to the league average based on the extent to which the team to team variation can be attributed to luck over the sample in question. As the variation in even strength shooting percentage at the team level is approximately 66% luck over the course of a regular season, each team's even strength shooting percentage is regressed two-thirds of the way to the mean in order to generate theoretical win probabilities.

The application of above method to data from the 2010-11 regular season yields the following even strength shooting talent estimates for each the league's 30 teams.*

This method, however, is actually a shortcut that relies on assumptions that are unlikely to be true in reality. For one, it assumes that all underlying talent distributions are normally distributed, which may or may not be the case. It's also insensitive to the fact that some teams take more shots than others over the course of a season. A more certain shooting talent estimate can be made with respect to a team that takes 2000 shots as compared to a team that takes 1500 shots, although the model fails to reflect that.

The proper - albeit significantly more complicated and involved - approach would be to actually identify the nature of the underlying talent distribution and work one's way forward from there.

The first step is to look at the observed distribution in performance. I did this through randomly selecting half a season's worth of games for each team and looking at how it performed with respect to EV SH% over that sample, and repeated this 2000 times for every season from 2003-04 to 2010-11. I elected to do this as it provided me with 420 000 data points, thereby allowing me to generate a smooth curve. By comparison, using the actual end-of-season frequencies would have provided me with a mere 210 data points.

I came away with the following curve:

The distribution is slightly right-skewed and therefore not quite normal. This becomes meaningful at the tails - there were approximately 26 times more values 3 standard deviations above the mean than there were values 3 standard deviations below it. In other words, there are many more very good teams than very bad ones when it comes to even strength shooting performance.

The next step is finding a curve that best fits the observed data. This curve should have a mean of approximately 0.081, which was the observed league average shooting percentage. It should also have a standard deviation of approximately 0.0048, which is the skill standard deviation in relation to even strength shooting percentage at the team level. Finally, the curve should be slightly positively skewed.

The beta (236, 2977) curve, shown below, satisfies these criteria.

As a check on the correctness of the selection, I used a random number generator to assign each team an artificial EV shooting percentage based on the above curve. I then simulated a sufficiently large number of half seasons based on those artificial numbers and compared the results to the observed data. If the choice is correct, the simulated results should closely match those observed.

The simulated curve is only based on about 30 000 data points, so it's not as smooth as the observed distribution. That said, the fit is pretty good. The observed distribution appears to have a fatter right tail, and so it's possible that a different beta curve might provide a better match. But it's close enough.

The beta ability distribution can be used to estimate each team's true talent underlying shooting percentage, based on the 2010-11 regular season. How do these estimates compare to those produced by the simple regression approach discussed earlier?

The two approaches produce very similar results - the average difference amounting to only 0.0004. The latter approach is both more precise and principled. But the former achieves substantially similar estimates with a fraction of the effort.

* The mean used was 0.0812, this being the league average EV SH% since the lockout, even though the observed shooting percentage in 2010-11 was a bit lower - a touch under 0.08.


Eric T from BSH said...

How is it possible for the rankings for estimated talent to be different from the rankings for observed performance? (Philadelphia was observed at #2 shooting percentage but estimated talent of #1.) Is that just a little noise in the simulation?

JLikens said...

It's because the Flyers took more shots than Dallas.

More shots increases the probability that the observed performance corresponds to underlying talent.

JLikens said...

Actually, a more careful check shows that there was an error of sorts.

When attempting to identify the best possible estimate, I used increments based on the number of EV shots taken by the team in question in 2010-11 (i.e. if a team took 1000 EV shots, the only numbers checked would have been 0.001 (1/1000), 0.002 (2/1000), 0.003 (3/1000), 0.004 (4/1000), etc).

Using increments of 0.0001 allows for more precision with respect to the best possible estimate.

If I do that, I get a talent estimate of 0.83 for Philadelphia and 0.833 for Dallas.

So you're right - Dallas should be #1 in terms of estimated EV shooting talent. said...

It is quite impressive that you can predict with that kind of accuracy playoff matches.