Since the 2005-06 season, there’s been a lot of talk in the media about the amount of parity that currently exists in the NHL. While I’m inclined to agree with this, I have a feeling that people are simply looking at the (presumably diminished) spread in point totals and making their conclusions on that basis. This is, of course, completely misguided and incorrect.
Points totals themselves are not necessarily indicative of reduced parity. For in order to measure parity, you first have to measure team strength, and point totals do not adequately measure team strength.
To be sure, point totals are correlated with team strength. Hockey would be a very strange game if this were not true. However, there are certain problems with point totals that preclude its use as a proxy for team quality.
For one, points totals are influenced by overtime and shootout success, and I would argue that overtime and shootout success have very little to do with how strong a team is. When I use the term ‘team quality’, I’m referring to how good a team is at actually playing hockey. And when I use the term ‘actually playing hockey’, I’m basically referring to how good a team is at winning in regulation. The distinction between regulation and extra-regulation results might seem arbitrary at first, but there's good reason for it. For one, overtime and shootout success has almost nothing to do with regulation success. Observe:
Moreover, extra-regulation results are not very repeatable across seasons, especially compared to regulation results.
The fact that extra-regulation results have virtually nothing to do with regulation results and have little to no repeatability suggests that they are largely the product of randomness. If something is largely random, then it cannot be thought of as an underlying ability. And if something cannot be thought of as an underlying ability, then it ought not to be part of a metric that ostensibly measures team strength. And yet, shootout and overtime success does have a sizable affect on point totals. Hence, my reluctance to use point totals as a metric for team strength.
However, the inadequacy of point totals goes much deeper than this. Even before the advent of 4-on-4 overtime and the shootout, points were not the best metric for team strength. The reason for this is that point totals only reflect wins and losses while completely ignoring the margin of victory. If there are two teams with similar point totals, one of them tending to win convincingly and lose narrowly, the other tending to win narrowly and lose convincingly, then the former team is, in almost all cases, the better team. The concept is an intuitive one. If you disagree with the assertion that a team’s goal differential better conveys its ability relative to its point total or place in the standings, then you’re probably at the wrong site.
Granted, goal differential per se, while better than points, is not the best available metric. Several corrections need to be made to it for this to be true. Firstly, shootout and empty net goals should be excluded from the totals, as they provide no useful information. Secondly, raw goal differential is problematic in that not all teams play identical schedules. Some teams, usually by virtue of playing in a stronger division or conference, are burdened with a more difficult schedule than average. If you thought that the 2005-06 Phoenix Coyotes and the 2005-06 Carolina Hurricanes had equally difficult schedules, then you would be mistaken. Thus, some attempt should be made to correct for schedule difficulty. Finally, it is not so much a team’s absolute goal differential that is important, but its GF-GA ratio. A team that scores 200 goals and concedes 100 is better than one that scores 400 and gives up 300. Furthermore, simple goal differential is too sensitive to scoring context for it to provide any useful information on league parity, as it would lead to the spurious conclusion that there was less parity in higher scoring seasons. These two problems are avoidable by using each team’s Pythagorean expectation instead– essentially, its theoretical winning percentage determined through the following calculation:
(Adjusted goals for)^2 / [(adjusted goals for)^2 + (adjusted goals against)^2]
The resulting metric can be termed adjusted winning percentage.
AW% is important as provides us with a suitable metric for assessing team strength. By computing the standard deviation in AW% in any particular season, we’re essentially measuring parity.
What, then, does AW% tell us about the amount of parity in the NHL over the last ten years?
A few comments. Firstly, parity in the pre-lockout NHL was pretty invariant on a year to year basis (mean: 0.094, ST DEV: 0.008). Only 1996-97 is anomalous, with all of the remaining values falling between 0.092 and 0.101. Secondly, there is clearly more parity (read: the standard deviation in AW% is smaller) in the post-lockout NHL relative to the pre-lockout NHL. The difference may not seem like much, but the 2005-06 and 2006-07 values are separated by one SD from the pre-lockout mean. The value for 2007-08 is 4 SD(!) from the pre-lockout mean. That's a fairly significant difference.
Parity in the new NHL seems to be more reality than fiction. Teams really are less separated in ability now compared to five or ten years ago. I find this interesting as the purpose of having the shootout and three point games seems, to me, like a ploy designed by the NHL with the intention of creating the illusion of parity. However, the fact that the new NHL is characterized by genuine parity has in some sense obviated this purpose. That considered, perhaps the NHL should do away with three point games and the shootout. I certainly wouldn't complain.