Saturday, June 25, 2011

How often does the Best Team Win?

This year’s Stanley Cup Final concluded with a somewhat surprising outcome. The Vancouver Canucks – who were widely regarded as the league’s best club – were defeated by the underdog Bruins.

To those who regard the NHL playoffs as a competition designed to determine the league’s best team, the result can mean only one thing – that the Bruins were the best team all along, and the Canucks mere pretenders.

A more reasonable explanation, however, is that shit happens over the course of a seven game series and, because of that, the better team doesn’t always win. The Canucks were better than Boston during the regular season, and were likely better in the first three rounds of the playoffs as well. They were better than Boston last year and there’s a good chance that they’ll do better next year. They were probably the better team.

The Canucks may or may not have been the best team in the league, but if they were in fact better than Boston, then that means that a team other than the best team in the league won the cup. That raises an interesting question – how often does the best team in the league end up winning the cup?

(The answer, of course, will vary as a function of the level of parity that exists in the league. Because the level of league parity has varied over time as a function of era, we’ll confine our answer to the post-lockout years).

Unfortunately, the question cannot be answered directly due to the fact that it’s not possible to identify the league’s best team in any given season with any certitude. One can only speak in terms of probability and educated guesses.

It is, however, possible to arrive at an approximate answer through assigning artificial win probabilities to each team, simulating a large number of seasons, and looking at how often the team with the best win probability ends up winning the cup.

This exercise is made possible by the fact that the distribution in team ability – which we’ll define as true talent goal ratio –can be ascertained through examining the observed spread in goal ratio and identifying the curve which best produces that spread when run through the appropriate simulator.

In order to generate an observed distribution of results, I randomly selected 40 games from every team and looked at how each of them performed with respect to goal percentage (empty netters excluded) over that sample. This exercise was performed 2000 times for each of the six post lockout seasons. The following curve resulted:

The likely ability distribution is the curve shown below – a normal distribution with a mean of 0.5 and standard deviation of 0.03.

If a large number of half-seasons are simulated through assigning artificial goal percentages based on the above ability distribution, the spread in simulated results closely matches the observed results displayed in the first graph.

As the ability distribution can be used to generate results that closely parallel those observed in reality, it can also be used in order to answer the question posed earlier in the post – that is, the probability of the best team in the league winning the cup in any given season.

Here’s how the simulations were conducted:

  • For each simulated season, every team was assigned an artificial goal percentage based on the ability distribution produced above
  • The artificial goal percentages were, in turn, used to produce GF/game and GA/game values for each team
  • GF/game values were calculated by multiplying a team’s goal percentage by 5.49 (5.49 being the approximate average number of non empty net goals scored per game in the post-lockout era)
  • GA/game values were calculated by subtracting a team’s GF/game value from 5.49
  • All 1230 games from the 2010-11regular season were then simulated, with a score being generated for each individual game
  • The probability of a team scoring ‘x’ number of goals in an individual game was determined through taking its GF/game value and adjusting it based on the GA/game value of the opponent
  • If each team scored an equal number of goals, each team was awarded one point and a random number generator was used to determine which of the two teams received the additional point
  • After all games were simulated, the division and conference standings were determined in accordance with NHL rules (that is, with the teams ranked by points, with the division winners being placed in the first three seeds in each conference)
  • If two teams were tied in points, greater number of wins was used as a tiebreaker
  • If two teams had the same number of points and wins, then a random number generator was used as a second tiebreaker
  • The playoff matchups were then determined based on the regular season standings
  • Individual playoff games were not simulated; rather, each series was simulated as a whole based on the Pythagorean expectations (which were derived from the goal percentage values) of the involved teams
  • Home advantage for the higher seed was valued at +0.015

20 000 Simulations were conducted in total. Here’s how the league’s best team – defined as the team with the best underlying goal percentage in each individual season – fared. We’ll start with the regular season results:

The above chart shows how the best team performed in four areas – division rank, conference rank, league rank in points, and league rank in goal differential. So, as an example, the best team ended up winning the President’s Trophy – i.e. finishing with the most points – about 32% of time.

The results are interesting. The best team does very well in general, but the range in outcomes is considerable. It wins its division a majority of the time yet still manages to finish dead last every now and then (about once every 200 seasons). It wins the conference almost half the time and finishes in the top four about 84% of the time. However, it still misses the playoffs a non-trivial percentage of the time (2.2%). The latter fact may not be too surprising – the 2010-11 Chicago Blackhawks were close to being the best team in the league but only made the playoffs by the slimmest of margins.

It wins the President’s Trophy about a third of time and does even better in terms of goal differential, posting the best mark in roughly 40% of the simulations. However, it occasionally finishes in the bottom half of the league in both categories (about 2% and 1% of the time, respectively).

The graph below shows the distribution in year end point totals for the best team. It averaged just over 107 points, with a high of 145 and a low of 73.

And the distribution in goal differential (mean = 57; max = 161; min= -34).

Finally, the chart showing the playoff outcomes for the best team, and therefore answering the question posed earlier.

It turns out that the best team wins the cup 22% of the time – about once every five seasons. This accords well with what we’ve observed since the lockout, with the 2007-08 Detroit Red Wings being the only cup winner that was also unambiguously the best team in the league. The 2009-10 Chicago Blackhawks were probably the best but it’s hard to say for sure. The 2008-09 Penguins were a good team but the Wings were probably better that year. Ditto for the 2006-07 Ducks. The 2010-11 Bruins were merely a good team, and I can’t even say that much for the 2005-06 Hurricanes, who may not have even been one of the ten best teams in the league during that season.


The exercise assumes that team ability is static. This is obviously untrue in reality, given injuries, roster turnover, and like variables. Consequently, the best true talent team at one point in the season may not be the best team at a different point. Moreover, the spread in team talent at any given point in the season is likely to be somewhat broader than the ability curve used in the exercise.

Scores were generated for individual games through the use of poisson probabilities, which does not take into account score effects. Thus, the model slightly underestimates the incidence of tie games. For the same reason, it also overestimates the team-to-team spread in goal differential.

Tuesday, June 7, 2011

Predicting Playoff Success - Part Two

Rob Vollman raised an interesting question in the comments section of my last post, that being whether my findings precluded the possibility that some teams consistently perform better or worse in the playoffs.

The question can be answered by comparing each team's actual performance, as measured by winning percentage, with what would be expected based on regular season results. If the spread in [actual - expected] winning percentage is significantly greater than what would be expected by chance alone, then that suggests that some types of teams may consistently outperform or underperform in the playoffs relative to the regular season.

As with my last post, my sample consisted of all 1882 playoffs games played between 1988 and 2010. I've prepared a chart which shows, in the leftmost section, how each of the league's teams performed during that span. The middle section of the chart shows each team's expected wins and losses, based on single-game probabilities generated from regular season data. Finally, the rightmost section shows each team's winning percentage differential (defined as observed winning percentage minus expected winning percentage), as well as the probability of observing a differential at least that large by chance alone.

That last part may require some elaboration. All 1882 games were simulated 1000 times, based on the regular-season derived probability values. For each of the individual simulations, I determined each team's winning percentage and subtracted from it that team's expected winning percentage. The p value column simply indicates the proportion of simulations in which the the absolute value of that number - that is, the team's [simulated winning percentage - expected winning percentage] - exceeded the absolute value of that team's [observed winning percentage - expected winning percentage].

A specific example may be illustrative. Anaheim had an observed winning percentage of 0.576, an expected winning percentage of 0.462, and therefore an [observed winning percentage - expected winning percentage] of 0.114. In only 0.033 of the 1000 simulations did Anaheim's simulated winning percentage differ from its expected winning percentage by at least 0.114. Hence Anaheim's p value of 0.033.

As can be seen, some teams outperformed their expected winning percentage, whereas others underachieved. Based on each team's [observed winning percentage - expected winning percentage], and the probability of each differential materializing by chance alone, Edmonton, Pittsburgh and Anaheim were the three most "clutch" teams, whereas the Islanders, Columbus and Atlanta were the biggest "chokers." But is the spread between the teams any different from what would be predicted from chance alone?

There are two ways in which this question can be answered. The first is to group the observed winning percentage differentials ( expected versus actual winning percentage) into several categories, and calculate the number of values in each category as a percentage of the total sample (relative frequency). Following that, the same can be done with the simulated differentials. The two distributions can then be compared.

The second is to repeat the exact same exercise, but to use actual wins instead of winning percentage. I prefer this second method as the fact that some teams, such as Atlanta, Columbus and Quebec, played very few games has the potential to skew the results if winning percentage is used.

Here are the two graphs:

In the case of winning percentage, the actual spread is noticeably greater than the observed spread. But the difference is not too large, there being something of a general correspondence. And in the case of wins, the two lines form almost a perfect match.

If I were to issue a conclusion, it would be that although some teams over or underperform in the playoffs relative to their regular season results, this appears to be mostly the product of normal statistical variation. There isn't much support for the idea that there exists an ability to perform in the playoffs that is independent and separate from the ability to perform during the regular season.

Saturday, June 4, 2011

Predicting Playoff Success

It's often said that the playoffs are a different ball game as compared to the regular season - that some teams are built for the playoffs whereas others are not.

The above statement can be evaluated by looking at how well regular season results predict playoff success. This can be done by assigning a theoretical win probability to every playoff team based on how it performed during the regular season, and determining the odds for each individual matchup on that basis. If the statement is true, the favorite - the team with the superior win probability as against its opponent - should win significantly less often than expected.

My sample consisted of all 1882 playoff games played between 1988 and 2010. Theoretical win probabilities were computed on the basis of regular season goal ratio, corrected for schedule difficulty. While goal ratio is imperfect in this respect, the data required to produce more precise estimates is simply not available for the majority of the seasons included in the sample. Thus, goal ratio is the best measure available.

Home advantage was valued at +0.056, this being the difference between the expected neutral ice winning percentage of home teams (0.505), and their observed winning percentage over the games included in the sample (0.561).

After computing the odds for each individual game, I divided the data into eight categories. The category in which a game was placed depended upon the expected winning percentage of the favorite. The cutoffs for the eight categories were as follows:
  1. 0.50-0.52
  2. 0.52-0.54
  3. 0.54-0.56
  4. 0.56-0.58
  5. 0.58-0.60
  6. 0.60-0.625
  7. 0.625-0.675
  8. 0.675-1
The cutoffs were not gerrymandered so as to produce a particular result - I simply wanted each category to contain a relatively equal number of games. As there are many more games in which the favorite has a win probability between 0.50 and 0.60 than there games in which the favorite has a win probability greater than 0.60, this necessitated making certain categories larger than others.

The results:

[The italicized 'n' column simply indicates the number of games contained within each category.]

As can be seen, using regular season data allows one to predict the results of groups of individual playoff games with surprising accuracy. On the whole, the favorite did slightly worse in reality than what the regular season results predicted - 0.573 versus 0.586. However, this is probably just a reflection of the fact that regular season goal ratio is the product of both skill and luck, and that the true talent goal ratio of the average team lies closer to the population average than does its observed goal ratio.

As for the individual categories, six of the eight show a reasonably close correspondence between expected and observed winning percentage, with the other two featuring notable discrepancies. While each gap appears significant, either could be the product of chance alone. The probability of a 0.51 team going 0.468 or worse over 263 games is 0.097. Likewise, the probability of a 0.713 team going 0.671 or worse over 204 games is 0.109.

If I were to guess, I'd say that the discrepancy in the 0.675-1 category is a real effect. As discussed earlier, goal ratio tends to overvalue the favorite and underrate the underdog, and the greater the distance from the mean, the more likely this is to be true in individual cases.

Thursday, June 2, 2011

More on Team EV Shooting Ability

About a week ago, I put up a post on team even strength shooting percentage, which included a chart showing what the underlying talent distribution in that area probably looks like. I've reproduced the relevant curve below:

The curve isn't excessively narrow. The 97th percentile equates to a shooting percentage of 0.0902, meaning that, in an average season, the league's most talented EV shooting team would have an underlying shooting percentage at or around that mark. That's no trivial advantage - with neutral luck, such a team would be expected to score roughly 18 more even strength goals than a team with average EV shooting ability.

The problem is that, given that goals in the NHL are somewhat of a statistical rarity, the regular season doesn't provide us with a sample that is sufficiently large so as to be able to identify each team's true talent level with reasonable accuracy.

This estimate uncertainty is well illustrated by comparing last year's Devils, who had a league worst 0.065 EV shooting percentage, with last year's Stars, who posted the league's best mark at 0.089. That seems like a fairly large gap - almost 2 and half percent. Surely one would be able to conclude that the 2010-11 Stars possessed more EV shooting talent than the 2010-11 Devils?

In fact, there is a not-insignificant probability that the Devils were actually the better EV shooting team. This becomes immediately apparent upon viewing the ability distribution for each team and noting the overlap between the two curves.

There is an 11.6% chance that N.J was actually the more talented team last season in terms of EV shooting ability. In other words, there will be some seasons - of which 2010-11 is an example - that do not permit the conclusion that any single team has definitively more EV shooting talent than any other.

Monday, May 30, 2011

Stanley Cup Finals 1011 Playoff Probabilities and Predictions

It all comes down to this.

The best team in the West against the best team in the East.*

For the 6th straight year, the Western representative appears to be the stronger team. That's not really surprising - the West has had the better interconference record in every season since 1999-00, and often by a large margin.

I think Vancouver is clearly the better team here and that, if anything, the odds I've presented above understate their chances. That said, these two teams are close enough to one another where it should be a good series.

I'll take the Canucks to win in six games.

VAN in 6.

*As per my probability model. It's possible - perhaps even likely in Boston's case - that neither team is the best team in its respective conference. For what it's worth, I'd take a healthy Pittsburgh team over Boston all day every day. But I digress.

Sunday, May 29, 2011

Team Even Strength Shooting Talent

A while back, I received a comment relating to how my playoff probability model accounts for teams that are outliers with respect to shooting percentage, with the 2009-10 Washington Capitals offered as an example of such a team.

The answer is relatively straightforward: I merely regress each team to the league average based on the extent to which the team to team variation can be attributed to luck over the sample in question. As the variation in even strength shooting percentage at the team level is approximately 66% luck over the course of a regular season, each team's even strength shooting percentage is regressed two-thirds of the way to the mean in order to generate theoretical win probabilities.

The application of above method to data from the 2010-11 regular season yields the following even strength shooting talent estimates for each the league's 30 teams.*

This method, however, is actually a shortcut that relies on assumptions that are unlikely to be true in reality. For one, it assumes that all underlying talent distributions are normally distributed, which may or may not be the case. It's also insensitive to the fact that some teams take more shots than others over the course of a season. A more certain shooting talent estimate can be made with respect to a team that takes 2000 shots as compared to a team that takes 1500 shots, although the model fails to reflect that.

The proper - albeit significantly more complicated and involved - approach would be to actually identify the nature of the underlying talent distribution and work one's way forward from there.

The first step is to look at the observed distribution in performance. I did this through randomly selecting half a season's worth of games for each team and looking at how it performed with respect to EV SH% over that sample, and repeated this 2000 times for every season from 2003-04 to 2010-11. I elected to do this as it provided me with 420 000 data points, thereby allowing me to generate a smooth curve. By comparison, using the actual end-of-season frequencies would have provided me with a mere 210 data points.

I came away with the following curve:

The distribution is slightly right-skewed and therefore not quite normal. This becomes meaningful at the tails - there were approximately 26 times more values 3 standard deviations above the mean than there were values 3 standard deviations below it. In other words, there are many more very good teams than very bad ones when it comes to even strength shooting performance.

The next step is finding a curve that best fits the observed data. This curve should have a mean of approximately 0.081, which was the observed league average shooting percentage. It should also have a standard deviation of approximately 0.0048, which is the skill standard deviation in relation to even strength shooting percentage at the team level. Finally, the curve should be slightly positively skewed.

The beta (236, 2977) curve, shown below, satisfies these criteria.

As a check on the correctness of the selection, I used a random number generator to assign each team an artificial EV shooting percentage based on the above curve. I then simulated a sufficiently large number of half seasons based on those artificial numbers and compared the results to the observed data. If the choice is correct, the simulated results should closely match those observed.

The simulated curve is only based on about 30 000 data points, so it's not as smooth as the observed distribution. That said, the fit is pretty good. The observed distribution appears to have a fatter right tail, and so it's possible that a different beta curve might provide a better match. But it's close enough.

The beta ability distribution can be used to estimate each team's true talent underlying shooting percentage, based on the 2010-11 regular season. How do these estimates compare to those produced by the simple regression approach discussed earlier?

The two approaches produce very similar results - the average difference amounting to only 0.0004. The latter approach is both more precise and principled. But the former achieves substantially similar estimates with a fraction of the effort.

* The mean used was 0.0812, this being the league average EV SH% since the lockout, even though the observed shooting percentage in 2010-11 was a bit lower - a touch under 0.08.

Friday, May 13, 2011

3rd Round 1011 Playoff Probabilities and Predictions


Another extremely even matchup. The relevant facts, as I see them:
  • S.J probably has the better powerplay - they generate a ridiculous number of shots
  • VAN very likely has the better goaltender
  • VAN has home ice advantage
  • Both teams are about equally good at controlling the play at EV
  • VAN is missing a key forward
If those facts give one club a clear advantage, I can't see it. These are probably the two best teams in the league and this should be a great series. I'll take the Canucks in seven games.

Van in 7.


At first glance, Boston seems like the obvious pick here. But the (Patrice) Bergeron injury complicates things. The latest reports indicate that he has yet to resume skating since the incident, so from that it seems as though he might not play at all. That would be a huge loss, as he's probably their best forward, at least by my reckoning.

The issue is whether the Bergeron injury is enough to tip the balance in Tampa Bay's favor. I don't think that it is. Based on regular season play, I have the Bruins as a 61% favorite. While the Bergeron injury necessitates a downward adjusted of that figure, I don't think the loss is profound enough to render the Bruins underdogs. This is supported by the fact that the oddsmakers - who certainly take such things into account - still have Boston as about a 56% favorite.

BOS in 7.