Saturday, June 25, 2011

How often does the Best Team Win?

This year’s Stanley Cup Final concluded with a somewhat surprising outcome. The Vancouver Canucks – who were widely regarded as the league’s best club – were defeated by the underdog Bruins.

To those who regard the NHL playoffs as a competition designed to determine the league’s best team, the result can mean only one thing – that the Bruins were the best team all along, and the Canucks mere pretenders.

A more reasonable explanation, however, is that shit happens over the course of a seven game series and, because of that, the better team doesn’t always win. The Canucks were better than Boston during the regular season, and were likely better in the first three rounds of the playoffs as well. They were better than Boston last year and there’s a good chance that they’ll do better next year. They were probably the better team.

The Canucks may or may not have been the best team in the league, but if they were in fact better than Boston, then that means that a team other than the best team in the league won the cup. That raises an interesting question – how often does the best team in the league end up winning the cup?

(The answer, of course, will vary as a function of the level of parity that exists in the league. Because the level of league parity has varied over time as a function of era, we’ll confine our answer to the post-lockout years).

Unfortunately, the question cannot be answered directly due to the fact that it’s not possible to identify the league’s best team in any given season with any certitude. One can only speak in terms of probability and educated guesses.

It is, however, possible to arrive at an approximate answer through assigning artificial win probabilities to each team, simulating a large number of seasons, and looking at how often the team with the best win probability ends up winning the cup.

This exercise is made possible by the fact that the distribution in team ability – which we’ll define as true talent goal ratio –can be ascertained through examining the observed spread in goal ratio and identifying the curve which best produces that spread when run through the appropriate simulator.

In order to generate an observed distribution of results, I randomly selected 40 games from every team and looked at how each of them performed with respect to goal percentage (empty netters excluded) over that sample. This exercise was performed 2000 times for each of the six post lockout seasons. The following curve resulted:

The likely ability distribution is the curve shown below – a normal distribution with a mean of 0.5 and standard deviation of 0.03.

If a large number of half-seasons are simulated through assigning artificial goal percentages based on the above ability distribution, the spread in simulated results closely matches the observed results displayed in the first graph.

As the ability distribution can be used to generate results that closely parallel those observed in reality, it can also be used in order to answer the question posed earlier in the post – that is, the probability of the best team in the league winning the cup in any given season.

Here’s how the simulations were conducted:

  • For each simulated season, every team was assigned an artificial goal percentage based on the ability distribution produced above
  • The artificial goal percentages were, in turn, used to produce GF/game and GA/game values for each team
  • GF/game values were calculated by multiplying a team’s goal percentage by 5.49 (5.49 being the approximate average number of non empty net goals scored per game in the post-lockout era)
  • GA/game values were calculated by subtracting a team’s GF/game value from 5.49
  • All 1230 games from the 2010-11regular season were then simulated, with a score being generated for each individual game
  • The probability of a team scoring ‘x’ number of goals in an individual game was determined through taking its GF/game value and adjusting it based on the GA/game value of the opponent
  • If each team scored an equal number of goals, each team was awarded one point and a random number generator was used to determine which of the two teams received the additional point
  • After all games were simulated, the division and conference standings were determined in accordance with NHL rules (that is, with the teams ranked by points, with the division winners being placed in the first three seeds in each conference)
  • If two teams were tied in points, greater number of wins was used as a tiebreaker
  • If two teams had the same number of points and wins, then a random number generator was used as a second tiebreaker
  • The playoff matchups were then determined based on the regular season standings
  • Individual playoff games were not simulated; rather, each series was simulated as a whole based on the Pythagorean expectations (which were derived from the goal percentage values) of the involved teams
  • Home advantage for the higher seed was valued at +0.015

20 000 Simulations were conducted in total. Here’s how the league’s best team – defined as the team with the best underlying goal percentage in each individual season – fared. We’ll start with the regular season results:

The above chart shows how the best team performed in four areas – division rank, conference rank, league rank in points, and league rank in goal differential. So, as an example, the best team ended up winning the President’s Trophy – i.e. finishing with the most points – about 32% of time.

The results are interesting. The best team does very well in general, but the range in outcomes is considerable. It wins its division a majority of the time yet still manages to finish dead last every now and then (about once every 200 seasons). It wins the conference almost half the time and finishes in the top four about 84% of the time. However, it still misses the playoffs a non-trivial percentage of the time (2.2%). The latter fact may not be too surprising – the 2010-11 Chicago Blackhawks were close to being the best team in the league but only made the playoffs by the slimmest of margins.

It wins the President’s Trophy about a third of time and does even better in terms of goal differential, posting the best mark in roughly 40% of the simulations. However, it occasionally finishes in the bottom half of the league in both categories (about 2% and 1% of the time, respectively).

The graph below shows the distribution in year end point totals for the best team. It averaged just over 107 points, with a high of 145 and a low of 73.

And the distribution in goal differential (mean = 57; max = 161; min= -34).

Finally, the chart showing the playoff outcomes for the best team, and therefore answering the question posed earlier.

It turns out that the best team wins the cup 22% of the time – about once every five seasons. This accords well with what we’ve observed since the lockout, with the 2007-08 Detroit Red Wings being the only cup winner that was also unambiguously the best team in the league. The 2009-10 Chicago Blackhawks were probably the best but it’s hard to say for sure. The 2008-09 Penguins were a good team but the Wings were probably better that year. Ditto for the 2006-07 Ducks. The 2010-11 Bruins were merely a good team, and I can’t even say that much for the 2005-06 Hurricanes, who may not have even been one of the ten best teams in the league during that season.


The exercise assumes that team ability is static. This is obviously untrue in reality, given injuries, roster turnover, and like variables. Consequently, the best true talent team at one point in the season may not be the best team at a different point. Moreover, the spread in team talent at any given point in the season is likely to be somewhat broader than the ability curve used in the exercise.

Scores were generated for individual games through the use of poisson probabilities, which does not take into account score effects. Thus, the model slightly underestimates the incidence of tie games. For the same reason, it also overestimates the team-to-team spread in goal differential.

Tuesday, June 7, 2011

Predicting Playoff Success - Part Two

Rob Vollman raised an interesting question in the comments section of my last post, that being whether my findings precluded the possibility that some teams consistently perform better or worse in the playoffs.

The question can be answered by comparing each team's actual performance, as measured by winning percentage, with what would be expected based on regular season results. If the spread in [actual - expected] winning percentage is significantly greater than what would be expected by chance alone, then that suggests that some types of teams may consistently outperform or underperform in the playoffs relative to the regular season.

As with my last post, my sample consisted of all 1882 playoffs games played between 1988 and 2010. I've prepared a chart which shows, in the leftmost section, how each of the league's teams performed during that span. The middle section of the chart shows each team's expected wins and losses, based on single-game probabilities generated from regular season data. Finally, the rightmost section shows each team's winning percentage differential (defined as observed winning percentage minus expected winning percentage), as well as the probability of observing a differential at least that large by chance alone.

That last part may require some elaboration. All 1882 games were simulated 1000 times, based on the regular-season derived probability values. For each of the individual simulations, I determined each team's winning percentage and subtracted from it that team's expected winning percentage. The p value column simply indicates the proportion of simulations in which the the absolute value of that number - that is, the team's [simulated winning percentage - expected winning percentage] - exceeded the absolute value of that team's [observed winning percentage - expected winning percentage].

A specific example may be illustrative. Anaheim had an observed winning percentage of 0.576, an expected winning percentage of 0.462, and therefore an [observed winning percentage - expected winning percentage] of 0.114. In only 0.033 of the 1000 simulations did Anaheim's simulated winning percentage differ from its expected winning percentage by at least 0.114. Hence Anaheim's p value of 0.033.

As can be seen, some teams outperformed their expected winning percentage, whereas others underachieved. Based on each team's [observed winning percentage - expected winning percentage], and the probability of each differential materializing by chance alone, Edmonton, Pittsburgh and Anaheim were the three most "clutch" teams, whereas the Islanders, Columbus and Atlanta were the biggest "chokers." But is the spread between the teams any different from what would be predicted from chance alone?

There are two ways in which this question can be answered. The first is to group the observed winning percentage differentials ( expected versus actual winning percentage) into several categories, and calculate the number of values in each category as a percentage of the total sample (relative frequency). Following that, the same can be done with the simulated differentials. The two distributions can then be compared.

The second is to repeat the exact same exercise, but to use actual wins instead of winning percentage. I prefer this second method as the fact that some teams, such as Atlanta, Columbus and Quebec, played very few games has the potential to skew the results if winning percentage is used.

Here are the two graphs:

In the case of winning percentage, the actual spread is noticeably greater than the observed spread. But the difference is not too large, there being something of a general correspondence. And in the case of wins, the two lines form almost a perfect match.

If I were to issue a conclusion, it would be that although some teams over or underperform in the playoffs relative to their regular season results, this appears to be mostly the product of normal statistical variation. There isn't much support for the idea that there exists an ability to perform in the playoffs that is independent and separate from the ability to perform during the regular season.

Saturday, June 4, 2011

Predicting Playoff Success

It's often said that the playoffs are a different ball game as compared to the regular season - that some teams are built for the playoffs whereas others are not.

The above statement can be evaluated by looking at how well regular season results predict playoff success. This can be done by assigning a theoretical win probability to every playoff team based on how it performed during the regular season, and determining the odds for each individual matchup on that basis. If the statement is true, the favorite - the team with the superior win probability as against its opponent - should win significantly less often than expected.

My sample consisted of all 1882 playoff games played between 1988 and 2010. Theoretical win probabilities were computed on the basis of regular season goal ratio, corrected for schedule difficulty. While goal ratio is imperfect in this respect, the data required to produce more precise estimates is simply not available for the majority of the seasons included in the sample. Thus, goal ratio is the best measure available.

Home advantage was valued at +0.056, this being the difference between the expected neutral ice winning percentage of home teams (0.505), and their observed winning percentage over the games included in the sample (0.561).

After computing the odds for each individual game, I divided the data into eight categories. The category in which a game was placed depended upon the expected winning percentage of the favorite. The cutoffs for the eight categories were as follows:
  1. 0.50-0.52
  2. 0.52-0.54
  3. 0.54-0.56
  4. 0.56-0.58
  5. 0.58-0.60
  6. 0.60-0.625
  7. 0.625-0.675
  8. 0.675-1
The cutoffs were not gerrymandered so as to produce a particular result - I simply wanted each category to contain a relatively equal number of games. As there are many more games in which the favorite has a win probability between 0.50 and 0.60 than there games in which the favorite has a win probability greater than 0.60, this necessitated making certain categories larger than others.

The results:

[The italicized 'n' column simply indicates the number of games contained within each category.]

As can be seen, using regular season data allows one to predict the results of groups of individual playoff games with surprising accuracy. On the whole, the favorite did slightly worse in reality than what the regular season results predicted - 0.573 versus 0.586. However, this is probably just a reflection of the fact that regular season goal ratio is the product of both skill and luck, and that the true talent goal ratio of the average team lies closer to the population average than does its observed goal ratio.

As for the individual categories, six of the eight show a reasonably close correspondence between expected and observed winning percentage, with the other two featuring notable discrepancies. While each gap appears significant, either could be the product of chance alone. The probability of a 0.51 team going 0.468 or worse over 263 games is 0.097. Likewise, the probability of a 0.713 team going 0.671 or worse over 204 games is 0.109.

If I were to guess, I'd say that the discrepancy in the 0.675-1 category is a real effect. As discussed earlier, goal ratio tends to overvalue the favorite and underrate the underdog, and the greater the distance from the mean, the more likely this is to be true in individual cases.

Thursday, June 2, 2011

More on Team EV Shooting Ability

About a week ago, I put up a post on team even strength shooting percentage, which included a chart showing what the underlying talent distribution in that area probably looks like. I've reproduced the relevant curve below:

The curve isn't excessively narrow. The 97th percentile equates to a shooting percentage of 0.0902, meaning that, in an average season, the league's most talented EV shooting team would have an underlying shooting percentage at or around that mark. That's no trivial advantage - with neutral luck, such a team would be expected to score roughly 18 more even strength goals than a team with average EV shooting ability.

The problem is that, given that goals in the NHL are somewhat of a statistical rarity, the regular season doesn't provide us with a sample that is sufficiently large so as to be able to identify each team's true talent level with reasonable accuracy.

This estimate uncertainty is well illustrated by comparing last year's Devils, who had a league worst 0.065 EV shooting percentage, with last year's Stars, who posted the league's best mark at 0.089. That seems like a fairly large gap - almost 2 and half percent. Surely one would be able to conclude that the 2010-11 Stars possessed more EV shooting talent than the 2010-11 Devils?

In fact, there is a not-insignificant probability that the Devils were actually the better EV shooting team. This becomes immediately apparent upon viewing the ability distribution for each team and noting the overlap between the two curves.

There is an 11.6% chance that N.J was actually the more talented team last season in terms of EV shooting ability. In other words, there will be some seasons - of which 2010-11 is an example - that do not permit the conclusion that any single team has definitively more EV shooting talent than any other.