Tuesday, July 28, 2009

Playing With The Lead and the Percentages

Depicted above is a graph showing the relationship between playing with the lead and PDO number at the team level for last season. The teams coded in black are teams that had an aggregate shot differential greater than 100 last season. Teams coded in red are teams that had a negative shot differential less than -100. Teams coded in white are teams that had a shot differential between 100 and -100.

Team PDO is defined as the sum of team shooting percentage and team save percentage. Unlike conventional PDO numbers, these figures are not solely for even strength play - special teams play is included. The same is true for the minutes played data. However, empty netters have been excluded in calculating each team's shooting and save percentage.

Playing with the lead is favorable to the percentages. The relationship is quite strong, too - the correlation between [Minutes played leading - Minutes played trailing] and Team PDO was 0.63 for last season. This is similar to the correlations observed in other seasons.

Only 2007-08 is anomalous. And even then, the correlation is positive.

What's interesting, however, is how the relationship varies according to shot differential.

I've long been opposed to the idea that there exists a relationship between shot totals and the percentages at the team level. I've been particularly opposed to the idea that there is a relationship between goaltender save percentage and number of shots faced. Now, in fairness, there isn't much of a relationship between the two in general. Shown below is the correlation between Team PDO and team shot differential for every season since 2002-03.

Thus, only in 2007-08 was there anything of a relationship. The correlations for every other season are insignificantly different from zero. On a related note, I did the same thing for shots against and goaltender save percentage in a previous post and obtained similar results.

Of course, the teams that get outshot over the course of a season tend to be the teams that are consistently playing from behind. (The correlation is approximately 0.5-0.6).

Once this fact that is controlled for, a positive relationship between Team PDO and shot differential emerges. That is to say, teams with negative differentials tend to do much better in terms of the percentages than what would otherwise be predicted on the basis of their [Minutes played leading - Minutes played trailing] differential.

To illustrate this, I assigned each team an expected PDO number based upon its [Minutes played leading - Minutes played trailing] differential. I then determined the correlation between expected PDO and shot differential for each of the involved seasons.

While the strength of the correlation varies from year to year, it's apparent that having a negative shot differential allows a team to outperform it's expected PDO.

I suspect that this is true for the following reasons:

The team that plays with the lead will tend to have a higher scoring chance/shot ratio than a team that plays from behind. This is because a team that plays from behind is forced to take more chances in an attempt to tie the score.

However, a team that has a good shot differential will tend to get the better of the play regardless of whether it is leading or trailing. Likewise, a team with a poor shot differential will tend to get dominated territorially regardless of goal state.

To use a concrete example, if San Jose is playing Florida, and San Jose is winning, San Jose is still likely getting the better of the play. The puck will tend to spend much more time in Florida's end than in San Jose's. Therefore, while San Jose will surely still end up outchancing the Panthers, it is likely that the Panthers will end up with the better scoring chance/shots ratio on account of generating more of its shots through odd man rushes and the like (rather than, say, shots from the periphery of the offensive zone that are generated through periods of sustained pressure).

Anyway, I plan to analyze the data in more detail in the future. I think it might go a long way in accounting for some of the more anomalous teams over the past few years (2006-07 Predators, 2006-07 Sabres, 2007-08 Canadiens, and so forth). I also think that it also might have some utility in terms of goaltender analyis.

One more thing: Intuitively, I would expect that the leading-trailing effect would be most pronounced at even strength.

As much as I would have liked to confine the data to even strength play only, that wasn't possible. Granted, the correlation between leading-trailing differential and leading-trailing differential at even strength is bound to be quite high.


Scott Reynolds said...

Excellent work. I think what you describe as a "higher scoring chance to shot ratio" is what people should be talking about when they mention shot quality. Unfortunately, right now it's mostly measured by shot distance. If he do indeed get a group of five together looking at the Northwest Division together we will be able to test a lot of this data since there will be both good teams and poor teams represented.

The "playing to the score" looks to play a big impact here as well. It would be interesting to break the data down by period. I would expect that the results would be even more extreme in the last ten minutes of a game culminating with the end of the game when the trailing team pulls the goaltender and any shot on goal from any location on the ice for the team leading is a scoring chance/goal.

JLikens said...

Good point about the breaking the data down by period. I'll have to look at doing that in the near future.

As for the scoring chance project, I think that's a great idea and I'd definitely be willing to participate.

I think there was a thread at IOF a while back in which it was said (possibly by me) that I could do the chances for Minnesota games, which works for me.

I get Center Ice every year so having access to the games wouldn't be an issue.