Thursday, July 30, 2009

Playing with the Lead and the Percentages: Part Two

The other day, I wrote about how, in any particular season, the sum of a team's shooting and save percentage is correlated with how much time that team spent playing with the lead, and how this relationship is, in turn, related to shot differential.

The purpose of this post is to elaborate upon that.

Firstly, the issue of causation. While it goes without saying that correlation does not imply causation, I think it's reasonable to assume that there some sort of causal relationship here.

I think that the arrow of causation is bi-directional. For one, a team that is lucky or good with the percentages when the score is tied will, on average, tend to play with the lead more. In this sense, having a good team PDO number causes a team to play more with the lead.

On the other hand, however, I think that playing with lead is, in and of itself, beneficial to shooting and save percentage. I'm basing this assumption on the fact that shot ratios are subject to the leading/trailing effect. I suspect that there's some sort of trade off involved whereby the leading team's advantage in shot ratio is met with a corresponding disadvantage in the percentages.

Thus, good percentages leads to playing with the lead more, which in turn begets good percentages.

Secondly, my prediction is that playing with the lead accounts for the fact that the spread in even strength shooting percentage is somewhat larger than what would be predicated by chance alone.

Here is what has demonstrated thus far:

The distribution of team EV S% when the score is tied is entirely random.

There are no 'real effects' with respect to EV S% when the score is tied. That is to say, it has no sustain.

Some of the variation in overall EV S% at the team level is non-random. That is to say, there is more variation than what would be predicted from chance alone.

This being the case, the logical implication is that the playing to score effect is one of - perhaps the only - non-random contributions to EV S%.

As a preliminary test for this hypothesis, I looked at the relationship between [minutes played with the lead - minutes played trailing] and various even strength variables for the 2008-09 season. The results are contained below:

While the results are not unequivocally supportive, I think it tends to accord with my theory.

The teams that do better with the percentages when the score is tied at EV tend to play more with the lead overall - that's not unexpected. Moreover, and perhaps more importantly, teams that did better with the percentages at EV when the score wasn't tied tended to play more with the lead as well.

Of course, I'll refrain from saying anything with confidence until further analysis is performed.

Tuesday, July 28, 2009

Playing With The Lead and the Percentages

Depicted above is a graph showing the relationship between playing with the lead and PDO number at the team level for last season. The teams coded in black are teams that had an aggregate shot differential greater than 100 last season. Teams coded in red are teams that had a negative shot differential less than -100. Teams coded in white are teams that had a shot differential between 100 and -100.

Team PDO is defined as the sum of team shooting percentage and team save percentage. Unlike conventional PDO numbers, these figures are not solely for even strength play - special teams play is included. The same is true for the minutes played data. However, empty netters have been excluded in calculating each team's shooting and save percentage.

Playing with the lead is favorable to the percentages. The relationship is quite strong, too - the correlation between [Minutes played leading - Minutes played trailing] and Team PDO was 0.63 for last season. This is similar to the correlations observed in other seasons.

Only 2007-08 is anomalous. And even then, the correlation is positive.

What's interesting, however, is how the relationship varies according to shot differential.

I've long been opposed to the idea that there exists a relationship between shot totals and the percentages at the team level. I've been particularly opposed to the idea that there is a relationship between goaltender save percentage and number of shots faced. Now, in fairness, there isn't much of a relationship between the two in general. Shown below is the correlation between Team PDO and team shot differential for every season since 2002-03.

Thus, only in 2007-08 was there anything of a relationship. The correlations for every other season are insignificantly different from zero. On a related note, I did the same thing for shots against and goaltender save percentage in a previous post and obtained similar results.

Of course, the teams that get outshot over the course of a season tend to be the teams that are consistently playing from behind. (The correlation is approximately 0.5-0.6).

Once this fact that is controlled for, a positive relationship between Team PDO and shot differential emerges. That is to say, teams with negative differentials tend to do much better in terms of the percentages than what would otherwise be predicted on the basis of their [Minutes played leading - Minutes played trailing] differential.

To illustrate this, I assigned each team an expected PDO number based upon its [Minutes played leading - Minutes played trailing] differential. I then determined the correlation between expected PDO and shot differential for each of the involved seasons.

While the strength of the correlation varies from year to year, it's apparent that having a negative shot differential allows a team to outperform it's expected PDO.

I suspect that this is true for the following reasons:

The team that plays with the lead will tend to have a higher scoring chance/shot ratio than a team that plays from behind. This is because a team that plays from behind is forced to take more chances in an attempt to tie the score.

However, a team that has a good shot differential will tend to get the better of the play regardless of whether it is leading or trailing. Likewise, a team with a poor shot differential will tend to get dominated territorially regardless of goal state.

To use a concrete example, if San Jose is playing Florida, and San Jose is winning, San Jose is still likely getting the better of the play. The puck will tend to spend much more time in Florida's end than in San Jose's. Therefore, while San Jose will surely still end up outchancing the Panthers, it is likely that the Panthers will end up with the better scoring chance/shots ratio on account of generating more of its shots through odd man rushes and the like (rather than, say, shots from the periphery of the offensive zone that are generated through periods of sustained pressure).

Anyway, I plan to analyze the data in more detail in the future. I think it might go a long way in accounting for some of the more anomalous teams over the past few years (2006-07 Predators, 2006-07 Sabres, 2007-08 Canadiens, and so forth). I also think that it also might have some utility in terms of goaltender analyis.

One more thing: Intuitively, I would expect that the leading-trailing effect would be most pronounced at even strength.

As much as I would have liked to confine the data to even strength play only, that wasn't possible. Granted, the correlation between leading-trailing differential and leading-trailing differential at even strength is bound to be quite high.

Wednesday, July 1, 2009

Zone Shift

I've been doing a bit of work with the Zone Shift stat as of late.

For those unfamiliar, Zone Shift is a stat conceptualized by Vic Ferrari, who has from time-to-time discussed the metric at his blog.

For individual players, Zone Shift is calculated as follows:

[EV Shifts Started in the Defensive Zone - EV Shifts Started in the Offensive Zone] -
[EV Shifts Ended in the Defensive Zone - EV Shifts Ended in the Offensive Zone]

What Zone shift is essentially measuring, albeit somewhat crudely, is the ability of the player to move the puck in the right direction - a valuable, if underrated, asset to have as a player.

Having said that, in browsing through the data, I couldn't help but notice that the players with the best Zone Shift numbers tended to take a large proportion of defensive zone draws relative to their teammates.

In order to quantify the effect, I calculated each team's aggregate zone shift ratio - that is, EV Defensive Zone draws/EV Offensive Zone Draws - and multiplied that ratio by one hundred. This stat can be termed 'TEAM ZONE RATIO.' To give a concrete example, the Thrashers were destroyed territorial this year at EV and took roughly 1.34 EV Defensive Zone draws for each Offensive Zone draw, thus giving them a TEAM ZONE RATIO figure of approximately 134.

I then figured out the exact same stat for all players - that is, for all EV faceoffs that the player was on the ice for when his shift BEGAN - in the league that were on the ice for at least 50 EV faceoffs in all three zones (Defensive, Offensive, Neutral). We'll call this figure PLAYER ZONE RATIO STARTING.

I then subtracted this figure from the TEAM ZONE RATIO of that player's team. This stat can be called 'PLAYER ZONE DIFFERENTIAL.'

Again, to give a concrete example, Colby Armstrong took approximately 1.51 EV Defensive Zone draws for each EV Offensive Zone Draw, therefore giving him a PLAYER ZONE RATIO STARTING figure of around 151, and a PLAYER ZONE DIFFERENTIAL of 17 (151-134=17).

I then figured out each player's zone ratio for all shifts that ended with him on the ice. We'll term this PLAYER ZONE RATIO ENDING. Going back to Armstrong again, he ended 1.16 shifts in his own zone for every faceoff ended in other team's end of the rink, therefore giving him a PLAYER ZONE RATIO ENDING number of 116.

Finally, I subtracted each player's ZONE RATIO ENDING number from his ZONE RATIO STARTING number in order to produce a ZONE SHIFT number. Armstrong's was around 35, which is pretty good - one of the best in the league, in fact.

It appears that starting a high proportion of your EV faceoffs in your own zone relative to your team average - in other words, having a high PLAYER ZONE DIFFERENTIAL - is pretty favorable toward ZONE SHIFT. Among all players on the ice for at least 50 EV faceoffs in each zone, the correlation was 0.80. Moreover, each unit increase in PLAYER ZONE DIFFERENTIAL is worth approximately a 0.88 increase in ZONE SHIFT. In other words, the effect is considerable.

To further illustrate this, consider the top ten players in unadjusted ZONE SHIFT during the 2008-09 season: Shultz, Sauer, Veilleux, Smithson, (Ryan) Johnson, Zigomanis, Hall, (Zybynek) Michalek, McClement - all of these players took a much higher percentage of defensive zone draws than their teammates.

Long story short: It's easier to have a good Zone Shift number if you're starting more in your own end of the rink relative to your teammates, and if the metric is to be worth anything at all, this ought to be corrected for.

And I've attempted to do exactly that. Contained below is a listing of the league's best and worst players in ADJUSTED ZONE SHIFT - adjusted because the stat attempts to control for the above bias. I've also included the unadjusted ZONE SHIFT numbers as well.

This stat is, of course, imperfect, and further corrections are probably necessary, which is something I intend to look at in the near future. I just figured I'd throw this up in the interim.