Thursday, July 30, 2009

Playing with the Lead and the Percentages: Part Two

The other day, I wrote about how, in any particular season, the sum of a team's shooting and save percentage is correlated with how much time that team spent playing with the lead, and how this relationship is, in turn, related to shot differential.

The purpose of this post is to elaborate upon that.

Firstly, the issue of causation. While it goes without saying that correlation does not imply causation, I think it's reasonable to assume that there some sort of causal relationship here.

I think that the arrow of causation is bi-directional. For one, a team that is lucky or good with the percentages when the score is tied will, on average, tend to play with the lead more. In this sense, having a good team PDO number causes a team to play more with the lead.

On the other hand, however, I think that playing with lead is, in and of itself, beneficial to shooting and save percentage. I'm basing this assumption on the fact that shot ratios are subject to the leading/trailing effect. I suspect that there's some sort of trade off involved whereby the leading team's advantage in shot ratio is met with a corresponding disadvantage in the percentages.

Thus, good percentages leads to playing with the lead more, which in turn begets good percentages.

Secondly, my prediction is that playing with the lead accounts for the fact that the spread in even strength shooting percentage is somewhat larger than what would be predicated by chance alone.

Here is what has demonstrated thus far:

The distribution of team EV S% when the score is tied is entirely random.

There are no 'real effects' with respect to EV S% when the score is tied. That is to say, it has no sustain.

Some of the variation in overall EV S% at the team level is non-random. That is to say, there is more variation than what would be predicted from chance alone.

This being the case, the logical implication is that the playing to score effect is one of - perhaps the only - non-random contributions to EV S%.

As a preliminary test for this hypothesis, I looked at the relationship between [minutes played with the lead - minutes played trailing] and various even strength variables for the 2008-09 season. The results are contained below:


While the results are not unequivocally supportive, I think it tends to accord with my theory.

The teams that do better with the percentages when the score is tied at EV tend to play more with the lead overall - that's not unexpected. Moreover, and perhaps more importantly, teams that did better with the percentages at EV when the score wasn't tied tended to play more with the lead as well.

Of course, I'll refrain from saying anything with confidence until further analysis is performed.

13 comments:

  1. J,

    I don't completely understand what you're theorizing with these last two posts. Can you maybe state your assertions (even if you haven't corroborated them with full confidence yet) in a different way?

    It sounds like you're saying that teams who do better with percentages tend to have the lead more often. That seems obvious enough.

    But then it sounds like you're saying that when teams have the lead they tend to see their percentages go up and their shot differentials go down (relative to their numbers when the game is tied). Do I have that right?

    ReplyDelete
  2. Sunny:

    "It sounds like you're saying that teams who do better with percentages tend to have the lead more often. That seems obvious enough."

    Well, the point of the first post was to show how that relationship varies with shot differential.

    On the whole, there isn't much of a relationship between shot differential and the percentages. However, once time spent playing the lead is controlled for, a fairly strong relationship emerges.

    I suppose my point was that the relationship between shot totals and the percentages is more nuanced than I originally thought.

    In terms of the second post, I was basically positing the leading/trailing effect as the cause of the increased variance in overall EV S% and SV% relative to EV S% and SV% when the score is tied.

    ReplyDelete
  3. What is PDO? Just so I can understand the analysis.

    ReplyDelete
  4. Moneypuck:

    As originally defined:

    PDO = EV shooting percentage + EV Save Percentage

    ReplyDelete
  5. O ya I read something like teams PDO usually revolve around 1, I don't see the usefullness in it though, because as I see it, thats an average which by all means make sense when you think about it, but not a great tool for team-by-team analysis.

    ReplyDelete
  6. The point is, a club's PDO will regress to the mean (100) over the long-term. If a team is hovering well below or well above 100, you can bet the percentages are having a sizable effect on their results and they will be in line for a correction eventually.

    ReplyDelete
  7. I saw PDO studies over at mc79, and while I obviously agree with the logic I still don't buy it applies to all teams. If you have good goaltending and good scorers theres a good chance your PDO's mean of regression is over one.

    I see PDO as more of a general rule for league studied than team cases.

    ReplyDelete
  8. Fair enough. As Tyler notes in the PDo post, it's regression "towards" the mean, not necessarily "to".

    ReplyDelete
  9. Of course its "towards" mean, all shooters regress towads the NHL shooting % mean over a very long span (hundreds of games) and same with goalies.

    However, I bet Malkin won't be below the 10% mark before his career ends, because hes a great player, same reason I don't think great teams can be tagged as "lucky" by using PDO.

    If the teams shooting % is something ridiculous like 15%, I would say that yes there would be definete regression, but I don't see how adding SH%+ SV% is effective anyways, yes theres a reason why its 1, because Avg SH% + Avg SV%=1.

    SH% and SV% are two seperate entities though who just happen to feed off each other, they analyze two seperate things, and IMO should not be combined like such, might as well just use league average for both categories to determine an expected level of regression.

    ReplyDelete
  10. "Of course its "towards" mean, all shooters regress towads the NHL shooting % mean over a very long span (hundreds of games) and same with goalies."

    That's actually not true.

    ReplyDelete
  11. I meant to say their mean, not NHL mean I didn't proof-read that long comment.

    ReplyDelete
  12. I'm interested in what you're doing here, but I agree with Sunnymehta that it is a little obscure what two things are correlated.

    It appears to me that your data shows "teams that play that often play with a lead, also tend to have a better than average SV% and ST%"...which is a bit underwhelming.

    The more interesting part is to me is that the advantage accrued to 'frequent leaders' appears to rest most often on the defensive side of things.

    Now you don't present multivariate analysis, but I assume that you're using the exact same data for both ST% and SV% correlations--and it is the SV% correlations that are much stronger.

    If I'm reading your tables right the conclusion is that the 'frequent leaders' i.e. good teams are slightly more accurate/lucky in terms of finishing chances, but MUCH better/lucky at defending against shots against.

    Feel free to correct me if I have this wrong.

    ReplyDelete
  13. Falconer:

    You're correct in that the correlations appear to be stronger for save percentage. Granted, it's only two seasons worth of data so I'd be reluctant to draw any conclusions.

    In a previous post, I surmised that the leading/trailing effect is responsible for the fact that the observed variance in EV S% (and presumably EV SV%) is larger than what would be predicted on the basis of chance alone. The purpose of this post was to see if the data supported that conclusion.

    ReplyDelete