Comments on Objective NHL: The Percentages Revisited: 5-on-4 Shooting Percentage

Yeah, intuitively 08/09 does seem a bit anomalous,...

2009-05-01T13:56:00.000-07:00

Yeah, intuitively 08/09 does seem a bit anomalous, but it's still a huge factor.

IIRC then home teams tend to shoot a bit more (home fans should feel reassured that their relentless cries of Shooooooot! are being listened to, at least a bit. But their shooting% drops a commensurate amount, if that makes sense.

PPs are tricky. I think that rolling averages would reveal a lot. Meaning I think that when teams go through stretches of being selective with their shots, the shooting percentage goes up ... and when they go through stretches of just shooting it on net, the shooting percetage goes down.

Special teams in hockey are like baseball ... it seems simpler than 5v5 hockey but it isn't. Players have to much time to think, offense and defense become divorced, it gets tricky. And I sure as hell don't have the answers.

"It's tough to say with the actual distribution be...

2009-04-30T13:33:00.000-07:00

"It's tough to say with the actual distribution being discrete and with only 30 samples. Just off the top of my head, but I think you would need a larger sample (maybe 50 games randomly selected from each team's schedule over and over again) to get the observed distribution clear in form."Yeah, that's a good point. The limited number of samples is definitely a problem.

And while using a normal curve with the same mean and standard deviation smoothes out the data, it presumes that the true variance is equivalent to the observed variance, which may or may not be the case.

Just looking at the 07-08 data from behindthenet, for example, it seems that the standard deviation in PP S% was lower than it was for this year. So perhaps 08-09 was a bit anomalous, leading me to overstate the 'team' contribution to PP S%.

I take it that you've looked at this type of thing before. What's your take on PP S%? I know that you mentioned in one of your posts that individual players were able to influence their on-ice PP S%.

"Then you could hypothesize a beta distribution as the distribution of ability, take a starting guess at the exponent multiplier at 300 or so to start, then parlay it through with Bernoulli trials, you new ability distribution through the binomial(ish) prior you created above. And see how well it works.

Then keep adjusting the ability distribution constants until you get a best fit, then abandon the beta entirely and just try to build a curve out of splines that works best.

That done, you would have a hell of a model.

Does that make sense? I may well be wrong, just spitballing."Heh. I think that you've lost me here.

BTW:

This url:
http://www.timeonice.com/gamePPinfo0708.php?gamenumber=20747
gets you the 5v4 PP info on a game by game basis.

If you wrote a calling script you could scrape off all 1230 games from 0708. Because some random selections of 41 games will have a much higher correlation to the other 41 games than others will, just by chance alone. I usually just use season splits as well, just to save time, but when I'm feeling more ambitious this is clearly the better way to go.Thanks for the link.

Yeah, that would be the better method.

Unfortunately, my programming skills are, well, non-existent.

BTW: This url: http://www.timeonice.com/gamePPinf...

2009-04-29T20:50:00.000-07:00

BTW:

This url:
http://www.timeonice.com/gamePPinfo0708.php?gamenumber=20747
gets you the 5v4 PP info on a game by game basis.

If you wrote a calling script you could scrape off all 1230 games from 0708. Because some random selections of 41 games will have a much higher correlation to the other 41 games than others will, just by chance alone. I usually just use season splits as well, just to save time, but when I'm feeling more ambitious this is clearly the better way to go.

Good stuff as always. It's tough to say with th...

2009-04-29T20:46:00.000-07:00

Good stuff as always.

It's tough to say with the actual distribution being discrete and with only 30 samples. Just off the top of my head, but I think you would need a larger sample (maybe 50 games randomly selected from each team's schedule over and over again) to get the observed distribution clear in form.

Then you could hypothesize a beta distribution as the distribution of ability, take a starting guess at the exponent multiplier at 300 or so to start, then parlay it through with Bernoulli trials, you new ability distribution through the binomial(ish) prior you created above. And see how well it works.

Then keep adjusting the ability distribution constants until you get a best fit, then abandon the beta entirely and just try to build a curve out of splines that works best.

That done, you would have a hell of a model.

Does that make sense? I may well be wrong, just spitballing.

Sunny: "Wouldn't the fact that shooting percentag...

2009-04-29T01:18:00.000-07:00

Sunny:

"Wouldn't the fact that shooting percentages are a good bit higher on the PP than ES mean that there will be less variance?"In the sense that a higher shooting percentage leads to higher frequency of goals?

Hmmm...

I'm not sure if that would be the case, theoretically speaking.

As it happens, there is actually more variance among NHL teams in PP S% than in EV S%, which can be accounted for by two reasons:

1. The fact that, over the course of an NHL season, there are many more shots taken at EV then on the power play. The greater the number of shots, the less randomness contributes to inter-team variation in shooting percentage (in an absolute sense).

2. The fact that the PP S% has a greater 'team-skill' component than EV S% -- that is, the contribution of random variation to PP S% is relatively weaker.

So, if the fact that PP S% is higher than EV S% does tend to reduce variance, this effect is counterbalanced by the above factors.

Dan/Mr.Ceraldi: "To start are you able to verify ...

2009-04-29T01:00:00.000-07:00

Dan/Mr.Ceraldi:

"To start are you able to verify my claim that ther is a low correlation between first half power play rate and second half power play rate?
thanks perhaps my research is flawed?

However, If this is true what does it mean?

My claim is simply that if pp % is a repeatable skill a team should be able to repeat their pp% from the first 1/2 to the second half..This seems reasonable to me ..its 1/2 a season!!!"
The problem with looking at the intra-season split half reliability for power play performance is that the relevant data simply isn't readily available.

On the other hand, NHL.com contains basic powerplay statistics for every team going back to the 1997-98 season. Thus, it's much easier to look at the interyear correlation.

I'll agree with you that, at least in theory, comparing first half performance to second half performance is the superior method.

The only year that I have such data on is the 2008-09 season. The advanced power play statistics are from at about the ~48 game mark of the season. Obviously, this is a tad past the midway point of the season, but it still allows for a comparison to be made between team powerplay performance up to that point and subsequent powerplay performance over the remainder of the regular season. Here are the correlations:

METRIC ----------------- r
==============================
PP Goals For/60 -------- 0.822
Shots For/60 -------- 0.851
PP Shooting PCT -------- 0.777

So, at least for 2008-09, powerplay performance in the first half of the season is strongly predictive of powerplay performance in the second half of the season.

"thanks for your research!!
If my data is correct than this year is similiar to 97-98.
What is your interpretaion of a year like this?

I am not sure what the interpretation means for these to years? Why should these two years
differ?".

I'm not sure how to account for the lack of correlation between power play conversion rate in 1997-98 and power play conversion rate in 1998-99. It looks as though both Anaheim and St.Louis had two of the worst powerplays in the league in 1997-98, yet two of the strongest powerplays in the following season. That might have something to do with it.

JLikens, Wouldn't the fact that shooting percenta...

2009-04-28T13:14:00.000-07:00

JLikens,

Wouldn't the fact that shooting percentages are a good bit higher on the PP than ES mean that there will be less variance?

Ceraldi, You are way WAY off in what constitutes ...

2009-04-28T13:10:00.000-07:00

Ceraldi,

You are way WAY off in what constitutes a reasonable sample with regards to goal scoring rates. You're not alone - many people are simply unaware of just how much randomness there is in the NHL when it comes to scoring, and just how long it takes for noise to smooth out. One full season is really not even a big enough sample size when it comes to measuring certain statistics.

Half a season's worth of PP shots is like 200 shots. LOL. Even 2000 shots would still have a good amount of error with regards to save/shooting % (and therefore, number of goals for and goals against). For example, Chris from hockeynumbers once wrote "one thing people often forget about save percentages is the amount of error is significant, even for goaltenders who play a lot, for example a goaltender who faces 500 shots has approximately 2.5% error, or has a save percentage of 0.908 ± 0.025 or it has a 95% confidence interval of (.934, 882), Luongo with approximately 2500 shots is (.925, .903), which in terms of quality is a huge difference, in fact that range covers the top 20 goaltenders of 2005-2006."

This is why a lot of us around here use stats like Corsi or Expected Goals, because when dealing with small sample sizes they're often much more predictive of future production.

Jlikens; thanks for your research!! If my data is...

2009-04-28T11:08:00.000-07:00

Jlikens;

thanks for your research!!
If my data is correct than this year is similiar to 97-98.
What is your interpretaion of a year like this?

I am not sure what the interpretation means for these to years? Why should these two years
differ?

But I contend that 1/2 a season MUST be a large enough sample.
Year to year is problematic due to the increased change in rosters etc.

Jlikens; To start are you able to verify my claim...

2009-04-28T10:41:00.000-07:00

Jlikens;

To start are you able to verify my claim that ther is a low correlation between first half power play rate and second half power play rate?
thanks perhaps my research is flawed?

However, If this is true what does it mean?

My claim is simply that if pp % is a repeatable skill a team should be able to repeat their pp% from the first 1/2 to the second half..This seems reasonable to me ..its 1/2 a season!!!

Dan: I'm not sure which data you've looked at, bu...

2009-04-28T01:37:00.000-07:00

Dan:

I'm not sure which data you've looked at, but there is in fact a statistically significant correlation between team powerplay performance at Time 1 and team powerplay performance at Time 2.

I suspect that you've looked at an insufficiently large sample of games when analyzing the data.

Sunny is correct in that the relationship does take some time to materialize, and can therefore only be discerned if the sample of games is sufficiently large.

For example, here are the inter-year correlations in power play conversion rate at the team level. Powerplay conversion rate may not be a perfect measure of powerplay performance, but it's good enough.

YEAR1-YEAR2-----r
=====================
0607-0708 ------ 0.407
0506-0607 ------ 0.32
0304-0506 ------ 0.455
0203-0304 ------ 0.315
0102-0203 ------ 0.496
0001-0102 ------ 0.474
9900-0001 ------ 0.376
9798-9899 ------ 0.025

I'm not sure if all of those values are statistically significant -- the last one certainly isn't. However, looking at the values as a whole, it's fairly clear that powerplay conversion rate in year n is positively correlated with powerplay conversion rate in year n+1.

"The key is to compare correlations between a team...

2009-04-27T23:14:00.000-07:00

"The key is to compare
correlations between a teams first 20 or 40 to there second 20 or 40
the stronger the correlation the more the stat is repeatable and thus the more the stat is likely determined by skill rather than luck."

I fail to see why this is "the key." Why is a sample of 20 to 40 games particularly significant? It seems like an arbitrary cutoff point to me. If a stat is particularly high in variance, it's still subject to the influence of skill, convergence might just take longer.

"a teams power play success
is not repeatable to any statistically significant measure."

Um, what? So you're saying that Philly, Detroit and Washington finishing 2nd,3rd, and 4th last season and then 1st,2nd, and 3rd this season in PP scoring is simply a total fluke?

What am I missing?

Actually..my research shows the opposite to be tru...

2009-04-27T13:02:00.000-07:00

Actually..my research shows the opposite to be true..that is to say that a teams power play success
is not repeatable to any statistically significant measure..wheras even strength goal scoring is ..Hav eyou looked at shooting % over 20 or 40 game segments..The key is to compare
correlations between a teams first 20 or 40 to there second 20 or 40
the stronger the correlation the more the stat is repeatable and thus the more the stat is likely determined by skill rather than luck. As I stated my research shows 5 on 5 is repeatable whereas special teams are not
Check advanced nfl stats.com and B. Burke's great site for more of the math behind tis approach

dan

Simply fascinating. To put it bluntly: I'm a fan!

2009-04-27T08:36:00.000-07:00

Simply fascinating. To put it bluntly: I'm a fan!