To what extent is team-to-team variation in even strength shooting percentage the product of random variation? I'm not sure what the answer is, but I suspect that the contribution is substantial. I've included several graphs below in order to illustrate this. The table below the first graph contains the data upon which each distribution is based.
The first graph. The yellow line is the actual spread in EV ( note: 5 on 5 only) shooting percentage that exists among NHL teams at this point in the 2008-09 NHL season.
The X-axis contains the percentage 'categories' in which the figure listed is the midpoint value of the category.
They Y-axis is the relative frequency of each individual percentage 'category'.
As an example, 6 teams in the NHL this year currently have an EV shooting percentage that is between 0.08 and 0.085. As there are 30 teams in the league, the relative frequency is 0.2 ( as 6/30 = 0.2). The midpoint value for this category is 0.0825. Therefore, the relative frequency of the '0.0825' category is 0.2.
The pink line shows the predicted spread in EV shooting percentage if each team had the exact same underlying shooting percentage at ~0.085 ( i.e. the league average 5-on-5 shooting percentage). This was determined through the following.
1000 "seasons" were simulated.
For each "season", each team has an artificial shooting percentage.
This percentage is the number of goals that a team scores over x number of trials.
The number of trials is equivalent to the number of EV shots that the team has taken through this point in the season.
The probability of "scoring" in each individual trial is the same for every team at 0.085.
Therefore, any team-to-team variation will be the product of randomness.
A specific example will hopefully make this clear.
Philadelphia has taken 984 shots at EV at this point in the 2008-09 season. Therefore, Philadelphia has 984 trials. The probability of scoring in each individual trial for Philadelphia is the league average EV shooting percentage at ~0.085. In Philadelphia's first "season", they scored 107 times. As 107 / 984= ~0.109, Philadelphia's EV shooting percentage for their 1st "season" is 0.109. I then did this for every team and repeated the process 100 times (i.e. simulated 100 seasons). Here's how the first 48 or so shaped out:
Even though the probability of a goal on any given "shot" is 0.085, the artificial shooting percentage will necessarily differ from 0.085 due to insufficient sample size. While it goes without saying, as the sample size (number of trials) increases, any given team's artificial shooting percentage will more closely approximate 0.085. Therefore, for teams that have taken more shots through this point in the 2008-09 season will have more "trials". The spread in shooting percentage for these teams will be lower due to them having a greater number of trials. For example, the standard deviation for Detroit's 100 seasons is ~0.007. By comparison, the same value for Pittsburgh is ~0.009.
The same rules regarding the x and y axes that apply to the yellow (actual) distribution also apply to the pink (random) distribution. The relative frequency for the pink distribution is the proportional representation of each artificial shooting percentage category. As an example, as there were 100 "seasons" and 30 teams, the entire sample consisted of 3000 artificial shooting percentages. 601 artificial percentages fell between 0.08 and 0.085. The relative frequency for the '0.0825' category is therefore ~0.2, as 601/3000 = ~0.2.
As many will note, the spread between the worst ( NYI at 0.069) and best ( BOS at 0.108) teams appears to be sizable, as is indicated by the breadth of the yellow distribution.
However, the pink distribution is itself fairly broad. In fact, it very closely resembles the yellow distribution. As would be anticipated, the yellow distribution is slightly broader than than its counterpart, but the difference is not large. This suggests that much of the inter-team variation in EV shooting percentage is the result of randomness.
The second graph, shown above, contains a 'smoothed' version of the actual distribution, which is represented by the dark line. The average shooting percentage in the league is currently ~0.085, as has been mentioned. The standard deviation is currently ~0.01. The dark graph is simply a normal distribution (bell curve) with a mean of 0.085 and standard deviation of 0.01.
The light line is merely the pink distribution reproduced. Again, the two distributions are very similar to one another.
The fact that the actual distribution is somewhat broader than the expected distribution shows that teams do indeed differ in their underlying shooting percentage at EV. Nonetheless, this variation is only very slightly larger than what would be predicted by chance alone. The underlying differences appear to be minimal.
Vic Ferrari has done a lot of excellent, excellent work over at his site that is similar to this. Much of his work has examined the ability of individual players to influence shooting and save percentage while on the ice. His findings are comparable in that the vast majority of inter-individual variation seems to be due to random variation.
EDIT: I've included some supplementary data tables for the purposes of clarity.
I should mention that the data I used for this post was obtained at behindthenet -- an awesome site that I highly recommend. Without it, this post wouldn't have been possible.