Saturday, December 20, 2008

The First Goal

In hockey, scoring the first goal is important. Last season, every single team in the league had a better record in games where they scored first compared to games where they did not.

However, one has to wonder: is scoring the first goal as important it's made out to be? For example, hockey media types love to harp on the importance of scoring first, invariably citing team A's record when managing to do, or how team B's losing streak is explicable through its tendency to surrender the lead early in the game. Not only does this emphasis conflate cause and effect, but it's insufferably repetitive and trite. One would intuitively expect the team that scores first to have a higher probability of winning, and it's fairly obvious that such a relationship exists. In fact, I suspect that the probability of winning when scoring first is not significantly different than what would otherwise be expected on a mathematical basis.

Moreover, scoring in general is important, whether it be the first goal of the game or the last one. Any given goal is more or less significant and potentially determinative of the game's outcome. To make the distinction between the first goal and any other goal scored during the game just smacks of arbitrariness. Scoring first is probably more correlated with winning than, say, scoring second, but I'd be surprised if the difference was large, and shocked if it was large enough to warrant the special attention.

Thus, this post seeks to answer two questions:

1. When a team scores scores first,what percentage of the time does it win the game? Is this value any different from what probability theory predicts it to be?

2. How much more important is the first goal than the second goal?

The first question can be answered through application of the poisson distribution. As Alan Ryder explains in this paper, goal scoring in hockey is essentially a poisson process. Ryder has determined that a team's probability of winning by z goals in regulation at any particular point during the game can be found through application of the following formula in Microsoft excel.

Pr(Win by z) = EXP(-(mt+vt)) * (mt/vt)^(z/2) * BESSELI(2*SQRT(mt*vt),ABS(z))

where:

m = that team's average goals for per game*
v = that team's average goals against per game*
t = the time remaining in regulation divided by 60
z = the margin of victory

* - Adjusted goals ought to be used here as they provide the best measure of a team's true ability to score and prevent goals.

Through use of this formula, the theoretical probability of winning in regulation for a team that scores first can be determined.

In the 2007-08, there were 1222 games that had at least one goal scored in regulation. On average, the first goal was scored just prior to the 12 minute mark of the 1st period. Thus, our value for t is 0.803.

The m and v values are, for the purposes of the formula, 2.639 and 2.545. These figures are adjusted to reflect the following:

1. That the team that scores first is, on average, slightly better than the average team.
2. That the team that gives up the first goal is, on average, slightly worse than the average team.
3. That the home team is more likely to score the first goal.

For determining the probability of winning, the z value ranges from 0 to sufficiently high n (~10), as the team that scores first must only maintain the existing margin -- or increase it -- in order to win. For the probability of losing, z is -2 to sufficiently low n (~-10), as the trailing team must outscore the opposition by 2 or more goals in order to win. For the probability of a tie, z is -1, as this will restore the original margin of zero.

Thus, all of the necessary input variables having been determined, the theoretical probability of winning when scoring first can be computed. Below is a comparison of the theoretical probability against the actual probability.



As can be seen, the actual and theoretical probabilities more or less mirror one another, save for the fact that probability theory predicts ties to occur less frequently than they actually do. This reflects the fact that teams do, to some degree, play to the score, particularly as the end of regulation nears. The upshot of there being more ties in reality is that the first goal is somewhat more valuable than what the values expressed in the chart would otherwise indicate. For example, while the actual probability of winning is nominally lower than its theoretical counterpart (0.599 vs 0.615), if one examines only those games resolved in regulation, the actual probability of winning is slightly higher than the theoretical probability (0.764 vs 0.744). Nonetheless, the important part is that the theoretical and actual values are essentially equivalent to one another. If the actual probability of winning was substantially higher than the theoretical probability, the large amount of emphasis placed on scoring first may be justifiable. However,the fact that they are virtually the same means that the advantage conferred by scoring first is neither surprising nor contrary to expectation, thus making it unworthy of mention.

What about the importance of the second goal vis-a-vis the first goal?



Scoring second is very nearly as highly correlated with winning as scoring first. And yet, it is the latter that -- rather unfairly -- receives all of the attention. In this sense, the emphasis that's placed on scoring first seems more than a little arbitrary. It's simply not very accurate to accord the first goal special status when, in actual fact,the vast majority of goals scored throughout the course of a hockey game are significant.

3 comments:

Anonymous said...

I am very curious about the hard data on whether the first team to score in an NHL game is more, or less likely to lose that game. Does anyone have this data?

Hostpph said...

I didn't know that you have more probability to win if you manage to score the the first goal.

Anonymous said...

Why wouldn't you think that? The team that scores the second goal is more likely to win. The team that scores the third goal is more likely to win, etc., etc.