Advanced Football Analytics (formerly Advanced NFL Stats): August 2007

Home Archives for August 2007

NFL Payroll and Wins

By Brian Burke

As anyone who isn't a Yankees fan can tell you, team salary has a strong effect on winning in baseball. The NFL is different, however. Its salary cap is a fairly effective means of equalizing team payroll. Yet teams can circumvent the rules to exceed the cap in various ways, at least until it catches up with them. There is also a salary minimum that teams are required to spend because there are teams that chose not to spend up to the cap limit. For these reasons there can be wide variations among teams in terms of player salary.

In 2006, the average team salary was about $100 million. The standard deviation was about $13 million. Indianapolis was the team with the highest payroll, exceeding $131 million, while Oakland was the team with the lowest, paying their players only about $72 million.

For MLB, about $2-5 million buys your team a marginal win. What about the NFL? Is there a connection between wins and payroll?

To find out, I compared team payrolls and team wins for the 2001 through 2006 seasons. Because actual team payrolls depend on complex contract structures, I used cap charge as the definition of team salary. Data was obtained from the USAToday NFL salary database.

Each year the salary cap is adjusted as a function of league revenues, so I normalized each salary by year. That way, we can compare a 2006 salary to a 2003 salary even though the cap has grown greatly over the past few years.

I ran two regressions to test the importance of team salary. The first regression used the year-to-year change in team salary to estimate the year-to-year change in team wins. I wanted to see if teams that beef up by signing free-agents to large contracts were able to convert those dollars into wins.

VARIABLE	COEFFICIENT	STDERROR	T STAT	P-VALUE
Z DeltaSalary	0.24	0.28	0.88	0.38
r-squared	0.4%

The significance of Delta Salary (change in team salary) is not significant in terms of improving a team's win-loss record.

The second regression was a more straightforward comparison. It simply compared total team salary to total wins.

VARIABLE	COEFFICIENT	STDERROR	T STAT	P-VALUE
Z_Team_Salary	0.33	0.16	2.10	0.04
r-squared	2.3%

Team salary is significant in estimating team wins. The r-squared is quite low, however.

What does the low r-squared mean when the variable is signficant? R-squared indicates the percentage of variance in the dependent variable (team wins) accounted for by the model (team salary). But keep in mind, there is a large amount of luck in team records. The r-squared of non-luck factors is probably close to 50%, so team salary could be far more important than indicated by the r-squared.

The thing that really matters is the coefficient of the significant variable--team salary. It estimates that for every standard deviation above average a team spends, it can expect 0.33 extra wins. That means that $13 million could buy a third of a win in 2006.

The Colts, who led the league in salary, were 2.4 standard deviations above average in payroll for 2006. This would buy them 2.4 * 0.33 = 0.79 wins. The Raiders were 2.1 standard deviations below average in payroll, which equates to 0.69 losses. Keep in mind how precious a single win is in the NFL, where the difference between 7-9 and 9-7 is enormous.

Football appears no different than any other professional sport. Salary can buy wins, but teams cannot sustain payroll above the cap for more than a couple years before they need to "pay back" what they borrowed through deferred bonuses other contract devices.

Next, I'll look at how well a team spreads the wealth among players--median salary. Does a team that has few stars but lots of depth win more often than a team with many stars and little depth?

published on 8/30/2007 with 7 comments

Field Goal Kickers

By Brian Burke

I recently took a look at special teams (ST) and its importance in winning. One of the most important, if not the most important, players on ST is the field goal kicker. No one else has such a direct and solitary impact on points scored. In this post, I'll look a little closer at the impact a kicker can make on the win-loss record of his team.

The relative importance of each dimension of the game, including ST, is estimated in a regression on regular season wins. Each variable is in terms of team efficiency (yds per attempt) and is standardized. The ST variables are relative to the league average for similar situations. For example, the FG/XP scores are relative to kick distance and the average success rate for each distance (1).

The results are summarized in the table below.

Variable	Std.Coeff.	P-Value
O PASS	1.26	0.00
D PASS	-0.81	0.00
O RUN	0.47	0.00
D RUN	-0.47	0.00
O INT	-0.63	0.00
D INT	0.60	0.00
O FUM	-0.33	0.07
D FFUM	0.36	0.02
PEN	-0.41	0.01
FG/XP	0.34	0.00
KICK	0.44	0.00
K RET	0.07	0.64
PUNT	0.23	0.12
P RET	0.27	0.10
r-squared	0.79

The results above can be interpreted as follows. Each coefficient indicates how many additional regular season wins a team can expect, on average, per standard deviation above average. For example, if a team is completely average in every facet of the game, it can expect to win 8 games. But if a team is average in every facet except offensive pass efficiency, in which it is 1 standard deviation above average, it can expect to win 8 + 1.26 = 9.26 wins.

The best kicker in the league (#1 out of 32) would typically be in the top 96th percentile, which is very close to two standard deviations above average. Therefore, the best FG kicker in the league would normally be worth 2 * 0.34 = 0.68 added wins in a season.

It's hard to imagine many other positions, other than the starting QB and RB, that have such a large individual impact on a team's record.

published on 8/28/2007 with 5 comments

Importance of Special Teams

By Brian Burke

In earlier posts I claimed that special teams could be neglected in a prediction model of wins because big special teams plays tended to be chaotic--largely random and non-repeatable. I think I may be wrong.

I can defintely say now that special teams (ST) retrospectively explain a good deal of variance in team records. But I'm not so sure it can help predict future ST performance, and therefore future wins.

The biggest challenge in analyzing ST stats is that they are difficult to measure. For example, consider a team that is frequently punting from midfield or the opponent's 40 yard line. They would likely have poor net punt yardage compared to teams that punt from their own territory more often. All of football is dependent on the situation, but special teams are in particular.

The website Football Outsiders has a potential solution. They grade each play according to the situation and compare each team's performance against the league average in the same situation. They call this measure Value Over Average (VOA). They go a step further and factor in a correction for opponent strength which results in Defense-adjusted VOA (DVOA). To be honest, I'm not sold on D/VOA as the best measure of team performance, but it does take situation into consideration on a play-by-play basis, which is espeically handy for special teams.

I was able to gather the D/VOA stats for special teams from the 2003-2006 regular seasons (n=128). I then added it into the team efficiency model which estimates team wins based on each team's efficiency stats of running, passing, turnovers, and penalties. I normalized each variable in the model, so that their regression coefficients could be directly compared to one another.

The baseline efficiency model, without ST data, results in an r-squared of 0.73. Including either ST DVOA or ST VOA in the model increases the r-squared to 0.77. The regression results are shown in the table below.

Variable	Std.Coeff.	P-Value
O PASS	1.27	0.00
D PASS	-0.80	0.00
O RUN	0.46	0.00
D RUN	-0.44	0.00
O INT	-0.53	0.00
D INT	0.57	0.00
O FUM	-0.40	0.02
D FFUM	0.44	0.01
PEN	-0.39	0.01
ST DVOA	0.68	0.00
r-squared	0.77

The results above can be interpreted as follows. Each coefficient indicates how many additional regular season wins a team can expect, on average, per standard deviation above average. I was surprised by how strong a variable ST DVOA turned out to be. It's standarized coefficient is 0.68, which is third only to offensive and defensive passing efficiency. If true, that would mean that the best ST squad in the league (about 2 standard deviations above the mean) is worth about 1.4 additional wins, on average.

I thought that a lot of that strength may be due to field goal kicking, which has the most direct impact on the score. So I reran the regression with each component of special teams broken out: FG/XP kicking, kick offs, kick returns, punts, and punt returns.

Variable	Std.Coeff.	P-Value
O PASS	1.26	0.00
D PASS	-0.81	0.00
O RUN	0.47	0.00
D RUN	-0.47	0.00
O INT	-0.63	0.00
D INT	0.60	0.00
O FUM	-0.33	0.07
D FFUM	0.36	0.02
PEN	-0.41	0.01
FG/XP	0.34	0.00
KICK	0.44	0.00
K RET	0.07	0.64
PUNT	0.23	0.12
P RET	0.27	0.10
r-squared	0.79

The standardized coefficients show that kick-off and kick coverage performance (KICK) is the most important (in winning) of all the components of ST. But kick return (KRET) is the only non-significant variable, which is puzzling. Why would kick-offs be so important if kick returns don't matter? This might indicate a shortcoming of the DVOA system.

Punt and punt return performance are marginally significant, but the signs of the coefficients make sense and they are relatively symmetric. So we can be confident they are relevant but their true coefficients may not precisely as indicated by this data set.

To simplify the results, I computed the relative strength of each of the standardized coefficients in terms of percent of the total strength of all variables.

Variable	Importance
O PASS	19%
D PASS	12%
O RUN	7%
D RUN	7%
O INT	9%
D INT	9%
O FUM	5%
D FFUM	5%
PEN	6%
FG/XP	5%
KICK	7%
K RET	1%
PUNT	3%
P RET	4%

So according to this analysis, special teams accounts for about 20% of the game in terms of winning and losing. In a way, this is disproportionately strong. ST plays comprise far fewer than 1 in every 5 plays on the field. GMs may want to take another look at how much they're paying their punters, kickers, and returners. Perhaps the league is noticing the importance of ST evidenced by the recent attention return specialists have received in the draft.

ST is a considerable part of the game. It retrospectively helps explain why teams won. The question remains, however, if prior ST performance indicates future ST performance, and if it is predictive of future wins.

published on 8/26/2007 with 8 comments

4th Down and 5 on the 40

By Brian Burke

I'm watching the Jaguars-Packers preseason game. In the 1st quarter with the score 0-0, the Packers decided to go for it on 4th down. They had a 4th and 5 on the Jags' 40 yd line. Putting aside this is just the preseason, is this a good idea?

After reading this study, I'd be inclined to say yes. Coaches tend to make decisions not to maximize points, but to minimize Monday morning criticism. Here is how I see a coach's decision matrix:

Make the safe but bad call = keep your job
Make a bold but correct call and suceed = keep your job
Make a bold but correct call and fail = lose your job

It's a no-brainer for most head coaches--punt. But what's safe for the coach's job may not be what's good for his teams chances to win.

[Have you ever noticed that when you are playing Madden or other football video games, yourself and others tend to go for it on 4th down far more often than in the real NFL? You'd probably think twice if it meant the unemployment line by January.]

Back to the Packers. This is a back-of-the-envelope analysis, so I will simplify the situation somewhat. Let's spell out the data we have and make some educated assumptions about the data we don't:

--I'll assume the average punt from the 40 ends up at about the 15 yd line. Sometimes it would end up a touchback, and other times it would be fair-caught or downed around the 10.
--The average play in the NFL is 5.0 yds. So the Packers have about a 50/50 shot at getting 5 yds and a 1st down. They'd end up at the 35.
--The average series success rate for the NFL in 2006 was 65%. NFL teams average a first down 65% of the time starting from a 1st and 10.
--Since the average play is 5 yds, the average successful series yields 15 yds.
--I am assuming no 'freak' plays (penalties, interceptions, fumbles).

Let's consider the punting option first. By punting from the 40 to the 15, the net difference is 25 yds. It takes 1.67 successful series on average to move 25 yards downfield (25/15 = 1.67). Since the series success rate is 65%, the probability of "1.67 successful series" is:

0.65 ^ 1.67 = 0.49

The number of successful series required to go 85 yds from a team's own 15 to the end zone is 85/15 = 5.67. The probability of 5.67 consecutive successful series is:

0.65 ^ 5.67 = 0.09

So by punting, the Packers would have effectively cut the Jags' ability to score in half. And if they go for it and fail, they've doubled the probability the Jags would score, a 0.18 probability.

But if the Packers get the 1st down, they're now at the 35 with a 1st and 10. They would need 2.33 more successful series to score a TD (35/15 = 2.33). The resulting probability is:

0.65 ^ 2.33 = 0.37

But remember, they need to get that 1st down first, which they have a 0.50 probability of obtaining. The resulting probability of a TD is:

0.50 * 0.28 = 0.18

The crucial probabilities the Packers should consider are:

Packers score a TD: 0.37 if GB makes the 1st down, 0.00 if they don't
Jaguars score a TD: 0.09 if GB punts, 0.18 if GB fails to get the 1st down

Ultimately, the Packers have a 50/50 shot at a 0.37 chance to score a TD, but only risk a net forfeit of a 0.09 chance to their opponent. It's a good call, considering only TDs.

The field goal equation also favors the Packers. If they get to the 35 with a 1st down, they only need 3 plays to get 5 yds to the 30, a range well within Longwell's ability. This is a likely close to a 50/50 proposition.

From their own 40, the Jaguars would need about 2 1st downs to get to the Packers' 30 and into field goal range. The chances of this are:

0.65 ^ 2 = 0.42

I realize this is a simplified analysis. I haven't considered the full effect of fumbles, interceptions, big pass interference calls, etc. But those considerations would have to be extremely strong, and unequal with respect to which team they favor, to overcome the balance of probabilities on the Packers' go for it side. I'd love to see more teams go for it on 4th down in the regular season.

published on 8/23/2007 with 3 comments

QB Wins Added (with Rushing)

By Brian Burke

In recent posts, I've calculated a QB's contribution to his team's wins based on his passing ability. In this post, I'll include his running ability as well.

The components of my passer rating are Air Yards, interception rate, sack yardage rate, and now rushing yards and fumbles. These components are weighted exactly as much as their relative importance to winning games. The weights are derived from a regression model using data from all teams since the 2002 expansion.

Although passing yards are more critical to winning than running yards, rushing yards by a QB tend to be from scrambles during pass plays. Only in the most extreme situations do coaches call for planned QB runs in running situations. For that reason, and for simplicity's sake, I'll use the weight for passing yards when calculating the effect of QB running yards.

I used fumbles instead of fumbles lost because fumble recovery is almost purely random. By putting the ball on the ground, a QB is really offering it up to anyone. Over the past five years, the rate of fumbles lost per fumble is 52%. I therefore used fumbles per possession divided by 2 for quarterbacks. Possessions are defined as pass attempts, sacks, and rushes.

The resulting table of QB Wins Added (per 16 games) is listed below. I also included the wins added from passing performance alone for comparison.

Rank	Player	+Wins	Pass Rank	+Pass Wins
1	Manning P	3.90	1	3.88
2	Romo	2.09	2	2.61
3	Huard	1.78	3	2.46
4	McNabb	1.76	9	1.21
5	Brees	1.75	5	2.10
6	Vick	1.72	20	-0.10
7	Palmer	1.61	4	2.14
8	Garcia	1.50	6	1.88
9	Rivers	1.41	7	1.84
10	Bulger	1.26	8	1.21
11	McNair	0.91	10	1.03
12	Hasselbeck	0.69	15	0.38
13	Leinart	0.58	11	0.89
14	Pennington	0.46	14	0.49
15	Young	0.43	24	-0.29
16	Garrard	0.36	23	-0.18
17	Brady	0.21	13	0.53
18	Grossman	0.01	16	0.33
19	Delhomme	-0.01	17	0.13
20	Kitna	-0.10	22	-0.13
21	Brunell	-0.17	19	-0.05
22	Manning E	-0.18	18	0.06
23	Roethlisberger	-0.26	25	-0.32
24	Losman	-0.36	21	-0.13
25	Leftwich	-0.47	26	-0.39
26	Warner	-0.51	12	0.80
27	Favre	-0.53	27	-0.45
28	Harrington	-0.60	28	-0.54
29	Plummer	-0.76	31	-0.64
30	Green	-0.77	29	-0.58
31	Smith	-0.91	32	-0.87
32	Johnson	-1.05	33	-0.94
33	Campbell	-1.16	36	-1.86
34	Frye	-1.31	35	-1.55
35	Carr	-1.45	34	-1.26
36	Cutler	-1.69	30	-0.62

The first thing I noticed was how high the running QBs climbed. Vick went from the 20th best QB to 6th best. Young went from 24th to 15th. Garrard climbed from 23rd to 16th. McNabb also climbed above Garcia.

I was frankly surprised by how high Vick ranks, even knowing how well he runs. We won't see him back in the league for a while, though.

When crunching the numbers, I noted how important fumbles are. Look at how Warner fell from 12th to 26th. His fumble rate was 5.1%, which means he fumbled almost 1 out of 20 times he held the ball. Right behind him was Jay Cutler. He fumbled 4.9% of the time, which caused a fall from 30th to dead last in the ranking. Warner and Cutler were in a class by themselves. The next slipperiest hands belonged to Huard with a 3.3% fumble rate, but he remains at 3rd overall.

Part of what helps Manning stand apart from the rest of the field is not his running ability, but the fact he only fumbled twice in 16 games. Remember, that's fumbles, not fumbles lost. Manning is special in a lot of ways that won't even show up in most conventional statistical analyses. As a Baltimore guy, I can't even say how much I dislike Manning and the Colts, but it's hard to deny how good he is.

published on 8/21/2007 with 1 comments

YAC Receiver Correlation

By Brian Burke

In the last post, I compared the year-to-year correlations of yards after catch (YAC) of quarterbacks and receivers. This follows a series of posts regarding an improved QB rating system.

I found that the correlation for receivers was signficantly stronger than for quarterbacks. This result bolstered my theory that QBs do not contribute to the YAC of their receivers by virtue of their abilities.

Another author had found that between '05 and '06, the year-to-year correlation of QB YAC was 0.33 for all QBs, and was 0.41 for QBs who remained on the same team. I previously discussed why this data could not be used to conclude that QBs are consistent in producing YAC, and in fact, suggests the opposite conclusion.

My analysis of year-to-year receiver YAC was for all receivers, regardless of QB or team changes, in the '02-'06 seasons. I found that the correlation between year-pairs ranged from 0.41 to 0.57, and averaged 0.47.

Since I posted that, however, I realized it was an unfair comparison because I included receivers who may have only had a handful of catches in any given year. This would cause the random component of variance to dominate the calculations. So I repeated the analysis using only receivers with at least 20 catches in every year from '03 through '06. The resulting sample contained 65 receivers. The critical level of significance for n=65 is 0.25.

Year Pair	Receiver r
'03 - '04	0.71
'04 - '05	0.78
'05 - 06	0.69

Those are strong correlations by most conventional standards. And they are much stronger than the 0.33 correlation for QBs, especially considering the exponential nature of correlation coefficients (r=0.8 is four times stronger than r=0.4).

But the correlations of QBs and receivers still cannot yet be directly compared. Here's why: Most QBs have several hunded completions in a season. But my receiver list contains players with as few as 20 receptions, which leaves a lot more room for random variance in the receiver variable than the QB variable. If the influences are independent, the variance of YAC can be written as:

var(YAC) = var(QB) + var(rec) + var(def) + var(random)

The smaller the number of repitions, the larger the share of var(random) would be, and therefore the smaller the share of var(rec) would be. But when we look at r for quarterbacks, var(QB), the high number of repititions means the var(random) will be much smaller as a share of the total variance. We'd therefore expect to see a stronger coefficient for the QBs. (This is why I said statisticians grate their teeth when a couple correlations are used to make a conclusion).

In other words, the bar should be set higher for QB correlations because they have so many more repititions in a season. If QBs and receivers had equal influences on YAC in each play, we'd expect to see a much higher correlation for QBs than for receivers. But we don't, we see the opposite.

Accordingly, a correlation as high as 0.70 for the receivers is stunningly strong, particularly when compared to the QB correlation of 0.33. Had the receivers tended to have the same number of receptions as QBs have completions, the total variance would not have so much random luck within it, and the correlation coefficient would be significantly higher than 0.70.

[The sabrmetricians in baseball have gone around and around on this issue for years. The year-to-year correlation methodology was used for decades to disprove the existence of "clutch" hitting. The theory was that if clutch performance was a skill possessed by batters, it should endure from year to year. The resulting correlations were very small, and the conclusion was there is no such thing as clutch. Not until recently did a researcher point out how much luck influences batting in the first place, and the low number of potential clutch plate appearances exacerbated the effect. So if clutch hitting existed, we'd see a small correlation anyway. (As it turns out clutch performance has indeed been disproved, but through other methods.)]

We can now take a step further. Previously we had evidence that YAC does not belong to the QB. Now we have evidence that the lion's share of it indeed belongs to the receiver.

published on 8/21/2007 with 2 comments

Who Gets Credit for YAC: Follow-Up

By Brian Burke

Previously, I made the case that YAC belongs to a receiver's abilities, and is not contributed to by a quarterback by virtue of his accuracy.

In a recent exchange with a well-informed fan of Brett Favre at the Football Outsiders site, Alex challenged my conclusion that YAC belongs to the receiver. Although throughout the discussion I was tweaking him about Favre's poor numbers, much of the discussion centered around McNabb vs. Garcia. Alex was suspicious because my rankings considered Garcia the better QB in '06 because Garcia's completions were deeper down-field, and McNabb's were shorter completions with lots of YAC. (Ultimately McNabb comes out on top after considering rushing yards and fumbles.)

Alex pointed out an article by Aaron Schatz of Football Outsiders, that implied that QBs control YAC rather than receivers. Mr. Schatz's research tends to be very heavy on data, but very weak on analysis. His article is based on a look at QB YAC between '05 and '06. He found a 0.33 correlation in year-to-year YAC for QBs as a whole, and a 0.41 correlation for QBs on the same team both years. The author concludes, "For the NFL, that's reasonably consistent."

Actually, for the NFL or for anything, that's reasonably meaningless. First, statisticians grate their teeth whenever a couple correlations by themselves are used to make a conclusion. Second, for a sample size of say, the top 32 QBs (n=32), the significance level for a Pearson correlation is 0.35. So the article's correlation for all QBs is probably not even significant.

In the article, the author goes on for several paragraphs explaining why QB after QB jumps from the bottom of the YAC rankings to the top or from the top to the middle, and so on. I think if he stood back from what he wrote, he'd see that his data should be leading him to the opposite conclusion. Here is a sample:

“Last year’s top quarterback in YAC was Jake Delhomme, and he’s fallen to the
middle of the pack this year…The rest of this year’s top five: Daunte Culpepper,
David Garrard, Mark Brunell, and Brett Favre. Brunell was third last year, but
Garrard was near the bottom of the YAC rankings last year…Garrard went from 43rd
to third, and Leftwich went from 33rd to eighth…Tom Brady was one of last year’s
leaders, but he’s middle of the pack this year …The bottom five: Garcia, Matt
Hasselbeck, Joey Harrington, Peyton Manning, Steve McNair. All of those guys
were middle of the pack in 2005 except Hasselbeck …There are a lot of other guys
who are near the bottom in YAC both years, though — they just aren’t bottom FIVE
this season.”

Also note that his explanation for the two biggest surprises, Delhomme and Brady, were due to their receivers. If YAC really was QB-dependent, receivers should be (somewhat) interchangeable.

To be generous, let's assume his numbers are significant. The correlation of 0.41 for QBs who remained on the same team, would be based a smaller sample size. I don't have his data, but we'd need an n=22 sample size to make 0.41 significant. It's a safe assumption that there were indeed 22 or more QBs on the same team in both years, which would probably make the 0.41 number statistically significant. Even so, that undermines his case because QBs on the same team in both years are throwing to mostly the same receivers. This does not support the conclusion that YAC is consistent in QBs.

Remember that correlation coefficients are not linear. You square the correlation coefficient (r) to know the percentage of variance accounted for (r-squared). So an r of 0.41 means that only 16% of the variance is accounted for by various combinations of both QB and receivers from '05 and '06. An r of 0.33 equates to 11% of the variance. How much could the QB alone account for? Very little if any.

I repeated a similar analysis, but this time for receivers from year to year. I studied all receivers with receptions in each season from '02 to '06, for an n=148 (critical r for significance is 0.16). I calculated the correlation between consecutive years and between non-consecutive years (between '02 and '05 for example). This is regardless of who the QB or team was for each receiver, so these numbers should be compared to the 0.33 number calculated by the author. The highest correlation I calculated was 0.57, and the lowest was 0.41. The average correlation among all years was 0.47. [Note: these correlations are actually understated. Here is an updated analysis.]

So the stronger correlation is for the receiver by far. But keep in mind this isn't a competition between who gets credit, QB or receiver. The real question at hand is should the QB get credit? My original thesis was that measures of QB performance is better off not including YAC.

Think of it this way. We're parsing the total variance of YAC--how much belongs to the QB, and how much belongs to other things. Assuming each contribution is independent, we can conceptualize it this way:

var(YAC) = var(QB) + var(rec) + var(defense) + var(random)

In other words, the total variance in YAC is due to the QB's contribution, the receiver's contribution, what the opposing defenses did or didn't do on each play, plus variance from random error.

So far we can be fairly certain about two things. Var(QB) is very small or zero, and var(rec) is significantly greater than var(QB). The rest of the variance, and possibly the vast majority of it, is due to defense and randomness. But no matter how the rest of it is divided up, in the final analysis, it is very unlikely any significant amount belongs to the QB.

Also consider that QB accuracy, measured by completion percentage and interception rate, are not statistically significant in a regression of YAC. So while I may have been mistaken to say YAC belongs only to the receiver, I was correct in saying it does not belong to the QB. The more accurate conclusion is that YAC belongs somewhat to the receiver, a lot to the opposing defenses and random variation, and very little, if at all, to the QB.

[Further analysis, which shows that receivers really do own YAC, can be found here.]

published on 8/20/2007 with 1 comments

QB Passer Rating

By Brian Burke

Building on my previous efforts to devise a better passer rating, and on my analysis of Air Yards, I've created a more complete passer rating formula.

Most fans are familiar with the NFL passer rating, and all its flaws. The formula itself is almost too complex to type out, so I won't bother, but you can find it here. Its shortcomings are also almost too long to list. But here are a few I've mentioned before:

It is incomplete because it does not consider sacks.

It is redundant because it includes both completion percentage and yards per attempt. Because yards per attempt is strongly dependent on completion percentage, it is double counted in the formula.

It is arbitrary because each of the four components are not weighted in any meaningful way. The components of the formula are based on multipliers and constants selected to give the rating a nice scale rather than based on their importance to scoring or winning. They also use arbitrary maximums.

Lastly, it includes touchdown passes. They should not be included in a passer rating because they are the result of many other factors in addition to QB passing proficiency. Factors such as a great defense that produces turnovers in an opponent's territory, a solid running offense that sustains drives, or spectacular recievers who generate large amounts of YAC are strong contributors to passing TDs. Further, TD passes are the culmination of all the other attributes of the passer including accuracy, avoiding interceptions, and avoiding sacks.

This improved QB rating system is primarily based on Air Yards per Attempt. Air Yards is what I call the yardage gained by the pass through the air. It's essentially total passing yards minus YAC. I've learned that YAC is largely independent of passer ability and it should be credited to the receiver more than to the QB.

Sack yards per pass attempt are also included. Although partially dependent on pass protection, sacks are also dependent on a QB's pocket presence, ability to read open receivers, decision-making, and mobility. Pass attempts are defined as throws and sacks.

Interceptions are obviously crucial to a passer's performance as well. I include interception rate (INTs/Att) in the formula.

The components of the new passer rating are weighted according to how important they are in terms of team wins. The formula is based on a multivariate regression model of team wins. Using data from the past five NFL regular seasons, the regression model estimates team wins based on the efficiency stats of each team including passing, running, turnovers, and penalties. Regression models can hold all other factors equal, so by only adjusting the factors of interest (passing performance) we can calculate the effect on the estimate of season wins. Arbitrary weighting is not necessary.

1. Is not arbitrary. Each component is weighted exactly as much as their relative importance to winning games. These weights are derived from a regression model using data from all teams since the 2002 expansion.

2. The result is stated in units of team wins over a 16-game regular season. The regression model allows the passer rating model's component weights to translate directly into how many additional wins a QB's performance would yield, on average, over 16 full games.

3. Is not redundant. The components do not double count passing stats.

4. Includes only the passing stats primarily controlled by the QB. Factors such as passing yards after catch are not included.

After some quick algebra to simplify the equation, the resulting formula of the improved new passer rating is:

QB Wins Added = [(Air Yds - Sack Yds) * 1.56 - INTs * 50.5 ] / Pass Attempts - 3

Basically, every additional yard of passing/sack efficiency yields an additional 1.56 wins on average. That is, assuming an average running game, and an average defense, a team whose passing efficiency is 1 yard/att above average will win 9.56 games.

The average interception rate is about 0.03 interceptions per attempt. So if a QB throws 0.04 INTs/Att, he'll cost his team 0.01 * 50.5 = 0.5 wins, all other factors being equal.

I subtract 3 because 3.01 was the average score of a QB in 2006. Better than average QBs have positive wins added, and vice versa.

The table below ranks the QBs of 2006 in Wins Added:

Player	Cmp Pct	Yds	AY	YAC	YPA	AY/ Cmp	YAC/ Cmp	AY/ Att	TAY/ Att	Int%	SkYd Rate	YAC%	+Wins
Manning P	65	4397	2889	1508	7.9	8.0	4.2	5.2	4.9	1.6	0.15	34	3.88
Romo	65	2903	1850	1053	8.6	8.4	4.8	5.5	4.8	3.9	0.35	36	2.61
Huard	61	1878	1045	833	7.7	7.1	5.6	4.3	3.6	0.4	0.41	44	2.46
Palmer	62	4035	2503	1532	7.8	7.7	4.7	4.8	4.1	2.5	0.42	38	2.14
Brees	64	4418	2330	2088	8.0	6.5	5.9	4.2	3.9	2.0	0.18	47	2.10
Garcia	62	1309	709	600	7.0	6.1	5.2	3.8	3.4	1.1	0.21	46	1.88
Rivers	62	3388	1953	1435	7.4	6.9	5.1	4.2	3.7	2.0	0.30	42	1.84
Bulger	63	4301	2355	1946	7.3	6.4	5.3	4.0	3.1	1.4	0.57	45	1.21
McNabb	57	2647	1250	1397	8.4	6.9	7.8	4.0	3.3	1.9	0.42	53	1.21
McNair	63	3050	1722	1328	6.5	5.8	4.5	3.7	3.4	2.6	0.17	44	1.03
Leinart	57	2547	1553	994	6.8	7.3	4.6	4.1	3.5	3.2	0.40	39	0.89
Warner	64	1377	715	662	8.2	6.6	6.1	4.3	3.4	2.9	0.57	48	0.80
Brady	62	3529	1801	1728	6.8	5.6	5.4	3.5	3.0	2.3	0.32	49	0.53
Pennington	65	3352	1867	1485	6.9	6.0	4.7	3.8	3.3	3.3	0.33	44	0.49
Hasselbeck	57	2442	1629	813	6.6	7.8	3.9	4.4	3.5	4.0	0.57	33	0.38
Grossman	55	3193	1877	1316	6.7	7.2	5.0	3.9	3.5	4.2	0.28	41	0.33
Delhomme	61	2805	1445	1360	6.5	5.5	5.2	3.4	2.8	2.6	0.37	48	0.13
Manning E	58	3244	1862	1382	6.2	6.2	4.6	3.6	3.1	3.4	0.34	43	0.06
Brunell	62	1789	738	1051	6.9	4.6	6.5	2.8	2.4	1.5	0.34	59	-0.05
Vick	53	2474	1571	903	6.4	7.7	4.4	4.0	2.9	3.4	0.70	36	-0.10
Losman	62	3051	1705	1346	7.1	6.4	5.0	4.0	2.9	3.3	0.70	44	-0.13
Kitna	62	4208	2377	1831	7.1	6.4	4.9	4.0	3.0	3.7	0.59	44	-0.13
Garrard	60	1735	902	833	7.2	6.2	5.7	3.7	3.0	3.7	0.46	48	-0.18
Young	52	2199	1237	962	6.2	6.7	5.2	3.5	2.9	3.6	0.34	44	-0.29
Roethlisberger	60	3513	1974	1539	7.5	7.1	5.5	4.2	3.3	4.9	0.54	44	-0.32
Leftwich	59	1159	532	627	6.3	4.9	5.8	2.9	2.5	2.7	0.25	54	-0.39
Favre	56	3885	1764	2121	6.3	5.1	6.2	2.9	2.6	2.9	0.21	55	-0.45
Harrington	57	2236	1250	986	5.8	5.6	4.4	3.2	2.8	3.9	0.29	44	-0.54
Green	61	1342	781	561	6.8	6.5	4.6	3.9	2.9	4.3	0.57	42	-0.58
Cutler	59	1001	483	518	7.3	6.0	6.4	3.5	2.7	3.5	0.57	41	-0.62
Plummer	55	1994	1058	936	6.3	6.0	5.3	3.3	2.8	4.1	0.33	47	-0.64
Smith	58	2890	1405	1485	6.5	5.5	5.8	3.2	2.5	3.6	0.42	51	-0.87
Johnson	62	2750	1331	1419	6.3	4.9	5.3	3.0	2.4	3.4	0.43	52	-0.94
Carr	68	2767	1198	1569	6.3	4.0	5.2	2.7	2.0	2.7	0.50	57	-1.26
Frye	64	2454	1275	1179	6.3	5.1	4.7	3.3	2.3	4.3	0.60	48	-1.55
Campbell	53	1297	405	892	6.3	3.7	8.1	2.0	1.6	2.8	0.26	45	-1.86

AY = Air Yards
YAC = Yards after Catch
YPA = Yards per Attempt (sacks not included)
TAY/ATT = 'True' Air Yards per Att (Includes sack data)
SkYdRate = Sack Yards per Attempt (Attempts include sacks)
YAC% = Percentage of passing yards obtained from YAC
+Wins = Wins Added by QB over a 16-game season

Manning tops the list, which should be a surprise to no one. But Romo's numbers are very impressive. Manning beats him by virtue of his low sack and interception rates.

Campbell's numbers are bad news for Redskins fans. Not only is his completion percentage very low, he's also throwing very short passes. His average completion only goes 3.7 yds down field.

Notice how much better Huard fared than Green. They had the same team around them, but Huard distinguished himself thanks to his very low interception rate and his above average down field passing ability.

Perhaps Vick's passing ability will be missed in Atlanta after all. When he does hit his receiver, it's usually for a good chunck of yards.

Rivers appears to be genuinely talented and was not relying on the talent around him to pound out YAC. He was successfully throwing deep in 2006.

There are many more observations that can be made, but if you're like me you look right for where your favorite QB ranks, and then decide if you buy the formula. Next, I'll expand it further to include fumbles and rushing yards.

published on 8/19/2007 with 11 comments

2006 QB Air Yards

By Brian Burke

I've recently been studying YAC and it's complement, Air Yards--the yards a pass travels through the air prior to the catch. So far we've learned that Air Yards are more critical than YAC in terms of generating high yards per attempt stats. We've also learned that YAC should be credited to the receiver only. No matter how accurate a passer is, he does not contribute to his receivers' YAC by hitting them in stride or by other means.

In this post I take a look at the top 30 QBs of 2006 in terms of Air Yards and YAC. Basically, I wanted to find out who the truly good QBs are, and who are the QBs who may appear good because they benefit disproportionately from receiver YAC.

The measures I use for each QB are YAC per Completion and Air Yards per Completion. These two stats tell us how each passer achieved his total passing yards. I also look at completion percentage, which is critical to both YAC and Air Yards--there can be neither without a completion. Applying completion percentage to Air Yards per Completion tells us Air Yards per Attempt.

Air Yards per Attempt captures the essence of passing. It tells us how well a QB throws down field and how accurate he is. It also filters out the effect of receiver YAC, yielding a truer measure of passing performance. Some quarterbacks are able to amass large amounts of yards by throwing short high-percentage passes to backs and receivers who generate big YAC yardage. This inflates the QB's stats above his true talent and achievement.

The table below lists 2006's top 30 QBs in total passing yards. They are sorted in order of Air Yards (AY) per Attempt.

Player	Cmp%	Yds	YAC	AirYds	YPA	YAC/Cmp	AY/Cmp	AY/Att	Int/Att	TD/Att	%YAC
Romo	0.65	2903	1053	1850	8.6	4.8	8.4	5.5	0.039	0.056	0.36
Manning P	0.65	4397	1508	2889	7.9	4.2	8.0	5.2	0.016	0.056	0.34
Palmer	0.62	4035	1532	2503	7.8	4.7	7.7	4.8	0.025	0.054	0.38
Hasselbeck	0.57	2442	813	1629	6.6	3.9	7.8	4.4	0.040	0.049	0.33
Huard	0.61	1878	833	1045	7.7	5.6	7.1	4.3	0.004	0.045	0.44
Rivers	0.62	3388	1435	1953	7.4	5.1	6.9	4.2	0.020	0.048	0.42
Roethlisberger	0.60	3513	1539	1974	7.5	5.5	7.1	4.2	0.049	0.038	0.44
Brees	0.64	4418	2088	2330	8.0	5.9	6.5	4.2	0.020	0.047	0.47
Leinart	0.57	2547	994	1553	6.8	4.6	7.3	4.1	0.032	0.029	0.39
Vick	0.53	2474	903	1571	6.4	4.4	7.7	4.0	0.034	0.052	0.36
Bulger	0.63	4301	1946	2355	7.3	5.3	6.4	4.0	0.014	0.041	0.45
Kitna	0.62	4208	1831	2377	7.1	4.9	6.4	4.0	0.037	0.035	0.44
Losman	0.62	3051	1346	1705	7.1	5.0	6.4	4.0	0.033	0.044	0.44
McNabb	0.57	2647	1397	1250	8.4	7.8	6.9	4.0	0.019	0.057	0.53
Grossman	0.55	3193	1316	1877	6.7	5.0	7.2	3.9	0.042	0.048	0.41
Pennington	0.65	3352	1485	1867	6.9	4.7	6.0	3.8	0.033	0.035	0.44
Garrard	0.60	1735	833	902	7.2	5.7	6.2	3.7	0.037	0.041	0.48
McNair	0.63	3050	1328	1722	6.5	4.5	5.8	3.7	0.026	0.034	0.44
Manning E	0.58	3244	1382	1862	6.2	4.6	6.2	3.6	0.034	0.046	0.43
Brady	0.62	3529	1728	1801	6.8	5.4	5.6	3.5	0.023	0.047	0.49
Young	0.52	2199	962	1237	6.2	5.2	6.7	3.5	0.036	0.034	0.44
Delhomme	0.61	2805	1360	1445	6.5	5.2	5.5	3.4	0.026	0.039	0.48
Plummer	0.55	1994	936	1058	6.3	5.3	6.0	3.3	0.041	0.035	0.47
Frye	0.64	2454	1179	1275	6.3	4.7	5.1	3.3	0.043	0.026	0.48
Harrington	0.57	2236	986	1250	5.8	4.4	5.6	3.2	0.039	0.031	0.44
Smith	0.58	2890	1485	1405	6.5	5.8	5.5	3.2	0.036	0.036	0.51
Johnson	0.62	2750	1419	1331	6.3	5.3	4.9	3.0	0.034	0.021	0.52
Favre	0.56	3885	2121	1764	6.3	6.2	5.1	2.9	0.029	0.029	0.55
Brunell	0.62	1789	1051	738	6.9	6.5	4.6	2.8	0.015	0.031	0.59
Carr	0.68	2767	1569	1198	6.3	5.2	4.0	2.7	0.027	0.025	0.57

Take a look at the QBs who are living off of YAC--Favre, Brunell, and Carr. Farve is seriously past his prime. Not only is his completion percentage down at 56%, but he's throwing short passes. Brunell was only slightly better, but he has given way to Jason Campbell in Washington. David Carr was by far the king of the dink and dunk passing game. Despite his 68% completion rate, he threw it down field an average of only 4.0 yards on each completion.

The real down-field threats are Romo (5.5), Manning (5.4), and Palmer (4.8). After those three, there is a drop off and a pack of QBs between 4.4 and 4.0 Air Yds/Att. Rivers looks like a solid talent.

Notice how far down the list Brady is. He was throwing a lot of short passes to Dillon and Maroney. Passes to RBs tend to have little or no (or negative) air yards and lots of YAC. He still had solidly above average TD per Att rate and a below average Int per Att rate.

I could go on. There are any number of observations that could be made. It's still not a perfect measure of passing ability, because passing always depends on receivers' abilities to get open and catch the ball. But I believe it is a far better measure than crediting YAC to the QB.

published on 8/15/2007 with 7 comments

NFL Payroll and Wins

Field Goal Kickers

Importance of Special Teams

4th Down and 5 on the 40

QB Wins Added (with Rushing)

YAC Receiver Correlation

Who Gets Credit for YAC: Follow-Up

QB Passer Rating

2006 QB Air Yards

Special Note

Search Advanced Football Analytics

Required Reading

Archive

@BBurkeESPN

ANS COMMUNITY

Support Military Families