Comments on Advanced Football Analytics (formerly Advanced NFL Stats): The Pay-Performance Linear Model

Brian, I don't think you'll ever acquire e...

2013-08-30T10:11:18.216-04:00

Brian, I don't think you'll ever acquire enough data. Performance norms, yes, but salary norms, no -- because QBs only negotiate 2 or 3 prime contracts in their careers.

Let me add something else: OLS is unbiased with no...

2013-08-21T02:35:57.394-04:00

Let me add something else: OLS is unbiased with non-normal data, but it may not be efficient - i.e., a weighted least squares procedure may produce estimates that are closer to the truth with less data. I'm not sure I ever learned when OLS is guaranteed to be efficient, but it seems likely to me that normality would be a consideration (in particular, I suspect that a weighted least squares procedure would generally be more efficient for highly-skewed data and I suspect that it would not be more efficient for normal data).

In academic finance, at least, if we have data that is highly skewed but becomes normalish when logged (firm size, for example), we log it. That might be for efficiency or it might be because we expect the assumptions behind our standard errors to hold up better for logged data than for the unlogged version. So I'm not saying that normality isn't a good feature for the data to have, I'm just saying it isn't required for OLS to be consistent and unbiased ("consistent" just means that if you get an arbitrarily large data set, the errors in your estimates become arbitrarily small).

No normality assumptions are required for linear r...

2013-08-21T00:35:14.959-04:00

No normality assumptions are required for linear regression estimates to be unbiased. Actually, you don't need a lot of assumptions for linear regression estimates to be unbiased (though one of them is that the true underlying relation is linear, which you usually don't actually quite believe when you're running linear regressions).

The standard errors, now, those are a whole other ball of wax. Those require assumptions like normality (of at least the residual and maybe the X variables, and note that if the residual and the X-variables are joint-normal and the true relation is linear, then the y-variable is normal) and that the variance of the residual doesn't depend on the X-variable and that the cows are spherical (I should probably look this up in one of my econometrics books, but that's how I remember it).

Regarding normality of the y-variable...It was alw...

2013-08-15T22:00:28.264-04:00

Regarding normality of the y-variable...It was always my understanding that OLS regression assumes normality. That's why we minimize the square the of error. The choice of squaring the error is not arbitrary and is derived directly from the equation that describes the Gaussian curve. The t-tests for significance and the goodness-of-fit calculations (r and r-squared, for example) assume normality. I'm not the authority here, so perhaps I'm mistaken.

Also, it is true that for unbiased regression estimators, the distribution of the errors must be normal.

Will-Just trying to establish a baseline for the m...

2013-08-15T21:56:49.188-04:00

Will-Just trying to establish a baseline for the market. The analysis is for current year salary for current year performance. In the long run, and in aggregate among all the FA QBs, the relationship will hold.

Brian, you ignored Anon's second point, which ...

2013-08-15T15:28:09.862-04:00

Brian, you ignored Anon's second point, which is a very important one. Based on your original post (which I did read...and the comments) the question is "is AR's contract a bargain or not?" and the measure of "bargain" must be against the market value.

As Anon pointed out, salary is based on projected future performance based on the performance up to that point. If you are trying to tease out the mean performance valuation NFL GM's have in their heads when making QB contracts and thus whether AR's contract is in or out of line with that expectation, then you can't regress career averages against each other, rather you need to work with the information the GM's had at the time: performance up to the date the contract was signed. This performance estimate may or may not correspond to EPA (to the extent that it does, the GM is doing a good job maximizing the things that contribute to winning) and is probably strongly weighted to recent performance (leading to the concept of the contract-year). Future performance does somewhat correlate to past performance (there is such a thing as a good QB), but the salary paid in a given year has absolutely zero direct logical relation to the performance in that year, and average performance only weakly so (if the time period over which the average was taken includes a new contract and thus a chance for performance information in the first part of the interval to feed forward to the latter part).

The rest of your statistical methodology discussion is fine, but it really just serves to highlight the importance of getting your question right in the first place before hauling out the statistical toolbox to answer it.

Back to the original-original question of the cont...

2013-08-15T14:13:15.839-04:00

Back to the original-original question of the contract's value; Could one combile total EPA/G for all of the different possible combinations of teams based on position and salary cap restraints and compare the average EPA/G of those teams to the average EPA/G of all of the different possible combinations with Rodgers.

This would show how restrictive Rodger's contract is to the remaining roster positions. Also it would account for the actual contract conditions in the NFL.

Phil Birnbaum is correct. It is the error term of ...

2013-08-14T23:01:43.667-04:00

Phil Birnbaum is correct. It is the error term of the Y variable that must be normally distributed, not the Y variable itself.

So Rodgers is a bargain, right? :)

2013-08-14T05:41:04.439-04:00

So Rodgers is a bargain, right? :)

Brian, are you sure the Y variable needs to be nor...

2013-08-13T20:06:12.288-04:00

Brian, are you sure the Y variable needs to be normal? I thought only the error term needed to be normal.

Anonymous-You completely misunderstand the post. I...

2013-08-13T18:18:36.668-04:00

Anonymous-You completely misunderstand the post. I see a diagonally oriented "cloud" with a very significant correlation. Please read the original post linked above. Besides, if it were a line, we wouldn't need a regression, so that makes no sense.

If teams were actually any good at digesting touchdowns, yards, etc into a single number, they'd be real close to EPA. And in aggregate there would definitely be very solid connection.

Here we are, comparing what teams *should* do and what they *do* do. That's right, I said do do.

"Does salary cause performance, or does perfo...

2013-08-13T13:50:07.374-04:00

"Does salary cause performance, or does performance cause salary? I think the answer is neither."

Then perhaps fitting a straight line to them doesn't really mean anything. In fact, at first glance, the data looks like a cloud not a line. The fact that we get such varying fits should lead one to conclude that these fits do not have great meaning.

I'd point out that we really don't do performance versus salary. It is expected future performance vs future salary. The future salary is known, it is the contract. The expected future performance is just a guess, which is often wrong (current results not indicative of future performance).

How do other measurements of performance look (touchdowns, yards, post season succeess, vs salary, for instance)? These are directly considered when estimating what salary a player gets, where EPA/G probably is not.

That's exactly what I always thought. Thanks ...

2013-08-12T20:22:44.330-04:00

That's exactly what I always thought. Thanks for confirming it for me Brian.