Comments on Advanced Football Analytics (formerly Advanced NFL Stats): On Opponent Strength and Team Strength Correlation

I'm working on Week 8 Power Rankings for NFL t...

2015-10-31T15:35:38.404-04:00

I'm working on Week 8 Power Rankings for NFL teams and after reading the comments above, I have come to a rather 'quick' guess on something about "teams not being able to play each other." The answer, IMHO, is THE BYE WEEK. The problem is in the skewing a team's power ratings and all other results. THAT would be the week they played themselves (in theory) so, it becomes a NULL result. If you think about it, turns out to probably be the same result, or a small deviation.

The BYE week really counts as TWO games where, say NE plays NE1 and NE1 plays NE. By making these games 0-0, in theory, there are 8 more games (each team plays itself twice). So, by making the BYE week, one would assume that would be the week that each team played itself.

So, a null result works for the Division, but I can see how it might not work with a 32 team-8 Division format, due to the fact there are, each week, 3-4 games I call NCND games (Non-Conference*Non-Division) games. These games are the games that, IME, skew the numbers due to the fact that they are usually After/Before (or BOTH) for each team. In some cases, result is a 'sandwich' game. A game where, say New England after playing Miami, plays Dallas, then next week has another Division game. So, it's safe to say, the stats from both teams in this game aren't TRUE stats. It's the game where a team ranked 28 beats a team ranked 10. Does this push up the worse team in League standings? NO, but it does show up in stats. To me, these NCND games should be taken OUT of the figures. Now versus Conference teams, it's different. AFC vs. AFC does have some implications for Playoffs but the ONLY stat that is important (and it's minor) is W-L unless it's late in season and somehow a team is needing a W to get to Wild-Card status. I take these games OUT. I get a better, reliable, accurate result doing it this way. It removes the 'Pre-Season' style game and stats that skew all my stats.

However, these games are A+ for Fantasy players because of the David vs. Goliath set-up, where the worst team can put up good numbers against a #2 team - not a 'target' game. But, they can be the most important game on the card of any Sunday for Fantasy! Find the best game of a lopsided match-up and use this to add some volume to your Fantasy score.

I'm not a Math guy, but I do know comparison is my 'tool' for working games and how to write/customize math can be difficult. Ex: Strength of Schedule (SOS) with League rank, O rank, D rank and if game is DIV or NC or NDNC game = a solid, usable number. Any help would be appreciated. I'm limited in Algebra, but do ok with Statistics and use stats for 'Comparison' (learned from Horse racing). Any Math Pros, I'd like link(s) to a problem Generator where I can customize my way of getting the numbers from mental idea - the stuff I know in my head - and onto paper.

Thanks for any input on my BYE week theory and any links to Math / Logarithm / Algorithm sites where a math novice can apply stats, create customized figures/results. Thx in advance. Too many rely on stats as concrete info, when they are simply historical results that in NO way determine the outcome of a game. They are useful to learn from, reverse-engineer for the sake of 'learning' in a different way. Many times, experience is the best tool. Creating custom-based, personal utility, and not from simple W-L record (being facetious), can result in solid, current, and RELEVANT rankings that reflect a team's real status. I use many different methods for certain games, even 'tiers' of yardage, points, TOP, TOs, and Fum-INT (with or without) TD, and Drive-times. I just need some direction on how to write this out. Thanks again. Please, any helpful links, info - much appreciated. GL all,

SC (if reply, please do so HERE, so I can find it)

Two problems: 1) SOS games should on average acco...

2012-02-01T22:08:39.618-05:00

Two problems:

1) SOS games should on average account for this; first place teams play two games each vs 2nd, 3rd, 4th, plus two SOS games against 1st.

2) There's no reason to believe that teams in a division would balance out to league average. Some divisions are strong or weak. There's also not a great year-over-year correlation in a team's ability: teams can rise or fall quickly, and often do.

I don't know if anyone is still reading this c...

2012-01-31T14:30:39.506-05:00

I don't know if anyone is still reading this comment thread, but I'll take a shot anyway.

How does the SOS change if you were to remove the games played by the team in question? In other words, if you remove 15 wins and 1 loss from GB's SOS, and do the same for every other team according to their record, does that help reduce (eliminate?) the SOS-WL pct. correlation?

I believe this is what they do for the RPI system in college basketball. I'm not trying to say that the RPI is great, because it isn't, but it does seem to make sense when calculating a teams SOS.

James is saying that an average team playing that ...

2012-01-30T17:11:08.783-05:00

James is saying that an average team playing that schedule should be expected to win just 1 game.

You could also argue that the 1.89 GWP figure is too high. e.g. take Regular Season Win% and compute what an average team would do. Using the average method, which again I don't like to use as SOS, you get an average of *only* 1.5 wins.

In your last post, there's a problem with your math and understanding. The .122 you compute is not Wins per Game -- it is actually .122 GWP WINS BELOW AVERAGE. Thus, .378 is the same as 1.89/GP which when combined with the OPP GWP of .622 the sum is 1.

James - in that example you'd expect an averag...

2012-01-29T12:08:53.919-05:00

James - in that example you'd expect an average team to win 1.89 games (based on season end GWP), so approx a 2-3 record. Just because they are massive underdogs, they've still a chance of winning.

I suppose one problem is we haven't got a unit for strength of schedule. Wins per game seems a reasonable one. That schedule would give you 1.89 wins, vs 2.5 wins against an average schedule, which comes out to .122 wins per game. Average opponent GWP of those five games is 0.622, exactly 0.5 above the wins per game measure.

If I'm massively missing the point then please explain it to me, but to me this seems to work fine.

@Mark M What's wrong with the average of oppo...

2012-01-27T21:16:11.572-05:00

@Mark M

What's wrong with the average of opponents generic win probability?

Lots of things. First, the averaging method is not robust enough. Second, this 4 team league example stinks because it is incomplete and there is way too much variance between the teams. There is no average team, and worse there are no similarly skilled teams. A team can always play against like opponents, and the NFL has all 32 teams play 2 of these every season! These matchups alone make it a 50-50 proposition REGARDLESS of the computed SOS or overall Win%.

GWP is the probability a team would win against an average team, so taking the straight average of that seems fair enough to me, because it answers your question.

You need to look at the schedules of each team individually and just averaging Win% doesn't capture that. For example, you probably didn't realize that the (#1) .75 team playing the (#2) .6 team is TWICE AS GOOD!!! That said, it has a 2/3 chance of winning!! In this bogus league there is no combination of MATCH-UPs where the probability of winning lies between 34% and 66%. Most unrealistic too say the least.

Mark, I could see how an average could misrepresen...

2012-01-27T20:16:03.158-05:00

Mark, I could see how an average could misrepresent the spread of GWP. For instance, if a team had played the Steelers, Saints, Patriots, Packers, and Vikings the average GWP would be .6 and indicate a 3-2 record, but the average team would be heavy underdogs in 4 of the 5 games.

Pat Laffaye What's wrong with the average of ...

2012-01-27T13:25:31.514-05:00

Pat Laffaye

What's wrong with the average of opponents generic win probability? After all, SOS is simply trying to say 'if I was an average team, what would my win percentage be against these opponents'.

GWP is the probability a team would win against an average team, so taking the straight average of that seems fair enough to me, because it answers your question.

"It is difficult to imagine how this wouldnt ...

2012-01-26T21:33:22.064-05:00

"It is difficult to imagine how this wouldnt be seen" Jim do u have to be derisive 24 hrs a day?

When something actually is seen consistently, for a clear and simple reason, what is "derisive" about saying it is hard to imagine that it wouldn't be seen?

Your opinion may differ, you may be able to imagine that it wouldn't be consistently seen.

But why read a difference of opinion as a personal slight?

[I take it that before the arrival of blog comments you didn't spend a lot of time in the friendly and happy world of usenet discussion groups, where "every stranger was just a friend you hadn't met". ;-) ]

"It is difficult to imagine how this wouldnt ...

2012-01-26T17:54:46.764-05:00

"It is difficult to imagine how this wouldnt be seen" Jim do u have to be derisive 24 hrs a day? This is just one example but there r dozens to be found

In 4+ years of following this blog I find this art...

2012-01-26T15:12:56.411-05:00

In 4+ years of following this blog I find this article probably the most disappointing. This analysis is completely flawed, and just as terrible as FO's own analysis. Taking a simple average to compute SOS??? At least use an iterative method and try to better understand the nuances involved. Yes, there are many ways to compute SOS, but just averaging opponents Win% is one of the worst!

On Opponent Coaching Strength and OUr Coaching Str...

2012-01-26T14:34:12.649-05:00

On Opponent Coaching Strength and OUr Coaching Strength Correlation...

Relatedly, I've often wondered, as a Pats fan, why other coaches seem so stupid.

It's because we never get to compete against Belichick.

Twice a year we play Chan Gailey, Dick Jauron, Mike Mularkey, Gregg Williams, Herm Edwards (!), the Mangenius, Dave Wannstadt (!!), Cam Cameron (!!!)

Brian, So the presumption of that analysis is tha...

2012-01-26T13:11:18.122-05:00

Brian,

So the presumption of that analysis is that the TWP within each division is .5. On mean that will be true, and so we arrive at approximately the same number, there is also a lot of variation in mean GWP between divisions.

That variation is very strong. Strong enough to make this correlation a really terrible way to judge yours or any other forecasting model. I just want to convince you to not head down that blind alley, because I enjoy your work.

Thanks for all that you do here. And GO GIANTS! :)

The average r that db22 reported of -.19 makes sen...

2012-01-26T10:06:21.691-05:00

The average r that db22 reported of -.19 makes sense. 6/16 (37.5%) of each team's schedule will have a "true" r of -1.0. Another 8 (50%) games should theoretically be uncorrelated at r=0. And 2 games (11%) should usually be positively (to some degree) correlated due to SoS scheduling.

I'm with Jim Glass on this one, it's extre...

2012-01-26T09:43:27.969-05:00

I'm with Jim Glass on this one, it's extremely obvious why, retrodictively (and THAT is KEY here) we would see the better teams having an easier schedule.

That is why, for me, GWP should be adjusted to represent the win probability a team would have were it playing an average team from the rest of the league not including itself, as this is far more meaningful.

In addition to the effect inherent in the method, ...

2012-01-26T06:04:45.344-05:00

In addition to the effect inherent in the method, there is also some Special circumstances is this years SoS games:

No 1 Seed schedules this year contained:
Indy, KC, and a lot of mediocrity in the NFC
No 3 seed schedules OTOH contained Houston, SF, and Oakland.

I dont understand how anyone who's seen their ...

2012-01-26T01:55:59.807-05:00

I dont understand how anyone who's seen their team rankings methodology takes FO seriously. Its pretty damn clear they don't understand the basics of statistics. Their whole staff is comprised of idiots like the 49ers fans of this year and the Falcons fans of last year that were constantly arguing that turnovers are predictive (even though there is literally no proof of this)

The question is why is there such a correlation ...

2012-01-26T01:53:24.494-05:00

The question is why is there such a correlation

Looking at this...

........... W-L ... Opp W-L ... S-o-S
Best team . 6-0 .... 6-12 ..... .333
2nd best ... 4-2 .... 8-10 ..... .444
3rd best ... 2-4 .... 10-8 ..... .556
Weakest ... 0-6 .... 12-6 ..... .667

... you are asking *why* is there a negative correlation between winning percentage and strength of schedule???

and why is there such variation in the correlation.

What a mystery. To start with, one might ponder if there is any reason why the exact correlation *wouldn't* vary year-to-year. Remembering, among other things, Brian's observation that 42% of game results are random.

I've answered that question. Have you?

Um, yes.
~~~~~~~~~~~~~~~~~~

Wait - somebody thought there was something wrong with negative correlation? It's obvious that there *should* be negative correlation. I made this exact argument a week or two ago somewhere in the FO comments section.

Of course. You were right. It's obvious.

There are people at FOers talking about using Python and Matlab to take apart and understand the complexities of this obviousness.

Which is something like using a 155mm howitzer to dissect a gnat. Worse, *needing* a howitzer to dissect a gnat. (Though a lot of it, I think, is some people will just take any excuse to have the fun of playing with a 155mm howitzer.)

another vote for the fact that it has been well kn...

2012-01-26T01:46:29.439-05:00

another vote for the fact that it has been well known.

for years, we fans have always said that the lions strength of schedule was high because they never get to play the lions.

(hopefully, that old joke is no longer valid).

Wait - somebody thought there was something wrong ...

2012-01-26T01:14:27.426-05:00

Wait - somebody thought there was something wrong with negative correlation? It's obvious that there should be negative correlation. I made this exact argument a week or two ago somewhere in the FO comments section.
For the most obvious example, start with a two-team league where one of the teams is stronger than the other. Proof by induction is left as an exercise for the reader. (Just think about what a minimal counter-example would look like.)

Well I'm going to make a mea culpa first. Miso...

2012-01-26T00:06:56.264-05:00

Well I'm going to make a mea culpa first. Misordered a vector, so I wasn't correlating the wrong two things. Fixed my code.

On 100000 runs:
R is at mean -.192
This should be lowered a little by the two conference games I didn't take into account.
R has standard deviation .278

The reason R can go positive is because of how teams are divided into divisions. For an extreme example consider that the NFC East has the four best teams, the NFC South the next four and so forth, and then consider we're in the year when the NFC East and the NFC South play. In that case, the 10 best teams are playing literally more than half their games against top ten teams. That gives a positive correlation of about .9; that's an extreme case - in a hundred thousand seasons the highest correlation I ever got was .71. But intuitively, we all know that there are seasons in which one division is really strong and another division godawful. Those circumstances will make the correlation more positive.

Jim,

"This season, by Pythagorean, the seven strongest teams had opponents with a 47% winning strength and the seven weakest teams had opponents with an average 53% winning strength."

I bet GWP correlates pretty well to Pythagorean. So what you are saying is that there is a negative correlation this year between GWP and schedule strength. We already knew that! The question is why is there such a correlation, and why is there such variation in the correlation. I've answered that question. Have you?

Hi Jim ... Do you have an issue with one of my ass...

2012-01-25T23:16:21.592-05:00

Hi Jim ... Do you have an issue with one of my assumptions?

No, I just don't see the need to use assumptions when one has facts. I looked at the actual Pythagorean numbers for all teams and noted what they say:

"This season, by Pythagorean, the seven strongest teams had opponents with a 47% winning strength and the seven weakest teams had opponents with an average 53% winning strength."

Looking at each team's game-by-game schedule one can see exactly where its numbers came from. (The Rams' opponents averaged 59% winning strength, the Packers' 45%, etc.)

If I didn't have the actual numbers I'd make assumptions to figure what they might reasonably be, but I have them.

So, I've looked into this a little because I d...

2012-01-25T23:01:04.051-05:00

So, I've looked into this a little because I do my own opponent adjusted EPA stats and here's what I found (maybe someone can help explain).

I adjust offenses based on defensive opponent strength and defenses based on offensive opponent strength. So, instead of just using the wins to determine strength of schedule, I wanted to look at it based on the strength of opponent's offenses and defenses.

Here are the correlations:
Adjusted Off EPA vs Opponent Def EPA: -0.021
Adjusted Def EPA vs Opponent Off EPA: -0.017

This is right around 0 as should be expected.

But,
Adjusted Off EPA vs Opponent OFF EPA: -0.53
Adjusted DEF EPA vs Opponent DEF EPA: -0.46

This falls more in line with what Brian was saying in terms of the nature of the schedule. But, it seems to me, that it might just be an outlier because I don't think these should be correlated, not really sure.

Would love to hear thoughts on it.

Pretty obvious once someone else has figured it ou...

2012-01-25T22:32:46.179-05:00

Pretty obvious once someone else has figured it out and posted a succinct explanation :)

Hi Jim, The reason it would be hard to see is bec...

2012-01-25T21:52:39.207-05:00

Hi Jim,

The reason it would be hard to see is because:

"If you make it a 16 team 15 game league, And the spread in GWP is even from .3 to .7, the spread in strength of schedule would be .027"

As in I calculated that using the pretty reasonable assumptions I gave. Do you have an issue with one of my assumptions? I did assume a certain distribution of GWP, but I could easily run my code on any arbitrary distribution. Though, I really don't see that this should make a difference.

Actually I just improved my code so that it accounts for a 14 game season. I neglect division champ games, because they're not easy to handle, and anyway should give a positive correlation.

After 10000 runs of such a season: The mean of R is .0012.
The standard deviation of R is .3657.

That standard deviation is pretty comparable to the year-by-year correlations from the FO post is that this correlation itself is fairly random.

If you would like my code (it's in matlab), I'd be happy to send it to you.