How to Lie with Statistics

Stochastic Football

 

Statistics make me cringe. Not the upstanding work done by Brian but too much analysis done by too much of the greater sportswriting community. That word, “analysis,” it doesn't help matters. Like many backwards things Western Civilization, modern usage derives from Aristotle. Real analysis, that's a marvel. What I encounter throughout the web would be better labeled: statistical rhetoric: the use of statistics to forward a previously held opinion. There's this great quote by Wayne C. Booth about critical theory—specifically the idea of showing versus telling in fiction writing—and how it disseminated from scholarly critics, down to commercial critics … well, I'll just share it:
“[T]he legitimate defense of the new soon froze into dogma. … [W]hen such rule-making descended further into the hands of unabashed commercial critics, it was simplified to the point of caricature.”
This is the progression: from new to accepted to—in some skeletal, bastardized form—the mother truckin law. Wanna be taken seriously? Gotta speak the language. For the modern sportswriter statistics are jargon, argot and shibboleth all in one. No wonder an old hat striving for relevance rushed to create an eponymous, um, effect? despite its obvious bogusness.

Brian attracted me to Advanced NFL Stats through his work exposing the phoniness of the so-called Curse of 370. His simple, clearly worded argument of why said curse was cooked up, and either indicative of blithe error or chicanery, challenged me to be careful and inquiring instead of gullible. So in honor of Mr. Burke and his fine and reputable site, I now intend the exact opposite. Let us together learn how to lie with statistics.

Why Standard Fantasy Football Rots and How To Fix It


My mother-in-law plays a dice game called Farkel. It's a game of simple math, but mostly a game of chance—more War than chess, more cleromancy than game theory. If I were stuck with her in a windowless room without my toothbrush but with a woman named Estelle, if we were stuck for all eternity, perhaps we could determine if her, ahem, extreme prudence or my more reckless style were superior. But as-is there's a lot of guessing, a lot of premature revelry and a lot of empty opining about strategy. If I may: the game sucks. It's an excuse for people of a certain age to get together and drink.

Fantasy football is a $70 billion market. I can't believe I just wrote that. That … that number was not planned.

… give me a second …

23 million people play fantasy football, and while I could not find an exact number, a not terribly scientific or thorough accounting of the number of Yahoo public leagues versus the number of Yahoo private leagues, indicates most play with standard rules. And many customized leagues are close enough to standard rules to, for my purposes, be comparable.

Fantasy football played by standard rules is a rotten game, requiring little skill, that, in its crappiness is a bad reason even to get together and drink. Here's why:

Bombast Revisited; A Partial List of My Cognitive Biases

Stochastic Football 

 

Bombast Revisited

 

It's said a joke explained is a joke ruined. So how about satire? I wrote a loudmouth, accusatory post. It was fun. My intention: to appropriate the tools of popular sports punditry: attention grabbing headline, us vs. them framing, a hectoring tone, abundant self-assurance, straw men galore, unsourced data and an adamantine sense of moral authority. And here's what happened: the response was almost universally negative, and often outright caustic. But the post netted big traffic: the most since Brian's point/counterpoint on Aaron Rodger's extension, and more than double any post written since May.

Here's a Thought

If it's ok for coaches to use the preseason to experiment with Peyton Manning in the pistol and risk Jay Cutler in the read-option, then it's ok to give your punt team a game off. Put away the 21 personnel I-formation and go with the 11 personnel on all four downs. Go for it. Be aggressive. Test the limits. It's the preseason. There are three other preseason games and 40 practices for you to hash out roster spots 52 and 53 on your punt and field goal units.


Call for Writers 2013

It's time to report for training camp.

This season Advanced NFL Stats is planning to add a small number of additional contributors. I’m looking for smart, articulate thinkers to bring a fresh perspective to the world of NFL analytics. Previous analysis or blogging experience is preferred, but not required.

Those interested should email me directly (see the About - Contact/FAQ menu link) no later than August 25th. Include the words 'Call for writers' in the subject line, please. Your email should include a brief introduction and links or attachments of two or more examples of your analysis and writing.  If you have no previous experience, you can send ‘demo’ drafts of the kind of analysis you’d like to do.

In particular, I’m looking for contributors to ‘own’ a regular weekly assignment. For example:

The Pay-Performance Linear Model

A couple months ago I posed an apparent paradox. Aaron Rodgers' new $21M/yr contract was either a solid bargain or a disastrous ripoff depending on how we analyze the data. By only flipping the x and y axes of a scatterplot, we can come to completely opposite conclusions about the value of a QB relative to what we'd expect for a given salary or for a given level of performance. Much of this post is derived from the many insightful comments in the original. Please take the time to read them, especially those from Peter, X, Phil and Steve.

By regressing salary on performance (adjusted salary cap hit on the vertical (y) axis and Expected Points Added per Game (EPA/G) on the horizontal (x) axis), Rodgers' deal is insanely expensive by conventional standards. But by regressing performance on salary, his new contract is a bargain.

Which one is correct? That depends on several considerations. First, there are generally two types of analyses. The one I do most often is normative analysis--what should a team do? The second type is descriptive analysis--what do teams actually do? The right analytic tool can depend on which question we are trying to answer.

The reason that we saw two different results by swapping the axes is that Ordinary Least Squares (OLS) regression chooses a best-fit line by minimizing the square of the errors between the estimate and the actual data of the y variable. OLS therefore produces an estimate that naturally has a shallow slope with respect to the x axis. When we swap axes, the OLS algorithm is not symmetrical because of that shallowness.

A HOF Game Preview Without Irony

Stochastic Football

 

Plans and Gambles Among Former Etruscan Pirates


By team efficiency Miami finished on the low-end of average in both offense and defense. Any other year, rookie quarterback Ryan Tannehill's performance would have been thought promising. He wasn't good, but typically rookie quarterbacks are not good, and he wasn't so bad as to seem unsalvageable. Rummaging through the last 13 years of data: Carson Palmer, Eli Manning, Jay Cutler, Joe Flacco and Matthew Stafford all performed comparably or worse than Tannehill. Two things work against Tannehill: he was bad in a season when three other rookie quarterbacks were very good to excellent, but that's more a matter of perception. And he's old. At 25, he's but months younger than Stafford and Josh Freeman. He's not comparing brands of glucosamine chondroitin with Brandon Weeden, but he's not a baby face still growing into his body. As an athlete, Tannehill's arrived.