Tables vs. Graphs

I'm not sure about everyone else, but I've got a very visual brain. I'm one of those guys at work who can't have a conversation without going to the whiteboard, if only to organize my own thoughts. I don't think I'm alone, either. Many theorists believe one reason humans became such smart monkeys is that we co-opted the huge visual-spatial part of our brains to use for abstract thought.

The concept of time is one example. We think and talk of time, a concept virtually without its own terminology, in terms of space and motion: Time goes by...Our best days are ahead of us...I'm looking forward to next season. The blathering talking heads on CNBC can't go 20 seconds without convulsively saying the phrase going forward whenever referring to the future.

Abstract sports concepts like win probability are no exception. We would all call 2-yard run on 4th and 1 a big play, even though it was anything but literally big. How would we characterize a 38-35 game? As a high-scoring game, of course. We are universally comfortable speaking about abstract concepts in terms of the metaphors of physical position, size and motion, and it's a window into how we think. That's why I'll take a graph over a table of numbers any day.

Tables have their advantages, but graphs are perfectly suited for comparing quantities and for detecting relationships between variables, two of the most common things we do around here. Here's a quick example. Consider team offensive Expected Points Added (EPA) for passing and rushing in 2010. I could present the information either as a table of numbers, like this:

Team Offense EPA

 Team Pass EPA Run EPA NE 143.7 60.8 HST 119.8 40.9 GB 163.0 -12.3 SD 159.9 -17.7 NO 157.2 -16.8 PHI 74.0 62.4 IND 146.0 -17.9 ATL 105.4 4.8 PIT 116.4 -8.7 TB 69.7 20.1 DAL 62.6 8.5 OAK 30.1 29.8 BLT 83.9 -25.1 JAX 15.3 43.2 NYG 68.6 -12.8 KC 52.8 0.8 TEN 66.1 -19.5 NYJ 29.6 10.1 DEN 77.2 -49.1 DET 42.2 -19.9 CIN 55.2 -53.3 WAS 9.4 -11.3 SF 6.5 -9.9 CLV -1.4 -14.1 MIA 10.9 -33.7 CHI -2.8 -21.9 SL -5.9 -32.0 MIN -38.7 0.2 BUF -25.3 -30.5 SEA -20.8 -39.8 ARZ -103.5 -15.0 CAR -110.9 -45.7 Avg 48.6 -7.0

Or I could present the same information in a scatter plot, like this:

Right away, you can see where each team stands with relation to the others. You can see the averages, illustrated with the green lines. For example, NE, NO, IND, and GB all had very successful offenses this past season, but what set NE apart was their success in the running game. This is far more intuitively apparent in the graph than in the table.

Further, you can see relationships within the data that you might ever detect just by reading a table. You can immediately see that pass EPA is distributed much wider than run EPA. You might also notice the correlation between rushing and passing success.

The interactive features make the graph even more compelling. To see the exact numbers or to reveal obscured team abbreviations, hover your cursor over any team's data point. To filter any number of divisions, click on the legend below the main plot.

This is no different than the ones I posted in these two posts, highlighting team balance and correlations between a team's offensive performance and defensive performance. But the plan is to have graphs like this, updated weekly, available throughout next season. The hard work is done, so now I can feed any set of statistics into the meat-grinder and instantly produce a similar plot, whether it's WPA, or EPA, or SR, YPA, or simple yards. It could be team running/passing or team offense/defense. No longer would I have to paste the data into Excel, create the graph, save it, upload it, etc...

Here are some other examples that illustrate how a simple graph can sometimes convey information a much clearer way than a table of jumbled numbers and decimals. This next graph is team opponent-adjusted Success Rate from the 2002 season. Look at how the Buccaneers' defense stands out above the pack. Their Super Bowl opponent, the Raiders, had a scorching offense, but it was the Bucs' year.

The next graph is an example of a more conventional statistic, net passing yards per attempt (YPA) for the 2010 season. I've made defensive YPA appear negative to be consistent--better defenses are higher on the vertical axis. Aside from the snake-bitten (or Norv-bitten if you prefer) Chargers, you can see that the two Super Bowl contenders are the two furthest teams to the upper-right, signifying their strength in both offensive and defensive net pass efficiency.

There's no new analysis here, just a novel way of presenting the information. One of projects I'll be working on this off-season is creating handy visualizations like these and others.

15 Responses to “Tables vs. Graphs”

1. Eric says:

are the axes properly labeled?

2. DSMok1 says:

What sort of software packages are you using for this work? What API for the flash stuff?

3. Anonymous says:

wrong graph for first one, you're graphing offense vs defense.

4. Brian Burke says:

Oops. Sorry wrong graph. You get the idea. The Flash stuff is 'XML/SWF Charts', same thing I use for the WP graphs.

http://www.maani.us/xml_charts/index.php

5. Brian Burke says:

Should be the correct graph now.

6. Larry says:

Something you might want to consider is using RApache. It is a nice method of putting all of your data in one place and allows a lot of ad-hoc analysis. It is also powered by R statistics package so there is a lot of horsepower to run statistical analysis.

7. Florida Danny says:

couldn't agree with you more, brian. here's another perfect example, though not sports-related (health care spending vs. per-capita GDP for OECD countries):

http://theincidentaleconomist.com/wordpress/sure-its-got-to-go-up-but-how-much/

i've never seen that basic data-driven idea about the US conveyed so powerfully and succinctly all at once. i don't really understand why politicians don't go all ross perot on us anymore. i bet if obama had put that graph up on the screen during the healthcare debate, he'd have had a lot more popular support at the time.

8. Dr Obvious says:

Florida Danny - You may find these two links interesting. Part 2 especially.

http://politicalcalculations.blogspot.com/2007/09/redefining-health-care-debate-part-1.html

http://politicalcalculations.blogspot.com/2009/07/redefining-health-care-debate-part-2.html

Brian - Please excuse the slightly off topic nature compared to your original post.

9. Ian Simcox says:

Florida Danny

Interesting link. I know we're getting off NFL here (hey, it's the offseason) but one interesting point to note about health spending is that if you plot the US's historic spend on those graphs, it's always above the European cloud of points i.e. for any given GDP/capita, the US spends more on healthcare than any other OECD country.

If you plot the same for Norway, one of those on the far right, you see that as GDP/capita grew, healthcare spend in Norway went pretty much straight though the non-US cloud. The implication seems to be that it's the US and not Norway that's the outlier in this dataset.

10. JMM says:

Well, to bring the discussion back to visual NFL stats, let me pose a definition of momentum. Momentum is the portion of a game graph where a peak to a trough covers more than 2 drives by each team. (see the SB graph from 3:10 of the second to 6:42 of the third where Pitt had "momentum" and the period onward when GB arguably had it. The three spikes can, and should be debated.

I am prefer graphs.

11. Doctorjorts says:

I can't believe I just read an entire post about the advantages that graphs have over charts. And we wonder why more NFL coaches aren't interested in statistics.

12. Anonymous says:

what the heck is JMM talking about

13. Anonymous says:

He is prefers graphs.

14. Florida Danny says:

dr. obvious-

thanks for the links. interesting stuff, and -- i agree -- a better analysis than the one cited in my link. i'm not an economist by any stretch, so take these critiques with a grain of salt, and enlighten me where necessary:

1) while i obviously agree with the idea that the US's orders-of-magnitude nation-level superiority in GDP (PPP) warrants a state-level analysis, i have reservations about the prudence of breaking down the US economy into 50 state economies. as illustrated by the charts on the link, it seems self-evident that the GDPs of the individual US states are highly interrelated, which is what we'd expect given the amount of interstate commerce that takes place among the states. it seems somewhat methodologically flawed (from my non-economist perspective) to treat the states as independent data points when interstate commmerce, by definition, renders the state-specific economies as dependent on the other state economies.

2) although the statistical argument does hold sway with me, the write-up seems a tad influenced by an anti-universal-coverage agenda (and conspicuously-so).

15. Anonymous says:

I've been playing with a graphics library called Graph::Clicker, a Perl module. Not as fancy as the above but far better than gnuplot. Downloadable from CPAN.

Leave a Reply

Note: Only a member of this blog may post a comment.