I'm gradually updating all the banners and titles, but you'll still see ANS references for while as things progress. And the old url advancednflstats.com will redirect to the new address, hopefully just as soon as the update spreads to all the dns servers.
Thanks to everyone for sticking with AFA!
For those not familiar with lacrosse, imagine hockey played on a football field but, you know, with cleats instead of skates. And instead of a flat puck and flat sticks, there's a round ball and the sticks have small netted pocket to carry said ball. And instead of 3 periods, which must be some sort of weird French-Canadian socialist metric system thing, there's an even 4 quarters of play in lacrosse, just like God intended. But pretty much everything else is the same as hockey--face offs, goaltending, penalties & power plays. Lacrosse players tend to have more teeth though.
Because players carry the ball in their sticks rather than push it around on ice, possession tends to be more permanent than hockey. Lacrosse belongs to a class of sports I think of as "flow" sports. Soccer, hockey, lacrosse, field hockey, and to some degree basketball qualify. They are characterized by unbroken and continuous play, a ball loosely possessed by one team, and netted goals at either end of the field (or court). There are many variants of the basic team/ball/goal sport--for those of us old enough to remember the Goodwill Games of the 1980s, we have the dystopic sport of motoball burned into our brains. And for those of us (un)fortunate enough to attend the US Naval Academy (or the NY State penitentiary system) there's field ball. The interesting thing about these sports is that they can all be modeled the same way.
So with lacrosse season underway, I thought I'd take a detour from football work and make my contribution to lacrosse analytics. I built a parametric win probability model for lacrosse based on score, time, and possession. Here's how often a team can expect to win based on neutral possession--when there's a loose ball or immediately upon a faceoff following a previous score:
The main responsibilities of the intern will include, but are not limited to the following:
*Support ANS team with data extractions, analysis, reporting, and data presentation
*Suggest new ways to improve existing processes
*Document and automate data and analytic processes
*Currently enrolled in, or a recent graduate of, a quantitative degree program such as Operations Research, Mathematics, Statistics, Economics, Computer Science, Information Science, Management Information Systems, or similar
*Senior, recent graduate or post-grad level
*Demonstrated attentiveness to detail
*Experience in analytical field
*Proficient in SPSS, SAS, mySQL, and/or R
*Proficient in Microsoft Office suite
The charts need some explanation. They plot how many timeouts a team has left during the second half based on time and score. Each facet represents a score difference. For example the top left plot is for when the team with the ball is down by 21 points. Each facet's horizontal axis represents game minutes remaining, from 30 to 0. The vertical axis is the average number of timeouts left. So as the half expires, teams obviously have fewer timeouts remaining.
The first chart shows the defense's number of timeouts left throughout the second half based on the offense's current lead. I realize that's a little confusing, but I always think of game state from the perspective of the offense. For example, the green facet titled "-7" is for a defense that's leading by 7. You can notice that defenses ahead naturally use fewer timeouts than those that trail, as indicated by comparison to the "7" facet in blue. (Click to enlarge.)
It's been almost 6 years since I introduced the win probability model. It's been useful, to say the least. But it's also been a prisoner of the decisions I made back in 2008, long before I realized just how much it could help analyze the game. Imagine a building that serves its purpose adequately, but came to be as the result of many unplanned additions and modifications. That's essentially the current WP model, an ungainly algorithm with layers upon layers of features added on top of the original fit. It works, but it's more complicated than it needs to be, which makes upkeep a big problem.
Despite last season's improvements, it's long past time for an overhaul. Adding the new overtime rules, team strength adjustments, and coin flip considerations were big steps forward, but ultimately they were just more additions to the house.
The problem is that I'm invested in an architecture that wasn't planned to be used as a decision analysis tool. It must have been in 2007 when I recall some tv announcer say that Brian Billick was 500-1 (or whatever) when the Ravens had a lead of 14 points or more. I immediately thought, isn't that due more to Chris McAllister than Brian Billick? And, by the way, what is the chance a team will win given a certain lead and time remaining? When can I relax when my home team is up by 10 points? 13 points? 17 points?
That was the only purpose behind the original model. It didn't need a lot of precision or features. But soon I realized that if it were improved sufficiently, it could be much more. So I added field position. And then I added better statistical smoothing. And then I added down and distance. Then I added more and more features, but they were always modifications and overlays to the underlying model, all the while being tied to decisions I made years ago when I just wanted to satisfy my curiosity.
So I'm creating an all new model. Here's what it will include:
I was just added to the football analytics panel as a last-minute fill-in. The panel is at 10:40 AM Saturday. I'll be there from Friday afternoon through Saturday afternoon and I'm looking forward to reconnect with all the great folks I met last year and to make new connections.
I'll make myself available for interviews or sit-down discussions by request. Please send an email to email@example.com to coordinate. See you there!
"Analytics" has unfortunately become a trendy buzzword in sports. I've found that many people who are only vaguely familiar with analytics, including some team executives, media members, and fans, have the wrong idea about what analytics is. Some think it's a panacea that can optimize the solution to any problem. Some think it's just statistical trivia or scientific minutia, like ESPN's Sports Science series (Dwight Howard's arm-span is as big as a 2-car garage!). Others think it's just Moneyball, a one-time talent arbitrage applicable to only one sport. So I thought I'd put my own thoughts down on what analytics is, at least as it applies to football.
I'm not the gatekeeper on what qualifies as analytics, and I'm not going to say what counts and what doesn't. But I think analytics comprises all or parts of four general processes: