- Home Posts filed under research
Two-Point Conversion in the KC-DEN game
NFL coaches typically adhere to what's known as the Vermeil Chart for making two-point decisions. The chart was created by Dick Vermeil when he was offensive coordinator for UCLA over 40 years ago. It's a very simple chart that simply looks at score difference prior to any conversion attempt and does not consider time remaining, with one caveat. It applies only when the coach expects there to be three or fewer (meaningful) possessions left in the game.
With just over 7 minutes to play, there could be three possessions at most left, especially considering that at least one of those possessions would need to be a KC scoring drive for any of this to matter. (In actuality, there were only two possessions left, one for each team.) Even the tried-and-true Vermeil chart says go for two when trailing by 5. But it's not the 1970s any more and this isn't college ball, so let's apply the numbers and create a better way of analyzing go-for-two decisions.
Except for rare exceptions I've resisted analyzing two-point conversion decisions with the Win Probability model because, as will become apparent, the analysis is particularly susceptible to noise. Now that we've got the new model, noise is extremely low, and I'm confident the model is more than up to the task.
First, let's walk through the possibilities for KC intuitively. If KC fails to score again or DEN gets a TD, none of this matters. Otherwise:
Nick Foles and Interception Index Regression
With one week of the 2014 season in the books, Foles and McCown have already matched that combined total. While everyone should have expected both to regress from their remarkably turnover-free 2013 seasons, that does not tell us how far each should regress based on historical norms.
Simulating the Saints-Falcons Endgame
I previously examined intentional touchdown scenarios, but only considered situations when the offense was within 3 points. In this case NO needed a TD, which--needless to say--makes a big difference. Yet, because NO was on the 1, perhaps the go-ahead score was so likely that ATL would be better off down 3 with the ball than up 4 backed-up against their goal line.
This is a really, really hard analysis. There's a lot of what-ifs: What if NO scores on 1st down anyway? What if they don't score on 1st but on 2nd down? On 3rd down? On 4th down? Or what if they throw the ball? What if they stop the clock somehow, or commit a penalty? How likely is a turnover on each successive down? You can see that the situation quickly becomes an almost intractable problem without excessive assumptions.
That's where the WOPR comes in. The WOPR is the new game simulation model created this past off-season, designed and calibrated specifically for in-game analytics. It simulates a game from any starting point, play by play, yard by yard, and second by second. Play outcomes are randomly drawn from empirical distributions of actual plays that occurred in similar circumstances.
If you're not familiar with how simulation models work, you're probably wondering So what? Dude, I can put my Madden on auto-play and do the same thing. Who cares who wins a dumb make-believe game?
Analyzing Replay Challenges
Most challenges are now replay assistant challenges--the automatic reviews for all scores and turnovers, plus particular plays inside two minutes of each half. Still, there are plenty of opportunities for coaches to challenge a call each week.
The cost of a challenge is two-fold. First, the coach (probably) loses one of his two challenges for the game. (He can recover one if he wins both challenges in a game.) Second, an unsuccessful challenge results in a charged timeout. The value of the first cost would be very hard to estimate, but thankfully the event that a coach runs out of challenges AND needs to use a third is exceptionally rare. I can't find even a single example since the automatic replay rules went into effect.
So I'm going to set that consideration aside for now. In the future, I may try to put a value on it, particularly if a coach had already used one challenge. But even then it would be very small and would diminish to zero as the game progresses toward its final 2 minutes. In any case, all the coaches challenges from this week were first challenges, and none represented the final team timeout, so we're in safe waters for now.
Every replay situation is unique. We can't quantify the probability that a particular play will be overturned statistically, but we can determine the breakeven probability of success for a challenge to be worthwhile for any situation. If a coach believes the chance of overturning the call is above the breakeven level, he should challenge. Below the breakeven level, he should hold onto his red flag.
Sneak Peek at WP 2.0
As a quick refresher the WP model tells us the chance that a team will win a game in progress as a function of the game state--score, time, down, distance...etc. Although it's certainly interesting to have a good idea of how likely your favorite team is to win, the model's usefulness goes far beyond that.
WP is the ultimate measure of utility in football. As Herm once reminded us all, You play to win the game! Hello!, and WP measures how close or far you are from that single-minded goal. Its elegance lies in its perfectly linear proportions. Having a 40% chance at winning is exactly twice as good as having a 20% chance at winning, and an 80% chance is twice as good as 40%. You get the idea.
That feature allows analysts to use the model as a decision support tool. Simply put, any decision can be assessed on the following basis: Do the thing that gives you the best chance of winning. That's hardly controversial. The tough part is figuring out what the relevant chances of winning are for the decision-maker's various options, and that's what the WP model does. Thankfully, once the model is created, only fifth grade arithmetic is required for some very practical applications of interest to team decision-makers and to fans alike.
Implications of a 33-Yard XP
Over the past five seasons, attempts from that distance are successful 91.5% of the time. That should put a bit of excitement and drama into XPs, especially late in close games, which is what the NFL wants. But it might also have another effect on the game.
Currently, two-point conversions are successful at just about half that rate, somewhere north of 45%. The actual rate is somewhat nebulous, because of how fakes and aborted kick attempts into two-point attempts are counted.
It's likely the NFL chose the 15-yd line for a reason. The success rates for kicks from that distance are approximately twice the success rate for a 2-point attempt, making the entire extra point process "risk-neutral." In other words, going for two gives teams have half the chance at twice the points.
Win Values for the NFL
In 2013 the combined 32 NFL teams chased 256 regular season wins and spent $3.92 billion on player salary along the way. In simple terms, that would make the value of a win about $15 million. Unfortunately, things aren't so simple. To estimate the true relationship between salary and winning, we need to focus on wins above replacement.
Think of replacement level as the "intercept" term or constant in a regression. As a simple example think of the relationship between Celsius and Fahrenheit. There is a perfectly linear relationship between the two scales. To convert from deg C to deg F, multiply the Celsius temperature by 9/5. That's the slope or coefficient of the relationship. But because the zero point on the Celsius scale is 32 on the Fahrenheit scale, we need to add 32 when converting. That's the intercept. 32 degrees F is like the replacement level temperature.
No matter how teams spend their available salary, they need to have 53 guys on their roster. At a bare minimum, they need to spend 53 * $min salary just to open the season. We can consider that amount analogous to the 32-degrees of Fahrenheit. For 2013, the minimum salaries ranged from $420k for rookies to $940k for 10-year veterans. To field a purely replacement level squad, a franchise could enlist nothing but rookies. But to add a bit of realism, let's throw in a good number of 1, 2, and 3-year veterans in the mix for a weighted average min salary of $500k per year. The league-wide total of potential replacement salary comes to:
Using Probabilistic Distributions to Quantify NFL Combine Performance
Jadeveon Clowney is thought of as a “once-in-a-decade” or even “once-in-a-generation” pass rushing talent by many. Once the top rated high school talent in the country, Clowney has retained that distinction through 3 years in college football’s most dominant conference. Super-talents like Clowney have traditionally been gambled on in the NFL draft with little idea of what future production is actually statistically anticipated. For all of the concerns over his work ethic, dedication, and professionalism, Clowney’s athleticism and potential have never been called into question. But is his athleticism actually that rare? And is his talent worth gambling millions of dollars and the 1st overall pick on? This article aims to objectify exactly how rare Jadeveon Clowney’s athleticism is in a historical sense.
Jadeveon Clowney set the NFL draft world on fire at this year’s combine when he delivered one of the most talked-about combine performances of recent memory, primarily driven by his blistering 40 yard dash time of 4.53. Over the years, however, I recall players like Vernon Gholston, Mario Williams, and even Ziggy Ansah displaying mind-boggling athleticism in drills. But if each year a player displays unseen athleticism at the combine, who is really impressive enough that we deem them “Once-in-a-decade?”
Probability Ranking allows me to identify the probability of encountering an athlete’s measurable. For instance, I probability ranked NFL combine 40 yard dash times for 341 defensive ends from 1999-2014 (Table 1 shows the top 50). In this case, Jadeveon Clowney’s 40 time of 4.53 had a probability rank of 99.12, meaning his speed is in the 99th percentile of all DEs over this time span.
NFL Prospect Evaluation using Quantile Regression
Casan Scott continues his guest series on evaluating NFL prospects through Principal Component Analysis. By day, Casan is a PhD candidate researching aquatic eco-toxicology at Baylor University.
Extraordinary amounts of data go into evaluation an NFL prospect. The NFL combine, pro days, college statistics, game tape breakdown, and even personality tests can all play a role in predicting a player’s future in the NFL. Jadeveon Clowney is arguably the most discussed prospect in the 2014 NFL draft, not named Johnny Manziel. He is certainly an elite prospect and potentially the best in this year’s draft, but he doesn’t appear to be a “once-in-a-decade” type of physical specimen based exclusively on historical combine performances. From the research I’ve done, only Mario Williams and JJ Watt can make such a claim. Super-talents like Clowney have traditionally been gambled on in the NFL draft with little idea of what future production is actually statistically anticipated. All prospects have a “ceiling” and a “floor” which represent the maximum and lowest potential that a prospect could realize respectively. But what does this “potential” mean and does it hold any importance for actually predicting a prospect’s success in the NFL? In this article I will show how Quantile Regression, a technique used by quantitative ecologists, can clarify what Clowney’s proverbial “ceiling” and “floor” may be in the NFL.
Athletes are a collection of numerous measured and unmeasured descriptor variables. Figure 1 shows a single predictor (40 yard dash time) vs a prospects’ Career NFL sacks + tackles for loss (TFL) per game.
New Feature on the Draft Model
When I learned a little about object oriented programming, it all made sense. The software engineers were designing the interface for their own convenience, not for ease of use. It made sense from an efficiency standpoint...a programming efficiency standpoint. But from the perspective of the user, it wasn't so efficient. The least used feature was just as accessible as the most common feature, and all of them were hidden until you expanded the right portion of the tree.
Yesterday I realized I was doing the same thing with the draft model. From my point of view, it's easiest to think in terms of players and their probability to be selected at each pick number, because that's how the software that runs the model works. It goes down the list of prospects, player-by-player, looking at the probability he'll be selected pick#-by-pick#.
For the players and their agents, and for fans of particular players, this is ideal. They want to know where and when they'll go. But the user is probably thinking of things from a team's perspective. Whether the user is a team personnel guy or a fan of a team, he'd rather see things from the perspective of a pick #. Right now, a Vikings fan (or exec) would have to click through over a dozen or so of the top players to see who's likely to be available to them at pick #8. And if they were wondering about who'd be available if they trade up or down, that's another few dozen clicks. Scroll, click. Scroll, click...
Bayesian Draft Analysis Tool
For details on how the model works, please refer to these write-ups:
- A full description of the purpose and capabilities of the model
- A discussion of the theoretical basis of Bayesian inference as applied to draft modeling
- More details on the specific methodology
If you want to jump straight to the results, here they are. But I recommend reading a little further for a brief description of what you'll find.
The interface consists of a list of prospects and two primary charts. Selecting a prospect displays the probabilities of when he'll likely be taken. You can filter the selection list by overall ranking or position.
The top chart plots the probabilities the selected prospect will be taken at each pick #. I think this chart is pretty cool because it illustrates the Bayesian inference process. You can actually see the model 'learn' as it refines its estimates with the addition of each new projection. Where there is a firm consensus among experts, the probability distribution is tall and narrow, indicating high confidence. When there is disagreement, the distribution is low and wide, indicating low confidence.
The lower chart is the bottom line. It's the take-away. It depicts the cumulative probability that the selected prospect will remain available at each pick #. For example, currently there's an 82% chance safety HaHa Clinton-Nix is available at the #8 pick but only a 26% chance he's available at #14. A team with an eye on a specific player could use this information in deciding whether to trade up or down, and in understanding how far they'd need to trade.
Hovering your cursor over one of the bars on the chart provides some additional context, including which team has that pick and that team's primary needs (according to nfl.com).
The box in the upper right gives you the player's vitals - school, position, height, weight. The expert projections used as inputs to the model are also listed. Currently those include Kiper (ESPN), McShay (Scouts, Inc.), Pat Kirwan(CBS Sports), Daniel Jeremiah (former team scout, NFL Network), and Bucky Brooks (NFL Network). Experts were selected for their reputation, historical accuracy, and independence--that is, they don't all parrot the same projections. Not every prospect has a projection from each expert.
Link to the tool.
Bayesian Draft Model: More Methodology
The new Bayesian draft model is nearly ready for prime time. Before I launch the full tool publicly, I need to finish describing how it works. Previously, I described its purpose and general approach. And my most recent post described the theoretical underpinnings of Bayesian inference as applied to draft projections. This post will provide more detail on the model's empirical basis.
To review, the purpose of the model is to provide support for decisions. Teams considering trades need the best estimates possible about the likelihood of specific player availability at each pick number. Knowing player availability also plays an important role in deciding which positions to focus on in each round. Plus, it's fun for fans who follow the draft to see which prospects will likely be available to their teams. Hopefully, this tool sits at the intersection of Things helpful to teams and Things interesting to fans.
Since I went over the math in the previous post, I'll dig right into how the probability distributions that comprise the 'priors' and 'likelihoods' were derived.
I collected three sets of data from the last four drafts--best player rankings, expert draft projections (mock drafts), and actual draft selections. In a nutshell, to produce the prior distribution, I compared how close each player's consensus 'best-player' ranking was to his actual selection. And to produce the likelihood distributions I compared how close each player's actual selection was to the experts' mock projections.
Theoretical Explanation of the Bayesian Draft Model
First, some terminology. P(A) means the "probability of event A," as in the probability it rains in Seattle tomorrow. Event A is 'it rains in Seattle tomorrow'. Likewise, we can define P(B) as the probability that it rains in Seattle today.
P(A|B) means "the probability of event A given event B occurs," as in the probability that it rains in Seattle tomorrow given that it rained there today. This is known as a conditional probability.
The probability it rains in Seattle today and tomorrow can be calculated by P(A|B) * P(B), which should be fairly intuitive. I hope I haven't lost anyone.
It's also intuitive that "raining in Seattle today and tomorrow" is equivalent to "raining in Seattle tomorrow and today." There's no difference at all between those two things, and so there's no difference in their probabilities.
We can write out that equivalence, like this:
Bayesian Draft Prediction Model
I've created a tool for predicting when players will come off the board. This isn't a simple average of projections. Instead, it's a complete model based on the concept of Bayesian inference. Bayesian models have an uncanny knack for accurate projections if done properly. I won't go into the details of how Bayesian inference works in this post and save that for another article. This post is intended to illustrate the potential of this decision support tool.
Bayesian models begin with a 'prior' probability distribution, used as a reasonable first guess. Then that guess is refined as we add new information. It works the same way your brain does (hopefully). As more information is added, your prior belief is either confirmed or revised to some degree. The degree to which it is refined is a function of how reliable the new information is. This draft projection model works the same way.
Draft Prospect Evaluation Using Principal Component Analysis
A guest post by W. Casan Scott, Baylor University.
As different as ecology and the NFL sound, they share quite similar problems. The environment is an infinitely complex system with many known and unknown variables. The NFL is a perpetually changing landscape with a revolving door of players and schemes. Predicting an athlete’s performance pre-draft is complicated through a number of contributing variables including combine results, college production, intangibles, or how well that player fits a certain NFL scheme. Perhaps techniques that ecologists use to discern confounding trends in nature may be suitable for such challenges as the NFL draft. This article aims to introduce an eco-statistical tool, Principal Component Analysis (PCA), and its potential utility to advanced NFL analytics.
My Ph.D. research area is aquatic eco-toxicology, where I primarily model chemical exposure hazards to fish. So essentially, I use the best available data and methods to quantify how much danger a fish may be in, in a given habitat. Chemical exposures occur in infinitely complex mixtures across many different environments, and distinguishing trends from such dynamic situations is difficult.
Prospective draftees are actually similar (in theory) in that they are always a unique combination of their college team, inherent athleticism, history, intangibles, and even the current landscape in the NFL. The myriad of variables present in the environment and the NFL, both static and changing, make it difficult to separate the noise from actual, observable trends.
In environmental science, we sometimes use non-traditional methods to help us visualize what previously could not be observed. Likewise, Advanced NFL Analytics tries to answer questions that traditional methods cannot. The goal of this article is to educate others of the utility of eco-statistical tools, namely Principal Component Analysis (PCA), in assessing NFL draft prospects.
Wondering About the Wonderlic: Does It Predict Quarterback Performance?
Lacrosse Analytics
For those not familiar with lacrosse, imagine hockey played on a football field but, you know, with cleats instead of skates. And instead of a flat puck and flat sticks, there's a round ball and the sticks have small netted pocket to carry said ball. And instead of 3 periods, which must be some sort of weird French-Canadian socialist metric system thing, there's an even 4 quarters of play in lacrosse, just like God intended. But pretty much everything else is the same as hockey--face offs, goaltending, penalties & power plays. Lacrosse players tend to have more teeth though.
Because players carry the ball in their sticks rather than push it around on ice, possession tends to be more permanent than hockey. Lacrosse belongs to a class of sports I think of as "flow" sports. Soccer, hockey, lacrosse, field hockey, and to some degree basketball qualify. They are characterized by unbroken and continuous play, a ball loosely possessed by one team, and netted goals at either end of the field (or court). There are many variants of the basic team/ball/goal sport--for those of us old enough to remember the Goodwill Games of the 1980s, we have the dystopic sport of motoball burned into our brains. And for those of us (un)fortunate enough to attend the US Naval Academy (or the NY State penitentiary system) there's field ball. The interesting thing about these sports is that they can all be modeled the same way.
So with lacrosse season underway, I thought I'd take a detour from football work and make my contribution to lacrosse analytics. I built a parametric win probability model for lacrosse based on score, time, and possession. Here's how often a team can expect to win based on neutral possession--when there's a loose ball or immediately upon a faceoff following a previous score:
When Coaches Use Timeouts
The charts need some explanation. They plot how many timeouts a team has left during the second half based on time and score. Each facet represents a score difference. For example the top left plot is for when the team with the ball is down by 21 points. Each facet's horizontal axis represents game minutes remaining, from 30 to 0. The vertical axis is the average number of timeouts left. So as the half expires, teams obviously have fewer timeouts remaining.
The first chart shows the defense's number of timeouts left throughout the second half based on the offense's current lead. I realize that's a little confusing, but I always think of game state from the perspective of the offense. For example, the green facet titled "-7" is for a defense that's leading by 7. You can notice that defenses ahead naturally use fewer timeouts than those that trail, as indicated by comparison to the "7" facet in blue. (Click to enlarge.)
What I'm Working On
It's been almost 6 years since I introduced the win probability model. It's been useful, to say the least. But it's also been a prisoner of the decisions I made back in 2008, long before I realized just how much it could help analyze the game. Imagine a building that serves its purpose adequately, but came to be as the result of many unplanned additions and modifications. That's essentially the current WP model, an ungainly algorithm with layers upon layers of features added on top of the original fit. It works, but it's more complicated than it needs to be, which makes upkeep a big problem.
Despite last season's improvements, it's long past time for an overhaul. Adding the new overtime rules, team strength adjustments, and coin flip considerations were big steps forward, but ultimately they were just more additions to the house.
The problem is that I'm invested in an architecture that wasn't planned to be used as a decision analysis tool. It must have been in 2007 when I recall some tv announcer say that Brian Billick was 500-1 (or whatever) when the Ravens had a lead of 14 points or more. I immediately thought, isn't that due more to Chris McAllister than Brian Billick? And, by the way, what is the chance a team will win given a certain lead and time remaining? When can I relax when my home team is up by 10 points? 13 points? 17 points?
That was the only purpose behind the original model. It didn't need a lot of precision or features. But soon I realized that if it were improved sufficiently, it could be much more. So I added field position. And then I added better statistical smoothing. And then I added down and distance. Then I added more and more features, but they were always modifications and overlays to the underlying model, all the while being tied to decisions I made years ago when I just wanted to satisfy my curiosity.
So I'm creating an all new model. Here's what it will include:
Thomas Bayes Would Approve of Seattle's Defensive Tactics
Last week a WSJ article about the Seahawks' defensive backs claimed that they "obstruct and foul opposing receivers on practically every play." I took a deeper look in to the numbers and found that as long as referees are reluctant to throw flags on the defense in pass coverage (as claimed in the article), holding the receiver is a very efficient defensive strategy despite the risk of being penalized.
The following is an analysis using the concepts of expected utility, expected cost, and bayesian statistics.
The reason defensive holding is an optimal strategy comes down to one word. Economics. The referee's reluctance to call penalties on the defensive secondary is analogous to a market inefficiency. The variance in talent on NFL rosters, coaching staffs, and front offices between the best and worst teams in the league is probably very small. Successful teams win within a small margin. Seattle has found a way to exploit a relaxation in marginal constraints within the way the game is called that their competitors have not, and turned it into a competitive advantage.
If you think about committing a penalty in the same way as committing a crime, the expected utility is essentially the same. The expected utility (EU) for defensive holding is (opponent loss of down due to incomplete pass - probability of being penalized x cost of penalty). In other words, EU is the benefit of an incomplete pass minus the cost of the penalty times the probability of getting caught.