Advanced Football Analytics (formerly Advanced NFL Stats): research

Showing posts with label research. Show all posts

Home Posts filed under research

Two-Point Conversion in the KC-DEN game

By Brian Burke

With 7:15 left in the 4th quarter against DEN, KC's Knile Davis ran for a 4-yard TD, narrowing the Broncos' lead 21-16 pending the extra point or two-point conversion. Andy Reid elected for the extra point, and following the kick the Chiefs trailed by 4 points rather than 3 or 5 points resulting from a two-point try.

NFL coaches typically adhere to what's known as the Vermeil Chart for making two-point decisions. The chart was created by Dick Vermeil when he was offensive coordinator for UCLA over 40 years ago. It's a very simple chart that simply looks at score difference prior to any conversion attempt and does not consider time remaining, with one caveat. It applies only when the coach expects there to be three or fewer (meaningful) possessions left in the game.

With just over 7 minutes to play, there could be three possessions at most left, especially considering that at least one of those possessions would need to be a KC scoring drive for any of this to matter. (In actuality, there were only two possessions left, one for each team.) Even the tried-and-true Vermeil chart says go for two when trailing by 5. But it's not the 1970s any more and this isn't college ball, so let's apply the numbers and create a better way of analyzing go-for-two decisions.

Except for rare exceptions I've resisted analyzing two-point conversion decisions with the Win Probability model because, as will become apparent, the analysis is particularly susceptible to noise. Now that we've got the new model, noise is extremely low, and I'm confident the model is more than up to the task.

First, let's walk through the possibilities for KC intuitively. If KC fails to score again or DEN gets a TD, none of this matters. Otherwise:

published on 9/15/2014 with 6 comments

Nick Foles and Interception Index Regression

By Unknown

Nick Foles and Josh McCown were two of last season's most pleasant surprises, emerging from obscurity to post two of league's most efficient seasons. Both finished in the top 3 for Expected Points Added per Play, largely in part because the two combined to throw just three interceptions.

With one week of the 2014 season in the books, Foles and McCown have already matched that combined total. While everyone should have expected both to regress from their remarkably turnover-free 2013 seasons, that does not tell us how far each should regress based on historical norms.

published on 9/11/2014 with 5 comments

Simulating the Saints-Falcons Endgame

By Brian Burke

I was asked yesterday about the end of regulation of the Saints-Falcons game. With about a minute and a half remaining, NO was down by 4 but had a 1st & goal at the 1. With 2 timeouts left, should ATL have allowed the touchdown intentionally?

I previously examined intentional touchdown scenarios, but only considered situations when the offense was within 3 points. In this case NO needed a TD, which--needless to say--makes a big difference. Yet, because NO was on the 1, perhaps the go-ahead score was so likely that ATL would be better off down 3 with the ball than up 4 backed-up against their goal line.

This is a really, really hard analysis. There's a lot of what-ifs: What if NO scores on 1st down anyway? What if they don't score on 1st but on 2nd down? On 3rd down? On 4th down? Or what if they throw the ball? What if they stop the clock somehow, or commit a penalty? How likely is a turnover on each successive down? You can see that the situation quickly becomes an almost intractable problem without excessive assumptions.

That's where the WOPR comes in. The WOPR is the new game simulation model created this past off-season, designed and calibrated specifically for in-game analytics. It simulates a game from any starting point, play by play, yard by yard, and second by second. Play outcomes are randomly drawn from empirical distributions of actual plays that occurred in similar circumstances.

If you're not familiar with how simulation models work, you're probably wondering So what? Dude, I can put my Madden on auto-play and do the same thing. Who cares who wins a dumb make-believe game?

published on 9/09/2014 with 4 comments

Analyzing Replay Challenges

By Brian Burke

The new WP model allows some nifty new applications. One of the more notable improvements is the consideration of timeouts. That, together with enhanced accuracy and precision allow us to analyze replay challenge decisions. Here at AFA, we've tinkered with replay analysis before, and we've estimated the implicit value of a timeout based on how and when coaches challenge plays. But without a way to directly measure the value of a timeout the analysis was only an exercise.

Most challenges are now replay assistant challenges--the automatic reviews for all scores and turnovers, plus particular plays inside two minutes of each half. Still, there are plenty of opportunities for coaches to challenge a call each week.

The cost of a challenge is two-fold. First, the coach (probably) loses one of his two challenges for the game. (He can recover one if he wins both challenges in a game.) Second, an unsuccessful challenge results in a charged timeout. The value of the first cost would be very hard to estimate, but thankfully the event that a coach runs out of challenges AND needs to use a third is exceptionally rare. I can't find even a single example since the automatic replay rules went into effect.

So I'm going to set that consideration aside for now. In the future, I may try to put a value on it, particularly if a coach had already used one challenge. But even then it would be very small and would diminish to zero as the game progresses toward its final 2 minutes. In any case, all the coaches challenges from this week were first challenges, and none represented the final team timeout, so we're in safe waters for now.

Every replay situation is unique. We can't quantify the probability that a particular play will be overturned statistically, but we can determine the breakeven probability of success for a challenge to be worthwhile for any situation. If a coach believes the chance of overturning the call is above the breakeven level, he should challenge. Below the breakeven level, he should hold onto his red flag.

published on 9/08/2014 with 5 comments

Sneak Peek at WP 2.0

By Brian Burke

I've just completed the development, validation, and testing of the next-generation Win Probability model. It took the better part of the past 6 months. Despite many heartaches and frustrating turns, I'm really thrilled with the results. But as excited as I am to have this new tool, I'm also somewhat humbled by how inadequate the original model is in some regards.

As a quick refresher the WP model tells us the chance that a team will win a game in progress as a function of the game state--score, time, down, distance...etc. Although it's certainly interesting to have a good idea of how likely your favorite team is to win, the model's usefulness goes far beyond that.

WP is the ultimate measure of utility in football. As Herm once reminded us all, You play to win the game! Hello!, and WP measures how close or far you are from that single-minded goal. Its elegance lies in its perfectly linear proportions. Having a 40% chance at winning is exactly twice as good as having a 20% chance at winning, and an 80% chance is twice as good as 40%. You get the idea.

That feature allows analysts to use the model as a decision support tool. Simply put, any decision can be assessed on the following basis: Do the thing that gives you the best chance of winning. That's hardly controversial. The tough part is figuring out what the relevant chances of winning are for the decision-maker's various options, and that's what the WP model does. Thankfully, once the model is created, only fifth grade arithmetic is required for some very practical applications of interest to team decision-makers and to fans alike.

published on 8/22/2014 with 7 comments

Implications of a 33-Yard XP

By Brian Burke

The NFL is experimenting with a longer XP this preseason. XPs have become so automatic (close to 99.5%) that there no longer much rationale for including them in the game. The Competition Committee's experiment is to move the line of scrimmage of each XP to the 15-yard line, making the distance of each kick 33-yards.

Over the past five seasons, attempts from that distance are successful 91.5% of the time. That should put a bit of excitement and drama into XPs, especially late in close games, which is what the NFL wants. But it might also have another effect on the game.

Currently, two-point conversions are successful at just about half that rate, somewhere north of 45%. The actual rate is somewhat nebulous, because of how fakes and aborted kick attempts into two-point attempts are counted.

It's likely the NFL chose the 15-yd line for a reason. The success rates for kicks from that distance are approximately twice the success rate for a 2-point attempt, making the entire extra point process "risk-neutral." In other words, going for two gives teams have half the chance at twice the points.

published on 8/07/2014 with 9 comments

Win Values for the NFL

By Brian Burke

Jimmy Graham's contract values him at about 0.9 wins per season. Here's how I came to that estimate.

In 2013 the combined 32 NFL teams chased 256 regular season wins and spent $3.92 billion on player salary along the way. In simple terms, that would make the value of a win about $15 million. Unfortunately, things aren't so simple. To estimate the true relationship between salary and winning, we need to focus on wins above replacement.

Think of replacement level as the "intercept" term or constant in a regression. As a simple example think of the relationship between Celsius and Fahrenheit. There is a perfectly linear relationship between the two scales. To convert from deg C to deg F, multiply the Celsius temperature by 9/5. That's the slope or coefficient of the relationship. But because the zero point on the Celsius scale is 32 on the Fahrenheit scale, we need to add 32 when converting. That's the intercept. 32 degrees F is like the replacement level temperature.

No matter how teams spend their available salary, they need to have 53 guys on their roster. At a bare minimum, they need to spend 53 * $min salary just to open the season. We can consider that amount analogous to the 32-degrees of Fahrenheit. For 2013, the minimum salaries ranged from $420k for rookies to $940k for 10-year veterans. To field a purely replacement level squad, a franchise could enlist nothing but rookies. But to add a bit of realism, let's throw in a good number of 1, 2, and 3-year veterans in the mix for a weighted average min salary of $500k per year. The league-wide total of potential replacement salary comes to:

published on 7/21/2014 with 9 comments

Using Probabilistic Distributions to Quantify NFL Combine Performance

By Unknown

Casan Scott continues his guest series on evaluating NFL prospects through Principal Component Analysis. By day, Casan is a PhD candidate researching aquatic eco-toxicology at Baylor University.

Jadeveon Clowney is thought of as a “once-in-a-decade” or even “once-in-a-generation” pass rushing talent by many. Once the top rated high school talent in the country, Clowney has retained that distinction through 3 years in college football’s most dominant conference. Super-talents like Clowney have traditionally been gambled on in the NFL draft with little idea of what future production is actually statistically anticipated. For all of the concerns over his work ethic, dedication, and professionalism, Clowney’s athleticism and potential have never been called into question. But is his athleticism actually that rare? And is his talent worth gambling millions of dollars and the 1st overall pick on? This article aims to objectify exactly how rare Jadeveon Clowney’s athleticism is in a historical sense.

Jadeveon Clowney set the NFL draft world on fire at this year’s combine when he delivered one of the most talked-about combine performances of recent memory, primarily driven by his blistering 40 yard dash time of 4.53. Over the years, however, I recall players like Vernon Gholston, Mario Williams, and even Ziggy Ansah displaying mind-boggling athleticism in drills. But if each year a player displays unseen athleticism at the combine, who is really impressive enough that we deem them “Once-in-a-decade?”

Probability Ranking allows me to identify the probability of encountering an athlete’s measurable. For instance, I probability ranked NFL combine 40 yard dash times for 341 defensive ends from 1999-2014 (Table 1 shows the top 50). In this case, Jadeveon Clowney’s 40 time of 4.53 had a probability rank of 99.12, meaning his speed is in the 99th percentile of all DEs over this time span.

published on 5/19/2014 with 12 comments

NFL Prospect Evaluation using Quantile Regression

By Unknown

Casan Scott continues his guest series on evaluating NFL prospects through Principal Component Analysis. By day, Casan is a PhD candidate researching aquatic eco-toxicology at Baylor University.

Extraordinary amounts of data go into evaluation an NFL prospect. The NFL combine, pro days, college statistics, game tape breakdown, and even personality tests can all play a role in predicting a player’s future in the NFL. Jadeveon Clowney is arguably the most discussed prospect in the 2014 NFL draft, not named Johnny Manziel. He is certainly an elite prospect and potentially the best in this year’s draft, but he doesn’t appear to be a “once-in-a-decade” type of physical specimen based exclusively on historical combine performances. From the research I’ve done, only Mario Williams and JJ Watt can make such a claim. Super-talents like Clowney have traditionally been gambled on in the NFL draft with little idea of what future production is actually statistically anticipated. All prospects have a “ceiling” and a “floor” which represent the maximum and lowest potential that a prospect could realize respectively. But what does this “potential” mean and does it hold any importance for actually predicting a prospect’s success in the NFL? In this article I will show how Quantile Regression, a technique used by quantitative ecologists, can clarify what Clowney’s proverbial “ceiling” and “floor” may be in the NFL.

Athletes are a collection of numerous measured and unmeasured descriptor variables. Figure 1 shows a single predictor (40 yard dash time) vs a prospects’ Career NFL sacks + tackles for loss (TFL) per game.

published on 5/19/2014 with 3 comments

New Feature on the Draft Model

By Brian Burke

In my last job I worked with a team of software developers. The interfaces they designed didn't make much sense to me. The interfaces were always, at heart, a giant expanding tree of classes, objects, and properties. Huh? Lots of tiny plus and minus marks everywhere to expand and contract the accordion. Left click to view something. Right click to modify it. If you ever had to deal with the Windows registry, it was like that. Steve Jobs would not have been thrilled.

When I learned a little about object oriented programming, it all made sense. The software engineers were designing the interface for their own convenience, not for ease of use. It made sense from an efficiency standpoint...a programming efficiency standpoint. But from the perspective of the user, it wasn't so efficient. The least used feature was just as accessible as the most common feature, and all of them were hidden until you expanded the right portion of the tree.

Yesterday I realized I was doing the same thing with the draft model. From my point of view, it's easiest to think in terms of players and their probability to be selected at each pick number, because that's how the software that runs the model works. It goes down the list of prospects, player-by-player, looking at the probability he'll be selected pick#-by-pick#.

For the players and their agents, and for fans of particular players, this is ideal. They want to know where and when they'll go. But the user is probably thinking of things from a team's perspective. Whether the user is a team personnel guy or a fan of a team, he'd rather see things from the perspective of a pick #. Right now, a Vikings fan (or exec) would have to click through over a dozen or so of the top players to see who's likely to be available to them at pick #8. And if they were wondering about who'd be available if they trade up or down, that's another few dozen clicks. Scroll, click. Scroll, click...

published on 5/06/2014 with 5 comments

Bayesian Draft Analysis Tool

By Brian Burke

This tool is intended to help decision-makers better assess the NFL draft market. Specifically, it estimates the probability each prospect will be available at each pick number. The estimates are based on a Bayesian inference model based on consensus player rankings and projections from individual experts with a history of accuracy.

For details on how the model works, please refer to these write-ups:

- A full description of the purpose and capabilities of the model
- A discussion of the theoretical basis of Bayesian inference as applied to draft modeling
- More details on the specific methodology

If you want to jump straight to the results, here they are. But I recommend reading a little further for a brief description of what you'll find.

The interface consists of a list of prospects and two primary charts. Selecting a prospect displays the probabilities of when he'll likely be taken. You can filter the selection list by overall ranking or position.

The top chart plots the probabilities the selected prospect will be taken at each pick #. I think this chart is pretty cool because it illustrates the Bayesian inference process. You can actually see the model 'learn' as it refines its estimates with the addition of each new projection. Where there is a firm consensus among experts, the probability distribution is tall and narrow, indicating high confidence. When there is disagreement, the distribution is low and wide, indicating low confidence.

The lower chart is the bottom line. It's the take-away. It depicts the cumulative probability that the selected prospect will remain available at each pick #. For example, currently there's an 82% chance safety HaHa Clinton-Nix is available at the #8 pick but only a 26% chance he's available at #14. A team with an eye on a specific player could use this information in deciding whether to trade up or down, and in understanding how far they'd need to trade.

Hovering your cursor over one of the bars on the chart provides some additional context, including which team has that pick and that team's primary needs (according to nfl.com).

The box in the upper right gives you the player's vitals - school, position, height, weight. The expert projections used as inputs to the model are also listed. Currently those include Kiper (ESPN), McShay (Scouts, Inc.), Pat Kirwan(CBS Sports), Daniel Jeremiah (former team scout, NFL Network), and Bucky Brooks (NFL Network). Experts were selected for their reputation, historical accuracy, and independence--that is, they don't all parrot the same projections. Not every prospect has a projection from each expert.

Link to the tool.

published on 5/05/2014 with 4 comments

Bayesian Draft Model: More Methodology

By Brian Burke

Boomer, when you think about a guy like Thomas Bayes you think high motor, long arms, quick off the snap. Huge upside in any 3-4 scheme. Gets leverage on those tricky probability theorems right off the block. Game 1 starter for 90% of the teams out there. Writes proofs all the way through the end of the whistle. Definitely like him in the late first, early second round...

The new Bayesian draft model is nearly ready for prime time. Before I launch the full tool publicly, I need to finish describing how it works. Previously, I described its purpose and general approach. And my most recent post described the theoretical underpinnings of Bayesian inference as applied to draft projections. This post will provide more detail on the model's empirical basis.

To review, the purpose of the model is to provide support for decisions. Teams considering trades need the best estimates possible about the likelihood of specific player availability at each pick number. Knowing player availability also plays an important role in deciding which positions to focus on in each round. Plus, it's fun for fans who follow the draft to see which prospects will likely be available to their teams. Hopefully, this tool sits at the intersection of Things helpful to teams and Things interesting to fans.

Since I went over the math in the previous post, I'll dig right into how the probability distributions that comprise the 'priors' and 'likelihoods' were derived.

I collected three sets of data from the last four drafts--best player rankings, expert draft projections (mock drafts), and actual draft selections. In a nutshell, to produce the prior distribution, I compared how close each player's consensus 'best-player' ranking was to his actual selection. And to produce the likelihood distributions I compared how close each player's actual selection was to the experts' mock projections.

published on 5/05/2014 with 0 comments

Theoretical Explanation of the Bayesian Draft Model

By Brian Burke

I recently introduced a model for estimating the probabilities of when prospects will be taken in the draft. This post will provide an overview of the principles that underpin it. A future post will go over some of the deeper details of how the inputs for the model were derived.

First, some terminology. P(A) means the "probability of event A," as in the probability it rains in Seattle tomorrow. Event A is 'it rains in Seattle tomorrow'. Likewise, we can define P(B) as the probability that it rains in Seattle today.

P(A|B) means "the probability of event A given event B occurs," as in the probability that it rains in Seattle tomorrow given that it rained there today. This is known as a conditional probability.

The probability it rains in Seattle today and tomorrow can be calculated by P(A|B) * P(B), which should be fairly intuitive. I hope I haven't lost anyone.

It's also intuitive that "raining in Seattle today and tomorrow" is equivalent to "raining in Seattle tomorrow and today." There's no difference at all between those two things, and so there's no difference in their probabilities.

We can write out that equivalence, like this:

published on 5/01/2014 with 2 comments

Bayesian Draft Prediction Model

By Brian Burke

Let's say you're a GM in need of a safety. You really like Ha Ha Clinton-Dix (FS Ala.) but are unsure if he'll still be on the board when you're on the clock. Do you need to trade up? How far? What if you're a GM with a high pick and would be willing to trade down if you're still assured of getting Clinton-Dix? How far down could you trade and still get your guy?

I've created a tool for predicting when players will come off the board. This isn't a simple average of projections. Instead, it's a complete model based on the concept of Bayesian inference. Bayesian models have an uncanny knack for accurate projections if done properly. I won't go into the details of how Bayesian inference works in this post and save that for another article. This post is intended to illustrate the potential of this decision support tool.

Bayesian models begin with a 'prior' probability distribution, used as a reasonable first guess. Then that guess is refined as we add new information. It works the same way your brain does (hopefully). As more information is added, your prior belief is either confirmed or revised to some degree. The degree to which it is refined is a function of how reliable the new information is. This draft projection model works the same way.

published on 4/30/2014 with 5 comments

Draft Prospect Evaluation Using Principal Component Analysis

By Brian Burke

A guest post by W. Casan Scott, Baylor University.

As different as ecology and the NFL sound, they share quite similar problems. The environment is an infinitely complex system with many known and unknown variables. The NFL is a perpetually changing landscape with a revolving door of players and schemes. Predicting an athlete’s performance pre-draft is complicated through a number of contributing variables including combine results, college production, intangibles, or how well that player fits a certain NFL scheme. Perhaps techniques that ecologists use to discern confounding trends in nature may be suitable for such challenges as the NFL draft. This article aims to introduce an eco-statistical tool, Principal Component Analysis (PCA), and its potential utility to advanced NFL analytics.

My Ph.D. research area is aquatic eco-toxicology, where I primarily model chemical exposure hazards to fish. So essentially, I use the best available data and methods to quantify how much danger a fish may be in, in a given habitat. Chemical exposures occur in infinitely complex mixtures across many different environments, and distinguishing trends from such dynamic situations is difficult.

Prospective draftees are actually similar (in theory) in that they are always a unique combination of their college team, inherent athleticism, history, intangibles, and even the current landscape in the NFL. The myriad of variables present in the environment and the NFL, both static and changing, make it difficult to separate the noise from actual, observable trends.

In environmental science, we sometimes use non-traditional methods to help us visualize what previously could not be observed. Likewise, Advanced NFL Analytics tries to answer questions that traditional methods cannot. The goal of this article is to educate others of the utility of eco-statistical tools, namely Principal Component Analysis (PCA), in assessing NFL draft prospects.

published on 4/28/2014 with 6 comments

Wondering About the Wonderlic: Does It Predict Quarterback Performance?

By Unknown

By: Austin Tymins and Andrew Fraga

Published originally at Harvard Sports Analysis

During the 2014 NFL Draft, all 32 NFL teams will be on the clock to invest in the future of their franchises. Decision makers will feel immense pressure to secure a top-notch first round pick, find the next Tom Brady in the sixth round, and, most importantly, avoid selecting a bust. College stats, highlight reels, and NFL Combine results will all be evaluated. The draft, however, isn’t just about physical prowess; in addition to the 6 workouts at the NFL Combine, such as the 40-yard dash and bench press, draft prospects must also complete the Wonderlic Test, an examination designed to gauge mental aptitude.

published on 4/25/2014 with 5 comments

Lacrosse Analytics

By Brian Burke

I'm a Baltimore guy, and aside from an affinity for steamed crabs and a regrettable taste for National Bohemian beer, the mid-Atlantic has given me an appreciation for the sport of lacrosse. To most North American sports fans, lacrosse must seem like some strange niche sport, like "jousting" or "baseball." But it's very entertaining and fun to watch. It's growing fast, particularly in the super-zips around DC where the ANS headquarters is.

For those not familiar with lacrosse, imagine hockey played on a football field but, you know, with cleats instead of skates. And instead of a flat puck and flat sticks, there's a round ball and the sticks have small netted pocket to carry said ball. And instead of 3 periods, which must be some sort of weird French-Canadian socialist metric system thing, there's an even 4 quarters of play in lacrosse, just like God intended. But pretty much everything else is the same as hockey--face offs, goaltending, penalties & power plays. Lacrosse players tend to have more teeth though.

Because players carry the ball in their sticks rather than push it around on ice, possession tends to be more permanent than hockey. Lacrosse belongs to a class of sports I think of as "flow" sports. Soccer, hockey, lacrosse, field hockey, and to some degree basketball qualify. They are characterized by unbroken and continuous play, a ball loosely possessed by one team, and netted goals at either end of the field (or court). There are many variants of the basic team/ball/goal sport--for those of us old enough to remember the Goodwill Games of the 1980s, we have the dystopic sport of motoball burned into our brains. And for those of us (un)fortunate enough to attend the US Naval Academy (or the NY State penitentiary system) there's field ball. The interesting thing about these sports is that they can all be modeled the same way.

So with lacrosse season underway, I thought I'd take a detour from football work and make my contribution to lacrosse analytics. I built a parametric win probability model for lacrosse based on score, time, and possession. Here's how often a team can expect to win based on neutral possession--when there's a loose ball or immediately upon a faceoff following a previous score:

published on 4/02/2014 with 2 comments

When Coaches Use Timeouts

By Brian Burke

As I continue to work on the next generation WP model, I'm looking hard at how timeouts are used. Here are 2 charts that capture about as much information as can be squeezed into a graphic.

The charts need some explanation. They plot how many timeouts a team has left during the second half based on time and score. Each facet represents a score difference. For example the top left plot is for when the team with the ball is down by 21 points. Each facet's horizontal axis represents game minutes remaining, from 30 to 0. The vertical axis is the average number of timeouts left. So as the half expires, teams obviously have fewer timeouts remaining.

The first chart shows the defense's number of timeouts left throughout the second half based on the offense's current lead. I realize that's a little confusing, but I always think of game state from the perspective of the offense. For example, the green facet titled "-7" is for a defense that's leading by 7. You can notice that defenses ahead naturally use fewer timeouts than those that trail, as indicated by comparison to the "7" facet in blue. (Click to enlarge.)

published on 3/11/2014 with 4 comments

What I'm Working On

By Brian Burke

It's been almost 6 years since I introduced the win probability model. It's been useful, to say the least. But it's also been a prisoner of the decisions I made back in 2008, long before I realized just how much it could help analyze the game. Imagine a building that serves its purpose adequately, but came to be as the result of many unplanned additions and modifications. That's essentially the current WP model, an ungainly algorithm with layers upon layers of features added on top of the original fit. It works, but it's more complicated than it needs to be, which makes upkeep a big problem.

Despite last season's improvements, it's long past time for an overhaul. Adding the new overtime rules, team strength adjustments, and coin flip considerations were big steps forward, but ultimately they were just more additions to the house.

The problem is that I'm invested in an architecture that wasn't planned to be used as a decision analysis tool. It must have been in 2007 when I recall some tv announcer say that Brian Billick was 500-1 (or whatever) when the Ravens had a lead of 14 points or more. I immediately thought, isn't that due more to Chris McAllister than Brian Billick? And, by the way, what is the chance a team will win given a certain lead and time remaining? When can I relax when my home team is up by 10 points? 13 points? 17 points?

That was the only purpose behind the original model. It didn't need a lot of precision or features. But soon I realized that if it were improved sufficiently, it could be much more. So I added field position. And then I added better statistical smoothing. And then I added down and distance. Then I added more and more features, but they were always modifications and overlays to the underlying model, all the while being tied to decisions I made years ago when I just wanted to satisfy my curiosity.

So I'm creating an all new model. Here's what it will include:

published on 3/10/2014 with 11 comments

Thomas Bayes Would Approve of Seattle's Defensive Tactics

By Brian Burke

The following is a guest article by Gary Montry, a professional applied mathematician. Editor's note: Gary uses net yardage as the measure of utility, and we might prefer something like EP or WP, I think the general point of the article stands, and its strength is in the construction and solution to the problem. It's also a great refresher on conditional probabilities and Bayes' theorem.

Last week a WSJ article about the Seahawks' defensive backs claimed that they "obstruct and foul opposing receivers on practically every play." I took a deeper look in to the numbers and found that as long as referees are reluctant to throw flags on the defense in pass coverage (as claimed in the article), holding the receiver is a very efficient defensive strategy despite the risk of being penalized.

The following is an analysis using the concepts of expected utility, expected cost, and bayesian statistics.

The reason defensive holding is an optimal strategy comes down to one word. Economics. The referee's reluctance to call penalties on the defensive secondary is analogous to a market inefficiency. The variance in talent on NFL rosters, coaching staffs, and front offices between the best and worst teams in the league is probably very small. Successful teams win within a small margin. Seattle has found a way to exploit a relaxation in marginal constraints within the way the game is called that their competitors have not, and turned it into a competitive advantage.

If you think about committing a penalty in the same way as committing a crime, the expected utility is essentially the same. The expected utility (EU) for defensive holding is (opponent loss of down due to incomplete pass - probability of being penalized x cost of penalty). In other words, EU is the benefit of an incomplete pass minus the cost of the penalty times the probability of getting caught.

published on 1/29/2014 with 22 comments

Two-Point Conversion in the KC-DEN game

Nick Foles and Interception Index Regression

Simulating the Saints-Falcons Endgame

Analyzing Replay Challenges

Sneak Peek at WP 2.0

Implications of a 33-Yard XP

Win Values for the NFL

Using Probabilistic Distributions to Quantify NFL Combine Performance

NFL Prospect Evaluation using Quantile Regression

New Feature on the Draft Model

Bayesian Draft Analysis Tool

Bayesian Draft Model: More Methodology

Theoretical Explanation of the Bayesian Draft Model

Bayesian Draft Prediction Model

Draft Prospect Evaluation Using Principal Component Analysis

Wondering About the Wonderlic: Does It Predict Quarterback Performance?

Lacrosse Analytics

When Coaches Use Timeouts

What I'm Working On

Thomas Bayes Would Approve of Seattle's Defensive Tactics

Special Note

Search Advanced Football Analytics

Required Reading

Archive

@BBurkeESPN

ANS COMMUNITY

Support Military Families