Win Probability Model/Calculator Upgrades - Team Strength Adjustment & More

I just implemented several new features and significant upgrades to the Win Probability Calculator tool as well as the model behind it.

1. The biggest new feature is the capability to adjust the WP estimates based on relative team strength. This is accomplished by entering either a pregame WP estimate from the efficiency model, another source, or the game's point spread. The model has had this ability for a long time, but I didn't want to implement it until I had a sound way of doing so.

The prior pregame estimate of WP is revised as the game goes on with the baseline in-game WP estimate. The two probabilities are reconciled using the logit method. The trick is to understand how the pregame difference in team strength decays over the course of the game. At a certain point, it doesn't matter how much a team was favored if it's trailing by two or more scores late in the game. Team strength differential decays proportionally to the log of time as the game progresses according to a particular curve.

Pregame WP or spreads should be with respect to the current offense. For example, if the game's spread was -3 but the visitor has the ball in the scenario you are investigating, enter 3 for the spread. In what is a cool enough feature all by itself, entering a spread will automatically convert into a pregame WP estimate.

For the record, the WPA stats for teams and players will continue to use the baseline unadjusted WP numbers. If we used the adjusted WP numbers, every team  and player would have a zero WPA assuming our pregame estimates were accurate. Put simply, using the adjusted WPA stats would defeat their very purpose and only be a measure of how good our pregame forecasts were.

2. The next most significant update is the ability to account for receiving the kickoff in the 2nd half. This can have an effect of up to a 0.04 WP swing in the first half of close games. The input asks users for whether the team with possession kicked off to start the first half or not. This consideration doesn't apply following the 2nd half kickoff, so for 2nd half scenarios you can just leave the input at the default Don't Know.


3. The model has always been aware of game states in which it was likely that a 2-point conversion would be required. But in a quirk of my algorithm, it suddenly forgot when a team was in or very close to goal-to-go situations. In other words, it gave erroneously low estimates for a team trailing by 8 near the goal line by assuming they could only score 7 points. To be clear, it was always possible to use the model and do the math manually go get accurate WP estimates for 8-point deficits. But now, the model automatically accounts for late-game situations in which a 2-point conversion is likely for all field-positions.

4. The model had a similar difficulty on 4th downs late in close games. It's hard enough calculating the probabilities of football games without considering the inconsistent criteria for when coaches will go for it. As we all know, a team's prospects can vary significantly based on whether they go for it or not. When I originally programmed the algorithm that runs the model, for simplicity I assumed FGs for all 4th downs inside the 36 and punts on all other 4th downs. And I left it that way for a reason. What seemed like a bug to everyone else was a feature to me as an analyst. This naive approach provided a baseline for calculating the "WP available" and "WP forfeited" in all situations.

This was one reason I created the Fourth Down Calculator. In many situations it's impossible to know what a coach would do, so it's best to know the WP of each plausible option. But, I need to pick one option to display on the game graph.

Unfortunately, my original approach caused the game graphs and related player stats to drop to a near-zero WP for a team with a 4th down trailing by between 3 and 8 very late in a game. The unfortunate players who participated in the unsuccessful 3rd down were brutalized by the WPA numbers, and players who converted the 4th down were rewarded with career-making WPA. Now, the model looks ahead to measure the consequence of punting or kicking late in the game. If the resulting WP is very low and the WP for the conversion attempt is higher than either punting or kicking, it assumes the team will go for it. The Calculate WP function does this by calling itself recursively to induce the probabilities earlier in a drive. In other words, the effect of being in "4-down territory" or having a "4-down mindset" cascades backward earlier in a series or drive. For example, a WP on a 2nd down early in a "4-down mindset" drive will be adjusted to reflect the enhanced WP for eventually having the option to go for it on 4th down.

I still encourage the use of the Fourth Down Calculator. The change to the WP model to incorporate 4th down logic is a special case. The change can create a discontinuity in the resulting WP estimates. As a team transitions from a state in which they still have a small chance of winning by kicking a FG to a state where their chance falls below the threshold, its WP can jump instantly.

I could code in some fuzzy logic and calculate WP for a dispersion of probability distributions for various options, but that's not the purpose of the tool. If we were using it for gambling, that would be the thing to do. But this is geared for a decision support and analysis tool, and not simply a forecasting model.

5. Lastly, the Fourth Down Calculator was revised to update legacy Expected Point values for a successful FG and TD. Values from the old kickoff rules were hard coded into the calculator script, and have now been updated to fully reflect kickoffs from the 35-yard line. The effect was that a punt appeared as a very slightly better option than compared to going for it or attempting a FG. This only applies to the EP calculations on the calculator. WP was unaffected.

  • Spread The Love
  • Digg This Post
  • Tweet This Post
  • Stumble This Post
  • Submit This Post To Delicious
  • Submit This Post To Reddit
  • Submit This Post To Mixx

10 Responses to “Win Probability Model/Calculator Upgrades - Team Strength Adjustment & More”

  1. Chase Stuart says:

    Awesome news, Brian. Those are all great updates.

    And I applaud you for being open and honest with the flaws of the old system (not that we'd expect anything differently from you).

  2. Anonymous says:

    Thank you so much! I've been waiting for this kind of upgrade!

  3. Anonymous says:

    Does the new WP model take timeouts remaining into account? For a team trailing late in a game, each timeout remaining can act as an additional ~33 seconds of playing time (40 second play clock - 7 second play duration). For example, a team trailing and on defense facing 3rd down with 1:00 to play and no time outs, is in a similar game state to that same team with 1 time out remaining and ~27 seconds left to play.

  4. James says:

    Great news! I'm glad to know these improvements have been made!

    Re: "The unfortunate players who participated in the unsuccessful 3rd down were brutalized by the WPA numbers, and players who converted the 4th down were rewarded with career-making WPA."

    Does this mean you'll retroactively change the WPA for players from previous seasons?

  5. Brian Burke says:

    James-eventually. But then there's a similar problem when a team may transition from a conventional mindset into what the model considers a 4-down mindset. Whoever participates in the intervening play would be unfairly rewarded.

    Anon-Not yet. That's a whole other level of complexity. I can still do things 'manually' as you describe for particular cases/analysis.

  6. Andrew Foland says:

    "Team strength differential decays proportionally to the log of time as the game progresses according to a particular curve"

    What does this mean? The log of time (even scaled by 60 minutes) will diverge to infinity at one end or the other of the game (depending on whether it's the log of time or log of time remaining.)

    Presumably, as you add it as a logit, it decays to 0.5?

    Will the real-time game WPA charts be taking the game probabilities into account from now on? (Or might that be somehow a toggleable option?)

  7. Brian Burke says:

    Andrew-The shape of the curve is log-shaped. The strength of the effect decays to zero as the game goes on. I wasn't trying to be mathematically literal.

    I haven't done a real-time implementation yet. I envision having 2 versions of each graph. And the calculator outputs both unadjusted and adjusted WP.

  8. Jonathan says:

    Wow, you weren't joking when you said that was coming very soon. Are there any plans to do something similar for adjusted EP, either on the drive or overall?

  9. Unknown says:

    Brian -

    Awesome upgrade to the calculator! Thanks!

  10. Anonymous says:

    Impressive additions and explanations

Leave a Reply

Note: Only a member of this blog may post a comment.