## Passing Predictability Part 2

Here's a tip: If you author a sports research website, never start an article as "Part 1" unless you already have a Part 2 ready to go. Way back at the end of March I began looking at how predictability affects passing success. I looked at "10 yards to go" situations--1st and 10, 2nd and 10, and 3rd and 10 plays. Passes are least predictable on 1st down and most predictable on 3rd and long. With 10 yards to go, passes are called roughly half the time on 1st and 2nd down, but on 3rd and 10 passes are called 91% of the time.

Since then, I've been tinkering with my NBA and NHL in-game win probability sites. This post will officially tie the loose end that's been dangling since the NCAA Tournament.

In Part 1 of this article, we saw the average gain in each of these situations:

Yds Per Attempt by Down, 10 Yds To Go

 Type 1st 2nd 3rd Total Pass 7.0 6.3 6.5 6.9 Run 4.2 4.4 6.9 4.3 Total 5.5 5.4 6.5 5.6

But there were several wrinkles in this analysis. First, there are interceptions to consider. Interceptions become much more probable on 3rd and 10.

Interception % by Down, 10 Yds To Go

 1st 2nd 3rd Total Int Rate 2.6 2.9 3.5 2.7

Combining Yards per Attempt and Interception rate yields Adjusted YPA, which is YPA with a -45 yd adjustment for every interception thrown.

Adj Yds Per Attempt by Down, 10 Yds To Go

 Type 1st 2nd 3rd Total Pass 5.9 5.0 4.9 5.6 Run 4.2 4.4 6.9 4.3

But there is still a problem. Adj YPA underestimates the drop off from 1st to 3rd down in passing effectiveness because defenses will allow gains, as long as they're not more than 9 yards. So we have to limit our observation to say the reduction in passing effectiveness due to predictability is likely at least 1 full adjusted yard per attempt.

Except that there's still another problem. There's a bias in the data because poor passing teams will face more 2nd & 10 and 3rd & 10 situations. So the 2nd and 3rd down numbers are lower than would be representative of the league as a whole. In other words, poor passing teams 'get more votes' in the analysis.

The solution is to give each team (or each team-year, actually) equal votes. Instead of averaging the gains for all 2nd and 10s for the entire league as a whole, I first averaged them by team-year, then averaged them by team.

Broken Out by Team & Year, Re-Averaged

 Stat 1st 2nd 3rd Total YPA 7.0 6.3 6.5 6.8 Int Rate 2.6 2.9 3.5 2.7 Adj YPA 5.8 5.0 4.9 5.6

As it turns out, the numbers are nearly (but not completely) identical using this method. The effect of the bias isn't very pronounced and we're left with the original estimate. Going from about 50% predictability to 90% predictability costs at least 1.0 Adj YPA.

Another notable observation is that although 1st &10 and 2nd & 10 have about the same run/pass balance of about 50%, passes on 1st and 10 are considerably more effective--5.8 vs 5.0 Adj YPA. I'd have to think that the difference may be largely due to play action.

### 5 Responses to “Passing Predictability Part 2”

1. Anonymous says:

Nice analysis. As a Lions fan, I think we bring any mean calculation down if it's a positive stat and up if it's a negative stat!

2. Anonymous says:

Could first down effectiveness be skewed by "garbage time" passes where a team is very behind and throwing on every down and the defense is playing prevent for half the field, giving up yards but not touchdowns?

3. Brian Burke says:

It could, except I limited the data to the 1st 3 quarters and when the score was within 10 points.

4. Bill Scofield says:

Fascinating stuff. I wonder if you've considered fumbles as part of the analysis. You detract -45 for a turnover in the passing gamein Adj. YPA, but where are turnovers in the running game accounted for here? Shouldn't they be? If the risk of an interception is factored in when looking at Adj. YPA, shouldn't the risk of a fumble in the running game be included in the analysis? If not, I think you're discounting the passing game premium too much.

I realize that data on fumbles by running backs, as opposed to fumbles by anyone (wide receiver, tight end, quarterback) would be tough to come by, but I think not accounting for that in your rushing data, while including a penalty in the passing game, skews the data a bit in favor of the run.

Again, great stuff!

5. Brian Burke says:

Bill-you're right. I took a shortcut here because fumbles occur at roughly equal rates for both run and pass plays. In reality it's a little more complex than I present it, though, because there is a difference in where fumbles tend to occur based on play type. I don't have a lot of confidence in my data on that right now. Good point.