So far the models I've been using have been purely additive. That is, they tally up the independent effects of running efficiency, passing efficiency, etc. But the models have not yet addressed the interaction effect of running and passing together. In other words, is there a bonus to having both a good passing game and running game simultaneously that is not otherwise captured by the respective linear coefficients of passing and running efficiency? Call it synergy or a dynamic, but in statistics it's called an interaction.
To test the theory of an interaction effect between passing and running efficiency I created two new variables that are simply the product of passing and running efficiency for offense and for defense.
OPASSRUNIXN = OPASS x ORUN, and
DPASSRUNIXN = DPASS x DRUN
In regression models, the coefficients of interaction variables are only meaningful if the variables are standardized, which means they are in terms of how many standard deviations they are from the mean and not in terms of raw yards. For example, a team's very poor passing efficiency stat of 4.5 yds/att would be -2.2 when standardized, meaning it's 2.2 standard deviations below the league average. Using standardized variables, aka "Z-scores" also allows us to evaluate the relative weight of each regression coefficient directly. If a variable's coefficient is twice as large as another's, then it is related twice as strongly to the dependent variable.
Again, season wins is the dependent variable. The sample comprises all 32 NFL teams' stats from the 02-06 regular seasons. The independent variables include the familiar offensive and defensive: running, passing, interception, and fumble efficiencies, plus penalties. This time, we also include the two new interaction variables.
The result of the regression is below:
R-squared is 0.75.
First, let's review what the coefficients mean. Since the variables are all standardized, we can see that "ZOTRUPASS" (standardized offensive true passing efficiency) has a coefficient of 1.2. This means for every standard deviation above league average, a team can expect 1.2 more wins than if it were average--holding all other variables constant.
We also see that ZORUNAVG (standardized offensive running efficiency) has a coefficient of about 0.4. That means for every standard deviation above average a team is in running efficiency, it can expect 0.4 more wins than if it were average--holding all other variables constant. And because these coefficients are of standardized variables, we can say that the relationship between passing efficiency and wins is 3 times stronger than for running efficiency and wins.
The variables of particular interest now are the interaction variables (the bottom 2). ZORUNPASSIXN (standardized offensive run-pass interaction) is solidly significant and its coefficient is about -0.3. In comparison, the coefficient for running alone is about 0.4. This is evidence that there is something to the theory that there is a synergistic effect to running and passing, at least on offense. But it is completely opposite of what I expected. The stronger the interaction (i.e. the better the combination of running and passing), the fewer wins a team can expect. This indicates although running and passing well help teams win, there is a limit to the linear relationships between running and winning, and between passing and winning.
ZDPASSRUNIXN (standardized defensive pass-run interaction) is not at all significant and the coefficient is near zero. Excluding other variables from the model still does not make defensive run-pass interaction significant. This is evidence that there is no synergistic effect on the defensive side of the ball. I'm surprised because it makes intuitive sense that if a team can stop the run, the opposing offense becomes very predictable.
A possible explanation for the offensive efficiency for bad teams lies in the flip side of the "teams run because they win". If you are a bad team, and the other team is in prevent D, draw type runs and short passes may be very "efficient" in terms of yards. From the defenses point of view you are helping them run out the clock. These blowout yards don't count the same as 1st quarter yards in determining victory.
I think that's an excellent point. I wonder if there would be a different result if I repeated the analysis with 1st half stats only.
Unfortunately, data splits like that is hard to come by.
That may also have an impact on why some QBs tend to get a lot of their yards by YAC and not through the air.