|
Examining the correlation of turnover history coming into the game with what actually occurs
This is further research into Turnovers. If you haven't read the earlier articles, a good place to start is with Turnover Difference Revisited.
For years now we have been harping on and on about the importance of turnovers in the outcome of NFL games. We've pioneered the turnover difference theory of analysis, and examined turnovers as a whole, fumbles and interceptions independently, and even branched into such areas as red zone turnovers and combining turnovers with the GAP team rating scheme.
Well, today we will focus on how correlated turnover history is with what actually transpires in future games. To do this we will need to venture into a little more serious/complicated statistical analysis, but don't worry the conclusions we draw should be easy enough to understand.
First up we have to decide on what kind of data set will be used and what we will be trying to calculate. For this column we have decided to use a team's (and its opponents) turnover history season-to-date average coming into a game and try to predict the turnovers that will occur that week in the specific matchup. Since the predictiveness of season-to-date stats will vary considerably by the point in the season (eg in week two with only one game played we would not expect much accuracy from the stats) we have to make an arbitrary decision as to what timeframe to go with.
We picked week 8 through week 15...
Why? you may ask. Well, our thinking is that by week eight teams have accumulated quite a large amount of data, and by cutting short of the last couple of weeks of the season we will exclude those games where teams may a) not be trying too hard, b) be throwing young blood onto the field to see how they do, and c) be generally worn down from the long, physical beating the season enacts on its players.
So, basically we'll look at weeks 8 to 15 for a ten year sample (1991-2000).
The following table represents the least squares estimation of the relationship between turnover season-to-date history and the actual turnovers that occur in the next game for "fumbles lost."
Predicting Fumbles Lost
Trying to predict: |
Using |
Formula |
| Home Fumbles Lost |
Home Off. Fumbles Lost |
.750 + .166X |
| Away Fumbles Lost |
Away Off. Fumbles Lost |
.789 + .110X |
| Home Fumbles Lost |
Away Def. Fumbles Captured |
.815 + .095X |
| Away Fumbles Lost |
Home Def. Fumbles Captured |
.810 + .085X |
| Home Fumbles Lost |
Home Off. Fumbles Lost +
Away Def. Fumbles Captured |
.694 + .229X |
| Away Fumbles Lost |
Away Off. Fumbles Lost +
Home Def. Fumbles Captured |
.731 + .174X |
ANALYSIS:
Got all that? Good.
Don't worry if the above seems nebulous, let's explain what it means. We're trying to see what formula you would want to use for predicting turnovers over a large sample of games with minimal inputed data. The answer is in each case in the column at the right.
So for example, if we want to predict the number of home fumbles lost in an upcoming game using only the season-to-date average, the computer spits out a formula of .750 + .166X. X represents the variable we are using to predict with, so for the first row X would be "Home offensive fumbles lost" season-to-date per game average.
To apply this then if a team has a history of losing a fumble precisely 1 time per game, we would project a fumble lost for that team of .750 + (.166 * 1.0) = .916 fumbles lost. On the other hand a team that had only lost say two fumbles in eight games for a 0.25 fumbles lost per game average, would predict out at .750 + (.166 * .025) = .792 fumbles lost.
Whew! What's this all mean, you're asking, and why are you interrupting my valuable web surfing time to tell me this??
Well, what we are ultimately trying to get at is how predictive turnovers are for future events. The answer we find out for fumbles at least, is "not very." How do we know this? Well it's that second part of the formula that really interests us. If prior history was perfectly correlated with the subsequent games, the formula would be 0 + 1X or simpler still, X.
In other words in perfect correlation X would have a multplier of one, while if there is absolutely no predictive value it would have a mutliplier of zero (if there was negative correlation it could be a negative number). So what it all means is that the closer the X-multiplier is to one the more correlated the two things are, namely turnover history and actual turnovers.
For fumbles lost then, offensive fumbles are slightly more predictive than defensive fumbles captured (compare the .166X for home offensive fumbles with the .095X for away defensive fumbles captured), but combining both offensive fumbles lost and defensive fumbles captured is not surprisingly the most accurate predictor of the ones we tested, although still not very good: .229X for home fumbles lost was the highest X-Multiplier we see.
Fair enough, and to relieve your eyes of the strain of squinting at all this text, let's put together a table for interceptions:
Predicting Interceptions
Trying to predict: |
Using |
Formula |
| Home Interceptions |
Home Off. Interceptions |
1.004 + .106X |
| Away Interceptions |
Away Off. Interceptions |
.877 + .279X |
| Home Interceptions |
Away Def. Interceptions |
.852 + .235X |
| Away Interceptions |
Home Def. Interceptions |
.998 + .174X |
| Home Interceptions |
Home Off. Interceptions +
Away Def. Interceptions |
.747 + .325X |
| Away Interceptions |
Away Off. Interceptions +
Home Def. Interceptions |
.713 + .420X |
So here too there is not huge predictiveness of the stats, but at least it's better than what we saw with fumbles...the best X-Multiplier we found was for predicting the number of interceptions thrown by the away team, which came in at .420X or another way of putting it is that 42% of the stat coming in is used to calculate the prediction.
Who on earth looks at stuff like this, you may be thinking, and what practical purpose can it have? Well, if you are trying to create an ultimate statistical predictor this kind of thing can be useful, and the reason I only use "can" is that we are looking at such a narrow view on the whole picture that it's not an ideal way to go about things. As an example if we changed the data set (to fewer/more weeks, different number of years) the above formulas would almost certainly change to a significant extent. Moreover, what causes turnovers could perhaps be some other variables not in our list above that would have a much higher correlation (if for instance teams with low rushing yards tend to throw more interceptions because they are passing more often).
What we can state with some respect is that turnover season-to-date history doesn't appear to be highly correlated with future turnovers...something we have said in one form or another many times admittedly. The above though provides a stronger mathematical basis for this belief.
We'll have -- are you ready -- more to say on the subject of correlation in the future...you probably can't wait!
Copyright © 2002 by TwoMinuteWarning.com, All Right Reserved
|