Statistical Adjustment and the Case of Marathon Records and Climate Change

There is often much confusion about what is meant by “statistical adjustment” when estimating and interpreting models that include more than one predictor variable. This issue commonly arises in the traditional multiple regression model, but is ubiquitous across nearly any class of multivariate models ranging from regression to factor analysis to complex structural equation models. Statistical adjustment goes by many names including an effect that is “controlled for”, is “unique to”, or is “above-and-beyond” the effects of one or more other predictors. Regardless of term, statistical adjustment attempts to equate factors in a model that were not controlled as part of the experimental design. An interesting recent example was described in a New York Times article about the relation between climate change and world record marathon times. Nearly five million finish times where drawn from 900 marathons spanning a 20 year period, and a nonlinear relation was found between temperature and finish time. The fastest times were recorded when temperatures were in the 40’s, and finish times systematically slowed with increasing temperature. This raises an interesting question about whether world records should be adjusted for race-time temperature. For example, the current world record holder would be replaced by the third place finisher on the record list when factoring in temperature. Similar arguments have been made in baseball about adjusting home run counts for elevation because the air resistance is markedly less in Denver than in Boston. Although an interesting argument among friends, the notion of statistical adjustment becomes much more serious in many applied research settings, particularly when studying issues such as end-of-grade achievement, substance use, psychopathology, or any of a myriad of outcomes in the behavioral and health sciences. Failure to account for potential confounding factors can lead to biased interpretations of causal effects that in turn threatens both the internal and external validity of the study. Dan discusses and demonstrates the idea of statistical control in greater detail in our Office Hours video series on multiple regression.