News and Updates
Our research group recently published a new paper that explores the many advantages of integrative data analysis (or IDA). IDA is an approach to combining and analyzing data across multiple, independent samples (a specific type of “data fusion”). Our recent paper explores several particularly salient advantages of IDA when applied to the study of high-risk child and adolescent behavior. First, pooling independent longitudinal samples with varying ranges of ages allow us to “stretch” or “accelerate” the passage of time by combining overlapping developmental cohorts. Second, studying substance use in children often leads to low cell counts due to the rare nature of the behavior, and pooling multiple samples can increase the number of cases reporting the rare outcomes. Finally, pooling samples enhances between subject heterogeneity thus strengthening our ability to generalize findings to a broader population of individuals. We demonstrate these advantages with a detailed worked example in which we generate score estimates for polysubstance use based on unique and shared substance use items drawn from three independent contributing studies. We hope others will find this paper useful as they explore the potential of IDA for their own research.
Curran, P.J., Cole, V.T., Giordano, M., Georgeson, A.R., Hussong, A.M., & Bauer, D.J. (in press). Advancing the study of adolescent substance use through the use of integrative data analysis. Evaluation and the Health Professions.Read More
There is often much confusion about what is meant by “statistical adjustment” when estimating and interpreting models that include more than one predictor variable. This issue commonly arises in the traditional multiple regression model, but is ubiquitous across nearly any class of multivariate models ranging from regression to factor analysis to complex structural equation models. Statistical adjustment goes by many names including an effect that is “controlled for”, is “unique to”, or is “above-and-beyond” the effects of one or more other predictors. Regardless of term, statistical adjustment attempts to equate factors in a model that were not controlled as part of the experimental design. An interesting recent example was described in a New York Times article about the relation between climate change and world record marathon times. Nearly five million finish times where drawn from 900 marathons spanning a 20 year period, and a nonlinear relation was found between temperature and finish time. The fastest times were recorded when temperatures were in the 40’s, and finish times systematically slowed with increasing temperature. This raises an interesting question about whether world records should be adjusted for race-time temperature. For example, the current world record holder would be replaced by the third place finisher on the record list when factoring in temperature. Similar arguments have been made in baseball about adjusting home run counts for elevation because the air resistance is markedly less in Denver than in Boston. Although an interesting argument among friends, the notion of statistical adjustment becomes much more serious in many applied research settings, particularly when studying issues such as end-of-grade achievement, substance use, psychopathology, or any of a myriad of outcomes in the behavioral and health sciences. Failure to account for potential confounding factors can lead to biased interpretations of causal effects that in turn threatens both the internal and external validity of the study. Dan discusses and demonstrates the idea of statistical control in greater detail in our Office Hours video series on multiple regression.Read More
We are pleased to announce our workshop schedule for this summer:
This year, we will be holding all workshops at the Chapel Hill-Carrboro Hampton Inn & Suites, where we have also reserved room blocks for the convenience of our participants. To learn more about what makes our workshops unique, please see our Training page.
It is critical for researchers in the behavioral, health, and social sciences to have a full understanding of the linear regression model. Not only is this model important in its own right, but it serves as the foundation for more advanced statistical models, such as the multilevel model, factor analysis, structural equation modeling, generalized linear models, and many other techniques. For those seeking a first exposure to linear regression or simply looking for a refresher, we’ve launched a new series of CBA Office Hours videos that starts with the basics of the simple one-predictor model and proceeds to more advanced topics. So far, we’ve posted four episodes:
- Episode 1: Introduction to Linear Regression
- Episode 2: Ordinary Least Squares Explained
- Episode 3: Testing the Model
- Episode 4: Inferences about Specific Parameters
- Episode 5: Multiple Regression
We intend to add more videos as time goes on, focusing on such topics as interpretation in the multiple regression model, the difference between hierarchical versus simultaneous regression, how to incorporate categorical predictors, and how to test, probe, and plot interactions. To view all of the videos in this series in sequence, simply click the embedded video or go to our YouTube playlist on Linear Regression. You can also follow us on social media to be updated as new videos are added on this and other topics.Read More
In a prior episode of Office Hours, Patrick discussed predicting growth by time-invariant covariates (TICs), predictors for which the numerical values are constant over time. In this episode, Patrick describes the inclusion of time-varying covariates (TVCs), predictors with numerical values that can differ across time. Examples of TVCs are numerous and include time-specific measures of depression, anxiety, substance use, marital status, onset of diagnosis, or dropout from treatment, among many others. When TICs are included in a growth model, the time-invariant predictors are used to directly predict the growth factors (e.g., intercept, slope). In contrast, when TVCs are included in a growth model, the effects of the time-varying predictors bypass the growth factors and directly influence the repeated measures. There are many ways that TVC influences can be included in the model, and models can be further extended to include both TICs and TVCs simultaneously. Patrick works through a hypothetical example and concludes with a summary of strengths and limitations of these models.
To see all episodes in this series, see our Growth Modeling playlist on YouTube.Read More