William R. Nugent, PhD, University of Tennessee, Knoxville and Gretchen E. Ely, PhD, University of Kentucky.
Meta-analysis is rapidly becoming a fundamental tool for identifying and justifying evidence-based practices, programs, and interventions. A fundamental presumption in meta-analysis is that effect sizes based upon different measurement procedures are directly comparable. It is this presumed direct comparability that allows the meta-analyst to aggregate effect sizes from different studies in order to summarize and integrate research on the effectiveness of some practice, program, or intervention. Recent research has shown that effect sizes based upon different measurement procedures are directly comparable only under restrictive validity invariance conditions. However, no research to date has investigated how much variability is introduced into effect sizes when these validity invariance conditions are violated, nor on the effects that violations of validity invariance have on the outcomes of a meta-analysis. In this presentation the results of a simulation study, in which the effects of violations of validity invariance on both effect size variability and on the outcomes of a meta-analysis, are reported. This simulation involved scores on the Beck Depression Inventory, The Center for Epidemiologic Studies Depression Scale (CESD), Hudson's Generalized Contentment Scale, and the depression subscale of the Multi-Problem Screening Inventory (MPSI). The simulation also made use of scores from the suicidal ideation subscale of the MPSI. Scores from several research studies were aggregated in this simulation, with a total number of cases over 2,000. The effects of violations of validity invariance on both the population standardized mean difference (SMD) and the population correlation effect sizes were the principal focus of this study. The results showed that the differences between SMD effect sizes, based upon different measurement procedures and for a given between population comparison based upon, can be quite large even when construct level validity coefficients exceed .90. The results further showed that the population correlation effect sizes, based upon different measurement procedures and for a given relationship in a given population, may range from .19 to .71 as a function of the measurement procedures used. Finally, the results also show that, under violations of the validity invariance conditions, the correlation between population construct level SMD effect sizes and some variable representing the characteristics of studies included in a meta-analysis may vary from -1 to +1, depending upon which measures are used in which studies. Similarly, under violations of the validity invariance conditions, the correlation between population construct level correlation effect sizes and some variable representing the characteristics of studies included in a meta-analysis may vary from -1 to +1, depending upon which measures are used in which studies. These results pose a challenge to meta-analysis as a tool for identifying evidence-based practices, raising the possibility that current meta-analyses may have measure-dependent results and thereby may be providing erroneous conclusions about best practices.