Saturday, 14 January 2006 - 8:22 AMQuasi-Experimental Strategies When Randomization Fails: Propensity Score Matching and Sensitivity Analysis
Purpose: Randomized experiments have long been treated as a “gold standard” in program evaluation. However, evaluators of whole-school programs may find it difficult to randomly assign schools into intervention and control conditions. Schools are often reluctant to participate in studies given a 50% chance of receiving treatment. Furthermore, because permission to recruit a school must often be secured from the school district, recruitment and random assignment of sites to each condition can be a costly and time-consuming process involving dozens of potential sites.
The evaluation of the School Success Profile-Intervention Package (SSP-IP) is an example of an evaluation of a whole-school program where problems occurred in the recruitment phase that made the random assignment of schools to treatment impossible and the assumption of randomness untenable. Only two of the five districts selected for recruitment for the SSP-IP evaluation participated, and all 11 intervention schools were from these geographically proximal districts. Too few middle schools remained in these districts to recruit 33 control schools required by our power analysis, which therefore must be selected from elsewhere. Consequently, intervention schools share a common trait that distinguishes them from the controls: geographic proximity. This violates the stable unit treatment value assumption (SUTVA) that, conditional on covariates X, treatment and non-treatment outcomes are unrelated to treatment assignment. We discuss quasi-experimental strategies, primarily propensity score matching (PSM), as a means to address violations of SUTVA that can arise in quasi-experimental settings, including those situations in which an experimental design failed to hold due to site recruitment problems. Methods: PSM is a powerful strategy to address this evaluation challenge. Our objective was to choose 33 comparison schools that best match the 11 intervention schools. We used a logistic regression, modeling treatment assignment on a set of variables suspected of being related to treatment assignment. Several different models were tested. We then used the propensity scores—predicted values of the logistic regression—and nearest neighbor matching within caliper to choose 33 controls. This process approximates group randomization, making the resultant intervention and comparison groups more balanced on observables. Results: The results of the logistic regressions demonstrated that geography and other factors were related to treatment assignment. Several characteristics were significant predictors of treatment assignment, including aggregative measures of school performance and urbanicity. After conducting the matching, the two groups of schools were no longer different on observed covariates, and the data met the requirement of SUTVA. Implications: When randomization fails, evaluators should consider using special treatments in data analysis. PSM at the school level is one such robust method that is particularly useful in the evaluation of school-based interventions. A major limitation of PSM is that it cannot control for selection bias due to unobservables. To address this limitation, researchers should conduct sensitivity analyses to gauge the robustness of the approach. A rigorous evaluation must employ multiple evaluation methods and sensitivity analysis. This strategy is useful in school and other cluster settings (e.g., communities, agencies) where random assignment of whole schools can be politically difficult to undertake.
See more of Methods in Evidence-Based Practice |