Research That Matters (January 17 - 20, 2008)
|Saturday, January 19, 2008: 2:00 PM-3:45 PM|
|Blue Room (Omni Shoreham)|
|[RD/M] Advanced Topics in Program Evaluation: Statistical Approaches|
|Speaker/Presenter:||Shenyang Guo, PhD, University of North Carolina at Chapel Hill|
When randomized clinical trials are infeasible, researchers must rely on statistical approaches to discern treatment effectiveness. Regression or covariance-control is just one approach, but not the BLUE one (i.e., best linear unbiased estimator). This workshop will focus on a few advanced topics about statistical models pertaining to program evaluation. Specifically, it will present results of a series of Monte Carlo studies specially designed to show problems of using regression under various conditions. It will emphasize the danger of blindly throwing control variables in regression analysis without carefully examining the process of data generation and mechanisms producing selection bias. The workshop aims to convey a simple message: in many situations regression would produce biased results, and should be replaced by more rigorous approaches such as matching or propensity score matching (PSM).
The workshop will focus on the following topics.
1. The fundamental assumption. In all program evaluations, evaluators must balance data to meet the fundamental assumption about independence of treatment assignment and outcomes under different conditions. Following Heckman and Robb (1985, 1986, & 1988), a Monte Carlo study simulating two conditions (i.e., “selection on observables” and “selection on unobservables”) shows that regression is more “asymptotically biased” and less robust than PSM. 2. The Stable Unit Treatment Value Assumption (SUTVA; Rubin, 1986). It says that the potential outcomes for any unit do not vary with the treatments assigned to any other units, and there are no different versions of the treatment. The SUTVA assumption imposes exclusion restrictions on outcome differences. It underscores the importance of analyzing average treatment effects for the subpopulation of treated units, which is analogous to the efficacy subset analysis (ESA) found in the intervention-research literature (Lochman et al, 2006). Results of the Monte Carlo study show why regression is biased but matching is unbiased in estimating efficacy subsets. 3. When the fundamental assumption is violated, evaluators face a number of choices to balance the data. To name a few: regression, matching, subclassification or stratification, regression on the propensity score, matching on the propensity score, weighting and regression, blocking and regression, PSM with lowess, regression discontinuity design, instrumental variables, Bayesian approaches, and more. Among these, matching (Abadie et al, 2004), PSM (Rosenbaum & Rubin, 1983), and PSM with lowess (Heckman et al, 1997) are the most popular ones that have been increasingly adopted by social behavioral evaluators. Results of a Monte Carlo study show how these methods produce better results than regression. 4. Under certain conditions, matching and PSM will be even better than a new approach called “group randomized clinical trials” (GRCT; Raudenbush 2006; Bloom et al 2005). The last simulation will present comparison data of two strategies evaluating the Title IV-E Waiver Demonstration project, a non-experimental program where selection bias is built in. Results show that PSM is more robust and less expensive than GRCT.
Conclusion: it's important to examine conditions under which evaluation data are generated; when selection bias is present, matching or PSM is a better method than regression.