Session: Genetic Matching in R: Getting Unbiased Estimates of Treatment Effects From Experiments and Observational Studies (Society for Social Work and Research 15th Annual Conference: Emerging Horizons for Social Work Research)

63 Genetic Matching in R: Getting Unbiased Estimates of Treatment Effects From Experiments and Observational Studies

Schedule:
Friday, January 14, 2011: 10:00 AM-11:45 AM
Meeting Room 8 (Tampa Marriott Waterside Hotel & Marina)
Cluster: Research Design and Measurement
Speaker/Presenter:  Richard Smith, PhD, Assistant Professor, Wayne State University, Detroit, MI
Propensity score matching is a well known method in social work research and regular topic for methods workshops at SSWR. However, it is almost 30 years old (Rubin, 1980). What matching methods are current in the literature? Genetic Matching (GenMatch), according to its author, can find in one hour an optimal set of matches and weights that would take human researchers ten years using propensity score matching (Diamond, & Sekhon 2005).

The first part of this workshop will go over the basics of R, the world's leading free and open source statistical software. The second part will review the intellectual history of causal inference as it pertains to matching. The third part will introduce GenMatch, an application of genetic matching algorithms for the social sciences.

The R project is a GNU (GNU is Not UNIX) implementation of the S system developed by Chambers et al. at Bell Laboratories for statistical and graphical analysis of data. GNU/Linux is the operating system maintained by the Free Software Foundation. It is "copy left" under the general public license (GPL) so that it may be used, modified and redistributed at will provided all derivative products also are copyleft under the GPL. Mostly used in command line, there are graphic user interfaces for different audiences.

According to the Neyman-Rubin potential outcomes framework, the fundamental problem of causal inference is that the researcher never observes the treated person under control conditions and never observes the control population under the treatment regime. That is why the random clinical trail is the gold standard for causal inference. The average treatment effect, or experimental benchmark, is the difference in mean outcome values between the treatment group and control group.

However, in many cases interventions cannot be subject to a randomized trial for budgetary, logistical or ethical reasons. That is why it is often necessary to perform an observational study, or quasi-experiment. Indeed, an observational study with a large random sample can generate estimates with better external validity provided the causal path and selection bias are well known.

Matching is necessary in an observational study to estimate the the average treatment effect on the treated. The treatment and control groups must be balanced on all observable key variables that influence selection into treatment or confound the outcome. While propensity scores are estimated using logistic regression from observed covariates on treatment. GenMatch is an improvement because it is non-parametric and affine invariant (preserves ratios of distances among covariates). It also generates a set of randomly generated population weights from which to select the optimal solution. To prevent data mining, GenMatch requires users to write their own function to discard bad weights based on a priori knowledge of the intervention and study population.

The workshop will be primarily a demonstration with software and handouts provided. Participants will practice writing sample code on their own laptop in small groups or on butcher paper. The workshop presenter has had a full year course in causal inference by one of the authors of GenMatch.

See more of: Workshops