Purpose: Social work research has increasingly involved the use of survey data that were collected from a complex design of sampling. For instance, to evaluate welfare reform, researchers analyze data of the Panel Study of Income Dynamics (PSID); to evaluate well-being of children receiving protective services and substitute care, researchers analyze data of the National Survey of Child and Adolescent Well-being (NSCAW); and to study health and mental health of the elderly, researchers analyze data of the Study of Asset and Health Dynamics Among the Oldest Old (AHEAD). Crucial features of these datasets are that samples were created using a multi-stage stratified design, unequally weighted, and clustered.
In NSCAW, sample weights were created to account for differential selection probabilities within primary sampling units (PSU) and to make adjustments to compensate for missing months in sampling frames, special situations that occurred at specific PSUs, and potential bias due to non-response. Clustering refers to the fact that PSUs are nested within sampling strata, subjects are nested within PSU, and identical measures over multiple time points are nested within subjects. Analyzing such data requires special treatment such as controls of clustering (e.g., correlation of subjects within PSU) and using appropriate weights for statistical inferences to the entire population of unsubstantiated cases of maltreatment in the U.S.
Based on the presenters’ research experiences with NSCAW, this workshop will demonstrate the use of two software packages for statistical analysis with complex sampling: (1) SUDAAN - this is the most comprehensive program specially designed for analyzing complex survey data; and (2) Mplus – this is the only package among existing software programs for structural equation modeling (i.e., AMOS, LISREL, EQS, & Mplus) that handles weights and corrects for clustering.
Contents: The workshop will focus on the following topics:
1. Introduction to key concepts related to analysis of complex sample design, including the generalized estimating equation (GEE) method (Zeger & Liang, 1986), the sandwich estimator known as Taylor expansion of Huber-White (Muthen & Satorra, 1995; Lin, 1994), weighting procedure in statistical inferences, various types of weights, and choices of weights under different research settings; 2. Description of key specifications in SUDAAN (i.e., SORT, NEST, DESIGN, and WEIGHT), and running “Proc Descript” and “Proc Crosstab” to conduct descriptive analysis; 3. Running SUDAAN “Proc Regress” to conduct regression analysis with GEE for both cross-sectional and longitudinal data; 4. Running SUDAAN “Proc Survival” to conduct event history analysis with a marginal proportional hazard model (i.e., the LWA model); and 5. Running Mplus to conduct structural equation modeling that controls for complex sample effects and unequal weights.
Pedagogical Techniques: Teaching methods include lecture, PowerPoint presentation, and computer demonstration.
Lin, D.Y. (1994). Cox regression analysis of multivariate failure time data: The marginal approach. Statistics in Medicine 13: 2233-2247.
Muthen, B., & Satorra, A. (1995). Complex sample data in structural equation modeling, in P.V. Marsden (Ed.), Sociological Methodology, (pp.267-316). Washington DC: The American Sociological Association.
Zeger, S. & Liang, K. (1986). Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42: 121-130.