Session: Caveats in Running Logistic Regression and Other Nonlinear Models (Society for Social Work and Research 28th Annual Conference - Recentering & Democratizing Knowledge: The Next 30 Years of Social Work Science)

All in-person and virtual presentations are in Eastern Standard Time Zone (EST).

SSWR 2024 Poster Gallery: as a registered in-person and virtual attendee, you have access to the virtual Poster Gallery which includes only the posters that elected to present virtually. The rest of the posters are presented in-person in the Poster/Exhibit Hall located in Marquis BR Salon 6, ML 2. The access to the Poster Gallery will be available via the virtual conference platform the week of January 11. You will receive an email with instructions how to access the virtual conference platform.

337 Caveats in Running Logistic Regression and Other Nonlinear Models

Sunday, January 14, 2024: 11:30 AM-1:00 PM
Congress, ML 4 (Marriott Marquis Washington DC)
Shenyang Guo, PhD, Washington University in Saint Louis
Yuanyuan Yang, MPA, Washington University in Saint Louis and Linyun Fu, MSW, University of Chicago
Background and Purpose: Logistic regression and other types of nonlinear models (e.g., ordered logistic regression, multinomial logit model, Poisson regression, negative binomial regression, and Cox regression) have been widely applied in social work research to address complicated research questions. Because these models are not "linear" per se (i.e., they are so-called "generalized linear models") and employ a maximum likelihood estimator, special measures must be taken to handle the nonlinear issues embedded in the data. Failures to address these issues lead to biased and inefficient findings of statistical analysis. This workshop discusses three caveats underscored by eminent scholars but often ignored in empirical practice. Using social work examples, the workshop calls for exercising cautions when running this type of nonlinear models.

Methods: We discuss three issues in this workshop. (1) Odds ratios versus predicted probabilities - although using odds ratios to interpret logit and similar models is very common, the method is rarely sufficient for understanding the results of the model. Logistic regression is essentially a nonlinear model, and the linear relation between a predictor and the cumulative distribution function (i.e., the probability) only exists in the probability range of 0.2 to 0.8. As such, the odds-ratio interpretation for results out of this range is misleading. "We strongly prefer methods of interpretation that are based on predicted probabilities (Long & Freese, 2014, Regression Models for categorical Dependent Variables Using Stata, p.227)." (2) Wald test versus likelihood ratio (LR) test - testing statistical significances based on the Wald test may not always produce similar findings as those provided by the LR test; as such, researchers must employ both methods to draw conclusions about statistically significant predictors (Guo, 2013, Maximum Likelihood Estimator: the Untold Stories, Caveats, and Tips for Application). (3) Single- versus multiple-parameter test. To address important research questions, statistical tests focusing on single parameter is often insufficient, whereas a test involving multiple parameters based on the so-called linear contrasts is highly recommended (Hosmer, et al. 2008, Applied Survival Analysis: Regression Modeling of Time-to-Event Data).

Results: (1) A study evaluating the effectiveness of a social emotional learning (SEL) intervention program showed that results based on predicted probabilities overcame the limitations of odds ratios, and clearly revealed that study children's "getting better" probabilities for the SEL group were higher than those of the control group. (2) A study evaluating the determinants of timing of adopting nonpharmaceutical mitigation interventions fighting the COVID pandemic in the United States confirmed that minority and vulnerable populations suffered most severely from the pandemic, which is an important finding supported by both the Wald and LR tests. (3) A study testing the research hypotheses about the adverse impacts of the welfare reform on the hazard rates of reunification for children placed in foster care indicated that the multi-parameter tests produced stronger results than those using the single-parameter tests alone.

Conclusion and Implications: Whenever possible, researchers running nonlinear models should exercise cautions and take remedial measures to warrant that findings are robust, and the statistical analysis is indeed rigorous.

See more of: Workshops