Abstract: Enhancing Treatment Outcome Predictions in Substance Use Programs: A Comparative Study of a Gradient-Boosted Bootstrap Model and Logistic Regression (Society for Social Work and Research 30th Annual Conference Anniversary)

Schedule:

Friday, January 16, 2026

Marquis BR 10, ML 2 (Marriott Marquis Washington DC)

* noted as presenting author

Graham Zulu, MSW, Research Associate, University of Denver, Denver, CO

Background: Accurately predicting treatment completion is vital to improving engagement and outcomes in substance use disorder (SUD) programs. Traditional models, such as logistic regression, often struggle to capture complex behavioral patterns that affect recovery trajectories. Novel ensemble methods may improve predictive performance and support more targeted, equitable interventions.

Objective: This study evaluated whether a Gradient-Boosted Bootstrap Model (GBBM), a hybrid machine learning technique combining bootstrapping and gradient boosting, could outperform logistic regression in predicting treatment completion among adults in SUD programs.

Methods: We analyzed data from 1,158 adults receiving treatment for substance use. Predictors included baseline demographic and drug use variables; missing values were imputed using median values, and categorical variables were dummy-coded. Ten gradient-boosted models were trained on bootstrapped samples, with predictions averaged across models. Performance was assessed using accuracy, precision, recall, F1-score, and ROC AUC on a held-out test set. Analyses were conducted using STATA v18 with Python integration.

Findings: GBBM outperformed logistic regression across most metrics: recall (0.875 vs. 0.836), precision (0.672 vs. 0.665), and F1-score (0.760 vs. 0.741), indicating improved ability to identify individuals who completed treatment. Both models performed similarly in identifying non-completers. Permutation importance analysis revealed key predictors of treatment success. The most influential were having disability income and no co-occurring mental health disorder, suggesting the importance of social stability and behavioral health complexity. Other influential factors included injection drug use, age of first use, and education level. Contextual variables such as unemployment, lack of income, and prescription drug sourcing also contributed meaningfully.

Conclusions and Implications: Machine learning models like GBBM may enhance recovery-oriented systems of care by identifying individuals at higher risk of dropout and informing tailored retention strategies. These methods can improve outcomes in substance use treatment, particularly in under-resourced settings, by guiding more equitable and data-driven allocation of services.