Abstract: Predicting Youth at High Risk of Aging out of Foster Care Using Machine Learning Methods (Society for Social Work and Research 26th Annual Conference - Social Work Science for Racial, Social, and Political Justice)

Schedule:

Sunday, January 16, 2022

Marquis BR Salon 12, ML 2 (Marriott Marquis Washington, DC)

* noted as presenting author

Eunhye AHN, MSW, PhD Student, University of Southern California, Los Angeles, CA

Yolanda Gill, Ph.D., Professor, University of Southern California, Marina Del Rey, CA

Emily Putnam-Hornstein, PhD, John A. Tate Distinguished Professor for Children in Need, University of North Carolina at Chapel Hill, Chapel Hill, NC

Objective: Youth who age out, or emancipate, from foster care without permanency by age 18 are at increased risk of experiencing difficulties during their transition to adulthood. Although general support programs have been implemented, it is unknown whether an earlier, more targeted and proactive identification of youth with a heightened risk of exiting care without permanency would improve both the services provided and the outcomes achieved. In the current study, we explored whether algorithmic approaches, using machine learning methods applied to historical child protection and welfare records, might assist child protection agencies to better identify and serve youth at high risk of not having established legal permanency by age 18. Importantly, we also describe how metrics of fairness and bias from the computer science literature relate to this specific use case.

Methods: For youth placed in foster care between ages 12 and 14, we assessed their risk of exiting care without permanency by age 18, 4-6 years prior to their exit, based on their child welfare service involvement history. To develop predictive risk models, we used various machine learning algorithms and 28 years (1991–2018) of child welfare service records from California. Performances were evaluated using F1 score, AUC, precision, and recall. Model fairness was assessed using calibration, predictive parity, and error rate balance.

Results: The gradient boosting decision tree and random forest showed the best performance (F1 score = .54~.55, precision score = .62, recall score = .49). Half of all youth who were observed to exit care without permanency were identified among the top 30% of youth the model identified as high risk, with a 39% error rate. Although racial disparities between Black and White youth were observed in imbalanced error rates, calibration and predictive parity were satisfied.

Discussion: Our findings illustrate the manner in which potential applications of predictive analytics, including those designed to achieve universal goals of permanency through more targeted allocations of resources can be tested. Our results are promising in that even a simple predictive machine learning model with limited information extracted from existing administrative data, such as that built here, could be useful for early identification of youth at risk. In addition, the results of algorithmic fairness analysis indicate the model performance varied depending on racial membership of youth, which can be attributed to racial variance in the rate at which Black and White children exit care without permanency.