Abstract: Most Important Predictors of Father-Child Contact in the U.S. Child Welfare System: A Machine Learning Approach (Society for Social Work and Research 30th Annual Conference Anniversary)

Most Important Predictors of Father-Child Contact in the U.S. Child Welfare System: A Machine Learning Approach

Schedule:
Friday, January 16, 2026
Marquis BR 10, ML 2 (Marriott Marquis Washington DC)
* noted as presenting author
Joyce Lee, PhD, Assistant Professor, Ohio State University, Columbus, OH
Garrett Pace, PhD, Assistant Professor, University of Nevada, Las Vegas, NV
Keunhye Park, PhD, Assistant Professor, Michigan State University, MI
Hunmin Cha, MSW, PhD Student, Ohio State University, OH
Yujeong Chang, MSW, PhD Student, Ohio State University, OH
Amy Xu, MSW, PhD Student, Ohio State University, OH
Background and Purpose: Fathers are likely to have a positive impact on child welfare cases and child outcomes. However, little attention has been paid to comprehensively understanding multisystem-level factors that promote father involvement in the U.S. child welfare system. Novel methods such as machine learning can advance this research. Guided by the ecological systems framework for understanding father-child relationships (Volling & Cabrera, 2019), this exploratory study aimed to identify the most important predictors of father-child contact within the U.S. child welfare system. Specifically, the following research questions were asked: 1) What variables at the individual (father and child), family, neighborhood, and child welfare system levels are most predictive of father-child contact among families in the child welfare system? and 2) What is the directionality of bivariate correlations between the most important predictors and father-child contact?

Methods: Data were from the National Survey of Child and Adolescent Wellbeing-Third Cohort (NSCAW-III), a nationally representative dataset of children and families impacted by the child welfare system. The sample included cases with valid birth father-child contact data (N = 2,380). The outcome was father-child contact in the past 12 months. A total of 124 predictors at multiple system levels—individual (e.g., father, child, caseworker), family, neighborhood, child welfare system—were included. Random forest—a machine learning approach—was employed for the main analysis. Correlations were conducted to explore directionality between predictors and the outcome.

Results: Fathers in the sample were socioeconomically disadvantaged. Most had a high school education or less (88%), were unemployed (67%), and were non-resident with their children (92%). They were racially and ethnically diverse (36% Black, 33% White, 25% Hispanic, 5% other). The mean age of the children was 6.55 years (SD = 5.91). Most families (70%) reported father-child contact in the past 12 months. Random forest results showed that among the top 20 predictors, fathers’ sociodemographic characteristics were particularly important. For example, missing data on fathers’ race/ethnicity (importance score = 1.00), fathers being White (importance score = 0.84), and fathers having an associate’s degree (importance score = 0.57) were some of the most important predictors of father-child contact. Additional key predictors included caseworkers’ ability to speak another language (importance score = 0.66), child welfare systems’ referral of families to health services (importance score = 0.61), child being a girl (importance score = 0.49), and families' access to food (importance score = 0.45). Top predictors generally showed weak negative correlations with father-child contact.

Conclusion and Implications: Factors at different system levels—individual, family, and child welfare system—interact in complex ways to predict father-child contact, suggesting the need for multisystemic interventions to connect fathers with their children. The prominence of certain predictors, including missing race/ethnicity data, also highlights underlying reporting challenges or measurement gaps in child welfare data. Applying machine learning to child welfare data has advantages over traditional methods, including the ability to handle large datasets and model complex, interactive, and non-linear relationships. These capabilities can help uncover hidden patterns that may better inform child welfare practice and policy.