Background and Purpose:
Child abuse and neglect (CAN) is a pervasive public health issue, with over 700,000 children affected annually in the U.S. Emergency departments (EDs) play a critical role in CAN identification, yet existing practices are often influenced by implicit biases, particularly against socio-economically disadvantaged and racially marginalized populations. Previous machine learning models leveraging electronic health records (EHRs) have not adequately addressed these disparities. This study aimed to develop and evaluate a machine learning-based model that incorporates clinical workflow data and the Area Deprivation Index (ADI) to detect suspected CAN in pediatric ED visits, with the goal of reducing socio-economic bias in reporting to child protective services (CPS).
Methods:
This retrospective case–control study analyzed EHR data from a single pediatric ED in the northeastern U.S. covering 33,961 patient visits in 2018. Suspected CAN cases (n = 74) were identified through natural language processing (NLP) of clinical notes and expert validation. Data sources included structured clinical orders (labs, consults, medications) and unstructured clinical notes. The Area Deprivation Index, a community-level measure of socio-economic disadvantage, was used to stratify patient background. Predictive models were developed using the LASSO algorithm. Model performance was assessed using precision, recall, and F1 score. SHAP values were calculated to interpret key predictors.
Results:
The integrated LASSO model (structured + unstructured data) achieved a precision of 0.83, recall of 0.80, and F1 score of 0.82. When stratified by ADI, the model achieved a precision of 0.80 for high-deprivation (lower socio-economic status) patients and 0.81 for low-deprivation patients. Key predictors for CPS reporting included consult orders, nurse-specific orders, oral medications, discharge orders, and ED length of stay. Differences were observed in feature impact between socio-economic strata: children from high-deprivation areas were more likely to have higher reporting associated with increased orders and longer ED stays, while low-deprivation patients were more often reported in the context of more acute presentations.
Conclusions and Implications:
This study presents a novel, interpretable machine learning model for identifying suspected child abuse and neglect in pediatric ED settings, incorporating clinical workflow data and socio-economic context. By using ADI instead of race or ethnicity, the model offers a more equitable approach to risk prediction and reduces the risk of perpetuating systemic bias. The findings have significant implications for improving clinical decision-support tools and guiding the ethical implementation of AI in healthcare. Future research should validate these models across multiple healthcare settings and further explore integrating social determinants of health to enhance early identification and intervention strategies.
![[ Visit Client Website ]](images/banner.gif)