Methods: Two sources of data from California were linked to create the analytic dataset: vital birth records and CPS records. Birth records from 2001, 2006, 2007 and 2012 were probabilistically linked to CPS records with children then classified based on whether or not they were reported for alleged maltreatment before 5 years of age. Four models, Decision Trees (DTs), Generalized Additive Models (GAMs), Random Forests (RFs), and Deep Neural Networks (DNNs), were applied to determine their ability to outperform linear regression. DTs and GAMs were selected for their interpretability due to DTs ability to generate flowcharts, and GAMs to generate main effects plots and statistical interactions.
Results: Using cross validation, we obtained average accuracies (AUCs) of 71.60% [CI:70.83, 72.65] (0.7791 [0.77, 0.78]) for LR, 70.48% [67.63, 72.49] (0.7624 [0.76, 0.77]) for DTs, 73.39% [67.78, 76.05] (0.7781 [0.77, 0.79]) for GAMs, 72.73% [72.05, 73.87] (0.7890 [0.78, 0.79]) for RFs, and 76.29% [73.57, 79.91] (0.7867 [0.78, 0.79]) for DNNs. DTs and GAMs both yielded sensible interpretations, placing emphasis on mother’s education, health insurance type, and nonimmigrant status as important risk predictors. These predictors appeared in DTs from nodes with highest Gini impurity and in GAMs from main effect and interaction plots.
Conclusions/Implications: DNNs appear to possess the potential to significantly outperform all other models. The confidence interval of DNN’s performance is wide in accuracy, nevertheless, the interval in both accuracy and AUC is generally higher than that of LR. For interpretable models, DTs underperformed against LR, but GAMs showed similar performance; thus, GAMs could potentially be used in place of LR for their interpretability.