Data & Methods Foster care agencies that receive funding from CFCIP are mandated to report data regarding case level information on the youth and independent living services they provide. These services are entered bi-annually into the National Youth in Transition Database (NYTD), and coded into one of fifteen different service types. We use NYTD data from fiscal year 2018 on 80,714 youth who were eligible to receive Chafee services. To analyze these data according to our research question, we develop, evaluate, and then interrogate a machine learning method that predicts how many services each youth would receive.
Results The machine learning we develop is significantly better at predicting service frequency for youth compared to statistical methods favored in the literature. This implies that more traditional statistical methods may fail to accurately explain factors associated with service allocation. Our approach suggests that the critical predictors of service allocation primarily fall into one of four categories: the number of services a youth received in the previous year, the youth's age, the youth's length of time in care (controlling for age and other factors), and the state in which the youth resides. Theory-informed exploration of the latter two factors suggest important lines for future inquiry. Finally, we find that the predictive models we build, if used to make decisions, would not be equitable - they would allocate significantly fewer services to Black youth. This finding underscores the dangers of applying predictive modeling directly in a decision-making context.
Conclusions and Implications There is robust debate in the child welfare literature concerning the use of predictive analyticS. On the one hand, we demonstrate that algorithms that maximize fit may decrease equity if used to allocate services. However, predictive analytics need not only be applied in a way that makes decisions. It can also be used to (re-)illuminate these inequalities in ways that highlight structural patterns and call for the linking of new theories to our problems. Our work shows the benefits of an approach that interweaves exploratory worK using computational methods with extant theory. This approach can also help explore data for best-practice implications. For instance, relationships between multiple variables such as state, age, and service types, and the impact of these on an outcome such as homelessness, may help point us to models of policy-practice success that deserve further exploration. A practice theory lens is important for knowing where to start such investigations.