Abstract: A Comparison of Classification Tree and Logistic Regression Modeling to Predict Educational Involvement for Youth Aging out of Foster Care (Society for Social Work and Research 23rd Annual Conference - Ending Gender Based, Family and Community Violence)

A Comparison of Classification Tree and Logistic Regression Modeling to Predict Educational Involvement for Youth Aging out of Foster Care

Schedule:
Friday, January 18, 2019: 4:00 PM
Golden Gate 3, Lobby Level (Hilton San Francisco)
* noted as presenting author
Kevin White, PhD, Assistant Professor, East Carolina University, Greenville, NC
Background/Purpose: Numerous studies have documented poor educational outcomes for older youth who age-out of foster care in the U.S. For example, foster youth who age-out are almost three times more likely to drop-out of high school than their peers, and are significantly less likely to be enrolled in school at least five years after foster care exit. Much of the previous research on the educational engagement of older foster youth has been descriptive, and more rigorous studies are needed. Methods designed for data mining of large datasets such as classification tree modeling can be used to develop predictive models from administrative and survey data. This study used several classification tree models and logistic regression to explore data obtained from the National Youth in Transition Database (NYTD), a nationally representative survey of older youth in foster care.

Methods:  NYTD survey data were obtained for two survey cohorts: youth who turned 17 in foster care during 2011 and 2014, respectively. Twenty-five predictor variables from wave 1 (age 17) that related to youth socio-demographics, services, and well-being were included in predictive models. The outcome was a dichotomous variable that indicated whether a youth was enrolled and attending high school, GED, or postsecondary school at wave 2 (age 19), and list-wise deletion was used for missing data. The final analysis dataset of 4000 youth was randomly divided into a training dataset (n = 1986), and a testing dataset (n = 2014) to test the successful prediction rates of models. Descriptive statistics of the overall sample were first examined. Next, eight models were estimated to examine the relationship between predictor variables and the outcome variable, including four tree models, three conditional inference tree models, and a stepwise logistic regression model. 

Results: Descriptive results with the analysis sample indicated that about 94% and 54% of youth reported enrollment/attendance in school at ages 17 and 19, respectively. Prediction rates were consistent across all eight models (58-61%). The classification tree model developed using the smallest complexity parameter (0.001) and pruned to eliminate the least important splits had the highest prediction success. Several youth variables at age 17 were related to educational enrollment/attendance at age 19, including previous incarceration, white race, previous substance abuse referral, and whether the youth had a connection with an adult. Several local interactions were also suggested by results.

Conclusions/Implications: This study has implications for research, as it shows the potential for using supervised data mining techniques, such as classification trees, to better understand the experiences of youth who age out of foster care. These methods may be useful with large datasets, and to clarify complicated interactions. This study also supports recent federal policy efforts to expand support for youth who age-out of foster care without a permanent family. Finally, this study suggests several targets for potential intervention. Notably, youth who have a history of incarceration and substance abuse treatment face unique risks for poor educational engagement, while those youth who have a supportive adult in their lives may experience better educational outcomes.