Abstract: Applying Machine Learning to Human Resources Data: Predicting Job Turnover Among Community Mental Health Center Employees (Society for Social Work and Research 28th Annual Conference - Recentering & Democratizing Knowledge: The Next 30 Years of Social Work Science)

Schedule:

Friday, January 12, 2024

Liberty Ballroom N, ML 4 (Marriott Marquis Washington DC)

* noted as presenting author

Sadaaki Fukui, PhD, Associate Professor, Indiana University, IN

Wei Wu, PhD, Associate Professor, Indiana University - Purdue University, Indianapolis, IN

Jaime Greenfield, MS, Vice President of Operations, Places for People

Michelle Salyers, PhD, Professor, Indiana University - Purdue University, Indianapolis, Indianapolis, IN

Gary Morse, PhD, Consultant, Places for People, Inc, Saint Louis, MO

Jennifer Garabrant, BSW, Program Manager, Indiana University-Purdue University Indianapolis, IN

Emily Bass, BA, Graduate Research Assistant, Indiana University-Purdue University, Indianapolis, IN

Eric Kyere, PhD, Assistant Professor, Indiana University, IN

Nathaniel Dell, PhD, Vice President of Knowledge Translation and Impact, Places for People, Inc, Saint Louis, MO

Background and purpose: Employee turnover continues to be a critical problem for many community mental health workers, with rates of turnover ranging from 25% to 60% annually. High turnover rates are devastating for mental health care systems, affecting organizations, employees, and the quality of care. Human resources (HR) departments collect extensive employee data that can be useful for predicting turnover. Yet, these data are not often used to address turnover (e.g., varying/complex data formats for traditional analytics). The goal of the current study was to predict community mental health center employees’ turnover by applying machine learning (ML) methods to HR data (collected in routine organizational managements) and to evaluate the feasibility of the ML approaches.

Methods: Historical HR data were obtained from two community mental health centers (urban and rural areas). The first center contained 654 employee records, dating back to 2011 (336 cases left and 318 cases stayed at the time of data extraction), and the second center contained 894 employee records, dating back to 2017 (487 cases left and 407 cases stayed). The extracted HR data included age, gender, race, education level, marital status, exempt status, job type, position type, wage, past work years, work hours, job training hours, and characteristics of clients served by the employees (e.g., age, gender, mental health diagnosis). ML approaches with random forest and Lasso regression as training models were applied for predicting an employee’s turnover probability within the following 12 months. Missing data were imputed by the k-nearest neighbor method. Five-fold cross-validation approaches were used to evaluate the performance with the following measures: overall prediction accuracy, specificity, sensitivity, and area under the curve (AUC). The variable importance measures were also calculated to facilitate the selection of important turnover predictors.

Results: The results suggested a good level of turnover predictive accuracy, particularly with the random forest model (e.g., AUC > .8; above/close to .8 prediction accuracy) for both centers. The study also found that the ML methods could identify several important predictors (e.g., past work years, wage, work hours, age, job position, job training hours, and marital status) for turnover using historical HR data. The HR data extraction processes for ML applications were also evaluated as feasible.

Conclusions and implications: There are large historical data in HR data management systems that are often cited as reliable turnover predictors in the literature; however, such data are not always used to predict employee turnover. As ML applications to HR data are accumulated across organizations, it may be expected that some findings (e.g., predictors, predictive patterns, turnover mechanisms) might be more generalizable across different organizations (that can contribute to broader policy and workforce development efforts) while others may be more organization-specific (that can help HR and leadership for employee job retention at their organization). The current study provides new insights and avenues to address a data-driven, evidence-based turnover prediction strategy using existing HR data that are often under-utilized. Implications on how social work organizational leaders can incorporate data-driven decision-making to support employee retention will be discussed.