Abstract: Ethical and Methodological Concerns with Machine Learning Predicting Cognitive Functioning Among Hispanics, Blacks, and Whites in Later Life: Implications for Equity Research and Big Data (Society for Social Work and Research 27th Annual Conference - Social Work Science and Complex Problems: Battling Inequities + Building Solutions)

All in-person and virtual presentations are in Mountain Standard Time Zone (MST).

SSWR 2023 Poster Gallery: as a registered in-person and virtual attendee, you have access to the virtual Poster Gallery which includes only the posters that elected to present virtually. The rest of the posters are presented in-person in the Poster/Exhibit Hall located in Phoenix A/B, 3rd floor. The access to the Poster Gallery will be available via the virtual conference platform the week of January 9. You will receive an email with instructions how to access the virtual conference platform.

Ethical and Methodological Concerns with Machine Learning Predicting Cognitive Functioning Among Hispanics, Blacks, and Whites in Later Life: Implications for Equity Research and Big Data

Friday, January 13, 2023
Camelback A, 2nd Level (Sheraton Phoenix Downtown)
* noted as presenting author
Ernest Gonzales, PhD, MSSW, Associate Professor, New York University, New York, NY
Forrest Bao, Assistant Professor, Iowa State University, IA
Yi Wang, PhD, Assistant Professor, University of Iowa, Iowa City, IA
Cliff Whetung, MSW, PhD Student, New York University, New york, NY
Background and Purpose: Cognitive impairment is a worldwide epidemic. Racial and ethnic minorities carry a heavier burden of cognitive impairment when compared to Whites in the United States. This inequity is persistent and large. Yet, nearly a third of all dementia cases can be prevented and equity is within reach. Longitudinal and experimental studies have identified important predictors to bolster cognitive functioning and brain structure. Machine Learning (ML), however, is a novel statistical method that has rarely been utilized with predicting cognitive functioning in later life. While this method holds tremendous promise to interrogate and confirm existing theory, there are also significant ethical and methodological concerns that arise within the context of structural racism. The objectives of this study are to compare and contrast traditional statistical approaches with that of machine learning when identifying risk and protective factors to cognitive health among older Blacks, Hispanics, and Whites.

Methods: We will utilize 14 years of data from the Health and Retirement Study (2006-2020), a large representative sample of older adults in the United States funded by The National Institute on Aging and U.S. Social Security Administration. The total sample consisted of 15,385 older adults, where 76% identified as White, 14% Black, and 9% Hispanic. Guided by minority stress theory, we examined risk and protective factors from a socio-environmental point of view: sociodemographic characteristics (age, gender, marital status), economic factors (education, income, assets), multimorbidity weighted health index (diabetes, hypertension, cancer, lung and heart diseases, stroke, psychiatric problems, arthritis, obesity), discrimination (major lifetime, everyday discrimination and attribution), and perceived neighborhood conditions (social cohesion, physical order). Growth curve models and machine learning were performed to examine the association between socio-environmental factors and cognitive functioning.

Results: Preliminary analyses show similar, yet different, results between the two methods. For instance, large cognitive health inequities were persistent between the two statistical approaches, where Blacks and Hispanics had lower scores when compared to Whites; and education consistently operated as a protective factor to cognitive functioning across race and ethnicity. However, there were also important differences in how environmental factors (neighborhood characteristics), major lifetime discrimination, and everyday discrimination, were related with cognitive functioning across the two methods. Gender and race/ethnicity interactions showed divergent patterns with cognitive health across the two statistical approaches.

Conclusions and Implications: ML is often referred to as a statistical method for “lazy scientists.” Indeed, large datasets and ML enable scientists to investigate phenomena from an a-theoretical/”black-box” perspective. Yet doing so introduces ethical concerns with how to model and operationalize complex social phenomena. There are also important methodological tools inherent in ML and growth curve modeling that can adjust for episodic events that occur at the national level (e.g., great recession) and can reduce bias caused by concept drift and changes in the environment. When used judiciously and carefully, ML has the potential to confirm, and interrogate, theory. Findings cannot only spark scientific imagination but also have important implications to develop theory and a rigorous knowledge base to inform social policies and practices.