Abstract: Contextualized Poverty Targeting through Multimodal Spatial Data and Machine Learning in Congo (Society for Social Work and Research 29th Annual Conference)

Please note schedule is subject to change. All in-person and virtual presentations are in Pacific Time Zone (PST).

42P Contextualized Poverty Targeting through Multimodal Spatial Data and Machine Learning in Congo

Schedule:
Thursday, January 16, 2025
Grand Ballroom C, Level 2 (Sheraton Grand Seattle)
* noted as presenting author
Woojin Jung, PhD, na, Rutgers University
Andrew Kim, MSW, Graduate Research Assistant, Rutgers University, School of Social Work, New Brunswick, NJ
Jordan Steiner, PhD, Doctoral Candidate, Rutgers University
Background & Purpose: The urgency for policymakers to deliver rapid benefits to vulnerable populations has intensified due to multiple economic shocks. Traditional methods in developing countries such as proxy means testing (PMT) and community-based targeting (CBT) often result in significant exclusion errors, particularly in regions with scarce data and when applied to small geographic units. Recent studies in this field predominantly harness georeferenced national-scale data for geographic targeting, leaving the needs of areas without such data unmet. Addressing this gap, this research introduces a multimodal approach to construct contextualized poverty metrics in data-deficient locales to improve targeting of social transfers. Using the case of Congo Brazzaville, we show that incorporating complementary spatial data and ML techniques can amplify the poverty-reducing impact of transfers.

Methods: Our methodology involves using administrative data from the Lisungi Social Safety Net Database to geolocate households and establish a household-level Multidimensional Poverty Index (MPI) as ground truth poverty. We collate a range of intuitive image, text, and geographic features that include infrastructure data (from the Congolese government and OpenStreetMap), vegetation indices, nighttime lights from satellite images, Twitter count and topic weights from BERT topic models, and internet connectivity data. These variables serve as predictors in various high-performing ML models (ensemble, Bayesian, neural networks, etc.) to predict MPI values. The effectiveness of these machine learning-based targeting (MLT) methods is assessed using mean squared errors (MSE), targeting error rates (TER), and their simulated poverty reducing effects (using P0=headcount; P1=gap; P2=severity).

Results:

Consistent with the literature on targeting in Sub-Saharan Africa, PMT alone has a mixed performance, and CBT alone performs poorly in reaching households with low levels of well-being. Traditional targeting methods, when leveraging family size information, result in slightly better performance in the case of PMT (TER=23.8%; P0=21.51; P1=3.00; P2=0.90) but not for CBT ( TER=49.2%; P0=13.83; P1=2.08; P2=0.68). On the other hand, adding a wide range of new features to PMT substantially improved prediction accuracy at the household level. Our data-augmented MTL outperforms status quo targeting mechanisms in Congo as well as global prediction models in the literature, which is too coarse. The best MLT by ML standards (neural network R2=0.709, test MSE= 0.009) did not identify deep poverty well (P0=22.43; P1=2.95; P2=0.89). Instead, the eXtreme gradient boosting algorithm, using all spatial features except daytime imagery (R2=0.703, Test MSE = 0.010), resulted in the lowest targeting error (TER=7.4%) and largest poverty reduction (P0=27.65; P1=3.21; P2=0.93), close approximating hypothetical perfect targeting (P0=29.49; P1=3.27; and P2=0.94).

Conclusions & Implications: Our study finds that augmenting MLT models with multimodal spatial data can substantially improve micro-level poverty targeting compared to traditional methods. This study suggests the potential of building data infrastructure and adopting holistic evaluation metrics to promote more inclusive social welfare programs. The robust performance of our model, even in spatially homogenous settings, suggests the scalability of AI/ML models across large regions with greater spatial variation. Our focus on countries and populations marginalized in global development discourse promotes data justice.