The disproportionate allocation of unpaid care work responsibilities between women and men is a significant concern in many South Asian countries, including India. The available literature on women's caregiving is restricted in terms of its methodological approach, particularly in relation to its predictive capacity. This is a critical factor for policymakers and analysts who require reliable data to inform their decision-making processes. Given this complexity, this research project utilizes machine learning techniques to identify patterns in a vast dataset with numerous covariates. Using the Sen's (1980) capability framework, this study investigates two research questions: (1) whether machine learning methods are more effective than traditional statistical methods in predicting time spent on unpaid care work, and (2) which sociodemographic factors predict increased time spent on unpaid caregiving across different gender identities.
Methods
The present analysis uses data from the 2019 Indian Time-Use Survey. Using the stratified two-stage sampling method, the survey included a nationally representative sample of 272,117 individuals in rural areas and 173,182 individuals in urban areas. The study compares the predictive performance of various sociodemographic factors using machine learning prediction methods such as ordinal least square (OLS), lasso, ridge, random forest, mean decrease impurity (MDI), and feature permutation. The analysis involved a sequence of stages, beginning with the implementation of OLS, lasso, and elastic net models to eliminate irrelevant variables. Nonetheless, due to their inability to consider nonlinear interactions, the R2 was restricted to 55%. In order to determine noteworthy features that predict the duration of unpaid care work, the random forest and permutation importance techniques were employed on both the test and training datasets. Additionally, the analysis was performed again, solely utilizing data from women.
Results
The results show that machine learning methods outperform traditional regression methods, with random forest analysis showing an almost 9% improvement in predictive performance. The study finds that gender, employment status, marital status, age, and monthly household expenditure are significant predictors of time spent on unpaid care work. Marital and employment status remained significant predictors for the data exclusively limited to women. Young married women with non-employed status are particularly vulnerable to increased time commitment to UCW, which compounds the risks and challenges associated with unpaid care work.
Conclusions and implications
The study highlights two important implications for practice and policy. Firstly, the findings identify target areas for improving the reduction of time spent on unpaid care work at the systems level in India. Secondly, the study emphasizes the need to eliminate gender disparities in unpaid caregiving through re-distributive policies, programs, and care initiatives. The study recommends promoting the 'dual-earner care model' prevalent in Nordic countries and increasing female labor force participation in India through flexible working arrangements to reduce the gender disparity in time spent on unpaid care work. Findings re-iterate the argument of building a care system in the country that addresses the caregiving constraints impeding women's capabilities to benefit equally from the new economic policies and reforms.