Methods: We evaluated the performance of random forests (RF) and logistic regression (LR) for classifying PTSD and depression. The data were partitioned into training (90%) and test (10%) data sets. The ROSE (Random Over-Sampling Examples) package in R was used to address class imbalance in the training dataset when classifying probable PTSD. Cases of probable depression had the following distribution in the total (48.6%), train (48.1%), and test (53.6%) datasets. Cases of probable PTSD had the following distribution in the total (20.4%), train (54.6%), and test (21.4%) datasets. Models included demographic, hurricane-related trauma exposure, and migration-related cultural stress variables. We inspected area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity to evaluate model performance. For the RF models, we inspected the importance of each attribute by inspecting variable importance measures (mean decrease accuracy). Partial dependence plots were created to visualize the marginal effect of each attribute individually on the prediction of Y.
Results: For classifying depression, RF outperformed logistic regression based on accuracy (RF: 0.75; LR: 0.71), AUC (RF: 0.83; LR: 0.73), and specificity (RF: 0.92; LR: 0.85). The attribute with the highest importance score was PTSD symptoms, followed by perceived discrimination, language-related stress, negative context of reception, age, sex, and hurricane trauma. For classifying PTSD, RF outperformed LR based on accuracy (RF: 0.71, LR: 0.68), AUC (RF: 0.79; LR: 0.71), and specificity (RF:0.68; LR: 0.64). Depressive symptoms were rated as most important for accurate classification of PTSD, followed by perceived discrimination, language-related stress, hurricane trauma, negative context of reception, age, and sex.
Conclusions and Implications: Using a theoretically-informed, selective set of demographic, hurricane-related trauma exposure, and cultural stress-related variables, RF and LR showed good classification accuracy for both depression and PTSD in a sample of recent migrants with experiences of crisis migration and trauma exposure. Findings underscore the importance of delivering culturally and linguistically appropriate and trauma-informed clinical services for recent migrants. A more thorough classification model would also include biomarkers (e.g., of allostatic load), family, community, or neighborhood-level attributes. Findings may not generalize to other groups who have experienced crisis-related migration. Use of assessments to identify pre-migration and post-migration stressors could inform clinical practice with migrants presenting with behavioral health-related difficulties.