Teacher-completed behavioral rating scales are one of the most commonly used tools for assessing student social-emotional competence (SEC) in the context of school-based social-emotional learning (SEL). However, concerns have been raised as to whether teachers rate student behavior similarly across diverse subgroups. In other words, if a teacher observes the behavior of two students, each with the exact same level of SEC, differentially based on their sociodemographic characteristics, the difference in rating scores would indicate the teacher’s bias instead of a true disparity in SEC. Testing measurement invariance across student subgroups is one way to quantitatively assess this type of measurement bias. Yet, few studies have examined the measurement invariance of SEC assessment tools. This gap is particularly problematic since some studies have found that males, students of color, and low-SES students have less favorable SEC developmental trajectories without necessarily confirming whether the construct was comparably measured across subgroups. This study primarily aims to test the measurement invariance of a widely used teacher-completed behavioral rating scale across student gender, race/ethnicity, and SES, with an additional aim to examine subgroup differences in SEC growth trajectories if evidence supports strong measurement invariance.
Data come from a district-wide, three-year SEL initiative (2011-12 to 2013-14) in a large urban district. The analysis sample consists of 5,452 students who were in grades K-2 in Year 1 (48% female; 64% Hispanic/Latinx, 16% non-Hispanic Black, 14% non-Hispanic White; 87% receiving free/reduced-price lunch). Student SEC was repeatedly measured using three alternate forms of the 8-item DESSA-Mini each year (Fall/Winter/Spring). Four levels of factorial invariance (configural, weak, strong, and strict) were tested using multi-group confirmatory factor analysis. Subgroup differences in SEC growth trajectories were examined using second-order latent growth modeling. Full Information Maximum Likelihood estimator with robust standard errors (MLR in Mplus) was used to handle missingness and non-normality.
Results suggest that at least “strong” factorial invariance (i.e., both factor loadings and intercepts equivalence) can be assumed across gender, race/ethnicity, and SES, indicated by inappreciable changes in practical fit indices (∆CFI=.00, ∆TLI=.00, and ∆RMSEA=.00) between models assuming strong invariance and models assuming only configural or weak invariance. The strong factorial invariance model showed an acceptable fit with each grouping variable (CFI≥.94, TLI≥.93, RMSEA≤.04). With strong factorial invariance assumed, male (vs. female), Black or Hispanic (vs. sample average), and low-SES (vs. middle-to-high-SES) students were found to have lower SEC not only at baseline but also consistently throughout the study years.
This study provides evidence that student SEC is measured comparably across gender, race/ethnicity, and SES using a widely used teacher-completed behavioral rating scale. Although evidence of measurement invariance does not guarantee that a measure is bias-free, it is one of the first necessary steps to responding to the field’s need for SEC measurement tools that can be used with diverse student populations. Given the observed disparities in student SEC growth trajectories, this study calls for more research on SEC disparities, and their root causes, using an invariant measure.