Methods: Scales were administered online as part of a larger child-welfare organizational health assessment in four states. Participation was voluntary with 1,192 surveys completed. The sample was split into two random halves to conduct exploratory (EFA; n=586) and confirmatory factor analyses (CFA; n=606) in Mplus 6.1. Because these scales had been previously vetted we began with CFA to fit models as described in the literature. If CFA models did not adequately fit the data, we conducted EFA to determine factors structures that would fit the data, and confirmed the new structures with CFA.
Results: Psychometric examinations of PQL and JS scales showed that they do not adequately represent the constructs they were intended to measure, and cannot be used in structural equation models as intended. The resulting scale based on PQL items bears little resemblance to the original scale; of 30 items, four remained representing Work Satisfaction (Fit Statistics: χ2(2)=1.366, p=0.51, CFI=1.000, RMSEA=0.071, SRMR=0.006, alpha=.852). With regard to Job Satisfaction, 32 of 36 original items were retained and six of nine subscales confirmed; however, three subscales did not fit the data as intended. More importantly, the scale cannot be combined into one higher-order construct representing job satisfaction. Scales failed to assess constructs adequately in at least four ways: at the individual item-level with vague or “double-barreled” statements; at the subscale-level with items representing latent constructs not overlapping enough in substance or there not being enough items to represent each subscale construct; and at the broader construct level with subscales not fitting together to form one larger construct as described in the literature.
Conclusions and Implications: In this case, use of existing and widely-used scales did not result in psychometrically sound measures. We provide a typology of ways in which scales fail. We use the PQL and JS scales as examples; however, it is important to note that many other existing scales in our assessment tool failed similarly, and none performed as intended. Our recommendations to other researchers interested in building psychometrically sound assessment and survey tools are to: look for measures that have been examined in an EFA/CFA framework, rather than simply had reliability confirmed via Chronbach's alpha; examine all item wording for applicability in your setting; examine groups of items for evidence of an underlying latent construct; and expect to conduct extensive psychometric analyses prior to structural equation modeling.