Abstract: Bots Are the New Fraud: A Post-Hoc Exploration of Statistical Methods to Identify Bot-Generated Responses in a Corrupt Data Set (Society for Social Work and Research 27th Annual Conference - Social Work Science and Complex Problems: Battling Inequities + Building Solutions)

All in-person and virtual presentations are in Mountain Standard Time Zone (MST).

SSWR 2023 Poster Gallery: as a registered in-person and virtual attendee, you have access to the virtual Poster Gallery which includes only the posters that elected to present virtually. The rest of the posters are presented in-person in the Poster/Exhibit Hall located in Phoenix A/B, 3rd floor. The access to the Poster Gallery will be available via the virtual conference platform the week of January 9. You will receive an email with instructions how to access the virtual conference platform.

43P Bots Are the New Fraud: A Post-Hoc Exploration of Statistical Methods to Identify Bot-Generated Responses in a Corrupt Data Set

Schedule:
Thursday, January 12, 2023
Phoenix C, 3rd Level (Sheraton Phoenix Downtown)
* noted as presenting author
Kathryn Irish, MSW, Graduate Research Assistant, Michigan State University
Jessica Saba, MSW, Graduate Research Assistant, Michigan State University
Carrie Moylan, PhD, Associate Professor, Michigan State University, East Lansing, MI
Background and Purpose: Automated robots (“bots”) are notorious for seeking and exploiting online, incentive-based research. Newer bots are remarkably adept at mimicking human behavior; they can bypass CAPTCHA and generate human-like response patterns, making them very difficult to detect. Recent literature proposes two methods of identifying bot-generated responses within datasets: (a) manually “flagging” suspicious responses and (b) statistical analyses: Mahalanobis distance (to detect multivariate outliers) and person-total-correlation (to assess individual response consistency). At present, there is no institutional standard to identify or account for more sophisticated bot-activity in data collected online. This exploratory study is among the first translational efforts to identify bot-generated responses retrospectively, within an actual corrupt data set using a combination of manual flags and statistical analyses (Mahalanobis distance; person-total correlation). The study aims include: (a) evaluate the efficacy and feasibility of statistical methods for bot-detection in translational practice, and (b) contribute an exploratory assessment of data characteristics generated by Mahalanobis distance and person-total correlation, within a real, bot-corrupted data set.

Methods: Cases (n=1,306) were reviewed and assigned flags. Descriptive characteristics and aggregate combinations of flag-distributions were used to consolidate flags and sort cases into two groups: “likely bots” (n=402) and “likely human” (n=904). Mahalanobis distance was assessed among viable cases (n=1132); raw scores underwent log transformation, and an independent samples t-test. Person-total correlation was assessed among viable cases (n=1286). Data characteristics between groups were compared.

Results: The independent samples t-test indicates the Mahalanobis distance for the "likely human" group (n=799, M=1.26, SD=0.38) was significantly greater than the "likely bot" group (n=333, M=1.12, SD = 0.50), t (1130) = 4.084, p < 0.001. Effect size was moderate (r = 0.32). There was no significant difference between groups detected by person-total correlation. Mahalanobis distance findings suggest “likely human” cases demonstrate more variability than cases deemed “likely bots.” This contrasts with modeled demonstrations of bot-detection using Mahalanobis distance, which usually identifies verifiable bots as multivariate outliers among human responses. Person-total correlation did not identify any group differences, unlike modeled research. Problems with group-level distinctions and unequal groups likely influenced analysis.

Conclusions and Implications: The greatest limitation of this study is that the actual proportion of human and bot cases is unknown; as such, the true accuracy of the presented methods cannot be definitively assessed. This circumstance is precisely what makes this study so valuable – when real data is corrupted by bots, researchers have no means of knowing which cases are legitimate and which are corrupt, and as is demonstrated here, even the most concrete, mathematical means of identifying bot-corrupted responses may have significant limitations when translated under real-life circumstances. Data integrity is an ethical imperative. More resources, discussion, and cross-disciplinary research will be necessary to work toward establishing an institutional standard for preventing, identifying, and accounting for sophisticated bot activity in online research. A long-term solution may involve a statistical mechanism that effectively accounts for what may become a rather inevitable level of bot-influence within online datasets.