Abstract: Unraveling the Complexities of Detecting Implicit Anti-Asian and Pro-Asian Speech on Social Media: Challenges, Insights, and the Potential of Large Language Models (Society for Social Work and Research 29th Annual Conference)

Please note schedule is subject to change. All in-person and virtual presentations are in Pacific Time Zone (PST).

Unraveling the Complexities of Detecting Implicit Anti-Asian and Pro-Asian Speech on Social Media: Challenges, Insights, and the Potential of Large Language Models

Schedule:
Saturday, January 18, 2025
Redwood B, Level 2 (Sheraton Grand Seattle)
* noted as presenting author
Doris Chang, PhD, Associate Professor, New York University, NY
Nari Yoo, MA, PhD Candidate, New York University, New York, NY
Heran Mane, Data Scientist, University of Maryland at College Park, MD
Angela Zhao, MA student, New York University, NY
Sumie Okazaki, PhD, Professor, New York University
Thu Nguyen, PhD, Associate Professor, University of Maryland at College Park
Background: The alarming rise in anti-Asian discrimination and violence, especially during the COVID-19 pandemic, has underscored the urgent need for robust tools to detect and classify hate speech and racial bias against Asian communities on social media platforms. Existing research has highlighted the mental health impact of such discrimination, with Asian Americans experiencing high levels of distress when exposed to both online and in-person discrimination. Social media has played a significant role in the spread of anti-Asian sentiment, facilitating the dissemination of hate speech and enabling the organization of hate groups. Meanwhile, large language models such as ChatGPT have enabled the development of hate speech detection algorithms in low-resource settings. This study aims to enhance the detection and classification of anti-Asian hate speech and racial bias by identifying challenges in developing classifiers, refining annotation guidelines, and exploring the usability of ChatGPT as an annotator.

Methods: A comprehensive dataset of 55,844,310 tweets, with 3,899,874 (7.17%) referencing Asians, was collected using Twitter's API for Academic Research from 2011 to 2021. The tweets were filtered to include those in English, originating from the US, and containing Asian race-related keywords identified from prior studies and an online database of racial slurs. A group of 9 annotators who identify as Asian Americans reviewed and individually annotated 1200 tweets for both anti-Asian (racist) and pro-Asian (solidarity) speech. Annotations were discussed as a group to arrive at a consensus decision. The team iteratively developed a codebook to standardize the annotation process. Finally, ChatGPT was used as an additional annotator to assess the inter-rater agreement between human annotations and LLM-based annotations.

Results: Debriefing sessions identified several challenges faced by the annotators, including the subtle nature of racism and solidarity directed towards Asians and the importance of considering annotators’ positionalities in the annotation process. The development of the codebook provided a more standardized approach to annotation, with clear guidelines and examples to assist in identifying relevant tweets targeting Asians. The comparison between human annotations and ChatGPT's performance showed promising results, with ChatGPT demonstrating 80% accuracy in detecting hate speech. However, the ChatGPT’s annotation was less accurate in capturing subtle forms of discrimination compared to the model trained with DeBERTa model, which showed 85% accuracy.

Conclusions: This study highlights the complexities involved in detecting and classifying anti-Asian hate speech and racial bias on social media platforms. The challenges identified by the annotators emphasize the need for more sophisticated detection methods specific to racial/ethnic minorities and contexts. While ChatGPT shows promise as an automated annotator, further refinements are necessary to enhance its ability to detect subtle forms of discrimination. The study's limitations include the subjectivity of human annotations, the lack of representation of Southeast Asians in the annotation team, and the lack of a gold standard for hate speech detection. The findings contribute to the limited but growing body of research on combating anti-Asian hate speech and racial bias.