Machine learning can help by automatically scanning social media, however, the use of emojis and hashtags with multiple meanings and a distinct linguistic style utilized among gang-affiliated youth renders most computational tools inadequate. To fill this gap, we developed a qualitative analytical process to contextualize the complex sentiments conveyed on social media. The goal is to use these qualitative features to train the machine learning system to detect and predict aggressive content in linguistically diverse social media content.
Methods: We developed the Digital Urban Violence Analysis Approach (DUVAA), a six step qualitative analytic method to identify offline characteristics in social media text to elucidate deeper contextual meaning. The process includes: (1) identifying precipitating events; (2) examining biographical information from the author of the text: (3) Analyzing text ; (4) identifying pertinent names, affiliations, and hashtags; (5) evaluating the tone of the text; (6) identifying triggering events that shift conversations.
We applied DUVAA to a Twitter data set. We used Radian6, a social media tracker to capture 10,000 tweets from between January-April 2014 from Gakirah Barnes, a female Chicago gang member. We apply DUVAA to analyze 800 of those tweets during a two week period following two homicides: the death of Gakirah’s close friend, then her subsequent death two weeks later. Two research assistants developed a codebook after an initial analysis of 50 randomly selected tweets. Twenty six (26) codes were developed. The research team then applied the DUVAA method to elicit additional information to contextualize each tweet. The inter-annotation agreement was K=0.62, which is moderate agreement.
Results: Observable patterns in the data from the DUVVA process suggested our codes fit into three categories: aggression, grief, and other. We learned that aggressive and threatening communication is often preceded by posts that reflected loss or grief of a loved one due to gang violence. As a result of the DUVVA process we were able to better understand a complex social phenomenon while providing insight that can be applied to machine learning.
Conclusion and implications: Through comprehensive multimodal evaluation, we uncovered unique language and communication styles used amongst our population. Our findings suggest a new interdisciplinary collaboration between social work and computer science. Additionally, utilization of qualitative processes to enhance machine learning and NLP techniques can help successfully predict exchanges on Twitter that may escalate into firearm violence.