Leveraging posts’ and authors’ metadata to spot several forms of abusive comments in Twitter

Social media is frequently plagued with undesirable phenomena such as cyberbullying and abusive content in the form of hateful and racist posts. Therefore, it is crucial to study and propose better mechanisms to automatically identify communication that promote hate speech, hostility, and aggressiveness. Traditional approaches have only focused on exploiting the content and writing style of social media posts while ignoring information related to their context. On the other hand, several recent works have reported some interesting findings in this direction, although they have lacked an exhaustive analysis of contextual information, and also an evaluation about if this same premise holds to detect different types of abusive comments, e.g. offensive, hostile and hateful. For this, we have extended seven Twitter benchmark datasets related to the detection of offensive, aggressive, hostile, and hateful communication. We evaluate our hypothesis by using three different learning models, considering classical (Bag of Words), advanced (Glove), and state-of-the-art (BERT) text representations. Experiments show statistically significant differences between the classification scores of all methods that use a combination of text and metadata in comparison to the classical view of only using the text content of the messages, thus suggesting the importance of paying attention to context to spot the different kinds of abusive comments on social networks.

keywords: Hate speech, Metadata, Social media