Comparative Analysis of Identifying Abusive Language on Twitter

Main Article Content

Puspendu Biswas, Donavalli Haritha

Abstract

The context-sensitive nature of online aggression presents significant challenges in annotating extensive data collections. Previous datasets utilized for detecting abusive language have proven inadequate in size for the effective training of deep learning models. Recently, a more substantial and reliable dataset titled Hate and Abusive Speech on Twitter has been made available. Nevertheless, this dataset has yet to be thoroughly explored to realize its full potential. In this paper, we present the inaugural comparative analysis of various learning models applied to the Hate and Abusive Speech on Twitter dataset, while also examining the potential benefits of incorporating additional features and contextual data. The experimental findings indicate that bidirectional GRU networks, trained on word-level features and enhanced with Latent Topic Clustering modules, yield the highest accuracy, achieving an F1 score of 0.805.

Article Details

Section
Articles