Skip to the main content

Original scientific paper

https://doi.org/10.17559/TV-20210708143535

Deep-Cov19-Hate: A Textual-Based Novel Approach for Automatic Detection of Hate Speech in Online Social Networks throughout COVID-19 with Shallow and Deep Learning Models

Cem Baydogan ; Department of Software Engineering, Faculty of Technology, Firat University, Elazig, 23119, Turkey
Bilal Alatas* orcid id orcid.org/0000-0002-3513-0329 ; Department of Software Engineering, Faculty of Technology, Firat University, Elazig, 23119, Turkey


Full text: english pdf 1.789 Kb

page 149-156

downloads: 693

cite


Abstract

The use of various online social media platforms rising day by day caused an increase in the correct or incorrect information shared by users, especially during COVID-19. The introduction of COVID-19 on the world agenda gave rise to an overall bad reaction against East Asia (esp. China) in online social media platforms. The social media users who spread degrading, racist, disrespectful, abusive, discriminatory, critical, abuse, harsh, offensive, etc. posts accused the Asian people of being responsible for the outbreak of COVID-19. For this reason, the development of the Hate Speech Detection (HSD) system was necessary in order to prevent the spread of these posts about COVID-19. In this article, a textual-based study on COVID-19-related hate speech (HS) sharing in online social networks was carried out with Shallow Learning (SL) and Deep Learning (DL) methods. In the first step of this study, typical Natural Language Processing (NLP) pipeline was applied for gathered two different datasets. This NLP pipeline was performed using bag of words, term frequency, document matrix, etc. techniques for features extraction representing datasets. Then, ten different SL and DL models were fine-tuned for HS datasets related to COVID-19. Accuracy, precision, sensitivity, and F-score performance measurement criteria were calculated to compare the performance of the SL and DL algorithms for the problem of HSD. The RNN, one of the models proposed for the first and second dataset in HSD, prevailed with the highest accuracy values of 78.7% and 90.3%, respectively. Due to the promising results of all approaches operated in the HSD, they are forecasted to be chosen in the solution of many other social media and network problems related to COVID-19.

Keywords

COVID-19; hate speech detection; shallow and deep learning models; social media analysis; social network problems

Hrčak ID:

269494

URI

https://hrcak.srce.hr/269494

Publication date:

15.2.2022.

Visits: 1.889 *