IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v14y2022i20p13094-d940620.html
   My bibliography  Save this article

How to Detect Online Hate towards Migrants and Refugees? Developing and Evaluating a Classifier of Racist and Xenophobic Hate Speech Using Shallow and Deep Learning

Author

Listed:
  • Carlos Arcila-Calderón

    (Facultad de Ciencias Sociales, Campus Unamuno, University of Salamanca, 37007 Salamanca, Spain)

  • Javier J. Amores

    (Facultad de Ciencias Sociales, Campus Unamuno, University of Salamanca, 37007 Salamanca, Spain)

  • Patricia Sánchez-Holgado

    (Facultad de Ciencias Sociales, Campus Unamuno, University of Salamanca, 37007 Salamanca, Spain)

  • Lazaros Vrysis

    (Multidisciplinary Media & Mediated Communication Research Group (M3C), Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece)

  • Nikolaos Vryzas

    (Multidisciplinary Media & Mediated Communication Research Group (M3C), Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece)

  • Martín Oller Alonso

    (Department of Social and Political Sciences, Università degli Studi di Milano, 20122 Milano, Italy)

Abstract

Hate speech spreading online is a matter of growing concern since social media allows for its rapid, uncontrolled, and massive dissemination. For this reason, several researchers are already working on the development of prototypes that allow for the detection of cyberhate automatically and on a large scale. However, most of them are developed to detect hate only in English, and very few focus specifically on racism and xenophobia, the category of discrimination in which the most hate crimes are recorded each year. In addition, ad hoc datasets manually generated by several trained coders are rarely used in the development of these prototypes since almost all researchers use already available datasets. The objective of this research is to overcome the limitations of those previous works by developing and evaluating classification models capable of detecting racist and/or xenophobic hate speech being spread online, first in Spanish, and later in Greek and Italian. In the development of these prototypes, three differentiated machine learning strategies are tested. First, various traditional shallow learning algorithms are used. Second, deep learning is used, specifically, an ad hoc developed RNN model. Finally, a BERT-based model is developed in which transformers and neural networks are used. The results confirm that deep learning strategies perform better in detecting anti-immigration hate speech online. It is for this reason that the deep architectures were the ones finally improved and tested for hate speech detection in Greek and Italian and in multisource. The results of this study represent an advance in the scientific literature in this field of research, since up to now, no online anti-immigration hate detectors had been tested in these languages and using this type of deep architecture.

Suggested Citation

  • Carlos Arcila-Calderón & Javier J. Amores & Patricia Sánchez-Holgado & Lazaros Vrysis & Nikolaos Vryzas & Martín Oller Alonso, 2022. "How to Detect Online Hate towards Migrants and Refugees? Developing and Evaluating a Classifier of Racist and Xenophobic Hate Speech Using Shallow and Deep Learning," Sustainability, MDPI, vol. 14(20), pages 1-16, October.
  • Handle: RePEc:gam:jsusta:v:14:y:2022:i:20:p:13094-:d:940620
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/14/20/13094/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/14/20/13094/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Lazaros Vrysis & Nikolaos Vryzas & Rigas Kotsakis & Theodora Saridou & Maria Matsiola & Andreas Veglis & Carlos Arcila-Calderón & Charalampos Dimoulas, 2021. "A Web Interface for Analyzing Hate Speech," Future Internet, MDPI, vol. 13(3), pages 1-18, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Paschalia (Lia) Spyridou & Constantinos Djouvas & Dimitra Milioni, 2022. "Modeling and Validating a News Recommender Algorithm in a Mainstream Medium-Sized News Organization: An Experimental Approach," Future Internet, MDPI, vol. 14(10), pages 1-21, September.
    2. Andreas Giannakoulopoulos & Minas Pergantis & Laida Limniati & Alexandros Kouretsis, 2022. "Investigating the Country of Origin and the Role of the .eu TLD in External Trade of European Union Member States," Future Internet, MDPI, vol. 14(6), pages 1-27, June.
    3. Charalampos A. Dimoulas & Andreas Veglis, 2023. "Theory and Applications of Web 3.0 in the Media Sector," Future Internet, MDPI, vol. 15(5), pages 1-10, April.
    4. Nikolaos Vryzas & Anastasia Katsaounidou & Lazaros Vrysis & Rigas Kotsakis & Charalampos Dimoulas, 2022. "A Prototype Web Application to Support Human-Centered Audiovisual Content Authentication and Crowdsourcing," Future Internet, MDPI, vol. 14(3), pages 1-17, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:14:y:2022:i:20:p:13094-:d:940620. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.