IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i18p2941-d1482716.html
   My bibliography  Save this article

Pre-Trained Language Model Ensemble for Arabic Fake News Detection

Author

Listed:
  • Lama Al-Zahrani

    (Information Technology Department, College of Computer and Information Sciences, King Saud University, P.O. Box 145111, Riyadh 4545, Saudi Arabia)

  • Maha Al-Yahya

    (Information Technology Department, College of Computer and Information Sciences, King Saud University, P.O. Box 145111, Riyadh 4545, Saudi Arabia)

Abstract

Fake news detection (FND) remains a challenge due to its vast and varied sources, especially on social media platforms. While numerous attempts have been made by academia and the industry to develop fake news detection systems, research on Arabic content remains limited. This study investigates transformer-based language models for Arabic FND. While transformer-based models have shown promising performance in various natural language processing tasks, they often struggle with tasks involving complex linguistic patterns and cultural contexts, resulting in unreliable performance and misclassification problems. To overcome these challenges, we investigated an ensemble of transformer-based models. We experimented with five Arabic transformer models: AraBERT, MARBERT, AraELECTRA, AraGPT2, and ARBERT. Various ensemble approaches, including a weighted-average ensemble, hard voting, and soft voting, were evaluated to determine the most effective techniques for boosting learning models and improving prediction accuracies. The results of this study demonstrate the effectiveness of ensemble models in significantly boosting the baseline model performance. An important finding is that ensemble models achieved excellent performance on the Arabic Multisource Fake News Detection (AMFND) dataset, reaching an F1 score of 94% using weighted averages. Moreover, changing the number of models in the ensemble has a slight effect on the performance. These key findings contribute to the advancement of fake news detection in Arabic, offering valuable insights for both academia and the industry

Suggested Citation

  • Lama Al-Zahrani & Maha Al-Yahya, 2024. "Pre-Trained Language Model Ensemble for Arabic Fake News Detection," Mathematics, MDPI, vol. 12(18), pages 1-17, September.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:18:p:2941-:d:1482716
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/18/2941/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/18/2941/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Alexandre Bovet & Hernán A. Makse, 2019. "Influence of fake news in Twitter during the 2016 US presidential election," Nature Communications, Nature, vol. 10(1), pages 1-14, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ciprian-Octavian Truică & Elena-Simona Apostol, 2022. "MisRoBÆRTa: Transformers versus Misinformation," Mathematics, MDPI, vol. 10(4), pages 1-25, February.
    2. Uğur Baloğlu, 2021. "Trolls, Pressure and Agenda: The discursive fight on Twitter in Turkey," Media and Communication, Cogitatio Press, vol. 9(4), pages 39-51.
    3. Mujtaba Ali Isani, 2021. "Methodological Problems of Using Arabic-Language Twitter as a Gauge for Arab Attitudes Toward Politics and Society," Contemporary Review of the Middle East, , vol. 8(1), pages 22-35, March.
    4. Nwaibeh, E.A. & Chikwendu, C.R., 2023. "A deterministic model of the spread of scam rumor and its numerical simulations," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 207(C), pages 111-129.
    5. Xipeng Liu & Xinmiao Li, 2024. "Unbiased evaluation of ranking algorithms applied to the Chinese green patents citation network," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(6), pages 2999-3021, June.
    6. Peter D. Lunn & Cameron A. Belton & Ciarán Lavin & Féidhlim P. McGowan & Shane Timmons & Deirdre A. Robertson, 2020. "Using behavioral science to help fight the Coronavirus," Journal of Behavioral Public Administration, Center for Experimental and Behavioral Public Administration, vol. 3(1).
    7. James Flamino & Alessandro Galeazzi & Stuart Feldman & Michael W. Macy & Brendan Cross & Zhenkun Zhou & Matteo Serafino & Alexandre Bovet & Hernán A. Makse & Boleslaw K. Szymanski, 2023. "Political polarization of news media and influencers on Twitter in the 2016 and 2020 US presidential elections," Nature Human Behaviour, Nature, vol. 7(6), pages 904-916, June.
    8. Matthew Spradling & Jeremy Straub, 2022. "Evaluation of the Factors That Impact the Perception of Online Content Trustworthiness by Income, Political Affiliation and Online Usage Time," Future Internet, MDPI, vol. 14(11), pages 1-55, November.
    9. Lodh, Rishab & Dey, Oindrila, 2023. "“Fake news alert!”: A game of misinformation and news consumption behavior," MPRA Paper 118371, University Library of Munich, Germany.
    10. Xu, Shuqi & Mariani, Manuel Sebastian & Lü, Linyuan & Medo, Matúš, 2020. "Unbiased evaluation of ranking metrics reveals consistent performance in science and technology citation data," Journal of Informetrics, Elsevier, vol. 14(1).
    11. Lipić, Tomislav & Štajduhar, Andrija & Medvidović, Luka & Wild, Dorian & Korošak, Dean & Podobnik, Boris, 2022. "Stringency without efficiency is not adequate to combat pandemics," Chaos, Solitons & Fractals, Elsevier, vol. 160(C).
    12. Junhui Cai & Dan Yang & Ran Chen & Wu Zhu & Haipeng Shen & Linda Zhao, 2021. "Network regression and supervised centrality estimation," Papers 2111.12921, arXiv.org, revised Feb 2025.
    13. Yevgeniy Golovchenko, 2020. "Measuring the scope of pro-Kremlin disinformation on Twitter," Palgrave Communications, Palgrave Macmillan, vol. 7(1), pages 1-11, December.
    14. Mao, Yajun & Rong, Zhihai & Wu, Zhi-Xi, 2021. "Effect of collective influence on the evolution of cooperation in evolutionary prisoner’s dilemma games," Applied Mathematics and Computation, Elsevier, vol. 392(C).
    15. So-Min Cheong & Matthew Babcock, 2021. "Attention to misleading and contentious tweets in the case of Hurricane Harvey," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 105(3), pages 2883-2906, February.
    16. Hyehyun Hong & Hyun Jee Oh, 2020. "Utilizing Bots for Sustainable News Business: Understanding Users’ Perspectives of News Bots in the Age of Social Media," Sustainability, MDPI, vol. 12(16), pages 1-16, August.
    17. Tiago A. Schieber & Laura C. Carpi & Panos M. Pardalos & Cristina Masoller & Albert Díaz-Guilera & Martín G. Ravetti, 2023. "Diffusion capacity of single and interconnected networks," Nature Communications, Nature, vol. 14(1), pages 1-9, December.
    18. Ye, Yucheng & Xu, Shuqi & Mariani, Manuel Sebastian & Lü, Linyuan, 2022. "Forecasting countries' gross domestic product from patent data," Chaos, Solitons & Fractals, Elsevier, vol. 160(C).
    19. Raj, Chahat & Meel, Priyanka, 2022. "People lie, actions Don't! Modeling infodemic proliferation predictors among social media users," Technology in Society, Elsevier, vol. 68(C).
    20. Lisa Singh & Leticia Bode & Ceren Budak & Kornraphop Kawintiranon & Colton Padden & Emily Vraga, 2020. "Understanding high- and low-quality URL Sharing on COVID-19 Twitter streams," Journal of Computational Social Science, Springer, vol. 3(2), pages 343-366, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:18:p:2941-:d:1482716. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.