IDEAS home Printed from https://ideas.repec.org/a/gam/jftint/v12y2020i5p87-d356350.html
   My bibliography  Save this article

Language-Independent Fake News Detection: English, Portuguese, and Spanish Mutual Features

Author

Listed:
  • Hugo Queiroz Abonizio

    (State University of Londrina (UEL), 86057-970 Londrina, Brazil
    These authors contributed equally to this work.)

  • Janaina Ignacio de Morais

    (State University of Londrina (UEL), 86057-970 Londrina, Brazil
    These authors contributed equally to this work.)

  • Gabriel Marques Tavares

    (Università degli Studi di Milano (UNIMI), 20122 Milan, Italy
    These authors contributed equally to this work.)

  • Sylvio Barbon Junior

    (State University of Londrina (UEL), 86057-970 Londrina, Brazil
    These authors contributed equally to this work.)

Abstract

Online Social Media (OSM) have been substantially transforming the process of spreading news, improving its speed, and reducing barriers toward reaching out to a broad audience. However, OSM are very limited in providing mechanisms to check the credibility of news propagated through their structure. The majority of studies on automatic fake news detection are restricted to English documents, with few works evaluating other languages, and none comparing language-independent characteristics. Moreover, the spreading of deceptive news tends to be a worldwide problem; therefore, this work evaluates textual features that are not tied to a specific language when describing textual data for detecting news. Corpora of news written in American English, Brazilian Portuguese, and Spanish were explored to study complexity, stylometric, and psychological text features. The extracted features support the detection of fake, legitimate, and satirical news. We compared four machine learning algorithms (k-Nearest Neighbors ( k -NN), Support Vector Machine (SVM), Random Forest (RF), and Extreme Gradient Boosting (XGB)) to induce the detection model. Results show our proposed language-independent features are successful in describing fake, satirical, and legitimate news across three different languages, with an average detection accuracy of 85.3% with RF.

Suggested Citation

  • Hugo Queiroz Abonizio & Janaina Ignacio de Morais & Gabriel Marques Tavares & Sylvio Barbon Junior, 2020. "Language-Independent Fake News Detection: English, Portuguese, and Spanish Mutual Features," Future Internet, MDPI, vol. 12(5), pages 1-18, May.
  • Handle: RePEc:gam:jftint:v:12:y:2020:i:5:p:87-:d:356350
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1999-5903/12/5/87/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1999-5903/12/5/87/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Lina Zhou & Judee K. Burgoon & Jay F. Nunamaker & Doug Twitchell, 2004. "Automating Linguistics-Based Cues for Detecting Deception in Text-Based Asynchronous Computer-Mediated Communications," Group Decision and Negotiation, Springer, vol. 13(1), pages 81-106, January.
    2. Hunt Allcott & Matthew Gentzkow, 2017. "Social Media and Fake News in the 2016 Election," NBER Working Papers 23089, National Bureau of Economic Research, Inc.
    3. Michael E. Tipping & Christopher M. Bishop, 1999. "Probabilistic Principal Component Analysis," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(3), pages 611-622.
    4. Hunt Allcott & Matthew Gentzkow, 2017. "Social Media and Fake News in the 2016 Election," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 211-236, Spring.
    5. Chengcheng Shao & Giovanni Luca Ciampaglia & Onur Varol & Kai-Cheng Yang & Alessandro Flammini & Filippo Menczer, 2018. "The spread of low-credibility content by social bots," Nature Communications, Nature, vol. 9(1), pages 1-9, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Laura Studen & Victor Tiberius, 2020. "Social Media, Quo Vadis? Prospective Development and Implications," Future Internet, MDPI, vol. 12(9), pages 1-22, August.
    2. Lina Zhou & Jie Tao & Dongsong Zhang, 2023. "Does Fake News in Different Languages Tell the Same Story? An Analysis of Multi-level Thematic and Emotional Characteristics of News about COVID-19," Information Systems Frontiers, Springer, vol. 25(2), pages 493-512, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yevgeniy Golovchenko, 2020. "Measuring the scope of pro-Kremlin disinformation on Twitter," Palgrave Communications, Palgrave Macmillan, vol. 7(1), pages 1-11, December.
    2. Matilde Giaccherini & Joanna Kopinska & Gabriele Rovigatti, 2022. "Vax Populi: The Social Costs of Online Vaccine Skepticism," CESifo Working Paper Series 10184, CESifo.
    3. Ahmed Abouzeid & Ole-Christoffer Granmo & Christian Webersik & Morten Goodwin, 2021. "Learning Automata-based Misinformation Mitigation via Hawkes Processes," Information Systems Frontiers, Springer, vol. 23(5), pages 1169-1188, September.
    4. Ka Chung Ng & Ping Fan Ke & Mike K. P. So & Kar Yan Tam, 2023. "Augmenting fake content detection in online platforms: A domain adaptive transfer learning via adversarial training approach," Production and Operations Management, Production and Operations Management Society, vol. 32(7), pages 2101-2122, July.
    5. Divinus Oppong-Tawiah & Jane Webster, 2023. "Corporate Sustainability Communication as ‘Fake News’: Firms’ Greenwashing on Twitter," Sustainability, MDPI, vol. 15(8), pages 1-26, April.
    6. Julia Cage & Nicolas Hervé & Marie-Luce Viaud, 2017. "The Production of Information in an Online World: Is Copy Right?," Working Papers hal-03393171, HAL.
    7. Leopoldo Fergusson & Carlos Molina, 2020. "Facebook Causes Protests," HiCN Working Papers 323, Households in Conflict Network.
    8. Tetsuro Kobayashi & Fumiaki Taka & Takahisa Suzuki, 2021. "Can “Googling” correct misbelief? Cognitive and affective consequences of online search," PLOS ONE, Public Library of Science, vol. 16(9), pages 1-16, September.
    9. Dean Neu & Gregory D. Saxton & Abu S. Rahaman, 2022. "Social Accountability, Ethics, and the Occupy Wall Street Protests," Journal of Business Ethics, Springer, vol. 180(1), pages 17-31, September.
    10. Robbett, Andrea & Matthews, Peter Hans, 2018. "Partisan bias and expressive voting," Journal of Public Economics, Elsevier, vol. 157(C), pages 107-120.
    11. Henrik Skaug Sætra, 2021. "AI in Context and the Sustainable Development Goals: Factoring in the Unsustainability of the Sociotechnical System," Sustainability, MDPI, vol. 13(4), pages 1-19, February.
    12. Fathey Mohammed & Nabil Hasan Al-Kumaim & Ahmed Ibrahim Alzahrani & Yousef Fazea, 2023. "The Impact of Social Media Shared Health Content on Protective Behavior against COVID-19," IJERPH, MDPI, vol. 20(3), pages 1-16, January.
    13. Michele Cantarella & Nicolo' Fraccaroli & Roberto Volpe, 2019. "Does fake news affect voting behaviour?," Department of Economics 0146, University of Modena and Reggio E., Faculty of Economics "Marco Biagi".
    14. Joël Cariolle & Yasmine Elkhateeb & Mathilde Maurel, 2022. "(Mis-)information technology: Internet use and perception of democracy in Africa," Documents de travail du Centre d'Economie de la Sorbonne 22010, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    15. Kerim Peren Arin & Juan A. Lacomba & Francisco Lagos & Deni Mazrekaj & Marcel Thum, 2021. "Misperceptions and Fake News during the Covid-19 Pandemic," CESifo Working Paper Series 9066, CESifo.
    16. Bartosz Wilczek, 2020. "Misinformation and herd behavior in media markets: A cross-national investigation of how tabloids’ attention to misinformation drives broadsheets’ attention to misinformation in political and business," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-22, November.
    17. Barrera, Oscar & Guriev, Sergei & Henry, Emeric & Zhuravskaya, Ekaterina, 2020. "Facts, alternative facts, and fact checking in times of post-truth politics," Journal of Public Economics, Elsevier, vol. 182(C).
    18. Sumeet Kumar & Binxuan Huang & Ramon Alfonso Villa Cox & Kathleen M. Carley, 2021. "An anatomical comparison of fake-news and trusted-news sharing pattern on Twitter," Computational and Mathematical Organization Theory, Springer, vol. 27(2), pages 109-133, June.
    19. Julia Cagé & Nicolas Hervé & Marie-Luce Viaud, 2020. "The Production of Information in an Online World," Review of Economic Studies, Oxford University Press, vol. 87(5), pages 2126-2164.
    20. Zazli Lily Wisker & Robert Neil McKie, 2021. "The effect of fake news on anger and negative word-of-mouth: moderating roles of religiosity and conservatism," Journal of Marketing Analytics, Palgrave Macmillan, vol. 9(2), pages 144-153, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jftint:v:12:y:2020:i:5:p:87-:d:356350. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.