IDEAS home Printed from https://ideas.repec.org/p/nsb/bilten/27.html
   My bibliography  Save this paper

News article analysis using Naive Bayes classifier

Author

Listed:
  • Ana Vujovic

    (National Bank of Serbia)

Abstract

The paper presents the Naive Bayes classifier (NBC), one of the standard models used for solving classification problems, in the context of textual analysis. The model is examined first from a theoretical perspective and then from a practical one. An empirical study was conducted with the aim of carrying out a thematic classification of news articles using the NBC. The results of our research confirm that the NBC has a high predictive power despite the simplified assumptions on which it is based. These findings suggest a potential for further application of the NBC in the thematic classification of texts, which may have significant implications for economic research.

Suggested Citation

  • Ana Vujovic, 2025. "News article analysis using Naive Bayes classifier," Working Papers Bulletin 27, National Bank of Serbia.
  • Handle: RePEc:nsb:bilten:27
    as

    Download full text from publisher

    File URL: https://www.nbs.rs/documents-eng/publikacije/wp_bulletin/wp_bulletin_03_25_5.pdf
    File Function: Full text
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. David Bholat & Stephen Hans & Pedro Santos & Cheryl Schonhardt-Bailey, 2015. "Text mining for central banks," Handbooks, Centre for Central Banking Studies, Bank of England, number 33, April.
    2. Scott R. Baker & Nicholas Bloom & Steven J. Davis, 2016. "Measuring Economic Policy Uncertainty," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 131(4), pages 1593-1636.
    3. Eleni Kalamara & Arthur Turrell & Chris Redl & George Kapetanios & Sujit Kapadia, 2022. "Making text count: Economic forecasting using newspaper text," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(5), pages 896-919, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Martin Baumgaertner & Johannes Zahner, 2021. "Whatever it takes to understand a central banker - Embedding their words using neural networks," MAGKS Papers on Economics 202130, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    2. Jon Ellingsen & Vegard H. Larsen & Leif Anders Thorsrud, 2020. "News Media vs. FRED-MD for Macroeconomic Forecasting," CESifo Working Paper Series 8639, CESifo.
    3. Anesti, Nikoleta & Kalamara, Eleni & Kapetanios, George, 2021. "Forecasting UK GDP growth with large survey panels," Bank of England working papers 923, Bank of England.
    4. Fernandez, Raul & Palma Guizar, Brenda & Rho, Caterina, 2021. "A sentiment-based risk indicator for the Mexican financial sector," Latin American Journal of Central Banking (previously Monetaria), Elsevier, vol. 2(3).
    5. Luca Barbaglia & Sergio Consoli & Sebastiano Manzan, 2024. "Forecasting GDP in Europe with textual data," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(2), pages 338-355, March.
    6. Youngjoon Lee & Soohyon Kim & Ki Young Park, 2018. "Deciphering Monetary Policy Committee Minutes with Text Mining Approach: A Case of South Korea," Working papers 2018rwp-132, Yonsei University, Yonsei Economics Research Institute.
    7. Johannes Zahner, 2020. "Above, but close to two percent. Evidence on the ECB’s inflation target using text mining," MAGKS Papers on Economics 202046, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    8. Leonardo N. Ferreira, 2021. "Forecasting with VAR-teXt and DFM-teXt Models:exploring the predictive power of central bank communication," Working Papers Series 559, Central Bank of Brazil, Research Department.
    9. Massimo Ferrari Minesso & Laura Lebastard & Helena Mezo, 2023. "Text-Based Recession Probabilities," IMF Economic Review, Palgrave Macmillan;International Monetary Fund, vol. 71(2), pages 415-438, June.
    10. Kohns, David & Bhattacharjee, Arnab, 2023. "Nowcasting growth using Google Trends data: A Bayesian Structural Time Series model," International Journal of Forecasting, Elsevier, vol. 39(3), pages 1384-1412.
    11. Erik Andres-Escayola & Corinna Ghirelli & Luis Molina & Javier J. Pérez & Elena Vidal, 2022. "Using newspapers for textual indicators: which and how many?," Working Papers 2235, Banco de España.
    12. Stolbov, Mikhail & Shchepeleva, Maria & Karminsky, Alexander, 2022. "When central bank research meets Google search: A sentiment index of global financial stress," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 81(C).
    13. Aprigliano, Valentina & Emiliozzi, Simone & Guaitoli, Gabriele & Luciani, Andrea & Marcucci, Juri & Monteforte, Libero, 2023. "The power of text-based indicators in forecasting Italian economic activity," International Journal of Forecasting, Elsevier, vol. 39(2), pages 791-808.
    14. Philip ME Garboden, 2019. "Sources and Types of Big Data for Macroeconomic Forecasting," Working Papers 2019-3, University of Hawaii Economic Research Organization, University of Hawaii at Manoa.
    15. Jon Ellingsen & Vegard H. Larsen & Leif Anders Thorsrud, 2022. "News media versus FRED‐MD for macroeconomic forecasting," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(1), pages 63-81, January.
    16. Vegard H ghaug Larsen & Leif Anders Thorsrud, 2018. "Business cycle narratives," Working Papers No 6/2018, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
    17. Nicolò Fraccaroli & Alessandro Giovannini & Jean-François Jamet & Eric Persson, 2023. "Central Banks in Parliaments: A Text Analysis of the Parliamentary Hearings of the Bank of England, the European Central Bank, and the Federal Reserve," International Journal of Central Banking, International Journal of Central Banking, vol. 19(2), pages 543-600, June.
    18. Saiz, Lorena & Ashwin, Julian & Kalamara, Eleni, 2021. "Nowcasting euro area GDP with news sentiment: a tale of two crises," Working Paper Series 2616, European Central Bank.
    19. Dooruj Rambaccussing & Craig Menzies & Andrzej Kwiatkowski, 2022. "Look who’s Talking: Individual Committee members’ impact on inflation expectations," Dundee Discussion Papers in Economics 305, Economic Studies, University of Dundee.
    20. Nyman, Rickard & Kapadia, Sujit & Tuckett, David, 2021. "News and narratives in financial systems: Exploiting big data for systemic risk assessment," Journal of Economic Dynamics and Control, Elsevier, vol. 127(C).

    More about this item

    Keywords

    Naive Bayes classifier; thematic classification; natural language processing;
    All these keywords.

    JEL classification:

    • C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General
    • E37 - Macroeconomics and Monetary Economics - - Prices, Business Fluctuations, and Cycles - - - Forecasting and Simulation: Models and Applications

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nsb:bilten:27. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Marko Miseljic (email available below). General contact details of provider: https://edirc.repec.org/data/nbjgvyu.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.