IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2412.20438.html
   My bibliography  Save this paper

Integrating Natural Language Processing Techniques of Text Mining Into Financial System: Applications and Limitations

Author

Listed:
  • Denisa Millo
  • Blerina Vika
  • Nevila Baci

Abstract

The financial sector, a pivotal force in economic development, increasingly uses the intelligent technologies such as natural language processing to enhance data processing and insight extraction. This research paper through a review process of the time span of 2018-2023 explores the use of text mining as natural language processing techniques in various components of the financial system including asset pricing, corporate finance, derivatives, risk management, and public finance and highlights the need to address the specific problems in the discussion section. We notice that most of the research materials combined probabilistic with vector-space models, and text-data with numerical ones. The most used technique regarding information processing is the information classification technique and the most used algorithms include the long-short term memory and bidirectional encoder models. The research noticed that new specific algorithms are developed and the focus of the financial system is mainly on asset pricing component. The research also proposes a path from engineering perspective for researchers who need to analyze financial text. The challenges regarding text mining perspective such as data quality, context-adaption and model interpretability need to be solved so to integrate advanced natural language processing models and techniques in enhancing financial analysis and prediction. Keywords: Financial System (FS), Natural Language Processing (NLP), Software and Text Engineering, Probabilistic, Vector-Space, Models, Techniques, TextData, Financial Analysis.

Suggested Citation

  • Denisa Millo & Blerina Vika & Nevila Baci, 2024. "Integrating Natural Language Processing Techniques of Text Mining Into Financial System: Applications and Limitations," Papers 2412.20438, arXiv.org.
  • Handle: RePEc:arx:papers:2412.20438
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2412.20438
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Cindy K Soo, 2018. "Quantifying Sentiment with News Media across Local Housing Markets," The Review of Financial Studies, Society for Financial Studies, vol. 31(10), pages 3689-3719.
    2. Luca Barbaglia & Sergio Consoli & Sebastiano Manzan, 2023. "Forecasting with Economic News," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 41(3), pages 708-719, July.
    3. Maciej Wujec, 2021. "Analysis of the Financial Information Contained in the Texts of Current Reports: A Deep Learning Approach," JRFM, MDPI, vol. 14(12), pages 1-17, December.
    4. Adam Zaremba & Ender Demir, 2023. "ChatGPT: Unlocking the future of NLP in finance," Modern Finance, Modern Finance Institute, vol. 1(1), pages 93-98.
    5. Aaryan Gupta & Vinya Dengre & Hamza Abubakar Kheruwala & Manan Shah, 2020. "Comprehensive review of text-mining applications in finance," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 6(1), pages 1-25, December.
    6. Chen Zhang, 2022. "Asset Pricing and Deep Learning," Papers 2209.12014, arXiv.org.
    7. Bledar Fazlija & Pedro Harder, 2022. "Using Financial News Sentiment for Stock Price Direction Prediction," Mathematics, MDPI, vol. 10(13), pages 1-20, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mazzotta, Stefano, 2022. "Immigration narrative sentiment from TV news and the stock market," Journal of Behavioral and Experimental Finance, Elsevier, vol. 34(C).
    2. Kurowski, Łukasz & Smaga, Paweł, 2023. "Analysing financial stability reports as crisis predictors with the use of text-mining," The Journal of Economic Asymmetries, Elsevier, vol. 28(C).
    3. Yinheng Li & Shaofei Wang & Han Ding & Hang Chen, 2023. "Large Language Models in Finance: A Survey," Papers 2311.10723, arXiv.org, revised Jul 2024.
    4. Mazzotta, Stefano, 2024. "Immigration Narrative and Home Prices," Journal of Behavioral and Experimental Finance, Elsevier, vol. 43(C).
    5. Jose Carreno, 2020. "Housing Booms and the U.S. Productivity Puzzle," Working Papers 20-4, Center for Economic Studies, U.S. Census Bureau.
    6. Theresa Kuchler & Monika Piazzesi & Johannes Stroebel, 2022. "Housing Market Expectations," CESifo Working Paper Series 9665, CESifo.
    7. Alessia De Stefani, 2021. "House price history, biased expectations, and credit cycles: The role of housing investors," Real Estate Economics, American Real Estate and Urban Economics Association, vol. 49(4), pages 1238-1266, December.
    8. Wenlu Zhao & Guanghu Jin & Chenyue Huang & Jinji Zhang, 2023. "Attention and Sentiment of the Chinese Public toward a 3D Greening System Based on Sina Weibo," IJERPH, MDPI, vol. 20(5), pages 1-20, February.
    9. Luca Barbaglia & Sergio Consoli & Sebastiano Manzan, 2024. "Forecasting GDP in Europe with textual data," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(2), pages 338-355, March.
    10. Zhenyu Gao & Michael Sockin & Wei Xiong, 2020. "Learning about the Neighborhood," NBER Working Papers 26907, National Bureau of Economic Research, Inc.
    11. Christopher Gerling & Stefan Lessmann, 2023. "Multimodal Document Analytics for Banking Process Automation," Papers 2307.11845, arXiv.org, revised Nov 2023.
    12. Chi-Young Choi & Alexander Chudik & Aaron Smallwood, 2024. "Time-varying Persistence of House Price Growth: The Role of Expectations and Credit Supply," Globalization Institute Working Papers 426, Federal Reserve Bank of Dallas.
    13. Kentaka Aruga & Md. Monirul Islam & Yoshihiro Zenno & Arifa Jannat, 2022. "Developing Novel Technique for Investigating Guidelines and Frameworks: A Text Mining Comparison between International and Japanese Green Bonds," JRFM, MDPI, vol. 15(9), pages 1-17, August.
    14. Hassnian Ali & Ahmet Faruk Aysan, 2023. "What will ChatGPT revolutionize in the financial industry?," Modern Finance, Modern Finance Institute, vol. 1(1), pages 116-129.
    15. Yuanying Chi & Mingjian Yan & Yuexia Pang & Hongbo Lei, 2022. "Financial Risk Assessment of Photovoltaic Industry Listed Companies Based on Text Mining," Sustainability, MDPI, vol. 14(19), pages 1-17, September.
    16. Maria Saveria Mavillonio, 2024. "Natural Language Processing Techniques for Long Financial Document," Discussion Papers 2024/317, Dipartimento di Economia e Management (DEM), University of Pisa, Pisa, Italy.
    17. Costas Milas & Theodore Panagiotidis & Theologos Dergiades, 2021. "Does It Matter Where You Search? Twitter versus Traditional News Media," Journal of Money, Credit and Banking, Blackwell Publishing, vol. 53(7), pages 1757-1795, October.
    18. Bekiros, Stelios & Nilavongse, Rachatar & Uddin, Gazi Salah, 2020. "Expectation-driven house prices and debt defaults: The effectiveness of monetary and macroprudential policies," Journal of Financial Stability, Elsevier, vol. 49(C).
    19. José Francisco Lima & Fernanda Catarina Pereira & Arminda Manuela Gonçalves & Marco Costa, 2023. "Bootstrapping State-Space Models: Distribution-Free Estimation in View of Prediction and Forecasting," Forecasting, MDPI, vol. 6(1), pages 1-19, December.
    20. Francisco Amaral & Martin Dohmen & Sebastian Kohl & Moritz Schularick, 2021. "Superstar Returns," ECONtribute Discussion Papers Series 131, University of Bonn and University of Cologne, Germany.

    More about this item

    Keywords

    financial system (fs); natural language processing (nlp); software and text engineering; probabilistic; vector-space; models; techniques; textdata; financial analysis.;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2412.20438. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.