IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v11y2019i5p1277-d209769.html
   My bibliography  Save this article

Text Mining for Big Data Analysis in Financial Sector: A Literature Review

Author

Listed:
  • Mirjana Pejić Bach

    (Faculty of Economics & Business, University of Zagreb, 10000 Zagreb, Croatia)

  • Živko Krstić

    (Atomic Intelligence, 10000 Zagreb, Croatia)

  • Sanja Seljan

    (Faculty of Humanities and Social Sciences, Information and Communication Sciences, University of Zagreb, 10000 Zagreb, Croatia)

  • Lejla Turulja

    (School of Economics and Business, University of Sarajevo, 71000 Sarajevo, Bosna i Hercegovina)

Abstract

Big data technologies have a strong impact on different industries, starting from the last decade, which continues nowadays, with the tendency to become omnipresent. The financial sector, as most of the other sectors, concentrated their operating activities mostly on structured data investigation. However, with the support of big data technologies, information stored in diverse sources of semi-structured and unstructured data could be harvested. Recent research and practice indicate that such information can be interesting for the decision-making process. Questions about how and to what extent research on data mining in the financial sector has developed and which tools are used for these purposes remains largely unexplored. This study aims to answer three research questions: (i) What is the intellectual core of the field? (ii) Which techniques are used in the financial sector for textual mining, especially in the era of the Internet, big data, and social media? (iii) Which data sources are the most often used for text mining in the financial sector, and for which purposes? In order to answer these questions, a qualitative analysis of literature is carried out using a systematic literature review, citation and co-citation analysis.

Suggested Citation

  • Mirjana Pejić Bach & Živko Krstić & Sanja Seljan & Lejla Turulja, 2019. "Text Mining for Big Data Analysis in Financial Sector: A Literature Review," Sustainability, MDPI, vol. 11(5), pages 1-27, February.
  • Handle: RePEc:gam:jsusta:v:11:y:2019:i:5:p:1277-:d:209769
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/11/5/1277/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/11/5/1277/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Won Sang Lee & So Young Sohn, 2017. "Identifying Emerging Trends of Financial Business Method Patents," Sustainability, MDPI, vol. 9(9), pages 1-21, September.
    2. Gray, Glen L. & Debreceny, Roger S., 2014. "A taxonomy to guide research on the application of data mining to fraud detection in financial statement audits," International Journal of Accounting Information Systems, Elsevier, vol. 15(4), pages 357-380.
    3. Paul C. Tetlock, 2007. "Giving Content to Investor Sentiment: The Role of Media in the Stock Market," Journal of Finance, American Finance Association, vol. 62(3), pages 1139-1168, June.
    4. Zekić-Sušac Marijana & Has Adela, 2015. "Data Mining as Support to Knowledge Management in Marketing," Business Systems Research, Sciendo, vol. 6(2), pages 18-30, September.
    5. Josephat Lotto, 2018. "Examination of the Status of Financial Inclusion and Its Determinants in Tanzania," Sustainability, MDPI, vol. 10(8), pages 1-15, August.
    6. David Moher & Alessandro Liberati & Jennifer Tetzlaff & Douglas G Altman & The PRISMA Group, 2009. "Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement," PLOS Medicine, Public Library of Science, vol. 6(7), pages 1-6, July.
    7. K. Coussement & D. van den Poel, 2008. "Integrating the voice of customers through call center emails into a decision support system for churn prediction," Post-Print hal-00788086, HAL.
    8. Charness, Gary & Gneezy, Uri, 2012. "Strong Evidence for Gender Differences in Risk Taking," Journal of Economic Behavior & Organization, Elsevier, vol. 83(1), pages 50-58.
    9. Eric Abrahamson & Lori Rosenkopf, 1997. "Social Network Effects on the Extent of Innovation Diffusion: A Computer Simulation," Organization Science, INFORMS, vol. 8(3), pages 289-309, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Polak, Marija & Kolić Stanić, Matilda & Togonal, Marijana, 2022. "Artificial Intelligence in Communication with Music Fans: An Example from South Korea," Proceedings of the ENTRENOVA - ENTerprise REsearch InNOVAtion Conference (2022), Hybrid Conference, Opatija, Croatia, in: Proceedings of the ENTRENOVA - ENTerprise REsearch InNOVAtion Conference, Hybrid Conference, Opatija, Croatia, 17-18 June 2022, pages 48-63, IRENET - Society for Advancing Innovation and Research in Economy, Zagreb.
    2. Osmud Rahman & Dingtao Hu & Benjamin C. M. Fung, 2023. "A Systematic Literature Review of Fashion, Sustainability, and Consumption Using a Mixed Methods Approach," Sustainability, MDPI, vol. 15(16), pages 1-37, August.
    3. Aaryan Gupta & Vinya Dengre & Hamza Abubakar Kheruwala & Manan Shah, 2020. "Comprehensive review of text-mining applications in finance," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 6(1), pages 1-25, December.
    4. Chunting Liu & Shanshan Wang & Guozhu Jia, 2020. "Exploring E-Commerce Big Data and Customer-Perceived Value: An Empirical Study on Chinese Online Customers," Sustainability, MDPI, vol. 12(20), pages 1-22, October.
    5. Gang-Hoon Seo & Munehiko Itoh, 2020. "Perceptions of Customers as Sustained Competitive Advantages of Global Marketing Airline Alliances: A Hybrid Text Mining Approach," Sustainability, MDPI, vol. 12(15), pages 1-15, August.
    6. Nguyen Bang & Nguyen Van-Ho & Ho Thanh, 2021. "Sentiment Analysis of Customer Feedback in Online Food Ordering Services," Business Systems Research, Sciendo, vol. 12(2), pages 46-59, December.
    7. Alexander Musaev & Andrey Makshanov & Dmitry Grigoriev, 2022. "Numerical Studies of Channel Management Strategies for Nonstationary Immersion Environments: EURUSD Case Study," Mathematics, MDPI, vol. 10(9), pages 1-20, April.
    8. Yuegang Song & Ruibing Wu, 2022. "The Impact of Financial Enterprises’ Excessive Financialization Risk Assessment for Risk Control based on Data Mining and Machine Learning," Computational Economics, Springer;Society for Computational Economics, vol. 60(4), pages 1245-1267, December.
    9. Vasja Roblek & Maja Meško & Mirjana Pejić Bach & Oshane Thorpe & Polona Šprajc, 2020. "The Interaction between Internet, Sustainable Development, and Emergence of Society 5.0," Data, MDPI, vol. 5(3), pages 1-27, September.
    10. Seo Gang-Hoon, 2020. "A Content Analysis of International Airline Alliances Mission Statements," Business Systems Research, Sciendo, vol. 11(1), pages 89-105, March.
    11. Rybinski, Krzysztof, 2021. "Ranking professional forecasters by the predictive power of their narratives," International Journal of Forecasting, Elsevier, vol. 37(1), pages 186-204.
    12. Fernando Zambrano Farias & María del Carmen Valls Martínez & Pedro Antonio Martín-Cervantes, 2021. "Explanatory Factors of Business Failure: Literature Review and Global Trends," Sustainability, MDPI, vol. 13(18), pages 1-26, September.
    13. Christopher Gerling & Stefan Lessmann, 2023. "Multimodal Document Analytics for Banking Process Automation," Papers 2307.11845, arXiv.org, revised Nov 2023.
    14. Rybinski, Krzysztof, 2020. "The forecasting power of the multi-language narrative of sell-side research: A machine learning evaluation," Finance Research Letters, Elsevier, vol. 34(C).
    15. Ćurlin Tamara & Jaković Božidar & Miloloža Ivan, 2019. "Twitter usage in Tourism: Literature Review," Business Systems Research, Sciendo, vol. 10(1), pages 102-119, April.
    16. Seo Gang-Hoon & Itoh Munehiko & Li Zhonghui, 2021. "Strategic Communication and Competitive Advantage: Assessing CEO Letters of Global Airline Alliances," Foundations of Management, Sciendo, vol. 13(1), pages 57-72, January.
    17. Cristina Ortega-Rodríguez & Ana Licerán-Gutiérrez & Antonio Luis Moreno-Albarracín, 2020. "Transparency as a Key Element in Accountability in Non-Profit Organizations: A Systematic Literature Review," Sustainability, MDPI, vol. 12(14), pages 1-22, July.
    18. Berkin, Anil & Aerts, Walter & Van Caneghem, Tom, 2023. "Feasibility analysis of machine learning for performance-related attributional statements," International Journal of Accounting Information Systems, Elsevier, vol. 48(C).
    19. Carlos Arcila-Calderón & David Blanco-Herrero & Maximiliano Frías-Vázquez & Francisco Seoane-Pérez, 2021. "Refugees Welcome? Online Hate Speech and Sentiments in Twitter in Spain during the Reception of the Boat Aquarius," Sustainability, MDPI, vol. 13(5), pages 1-16, March.
    20. Senave, Elseline & Jans, Mieke J. & Srivastava, Rajendra P., 2023. "The application of text mining in accounting," International Journal of Accounting Information Systems, Elsevier, vol. 50(C).
    21. Vasja Roblek & Oshane Thorpe & Mirjana Pejic Bach & Andrej Jerman & Maja Meško, 2020. "The Fourth Industrial Revolution and the Sustainability Practices: A Comparative Automated Content Analysis Approach of Theory and Practice," Sustainability, MDPI, vol. 12(20), pages 1-27, October.
    22. Rongjiang Cai & Tao Lv & Xu Deng, 2021. "Evaluation of Environmental Information Disclosure of Listed Companies in China’s Heavy Pollution Industries: A Text Mining-Based Methodology," Sustainability, MDPI, vol. 13(10), pages 1-23, May.
    23. Vrdoljak Ivana, 2023. "Lifelong Education in Economics, Business and Management Research: Literature Review," Business Systems Research, Sciendo, vol. 14(1), pages 153-172, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jan Luca Pletzer & Romina Nikolova & Karina Karolina Kedzior & Sven Constantin Voelpel, 2015. "Does Gender Matter? Female Representation on Corporate Boards and Firm Financial Performance - A Meta-Analysis," PLOS ONE, Public Library of Science, vol. 10(6), pages 1-20, June.
    2. Craja, Patricia & Kim, Alisa & Lessmann, Stefan, 2020. "Deep Learning application for fraud detection in financial statements," IRTG 1792 Discussion Papers 2020-007, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    3. İlkay Unay-Gailhard & Mark A. Brennen, 2022. "How digital communications contribute to shaping the career paths of youth: a review study focused on farming as a career option," Agriculture and Human Values, Springer;The Agriculture, Food, & Human Values Society (AFHVS), vol. 39(4), pages 1491-1508, December.
    4. Mahin Ghafari & Vali Baigi & Zahra Cheraghi & Amin Doosti-Irani, 2016. "The Prevalence of Asymptomatic Bacteriuria in Iranian Pregnant Women: A Systematic Review and Meta-Analysis," PLOS ONE, Public Library of Science, vol. 11(6), pages 1-10, June.
    5. Elizabeth T Cafiero-Fonseca & Andrew Stawasz & Sydney T Johnson & Reiko Sato & David E Bloom, 2017. "The full benefits of adult pneumococcal vaccination: A systematic review," PLOS ONE, Public Library of Science, vol. 12(10), pages 1-23, October.
    6. Santos Urbina & Sofía Villatoro & Jesús Salinas, 2021. "Self-Regulated Learning and Technology-Enhanced Learning Environments in Higher Education: A Scoping Review," Sustainability, MDPI, vol. 13(13), pages 1-12, June.
    7. Ertac, Seda & Gumren, Mert & Gurdal, Mehmet Y., 2020. "Demand for decision autonomy and the desire to avoid responsibility in risky environments: Experimental evidence," Journal of Economic Psychology, Elsevier, vol. 77(C).
    8. Fereshteh Mahmoudian & Johnny Jermias, 2022. "The influence of governance structure on the relationship between pay ratio and environmental and social performance," Business Strategy and the Environment, Wiley Blackwell, vol. 31(7), pages 2992-3013, November.
    9. Oded Berger-Tal & Alison L Greggor & Biljana Macura & Carrie Ann Adams & Arden Blumenthal & Amos Bouskila & Ulrika Candolin & Carolina Doran & Esteban Fernández-Juricic & Kiyoko M Gotanda & Catherine , 2019. "Systematic reviews and maps as tools for applying behavioral ecology to management and policy," Behavioral Ecology, International Society for Behavioral Ecology, vol. 30(1), pages 1-8.
    10. Nadine Desrochers & Adèle Paul‐Hus & Jen Pecoskie, 2017. "Five decades of gratitude: A meta‐synthesis of acknowledgments research," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(12), pages 2821-2833, December.
    11. Maryono, Maryono & Killoes, Aditya Marendra & Adhikari, Rajendra & Abdul Aziz, Ammar, 2024. "Agriculture development through multi-stakeholder partnerships in developing countries: A systematic literature review," Agricultural Systems, Elsevier, vol. 213(C).
    12. Alene Sze Jing Yong & Yi Heng Lim & Mark Wing Loong Cheong & Ednin Hamzah & Siew Li Teoh, 2022. "Willingness-to-pay for cancer treatment and outcome: a systematic review," The European Journal of Health Economics, Springer;Deutsche Gesellschaft für Gesundheitsökonomie (DGGÖ), vol. 23(6), pages 1037-1057, August.
    13. Xue-Ying Xu & Hong Kong & Rui-Xiang Song & Yu-Han Zhai & Xiao-Fei Wu & Wen-Si Ai & Hong-Bo Liu, 2014. "The Effectiveness of Noninvasive Biomarkers to Predict Hepatitis B-Related Significant Fibrosis and Cirrhosis: A Systematic Review and Meta-Analysis of Diagnostic Test Accuracy," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-16, June.
    14. Vicente Miñana-Signes & Manuel Monfort-Pañego & Javier Valiente, 2021. "Teaching Back Health in the School Setting: A Systematic Review of Randomized Controlled Trials," IJERPH, MDPI, vol. 18(3), pages 1-18, January.
    15. Müller, Karsten, 2020. "German forecasters' narratives: How informative are German business cycle forecast reports?," Working Papers 23, German Research Foundation's Priority Programme 1859 "Experience and Expectation. Historical Foundations of Economic Behaviour", Humboldt University Berlin.
    16. Agnieszka A. Tubis & Katarzyna Grzybowska, 2022. "In Search of Industry 4.0 and Logistics 4.0 in Small-Medium Enterprises—A State of the Art Review," Energies, MDPI, vol. 15(22), pages 1-26, November.
    17. Goedde-Menke, Michael & Langer, Thomas & Pfingsten, Andreas, 2014. "Impact of the financial crisis on bank run risk – Danger of the days after," Journal of Banking & Finance, Elsevier, vol. 40(C), pages 522-533.
    18. David E. Allen & Michael McAleer & Abhay K. Singh, 2019. "Daily market news sentiment and stock prices," Applied Economics, Taylor & Francis Journals, vol. 51(30), pages 3212-3235, June.
    19. Dilmaghani, Maryam, 2021. "A matter of time: Gender, time constraint, and risk taking among the chess elite," Economics Letters, Elsevier, vol. 208(C).
    20. Obsa Urgessa Ayana & Jima Degaga, 2022. "Effects of rural electrification on household welfare: a meta-regression analysis," International Review of Economics, Springer;Happiness Economics and Interpersonal Relations (HEIRS), vol. 69(2), pages 209-261, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:11:y:2019:i:5:p:1277-:d:209769. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.