IDEAS home Printed from https://ideas.repec.org/a/eee/ijoais/v53y2024ics1467089524000150.html
   My bibliography  Save this article

Accounting fraud detection using contextual language learning

Author

Listed:
  • Bhattacharya, Indranil
  • Mickovic, Ana

Abstract

Accounting fraud is a widespread problem that causes significant damage in the economic market. Detection and investigation of fraudulent firms require a large amount of time, money, and effort for corporate monitors and regulators. In this study, we explore how textual contents from financial reports help in detecting accounting fraud. Pre-trained contextual language learning models, such as BERT, have significantly advanced natural language processing in recent years. We fine-tune the BERT model on Management Discussion and Analysis (MD&A) sections of annual 10-K reports from the Securities and Exchange Commission (SEC) database. Our final model outperforms the textual benchmark model and the quantitative benchmark model from the previous literature by 15% and 12%, respectively. Further, our model identifies five times more fraudulent firm-year observations than the textual benchmark by investigating the same number of firms, and three times more than the quantitative benchmark. Optimizing this investigation process, where more fraudulent observations are detected in the same size of the investigation sample, would be of great economic significance for regulators, investors, financial analysts, and auditors.

Suggested Citation

  • Bhattacharya, Indranil & Mickovic, Ana, 2024. "Accounting fraud detection using contextual language learning," International Journal of Accounting Information Systems, Elsevier, vol. 53(C).
  • Handle: RePEc:eee:ijoais:v:53:y:2024:i:c:s1467089524000150
    DOI: 10.1016/j.accinf.2024.100682
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1467089524000150
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.accinf.2024.100682?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. David F. Larcker & Anastasia A. Zakolyukina, 2012. "Detecting Deceptive Discussions in Conference Calls," Journal of Accounting Research, Wiley Blackwell, vol. 50(2), pages 495-540, May.
    2. Durnev, Art & Mangen, Claudine, 2020. "The spillover effects of MD&A disclosures for real investment: The role of industry competition," Journal of Accounting and Economics, Elsevier, vol. 70(1).
    3. Yang Bao & Bin Ke & Bin Li & Y. Julia Yu & Jie Zhang, 2020. "Detecting Accounting Fraud in Publicly Traded U.S. Firms Using a Machine Learning Approach," Journal of Accounting Research, Wiley Blackwell, vol. 58(1), pages 199-235, March.
    4. Tim Loughran & Bill Mcdonald, 2016. "Textual Analysis in Accounting and Finance: A Survey," Journal of Accounting Research, Wiley Blackwell, vol. 54(4), pages 1187-1230, September.
    5. Sunita Goel & Jagdish Gangolly, 2012. "Beyond The Numbers: Mining The Annual Reports For Hidden Cues Indicative Of Financial Statement Fraud," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 19(2), pages 75-89, April.
    6. Lynnette Purda & David Skillicorn, 2015. "Accounting Variables, Deception, and a Bag of Words: Assessing the Tools of Fraud Detection," Contemporary Accounting Research, John Wiley & Sons, vol. 32(3), pages 1193-1223, September.
    7. Hoberg, Gerard & Lewis, Craig, 2017. "Do fraudulent firms produce abnormal disclosure?," Journal of Corporate Finance, Elsevier, vol. 43(C), pages 58-85.
    8. Gill A. Pratt, 2015. "Is a Cambrian Explosion Coming for Robotics?," Journal of Economic Perspectives, American Economic Association, vol. 29(3), pages 51-60, Summer.
    9. Patricia M. Dechow & Weili Ge & Chad R. Larson & Richard G. Sloan, 2011. "Predicting Material Accounting Misstatements," Contemporary Accounting Research, John Wiley & Sons, vol. 28(1), pages 17-82, March.
    10. Alexander Dyck & Adair Morse & Luigi Zingales, 2010. "Who Blows the Whistle on Corporate Fraud?," Journal of Finance, American Finance Association, vol. 65(6), pages 2213-2253, December.
    11. Berkin, Anil & Aerts, Walter & Van Caneghem, Tom, 2023. "Feasibility analysis of machine learning for performance-related attributional statements," International Journal of Accounting Information Systems, Elsevier, vol. 48(C).
    12. John Berns & Patty Bick & Ryan Flugum & Reza Houston, 2022. "Do changes in MD&A section tone predict investment behavior?," The Financial Review, Eastern Finance Association, vol. 57(1), pages 129-153, February.
    13. Brian J. Bushee & Ian D. Gow & Daniel J. Taylor, 2018. "Linguistic Complexity in Firm Disclosures: Obfuscation or Information?," Journal of Accounting Research, Wiley Blackwell, vol. 56(1), pages 85-121, March.
    14. Lori Holder-Webb & Jaffrey Cohen, 2007. "The Association between Disclosure, Distress, and Failure," Journal of Business Ethics, Springer, vol. 75(3), pages 301-314, October.
    15. Mark Cecchini & Haldun Aytug & Gary J. Koehler & Praveen Pathak, 2010. "Detecting Management Fraud in Public Companies," Management Science, INFORMS, vol. 56(7), pages 1146-1160, July.
    16. Volkan Muslu & Suresh Radhakrishnan & K. R. Subramanyam & Dongkuk Lim, 2015. "Forward-Looking MD&A Disclosures and the Information Environment," Management Science, INFORMS, vol. 61(5), pages 931-948, May.
    17. Nerissa C. Brown & Richard M. Crowley & W. Brooke Elliott, 2020. "What Are You Saying? Using topic to Detect Financial Misreporting," Journal of Accounting Research, Wiley Blackwell, vol. 58(1), pages 237-291, March.
    18. Li, Feng, 2008. "Annual report readability, current earnings, and earnings persistence," Journal of Accounting and Economics, Elsevier, vol. 45(2-3), pages 221-247, August.
    19. Craja, Patricia & Kim, Alisa & Lessmann, Stefan, 2020. "Deep Learning application for fraud detection in financial statements," IRTG 1792 Discussion Papers 2020-007, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yuqi Nie & Yaxuan Kong & Xiaowen Dong & John M. Mulvey & H. Vincent Poor & Qingsong Wen & Stefan Zohren, 2024. "A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges," Papers 2406.11903, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Li, Jing & Li, Nan & Xia, Tongshui & Guo, Jinjin, 2023. "Textual analysis and detection of financial fraud: Evidence from Chinese manufacturing firms," Economic Modelling, Elsevier, vol. 126(C).
    2. Nerissa C. Brown & Richard M. Crowley & W. Brooke Elliott, 2020. "What Are You Saying? Using topic to Detect Financial Misreporting," Journal of Accounting Research, Wiley Blackwell, vol. 58(1), pages 237-291, March.
    3. Rong Liu & Jujun Huang & Zhongju Zhang, 2023. "Tracking disclosure change trajectories for financial fraud detection," Production and Operations Management, Production and Operations Management Society, vol. 32(2), pages 584-602, February.
    4. Senave, Elseline & Jans, Mieke J. & Srivastava, Rajendra P., 2023. "The application of text mining in accounting," International Journal of Accounting Information Systems, Elsevier, vol. 50(C).
    5. Xin Xu & Feng Xiong & Zhe An, 2023. "Using Machine Learning to Predict Corporate Fraud: Evidence Based on the GONE Framework," Journal of Business Ethics, Springer, vol. 186(1), pages 137-158, August.
    6. James P. Ryans, 2021. "Textual classification of SEC comment letters," Review of Accounting Studies, Springer, vol. 26(1), pages 37-80, March.
    7. Elias Zavitsanos & Dimitris Mavroeidis & Konstantinos Bougiatiotis & Eirini Spyropoulou & Lefteris Loukas & Georgios Paliouras, 2023. "Financial misstatement detection: a realistic evaluation," Papers 2305.17457, arXiv.org.
    8. Lukui Huang & Alan Abrahams & Peter Ractham, 2022. "Enhanced financial fraud detection using cost‐sensitive cascade forest with missing value imputation," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 29(3), pages 133-155, July.
    9. Yunchuan Sun & Xiaoping Zeng & Ying Xu & Hong Yue & Xipu Yu, 2024. "An intelligent detecting model for financial frauds in Chinese A‐share market," Economics and Politics, Wiley Blackwell, vol. 36(2), pages 1110-1136, July.
    10. Li, Guowen & Wang, Shuai & Feng, Yuyao, 2024. "Making differences work: Financial fraud detection based on multi-subject perceptions," Emerging Markets Review, Elsevier, vol. 60(C).
    11. Achakzai, Muhammad Atif Khan & Peng, Juan, 2023. "Detecting financial statement fraud using dynamic ensemble machine learning," International Review of Financial Analysis, Elsevier, vol. 89(C).
    12. Craja, Patricia & Kim, Alisa & Lessmann, Stefan, 2020. "Deep Learning application for fraud detection in financial statements," IRTG 1792 Discussion Papers 2020-007, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    13. Zvi Singer & Jing Zhang, 2022. "Do companies try to conceal financial misstatements through auditor shopping?," Journal of Business Finance & Accounting, Wiley Blackwell, vol. 49(1-2), pages 140-180, January.
    14. Zhang, Chanyuan (Abigail) & Cho, Soohyun & Vasarhelyi, Miklos, 2022. "Explainable Artificial Intelligence (XAI) in auditing," International Journal of Accounting Information Systems, Elsevier, vol. 46(C).
    15. García, Diego & Hu, Xiaowen & Rohrer, Maximilian, 2023. "The colour of finance words," Journal of Financial Economics, Elsevier, vol. 147(3), pages 525-549.
    16. Dan Amiram & Zahn Bozanic & James D. Cox & Quentin Dupont & Jonathan M. Karpoff & Richard Sloan, 2018. "Financial reporting fraud and other forms of misconduct: a multidisciplinary review of the literature," Review of Accounting Studies, Springer, vol. 23(2), pages 732-783, June.
    17. Zhang, Yi & Liu, Tianxiang & Li, Weiping, 2024. "Corporate fraud detection based on linguistic readability vector: Application to financial companies in China," International Review of Financial Analysis, Elsevier, vol. 95(PB).
    18. Blankespoor, Elizabeth & deHaan, Ed & Marinovic, Iván, 2020. "Disclosure processing costs, investors’ information choice, and equity market outcomes: A review," Journal of Accounting and Economics, Elsevier, vol. 70(2).
    19. Abdullah Albizri & Deniz Appelbaum & Nicholas Rizzotto, 2019. "Evaluation of financial statements fraud detection research: a multi-disciplinary analysis," International Journal of Disclosure and Governance, Palgrave Macmillan, vol. 16(4), pages 206-241, December.
    20. Achakzai, Muhammad Atif Khan & Juan, Peng, 2022. "Using machine learning Meta-Classifiers to detect financial frauds," Finance Research Letters, Elsevier, vol. 48(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ijoais:v:53:y:2024:i:c:s1467089524000150. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/international-journal-of-accounting-information-systems/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.