IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2212.12051.html
   My bibliography  Save this paper

Benchmarking Machine Learning Models to Predict Corporate Bankruptcy

Author

Listed:
  • Emmanuel Alanis
  • Sudheer Chava
  • Agam Shah

Abstract

Using a comprehensive sample of 2,585 bankruptcies from 1990 to 2019, we benchmark the performance of various machine learning models in predicting financial distress of publicly traded U.S. firms. We find that gradient boosted trees outperform other models in one-year-ahead forecasts. Variable permutation tests show that excess stock returns, idiosyncratic risk, and relative size are the more important variables for predictions. Textual features derived from corporate filings do not improve performance materially. In a credit competition model that accounts for the asymmetric cost of default misclassification, the survival random forest is able to capture large dollar profits.

Suggested Citation

  • Emmanuel Alanis & Sudheer Chava & Agam Shah, 2022. "Benchmarking Machine Learning Models to Predict Corporate Bankruptcy," Papers 2212.12051, arXiv.org.
  • Handle: RePEc:arx:papers:2212.12051
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2212.12051
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sudheer Chava & Catalina Stefanescu & Stuart Turnbull, 2011. "Modeling the Loss Distribution," Management Science, INFORMS, vol. 57(7), pages 1267-1287, July.
    2. Merton, Robert C, 1974. "On the Pricing of Corporate Debt: The Risk Structure of Interest Rates," Journal of Finance, American Finance Association, vol. 29(2), pages 449-470, May.
    3. Sudheer Chava & Amiyatosh Purnanandam, 2010. "Is Default Risk Negatively Related to Stock Returns?," The Review of Financial Studies, Society for Financial Studies, vol. 23(6), pages 2523-2559, June.
    4. Edward I. Altman, 1968. "Financial Ratios, Discriminant Analysis And The Prediction Of Corporate Bankruptcy," Journal of Finance, American Finance Association, vol. 23(4), pages 589-609, September.
    5. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Agam Shah & Suvan Paturi & Sudheer Chava, 2023. "Trillion Dollar Words: A New Financial Dataset, Task & Market Analysis," Papers 2305.07972, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Miao, Hong & Ramchander, Sanjay & Ryan, Patricia & Wang, Tianyang, 2018. "Default prediction models: The role of forward-looking measures of returns and volatility," Journal of Empirical Finance, Elsevier, vol. 46(C), pages 146-162.
    2. Ruey-Ching Hwang & Huimin Chung & Jiun-Yi Ku, 2013. "Predicting Recurrent Financial Distresses with Autocorrelation Structure: An Empirical Analysis from an Emerging Market," Journal of Financial Services Research, Springer;Western Finance Association, vol. 43(3), pages 321-341, June.
    3. Cathcart, Lara & Dufour, Alfonso & Rossi, Ludovico & Varotto, Simone, 2024. "Corporate bankruptcy and banking deregulation: The effect of financial leverage," Journal of Banking & Finance, Elsevier, vol. 166(C).
    4. Ferreira Filipe, Sara & Grammatikos, Theoharry & Michala, Dimitra, 2016. "Pricing default risk: The good, the bad, and the anomaly," Journal of Financial Stability, Elsevier, vol. 26(C), pages 190-213.
    5. Doshi, Hitesh & Patel, Saurin & Ramani, Srikanth & Sooy, Matthew, 2023. "Uncertain tone, asset volatility and credit default swap spreads," Journal of Contemporary Accounting and Economics, Elsevier, vol. 19(3).
    6. Sanjiv Das & Xin Huang & Soji Adeshina & Patrick Yang & Leonardo Bachega, 2023. "Credit Risk Modeling with Graph Machine Learning," INFORMS Joural on Data Science, INFORMS, vol. 2(2), pages 197-217, October.
    7. Andreou, Christoforos K. & Lambertides, Neophytos & Panayides, Photis M., 2021. "Distress risk anomaly and misvaluation," The British Accounting Review, Elsevier, vol. 53(5).
    8. Charitou, Andreas & Dionysiou, Dionysia & Lambertides, Neophytos & Trigeorgis, Lenos, 2013. "Alternative bankruptcy prediction models using option-pricing theory," Journal of Banking & Finance, Elsevier, vol. 37(7), pages 2329-2341.
    9. Jiang, Cuiqing & Lyu, Ximei & Yuan, Yufei & Wang, Zhao & Ding, Yong, 2022. "Mining semantic features in current reports for financial distress prediction: Empirical evidence from unlisted public firms in China," International Journal of Forecasting, Elsevier, vol. 38(3), pages 1086-1099.
    10. de Groot, Wilma & Huij, Joop, 2018. "Are the Fama-French factors really compensation for distress risk?," Journal of International Money and Finance, Elsevier, vol. 86(C), pages 50-69.
    11. Nguyen, Ha, 2023. "An empirical application of Particle Markov Chain Monte Carlo to frailty correlated default models," Journal of Empirical Finance, Elsevier, vol. 72(C), pages 103-121.
    12. Shiyan Yin & Kai Yao & Thanaset Chevapatrakul & Rong Huang, 2024. "Reduced disclosure and default risk: analysis of smaller reporting companies," Review of Quantitative Finance and Accounting, Springer, vol. 63(1), pages 355-395, July.
    13. Michael S. O'Doherty, 2012. "On the Conditional Risk and Performance of Financially Distressed Stocks," Management Science, INFORMS, vol. 58(8), pages 1502-1520, August.
    14. Stefano Filomeni & Udichibarna Bose & Anastasios Megaritis & Athanasios Triantafyllou, 2024. "Can market information outperform hard and soft information in predicting corporate defaults?," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 29(3), pages 3567-3592, July.
    15. Deniz Anginer & Çelim Yıldızhan, 2018. "Is There a Distress Risk Anomaly? Pricing of Systematic Default Risk in the Cross-section of Equity Returns [The risk-adjusted cost of financial distress]," Review of Finance, European Finance Association, vol. 22(2), pages 633-660.
    16. Assaf Eisdorfer & Amit Goyal & Alexei Zhdanov, 2018. "Distress Anomaly and Shareholder Risk: International Evidence," Financial Management, Financial Management Association International, vol. 47(3), pages 553-581, September.
    17. Howard Chan & Robert Faff & Paul Kofman, 2011. "Is default risk priced in Australian equity? Exploring the role of the business cycle," Australian Journal of Management, Australian School of Business, vol. 36(2), pages 217-246, August.
    18. Hwang, Ruey-Ching, 2012. "A varying-coefficient default model," International Journal of Forecasting, Elsevier, vol. 28(3), pages 675-688.
    19. George, Thomas J. & Hwang, Chuan-Yang, 2010. "A resolution of the distress risk and leverage puzzles in the cross section of stock returns," Journal of Financial Economics, Elsevier, vol. 96(1), pages 56-79, April.
    20. Krüger, Steffen & Oehme, Toni & Rösch, Daniel & Scheule, Harald, 2018. "A copula sample selection model for predicting multi-year LGDs and Lifetime Expected Losses," Journal of Empirical Finance, Elsevier, vol. 47(C), pages 246-262.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2212.12051. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.