IDEAS home Printed from https://ideas.repec.org/a/gam/jftint/v14y2022i8p244-d894540.html
   My bibliography  Save this article

Machine Learning for Bankruptcy Prediction in the American Stock Market: Dataset and Benchmarks

Author

Listed:
  • Gianfranco Lombardo

    (Department of Engineering and Architecture, University of Parma, 43124 Parma, Italy
    These authors contributed equally to this work.)

  • Mattia Pellegrino

    (Department of Engineering and Architecture, University of Parma, 43124 Parma, Italy
    These authors contributed equally to this work.)

  • George Adosoglou

    (Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA
    These authors contributed equally to this work.)

  • Stefano Cagnoni

    (Department of Engineering and Architecture, University of Parma, 43124 Parma, Italy
    These authors contributed equally to this work.)

  • Panos M. Pardalos

    (Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA
    These authors contributed equally to this work.)

  • Agostino Poggi

    (Department of Engineering and Architecture, University of Parma, 43124 Parma, Italy
    These authors contributed equally to this work.)

Abstract

Predicting corporate bankruptcy is one of the fundamental tasks in credit risk assessment. In particular, since the 2007/2008 financial crisis, it has become a priority for most financial institutions, practitioners, and academics. The recent advancements in machine learning (ML) enabled the development of several models for bankruptcy prediction. The most challenging aspect of this task is dealing with the class imbalance due to the rarity of bankruptcy events in the real economy. Furthermore, a fair comparison in the literature is difficult to make because bankruptcy datasets are not publicly available and because studies often restrict their datasets to specific economic sectors and markets and/or time periods. In this work, we investigated the design and the application of different ML models to two different tasks related to default events: (a) estimating survival probabilities over time; (b) default prediction using time-series accounting data with different lengths. The entire dataset used for the experiments has been made available to the scientific community for further research and benchmarking purposes. The dataset pertains to 8262 different public companies listed on the American stock market between 1999 and 2018. Finally, in light of the results obtained, we critically discuss the most interesting metrics as proposed benchmarks for future studies.

Suggested Citation

  • Gianfranco Lombardo & Mattia Pellegrino & George Adosoglou & Stefano Cagnoni & Panos M. Pardalos & Agostino Poggi, 2022. "Machine Learning for Bankruptcy Prediction in the American Stock Market: Dataset and Benchmarks," Future Internet, MDPI, vol. 14(8), pages 1-23, August.
  • Handle: RePEc:gam:jftint:v:14:y:2022:i:8:p:244-:d:894540
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1999-5903/14/8/244/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1999-5903/14/8/244/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. John Y. Campbell & Jens Hilscher & Jan Szilagyi, 2008. "In Search of Distress Risk," Journal of Finance, American Finance Association, vol. 63(6), pages 2899-2939, December.
    2. Błażej Prusak, 2018. "Review of Research into Enterprise Bankruptcy Prediction in Selected Central and Eastern European Countries," IJFS, MDPI, vol. 6(3), pages 1-28, June.
    3. Duan, Jin-Chuan & Sun, Jie & Wang, Tao, 2012. "Multiperiod corporate default prediction—A forward intensity approach," Journal of Econometrics, Elsevier, vol. 170(1), pages 191-209.
    4. Wanke, Peter & Barros, Carlos P. & Faria, João R., 2015. "Financial distress drivers in Brazilian banks: A dynamic slacks approach," European Journal of Operational Research, Elsevier, vol. 240(1), pages 258-268.
    5. Bose, Indranil & Pal, Raktim, 2006. "Predicting the survival or failure of click-and-mortar corporations: A knowledge discovery approach," European Journal of Operational Research, Elsevier, vol. 174(2), pages 959-982, October.
    6. Hyeongjun Kim & Hoon Cho & Doojin Ryu, 2020. "Corporate Default Predictions Using Machine Learning: Literature Review," Sustainability, MDPI, vol. 12(16), pages 1-11, August.
    7. Edward I. Altman, 1968. "Financial Ratios, Discriminant Analysis And The Prediction Of Corporate Bankruptcy," Journal of Finance, American Finance Association, vol. 23(4), pages 589-609, September.
    8. Edward I. Altman, 1968. "The Prediction Of Corporate Bankruptcy: A Discriminant Analysis," Journal of Finance, American Finance Association, vol. 23(1), pages 193-194, March.
    9. Ohlson, Ja, 1980. "Financial Ratios And The Probabilistic Prediction Of Bankruptcy," Journal of Accounting Research, Wiley Blackwell, vol. 18(1), pages 109-131.
    10. A. Adam Ding & Shaonan Tian & Yan Yu & Hui Guo, 2012. "A Class of Discrete Transformation Survival Models With Application to Default Probability Prediction," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(499), pages 990-1003, September.
    11. du Jardin, Philippe, 2016. "A two-stage classification technique for bankruptcy prediction," European Journal of Operational Research, Elsevier, vol. 254(1), pages 236-252.
    12. Mossman, Charles E, et al, 1998. "An Empirical Comparison of Bankruptcy Models," The Financial Review, Eastern Finance Association, vol. 33(2), pages 35-53, May.
    13. Mai, Feng & Tian, Shaonan & Lee, Chihoon & Ma, Ling, 2019. "Deep learning models for bankruptcy prediction using textual disclosures," European Journal of Operational Research, Elsevier, vol. 274(2), pages 743-758.
    14. George Adosoglou & Seonho Park & Gianfranco Lombardo & Stefano Cagnoni & Panos M. Pardalos & Guilherme Ferraz de Arruda, 2022. "Lazy Network: A Word Embedding-Based Temporal Financial Network to Avoid Economic Shocks in Asset Pricing Models," Complexity, Hindawi, vol. 2022, pages 1-12, April.
    15. Geng, Ruibin & Bose, Indranil & Chen, Xi, 2015. "Prediction of financial distress: An empirical study of listed Chinese companies using data mining," European Journal of Operational Research, Elsevier, vol. 241(1), pages 236-247.
    16. Beaver, Wh, 1966. "Financial Ratios As Predictors Of Failure," Journal of Accounting Research, Wiley Blackwell, vol. 4, pages 71-111.
    17. Ligang Zhou & Kin Lai & Jerome Yen, 2014. "Bankruptcy prediction using SVM models with a new approach to combine features selection and parameter optimisation," International Journal of Systems Science, Taylor & Francis Journals, vol. 45(3), pages 241-253.
    18. Tian, Shaonan & Yu, Yan & Guo, Hui, 2015. "Variable selection and corporate bankruptcy forecasts," Journal of Banking & Finance, Elsevier, vol. 52(C), pages 89-100.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mattia Pellegrino & Gianfranco Lombardo & George Adosoglou & Stefano Cagnoni & Panos M. Pardalos & Agostino Poggi, 2024. "A Multi-Head LSTM Architecture for Bankruptcy Prediction with Time Series Accounting Data," Future Internet, MDPI, vol. 16(3), pages 1-20, February.
    2. Lorena Espina-Romero & José Gregorio Noroño Sánchez & Humberto Gutiérrez Hurtado & Helga Dworaczek Conde & Yessenia Solier Castro & Luz Emérita Cervera Cajo & Jose Rio Corredoira, 2023. "Which Industrial Sectors Are Affected by Artificial Intelligence? A Bibliometric Analysis of Trends and Perspectives," Sustainability, MDPI, vol. 15(16), pages 1-18, August.
    3. Ana Lorena Jiménez-Preciado & Francisco Venegas-Martínez & Abraham Ramírez-García, 2022. "Stock Portfolio Optimization with Competitive Advantages (MOAT): A Machine Learning Approach," Mathematics, MDPI, vol. 10(23), pages 1-16, November.
    4. Jomark Pablo Noriega & Luis Antonio Rivera & José Alfredo Herrera, 2023. "Machine Learning for Credit Risk Prediction: A Systematic Literature Review," Data, MDPI, vol. 8(11), pages 1-17, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mai, Feng & Tian, Shaonan & Lee, Chihoon & Ma, Ling, 2019. "Deep learning models for bankruptcy prediction using textual disclosures," European Journal of Operational Research, Elsevier, vol. 274(2), pages 743-758.
    2. Yi Cao & Xiaoquan Liu & Jia Zhai & Shan Hua, 2022. "A two‐stage Bayesian network model for corporate bankruptcy prediction," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 27(1), pages 455-472, January.
    3. Hyeongjun Kim & Hoon Cho & Doojin Ryu, 2020. "Corporate Default Predictions Using Machine Learning: Literature Review," Sustainability, MDPI, vol. 12(16), pages 1-11, August.
    4. Hyeongjun Kim & Hoon Cho & Doojin Ryu, 2022. "Corporate Bankruptcy Prediction Using Machine Learning Methodologies with a Focus on Sequential Data," Computational Economics, Springer;Society for Computational Economics, vol. 59(3), pages 1231-1249, March.
    5. Mattia Pellegrino & Gianfranco Lombardo & George Adosoglou & Stefano Cagnoni & Panos M. Pardalos & Agostino Poggi, 2024. "A Multi-Head LSTM Architecture for Bankruptcy Prediction with Time Series Accounting Data," Future Internet, MDPI, vol. 16(3), pages 1-20, February.
    6. Alberto Tron & Maurizio Dallocchio & Salvatore Ferri & Federico Colantoni, 2023. "Corporate governance and financial distress: lessons learned from an unconventional approach," Journal of Management & Governance, Springer;Accademia Italiana di Economia Aziendale (AIDEA), vol. 27(2), pages 425-456, June.
    7. Li, Chunyu & Lou, Chenxin & Luo, Dan & Xing, Kai, 2021. "Chinese corporate distress prediction using LASSO: The role of earnings management," International Review of Financial Analysis, Elsevier, vol. 76(C).
    8. Sigrist, Fabio & Leuenberger, Nicola, 2023. "Machine learning for corporate default risk: Multi-period prediction, frailty correlation, loan portfolios, and tail probabilities," European Journal of Operational Research, Elsevier, vol. 305(3), pages 1390-1406.
    9. Elena Gregova & Katarina Valaskova & Peter Adamko & Milos Tumpach & Jaroslav Jaros, 2020. "Predicting Financial Distress of Slovak Enterprises: Comparison of Selected Traditional and Learning Algorithms Methods," Sustainability, MDPI, vol. 12(10), pages 1-17, May.
    10. Bai, Qing & Tian, Shaonan, 2020. "Innovate or die: Corporate innovation and bankruptcy forecasts," Journal of Empirical Finance, Elsevier, vol. 59(C), pages 88-108.
    11. Dong, Manh Cuong & Tian, Shaonan & Chen, Cathy W.S., 2018. "Predicting failure risk using financial ratios: Quantile hazard model approach," The North American Journal of Economics and Finance, Elsevier, vol. 44(C), pages 204-220.
    12. Ben Jabeur, Sami & Serret, Vanessa, 2023. "Bankruptcy prediction using fuzzy convolutional neural networks," Research in International Business and Finance, Elsevier, vol. 64(C).
    13. Tian, Shaonan & Yu, Yan, 2017. "Financial ratios and bankruptcy predictions: An international evidence," International Review of Economics & Finance, Elsevier, vol. 51(C), pages 510-526.
    14. Zhou, Fanyin & Fu, Lijun & Li, Zhiyong & Xu, Jiawei, 2022. "The recurrence of financial distress: A survival analysis," International Journal of Forecasting, Elsevier, vol. 38(3), pages 1100-1115.
    15. Serrano-Cinca, Carlos & Gutiérrez-Nieto, Begoña & Bernate-Valbuena, Martha, 2019. "The use of accounting anomalies indicators to predict business failure," European Management Journal, Elsevier, vol. 37(3), pages 353-375.
    16. Youssef Zizi & Mohamed Oudgou & Abdeslam El Moudden, 2020. "Determinants and Predictors of SMEs’ Financial Failure: A Logistic Regression Approach," Risks, MDPI, vol. 8(4), pages 1-21, October.
    17. Mohammad Mahdi Mousavi & Jamal Ouenniche & Kaoru Tone, 2023. "A dynamic performance evaluation of distress prediction models," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 42(4), pages 756-784, July.
    18. Zhang, Xuan & Ouyang, Ruolan & Liu, Ding & Xu, Liao, 2020. "Determinants of corporate default risk in China: The role of financial constraints," Economic Modelling, Elsevier, vol. 92(C), pages 87-98.
    19. Lenka Papíková & Mário Papík, 2022. "Effects of classification, feature selection, and resampling methods on bankruptcy prediction of small and medium‐sized enterprises," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 29(4), pages 254-281, October.
    20. Sumaira Ashraf & Elisabete G. S. Félix & Zélia Serrasqueiro, 2019. "Do Traditional Financial Distress Prediction Models Predict the Early Warning Signs of Financial Distress?," JRFM, MDPI, vol. 12(2), pages 1-17, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jftint:v:14:y:2022:i:8:p:244-:d:894540. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.