IDEAS home Printed from https://ideas.repec.org/a/spr/fininn/v11y2025i1d10.1186_s40854-024-00745-w.html
   My bibliography  Save this article

Predicting financial distress in high-dimensional imbalanced datasets: a multi-heterogeneous self-paced ensemble learning framework

Author

Listed:
  • Ruize Gao

    (Tsinghua University
    Beijing Institute of Mathematical Sciences and Applications)

  • Shaoze Cui

    (Beijing Institute of Technology)

  • Yu Wang

    (Chongqing University)

  • Wei Xu

    (Jiangnan University)

Abstract

Financial distress prediction (FDP) is a critical area of study for researchers, industry stakeholders, and regulatory authorities. However, FDP tasks present several challenges, including high-dimensional datasets, class imbalances, and the complexity of parameter optimization. These issues often hinder the predictive model’s ability to accurately identify companies at high risk of financial distress. To mitigate these challenges, we introduce FinMHSPE—a novel multi-heterogeneous self-paced ensemble (MHSPE) FDP learning framework. The proposed model uses pairwise comparisons of data from multiple time frames combined with the maximum relevance and minimum redundancy method to select an optimal subset of features, effectively resolving the high dimensionality issue. Furthermore, the proposed framework incorporates the MHSPE model to iteratively identify the most informative majority class data samples, effectively addressing the class imbalance issue. To optimize the model’s parameters, we leverage the particle swarm optimization algorithm. The robustness of our proposed model is validated through extensive experiments performed on a financial dataset of Chinese listed companies. The empirical results demonstrate that the proposed model outperforms existing competing models in the field of FDP. Specifically, our FinMHSPE framework achieves the highest performance, achieving an area under the curve (AUC) value of 0.9574, considerably surpassing all existing methods. A comparative analysis of AUC values further reveals that FinMHSPE outperforms state-of-the-art approaches that rely on financial features as inputs. Furthermore, our investigation identifies several valuable features for enhancing FDP model performance, notably those associated with a company’s information and growth potential.

Suggested Citation

  • Ruize Gao & Shaoze Cui & Yu Wang & Wei Xu, 2025. "Predicting financial distress in high-dimensional imbalanced datasets: a multi-heterogeneous self-paced ensemble learning framework," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 11(1), pages 1-34, December.
  • Handle: RePEc:spr:fininn:v:11:y:2025:i:1:d:10.1186_s40854-024-00745-w
    DOI: 10.1186/s40854-024-00745-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1186/s40854-024-00745-w
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1186/s40854-024-00745-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Liang, Deron & Tsai, Chih-Fong & Lu, Hung-Yuan (Richard) & Chang, Li-Shin, 2020. "Combining corporate governance indicators with stacking ensembles for financial distress prediction," Journal of Business Research, Elsevier, vol. 120(C), pages 137-146.
    2. De Bock, Koen W. & Coussement, Kristof & Lessmann, Stefan, 2020. "Cost-sensitive business failure prediction when misclassification costs are uncertain: A heterogeneous ensemble selection approach," European Journal of Operational Research, Elsevier, vol. 285(2), pages 612-630.
    3. Yang, Z. R. & Platt, Marjorie B. & Platt, Harlan D., 1999. "Probabilistic Neural Networks in Bankruptcy Prediction," Journal of Business Research, Elsevier, vol. 44(2), pages 67-74, February.
    4. Li, Hui & Sun, Jie, 2012. "Forecasting business failure: The use of nearest-neighbour support vectors and correcting imbalanced samples – Evidence from the Chinese hotel industry," Tourism Management, Elsevier, vol. 33(3), pages 622-634.
    5. Ding, Shusheng & Cui, Tianxiang & Bellotti, Anthony Graham & Abedin, Mohammad Zoynul & Lucey, Brian, 2023. "The role of feature importance in predicting corporate financial distress in pre and post COVID periods: Evidence from China," International Review of Financial Analysis, Elsevier, vol. 90(C).
    6. Edward I. Altman, 1968. "Financial Ratios, Discriminant Analysis And The Prediction Of Corporate Bankruptcy," Journal of Finance, American Finance Association, vol. 23(4), pages 589-609, September.
    7. Zhiyong Hu & Dongping Du, 2020. "A new analytical framework for missing data imputation and classification with uncertainty: Missing data imputation and heart failure readmission prediction," PLOS ONE, Public Library of Science, vol. 15(9), pages 1-15, September.
    8. Ohlson, Ja, 1980. "Financial Ratios And The Probabilistic Prediction Of Bankruptcy," Journal of Accounting Research, Wiley Blackwell, vol. 18(1), pages 109-131.
    9. Zmijewski, Me, 1984. "Methodological Issues Related To The Estimation Of Financial Distress Prediction Models," Journal of Accounting Research, Wiley Blackwell, vol. 22, pages 59-82.
    10. Ching-Hsue Cheng & Ssu-Hsiang Wang, 2015. "A quarterly time-series classifier based on a reduced-dimension generated rules method for identifying financial distress," Quantitative Finance, Taylor & Francis Journals, vol. 15(12), pages 1979-1994, December.
    11. Bhanu Pratap Singh & Alok Kumar Mishra, 2016. "Re-estimation and comparisons of alternative accounting based bankruptcy prediction models for Indian companies," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 2(1), pages 1-28, December.
    12. Xiaobo Tang & Shixuan Li & Mingliang Tan & Wenxuan Shi, 2020. "Incorporating textual and management factors into financial distress prediction: A comparative study of machine learning methods," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 39(5), pages 769-787, August.
    13. J, Uthayakumar & Metawa, Noura & Shankar, K. & Lakshmanaprabu, S.K., 2020. "Financial crisis prediction model using ant colony optimization," International Journal of Information Management, Elsevier, vol. 50(C), pages 538-556.
    14. du Jardin, Philippe, 2016. "A two-stage classification technique for bankruptcy prediction," European Journal of Operational Research, Elsevier, vol. 254(1), pages 236-252.
    15. Crone, Sven F. & Finlay, Steven, 2012. "Instance sampling in credit scoring: An empirical study of sample size and balancing," International Journal of Forecasting, Elsevier, vol. 28(1), pages 224-238.
    16. Petropoulos, Anastasios & Siakoulis, Vasilis & Stavroulakis, Evangelos & Vlachogiannakis, Nikolaos E., 2020. "Predicting bank insolvencies using machine learning techniques," International Journal of Forecasting, Elsevier, vol. 36(3), pages 1092-1113.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xavier Brédart & Eric Séverin & David Veganzones, 2021. "Human resources and corporate failure prediction modeling: Evidence from Belgium," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 40(7), pages 1325-1341, November.
    2. Michal Pavlicko & Marek Durica & Jaroslav Mazanec, 2021. "Ensemble Model of the Financial Distress Prediction in Visegrad Group Countries," Mathematics, MDPI, vol. 9(16), pages 1-26, August.
    3. Yinghua Song & Minzhe Jiang & Shixuan Li & Shengzhe Zhao, 2024. "Class‐imbalanced financial distress prediction with machine learning: Incorporating financial, management, textual, and social responsibility features into index system," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 43(3), pages 593-614, April.
    4. Casado Yusta, Silvia & Nœ–ez Letamendía, Laura & Pacheco Bonrostro, Joaqu’n Antonio, 2018. "Predicting Corporate Failure: The GRASP-LOGIT Model || Predicci—n de la quiebra empresarial: el modelo GRASP-LOGIT," Revista de Métodos Cuantitativos para la Economía y la Empresa = Journal of Quantitative Methods for Economics and Business Administration, Universidad Pablo de Olavide, Department of Quantitative Methods for Economics and Business Administration, vol. 26(1), pages 294-314, Diciembre.
    5. du Jardin, Philippe, 2012. "The influence of variable selection methods on the accuracy of bankruptcy prediction models," MPRA Paper 44383, University Library of Munich, Germany.
    6. Yue Qiu & Jiabei He & Zhensong Chen & Yinhong Yao & Yi Qu, 2024. "A novel semisupervised learning method with textual information for financial distress prediction," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 43(7), pages 2478-2494, November.
    7. Hyeongjun Kim & Hoon Cho & Doojin Ryu, 2022. "Corporate Bankruptcy Prediction Using Machine Learning Methodologies with a Focus on Sequential Data," Computational Economics, Springer;Society for Computational Economics, vol. 59(3), pages 1231-1249, March.
    8. du Jardin, Philippe, 2008. "Bankruptcy prediction and neural networks: The contribution of variable selection methods," MPRA Paper 44384, University Library of Munich, Germany.
    9. Şaban Çelik, 2013. "Micro Credit Risk Metrics: A Comprehensive Review," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 20(4), pages 233-272, October.
    10. du Jardin, Philippe & Séverin, Eric, 2012. "Forecasting financial failure using a Kohonen map: A comparative study to improve model stability over time," European Journal of Operational Research, Elsevier, vol. 221(2), pages 378-396.
    11. Oz, Ibrahim Onur & Simga-Mugan, Can, 2018. "Bankruptcy prediction models' generalizability: Evidence from emerging market economies," Advances in accounting, Elsevier, vol. 41(C), pages 114-125.
    12. Umair Bin Yousaf & Khalil Jebran & Irfan Ullah, 2024. "Corporate governance and financial distress: A review of the theoretical and empirical literature," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 29(2), pages 1627-1679, April.
    13. Serrano-Cinca, Carlos & Gutiérrez-Nieto, Begoña & Bernate-Valbuena, Martha, 2019. "The use of accounting anomalies indicators to predict business failure," European Management Journal, Elsevier, vol. 37(3), pages 353-375.
    14. Maurice Peat & Stewart Jones, 2012. "Using Neural Nets To Combine Information Sets In Corporate Bankruptcy Prediction," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 19(2), pages 90-101, April.
    15. Hyeongjun Kim & Hoon Cho & Doojin Ryu, 2020. "Corporate Default Predictions Using Machine Learning: Literature Review," Sustainability, MDPI, vol. 12(16), pages 1-11, August.
    16. Antonio Blanco-Oliver & Ana Irimia-Dieguez & María Oliver-Alfonso & Nicholas Wilson, 2015. "Systemic Sovereign Risk and Asset Prices: Evidence from the CDS Market, Stressed European Economies and Nonlinear Causality Tests," Czech Journal of Economics and Finance (Finance a uver), Charles University Prague, Faculty of Social Sciences, vol. 65(2), pages 144-166, April.
    17. Frank Ranganai Matenda & Mabutho Sibanda & Eriyoti Chikodza & Victor Gumbo, 2022. "Bankruptcy prediction for private firms in developing economies: a scoping review and guidance for future research," Management Review Quarterly, Springer, vol. 72(4), pages 927-966, December.
    18. Philippe Jardin & David Veganzones & Eric Séverin, 2019. "Forecasting Corporate Bankruptcy Using Accrual-Based Models," Computational Economics, Springer;Society for Computational Economics, vol. 54(1), pages 7-43, June.
    19. Kim, Soo Y. & Upneja, Arun, 2014. "Predicting restaurant financial distress using decision tree and AdaBoosted decision tree models," Economic Modelling, Elsevier, vol. 36(C), pages 354-362.
    20. Zhao, Qi & Xu, Weijun & Ji, Yucheng, 2023. "Predicting financial distress of Chinese listed companies using machine learning: To what extent does textual disclosure matter?," International Review of Financial Analysis, Elsevier, vol. 89(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:fininn:v:11:y:2025:i:1:d:10.1186_s40854-024-00745-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.