IDEAS home Printed from https://ideas.repec.org/p/ehl/lserod/100211.html
   My bibliography  Save this paper

An innovative feature selection method for support vector machines and its test on the estimation of the credit risk of default

Author

Listed:
  • Sariev, Eduard
  • Germano, Guido

Abstract

Support vector machines (SVM) have been extensively used for classification problems in many areas such as gene, text and image recognition. However, SVM have been rarely used to estimate the probability of default (PD) in credit risk. In this paper, we advocate the application of SVM, rather than the popular logistic regression (LR) method, for the estimation of both corporate and retail PD. Our results indicate that most of the time SVM outperforms LR in terms of classification accuracy for the corporate and retail segments. We propose a new wrapper feature selection based on maximizing the distance of the support vectors from the separating hyperplane and apply it to identify the main PD drivers. We used three datasets to test the PD estimation, containing (1) retail obligors from Germany, (2) corporate obligors from Eastern Europe, and (3) corporate obligors from Poland. Total assets, total liabilities, and sales are identified as frequent default drivers for the corporate datasets, whereas current account status and duration of the current account are frequent default drivers for the retail dataset.

Suggested Citation

  • Sariev, Eduard & Germano, Guido, 2018. "An innovative feature selection method for support vector machines and its test on the estimation of the credit risk of default," LSE Research Online Documents on Economics 100211, London School of Economics and Political Science, LSE Library.
  • Handle: RePEc:ehl:lserod:100211
    as

    Download full text from publisher

    File URL: http://eprints.lse.ac.uk/100211/
    File Function: Open access version.
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Kobi Abayomi & Andrew Gelman & Marc Levy, 2008. "Diagnostics for multivariate imputations," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 57(3), pages 273-291, June.
    2. Gavalas, Dimitris, 2015. "How do banks perform under Basel III? Tracing lending rates and loan quantity," Journal of Economics and Business, Elsevier, vol. 81(C), pages 21-37.
    3. Shiyi Chen & W. K. Hardle & R. A. Moro, 2011. "Modeling default risk with support vector machines," Quantitative Finance, Taylor & Francis Journals, vol. 11(1), pages 135-154.
    4. Tian, Shaonan & Yu, Yan & Guo, Hui, 2015. "Variable selection and corporate bankruptcy forecasts," Journal of Banking & Finance, Elsevier, vol. 52(C), pages 89-100.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Eduard Sariev & Guido Germano, 2020. "Bayesian regularized artificial neural networks for the estimation of the probability of default," Quantitative Finance, Taylor & Francis Journals, vol. 20(2), pages 311-328, February.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Eduard Sariev & Guido Germano, 2020. "Bayesian regularized artificial neural networks for the estimation of the probability of default," Quantitative Finance, Taylor & Francis Journals, vol. 20(2), pages 311-328, February.
    2. Li, Chunyu & Lou, Chenxin & Luo, Dan & Xing, Kai, 2021. "Chinese corporate distress prediction using LASSO: The role of earnings management," International Review of Financial Analysis, Elsevier, vol. 76(C).
    3. Giovanni Bonaccolto & Massimiliano Caporin & Sandra Paterlini, 2018. "Asset allocation strategies based on penalized quantile regression," Computational Management Science, Springer, vol. 15(1), pages 1-32, January.
    4. Modina, Michele & Pietrovito, Filomena & Gallucci, Carmen & Formisano, Vincenzo, 2023. "Predicting SMEs’ default risk: Evidence from bank-firm relationship data," The Quarterly Review of Economics and Finance, Elsevier, vol. 89(C), pages 254-268.
    5. Davidescu Adriana AnaMaria & Agafiței Marina-Diana & Strat Vasile Alecsandru & Dima Alina Mihaela, 2024. "Mapping the Landscape: A Bibliometric Analysis of Rating Agencies in the Era of Artificial Intelligence and Machine Learning," Proceedings of the International Conference on Business Excellence, Sciendo, vol. 18(1), pages 67-85.
    6. Croux, Christophe & Jagtiani, Julapa & Korivi, Tarunsai & Vulanovic, Milos, 2020. "Important factors determining Fintech loan default: Evidence from a lendingclub consumer platform," Journal of Economic Behavior & Organization, Elsevier, vol. 173(C), pages 270-296.
    7. Robert Stewart & Murshed Chowdhury & Vaalmikki Arjoon, 2021. "Bank stability and economic growth: trade-offs or opportunities?," Empirical Economics, Springer, vol. 61(2), pages 827-853, August.
    8. repec:hum:wpaper:sfb649dp2013-028 is not listed on IDEAS
    9. Serrano-Cinca, Carlos & Gutiérrez-Nieto, Begoña & Bernate-Valbuena, Martha, 2019. "The use of accounting anomalies indicators to predict business failure," European Management Journal, Elsevier, vol. 37(3), pages 353-375.
    10. Gerko Vink & Laurence E. Frank & Jeroen Pannekoek & Stef Buuren, 2014. "Predictive mean matching imputation of semicontinuous variables," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 68(1), pages 61-90, February.
    11. Youssef Zizi & Mohamed Oudgou & Abdeslam El Moudden, 2020. "Determinants and Predictors of SMEs’ Financial Failure: A Logistic Regression Approach," Risks, MDPI, vol. 8(4), pages 1-21, October.
    12. Wilms, Ines & Rombouts, Jeroen & Croux, Christophe, 2021. "Multivariate volatility forecasts for stock market indices," International Journal of Forecasting, Elsevier, vol. 37(2), pages 484-499.
    13. Azizpour, S & Giesecke, K. & Schwenkler, G., 2018. "Exploring the sources of default clustering," Journal of Financial Economics, Elsevier, vol. 129(1), pages 154-183.
    14. Ángel Beade & Manuel Rodríguez & José Santos, 2024. "Multiperiod Bankruptcy Prediction Models with Interpretable Single Models," Computational Economics, Springer;Society for Computational Economics, vol. 64(3), pages 1357-1390, September.
    15. Martin, Eisele & Zhu, Junyi, 2013. "Multiple imputation in a complex household survey - the German Panel on Household Finances (PHF): challenges and solutions," MPRA Paper 57666, University Library of Munich, Germany.
    16. Härdle, Wolfgang Karl & Huang, Li-shan, 2013. "Analysis of deviance in generalized partial linear models," SFB 649 Discussion Papers 2013-028, Humboldt University Berlin, Collaborative Research Center 649: Economic Risk.
    17. repec:hum:wpaper:sfb649dp2013-037 is not listed on IDEAS
    18. O’Sullivan, Conall & Papavassiliou, Vassilios G. & Wafula, Ronald Wekesa & Boubaker, Sabri, 2024. "New insights into liquidity resiliency," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 90(C).
    19. Stewart, Robert & Chowdhury, Murshed & Arjoon, Vaalmikki, 2021. "Interdependencies between regulatory capital, credit extension and economic growth," Journal of Economics and Business, Elsevier, vol. 117(C).
    20. Sigrist, Fabio & Leuenberger, Nicola, 2023. "Machine learning for corporate default risk: Multi-period prediction, frailty correlation, loan portfolios, and tail probabilities," European Journal of Operational Research, Elsevier, vol. 305(3), pages 1390-1406.
    21. Shiyi Chen & Wolfgang K. Härdle & Kiho Jeong, 2010. "Forecasting volatility with support vector machine-based GARCH model," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 29(4), pages 406-433.
    22. Edward I. Altman & Marco Balzano & Alessandro Giannozzi & Stjepan Srhoj, 2023. "Revisiting SME default predictors: The Omega Score," Journal of Small Business Management, Taylor & Francis Journals, vol. 61(6), pages 2383-2417, November.

    More about this item

    Keywords

    default risk; logistic regression; support vector machines;
    All these keywords.

    JEL classification:

    • C10 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - General
    • C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ehl:lserod:100211. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: LSERO Manager (email available below). General contact details of provider: https://edirc.repec.org/data/lsepsuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.