IDEAS home Printed from https://ideas.repec.org/a/eee/soceps/v90y2023ics0038012123002586.html
   My bibliography  Save this article

Machine learning and credit risk: Empirical evidence from small- and mid-sized businesses

Author

Listed:
  • Bitetto, Alessandro
  • Cerchiello, Paola
  • Filomeni, Stefano
  • Tanda, Alessandra
  • Tarantino, Barbara

Abstract

In this paper, we compare two different approaches to estimate the credit risk for small- and mid-sized businesses (SMBs), namely a classic parametric approach, by fitting an ordered probit model, and a non-parametric approach, calibrating a machine learning historical random forest (HRF) model. The models are applied to a unique and proprietary dataset comprising granular firm-level quarterly data collected from a European investment bank and an international insurance company on a sample of 464 Italian SMBs over the period 2015–2017. Results show that the HRF approach outperforms the traditional ordered probit model, highlighting how advanced estimation methodologies that use machine learning techniques can be successfully implemented to predict SMB credit risk, i.e. when facing high asymmetries of information. Moreover, by using Shapley values, we are able to assess the relevance of each variable in predicting SMB credit risk.

Suggested Citation

  • Bitetto, Alessandro & Cerchiello, Paola & Filomeni, Stefano & Tanda, Alessandra & Tarantino, Barbara, 2023. "Machine learning and credit risk: Empirical evidence from small- and mid-sized businesses," Socio-Economic Planning Sciences, Elsevier, vol. 90(C).
  • Handle: RePEc:eee:soceps:v:90:y:2023:i:c:s0038012123002586
    DOI: 10.1016/j.seps.2023.101746
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0038012123002586
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.seps.2023.101746?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. William H. Greene & David A. Hensher, 2008. "Modeling Ordered Choices: A Primer and Recent Developments," Working Papers 08-26, New York University, Leonard N. Stern School of Business, Department of Economics.
    2. Stefano Filomeni & Gregory F. Udell & Alberto Zazzaro, 2021. "Hardening soft information: does organizational distance matter?," The European Journal of Finance, Taylor & Francis Journals, vol. 27(9), pages 897-927, June.
    3. Corazza, Marco & Funari, Stefania & Gusso, Riccardo, 2016. "Creditworthiness evaluation of Italian SMEs at the beginning of the 2007–2008 crisis: An MCDA approach," The North American Journal of Economics and Finance, Elsevier, vol. 38(C), pages 1-26.
    4. Jeffrey M. Wooldridge, 2005. "Simple solutions to the initial conditions problem in dynamic, nonlinear panel data models with unobserved heterogeneity," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 20(1), pages 39-54, January.
    5. Blöchlinger, Andreas & Leippold, Markus, 2018. "Are Ratings the Worst Form of Credit Assessment Except for All the Others?," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 53(1), pages 299-334, February.
    6. Bitetto, Alessandro & Cerchiello, Paola, 2023. "Initial coin offerings and ESG: Allies or enemies?," Finance Research Letters, Elsevier, vol. 57(C).
    7. Altman, Edward I., 1980. "Commercial Bank Lending: Process, Credit Scoring, and Costs of Errors in Lending," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 15(4), pages 813-832, November.
    8. Paul Contoyannis & Andrew M. Jones & Nigel Rice, 2004. "The dynamics of health in the British Household Panel Survey," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 19(4), pages 473-503.
    9. Bitetto, Alessandro & Cerchiello, Paola & Mertzanis, Charilaos, 2023. "On the efficient synthesis of short financial time series: A Dynamic Factor Model approach," Finance Research Letters, Elsevier, vol. 53(C).
    10. Greta Falavigna, 2006. "Models for Default Risk Analysis: Focus on Artificial Neural Networks, Model Comparisons, Hybrid Frameworks," CERIS Working Paper 200610, CNR-IRCrES Research Institute on Sustainable Economic Growth - Torino (TO) ITALY - former Institute for Economic Research on Firms and Growth - Moncalieri (TO) ITALY.
    11. Mirko Moscatelli & Simone Narizzano & Fabio Parlapiano & Gianluca Viggiano, 2019. "Corporate default forecasting with machine learning," Temi di discussione (Economic working papers) 1256, Bank of Italy, Economic Research and International Relations Area.
    12. Berger, Allen N & Udell, Gregory F, 1995. "Relationship Lending and Lines of Credit in Small Firm Finance," The Journal of Business, University of Chicago Press, vol. 68(3), pages 351-381, July.
    13. Stefano Filomeni & Michele Modina & Elena Tabacco, 2023. "Trade credit and firm investments: empirical evidence from Italian cooperative banks," Review of Quantitative Finance and Accounting, Springer, vol. 60(3), pages 1099-1141, April.
    14. Gonzalez, F. & Haas, F. & Johannes, R. & Persson, M. & Toledo, L. & Violi, R. & Zins, C. & Wieland, M., 2004. "Market dynamics associated with credit ratings: a literature review," Financial Stability Review, Banque de France, issue 4, pages 53-76, June.
    15. Majid Bazarbash, 2019. "FinTech in Financial Inclusion: Machine Learning Applications in Assessing Credit Risk," IMF Working Papers 2019/109, International Monetary Fund.
    16. de Andres, Javier & Landajo, Manuel & Lorca, Pedro, 2005. "Forecasting business profitability by using classification techniques: A comparative analysis based on a Spanish case," European Journal of Operational Research, Elsevier, vol. 167(2), pages 518-542, December.
    17. Gonzalez, F. & Haas, F. & Johannes, R. & Persson, M. & Toledo, L. & Violi, R. & Zins, C. & Wieland, M., 2004. "Market dynamics associated with credit ratings: a literature review," Financial Stability Review, Banque de France, issue 4, pages 53-76, June.
    18. Elizabeth R. Odders-White & Mark J. Ready, 2006. "Credit Ratings and Stock Liquidity," The Review of Financial Studies, Society for Financial Studies, vol. 19(1), pages 119-157.
    19. Bitetto, Alessandro & Cerchiello, Paola & Mertzanis, Charilaos, 2023. "Measuring financial soundness around the world: A machine learning approach," International Review of Financial Analysis, Elsevier, vol. 85(C).
    20. Dean Fantazzini & Silvia Figini, 2009. "Random Survival Forests Models for SME Credit Risk Measurement," Methodology and Computing in Applied Probability, Springer, vol. 11(1), pages 29-45, March.
    21. Edward I. Altman & Gabriele Sabato, 2013. "MODELING CREDIT RISK FOR SMEs: EVIDENCE FROM THE US MARKET," World Scientific Book Chapters, in: Oliviero Roggi & Edward I Altman (ed.), Managing and Measuring Risk Emerging Global Standards and Regulations After the Financial Crisis, chapter 9, pages 251-279, World Scientific Publishing Co. Pte. Ltd..
    22. Jeffrey M. Wooldridge, 2005. "Simple solutions to the initial conditions problem in dynamic, nonlinear panel data models with unobserved heterogeneity," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 20(1), pages 39-54, January.
    23. Francesco Dainelli & Francesco Giunta & Fabrizio Cipollini, 2013. "Determinants of SME credit worthiness under Basel rules: the value of credit history information," PSL Quarterly Review, Economia civile, vol. 66(264), pages 21-47.
    24. Vlado Kysucky & Lars Norden, 2016. "The Benefits of Relationship Lending in a Cross-Country Context: A Meta-Analysis," Management Science, INFORMS, vol. 62(1), pages 90-110, January.
    25. Filomeni, Stefano & Udell, Gregory F. & Zazzaro, Alberto, 2020. "Communication frictions in banking organizations: Evidence from credit score lending," Economics Letters, Elsevier, vol. 195(C).
    26. Cucinelli, Doriana & Battista, Maria Luisa Di & Marchese, Malvina & Nieri, Laura, 2018. "Credit risk in European banks: The bright side of the internal ratings based approach," Journal of Banking & Finance, Elsevier, vol. 93(C), pages 213-229.
    27. José María Liberti & Mitchell A. Petersen, 2018. "Information: Hard and Soft," NBER Working Papers 25075, National Bureau of Economic Research, Inc.
    28. Stijn Claessens & Jan Krahnen & William Lang, 2005. "The Basel II Reform and Retail Credit Markets," Journal of Financial Services Research, Springer;Western Finance Association, vol. 28(1), pages 5-13, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alessandro Bitetto & Paola Cerchiello & Stefano Filomeni & Alessandra Tanda & Barbara Tarantino, 2021. "Machine Learning and Credit Risk: Empirical Evidence from SMEs," DEM Working Papers Series 201, University of Pavia, Department of Economics and Management.
    2. Alessandro Bitetto & Paola Cerchiello & Stefano Filomeni & Alessandra Tanda & Barbara Tarantino, 2024. "Can we trust machine learning to predict the credit risk of small businesses?," Review of Quantitative Finance and Accounting, Springer, vol. 63(3), pages 925-954, October.
    3. Mizen, Paul & Tsoukas, Serafeim, 2012. "Forecasting US bond default ratings allowing for previous and initial state dependence in an ordered probit model," International Journal of Forecasting, Elsevier, vol. 28(1), pages 273-287.
    4. Modina, Michele & Pietrovito, Filomena & Gallucci, Carmen & Formisano, Vincenzo, 2023. "Predicting SMEs’ default risk: Evidence from bank-firm relationship data," The Quarterly Review of Economics and Finance, Elsevier, vol. 89(C), pages 254-268.
    5. Stefano Filomeni & Udichibarna Bose & Anastasios Megaritis & Athanasios Triantafyllou, 2024. "Can market information outperform hard and soft information in predicting corporate defaults?," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 29(3), pages 3567-3592, July.
    6. Traferri, Alejandra, 2009. "Correcting the bias in the estimation of a dynamic ordered probit with fixed effects of self-assessed health status," UC3M Working papers. Economics we094021, Universidad Carlos III de Madrid. Departamento de Economía.
    7. Lionel WILNER, 2019. "The Dynamics of Individual Happiness," Working Papers 2019-18, Center for Research in Economics and Statistics.
    8. Mussida, Chiara & Sciulli, Dario, 2023. "The evolution of income distribution and disability in Europe," Structural Change and Economic Dynamics, Elsevier, vol. 66(C), pages 29-38.
    9. Patricia Cubí‐Mollá & Mireia Jofre‐Bonet & Victoria Serra‐Sastre, 2017. "Adaptation to health states: Sick yet better off?," Health Economics, John Wiley & Sons, Ltd., vol. 26(12), pages 1826-1843, December.
    10. Hernández-Quevedo, Cristina & Jones, Andrew M. & Rice, Nigel, 2008. "Persistence in health limitations: A European comparative analysis," Journal of Health Economics, Elsevier, vol. 27(6), pages 1472-1488, December.
    11. Prieto Suarez, Joaquin, 2021. "Poverty traps and affluence shields: modelling the persistence of income position in Chile," LSE Research Online Documents on Economics 110719, London School of Economics and Political Science, LSE Library.
    12. Alexander Mosthaf & Thorsten Schank & Claus Schnabel, 2014. "Low-wage employment versus unemployment: Which one provides better prospects for women?," IZA Journal of European Labor Studies, Springer;Forschungsinstitut zur Zukunft der Arbeit GmbH (IZA), vol. 3(1), pages 1-17, December.
    13. Soyoon Weon & David W. Rothwell, 2020. "Dynamics of Asset Poverty in South Korea," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 150(2), pages 639-657, July.
    14. Franco Peracchi & Claudio Rossetti, 2022. "A nonlinear dynamic factor model of health and medical treatment," Health Economics, John Wiley & Sons, Ltd., vol. 31(6), pages 1046-1066, June.
    15. Davillas, Apostolos & de Oliveira, Victor Hugo & Jones, Andrew M., 2023. "Is inconsistent reporting of self-assessed health persistent and systematic? Evidence from the UKHLS," Economics & Human Biology, Elsevier, vol. 49(C).
    16. Florian Heiss, 2006. "Nonlinear State-Space Models for Microeconometric Panel Data," Computing in Economics and Finance 2006 285, Society for Computational Economics.
    17. Beine, Michel & Lodigiani, Elisabetta & Vermeulen, Robert, 2012. "Remittances and financial openness," Regional Science and Urban Economics, Elsevier, vol. 42(5), pages 844-857.
    18. Nabanita Datta Gupta & Nicolai Kristensen, 2008. "Work environment satisfaction and employee health: panel evidence from Denmark, France and Spain, 1994–2001," The European Journal of Health Economics, Springer;Deutsche Gesellschaft für Gesundheitsökonomie (DGGÖ), vol. 9(1), pages 51-61, February.
    19. Emmanouil Mentzakis & Paul McNamee & Mandy Ryan, 2009. "Who cares and how much: exploring the determinants of co-residential informal care," Review of Economics of the Household, Springer, vol. 7(3), pages 283-303, September.
    20. Georgios Marios Chrysanthou, 2021. "A Multiple Cohort Study of the Gender Gradient of Life Satisfaction during Adolescence: Longitudinal Evidence from Great Britain," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 83(6), pages 1341-1376, December.

    More about this item

    Keywords

    Credit rating; SMB; Historical random forest; Machine learning; Relationship banking; Invoice lending;
    All these keywords.

    JEL classification:

    • C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
    • D82 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Asymmetric and Private Information; Mechanism Design
    • D83 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Search; Learning; Information and Knowledge; Communication; Belief; Unawareness
    • G21 - Financial Economics - - Financial Institutions and Services - - - Banks; Other Depository Institutions; Micro Finance Institutions; Mortgages
    • G22 - Financial Economics - - Financial Institutions and Services - - - Insurance; Insurance Companies; Actuarial Studies

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:soceps:v:90:y:2023:i:c:s0038012123002586. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/seps .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.