Author
Listed:
- Vlad TEODORESCU
(Bucharest University of Economic Studies, Bucharest, Romania)
- Catalina-Ioana TOADER
(Bucharest University of Economic Studies, Bucharest, Romania)
Abstract
This article extensively studies the optimisation and relative performance of three classes of machine learning models (logistic regression with regularisation, Random Forest, and XGBoost) to quantify the probability of bankruptcy using financial data from a database of listed companies in Taiwan. The database covers the period from 1999 to 2009, contains 95 financial ratios from 7 categories, has 6,819 observations, and has a bankruptcy rate of approximately 3.2%. The database choice stemmed from our wish of utilising a dataset which was publicly available and that posed high quality and moderate size, traits that permitted the rapid training of machine learning models. As aresult, we were able to run experiments based on multiple model configurations and to compare the attained results with the ones gathered by other researchers. For the purpose of splitting data for training and testing sets, the k-fold cross-validation methodology can be used. We investigate the validity of its use, especially in the context of XGBoost with an early stopping round based on the test fold. We also determine the sensitivity of predictive performance on the value of k and on the specific folds created. We use AUROC as a performance measure and show that Random Forest models significantly outperform logistic models with regularisation, while XGBoost models have a moderately higher performance than Random Forest. For each type of model, we study hyperparameter tuning and demonstrate that this process has a significant effect on predictive performance. For the first two types of model, we perform a full grid search. For XGBoost models, we use a guided (sequential) grid search methodology. Furthermore, we study and propose a criterion for hyperparameter tuning using average performance instead of maximum performance, highlighting the relatively large effect on predictive performance of the stochastic component employed by these machine learning algorithms during training. Our research also indicates that in the case of some hyperparameters, tuning can shape predictive performance. Last but not least, the meaningfulness of variables in forecasting the bankruptcy likelihood is assessed, as it was indicated by the three classes of models.
Suggested Citation
Vlad TEODORESCU & Catalina-Ioana TOADER, 2024.
"Using Machine Learning to Model Bankruptcy Risk in Listed Companies,"
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ECONOMICS AND SOCIAL SCIENCES, Bucharest University of Economic Studies, Romania, vol. 6(1), pages 610-619, August.
Handle:
RePEc:rom:conase:v:6:y:2024:i:1:p:610-619
Download full text from publisher
More about this item
Keywords
bankruptcy risk;
probability of bankruptcy;
machine learning;
xgboost;
random forest.;
All these keywords.
JEL classification:
- C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
- C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
- D81 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Criteria for Decision-Making under Risk and Uncertainty
- G2 - Financial Economics - - Financial Institutions and Services
- G32 - Financial Economics - - Corporate Finance and Governance - - - Financing Policy; Financial Risk and Risk Management; Capital and Ownership Structure; Value of Firms; Goodwill
Statistics
Access and download statistics
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:rom:conase:v:6:y:2024:i:1:p:610-619. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Zamfir Andreea (email available below). General contact details of provider: https://edirc.repec.org/data/aseeero.html .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.