IDEAS home Printed from https://ideas.repec.org/a/spr/infsem/v19y2021i1d10.1007_s10257-021-00511-w.html
   My bibliography  Save this article

A decision tree classifier for credit assessment problems in big data environments

Author

Listed:
  • Ching-Chin Chern

    (National Taiwan University)

  • Weng-U Lei

    (National Taiwan University)

  • Kwei-Long Huang

    (National Taiwan University)

  • Shu-Yi Chen

    (Ming Chuan University)

Abstract

Financial institutions have long sought to reduce the risk of consumer loans by improving their credit assessment methods. As new information and network technologies enable massive data collections from many different sources, credit assessment has become a challenge in the big data environment. Complicated processing is required to deal with vast, messy data sources and ever-changing loan regulations. This study proposes a decision tree credit assessment approach (DTCAA) to solve the credit assessment problem in a big data environment. Decision tree models offer good interpretability and easily understood rules, with competitive performance capabilities. In addition, DTCAA features various data consolidation methods to eliminate some of the noise in raw data and facilitate the construction of decision tree. By using a large volume data set from one of the biggest car collateral loan companies in Taiwan, this study verifies the efficiency and validity of DTCAA. The results indicate that DTCAA is competitive in various situations and across multiple factors, in support of the applicability of DTCAA to credit assessment practices.

Suggested Citation

  • Ching-Chin Chern & Weng-U Lei & Kwei-Long Huang & Shu-Yi Chen, 2021. "A decision tree classifier for credit assessment problems in big data environments," Information Systems and e-Business Management, Springer, vol. 19(1), pages 363-386, March.
  • Handle: RePEc:spr:infsem:v:19:y:2021:i:1:d:10.1007_s10257-021-00511-w
    DOI: 10.1007/s10257-021-00511-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10257-021-00511-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10257-021-00511-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Namsik Chang & Olivia R. Liu Sheng, 2008. "Decision-Tree-Based Knowledge Discovery: Single- vs. Multi-Decision-Tree Induction," INFORMS Journal on Computing, INFORMS, vol. 20(1), pages 46-54, February.
    2. Lee, Tian-Shyug & Chiu, Chih-Chou & Chou, Yu-Chao & Lu, Chi-Jie, 2006. "Mining the customer credit using classification and regression tree and multivariate adaptive regression splines," Computational Statistics & Data Analysis, Elsevier, vol. 50(4), pages 1113-1130, February.
    3. Koutanaei, Fatemeh Nemati & Sajedi, Hedieh & Khanbabaei, Mohammad, 2015. "A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring," Journal of Retailing and Consumer Services, Elsevier, vol. 27(C), pages 11-23.
    4. Finlay, Steven, 2011. "Multiple classifier architectures and their application to credit risk assessment," European Journal of Operational Research, Elsevier, vol. 210(2), pages 368-378, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Adnan Dželihodžić & Dženana Đonko & Jasmin Kevrić, 2018. "Improved Credit Scoring Model Based on Bagging Neural Network," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 17(06), pages 1725-1741, November.
    2. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    3. Akkoç, Soner, 2012. "An empirical comparison of conventional techniques, neural networks and the three stage hybrid Adaptive Neuro Fuzzy Inference System (ANFIS) model for credit scoring analysis: The case of Turkish cred," European Journal of Operational Research, Elsevier, vol. 222(1), pages 168-178.
    4. Weidong Guo & Zach Zhizhong Zhou, 2022. "A comparative study of combining tree‐based feature selection methods and classifiers in personal loan default prediction," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 41(6), pages 1248-1313, September.
    5. Kun Liang & Chen Zhang & Cuiqing Jiang, 2022. "Analyzing default risk among P2P platforms based on the LAS-STACK method by considering multidimensional signals under specific economic contexts," Electronic Commerce Research, Springer, vol. 22(1), pages 77-111, March.
    6. Dangxing Chen & Weicheng Ye & Jiahui Ye, 2022. "Interpretable Selective Learning in Credit Risk," Papers 2209.10127, arXiv.org.
    7. Davidescu Adriana AnaMaria & Agafiței Marina-Diana & Strat Vasile Alecsandru & Dima Alina Mihaela, 2024. "Mapping the Landscape: A Bibliometric Analysis of Rating Agencies in the Era of Artificial Intelligence and Machine Learning," Proceedings of the International Conference on Business Excellence, Sciendo, vol. 18(1), pages 67-85.
    8. Doumpos, Michalis & Zopounidis, Constantin & Gounopoulos, Dimitrios & Platanakis, Emmanouil & Zhang, Wenke, 2023. "Operational research and artificial intelligence methods in banking," European Journal of Operational Research, Elsevier, vol. 306(1), pages 1-16.
    9. Barboza, Flavio & Altman, Edward, 2024. "Predicting financial distress in Latin American companies: A comparative analysis of logistic regression and random forest models," The North American Journal of Economics and Finance, Elsevier, vol. 72(C).
    10. Cao Son Tran & Dan Nicolau & Richi Nayak & Peter Verhoeven, 2021. "Modeling Credit Risk: A Category Theory Perspective," JRFM, MDPI, vol. 14(7), pages 1-21, July.
    11. Huseyin Ince & Bora Aktan, 2009. "A comparison of data mining techniques for credit scoring in banking: A managerial perspective," Journal of Business Economics and Management, Taylor & Francis Journals, vol. 10(3), pages 233-240, March.
    12. Elcin Koc & Cem Iyigun, 2014. "Restructuring forward step of MARS algorithm using a new knot selection procedure based on a mapping approach," Journal of Global Optimization, Springer, vol. 60(1), pages 79-102, September.
    13. Doumpos, Michalis & Andriosopoulos, Kostas & Galariotis, Emilios & Makridou, Georgia & Zopounidis, Constantin, 2017. "Corporate failure prediction in the European energy sector: A multicriteria approach and the effect of country characteristics," European Journal of Operational Research, Elsevier, vol. 262(1), pages 347-360.
    14. Yi-Tien Lin & Mingchih Chen & Chien-Chang Ho & Tian-Shyug Lee, 2020. "Relationships among Leisure Physical Activity, Sedentary Lifestyle, Physical Fitness, and Happiness in Adults 65 Years or Older in Taiwan," IJERPH, MDPI, vol. 17(14), pages 1-12, July.
    15. Lkhagvadorj Munkhdalai & Tsendsuren Munkhdalai & Oyun-Erdene Namsrai & Jong Yun Lee & Keun Ho Ryu, 2019. "An Empirical Comparison of Machine-Learning Methods on Bank Client Credit Assessments," Sustainability, MDPI, vol. 11(3), pages 1-23, January.
    16. Cang, Shuang & Yu, Hongnian, 2014. "A combination selection algorithm on forecasting," European Journal of Operational Research, Elsevier, vol. 234(1), pages 127-139.
    17. Nadia Ayed & Khemaies Bougatef, 2024. "Performance Assessment of Logistic Regression (LR), Artificial Neural Network (ANN), Fuzzy Inference System (FIS) and Adaptive Neuro-Fuzzy System (ANFIS) in Predicting Default Probability: The Case of," Computational Economics, Springer;Society for Computational Economics, vol. 64(3), pages 1803-1835, September.
    18. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    19. Chen, Shiyi & Jeong, Kiho & Härdle, Wolfgang Karl, 2008. "Recurrent support vector regression for a nonlinear ARMA model with applications to forecasting financial returns," SFB 649 Discussion Papers 2008-051, Humboldt University Berlin, Collaborative Research Center 649: Economic Risk.
    20. Yufei Xia & Lingyun He & Yinguo Li & Nana Liu & Yanlin Ding, 2020. "Predicting loan default in peer‐to‐peer lending using narrative data," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 39(2), pages 260-280, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:infsem:v:19:y:2021:i:1:d:10.1007_s10257-021-00511-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.