IDEAS home Printed from https://ideas.repec.org/a/spr/infsem/v19y2021i1d10.1007_s10257-021-00511-w.html
   My bibliography  Save this article

A decision tree classifier for credit assessment problems in big data environments

Author

Listed:
  • Ching-Chin Chern

    (National Taiwan University)

  • Weng-U Lei

    (National Taiwan University)

  • Kwei-Long Huang

    (National Taiwan University)

  • Shu-Yi Chen

    (Ming Chuan University)

Abstract

Financial institutions have long sought to reduce the risk of consumer loans by improving their credit assessment methods. As new information and network technologies enable massive data collections from many different sources, credit assessment has become a challenge in the big data environment. Complicated processing is required to deal with vast, messy data sources and ever-changing loan regulations. This study proposes a decision tree credit assessment approach (DTCAA) to solve the credit assessment problem in a big data environment. Decision tree models offer good interpretability and easily understood rules, with competitive performance capabilities. In addition, DTCAA features various data consolidation methods to eliminate some of the noise in raw data and facilitate the construction of decision tree. By using a large volume data set from one of the biggest car collateral loan companies in Taiwan, this study verifies the efficiency and validity of DTCAA. The results indicate that DTCAA is competitive in various situations and across multiple factors, in support of the applicability of DTCAA to credit assessment practices.

Suggested Citation

  • Ching-Chin Chern & Weng-U Lei & Kwei-Long Huang & Shu-Yi Chen, 2021. "A decision tree classifier for credit assessment problems in big data environments," Information Systems and e-Business Management, Springer, vol. 19(1), pages 363-386, March.
  • Handle: RePEc:spr:infsem:v:19:y:2021:i:1:d:10.1007_s10257-021-00511-w
    DOI: 10.1007/s10257-021-00511-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10257-021-00511-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10257-021-00511-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Namsik Chang & Olivia R. Liu Sheng, 2008. "Decision-Tree-Based Knowledge Discovery: Single- vs. Multi-Decision-Tree Induction," INFORMS Journal on Computing, INFORMS, vol. 20(1), pages 46-54, February.
    2. Finlay, Steven, 2011. "Multiple classifier architectures and their application to credit risk assessment," European Journal of Operational Research, Elsevier, vol. 210(2), pages 368-378, April.
    3. Lee, Tian-Shyug & Chiu, Chih-Chou & Chou, Yu-Chao & Lu, Chi-Jie, 2006. "Mining the customer credit using classification and regression tree and multivariate adaptive regression splines," Computational Statistics & Data Analysis, Elsevier, vol. 50(4), pages 1113-1130, February.
    4. Koutanaei, Fatemeh Nemati & Sajedi, Hedieh & Khanbabaei, Mohammad, 2015. "A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring," Journal of Retailing and Consumer Services, Elsevier, vol. 27(C), pages 11-23.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    2. Kun Liang & Chen Zhang & Cuiqing Jiang, 2022. "Analyzing default risk among P2P platforms based on the LAS-STACK method by considering multidimensional signals under specific economic contexts," Electronic Commerce Research, Springer, vol. 22(1), pages 77-111, March.
    3. Adnan Dželihodžić & Dženana Đonko & Jasmin Kevrić, 2018. "Improved Credit Scoring Model Based on Bagging Neural Network," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 17(06), pages 1725-1741, November.
    4. Akkoç, Soner, 2012. "An empirical comparison of conventional techniques, neural networks and the three stage hybrid Adaptive Neuro Fuzzy Inference System (ANFIS) model for credit scoring analysis: The case of Turkish cred," European Journal of Operational Research, Elsevier, vol. 222(1), pages 168-178.
    5. Weidong Guo & Zach Zhizhong Zhou, 2022. "A comparative study of combining tree‐based feature selection methods and classifiers in personal loan default prediction," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 41(6), pages 1248-1313, September.
    6. Dangxing Chen & Weicheng Ye & Jiahui Ye, 2022. "Interpretable Selective Learning in Credit Risk," Papers 2209.10127, arXiv.org.
    7. Cao Son Tran & Dan Nicolau & Richi Nayak & Peter Verhoeven, 2021. "Modeling Credit Risk: A Category Theory Perspective," JRFM, MDPI, vol. 14(7), pages 1-21, July.
    8. Doumpos, Michalis & Andriosopoulos, Kostas & Galariotis, Emilios & Makridou, Georgia & Zopounidis, Constantin, 2017. "Corporate failure prediction in the European energy sector: A multicriteria approach and the effect of country characteristics," European Journal of Operational Research, Elsevier, vol. 262(1), pages 347-360.
    9. Lkhagvadorj Munkhdalai & Tsendsuren Munkhdalai & Oyun-Erdene Namsrai & Jong Yun Lee & Keun Ho Ryu, 2019. "An Empirical Comparison of Machine-Learning Methods on Bank Client Credit Assessments," Sustainability, MDPI, vol. 11(3), pages 1-23, January.
    10. Cang, Shuang & Yu, Hongnian, 2014. "A combination selection algorithm on forecasting," European Journal of Operational Research, Elsevier, vol. 234(1), pages 127-139.
    11. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    12. Chen, Shiyi & Jeong, Kiho & Härdle, Wolfgang Karl, 2008. "Recurrent support vector regression for a nonlinear ARMA model with applications to forecasting financial returns," SFB 649 Discussion Papers 2008-051, Humboldt University Berlin, Collaborative Research Center 649: Economic Risk.
    13. Yufei Xia & Lingyun He & Yinguo Li & Nana Liu & Yanlin Ding, 2020. "Predicting loan default in peer‐to‐peer lending using narrative data," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 39(2), pages 260-280, March.
    14. Guotai Chi & Zhipeng Zhang, 2017. "Multi Criteria Credit Rating Model for Small Enterprise Using a Nonparametric Method," Sustainability, MDPI, vol. 9(10), pages 1-23, October.
    15. Ibtissem Baklouti, 2014. "A Psychological Approach To Microfinance Credit Scoring Via A Classification And Regression Tree," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 21(4), pages 193-208, October.
    16. Paulo Vitor Campos Souza & Luiz Carlos Bambirra Torres, 2021. "Extreme Wavelet Fast Learning Machine for Evaluation of the Default Profile on Financial Transactions," Computational Economics, Springer;Society for Computational Economics, vol. 57(4), pages 1263-1285, April.
    17. Shao-Bo Lin & Shaojie Tang & Yao Wang & Di Wang, 2022. "Toward Efficient Ensemble Learning with Structure Constraints: Convergent Algorithms and Applications," INFORMS Journal on Computing, INFORMS, vol. 34(6), pages 3096-3116, November.
    18. Parimal Kumar Giri & Sagar S. De & Sachidananda Dehuri & Sung‐Bae Cho, 2021. "Biogeography based optimization for mining rules to assess credit risk," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 28(1), pages 35-51, January.
    19. Antonio Angelo Romano & Giuseppe Scandurra & Alfonso Carfora, 2016. "Estimating the Impact of Feed-in Tariff Adoption: Similarities and Divergences among Countries through a Propensity-score Matching Method," International Journal of Energy Economics and Policy, Econjournals, vol. 6(2), pages 144-151.
    20. Hu, Xiaolu & Huang, Haozhi & Pan, Zheyao & Shi, Jing, 2019. "Information asymmetry and credit rating: A quasi-natural experiment from China," Journal of Banking & Finance, Elsevier, vol. 106(C), pages 132-152.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:infsem:v:19:y:2021:i:1:d:10.1007_s10257-021-00511-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.