IDEAS home Printed from https://ideas.repec.org/a/eee/joreco/v27y2015icp11-23.html
   My bibliography  Save this article

A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring

Author

Listed:
  • Koutanaei, Fatemeh Nemati
  • Sajedi, Hedieh
  • Khanbabaei, Mohammad

Abstract

Data mining techniques have numerous applications in credit scoring of customers in the banking field. One of the most popular data mining techniques is the classification method. Previous researches have demonstrated that using the feature selection (FS) algorithms and ensemble classifiers can improve the banks' performance in credit scoring problems. In this domain, the main issue is the simultaneous and the hybrid utilization of several FS and ensemble learning classification algorithms with respect to their parameters setting, in order to achieve a higher performance in the proposed model. As a result, the present paper has developed a hybrid data mining model of feature selection and ensemble learning classification algorithms on the basis of three stages. The first stage, as expected, deals with the data gathering and pre-processing. In the second stage, four FS algorithms are employed, including principal component analysis (PCA), genetic algorithm (GA), information gain ratio, and relief attribute evaluation function. In here, parameters setting of FS methods is based on the classification accuracy resulted from the implementation of the support vector machine (SVM) classification algorithm. After choosing the appropriate model for each selected feature, they are applied to the base and ensemble classification algorithms. In this stage, the best FS algorithm with its parameters setting is indicated for the modeling stage of the proposed model. In the third stage, the classification algorithms are employed for the dataset prepared from each FS algorithm. The results exhibited that in the second stage, PCA algorithm is the best FS algorithm. In the third stage, the classification results showed that the artificial neural network (ANN) adaptive boosting (AdaBoost) method has higher classification accuracy. Ultimately, the paper verified and proposed the hybrid model as an operative and strong model for performing credit scoring.

Suggested Citation

  • Koutanaei, Fatemeh Nemati & Sajedi, Hedieh & Khanbabaei, Mohammad, 2015. "A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring," Journal of Retailing and Consumer Services, Elsevier, vol. 27(C), pages 11-23.
  • Handle: RePEc:eee:joreco:v:27:y:2015:i:c:p:11-23
    DOI: 10.1016/j.jretconser.2015.07.003
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0969698915300060
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jretconser.2015.07.003?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Thomas, Lyn C., 2000. "A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers," International Journal of Forecasting, Elsevier, vol. 16(2), pages 149-172.
    2. Hu, Yu-Chiang & Ansell, Jake, 2007. "Measuring retail company performance using credit scoring techniques," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1595-1606, December.
    3. Yu, Lean & Wang, Shouyang & Lai, Kin Keung, 2009. "An intelligent-agent-based fuzzy group decision making model for financial multicriteria decision support: The case of credit scoring," European Journal of Operational Research, Elsevier, vol. 195(3), pages 942-959, June.
    4. Paleologo, Giuseppe & Elisseeff, André & Antonini, Gianluca, 2010. "Subagging for credit scoring models," European Journal of Operational Research, Elsevier, vol. 201(2), pages 490-499, March.
    5. Gray, J. Brian & Fan, Guangzhe, 2008. "Classification tree analysis using TARGET," Computational Statistics & Data Analysis, Elsevier, vol. 52(3), pages 1362-1372, January.
    6. Setiono, Rudy & Baesens, Bart & Mues, Christophe, 2009. "A note on knowledge discovery using neural networks and its application to credit card screening," European Journal of Operational Research, Elsevier, vol. 192(1), pages 326-332, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yitong Guo & Jie Mei & Zhiting Pan & Haonan Liu & Weiwei Li, 2022. "Adaptively Promoting Diversity in a Novel Ensemble Method for Imbalanced Credit-Risk Evaluation," Mathematics, MDPI, vol. 10(11), pages 1-21, May.
    2. Omar H. Fares & Irfan Butt & Seung Hwan Mark Lee, 2023. "Utilization of artificial intelligence in the banking sector: a systematic literature review," Journal of Financial Services Marketing, Palgrave Macmillan, vol. 28(4), pages 835-852, December.
    3. Djonata Schiessl & Helison Bertoli Alves Dias & José Carlos Korelo, 2022. "Artificial intelligence in marketing: a network analysis and future agenda," Journal of Marketing Analytics, Palgrave Macmillan, vol. 10(3), pages 207-218, September.
    4. Ching-Chin Chern & Weng-U Lei & Kwei-Long Huang & Shu-Yi Chen, 2021. "A decision tree classifier for credit assessment problems in big data environments," Information Systems and e-Business Management, Springer, vol. 19(1), pages 363-386, March.
    5. Barboza, Flavio & Altman, Edward, 2024. "Predicting financial distress in Latin American companies: A comparative analysis of logistic regression and random forest models," The North American Journal of Economics and Finance, Elsevier, vol. 72(C).
    6. Trivedi, Shrawan Kumar, 2020. "A study on credit scoring modeling with different feature selection and machine learning approaches," Technology in Society, Elsevier, vol. 63(C).
    7. Vahid Baradaran & Maryam Keshavarz, 2015. "An integrated approach of system dynamics simulation and fuzzy inference system for retailers’ credit scoring," Economic Research-Ekonomska Istraživanja, Taylor & Francis Journals, vol. 28(1), pages 959-980, January.
    8. Zieba, Maciej & Härdle, Wolfgang Karl, 2016. "Beta-boosted ensemble for big credit scoring data," SFB 649 Discussion Papers 2016-052, Humboldt University Berlin, Collaborative Research Center 649: Economic Risk.
    9. Li, Jiawen & Meng, Lu & Zhang, Zelin & Yang, Kejia, 2023. "Low-frequency, high-impact: Discovering important rare events from UGC," Journal of Retailing and Consumer Services, Elsevier, vol. 70(C).
    10. Fallahpour, Saeid & Lakvan, Eisa Norouzian & Zadeh, Mohammad Hendijani, 2017. "Using an ensemble classifier based on sequential floating forward selection for financial distress prediction problem," Journal of Retailing and Consumer Services, Elsevier, vol. 34(C), pages 159-167.
    11. Paulo Vitor Campos Souza & Luiz Carlos Bambirra Torres, 2021. "Extreme Wavelet Fast Learning Machine for Evaluation of the Default Profile on Financial Transactions," Computational Economics, Springer;Society for Computational Economics, vol. 57(4), pages 1263-1285, April.
    12. repec:hum:wpaper:sfb649dp2016-052 is not listed on IDEAS
    13. Liao, Shu-Hsien & Yang, Ling-Ling, 2020. "Mobile payment and online to offline retail business models," Journal of Retailing and Consumer Services, Elsevier, vol. 57(C).
    14. Hazar Altinbas & Goktug Cenk Akkaya, 2017. "Improving the performance of statistical learning methods with a combined meta-heuristic for consumer credit risk assessment," Risk Management, Palgrave Macmillan, vol. 19(4), pages 255-280, November.
    15. Weidong Guo & Zach Zhizhong Zhou, 2022. "A comparative study of combining tree‐based feature selection methods and classifiers in personal loan default prediction," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 41(6), pages 1248-1313, September.
    16. Ahmad Hammami & Mohammad Hendijani Zadeh, 2022. "Predicting earnings management through machine learning ensemble classifiers," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 41(8), pages 1639-1660, December.
    17. Saba Moradi & Farimah Mokhatab Rafiei, 2019. "A dynamic credit risk assessment model with data mining techniques: evidence from Iranian banks," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 5(1), pages 1-27, December.
    18. Lkhagvadorj Munkhdalai & Tsendsuren Munkhdalai & Oyun-Erdene Namsrai & Jong Yun Lee & Keun Ho Ryu, 2019. "An Empirical Comparison of Machine-Learning Methods on Bank Client Credit Assessments," Sustainability, MDPI, vol. 11(3), pages 1-23, January.
    19. Widodo, Erwin & Rochmadhan, Oryza Akbar & Lukmandono, & Januardi,, 2022. "Modeling Bayesian inspection game for non-performing loan problems," Operations Research Perspectives, Elsevier, vol. 9(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hussein A. Abdou & John Pointon, 2011. "Credit Scoring, Statistical Techniques And Evaluation Criteria: A Review Of The Literature," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 18(2-3), pages 59-88, April.
    2. Nadia Ayed & Khemaies Bougatef, 2024. "Performance Assessment of Logistic Regression (LR), Artificial Neural Network (ANN), Fuzzy Inference System (FIS) and Adaptive Neuro-Fuzzy System (ANFIS) in Predicting Default Probability: The Case of," Computational Economics, Springer;Society for Computational Economics, vol. 64(3), pages 1803-1835, September.
    3. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    4. Alexandra Horobet & Stefania Cristina Curea & Alexandra Smedoiu Popoviciu & Cosmin-Alin Botoroga & Lucian Belascu & Dan Gabriel Dumitrescu, 2021. "Solvency Risk and Corporate Performance: A Case Study on European Retailers," JRFM, MDPI, vol. 14(11), pages 1-34, November.
    5. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    6. K. W. De Bock & D. Van Den Poel, 2012. "Reconciling Performance and Interpretability in Customer Churn Prediction using Ensemble Learning based on Generalized Additive Models," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 12/805, Ghent University, Faculty of Economics and Business Administration.
    7. Rais Ahmad Itoo & A. Selvarasu & José António Filipe, 2015. "Loan Products and Credit Scoring by Commercial Banks (India)," International Journal of Finance, Insurance and Risk Management, International Journal of Finance, Insurance and Risk Management, vol. 5(1), pages 851-851.
    8. Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    9. Raffaele Manini & Oriol Amat, 2018. "Credit scoring for the supermarket and retailing industry: analysis and application proposal," Economics Working Papers 1614, Department of Economics and Business, Universitat Pompeu Fabra.
    10. Brad S. Trinkle & Amelia A. Baldwin, 2016. "Research Opportunities for Neural Networks: The Case for Credit," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 23(3), pages 240-254, July.
    11. Huei-Wen Teng & Michael Lee, 2019. "Estimation Procedures of Using Five Alternative Machine Learning Methods for Predicting Credit Card Default," Review of Pacific Basin Financial Markets and Policies (RPBFMP), World Scientific Publishing Co. Pte. Ltd., vol. 22(03), pages 1-27, September.
    12. Akkoç, Soner, 2012. "An empirical comparison of conventional techniques, neural networks and the three stage hybrid Adaptive Neuro Fuzzy Inference System (ANFIS) model for credit scoring analysis: The case of Turkish cred," European Journal of Operational Research, Elsevier, vol. 222(1), pages 168-178.
    13. Qifeng Qiao & Peter A. Beling, 2016. "Decision analytics and machine learning in economic and financial systems," Environment Systems and Decisions, Springer, vol. 36(2), pages 109-113, June.
    14. Hong Wang & Qingsong Xu & Lifeng Zhou, 2015. "Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-20, February.
    15. Dangxing Chen & Weicheng Ye & Jiahui Ye, 2022. "Interpretable Selective Learning in Credit Risk," Papers 2209.10127, arXiv.org.
    16. Dinh, K. & Kleimeier, S., 2006. "Credit scoring for Vietnam's retail banking market : implementation and implications for transactional versus relationship lending," Research Memorandum 012, Maastricht University, Maastricht Research School of Economics of Technology and Organization (METEOR).
    17. Lucian Belascu & Alexandra Horobet & Georgiana Vrinceanu & Consuela Popescu, 2021. "Performance Dissimilarities in European Union Manufacturing: The Effect of Ownership and Technological Intensity," Sustainability, MDPI, vol. 13(18), pages 1-19, September.
    18. Elena Stanghellini, 2003. "Monitoring the Behaviour of Credit Card Holders with Graphical Chain Models," Journal of Business Finance & Accounting, Wiley Blackwell, vol. 30(9‐10), pages 1423-1435, December.
    19. Thomas Wainwright, 2011. "Elite Knowledges: Framing Risk and the Geographies of Credit," Environment and Planning A, , vol. 43(3), pages 650-665, March.
    20. B Baesens & T Van Gestel & S Viaene & M Stepanova & J Suykens & J Vanthienen, 2003. "Benchmarking state-of-the-art classification algorithms for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(6), pages 627-635, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:joreco:v:27:y:2015:i:c:p:11-23. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/journal-of-retailing-and-consumer-services .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.