IDEAS home Printed from https://ideas.repec.org/a/gam/jrisks/v9y2021i3p54-d519306.html
   My bibliography  Save this article

Regularization of Autoencoders for Bank Client Profiling Based on Financial Transactions

Author

Listed:
  • Andrey Filchenkov

    (Machine Learning Lab, ITMO University, 49 Kronverksky Pr., 197101 St. Petersburg, Russia
    These authors contributed equally to this work.)

  • Natalia Khanzhina

    (Machine Learning Lab, ITMO University, 49 Kronverksky Pr., 197101 St. Petersburg, Russia
    These authors contributed equally to this work.)

  • Arina Tsai

    (Computer Technologies Department, Formerly ITMO University, 49 Kronverksky Pr., 197101 St. Petersburg, Russia)

  • Ivan Smetannikov

    (Machine Learning Lab, ITMO University, 49 Kronverksky Pr., 197101 St. Petersburg, Russia)

Abstract

Predicting if a client is worth giving a loan—credit scoring—is one of the most essential and popular problems in banking. Predictive models for this goal are built on the assumption that there is a dependency between the client’s profile before the loan approval and their future behavior. However, circumstances that cause changes in the client’s behavior may not depend on their will and cannot be predicted by their profile. Such clients may be considered “noisy” as their eventual belonging to the defaulters class results rather from random factors than from some predictable rules. Excluding such clients from the dataset may be helpful in building more accurate predictive models. In this paper, we report on primary results on testing the hypothesis that a client can become a defaulter in two scenarios: intentionally and unintentionally. We verify our hypothesis applying data driven regularized classification using an autoencoder to client profiles. To model an intention as a hidden variable, we propose an especially designed regularizer for the autoencoder. The regularizer aims to obtain a representation of defaulters that includes a cluster of intentional defaulters and unintentional defaulters as outliers. The outliers were detected by our model and excluded from the dataset. This improved the credit scoring model and confirmed our hypothesis.

Suggested Citation

  • Andrey Filchenkov & Natalia Khanzhina & Arina Tsai & Ivan Smetannikov, 2021. "Regularization of Autoencoders for Bank Client Profiling Based on Financial Transactions," Risks, MDPI, vol. 9(3), pages 1-16, March.
  • Handle: RePEc:gam:jrisks:v:9:y:2021:i:3:p:54-:d:519306
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-9091/9/3/54/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-9091/9/3/54/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Desai, Vijay S. & Crook, Jonathan N. & Overstreet, George A., 1996. "A comparison of neural networks and linear scoring models in the credit union environment," European Journal of Operational Research, Elsevier, vol. 95(1), pages 24-37, November.
    2. D. J. Hand & W. E. Henley, 1997. "Statistical Classification Methods in Consumer Credit Scoring: a Review," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 160(3), pages 523-541, September.
    3. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    4. Steenackers, A. & Goovaerts, M. J., 1989. "A credit scoring model for personal loans," Insurance: Mathematics and Economics, Elsevier, vol. 8(1), pages 31-34, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hussein A. Abdou & John Pointon, 2011. "Credit Scoring, Statistical Techniques And Evaluation Criteria: A Review Of The Literature," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 18(2-3), pages 59-88, April.
    2. Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    3. Huei-Wen Teng & Michael Lee, 2019. "Estimation Procedures of Using Five Alternative Machine Learning Methods for Predicting Credit Card Default," Review of Pacific Basin Financial Markets and Policies (RPBFMP), World Scientific Publishing Co. Pte. Ltd., vol. 22(03), pages 1-27, September.
    4. Maria Rocha Sousa & João Gama & Elísio Brandão, 2013. "Introducing time-changing economics into credit scoring," FEP Working Papers 513, Universidade do Porto, Faculdade de Economia do Porto.
    5. Yiheng Li & Weidong Chen, 2020. "A Comparative Performance Assessment of Ensemble Learning for Credit Scoring," Mathematics, MDPI, vol. 8(10), pages 1-19, October.
    6. Dawei Cheng & Zhibin Niu & Yi Tu & Liqing Zhang, 2017. "Prediction defaults for networked-guarantee loans," Papers 1702.04642, arXiv.org, revised Jun 2020.
    7. Dangxing Chen & Weicheng Ye & Jiahui Ye, 2022. "Interpretable Selective Learning in Credit Risk," Papers 2209.10127, arXiv.org.
    8. Rasa Kanapickiene & Renatas Spicas, 2019. "Credit Risk Assessment Model for Small and Micro-Enterprises: The Case of Lithuania," Risks, MDPI, vol. 7(2), pages 1-23, June.
    9. B Baesens & T Van Gestel & S Viaene & M Stepanova & J Suykens & J Vanthienen, 2003. "Benchmarking state-of-the-art classification algorithms for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(6), pages 627-635, June.
    10. Ha-Thu Nguyen, 2015. "How is credit scoring used to predict default in China?," EconomiX Working Papers 2015-1, University of Paris Nanterre, EconomiX.
    11. Teply, Petr & Polena, Michal, 2020. "Best classification algorithms in peer-to-peer lending," The North American Journal of Economics and Finance, Elsevier, vol. 51(C).
    12. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    13. Badreddine Benyacoub & Souad ElBernoussi & Abdelhak Zoglat & Mohamed Ouzineb, 2022. "Credit Scoring Model Based on HMM/Baum-Welch Method," Computational Economics, Springer;Society for Computational Economics, vol. 59(3), pages 1135-1154, March.
    14. Agustin Pérez-Martín & Agustin Pérez-Torregrosa & Alejandro Rabasa & Marta Vaca, 2020. "Feature Selection to Optimize Credit Banking Risk Evaluation Decisions for the Example of Home Equity Loans," Mathematics, MDPI, vol. 8(11), pages 1-16, November.
    15. Juan Laborda & Seyong Ryoo, 2021. "Feature Selection in a Credit Scoring Model," Mathematics, MDPI, vol. 9(7), pages 1-22, March.
    16. Evžen Kocenda & Martin Vojtek, 2011. "Default Predictors in Retail Credit Scoring: Evidence from Czech Banking Data," Emerging Markets Finance and Trade, Taylor & Francis Journals, vol. 47(6), pages 80-98, November.
    17. Martin Leo & Suneel Sharma & K. Maddulety, 2019. "Machine Learning in Banking Risk Management: A Literature Review," Risks, MDPI, vol. 7(1), pages 1-22, March.
    18. Azam, Rehan & Muhammad, Danish & Syed Akbar, Suleman, 2012. "The significance of socioeconomic factors on personal loan decision a study of consumer banking local private banks in Pakistan," MPRA Paper 42322, University Library of Munich, Germany.
    19. Mestiri, Sami & Farhat, Abdejelil, 2018. "Credit Risk Prediction based on Bayesian estimation of logistic regression model with random effects," MPRA Paper 119960, University Library of Munich, Germany.
    20. Joseph L. Breeden, 2024. "An Age–Period–Cohort Framework for Profit and Profit Volatility Modeling," Mathematics, MDPI, vol. 12(10), pages 1-23, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jrisks:v:9:y:2021:i:3:p:54-:d:519306. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.