IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i18p2907-d1480491.html
   My bibliography  Save this article

Incorporating Digital Footprints into Credit-Scoring Models through Model Averaging

Author

Listed:
  • Linhui Wang

    (School of Management, Xiamen University, Xiamen 361005, China
    Data Mining Research Center, Xiamen University, Xiamen 361005, China)

  • Jianping Zhu

    (School of Management, Xiamen University, Xiamen 361005, China
    Data Mining Research Center, Xiamen University, Xiamen 361005, China)

  • Chenlu Zheng

    (Public Administration Department, Fujian Police College, Fuzhou 350007, China)

  • Zhiyuan Zhang

    (Artificial Intelligence and Model Development Center, Technology Development Department, Xiamen International Bank, Xiamen 361001, China)

Abstract

Digital footprints provide crucial insights into individuals’ behaviors and preferences. Their role in credit scoring is becoming increasingly significant. Therefore, it is crucial to combine digital footprint data with traditional data for personal credit scoring. This paper proposes a novel credit-scoring model. First, lasso-logistic regression is used to select key variables that significantly impact the prediction results. Then, digital footprint variables are categorized based on business understanding, and candidate models are constructed from various combinations of these groups. Finally, the optimal weight is selected by minimizing the Kullback–Leibler loss. Subsequently, the final prediction model is constructed. Empirical analysis validates the advantages and feasibility of the proposed method in variable selection, coefficient estimation, and predictive accuracy. Furthermore, the model-averaging method provides the weights for each candidate model, providing managerial implications to identify beneficial variable combinations for credit scoring.

Suggested Citation

  • Linhui Wang & Jianping Zhu & Chenlu Zheng & Zhiyuan Zhang, 2024. "Incorporating Digital Footprints into Credit-Scoring Models through Model Averaging," Mathematics, MDPI, vol. 12(18), pages 1-15, September.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:18:p:2907-:d:1480491
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/18/2907/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/18/2907/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. B Baesens & T Van Gestel & S Viaene & M Stepanova & J Suykens & J Vanthienen, 2003. "Benchmarking state-of-the-art classification algorithms for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(6), pages 627-635, June.
    2. Desai, Vijay S. & Crook, Jonathan N. & Overstreet, George A., 1996. "A comparison of neural networks and linear scoring models in the credit union environment," European Journal of Operational Research, Elsevier, vol. 95(1), pages 24-37, November.
    3. D. J. Hand & W. E. Henley, 1997. "Statistical Classification Methods in Consumer Credit Scoring: a Review," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 160(3), pages 523-541, September.
    4. Yang, Yang & Fan, Yawen & Jiang, Lan & Liu, Xiaohui, 2022. "Search query and tourism forecasting during the pandemic: When and where can digital footprints be helpful as predictors?," Annals of Tourism Research, Elsevier, vol. 93(C).
    5. Thomas, Lyn C., 2000. "A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers," International Journal of Forecasting, Elsevier, vol. 16(2), pages 149-172.
    6. Lobna Abid & Afif Masmoudi & Sonia Zouari-Ghorbel, 2018. "The Consumer Loan’s Payment Default Predictive Model: an Application of the Logistic Regression and the Discriminant Analysis in a Tunisian Commercial Bank," Journal of the Knowledge Economy, Springer;Portland International Center for Management of Engineering and Technology (PICMET), vol. 9(3), pages 948-962, September.
    7. Tomohiro Ando & Ker-Chau Li, 2014. "A Model-Averaging Approach for High-Dimensional Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(505), pages 254-265, March.
    8. Jiang, Jinglin & Liao, Li & Lu, Xi & Wang, Zhengwei & Xiang, Hongyu, 2021. "Deciphering big data in consumer credit evaluation," Journal of Empirical Finance, Elsevier, vol. 62(C), pages 28-45.
    9. Loutfi, Ahmad Amine, 2022. "A framework for evaluating the business deployability of digital footprint based models for consumer credit," Journal of Business Research, Elsevier, vol. 152(C), pages 473-486.
    10. Dayu Xu & Xuyao Zhang & Hailin Feng, 2019. "Generalized fuzzy soft sets theory‐based novel hybrid ensemble credit scoring model," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 24(2), pages 903-921, April.
    11. Tong Zhang & Guotai Chi, 2021. "A heterogeneous ensemble credit scoring model based on adaptive classifier selection: An application on imbalanced data," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 26(3), pages 4372-4385, July.
    12. Arya, Vikas & Sethi, Deepa & Paul, Justin, 2019. "Does digital footprint act as a digital asset? – Enhancing brand experience through remarketing," International Journal of Information Management, Elsevier, vol. 49(C), pages 142-156.
    13. David Durand, 1941. "Risk Elements in Consumer Instalment Financing," NBER Books, National Bureau of Economic Research, Inc, number dura41-1.
    14. Tobias Berg & Valentin Burg & Ana Gombović & Manju Puri, 2020. "On the Rise of FinTechs: Credit Scoring Using Digital Footprints," The Review of Financial Studies, Society for Financial Studies, vol. 33(7), pages 2845-2897.
    15. Arno de Caigny & Kristof Coussement & Koen W. de Bock, 2018. "A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees," Post-Print hal-01741661, HAL.
    16. Julapa Jagtiani & Catharine Lemieux, 2019. "The roles of alternative data and machine learning in fintech lending: Evidence from the LendingClub consumer platform," Financial Management, Financial Management Association International, vol. 48(4), pages 1009-1029, December.
    17. De Caigny, Arno & Coussement, Kristof & De Bock, Koen W., 2018. "A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees," European Journal of Operational Research, Elsevier, vol. 269(2), pages 760-772.
    18. Enrique Moral-Benito, 2015. "Model Averaging In Economics: An Overview," Journal of Economic Surveys, Wiley Blackwell, vol. 29(1), pages 46-75, February.
    19. Crook, Jonathan N. & Edelman, David B. & Thomas, Lyn C., 2007. "Recent developments in consumer credit risk assessment," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1447-1465, December.
    20. David Durand, 1941. "Risk Elements in Consumer Instalment Financing, Technical Edition," NBER Books, National Bureau of Economic Research, Inc, number dura41-2.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    2. Nadia Ayed & Khemaies Bougatef, 2024. "Performance Assessment of Logistic Regression (LR), Artificial Neural Network (ANN), Fuzzy Inference System (FIS) and Adaptive Neuro-Fuzzy System (ANFIS) in Predicting Default Probability: The Case of," Computational Economics, Springer;Society for Computational Economics, vol. 64(3), pages 1803-1835, September.
    3. Tigges, Maximilian & Mestwerdt, Sönke & Tschirner, Sebastian & Mauer, René, 2024. "Who gets the money? A qualitative analysis of fintech lending and credit scoring through the adoption of AI and alternative data," Technological Forecasting and Social Change, Elsevier, vol. 205(C).
    4. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    5. Hussein A. Abdou & John Pointon, 2011. "Credit Scoring, Statistical Techniques And Evaluation Criteria: A Review Of The Literature," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 18(2-3), pages 59-88, April.
    6. Huei-Wen Teng & Michael Lee, 2019. "Estimation Procedures of Using Five Alternative Machine Learning Methods for Predicting Credit Card Default," Review of Pacific Basin Financial Markets and Policies (RPBFMP), World Scientific Publishing Co. Pte. Ltd., vol. 22(03), pages 1-27, September.
    7. Akkoç, Soner, 2012. "An empirical comparison of conventional techniques, neural networks and the three stage hybrid Adaptive Neuro Fuzzy Inference System (ANFIS) model for credit scoring analysis: The case of Turkish cred," European Journal of Operational Research, Elsevier, vol. 222(1), pages 168-178.
    8. Maria Rocha Sousa & João Gama & Elísio Brandão, 2013. "Introducing time-changing economics into credit scoring," FEP Working Papers 513, Universidade do Porto, Faculdade de Economia do Porto.
    9. Finlay, Steven, 2011. "Multiple classifier architectures and their application to credit risk assessment," European Journal of Operational Research, Elsevier, vol. 210(2), pages 368-378, April.
    10. Thomas Wainwright, 2011. "Elite Knowledges: Framing Risk and the Geographies of Credit," Environment and Planning A, , vol. 43(3), pages 650-665, March.
    11. Crone, Sven F. & Finlay, Steven, 2012. "Instance sampling in credit scoring: An empirical study of sample size and balancing," International Journal of Forecasting, Elsevier, vol. 28(1), pages 224-238.
    12. Doruk Şen & Cem Çağrı Dönmez & Umman Mahir Yıldırım, 2020. "A Hybrid Bi-level Metaheuristic for Credit Scoring," Information Systems Frontiers, Springer, vol. 22(5), pages 1009-1019, October.
    13. Rais Ahmad Itoo & A. Selvarasu & José António Filipe, 2015. "Loan Products and Credit Scoring by Commercial Banks (India)," International Journal of Finance, Insurance and Risk Management, International Journal of Finance, Insurance and Risk Management, vol. 5(1), pages 851-851.
    14. Crook, Jonathan N. & Edelman, David B. & Thomas, Lyn C., 2007. "Recent developments in consumer credit risk assessment," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1447-1465, December.
    15. Ahmed Almustfa Hussin Adam Khatir & Marco Bee, 2022. "Machine Learning Models and Data-Balancing Techniques for Credit Scoring: What Is the Best Combination?," Risks, MDPI, vol. 10(9), pages 1-22, August.
    16. G Verstraeten & D Van den Poel, 2005. "The impact of sample bias on consumer credit scoring performance and profitability," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 56(8), pages 981-992, August.
    17. Neuberg Richard & Hannah Lauren, 2017. "Loan pricing under estimation risk," Statistics & Risk Modeling, De Gruyter, vol. 34(1-2), pages 69-87, June.
    18. L C Thomas, 2010. "Consumer finance: challenges for operational research," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 61(1), pages 41-52, January.
    19. Doruk Şen & Cem Çağrı Dönmez & Umman Mahir Yıldırım, 0. "A Hybrid Bi-level Metaheuristic for Credit Scoring," Information Systems Frontiers, Springer, vol. 0, pages 1-11.
    20. José Willer Prado & Valderí Castro Alcântara & Francisval Melo Carvalho & Kelly Carvalho Vieira & Luiz Kennedy Cruz Machado & Dany Flávio Tonelli, 2016. "Multivariate analysis of credit risk and bankruptcy research data: a bibliometric study involving different knowledge fields (1968–2014)," Scientometrics, Springer;Akadémiai Kiadó, vol. 106(3), pages 1007-1029, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:18:p:2907-:d:1480491. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.