IDEAS home Printed from https://ideas.repec.org/a/wly/jforec/v39y2020i2p260-280.html
   My bibliography  Save this article

Predicting loan default in peer‐to‐peer lending using narrative data

Author

Listed:
  • Yufei Xia
  • Lingyun He
  • Yinguo Li
  • Nana Liu
  • Yanlin Ding

Abstract

Peer‐to‐peer (P2P) lending is facing severe information asymmetry problems and depends highly on the internal credit scoring system. This paper provides a novel credit scoring model, which forecasts the probability of default for each applicant and guides the lenders' decision‐making in P2P lending. The proposal is expected to improve the existing credit scoring models in P2P lending from two aspects, namely the classifier and the usage of narrative data. We utilize an advanced gradient boosting decision tree technique (i.e., CatBoost) to predict default loans. Moreover, a soft information extraction technique based on keyword clustering is developed to compensate for the insufficient hard credit data. Validated on three real‐world datasets, the experimental results demonstrate that variables extracted from narrative data are powerful features, and the utilization of narrative data significantly improves the predictability relative to solely using hard information. The results of sensitivity analysis reveal that CatBoost outperforms the industry benchmark under different cluster numbers of extracted soft information; meanwhile a small number of clusters (e.g., three) is preferred for consideration of model performance, computational cost, and comprehensibility. We finally facilitate a discussion on practical implication and explanatory considerations.

Suggested Citation

  • Yufei Xia & Lingyun He & Yinguo Li & Nana Liu & Yanlin Ding, 2020. "Predicting loan default in peer‐to‐peer lending using narrative data," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 39(2), pages 260-280, March.
  • Handle: RePEc:wly:jforec:v:39:y:2020:i:2:p:260-280
    DOI: 10.1002/for.2625
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/for.2625
    Download Restriction: no

    File URL: https://libkey.io/10.1002/for.2625?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Guo, Yanhong & Zhou, Wenjun & Luo, Chunyu & Liu, Chuanren & Xiong, Hui, 2016. "Instance-based credit risk assessment for investment decisions in P2P lending," European Journal of Operational Research, Elsevier, vol. 249(2), pages 417-426.
    2. Mild, Andreas & Waitz, Martin & Wöckl, Jürgen, 2015. "How low can you go? — Overcoming the inability of lenders to set proper interest rates on unsecured peer-to-peer lending markets," Journal of Business Research, Elsevier, vol. 68(6), pages 1291-1305.
    3. Freedman, Seth & Jin, Ginger Zhe, 2017. "The information value of online social networks: Lessons from peer-to-peer lending," International Journal of Industrial Organization, Elsevier, vol. 51(C), pages 185-222.
    4. Dorfleitner, Gregor & Priberny, Christopher & Schuster, Stephanie & Stoiber, Johannes & Weber, Martina & de Castro, Ivan & Kammler, Julia, 2016. "Description-text related soft information in peer-to-peer lending – Evidence from two leading European platforms," Journal of Banking & Finance, Elsevier, vol. 64(C), pages 169-187.
    5. Jeremy C. Short & David J. Ketchen Jr. & Aaron F. McKenny & Thomas H. Allison & R. Duane Ireland, 2017. "Research on Crowdfunding: Reviewing the (Very Recent) past and Celebrating the Present," Entrepreneurship Theory and Practice, , vol. 41(2), pages 149-160, March.
    6. Cuiqing Jiang & Zhao Wang & Ruiya Wang & Yong Ding, 2018. "Loan default prediction by combining soft information extracted from descriptive text in online peer-to-peer lending," Annals of Operations Research, Springer, vol. 266(1), pages 511-529, July.
    7. Chen, Xiao & Huang, Bihong & Ye, Dezhu, 2018. "The role of punctuation in P2P lending: Evidence from China," Economic Modelling, Elsevier, vol. 68(C), pages 634-643.
    8. Finlay, Steven, 2011. "Multiple classifier architectures and their application to credit risk assessment," European Journal of Operational Research, Elsevier, vol. 210(2), pages 368-378, April.
    9. Mingfeng Lin & Nagpurnanand R. Prabhala & Siva Viswanathan, 2013. "Judging Borrowers by the Company They Keep: Friendship Networks and Information Asymmetry in Online Peer-to-Peer Lending," Management Science, INFORMS, vol. 59(1), pages 17-35, August.
    10. José María Liberti & Mitchell A Petersen, 2019. "Information: Hard and Soft," The Review of Corporate Finance Studies, Society for Financial Studies, vol. 8(1), pages 1-41.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Modina, Michele & Pietrovito, Filomena & Gallucci, Carmen & Formisano, Vincenzo, 2023. "Predicting SMEs’ default risk: Evidence from bank-firm relationship data," The Quarterly Review of Economics and Finance, Elsevier, vol. 89(C), pages 254-268.
    2. Hyunwoo Woo & So Young Sohn, 2022. "A credit scoring model based on the Myers–Briggs type indicator in online peer-to-peer lending," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-19, December.
    3. Ligang Zhou & Chao Ma, 2023. "A Comparison of Different Rules on Loans Evaluation in Peer-to-Peer Lending by Gradient Boosting Models Under Moving Windows with Two Timestamps," Computational Economics, Springer;Society for Computational Economics, vol. 62(4), pages 1481-1504, December.
    4. Zhao, Shuping & Xu, Kai & Wang, Zhao & Liang, Changyong & Lu, Wenxing & Chen, Bo, 2022. "Financial distress prediction by combining sentiment tone features," Economic Modelling, Elsevier, vol. 106(C).
    5. Xia, Yufei & Zhao, Junhao & He, Lingyun & Li, Yinguo & Yang, Xiaoli, 2021. "Forecasting loss given default for peer-to-peer loans via heterogeneous stacking ensemble approach," International Journal of Forecasting, Elsevier, vol. 37(4), pages 1590-1613.
    6. Stefano Filomeni & Udichibarna Bose & Anastasios Megaritis & Athanasios Triantafyllou, 2024. "Can market information outperform hard and soft information in predicting corporate defaults?," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 29(3), pages 3567-3592, July.
    7. Tian, Geran & Wang, Xiaowen & Wu, Weixing, 2021. "Borrow low, lend high: Credit arbitrage by sophisticated investors," Pacific-Basin Finance Journal, Elsevier, vol. 67(C).
    8. Mario Sanz-Guerrero & Javier Arroyo, 2024. "Credit Risk Meets Large Language Models: Building a Risk Indicator from Loan Descriptions in P2P Lending," Papers 2401.16458, arXiv.org, revised Aug 2024.
    9. Jabeur, Sami Ben & Gharib, Cheima & Mefteh-Wali, Salma & Arfi, Wissal Ben, 2021. "CatBoost model and artificial intelligence techniques for corporate failure prediction," Technological Forecasting and Social Change, Elsevier, vol. 166(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Samuel Ribeiro-Navarrete & Juan Piñeiro-Chousa & M. Ángeles López-Cabarcos & Daniel Palacios-Marqués, 2022. "Crowdlending: mapping the core literature and research frontiers," Review of Managerial Science, Springer, vol. 16(8), pages 2381-2411, November.
    2. Li, Zhiyong & Li, Aimin & Bellotti, Anthony & Yao, Xiao, 2023. "The profitability of online loans: A competing risks analysis on default and prepayment," European Journal of Operational Research, Elsevier, vol. 306(2), pages 968-985.
    3. Qun Chen & Ji-Wen Li & Jian-Guo Liu & Jing-Ti Han & Yun Shi & Xun-Hua Guo, 2021. "Borrower Learning Effects: Do Prior Experiences Promote Continuous Successes in Peer-to-Peer Lending?," Information Systems Frontiers, Springer, vol. 23(4), pages 963-986, August.
    4. Qun Chen & Ji-Wen Li & Jian-Guo Liu & Jing-Ti Han & Yun Shi & Xun-Hua Guo, 0. "Borrower Learning Effects: Do Prior Experiences Promote Continuous Successes in Peer-to-Peer Lending?," Information Systems Frontiers, Springer, vol. 0, pages 1-24.
    5. Dongwoo Kim, 2023. "Can investors’ collective decision-making evolve? Evidence from peer-to-peer lending markets," Electronic Commerce Research, Springer, vol. 23(2), pages 1323-1358, June.
    6. Kriebel, Johannes & Stitz, Lennart, 2022. "Credit default prediction from user-generated text in peer-to-peer lending using deep learning," European Journal of Operational Research, Elsevier, vol. 302(1), pages 309-323.
    7. Qizhi Tao & Yizhe Dong & Ziming Lin, 2017. "Who can get money? Evidence from the Chinese peer-to-peer lending platform," Information Systems Frontiers, Springer, vol. 19(3), pages 425-441, June.
    8. Gregor Dorfleitner & Eva-Maria Oswald & Rongxin Zhang, 2021. "From Credit Risk to Social Impact: On the Funding Determinants in Interest-Free Peer-to-Peer Lending," Journal of Business Ethics, Springer, vol. 170(2), pages 375-400, May.
    9. Li, Jianwen & Hu, Jinyan, 2019. "Does university reputation matter? Evidence from peer-to-peer lending," Finance Research Letters, Elsevier, vol. 31(C), pages 66-77.
    10. Liu, He & Qiao, Han & Wang, Shouyang & Li, Yuze, 2019. "Platform Competition in Peer-to-Peer Lending Considering Risk Control Ability," European Journal of Operational Research, Elsevier, vol. 274(1), pages 280-290.
    11. Qizhi Tao & Yizhe Dong & Ziming Lin, 0. "Who can get money? Evidence from the Chinese peer-to-peer lending platform," Information Systems Frontiers, Springer, vol. 0, pages 1-17.
    12. Xia, Yufei & Zhao, Junhao & He, Lingyun & Li, Yinguo & Yang, Xiaoli, 2021. "Forecasting loss given default for peer-to-peer loans via heterogeneous stacking ensemble approach," International Journal of Forecasting, Elsevier, vol. 37(4), pages 1590-1613.
    13. Gao, Mingze & Leung, Henry & Liu, Linhui & Qiu, Buhui, 2023. "Consumer behaviour and credit supply: Evidence from an Australian FinTech lender," Finance Research Letters, Elsevier, vol. 57(C).
    14. Wang, Chao & Wang, Junbo & Wu, Chunchi & Zhang, Yue, 2023. "Voluntary disclosure in P2P lending: Information or hyperbole?," Pacific-Basin Finance Journal, Elsevier, vol. 79(C).
    15. Wang, Tong & Zhao, Sheng & Zhou, Mengqiu, 2022. "Does soft information in expert ratings curb information asymmetry? Evidence from crowdfunding and early transaction phases of Initial Coin offerings," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 81(C).
    16. Mengyin Li & Phillip H. Phan & Xian Sun, 2021. "Business Friendliness: A Double-Edged Sword," Sustainability, MDPI, vol. 13(4), pages 1-22, February.
    17. Wang, Shaoda & Ye, Dezhu & Liao, Junyun, 2024. "Politeness matters: The role of polite languages in online peer-to-peer lending," Journal of Business Research, Elsevier, vol. 171(C).
    18. Xueru Chen & Xiaoji Hu & Shenglin Ben, 2021. "How do reputation, structure design and FinTech ecosystem affect the net cash inflow of P2P lending platforms? Evidence from China," Electronic Commerce Research, Springer, vol. 21(4), pages 1055-1082, December.
    19. Wu, Yu & Zhang, Tong, 2021. "Can credit ratings predict defaults in peer-to-peer online lending? Evidence from a Chinese platform," Finance Research Letters, Elsevier, vol. 40(C).
    20. Eid, Nourhan & Maltby, Josephine & Talavera, Oleksandr, 2016. "Income Rounding and Loan Performance in the Peer-to-Peer Market," MPRA Paper 72852, University Library of Munich, Germany.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wly:jforec:v:39:y:2020:i:2:p:260-280. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www3.interscience.wiley.com/cgi-bin/jhome/2966 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.