IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2311.06292.html
   My bibliography  Save this paper

Towards a data-driven debt collection strategy based on an advanced machine learning framework

Author

Listed:
  • Abel Sancarlos
  • Edgar Bahilo
  • Pablo Mozo
  • Lukas Norman
  • Obaid Ur Rehma
  • Mihails Anufrijevs

Abstract

The European debt purchase market as measured by the total book value of purchased debt approached 25bn euros in 2020 and it was growing at double-digit rates. This is an example of how big the debt collection and debt purchase industry has grown and the important impact it has in the financial sector. However, in order to ensure an adequate return during the debt collection process, a good estimation of the propensity to pay and/or the expected cashflow is crucial. These estimations can be employed, for instance, to create different strategies during the amicable collection to maximize quality standards and revenues. And not only that, but also to prioritize the cases in which a legal process is necessary when debtors are unreachable for an amicable negotiation. This work offers a solution for these estimations. Specifically, a new machine learning modelling pipeline is presented showing how outperforms current strategies employed in the sector. The solution contains a pre-processing pipeline and a model selector based on the best model calibration. Performance is validated with real historical data of the debt industry.

Suggested Citation

  • Abel Sancarlos & Edgar Bahilo & Pablo Mozo & Lukas Norman & Obaid Ur Rehma & Mihails Anufrijevs, 2023. "Towards a data-driven debt collection strategy based on an advanced machine learning framework," Papers 2311.06292, arXiv.org.
  • Handle: RePEc:arx:papers:2311.06292
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2311.06292
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Allan H. Murphy & Robert L. Winkler, 1977. "Reliability of Subjective Probability Forecasts of Precipitation and Temperature," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 26(1), pages 41-47, March.
    2. Gneiting, Tilmann & Raftery, Adrian E., 2007. "Strictly Proper Scoring Rules, Prediction, and Estimation," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 359-378, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dimitriadis, Timo & Gneiting, Tilmann & Jordan, Alexander I. & Vogel, Peter, 2024. "Evaluating probabilistic classifiers: The triptych," International Journal of Forecasting, Elsevier, vol. 40(3), pages 1101-1122.
    2. Victor Richmond R. Jose & Robert L. Winkler, 2009. "Evaluating Quantile Assessments," Operations Research, INFORMS, vol. 57(5), pages 1287-1297, October.
    3. Azar, Pablo D. & Micali, Silvio, 2018. "Computational principal agent problems," Theoretical Economics, Econometric Society, vol. 13(2), May.
    4. Angelica Gianfreda & Francesco Ravazzolo & Luca Rossini, 2023. "Large Time‐Varying Volatility Models for Hourly Electricity Prices," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 85(3), pages 545-573, June.
    5. Tobias Fissler & Yannick Hoga, 2024. "How to Compare Copula Forecasts?," Papers 2410.04165, arXiv.org.
    6. Davide Pettenuzzo & Francesco Ravazzolo, 2016. "Optimal Portfolio Choice Under Decision‐Based Model Combinations," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 31(7), pages 1312-1332, November.
    7. Rubio, F.J. & Steel, M.F.J., 2011. "Inference for grouped data with a truncated skew-Laplace distribution," Computational Statistics & Data Analysis, Elsevier, vol. 55(12), pages 3218-3231, December.
    8. Hwang, Eunju, 2022. "Prediction intervals of the COVID-19 cases by HAR models with growth rates and vaccination rates in top eight affected countries: Bootstrap improvement," Chaos, Solitons & Fractals, Elsevier, vol. 155(C).
    9. R de Fondeville & A C Davison, 2018. "High-dimensional peaks-over-threshold inference," Biometrika, Biometrika Trust, vol. 105(3), pages 575-592.
    10. Armantier, Olivier & Treich, Nicolas, 2013. "Eliciting beliefs: Proper scoring rules, incentives, stakes and hedging," European Economic Review, Elsevier, vol. 62(C), pages 17-40.
    11. Domenico Piccolo & Rosaria Simone, 2019. "The class of cub models: statistical foundations, inferential issues and empirical evidence," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 28(3), pages 389-435, September.
    12. Finn Lindgren, 2015. "Comments on: Comparing and selecting spatial predictors using local criteria," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 24(1), pages 35-44, March.
    13. Chuliá, Helena & Garrón, Ignacio & Uribe, Jorge M., 2024. "Daily growth at risk: Financial or real drivers? The answer is not always the same," International Journal of Forecasting, Elsevier, vol. 40(2), pages 762-776.
    14. Laura Liu & Hyungsik Roger Moon & Frank Schorfheide, 2023. "Forecasting with a panel Tobit model," Quantitative Economics, Econometric Society, vol. 14(1), pages 117-159, January.
    15. Warne, Anders, 2023. "DSGE model forecasting: rational expectations vs. adaptive learning," Working Paper Series 2768, European Central Bank.
    16. James Mitchell & Aubrey Poon & Dan Zhu, 2024. "Constructing density forecasts from quantile regressions: Multimodality in macrofinancial dynamics," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(5), pages 790-812, August.
    17. Rafael Frongillo, 2022. "Quantum Information Elicitation," Papers 2203.07469, arXiv.org.
    18. Karimi, Majid & Zaerpour, Nima, 2022. "Put your money where your forecast is: Supply chain collaborative forecasting with cost-function-based prediction markets," European Journal of Operational Research, Elsevier, vol. 300(3), pages 1035-1049.
    19. Peysakhovich, Alexander & Plagborg-Møller, Mikkel, 2012. "A note on proper scoring rules and risk aversion," Economics Letters, Elsevier, vol. 117(1), pages 357-361.
    20. Ranadeep Daw & Christopher K. Wikle, 2023. "REDS: Random ensemble deep spatial prediction," Environmetrics, John Wiley & Sons, Ltd., vol. 34(1), February.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2311.06292. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.