IDEAS home Printed from https://ideas.repec.org/a/gam/jrisks/v9y2021i11p204-d675912.html
   My bibliography  Save this article

Using Model Performance to Assess the Representativeness of Data for Model Development and Calibration in Financial Institutions

Author

Listed:
  • Chamay Kruger

    (Centre for Business Mathematics and Informatics, North-West University, Potchefstroom 2531, South Africa)

  • Willem Daniel Schutte

    (Centre for Business Mathematics and Informatics, North-West University, Potchefstroom 2531, South Africa
    National Institute for Theoretical and Computational Sciences (NITheCS), Pretoria 0001, South Africa)

  • Tanja Verster

    (Centre for Business Mathematics and Informatics, North-West University, Potchefstroom 2531, South Africa)

Abstract

This paper proposes a methodology that utilises model performance as a metric to assess the representativeness of external or pooled data when it is used by banks in regulatory model development and calibration. There is currently no formal methodology to assess representativeness. The paper provides a review of existing regulatory literature on the requirements of assessing representativeness and emphasises that both qualitative and quantitative aspects need to be considered. We present a novel methodology and apply it to two case studies. We compared our methodology with the Multivariate Prediction Accuracy Index. The first case study investigates whether a pooled data source from Global Credit Data (GCD) is representative when considering the enrichment of internal data with pooled data in the development of a regulatory loss given default (LGD) model. The second case study differs from the first by illustrating which other countries in the pooled data set could be representative when enriching internal data during the development of a LGD model. Using these case studies as examples, our proposed methodology provides users with a generalised framework to identify subsets of the external data that are representative of their Country’s or bank’s data, making the results general and universally applicable.

Suggested Citation

  • Chamay Kruger & Willem Daniel Schutte & Tanja Verster, 2021. "Using Model Performance to Assess the Representativeness of Data for Model Development and Calibration in Financial Institutions," Risks, MDPI, vol. 9(11), pages 1-26, November.
  • Handle: RePEc:gam:jrisks:v:9:y:2021:i:11:p:204-:d:675912
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-9091/9/11/204/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-9091/9/11/204/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Ross Taplin & Clive Hunt, 2019. "The Population Accuracy Index: A New Measure of Population Stability for Model Monitoring," Risks, MDPI, vol. 7(2), pages 1-11, May.
    2. Cortés, Lina M. & Mora-Valencia, Andrés & Perote, Javier, 2017. "Measuring firm size distribution with semi-nonparametric densities," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 485(C), pages 35-47.
    3. Francis X. Diebold, 2015. "Comparing Predictive Accuracy, Twenty Years Later: A Personal Perspective on the Use and Abuse of Diebold-Mariano Tests," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 33(1), pages 1-1, January.
    4. Loterman, Gert & Brown, Iain & Martens, David & Mues, Christophe & Baesens, Bart, 2012. "Benchmarking regression algorithms for loss given default modeling," International Journal of Forecasting, Elsevier, vol. 28(1), pages 161-170.
    5. Zhang, Yongli & Yang, Yuhong, 2015. "Cross-validation for selecting a model selection procedure," Journal of Econometrics, Elsevier, vol. 187(1), pages 95-112.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ross Taplin, 2023. "Investigating Causes of Model Instability: Properties of the Prediction Accuracy Index," Risks, MDPI, vol. 11(6), pages 1-15, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. João C. Claudio & Katja Heinisch & Oliver Holtemöller, 2020. "Nowcasting East German GDP growth: a MIDAS approach," Empirical Economics, Springer, vol. 58(1), pages 29-54, January.
    2. Florian Ziel & Rick Steinert & Sven Husmann, 2015. "Forecasting day ahead electricity spot prices: The impact of the EXAA to other European electricity markets," Papers 1501.00818, arXiv.org, revised Dec 2015.
    3. Chen, Xiaowei & Wang, Gang & Zhang, Xiangting, 2019. "Modeling recovery rate for leveraged loans," Economic Modelling, Elsevier, vol. 81(C), pages 231-241.
    4. Christophe Hurlin & Jérémy Leymarie & Antoine Patin, 2018. "Loss functions for LGD model comparison," Working Papers halshs-01516147, HAL.
    5. Seitz, Franz & Baumann, Ursel & Albuquerque, Bruno, 2015. "The information content of money and credit for US activity," Working Paper Series 1803, European Central Bank.
    6. Baris Soybilgen & Ege Yazgan, 2017. "An evaluation of inflation expectations in Turkey," Central Bank Review, Research and Monetary Policy Department, Central Bank of the Republic of Turkey, vol. 17(1), pages 1-31–38.
    7. Barbara Rossi, 2013. "Exchange Rate Predictability," Journal of Economic Literature, American Economic Association, vol. 51(4), pages 1063-1119, December.
    8. Wei, Jie & Chen, Hui, 2020. "Determining the number of factors in approximate factor models by twice K-fold cross validation," Economics Letters, Elsevier, vol. 191(C).
    9. Warne, Anders, 2023. "DSGE model forecasting: rational expectations vs. adaptive learning," Working Paper Series 2768, European Central Bank.
    10. Christopher Kath & Florian Ziel, 2018. "The value of forecasts: Quantifying the economic gains of accurate quarter-hourly electricity price forecasts," Papers 1811.08604, arXiv.org.
    11. Sophie van Huellen & Duo Qin, 2019. "Compulsory Schooling and Returns to Education: A Re-Examination," Econometrics, MDPI, vol. 7(3), pages 1-20, September.
    12. Joseph P. Byrne & Dimitris Korobilis & Pinho J. Ribeiro, 2018. "On The Sources Of Uncertainty In Exchange Rate Predictability," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 59(1), pages 329-357, February.
    13. Salvatore D. Tomarchio & Antonio Punzo, 2019. "Modelling the loss given default distribution via a family of zero‐and‐one inflated mixture models," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 182(4), pages 1247-1266, October.
    14. Catherine L. Kling & Raymond W. Arritt & Gray Calhoun & David A. Keiser, 2016. "Research Needs and Challenges in the FEW System: Coupling Economic Models with Agronomic, Hydrologic, and Bioenergy Models for Sustainable Food, Energy, and Water Systems," Center for Agricultural and Rural Development (CARD) Publications 16-wp563, Center for Agricultural and Rural Development (CARD) at Iowa State University.
    15. Bellotti, Anthony & Brigo, Damiano & Gambetti, Paolo & Vrins, Frédéric, 2021. "Forecasting recovery rates on non-performing loans with machine learning," International Journal of Forecasting, Elsevier, vol. 37(1), pages 428-444.
    16. Karen Poghosyan & Ruben Poghosyan, 2021. "On the Applicability of Dynamic Factor Models for Forecasting Real GDP Growth in Armenia," Czech Journal of Economics and Finance (Finance a uver), Charles University Prague, Faculty of Social Sciences, vol. 71(1), pages 52-79, June.
    17. Thamayanthi Chellathurai, 2017. "Probability Density Of Recovery Rate Given Default Of A Firm’S Debt And Its Constituent Tranches," International Journal of Theoretical and Applied Finance (IJTAF), World Scientific Publishing Co. Pte. Ltd., vol. 20(04), pages 1-34, June.
    18. Doumpos, Michalis & Zopounidis, Constantin & Gounopoulos, Dimitrios & Platanakis, Emmanouil & Zhang, Wenke, 2023. "Operational research and artificial intelligence methods in banking," European Journal of Operational Research, Elsevier, vol. 306(1), pages 1-16.
    19. Hwang, Ruey-Ching & Chu, Chih-Kang & Yu, Kaizhi, 2020. "Predicting LGD distributions with mixed continuous and discrete ordinal outcomes," International Journal of Forecasting, Elsevier, vol. 36(3), pages 1003-1022.
    20. Tong, Edward N.C. & Mues, Christophe & Thomas, Lyn, 2013. "A zero-adjusted gamma model for mortgage loan loss given default," International Journal of Forecasting, Elsevier, vol. 29(4), pages 548-562.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jrisks:v:9:y:2021:i:11:p:204-:d:675912. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.