IDEAS home Printed from https://ideas.repec.org/p/bdi/opques/qef_631_21.html
   My bibliography  Save this paper

Application of classification algorithms for the assessment of confirmation to quality remarks

Author

Listed:
  • Fabio Zambuto

    (Bank of Italy)

  • Simona Arcuti

    (Bank of Italy)

  • Roberto Sabatini

    (Bank of Italy)

  • Daniele Zambuto

Abstract

In the context of the data quality management of supervisory banking data, the Bank of Italy receives a significant number of data reports at various intervals from Italian banks. If any anomalies are found, a quality remark is sent back, questioning the data submitted. This process can lead to the bank in question confirming or revising the data it previously transmitted. We propose an innovative methodology, based on text mining and machine learning techniques, for the automatic processing of the data confirmations received from banks. A classification model is employed to predict whether these confirmations should be accepted or rejected based on the reasons provided by the reporting banks, the characteristics of the validation quality checks, and reporting behaviour across the banking system. The model was trained on past cases already labelled by data managers and its performance was assessed against a set of cross-checked cases that were used as gold standard. The empirical findings show that the methodology predicts the correct decisions on recurrent data confirmations and that the performance of the proposed model is comparable to that of data managers currently engaged in data analysis.

Suggested Citation

  • Fabio Zambuto & Simona Arcuti & Roberto Sabatini & Daniele Zambuto, 2021. "Application of classification algorithms for the assessment of confirmation to quality remarks," Questioni di Economia e Finanza (Occasional Papers) 631, Bank of Italy, Economic Research and International Relations Area.
  • Handle: RePEc:bdi:opques:qef_631_21
    as

    Download full text from publisher

    File URL: https://www.bancaditalia.it/pubblicazioni/qef/2021-0631/QEF_631_21.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Chakraborty, Chiranjit & Joseph, Andreas, 2017. "Machine learning at central banks," Bank of England working papers 674, Bank of England.
    2. Tobias Cagala, 2017. "Improving data quality and closing data gaps with machine learning," IFC Bulletins chapters, in: Bank for International Settlements (ed.), Data needs and Statistics compilation for macroprudential analysis, volume 46, Bank for International Settlements.
    3. Fabio Zambuto, 2021. "Quality checks on granular banking data: an experimental approach based on machine learning," IFC Bulletins chapters, in: Bank for International Settlements (ed.), Micro data for the macro world, volume 53, Bank for International Settlements.
    4. Francesco Cusano & Giuseppe Marinelli & Stefano Piermattei, 2021. "Learning from revisions: a tool for detecting potential errors in banks' balance sheet statistical reporting," Questioni di Economia e Finanza (Occasional Papers) 611, Bank of Italy, Economic Research and International Relations Area.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Vittoria La Serra & Emiliano Svezia, 2024. "A supervised record linkage approach for anomaly detection in insurance assets granular data," Quality & Quantity: International Journal of Methodology, Springer, vol. 58(5), pages 4181-4205, October.
    2. Francesco Cusano & Giuseppe Marinelli & Stefano Piermattei, 2022. "Learning from revisions: an algorithm to detect errors in banks’ balance sheet statistical reporting," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(6), pages 4025-4059, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Francesco Cusano & Giuseppe Marinelli & Stefano Piermattei, 2022. "Learning from revisions: an algorithm to detect errors in banks’ balance sheet statistical reporting," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(6), pages 4025-4059, December.
    2. Francesco Cusano & Giuseppe Marinelli & Stefano Piermattei, 2021. "Learning from revisions: a tool for detecting potential errors in banks' balance sheet statistical reporting," Questioni di Economia e Finanza (Occasional Papers) 611, Bank of Italy, Economic Research and International Relations Area.
    3. Vittoria La Serra & Emiliano Svezia, 2024. "A supervised record linkage approach for anomaly detection in insurance assets granular data," Quality & Quantity: International Journal of Methodology, Springer, vol. 58(5), pages 4181-4205, October.
    4. Davide Nicola Continanza & Andrea del Monaco & Marco di Lucido & Daniele Figoli & Pasquale Maddaloni & Filippo Quarta & Giuseppe Turturiello, 2023. "Stacking machine learning models for anomaly detection: comparing AnaCredit to other banking data sets," IFC Bulletins chapters, in: Bank for International Settlements (ed.), Data science in central banking: applications and tools, volume 59, Bank for International Settlements.
    5. Fabio Zambuto, 2021. "Quality checks on granular banking data: an experimental approach based on machine learning," IFC Bulletins chapters, in: Bank for International Settlements (ed.), Micro data for the macro world, volume 53, Bank for International Settlements.
    6. repec:zbw:bofitp:2019_008 is not listed on IDEAS
    7. Joseph, Andreas & Vasios, Michalis, 2022. "OTC Microstructure in a period of stress: A Multi-layered network approach," Journal of Banking & Finance, Elsevier, vol. 138(C).
    8. Funke, Michael & Tsang, Andrew, 2019. "The direction and intensity of China's monetary policy conduct: A dynamic factor modelling approach," BOFIT Discussion Papers 8/2019, Bank of Finland Institute for Emerging Economies (BOFIT).
    9. Tsang, Andrew, 2021. "Uncovering Heterogeneous Regional Impacts of Chinese Monetary Policy," MPRA Paper 110703, University Library of Munich, Germany.
    10. James Chapman & Ajit Desai, 2021. "Using Payments Data to Nowcast Macroeconomic Variables During the Onset of COVID-19," Staff Working Papers 21-2, Bank of Canada.
    11. Martin Baumgaertner & Johannes Zahner, 2021. "Whatever it takes to understand a central banker - Embedding their words using neural networks," MAGKS Papers on Economics 202130, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    12. Szafranek, Karol, 2019. "Bagged neural networks for forecasting Polish (low) inflation," International Journal of Forecasting, Elsevier, vol. 35(3), pages 1042-1059.
    13. Barkan, Oren & Benchimol, Jonathan & Caspi, Itamar & Cohen, Eliya & Hammer, Allon & Koenigstein, Noam, 2023. "Forecasting CPI inflation components with Hierarchical Recurrent Neural Networks," International Journal of Forecasting, Elsevier, vol. 39(3), pages 1145-1162.
    14. Andrea Carboni & Alessandro Moro, 2018. "Imputation techniques for the nationality of foreign shareholders in Italian firms," IFC Bulletins chapters, in: Bank for International Settlements (ed.), External sector statistics: current issues and new challenges, volume 48, Bank for International Settlements.
    15. Ivan Baybuza, 2018. "Inflation Forecasting Using Machine Learning Methods," Russian Journal of Money and Finance, Bank of Russia, vol. 77(4), pages 42-59, December.
    16. Joseph, Andreas, 2019. "Parametric inference with universal function approximators," Bank of England working papers 784, Bank of England, revised 22 Jul 2020.
    17. Long Ren & Shaojie Cong & Xinlong Xue & Daqing Gong, 2024. "Credit rating prediction with supply chain information: a machine learning perspective," Annals of Operations Research, Springer, vol. 342(1), pages 657-686, November.
    18. Denis Shibitov & Mariam Mamedli, 2021. "Forecasting Russian Cpi With Data Vintages And Machine Learning Techniques," Bank of Russia Working Paper Series wps70, Bank of Russia.
    19. Jesús Fernández-Villaverde & Galo Nuño & Jesse Perla, 2024. "Taming the Curse of Dimensionality: Quantitative Economics with Deep Learning," NBER Working Papers 33117, National Bureau of Economic Research, Inc.
    20. James T. E. Chapman & Ajit Desai, 2023. "Macroeconomic Predictions Using Payments Data and Machine Learning," Forecasting, MDPI, vol. 5(4), pages 1-32, November.
    21. Defina, Ryan, 2021. "Machine Learning Methods: Potential for Deposit Insurance," MPRA Paper 110712, University Library of Munich, Germany.

    More about this item

    Keywords

    supervisory banking data; data quality management; machine learning; text mining; latent dirichlet allocation; gradient boosting.;
    All these keywords.

    JEL classification:

    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • G21 - Financial Economics - - Financial Institutions and Services - - - Banks; Other Depository Institutions; Micro Finance Institutions; Mortgages

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bdi:opques:qef_631_21. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/bdigvit.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.