IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2207.06273.html
   My bibliography  Save this paper

Understanding Unfairness in Fraud Detection through Model and Data Bias Interactions

Author

Listed:
  • Jos'e Pombal
  • Andr'e F. Cruz
  • Jo~ao Bravo
  • Pedro Saleiro
  • M'ario A. T. Figueiredo
  • Pedro Bizarro

Abstract

In recent years, machine learning algorithms have become ubiquitous in a multitude of high-stakes decision-making applications. The unparalleled ability of machine learning algorithms to learn patterns from data also enables them to incorporate biases embedded within. A biased model can then make decisions that disproportionately harm certain groups in society -- limiting their access to financial services, for example. The awareness of this problem has given rise to the field of Fair ML, which focuses on studying, measuring, and mitigating unfairness in algorithmic prediction, with respect to a set of protected groups (e.g., race or gender). However, the underlying causes for algorithmic unfairness still remain elusive, with researchers divided between blaming either the ML algorithms or the data they are trained on. In this work, we maintain that algorithmic unfairness stems from interactions between models and biases in the data, rather than from isolated contributions of either of them. To this end, we propose a taxonomy to characterize data bias and we study a set of hypotheses regarding the fairness-accuracy trade-offs that fairness-blind ML algorithms exhibit under different data bias settings. On our real-world account-opening fraud use case, we find that each setting entails specific trade-offs, affecting fairness in expected value and variance -- the latter often going unnoticed. Moreover, we show how algorithms compare differently in terms of accuracy and fairness, depending on the biases affecting the data. Finally, we note that under specific data bias conditions, simple pre-processing interventions can successfully balance group-wise error rates, while the same techniques fail in more complex settings.

Suggested Citation

  • Jos'e Pombal & Andr'e F. Cruz & Jo~ao Bravo & Pedro Saleiro & M'ario A. T. Figueiredo & Pedro Bizarro, 2022. "Understanding Unfairness in Fraud Detection through Model and Data Bias Interactions," Papers 2207.06273, arXiv.org.
  • Handle: RePEc:arx:papers:2207.06273
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2207.06273
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Bartlett, Robert & Morse, Adair & Stanton, Richard & Wallace, Nancy, 2022. "Consumer-lending discrimination in the FinTech Era," Journal of Financial Economics, Elsevier, vol. 143(1), pages 30-56.
    2. Kozodoi, Nikita & Jacob, Johannes & Lessmann, Stefan, 2022. "Fairness in credit scoring: Assessment, implementation and profit implications," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1083-1094.
    3. Nikita Kozodoi & Johannes Jacob & Stefan Lessmann, 2021. "Fairness in Credit Scoring: Assessment, Implementation and Profit Implications," Papers 2103.01907, arXiv.org, revised Jun 2022.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jie Shi & Arno P. J. M. Siebes & Siamak Mehrkanoon, 2023. "TransCORALNet: A Two-Stream Transformer CORAL Networks for Supply Chain Credit Assessment Cold Start," Papers 2311.18749, arXiv.org.
    2. Silvana M. Pesenti & Pietro Millossovich & Andreas Tsanakas, 2023. "Differential Quantile-Based Sensitivity in Discontinuous Models," Papers 2310.06151, arXiv.org, revised Oct 2024.
    3. Topuz, Kazim & Urban, Timothy L. & Yildirim, Mehmet B., 2024. "A Markovian score model for evaluating provider performance for continuity of care—An explainable analytics approach," European Journal of Operational Research, Elsevier, vol. 317(2), pages 341-351.
    4. Anna Langenberg & Shih-Chi Ma & Tatiana Ermakova & Benjamin Fabian, 2023. "Formal Group Fairness and Accuracy in Automated Decision Making," Mathematics, MDPI, vol. 11(8), pages 1-25, April.
    5. Henry Penikas, 2023. "Unaccounted model risk for Basel IRB models deemed acceptable by conventional validation criteria," Risk Management, Palgrave Macmillan, vol. 25(4), pages 1-25, December.
    6. Lu, Xuefei & Calabrese, Raffaella, 2023. "The Cohort Shapley value to measure fairness in financing small and medium enterprises in the UK," Finance Research Letters, Elsevier, vol. 58(PC).
    7. Li, Zhe & Liang, Shuguang & Pan, Xianyou & Pang, Meng, 2024. "Credit risk prediction based on loan profit: Evidence from Chinese SMEs," Research in International Business and Finance, Elsevier, vol. 67(PA).
    8. Piccialli, Veronica & Romero Morales, Dolores & Salvatore, Cecilia, 2024. "Supervised feature compression based on counterfactual analysis," European Journal of Operational Research, Elsevier, vol. 317(2), pages 273-285.
    9. Zha, Yong & Wang, Yuting & Li, Quan & Yao, Wenying, 2022. "Credit offering strategy and dynamic pricing in the presence of consumer strategic behavior," European Journal of Operational Research, Elsevier, vol. 303(2), pages 753-766.
    10. Dimitrios Nikolaidis & Michalis Doumpos, 2022. "Credit Scoring with Drift Adaptation Using Local Regions of Competence," SN Operations Research Forum, Springer, vol. 3(4), pages 1-28, December.
    11. Sullivan Hué, 2022. "GAM(L)A: An econometric model for interpretable machine learning," French Stata Users' Group Meetings 2022 19, Stata Users Group.
    12. De Bock, Koen W. & Coussement, Kristof & Caigny, Arno De & Słowiński, Roman & Baesens, Bart & Boute, Robert N. & Choi, Tsan-Ming & Delen, Dursun & Kraus, Mathias & Lessmann, Stefan & Maldonado, Sebast, 2024. "Explainable AI for Operational Research: A defining framework, methods, applications, and a research agenda," European Journal of Operational Research, Elsevier, vol. 317(2), pages 249-272.
    13. Koen W. de Bock & Kristof Coussement & Arno De Caigny & Roman Slowiński & Bart Baesens & Robert N Boute & Tsan-Ming Choi & Dursun Delen & Mathias Kraus & Stefan Lessmann & Sebastián Maldonado & David , 2023. "Explainable AI for Operational Research: A Defining Framework, Methods, Applications, and a Research Agenda," Post-Print hal-04219546, HAL.
    14. Emmanuel Flachaire & Gilles Hacheme & Sullivan Hu'e & S'ebastien Laurent, 2022. "GAM(L)A: An econometric model for interpretable Machine Learning," Papers 2203.11691, arXiv.org.
    15. Lukas Jurgensmeier & Bernd Skiera, 2023. "Measuring Self-Preferencing on Digital Platforms," Papers 2303.14947, arXiv.org, revised Feb 2024.
    16. Bertoletti, Lucía & Borraz, Fernando & Sanroman, Graciela, 2024. "Consumer Debt and Poverty: the Default Risk Gap," GLO Discussion Paper Series 1439, Global Labor Organization (GLO).
    17. Erel, Isil & Liebersohn, Jack, 2022. "Can FinTech reduce disparities in access to finance? Evidence from the Paycheck Protection Program," Journal of Financial Economics, Elsevier, vol. 146(1), pages 90-118.
    18. Agarwal, Shivam & Muckley, Cal B. & Neelakantan, Parvati, 2023. "Countering racial discrimination in algorithmic lending: A case for model-agnostic interpretation methods," Economics Letters, Elsevier, vol. 226(C).
    19. W. Scott Langford & Harrison W. Thomas & Maryann P. Feldman, 2024. "Banking for the Other Half: The Factors That Explain Banking Desert Formation," Economic Development Quarterly, , vol. 38(2), pages 71-81, May.
    20. Hasan, Iftekhar & Li, Xiang & Takalo, Tuomas, 2023. "Technological innovation and the bank lending channel of monetary policy transmission," BOFIT Discussion Papers 9/2023, Bank of Finland Institute for Emerging Economies (BOFIT).

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2207.06273. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.