IDEAS home Printed from https://ideas.repec.org/p/mse/cesdoc/21034.html
   My bibliography  Save this paper

Fair learning with bagging

Author

Abstract

The central question of this paper is how to enhance supervised learning algorithms with fairness requirement ensuring that any sensitive input does not "unfairly"' influence the outcome of the learning algorithm. To attain this objective we proceed by three steps. First after introducing several notions of fairness in a uniform approach, we introduce a more general notion through conditional fairness definition which englobes most of the well known fairness definitions. Second we use a ensemble of binary and continuous classifiers to get an optimal solution for a fair predictive outcome using a related-post-processing procedure without any transformation on the data, nor on the training algorithms. Finally we introduce several tests to verify the fairness of the predictions. Some empirics are provided to illustrate our approach

Suggested Citation

  • Jean-David Fermanian & Dominique Guégan, 2021. "Fair learning with bagging," Documents de travail du Centre d'Economie de la Sorbonne 21034, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
  • Handle: RePEc:mse:cesdoc:21034
    as

    Download full text from publisher

    File URL: http://mse.univ-paris1.fr/pub/mse/CES2021/21034.pdf
    Download Restriction: no

    File URL: https://halshs.archives-ouvertes.fr/halshs-03500906
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Alexis Bogroff & Dominique Guégan, 2019. "Artificial Intelligence, Data, Ethics. An Holistic Approach for Risks and Regulation," Working Papers 2019: 19, Department of Economics, University of Venice "Ca' Foscari".
    2. Anderson, N. H. & Hall, P. & Titterington, D. M., 1994. "Two-Sample Test Statistics for Measuring Discrepancies Between Two Multivariate Probability Density Functions Using Kernel-Based Density Estimates," Journal of Multivariate Analysis, Elsevier, vol. 50(1), pages 41-54, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jean-David Fermanian & Dominique Guegan, 2021. "Fair learning with bagging," Post-Print halshs-03500906, HAL.
    2. Jean-David Fermanian & Dominique Guegan, 2021. "Fair learning with bagging," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-03500906, HAL.
    3. Martin L. Hazelton & Tilman M. Davies, 2022. "Pointwise comparison of two multivariate density functions," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(4), pages 1791-1810, December.
    4. Masayuki Hirukawa & Mari Sakudo, 2016. "Testing Symmetry of Unknown Densities via Smoothing with the Generalized Gamma Kernels," Econometrics, MDPI, vol. 4(2), pages 1-27, June.
    5. Marcelo Fernandes & Eduardo Mendes & Olivier Scaillet, 2015. "Testing for symmetry and conditional symmetry using asymmetric kernels," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 67(4), pages 649-671, August.
    6. Leopold Simar & Valentin Zelenyuk, 2006. "On Testing Equality of Distributions of Technical Efficiency Scores," Econometric Reviews, Taylor & Francis Journals, vol. 25(4), pages 497-522.
    7. Pavia, Jose M., 2015. "Testing Goodness-of-Fit with the Kernel Density Estimator: GoFKernel," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 66(c01).
    8. Lucio Sarno, 2003. "Nonlinear Exchange Rate Models: A Selective Overview," Rivista di Politica Economica, SIPI Spa, vol. 93(4), pages 3-46, July-Augu.
    9. Polanski, Arnold & Stoja, Evarist, 2012. "Efficient evaluation of multidimensional time-varying density forecasts, with applications to risk management," International Journal of Forecasting, Elsevier, vol. 28(2), pages 343-352.
    10. Luca Bagnato & Lucio De Capitani & Antonio Punzo, 2014. "Testing Serial Independence via Density-Based Measures of Divergence," Methodology and Computing in Applied Probability, Springer, vol. 16(3), pages 627-641, September.
    11. Priyanga Dilini Talagala & Rob J Hyndman & Kate Smith-Miles & Sevvandi Kandanaarachchi & Mario A Munoz, 2018. "Anomaly detection in streaming nonstationary temporal data," Monash Econometrics and Business Statistics Working Papers 4/18, Monash University, Department of Econometrics and Business Statistics.
    12. Gong, Li-Hua & Xiang, Ling-Zhi & Liu, Si-Hang & Zhou, Nan-Run, 2022. "Born machine model based on matrix product state quantum circuit," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 593(C).
    13. Dominique Guégan, 2020. "A Note on the Interpretability of Machine Learning Algorithms," Documents de travail du Centre d'Economie de la Sorbonne 20012, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    14. Cees Diks & Valentyn Panchenko, 2005. "Nonparametric Tests for Serial Independence Based on Quadratic Forms," Tinbergen Institute Discussion Papers 05-076/1, Tinbergen Institute.
    15. Dominique Guegan, 2020. "A Note on the Interpretability of Machine Learning Algorithms," Post-Print halshs-02900929, HAL.
    16. Juan Carlos Pardo-Fernández & María Dolores Jiménez-Gamero & Anouar El Ghouch, 2015. "A Non-parametric ANOVA-type Test for Regression Curves Based on Characteristic Functions," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 42(1), pages 197-213, March.
    17. Pablo Martínez-Camblor & Jacobo Uña-Álvarez, 2013. "Studying the bandwidth in $$k$$ -sample smooth tests," Computational Statistics, Springer, vol. 28(2), pages 875-892, April.
    18. Martínez-Camblor, Pablo & de Uña-Álvarez, Jacobo, 2009. "Non-parametric k-sample tests: Density functions vs distribution functions," Computational Statistics & Data Analysis, Elsevier, vol. 53(9), pages 3344-3357, July.
    19. Henze, N. & Klar, B. & Zhu, L. X., 2005. "Checking the adequacy of the multivariate semiparametric location shift model," Journal of Multivariate Analysis, Elsevier, vol. 93(2), pages 238-256, April.
    20. Leucht, Anne & Neumann, Michael H., 2013. "Dependent wild bootstrap for degenerate U- and V-statistics," Journal of Multivariate Analysis, Elsevier, vol. 117(C), pages 257-280.

    More about this item

    Keywords

    fairness; nonparametric regression; classification; accuracy;
    All these keywords.

    JEL classification:

    • C10 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - General
    • C38 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Classification Methdos; Cluster Analysis; Principal Components; Factor Analysis
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:mse:cesdoc:21034. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Lucie Label (email available below). General contact details of provider: https://edirc.repec.org/data/cenp1fr.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.