IDEAS home Printed from https://ideas.repec.org/a/spr/metrik/v86y2023i4d10.1007_s00184-022-00874-1.html
   My bibliography  Save this article

Robust regression against heavy heterogeneous contamination

Author

Listed:
  • Takayuki Kawashima

    (Tokyo Insitute of Technology/RIKEN)

  • Hironori Fujisawa

    (The Institute of Statistical Mathematics/RIKEN)

Abstract

The $$\gamma $$ γ -divergence is well-known for having strong robustness against heavy contamination. By virtue of this property, many applications via the $$\gamma $$ γ -divergence have been proposed. There are two types of $$\gamma $$ γ -divergence for the regression problem, in which the base measures are handled differently. In this study, these two $$\gamma $$ γ -divergences are compared, and a large difference is found between them under heterogeneous contamination, where the outlier ratio depends on the explanatory variable. One $$\gamma $$ γ -divergence has the strong robustness even under heterogeneous contamination. The other does not have in general; however, it has under homogeneous contamination, where the outlier ratio does not depend on the explanatory variable, or when the parametric model of the response variable belongs to a location-scale family in which the scale does not depend on the explanatory variables. Hung et al. (Biometrics 74(1):145–154, 2018) discussed the strong robustness in a logistic regression model with an additional assumption that the tuning parameter $$\gamma $$ γ is sufficiently large. The results obtained in this study hold for any parametric model without such an additional assumption.

Suggested Citation

  • Takayuki Kawashima & Hironori Fujisawa, 2023. "Robust regression against heavy heterogeneous contamination," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 86(4), pages 421-442, May.
  • Handle: RePEc:spr:metrik:v:86:y:2023:i:4:d:10.1007_s00184-022-00874-1
    DOI: 10.1007/s00184-022-00874-1
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00184-022-00874-1
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00184-022-00874-1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hung Hung & Zhi†Yu Jou & Su†Yun Huang, 2018. "Robust mislabel logistic regression without modeling mislabel probabilities," Biometrics, The International Biometric Society, vol. 74(1), pages 145-154, March.
    2. Dankmar Böhning & Bruce Lindsay, 1988. "Monotonicity of quadratic-approximation algorithms," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 40(4), pages 641-663, December.
    3. Riani, Marco & Atkinson, Anthony C. & Corbellini, Aldo & Perrotta, Domenico, 2020. "Robust regression with density power divergence: theory, comparisons, and data analysis," LSE Research Online Documents on Economics 103931, London School of Economics and Political Science, LSE Library.
    4. Fujisawa, Hironori & Eguchi, Shinto, 2008. "Robust parameter estimation with a small bias against heavy contamination," Journal of Multivariate Analysis, Elsevier, vol. 99(9), pages 2053-2081, October.
    5. Takafumi Kanamori & Hironori Fujisawa, 2015. "Robust estimation under heavy contamination using unnormalized models," Biometrika, Biometrika Trust, vol. 102(3), pages 559-572.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dries Cornilly & Lise Tubex & Stefan Van Aelst & Tim Verdonck, 2024. "Robust and sparse logistic regression," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 18(3), pages 663-679, September.
    2. Hirose, Kei & Fujisawa, Hironori & Sese, Jun, 2017. "Robust sparse Gaussian graphical modeling," Journal of Multivariate Analysis, Elsevier, vol. 161(C), pages 172-190.
    3. Akifumi Okuno, 2024. "Minimizing robust density power-based divergences for general parametric density models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 76(5), pages 851-875, October.
    4. Hung Hung & Zhi†Yu Jou & Su†Yun Huang, 2018. "Robust mislabel logistic regression without modeling mislabel probabilities," Biometrics, The International Biometric Society, vol. 74(1), pages 145-154, March.
    5. Mingyang Ren & Sanguo Zhang & Qingzhao Zhang, 2021. "Robust high-dimensional regression for data with anomalous responses," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(4), pages 703-736, August.
    6. Arun Kumar Kuchibhotla & Somabha Mukherjee & Ayanendranath Basu, 2019. "Statistical inference based on bridge divergences," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 71(3), pages 627-656, June.
    7. Wang, Fa, 2017. "Maximum likelihood estimation and inference for high dimensional nonlinear factor models with application to factor-augmented regressions," MPRA Paper 93484, University Library of Munich, Germany, revised 19 May 2019.
    8. de Leeuw, Jan, 2006. "Principal component analysis of binary data by iterated singular value decomposition," Computational Statistics & Data Analysis, Elsevier, vol. 50(1), pages 21-39, January.
    9. Roussille, Nina & Scuderi, Benjamin, 2023. "Bidding for Talent: A Test of Conduct in a High-Wage Labor Market," IZA Discussion Papers 16352, Institute of Labor Economics (IZA).
    10. Kenneth Lange & Hua Zhou, 2022. "A Legacy of EM Algorithms," International Statistical Review, International Statistical Institute, vol. 90(S1), pages 52-66, December.
    11. Torti, Francesca & Corbellini, Aldo & Atkinson, Anthony C., 2021. "fsdaSAS: a package for robust regression for very large datasets including the batch forward search," LSE Research Online Documents on Economics 109895, London School of Economics and Political Science, LSE Library.
    12. Miron, Julien & Poilane, Benjamin & Cantoni, Eva, 2022. "Robust polytomous logistic regression," Computational Statistics & Data Analysis, Elsevier, vol. 176(C).
    13. Gayen, Atin & Kumar, M. Ashok, 2021. "Projection theorems and estimating equations for power-law models," Journal of Multivariate Analysis, Elsevier, vol. 184(C).
    14. Wang, Fa, 2022. "Maximum likelihood estimation and inference for high dimensional generalized factor models with application to factor-augmented regressions," Journal of Econometrics, Elsevier, vol. 229(1), pages 180-200.
    15. Utkarsh J. Dang & Michael P.B. Gallaugher & Ryan P. Browne & Paul D. McNicholas, 2023. "Model-Based Clustering and Classification Using Mixtures of Multivariate Skewed Power Exponential Distributions," Journal of Classification, Springer;The Classification Society, vol. 40(1), pages 145-167, April.
    16. Liu, Yan, 2017. "Robust parameter estimation for stationary processes by an exotic disparity from prediction problem," Statistics & Probability Letters, Elsevier, vol. 129(C), pages 120-130.
    17. A. Philip Dawid & Monica Musio & Laura Ventura, 2016. "Minimum Scoring Rule Inference," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(1), pages 123-138, March.
    18. Abhijit Mandal & Beste Hamiye Beyaztas & Soutir Bandyopadhyay, 2023. "Robust density power divergence estimates for panel data models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 75(5), pages 773-798, October.
    19. Tian, Guo-Liang & Tang, Man-Lai & Liu, Chunling, 2012. "Accelerating the quadratic lower-bound algorithm via optimizing the shrinkage parameter," Computational Statistics & Data Analysis, Elsevier, vol. 56(2), pages 255-265.
    20. Tian, Guo-Liang & Tang, Man-Lai & Fang, Hong-Bin & Tan, Ming, 2008. "Efficient methods for estimating constrained parameters with applications to regularized (lasso) logistic regression," Computational Statistics & Data Analysis, Elsevier, vol. 52(7), pages 3528-3542, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:metrik:v:86:y:2023:i:4:d:10.1007_s00184-022-00874-1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.