IDEAS home Printed from https://ideas.repec.org/a/spr/testjl/v30y2021i4d10.1007_s11749-021-00757-z.html
   My bibliography  Save this article

Robust multivariate estimation based on statistical depth filters

Author

Listed:
  • Giovanni Saraceno

    (Università degli studi di Trento)

  • Claudio Agostinelli

    (Università degli studi di Trento)

Abstract

In the classical contamination models, such as the gross-error (Huber and Tukey contamination model or case-wise contamination), observations are considered as the units to be identified as outliers or not. This model is very useful when the number of considered variables is moderately small. Alqallaf et al. (Ann Stat 37(1):311–331, 2009) show the limits of this approach for a larger number of variables and introduced the independent contamination model (cell-wise contamination) where now the cells are the units to be identified as outliers or not. One approach to deal, at the same time, with both type of contamination is filter out the contaminated cells from the data set and then apply a robust procedure able to handle case-wise outliers and missing values. Here, we develop a general framework to build filters in any dimension based on statistical data depth functions. We show that previous approaches, e.g., Agostinelli et al. (TEST 24(3):441–461, 2015b) and Leung et al. (Comput Stat Data Anal 111:59–76, 2017), are special cases. We illustrate our method by using the half-space depth.

Suggested Citation

  • Giovanni Saraceno & Claudio Agostinelli, 2021. "Robust multivariate estimation based on statistical depth filters," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(4), pages 935-959, December.
  • Handle: RePEc:spr:testjl:v:30:y:2021:i:4:d:10.1007_s11749-021-00757-z
    DOI: 10.1007/s11749-021-00757-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11749-021-00757-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11749-021-00757-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Mike Danilov & Víctor J. Yohai & Ruben H. Zamar, 2012. "Robust Estimation of Multivariate Location and Scatter in the Presence of Missing Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(499), pages 1178-1186, September.
    2. Tatjana Lange & Karl Mosler & Pavlo Mozharovskyi, 2014. "Fast nonparametric classification based on data depth," Statistical Papers, Springer, vol. 55(1), pages 49-69, February.
    3. Dyckerhoff, Rainer & Mozharovskyi, Pavlo, 2016. "Exact computation of the halfspace depth," Computational Statistics & Data Analysis, Elsevier, vol. 98(C), pages 19-30.
    4. Cuesta-Albertos, J.A. & Nieto-Reyes, A., 2008. "The random Tukey depth," Computational Statistics & Data Analysis, Elsevier, vol. 52(11), pages 4979-4988, July.
    5. Leung, Andy & Yohai, Victor & Zamar, Ruben, 2017. "Multivariate location and scatter matrix estimation under cellwise and casewise contamination," Computational Statistics & Data Analysis, Elsevier, vol. 111(C), pages 59-76.
    6. Claudio Agostinelli & Andy Leung & Victor Yohai & Ruben Zamar, 2015. "Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 24(3), pages 441-461, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Alicia Nieto-Reyes & Rafael Duque & Giacomo Francisci, 2021. "A Method to Automate the Prediction of Student Academic Performance from Early Stages of the Course," Mathematics, MDPI, vol. 9(21), pages 1-14, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wei Shao & Yijun Zuo, 2020. "Computing the halfspace depth with multiple try algorithm and simulated annealing algorithm," Computational Statistics, Springer, vol. 35(1), pages 203-226, March.
    2. Nikola Štefelová & Andreas Alfons & Javier Palarea-Albaladejo & Peter Filzmoser & Karel Hron, 2021. "Robust regression with compositional covariates including cellwise outliers," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 15(4), pages 869-909, December.
    3. Xiaohui Liu & Shihua Luo & Yijun Zuo, 2020. "Some results on the computing of Tukey’s halfspace median," Statistical Papers, Springer, vol. 61(1), pages 303-316, February.
    4. Dyckerhoff, Rainer & Mozharovskyi, Pavlo, 2016. "Exact computation of the halfspace depth," Computational Statistics & Data Analysis, Elsevier, vol. 98(C), pages 19-30.
    5. Christophe Croux & Viktoria Öllerer, 2015. "Comments on: Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 24(3), pages 462-466, September.
    6. Zhang, Xu & Tian, Yahui & Guan, Guoyu & Gel, Yulia R., 2021. "Depth-based classification for relational data with multiple attributes," Journal of Multivariate Analysis, Elsevier, vol. 184(C).
    7. Dyckerhoff, Rainer & Mozharovskyi, Pavlo & Nagy, Stanislav, 2021. "Approximate computation of projection depths," Computational Statistics & Data Analysis, Elsevier, vol. 157(C).
    8. Stephane Heritier & Maria-Pia Victoria-Feser, 2018. "Discussion of “The power of monitoring: how to make the most of a contaminated multivariate sample” by Andrea Cerioli, Marco Riani, Anthony C. Atkinson and Aldo Corbellini," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 27(4), pages 595-602, December.
    9. Alicia Nieto-Reyes & Rafael Duque & Giacomo Francisci, 2021. "A Method to Automate the Prediction of Student Academic Performance from Early Stages of the Course," Mathematics, MDPI, vol. 9(21), pages 1-14, October.
    10. Mia Hubert & Peter Rousseeuw & Pieter Segaert, 2017. "Multivariate and functional classification using depth and distance," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(3), pages 445-466, September.
    11. Olusola Samuel Makinde, 2019. "Classification rules based on distribution functions of functional depth," Statistical Papers, Springer, vol. 60(3), pages 629-640, June.
    12. Stefan Aelst & Ruben H. Zamar, 2019. "Comments on: Data science, big data and statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(2), pages 360-362, June.
    13. Leung, Andy & Yohai, Victor & Zamar, Ruben, 2017. "Multivariate location and scatter matrix estimation under cellwise and casewise contamination," Computational Statistics & Data Analysis, Elsevier, vol. 111(C), pages 59-76.
    14. Henry Velasco & Henry Laniado & Mauricio Toro & Víctor Leiva & Yuhlong Lio, 2020. "Robust Three-Step Regression Based on Comedian and Its Performance in Cell-Wise and Case-Wise Outliers," Mathematics, MDPI, vol. 8(8), pages 1-18, August.
    15. Tian, Yahui & Gel, Yulia R., 2019. "Fusing data depth with complex networks: Community detection with prior information," Computational Statistics & Data Analysis, Elsevier, vol. 139(C), pages 99-116.
    16. Leung, Andy & Zhang, Hongyang & Zamar, Ruben, 2016. "Robust regression estimation and inference in the presence of cellwise and casewise contamination," Computational Statistics & Data Analysis, Elsevier, vol. 99(C), pages 1-11.
    17. Nieto-Reyes, Alicia & Battey, Heather, 2021. "A topologically valid construction of depth for functional data," Journal of Multivariate Analysis, Elsevier, vol. 184(C).
    18. Vencalek, Ondrej & Pokotylo, Oleksii, 2018. "Depth-weighted Bayes classification," Computational Statistics & Data Analysis, Elsevier, vol. 123(C), pages 1-12.
    19. Miguel Flores & Salvador Naya & Rubén Fernández-Casal & Sonia Zaragoza & Paula Raña & Javier Tarrío-Saavedra, 2020. "Constructing a Control Chart Using Functional Data," Mathematics, MDPI, vol. 8(1), pages 1-26, January.
    20. Alba M. Franco-Pereira & Rosa E. Lillo, 2020. "Rank tests for functional data based on the epigraph, the hypograph and associated graphical representations," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(3), pages 651-676, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:testjl:v:30:y:2021:i:4:d:10.1007_s11749-021-00757-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.