IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v54y2010i4p843-857.html
   My bibliography  Save this article

RelaxMCD: Smooth optimisation for the Minimum Covariance Determinant estimator

Author

Listed:
  • Schyns, M.
  • Haesbroeck, G.
  • Critchley, F.

Abstract

The Minimum Covariance Determinant (MCD) estimator is a highly robust procedure for estimating the centre and shape of a high dimensional data set. It consists of determining a subsample of h points out of n which minimises the generalised variance. By definition, the computation of this estimator gives rise to a combinatorial optimisation problem, for which several approximate algorithms have been developed. Some of these approximations are quite powerful, but they do not take advantage of any smoothness in the objective function. Recently, in a general framework, an approach transforming any discrete and high dimensional combinatorial problem of this type into a continuous and low-dimensional one has been developed and a general algorithm to solve the transformed problem has been designed. The idea is to build on that general algorithm in order to take into account particular features of the MCD methodology. More specifically, two main goals are considered: (a) adaptation of the algorithm to the specific MCD target function and (b) comparison of this 'tuned' algorithm with the usual competitors for computing MCD. The adaptation focuses on the design of 'clever' starting points in order to systematically investigate the search domain. Accordingly, a new and surprisingly efficient procedure based on a suitably equivariant modification of the well-known k-means algorithm is constructed. The adapted algorithm, called RelaxMCD, is then compared by means of simulations with FASTMCD and the Feasible Subset Algorithm, both benchmark algorithms for computing MCD. As a by-product, it is shown that RelaxMCD is a general technique encompassing the two others, yielding insight into their overall good performance.

Suggested Citation

  • Schyns, M. & Haesbroeck, G. & Critchley, F., 2010. "RelaxMCD: Smooth optimisation for the Minimum Covariance Determinant estimator," Computational Statistics & Data Analysis, Elsevier, vol. 54(4), pages 843-857, April.
  • Handle: RePEc:eee:csdana:v:54:y:2010:i:4:p:843-857
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-9473(09)00408-3
    Download Restriction: Full text for ScienceDirect subscribers only.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hawkins, Douglas M., 1994. "The feasible solution algorithm for the minimum covariance determinant estimator in multivariate data," Computational Statistics & Data Analysis, Elsevier, vol. 17(2), pages 197-210, February.
    2. Todorov, Valentin, 1992. "Computing the minimum covariance determinant estimator (MCD) by simulated annealing," Computational Statistics & Data Analysis, Elsevier, vol. 14(4), pages 515-525, November.
    3. Garcia-Escudero, L.A. & Gordaliza, A., 2007. "The importance of the scales in heterogeneous robust clustering," Computational Statistics & Data Analysis, Elsevier, vol. 51(9), pages 4403-4412, May.
    4. Hawkins, Douglas M. & Olive, David J., 1999. "Improved feasible solution algorithms for high breakdown estimation," Computational Statistics & Data Analysis, Elsevier, vol. 30(1), pages 1-11, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Tri-Dzung Nguyen & Roy Welsch, 2010. "Outlier detection and robust covariance estimation using mathematical programming," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 4(4), pages 301-334, December.
    2. Trendafilov, Nickolay T., 2010. "Stepwise estimation of common principal components," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3446-3457, December.
    3. Luis García-Escudero & Alfonso Gordaliza & Carlos Matrán & Agustín Mayo-Iscar, 2010. "A review of robust clustering methods," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 4(2), pages 89-109, September.
    4. Stephane Heritier & Maria-Pia Victoria-Feser, 2018. "Discussion of “The power of monitoring: how to make the most of a contaminated multivariate sample” by Andrea Cerioli, Marco Riani, Anthony C. Atkinson and Aldo Corbellini," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 27(4), pages 595-602, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Todorov, Valentin & Filzmoser, Peter, 2009. "An Object-Oriented Framework for Robust Multivariate Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 32(i03).
    2. J. L. Alfaro & J. Fco. Ortega, 2009. "A comparison of robust alternatives to Hotelling's T2 control chart," Journal of Applied Statistics, Taylor & Francis Journals, vol. 36(12), pages 1385-1396.
    3. Nunkesser, Robin & Morell, Oliver, 2010. "An evolutionary algorithm for robust regression," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3242-3248, December.
    4. Vanessa Berenguer-Rico & Søren Johansen & Bent Nielsen, 2019. "Models where the Least Trimmed Squares and Least Median of Squares estimators are maximum likelihood," CREATES Research Papers 2019-15, Department of Economics and Business Economics, Aarhus University.
    5. Selin Ahipaşaoğlu, 2015. "Fast algorithms for the minimum volume estimator," Journal of Global Optimization, Springer, vol. 62(2), pages 351-370, June.
    6. Croux, Christophe & Haesbroeck, Gentiane, 1997. "An easy way to increase the finite-sample efficiency of the resampled minimum volume ellipsoid estimator," Computational Statistics & Data Analysis, Elsevier, vol. 25(2), pages 125-141, July.
    7. Nunkesser, Robin & Morell, Oliver, 2008. "Evolutionary algorithms for robust methods," Technical Reports 2008,29, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    8. Hawkins, Douglas M. & Olive, David J., 1999. "Improved feasible solution algorithms for high breakdown estimation," Computational Statistics & Data Analysis, Elsevier, vol. 30(1), pages 1-11, March.
    9. Olive, David J., 2004. "A resistant estimator of multivariate location and dispersion," Computational Statistics & Data Analysis, Elsevier, vol. 46(1), pages 93-102, May.
    10. Nguyen, T.D. & Welsch, R., 2010. "Outlier detection and least trimmed squares approximation using semi-definite programming," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3212-3226, December.
    11. Hawkins, Douglas M., 1995. "Convergence of the feasible solution algorithm for least median of squares regression," Computational Statistics & Data Analysis, Elsevier, vol. 19(5), pages 519-538, May.
    12. C. Ruwet & L. García-Escudero & A. Gordaliza & A. Mayo-Iscar, 2012. "The influence function of the TCLUST robust clustering procedure," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 6(2), pages 107-130, July.
    13. Luis García-Escudero & Alfonso Gordaliza & Carlos Matrán & Agustín Mayo-Iscar, 2010. "A review of robust clustering methods," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 4(2), pages 89-109, September.
    14. Woodruff, David L. & Reiners, Torsten, 2004. "Experiments with, and on, algorithms for maximum likelihood clustering," Computational Statistics & Data Analysis, Elsevier, vol. 47(2), pages 237-253, September.
    15. L. Pitsoulis & G. Zioutas, 2010. "A fast algorithm for robust regression with penalised trimmed squares," Computational Statistics, Springer, vol. 25(4), pages 663-689, December.
    16. Pokojovy, Michael & Jobe, J. Marcus, 2022. "A robust deterministic affine-equivariant algorithm for multivariate location and scatter," Computational Statistics & Data Analysis, Elsevier, vol. 172(C).
    17. L. A. García-Escudero & A. Gordaliza & C. Matrán & A. Mayo-Iscar, 2018. "Comments on “The power of monitoring: how to make the most of a contaminated multivariate sample”," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 27(4), pages 605-608, December.
    18. Winker, Peter & Gilli, Manfred, 2004. "Applications of optimization heuristics to estimation and modelling problems," Computational Statistics & Data Analysis, Elsevier, vol. 47(2), pages 211-223, September.
    19. Ranganai, Edmore, 2016. "Quality of fit measurement in regression quantiles: An elemental set method approach," Statistics & Probability Letters, Elsevier, vol. 111(C), pages 18-25.
    20. Lind, John C. & Wiens, Douglas P. & Yohai, Victor J., 2013. "Robust minimum information loss estimation," Computational Statistics & Data Analysis, Elsevier, vol. 65(C), pages 98-112.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:54:y:2010:i:4:p:843-857. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.