IDEAS home Printed from https://ideas.repec.org/a/eee/reecon/v75y2021i2p176-202.html
   My bibliography  Save this article

Robust covariance matrix estimation and identification of unusual data points: New tools

Author

Listed:
  • Garciga, Christian
  • Verbrugge, Randal

Abstract

Most consistent estimators are prone to total breakdown in the presence of a handful of unusual data points (UDPs). This compromises inference. Robust estimation is a (seldom-used) solution; but methods commonly-used in applied research have severe drawbacks. In this paper, building upon methods that are relatively unknown outside of the robust statistics literature, we provide an enhanced tool for robust estimates of mean and covariance, useful both for robust estimation and for detection of unusual data points. It is relatively fast and useful for large data sets. We also provide a new robust cluster method, an input to our broader method, but also useful for standalone UDP detection or cluster analysis. We provide a comparative study of numerous methods that is not available in the current literature. Testing indicates that our method performs at par with, and often better than, two of the currently best available methods. We also demonstrate that the issues we discuss are not merely hypothetical, by applying our tools to real world data, and to re-examine two prominent economic studies. Our methods reveal that their central results are driven by a set of unusual points.

Suggested Citation

  • Garciga, Christian & Verbrugge, Randal, 2021. "Robust covariance matrix estimation and identification of unusual data points: New tools," Research in Economics, Elsevier, vol. 75(2), pages 176-202.
  • Handle: RePEc:eee:reecon:v:75:y:2021:i:2:p:176-202
    DOI: 10.1016/j.rie.2021.03.001
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1090944321000053
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.rie.2021.03.001?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Card, David & Krueger, Alan B, 1994. "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania," American Economic Review, American Economic Association, vol. 84(4), pages 772-793, September.
    2. Arnaud Costinot & Lindsay Oldenski & James Rauch, 2011. "Adaptation and the Boundary of Multinational Firms," The Review of Economics and Statistics, MIT Press, vol. 93(1), pages 298-308, February.
    3. Wang N. & Raftery A.E., 2002. "Nearest-Neighbor Variance Estimation (NNVE): Robust Covariance Estimation via Nearest-Neighbor Cleaning," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 994-1019, December.
    4. Alan B. Krueger & David Card, 2000. "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania: Reply," American Economic Review, American Economic Association, vol. 90(5), pages 1397-1420, December.
    5. Michael F. Bryan & Stephen G. Cecchetti, 1994. "Measuring Core Inflation," NBER Chapters, in: Monetary Policy, pages 195-219, National Bureau of Economic Research, Inc.
    6. William Wascher & David Neumark, 2000. "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania: Comment," American Economic Review, American Economic Association, vol. 90(5), pages 1362-1396, December.
    7. Zaman, Asad & Rousseeuw, Peter J. & Orhan, Mehmet, 2001. "Econometric applications of high-breakdown robust regression techniques," Economics Letters, Elsevier, vol. 71(1), pages 1-8, April.
    8. Pietro Coretto & Christian Hennig, 2016. "Robust Improper Maximum Likelihood: Tuning, Computation, and a Comparison With Other Methods for Robust Gaussian Clustering," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1648-1659, October.
    9. Rama Cont & Romain Deguest & Giacomo Scandolo, 2010. "Robustness and sensitivity analysis of risk measurement procedures," Quantitative Finance, Taylor & Francis Journals, vol. 10(6), pages 593-606.
    10. Salibian-Barrera, Matias & Van Aelst, Stefan & Willems, Gert, 2006. "Principal Components Analysis Based on Multivariate MM Estimators With Fast and Robust Bootstrap," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1198-1211, September.
    11. Knez, Peter J & Ready, Mark J, 1997. "On the Robustness of Size and Book-to-Market in Cross-Sectional Regressions," Journal of Finance, American Finance Association, vol. 52(4), pages 1355-1382, September.
    12. Fama, Eugene F & French, Kenneth R, 1992. "The Cross-Section of Expected Stock Returns," Journal of Finance, American Finance Association, vol. 47(2), pages 427-465, June.
    13. Torti, Francesca & Perrotta, Domenico & Atkinson, Anthony C. & Riani, Marco, 2012. "Benchmark testing of algorithms for very robust regression: FS, LMS and LTS," Computational Statistics & Data Analysis, Elsevier, vol. 56(8), pages 2501-2512.
    14. Muler, Nora & Yohai, V´ictor J., 2013. "Robust estimation for vector autoregressive models," Computational Statistics & Data Analysis, Elsevier, vol. 65(C), pages 68-79.
    15. Rama Cont & Romain Deguest & Giacomo Scandolo, 2010. "Robustness and sensitivity analysis of risk measurement procedures," Post-Print hal-00413729, HAL.
    16. Billor, Nedret & Hadi, Ali S. & Velleman, Paul F., 2000. "BACON: blocked adaptive computationally efficient outlier nominators," Computational Statistics & Data Analysis, Elsevier, vol. 34(3), pages 279-298, September.
    17. Hawkins D. M. & Olive D. J., 2002. "Inconsistency of Resampling Algorithms for High-Breakdown Regression Estimators and a New Algorithm," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 136-159, March.
    18. Atif Mian & Amir Sufi, 2014. "What Explains the 2007–2009 Drop in Employment?," Econometrica, Econometric Society, vol. 82, pages 2197-2223, November.
    19. Fama, Eugene F. & French, Kenneth R., 1993. "Common risk factors in the returns on stocks and bonds," Journal of Financial Economics, Elsevier, vol. 33(1), pages 3-56, February.
    20. Daniel R. Carroll & Randal J. Verbrugge, 2019. "Behavior of a New Median PCE Measure: A Tale of Tails," Economic Commentary, Federal Reserve Bank of Cleveland, vol. 2019(10), July.
    21. Croux, Christophe & Haesbroeck, Gentiane, 1999. "Influence Function and Efficiency of the Minimum Covariance Determinant Scatter Matrix Estimator," Journal of Multivariate Analysis, Elsevier, vol. 71(2), pages 161-190, November.
    22. Cerioli, Andrea, 2010. "Multivariate Outlier Detection With High-Breakdown Estimators," Journal of the American Statistical Association, American Statistical Association, vol. 105(489), pages 147-156.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Verbrugge, Randal & Zaman, Saeed, 2023. "The hard road to a soft landing: Evidence from a (modestly) nonlinear structural model," Energy Economics, Elsevier, vol. 123(C).
    2. Brenton R. Clarke & Andrew Grose, 2023. "A further study comparing forward search multivariate outlier methods including ATLA with an application to clustering," Statistical Papers, Springer, vol. 64(2), pages 395-420, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Christian Garciga & Randal J. Verbrugge, 2020. "A New Tool for Robust Estimation and Identification of Unusual Data Points," Working Papers 20-08, Federal Reserve Bank of Cleveland.
    2. Baishuai Zuo & Chuancun Yin & Jing Yao, 2023. "Multivariate range Value-at-Risk and covariance risk measures for elliptical and log-elliptical distributions," Papers 2305.09097, arXiv.org.
    3. Derk Bienen, 2002. "Mindestlohnreformen in Südamerika – ökonomische Rechtfertigung und praktische Umsetzung," Ibero America Institute for Econ. Research (IAI) Discussion Papers 090, Ibero-America Institute for Economic Research.
    4. Mark B. Stewart, 2004. "The Impact of the Introduction of the U.K. Minimum Wage on the Employment Probabilities of Low-Wage Workers," Journal of the European Economic Association, MIT Press, vol. 2(1), pages 67-97, March.
    5. Lemos Sara, 2005. "Political Variables as Instruments for the Minimum Wage," The B.E. Journal of Economic Analysis & Policy, De Gruyter, vol. 4(1), pages 1-31, December.
    6. Shirley, Peter, 2018. "The response of commuting patterns to cross-border policy differentials: Evidence from the American Community Survey," Regional Science and Urban Economics, Elsevier, vol. 73(C), pages 1-16.
    7. Goto, Hideaki, 2008. "Labor Market Competitiveness and Poverty," Working Papers 51159, Cornell University, Department of Applied Economics and Management.
    8. Anderson, James H. & Korsun, Georges & Murrell, Peter, 2003. "Glamour and value in the land of Chingis Khan," Journal of Comparative Economics, Elsevier, vol. 31(1), pages 34-57, March.
    9. Thomas W. Downs & Robert W. Ingram, 2000. "Beta, Size, Risk, And Return," Journal of Financial Research, Southern Finance Association;Southwestern Finance Association, vol. 23(3), pages 245-260, September.
    10. William E. Even & David A. Macpherson, 2014. "The Effect of the Tipped Minimum Wage on Employees in the U.S. Restaurant Industry," Southern Economic Journal, John Wiley & Sons, vol. 80(3), pages 633-655, January.
    11. Lemos, Sara, 2004. "A Menu of Minimum Wage Variables for Evaluating Wages and Employment Effects: Evidence from Brazil," IZA Discussion Papers 1069, Institute of Labor Economics (IZA).
    12. Pinoli, Sara, 2008. "Rational Expectations and the Puzzling No-Effect of the Minimum Wage," MPRA Paper 11405, University Library of Munich, Germany.
    13. Thilini V. Mahanama & Abootaleb Shirvani & Svetlozar Rachev, 2023. "The Financial Market of Indices of Socioeconomic Wellbeing," Papers 2303.05654, arXiv.org.
    14. Denis Fougère & Erwan Gautier & Hervé Le Bihan, 2010. "Restaurant Prices and the Minimum Wage," Journal of Money, Credit and Banking, Blackwell Publishing, vol. 42(7), pages 1199-1234, October.
    15. Anton Astakhov & Tomas Havranek & Jiri Novak, 2019. "Firm Size And Stock Returns: A Quantitative Survey," Journal of Economic Surveys, Wiley Blackwell, vol. 33(5), pages 1463-1492, December.
    16. Philipp Berge & Hanna Frings, 2020. "High-impact minimum wages and heterogeneous regions," Empirical Economics, Springer, vol. 59(2), pages 701-729, August.
    17. Joseph Marchand, 2017. "Thinking about Minimum Wage Increases in Alberta: Theoretically, Empirically, and Regionally," C.D. Howe Institute Commentary, C.D. Howe Institute, issue 491, pages 1-20, September.
    18. Ang, Andrew & Chen, Joseph, 2007. "CAPM over the long run: 1926-2001," Journal of Empirical Finance, Elsevier, vol. 14(1), pages 1-40, January.
    19. Daniele Bondonio, 2019. "Does the Running Variable Matter? A Second Look at Discontinuity Designs for Evaluating Regional Economic Development and Business Incentive Policies," Economics Working Paper from Condorcet Center for political Economy at CREM-CNRS 2019-02-ccr, Condorcet Center for political Economy.
    20. Thiess Büttner & Alexander Ebertz & Jens Ruhose, 2009. "Der Mindestlohn und die räumliche Lohnstruktur in Deutschland," ifo Schnelldienst, ifo Institute - Leibniz Institute for Economic Research at the University of Munich, vol. 62(05), pages 20-26, March.

    More about this item

    Keywords

    Outlier identification; Fragility; Robust estimation; detMCD; RMVN;
    All these keywords.

    JEL classification:

    • C31 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models; Quantile Regressions; Social Interaction Models
    • C38 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Classification Methdos; Cluster Analysis; Principal Components; Factor Analysis
    • C51 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Construction and Estimation
    • C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • C87 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Econometric Software

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:reecon:v:75:y:2021:i:2:p:176-202. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/inca/622941 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.