IDEAS home Printed from https://ideas.repec.org/a/spr/metrik/v85y2022i6d10.1007_s00184-021-00848-9.html
   My bibliography  Save this article

Robust covariance estimation for distributed principal component analysis

Author

Listed:
  • Kangqiang Li

    (Zhejiang University)

  • Han Bao

    (Zhejiang University)

  • Lixin Zhang

    (Zhejiang University)

Abstract

Fan et al. (Ann Stat 47(6):3009–3031, 2019) constructed a distributed principal component analysis (PCA) algorithm to reduce the communication cost between multiple servers significantly. However, their algorithm’s guarantee is only for sub-Gaussian data. Spurred by this deficiency, this paper enhances the effectiveness of their distributed PCA algorithm by utilizing robust covariance matrix estimators of Minsker (Ann Stat 46(6A):2871–2903, 2018) and Ke et al. (Stat Sci 34(3):454–471, 2019) to tame heavy-tailed data. The theoretical results demonstrate that when the sampling distribution is symmetric innovation with the bounded fourth moment or asymmetric with the finite 6th moment, the statistical error rate of the final estimator produced by the robust algorithm is similar to that of sub-Gaussian tails. Extensive numerical trials support the theoretical analysis and indicate that our algorithm is robust to heavy-tailed data and outliers.

Suggested Citation

  • Kangqiang Li & Han Bao & Lixin Zhang, 2022. "Robust covariance estimation for distributed principal component analysis," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 85(6), pages 707-732, August.
  • Handle: RePEc:spr:metrik:v:85:y:2022:i:6:d:10.1007_s00184-021-00848-9
    DOI: 10.1007/s00184-021-00848-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00184-021-00848-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00184-021-00848-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Fang Han & Han Liu, 2018. "ECA: High-Dimensional Elliptical Component Analysis in Non-Gaussian Distributions," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(521), pages 252-268, January.
    2. Marco Avella-Medina & Heather S Battey & Jianqing Fan & Quefeng Li, 2018. "Robust estimation of high-dimensional covariance and precision matrices," Biometrika, Biometrika Trust, vol. 105(2), pages 271-284.
    3. Fan, Jianqing & Fan, Yingying & Lv, Jinchi, 2008. "High dimensional covariance matrix estimation using a factor model," Journal of Econometrics, Elsevier, vol. 147(1), pages 186-197, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kenji Katayama & Kei Kawaguchi & Yuta Egawa & Zhenhua Pan, 2022. "Local Charge Carrier Dynamics for Photocatalytic Materials Using Pattern-Illumination Time-Resolved Phase Microscopy," Energies, MDPI, vol. 15(24), pages 1-13, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yang, Shuquan & Ling, Nengxiang, 2023. "Robust projected principal component analysis for large-dimensional semiparametric factor modeling," Journal of Multivariate Analysis, Elsevier, vol. 195(C).
    2. Xin Wang & Lingchen Kong & Liqun Wang & Zhaoqilin Yang, 2023. "High-Dimensional Covariance Estimation via Constrained L q -Type Regularization," Mathematics, MDPI, vol. 11(4), pages 1-20, February.
    3. Liusha Yang & Romain Couillet & Matthew R. McKay, 2015. "A Robust Statistics Approach to Minimum Variance Portfolio Optimization," Papers 1503.08013, arXiv.org.
    4. Fan, Jianqing & Jiang, Bai & Sun, Qiang, 2022. "Bayesian factor-adjusted sparse regression," Journal of Econometrics, Elsevier, vol. 230(1), pages 3-19.
    5. Fan, Jianqing & Liao, Yuan & Shi, Xiaofeng, 2015. "Risks of large portfolios," Journal of Econometrics, Elsevier, vol. 186(2), pages 367-387.
    6. David Neděla & Sergio Ortobelli & Tomáš Tichý, 2024. "Mean–variance vs trend–risk portfolio selection," Review of Managerial Science, Springer, vol. 18(7), pages 2047-2078, July.
    7. Seyoung Park & Eun Ryung Lee & Sungchul Lee & Geonwoo Kim, 2019. "Dantzig Type Optimization Method with Applications to Portfolio Selection," Sustainability, MDPI, vol. 11(11), pages 1-32, June.
    8. Francesco Lautizi, 2015. "Large Scale Covariance Estimates for Portfolio Selection," CEIS Research Paper 353, Tor Vergata University, CEIS, revised 07 Aug 2015.
    9. Christian M. Hafner & Oliver Linton & Haihan Tang, 2016. "Estimation of a multiplicative covariance structure in the large dimensional case," CeMMAP working papers 52/16, Institute for Fiscal Studies.
    10. Nikolaus Hautsch & Lada M. Kyj & Peter Malec, 2015. "Do High‐Frequency Data Improve High‐Dimensional Portfolio Allocations?," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 30(2), pages 263-290, March.
    11. Hafner, Christian M. & Linton, Oliver B. & Tang, Haihan, 2020. "Estimation of a multiplicative correlation structure in the large dimensional case," Journal of Econometrics, Elsevier, vol. 217(2), pages 431-470.
    12. HAFNER, Christian & LINTON, Oliver B. & TANG, Haihan, 2016. "Estimation of a Multiplicative Covariance Structure in the Large Dimensional Case," LIDAM Discussion Papers CORE 2016044, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    13. Viet Anh Nguyen & Daniel Kuhn & Peyman Mohajerin Esfahani, 2018. "Distributionally Robust Inverse Covariance Estimation: The Wasserstein Shrinkage Estimator," Papers 1805.07194, arXiv.org.
    14. Pesaran, M. Hashem & Yamagata, Takashi, 2012. "Testing CAPM with a Large Number of Assets," IZA Discussion Papers 6469, Institute of Labor Economics (IZA).
    15. Clifford Lam & Phoenix Feng & Charlie Hu, 2017. "Nonlinear shrinkage estimation of large integrated covariance matrices," Biometrika, Biometrika Trust, vol. 104(2), pages 481-488.
    16. Jin-Chuan Duan & Weimin Miao, 2016. "Default Correlations and Large-Portfolio Credit Analysis," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 536-546, October.
    17. Tae-Hwy Lee & Ekaterina Seregina, 2024. "Optimal Portfolio Using Factor Graphical Lasso," Journal of Financial Econometrics, Oxford University Press, vol. 22(3), pages 670-695.
    18. Francisco Peñaranda & Enrique Sentana, 2024. "Portfolio management with big data," Working Papers wp2024_2411, CEMFI.
    19. Tatsuya Kubokawa & Muni S. Srivastava, 2013. "Optimal Ridge-type Estimators of Covariance Matrix in High Dimension," CIRJE F-Series CIRJE-F-906, CIRJE, Faculty of Economics, University of Tokyo.
    20. Hautsch, Nikolaus & Voigt, Stefan, 2019. "Large-scale portfolio allocation under transaction costs and model uncertainty," Journal of Econometrics, Elsevier, vol. 212(1), pages 221-240.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:metrik:v:85:y:2022:i:6:d:10.1007_s00184-021-00848-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.