IDEAS home Printed from https://ideas.repec.org/a/spr/metrik/v85y2022i6d10.1007_s00184-021-00848-9.html
   My bibliography  Save this article

Robust covariance estimation for distributed principal component analysis

Author

Listed:
  • Kangqiang Li

    (Zhejiang University)

  • Han Bao

    (Zhejiang University)

  • Lixin Zhang

    (Zhejiang University)

Abstract

Fan et al. (Ann Stat 47(6):3009–3031, 2019) constructed a distributed principal component analysis (PCA) algorithm to reduce the communication cost between multiple servers significantly. However, their algorithm’s guarantee is only for sub-Gaussian data. Spurred by this deficiency, this paper enhances the effectiveness of their distributed PCA algorithm by utilizing robust covariance matrix estimators of Minsker (Ann Stat 46(6A):2871–2903, 2018) and Ke et al. (Stat Sci 34(3):454–471, 2019) to tame heavy-tailed data. The theoretical results demonstrate that when the sampling distribution is symmetric innovation with the bounded fourth moment or asymmetric with the finite 6th moment, the statistical error rate of the final estimator produced by the robust algorithm is similar to that of sub-Gaussian tails. Extensive numerical trials support the theoretical analysis and indicate that our algorithm is robust to heavy-tailed data and outliers.

Suggested Citation

  • Kangqiang Li & Han Bao & Lixin Zhang, 2022. "Robust covariance estimation for distributed principal component analysis," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 85(6), pages 707-732, August.
  • Handle: RePEc:spr:metrik:v:85:y:2022:i:6:d:10.1007_s00184-021-00848-9
    DOI: 10.1007/s00184-021-00848-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00184-021-00848-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00184-021-00848-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Fang Han & Han Liu, 2018. "ECA: High-Dimensional Elliptical Component Analysis in Non-Gaussian Distributions," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(521), pages 252-268, January.
    2. Marco Avella-Medina & Heather S Battey & Jianqing Fan & Quefeng Li, 2018. "Robust estimation of high-dimensional covariance and precision matrices," Biometrika, Biometrika Trust, vol. 105(2), pages 271-284.
    3. Fan, Jianqing & Fan, Yingying & Lv, Jinchi, 2008. "High dimensional covariance matrix estimation using a factor model," Journal of Econometrics, Elsevier, vol. 147(1), pages 186-197, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kenji Katayama & Kei Kawaguchi & Yuta Egawa & Zhenhua Pan, 2022. "Local Charge Carrier Dynamics for Photocatalytic Materials Using Pattern-Illumination Time-Resolved Phase Microscopy," Energies, MDPI, vol. 15(24), pages 1-13, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yang, Shuquan & Ling, Nengxiang, 2023. "Robust projected principal component analysis for large-dimensional semiparametric factor modeling," Journal of Multivariate Analysis, Elsevier, vol. 195(C).
    2. Xin Wang & Lingchen Kong & Liqun Wang & Zhaoqilin Yang, 2023. "High-Dimensional Covariance Estimation via Constrained L q -Type Regularization," Mathematics, MDPI, vol. 11(4), pages 1-20, February.
    3. Fan, Jianqing & Liao, Yuan & Shi, Xiaofeng, 2015. "Risks of large portfolios," Journal of Econometrics, Elsevier, vol. 186(2), pages 367-387.
    4. Tae-Hwy Lee & Ekaterina Seregina, 2020. "Optimal Portfolio Using Factor Graphical Lasso," Working Papers 202025, University of California at Riverside, Department of Economics.
    5. Tae-Hwy Lee & Ekaterina Seregina, 2020. "Learning from Forecast Errors: A New Approach to Forecast Combination," Working Papers 202024, University of California at Riverside, Department of Economics.
    6. Zeyu Wu & Cheng Wang & Weidong Liu, 2023. "A unified precision matrix estimation framework via sparse column-wise inverse operator under weak sparsity," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 75(4), pages 619-648, August.
    7. Pun, Chi Seng & Wong, Hoi Ying, 2019. "A linear programming model for selection of sparse high-dimensional multiperiod portfolios," European Journal of Operational Research, Elsevier, vol. 273(2), pages 754-771.
    8. Bai, Jushan & Liao, Yuan, 2016. "Efficient estimation of approximate factor models via penalized maximum likelihood," Journal of Econometrics, Elsevier, vol. 191(1), pages 1-18.
    9. Jingying Yang, 2024. "Element Aggregation for Estimation of High-Dimensional Covariance Matrices," Mathematics, MDPI, vol. 12(7), pages 1-16, March.
    10. Qiu, Yumou & Chen, Songxi, 2012. "Test for Bandedness of High Dimensional Covariance Matrices with Bandwidth Estimation," MPRA Paper 46242, University Library of Munich, Germany.
    11. Gianluca De Nard & Olivier Ledoit & Michael Wolf, 2018. "Factor models for portfolio selection in large dimensions: the good, the better and the ugly," ECON - Working Papers 290, Department of Economics - University of Zurich, revised Dec 2018.
    12. Horváth, Lajos & Rice, Gregory, 2019. "Asymptotics for empirical eigenvalue processes in high-dimensional linear factor models," Journal of Multivariate Analysis, Elsevier, vol. 169(C), pages 138-165.
    13. Sung Hoon Choi & Donggyu Kim, 2022. "Large Volatility Matrix Analysis Using Global and National Factor Models," Papers 2208.12323, arXiv.org, revised Dec 2022.
    14. Chen, Jia & Li, Degui & Linton, Oliver, 2019. "A new semiparametric estimation approach for large dynamic covariance matrices with multiple conditioning variables," Journal of Econometrics, Elsevier, vol. 212(1), pages 155-176.
    15. Choi, Sung Hoon & Kim, Donggyu, 2023. "Large volatility matrix analysis using global and national factor models," Journal of Econometrics, Elsevier, vol. 235(2), pages 1917-1933.
    16. Lam, Clifford, 2020. "High-dimensional covariance matrix estimation," LSE Research Online Documents on Economics 101667, London School of Economics and Political Science, LSE Library.
    17. Dong Hwan Oh & Andrew J. Patton, 2017. "Modeling Dependence in High Dimensions With Factor Copulas," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 35(1), pages 139-154, January.
    18. Wei Lan & Ronghua Luo & Chih-Ling Tsai & Hansheng Wang & Yunhong Yang, 2015. "Testing the Diagonality of a Large Covariance Matrix in a Regression Setting," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 33(1), pages 76-86, January.
    19. Yoshimasa Uematsu & Takashi Yamagata, 2019. "Estimation of Weak Factor Models," DSSR Discussion Papers 96, Graduate School of Economics and Management, Tohoku University.
    20. Gautam Sabnis & Debdeep Pati & Anirban Bhattacharya, 2019. "Compressed Covariance Estimation with Automated Dimension Learning," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 81(2), pages 466-481, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:metrik:v:85:y:2022:i:6:d:10.1007_s00184-021-00848-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.