IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v92y2015icp13-25.html
   My bibliography  Save this article

Covariance matrix estimation for left-censored data

Author

Listed:
  • Pesonen, Maiju
  • Pesonen, Henri
  • Nevalainen, Jaakko

Abstract

Multivariate methods often rely on a sample covariance matrix. The conventional estimators of a covariance matrix require complete data vectors on all subjects—an assumption that can frequently not be met. For example, in many fields of life sciences that are utilizing modern measuring technology, such as mass spectrometry, left-censored values caused by denoising the data are a commonplace phenomena. Left-censored values are low-level concentrations that are considered too imprecise to be reported as a single number but known to exist somewhere between zero and the laboratory’s lower limit of detection. Maximum likelihood-based covariance matrix estimators that allow the presence of the left-censored values without substituting them with a constant or ignoring them completely are considered. The presented estimators efficiently use all the information available and thus, based on simulation studies, produce the least biased estimates compared to often used competing estimators. As the genuine maximum likelihood estimate can be solved fast only in low dimensions, it is suggested to estimate the covariance matrix element-wise and then adjust the resulting covariance matrix to achieve positive semi-definiteness. It is shown that the new approach succeeds in decreasing the computation times substantially and still produces accurate estimates. Finally, as an example, a left-censored data set of toxic chemicals is explored.

Suggested Citation

  • Pesonen, Maiju & Pesonen, Henri & Nevalainen, Jaakko, 2015. "Covariance matrix estimation for left-censored data," Computational Statistics & Data Analysis, Elsevier, vol. 92(C), pages 13-25.
  • Handle: RePEc:eee:csdana:v:92:y:2015:i:c:p:13-25
    DOI: 10.1016/j.csda.2015.06.005
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947315001437
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2015.06.005?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Robert H. Lyles & Jovonne K. Williams & Rutt Chuachoowong, 2001. "Correlating Two Viral Load Assays with Known Detection Limits," Biometrics, The International Biometric Society, vol. 57(4), pages 1238-1244, December.
    2. N. Locantore & J. Marron & D. Simpson & N. Tripoli & J. Zhang & K. Cohen & Graciela Boente & Ricardo Fraiman & Babette Brumback & Christophe Croux & Jianqing Fan & Alois Kneip & John Marden & Daniel P, 1999. "Robust principal component analysis for functional data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 8(1), pages 1-73, June.
    3. Jianhua Z. Huang & Naiping Liu & Mohsen Pourahmadi & Linxu Liu, 2006. "Covariance matrix selection and estimation via penalised normal likelihood," Biometrika, Biometrika Trust, vol. 93(1), pages 85-98, March.
    4. Marden, John I., 1999. "Some robust estimates of principal components," Statistics & Probability Letters, Elsevier, vol. 43(4), pages 349-359, July.
    5. Schäfer Juliane & Strimmer Korbinian, 2005. "A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 4(1), pages 1-32, November.
    6. Haiying Chen & Sara A. Quandt & Joseph G. Grzywacz & Thomas A. Arcury, 2013. "A Bayesian multiple imputation method for handling longitudinal pesticide data with values below the limit of detection," Environmetrics, John Wiley & Sons, Ltd., vol. 24(2), pages 132-142, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jing Ma, 2021. "Joint Microbial and Metabolomic Network Estimation with the Censored Gaussian Graphical Model," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(2), pages 351-372, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guangxing Wang & Sisheng Liu & Fang Han & Chong‐Zhi Di, 2023. "Robust functional principal component analysis via a functional pairwise spatial sign operator," Biometrics, The International Biometric Society, vol. 79(2), pages 1239-1253, June.
    2. Lam, Clifford, 2020. "High-dimensional covariance matrix estimation," LSE Research Online Documents on Economics 101667, London School of Economics and Political Science, LSE Library.
    3. Gautam Sabnis & Debdeep Pati & Anirban Bhattacharya, 2019. "Compressed Covariance Estimation with Automated Dimension Learning," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 81(2), pages 466-481, December.
    4. C. Croux & C. Dehon & A. Yadine, 2010. "The k-step spatial sign covariance matrix," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 4(2), pages 137-150, September.
    5. Villers Fanny & Schaeffer Brigitte & Bertin Caroline & Huet Sylvie, 2008. "Assessing the Validity Domains of Graphical Gaussian Models in Order to Infer Relationships among Components of Complex Biological Systems," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 7(2), pages 1-37, September.
    6. Bailey, Natalia & Pesaran, M. Hashem & Smith, L. Vanessa, 2019. "A multiple testing approach to the regularisation of large sample correlation matrices," Journal of Econometrics, Elsevier, vol. 208(2), pages 507-534.
    7. Seija Sirkiä & Sara Taskinen & Hannu Oja & David Tyler, 2009. "Tests and estimates of shape based on spatial signs and ranks," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 21(2), pages 155-176.
    8. Debruyne, Michiel & Hubert, Mia & Van Horebeek, Johan, 2010. "Detecting influential observations in Kernel PCA," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3007-3019, December.
    9. Taskinen, Sara & Koch, Inge & Oja, Hannu, 2012. "Robustifying principal component analysis with spatial sign vectors," Statistics & Probability Letters, Elsevier, vol. 82(4), pages 765-774.
    10. Dürre, Alexander & Vogel, Daniel & Tyler, David E., 2014. "The spatial sign covariance matrix with unknown location," Journal of Multivariate Analysis, Elsevier, vol. 130(C), pages 107-117.
    11. Dürre, Alexander & Tyler, David E. & Vogel, Daniel, 2016. "On the eigenvalues of the spatial sign covariance matrix in more than two dimensions," Statistics & Probability Letters, Elsevier, vol. 111(C), pages 80-85.
    12. Fisher, Thomas J. & Sun, Xiaoqian, 2011. "Improved Stein-type shrinkage estimators for the high-dimensional multivariate normal covariance matrix," Computational Statistics & Data Analysis, Elsevier, vol. 55(5), pages 1909-1918, May.
    13. Marc Vidal & Mattia Rosso & Ana M. Aguilera, 2021. "Bi-Smoothed Functional Independent Component Analysis for EEG Artifact Removal," Mathematics, MDPI, vol. 9(11), pages 1-17, May.
    14. Michiel Debruyne & Tim Verdonck, 2010. "Robust kernel principal component analysis and classification," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 4(2), pages 151-167, September.
    15. Majumdar, Subhabrata & Chatterjee, Snigdhansu, 2022. "On weighted multivariate sign functions," Journal of Multivariate Analysis, Elsevier, vol. 191(C).
    16. Raymaekers, Jakob & Rousseeuw, Peter, 2019. "A generalized spatial sign covariance matrix," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 94-111.
    17. Dürre, Alexander & Vogel, Daniel & Fried, Roland, 2015. "Spatial sign correlation," Journal of Multivariate Analysis, Elsevier, vol. 135(C), pages 89-105.
    18. Dürre, Alexander & Vogel, Daniel, 2016. "Asymptotics of the two-stage spatial sign correlation," Journal of Multivariate Analysis, Elsevier, vol. 144(C), pages 54-67.
    19. J. L. Scealy & Patrice de Caritat & Eric C. Grunsky & Michail T. Tsagris & A. H. Welsh, 2015. "Robust Principal Component Analysis for Power Transformed Compositional Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(509), pages 136-148, March.
    20. Hannart, Alexis & Naveau, Philippe, 2014. "Estimating high dimensional covariance matrices: A new look at the Gaussian conjugate framework," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 149-162.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:92:y:2015:i:c:p:13-25. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.