IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v138y2019icp201-221.html
   My bibliography  Save this article

Estimating the mean and variance of a high-dimensional normal distribution using a mixture prior

Author

Listed:
  • Sinha, Shyamalendu
  • Hart, Jeffrey D.

Abstract

A framework is provided for estimating the mean and variance of a high-dimensional normal density. The main setting considered is a fixed number of vectors following a high-dimensional normal distribution with unknown mean and diagonal covariance matrix. The diagonal covariance matrix can be known or unknown. If the covariance matrix is unknown, the sample size can be as small as 2. The proposed estimator is based on the idea that the unobserved mean/variance pairs across dimensions are drawn from an unknown bivariate distribution, which is modeled as a mixture of normal-inverse gammas. The mixture of normal-inverse gamma distributions provides advantages over more traditional empirical Bayes methods, which are based on a normal–normal model. When fitting a mixture model, the algorithm is essentially clustering the unobserved mean and variance pairs into different groups, with each group having a different normal-inverse gamma distribution. The proposed estimator of each mean is the posterior mean of shrinkage estimates, each of which shrinks a sample mean towards a different component of the mixture distribution. The proposed estimator of variance has an analogous interpretation in terms of sample variances and components of the mixture distribution. If the diagonal covariance matrix is known, then the sample size can be as small as 1, and the pairs of known variances and unknown means across dimensions are treated as random observations coming from a flexible mixture of normal-inverse gamma distributions.

Suggested Citation

  • Sinha, Shyamalendu & Hart, Jeffrey D., 2019. "Estimating the mean and variance of a high-dimensional normal distribution using a mixture prior," Computational Statistics & Data Analysis, Elsevier, vol. 138(C), pages 201-221.
  • Handle: RePEc:eee:csdana:v:138:y:2019:i:c:p:201-221
    DOI: 10.1016/j.csda.2019.04.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947319300908
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2019.04.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Matthew Stephens, 2000. "Dealing with label switching in mixture models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(4), pages 795-809.
    2. Xianchao Xie & S. C. Kou & Lawrence D. Brown, 2012. "SURE Estimates for a Heteroscedastic Hierarchical Model," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(500), pages 1465-1479, December.
    3. repec:dau:papers:123456789/4648 is not listed on IDEAS
    4. Asaf Weinstein & Zhuang Ma & Lawrence D. Brown & Cun-Hui Zhang, 2018. "Group-Linear Empirical Bayes Estimates for a Heteroscedastic Normal Mean," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(522), pages 698-710, April.
    5. Bing-Yi Jing & Zhouping Li & Guangming Pan & Wang Zhou, 2016. "On SURE-Type Double Shrinkage Estimation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1696-1704, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lang Zhao & Yuan Zeng & Zhidong Wang & Yizheng Li & Dong Peng & Yao Wang & Xueying Wang, 2023. "Robust Optimal Scheduling of Integrated Energy Systems Considering the Uncertainty of Power Supply and Load in the Power Market," Energies, MDPI, vol. 16(14), pages 1-14, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jochmans, Koen & Weidner, Martin, 2024. "Inference On A Distribution From Noisy Draws," Econometric Theory, Cambridge University Press, vol. 40(1), pages 60-97, February.
    2. Jiafeng Chen, 2022. "Empirical Bayes When Estimation Precision Predicts Parameters," Papers 2212.14444, arXiv.org, revised Apr 2024.
    3. Yao, Weixin & Wei, Yan & Yu, Chun, 2014. "Robust mixture regression using the t-distribution," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 116-127.
    4. Jeong Eun Lee & Christian Robert, 2013. "Imortance Sampling Schemes for Evidence Approximation in Mixture Models," Working Papers 2013-42, Center for Research in Economics and Statistics.
    5. Aßmann, Christian & Boysen-Hogrefe, Jens & Pape, Markus, 2012. "The directional identification problem in Bayesian factor analysis: An ex-post approach," Kiel Working Papers 1799, Kiel Institute for the World Economy (IfW Kiel).
    6. Sun-Joo Cho & Allan S. Cohen, 2010. "A Multilevel Mixture IRT Model With an Application to DIF," Journal of Educational and Behavioral Statistics, , vol. 35(3), pages 336-370, June.
    7. Brian Hartley, 2020. "Corridor stability of the Kaleckian growth model: a Markov-switching approach," Working Papers 2013, New School for Social Research, Department of Economics, revised Nov 2020.
    8. Papastamoulis, Panagiotis, 2018. "Overfitting Bayesian mixtures of factor analyzers with an unknown number of components," Computational Statistics & Data Analysis, Elsevier, vol. 124(C), pages 220-234.
    9. Simen Alexander Linge Johnsen & Jörg Bollmann, 2020. "Coccolith mass and morphology of different Emiliania huxleyi morphotypes: A critical examination using Canary Islands material," PLOS ONE, Public Library of Science, vol. 15(3), pages 1-29, March.
    10. Nichole E. Carlson & Timothy D. Johnson & Morton B. Brown, 2009. "A Bayesian Approach to Modeling Associations Between Pulsatile Hormones," Biometrics, The International Biometric Society, vol. 65(2), pages 650-659, June.
    11. Montanari, Angela & Viroli, Cinzia, 2011. "Maximum likelihood estimation of mixtures of factor analyzers," Computational Statistics & Data Analysis, Elsevier, vol. 55(9), pages 2712-2723, September.
    12. Stéphane Bonhomme & Koen Jochmans & Jean-Marc Robin, 2016. "Non-parametric estimation of finite mixtures from repeated measurements," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(1), pages 211-229, January.
    13. Xue, Jiacheng & Yao, Weixin, 2022. "Machine Learning Embedded Semiparametric Mixtures of Regressions with Covariate-Varying Mixing Proportions," Econometrics and Statistics, Elsevier, vol. 22(C), pages 159-171.
    14. Liqun Wang & James Fu, 2007. "A practical sampling approach for a Bayesian mixture model with unknown number of components," Statistical Papers, Springer, vol. 48(4), pages 631-653, October.
    15. Royce Anders & William Batchelder, 2015. "Cultural Consensus Theory for the Ordinal Data Case," Psychometrika, Springer;The Psychometric Society, vol. 80(1), pages 151-181, March.
    16. Lu, Xiaosun & Huang, Yangxin & Zhu, Yiliang, 2016. "Finite mixture of nonlinear mixed-effects joint models in the presence of missing and mismeasured covariate, with application to AIDS studies," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 119-130.
    17. Yuan Fang & Dimitris Karlis & Sanjeena Subedi, 2022. "Infinite Mixtures of Multivariate Normal-Inverse Gaussian Distributions for Clustering of Skewed Data," Journal of Classification, Springer;The Classification Society, vol. 39(3), pages 510-552, November.
    18. De la Cruz-Mesia, Rolando & Quintana, Fernando A. & Marshall, Guillermo, 2008. "Model-based clustering for longitudinal data," Computational Statistics & Data Analysis, Elsevier, vol. 52(3), pages 1441-1457, January.
    19. Kim Jin Gyo & Menzefricke Ulrich & Feinberg Fred M., 2004. "Assessing Heterogeneity in Discrete Choice Models Using a Dirichlet Process Prior," Review of Marketing Science, De Gruyter, vol. 2(1), pages 1-41, January.
    20. Terrance Savitsky & Daniel McCaffrey, 2014. "Bayesian Hierarchical Multivariate Formulation with Factor Analysis for Nested Ordinal Data," Psychometrika, Springer;The Psychometric Society, vol. 79(2), pages 275-302, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:138:y:2019:i:c:p:201-221. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.