IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v115y2017icp11-20.html
   My bibliography  Save this article

Powered embarrassing parallel MCMC sampling in Bayesian inference, a weighted average intuition

Author

Listed:
  • Li, Song
  • Tso, Geoffrey K.F.
  • Long, Lufan

Abstract

Although the Markov Chain Monte Carlo (MCMC) is very popular in parameter inference, the alleviation of the burden of calculation is crucial due to the limit of processors, memory, and disk bottleneck. This is especially true in terms of handling big data. In recent years, researchers have developed a parallel MCMC algorithm, in which full data are partitioned into subdatasets. Samples are drawn from the subdatasets independently at different machines without communication. In the extant literature, all machines are deemed to be identical. However, due to the heterogeneity of the data put into different machines, and the random nature of MCMC, the assumption of “identical machines” is questionable. Here we propose a Powered Embarrassing Parallel MCMC (PEPMCMC) algorithm, in which the full data posterior density is the product of the sub-posterior densities (posterior densities of different subdatasets) raised by some constraint powers. This is proven to be equivalent to a weighted averaging procedure. In our work, the powers are determined based on a maximum likelihood criterion, which leads to finding a maximum likelihood point within the convex hull of the estimates from different machines. We prove the asymptotic exactness and apply it to several cases to verify its strength in comparison with the unparallel and unpowered parallel algorithms. Furthermore, the connection between normal kernel density and parametric density estimations under certain conditions is investigated.

Suggested Citation

  • Li, Song & Tso, Geoffrey K.F. & Long, Lufan, 2017. "Powered embarrassing parallel MCMC sampling in Bayesian inference, a weighted average intuition," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 11-20.
  • Handle: RePEc:eee:csdana:v:115:y:2017:i:c:p:11-20
    DOI: 10.1016/j.csda.2017.05.005
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947317301007
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2017.05.005?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Mahani, Alireza S. & Sharabiani, Mansour T.A., 2015. "SIMD parallel MCMC sampling with applications for big-data Bayesian analytics," Computational Statistics & Data Analysis, Elsevier, vol. 88(C), pages 75-99.
    2. White, Gentry & Porter, Michael D., 2014. "GPU accelerated MCMC for modeling terrorist activity," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 643-651.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Marissa Renardy & Tau-Mu Yi & Dongbin Xiu & Ching-Shan Chou, 2018. "Parameter uncertainty quantification using surrogate models applied to a spatial model of yeast mating polarization," PLOS Computational Biology, Public Library of Science, vol. 14(5), pages 1-26, May.
    2. Tsionas, Mike G., 2019. "Multi-objective optimization using statistical models," European Journal of Operational Research, Elsevier, vol. 276(1), pages 364-378.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Michael Platzer & Thomas Reutterer, 2016. "Ticking Away the Moments: Timing Regularity Helps to Better Predict Customer Activity," Marketing Science, INFORMS, vol. 35(5), pages 779-799, September.
    2. Abpeykar, Shadi & Ghatee, Mehdi & Zare, Hadi, 2019. "Ensemble decision forest of RBF networks via hybrid feature clustering approach for high-dimensional data classification," Computational Statistics & Data Analysis, Elsevier, vol. 131(C), pages 12-36.
    3. Federico Palacios-González & Rosa M. García-Fernández, 2020. "A faster algorithm to estimate multiresolution densities," Computational Statistics, Springer, vol. 35(3), pages 1207-1230, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:115:y:2017:i:c:p:11-20. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.