IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1003264.html
   My bibliography  Save this article

Polygenic Modeling with Bayesian Sparse Linear Mixed Models

Author

Listed:
  • Xiang Zhou
  • Peter Carbonetto
  • Matthew Stephens

Abstract

Both linear mixed models (LMMs) and sparse regression models are widely used in genetics applications, including, recently, polygenic modeling in genome-wide association studies. These two approaches make very different assumptions, so are expected to perform well in different situations. However, in practice, for a given dataset one typically does not know which assumptions will be more accurate. Motivated by this, we consider a hybrid of the two, which we refer to as a “Bayesian sparse linear mixed model” (BSLMM) that includes both these models as special cases. We address several key computational and statistical issues that arise when applying BSLMM, including appropriate prior specification for the hyper-parameters and a novel Markov chain Monte Carlo algorithm for posterior inference. We apply BSLMM and compare it with other methods for two polygenic modeling applications: estimating the proportion of variance in phenotypes explained (PVE) by available genotypes, and phenotype (or breeding value) prediction. For PVE estimation, we demonstrate that BSLMM combines the advantages of both standard LMMs and sparse regression modeling. For phenotype prediction it considerably outperforms either of the other two methods, as well as several other large-scale regression methods previously suggested for this problem. Software implementing our method is freely available from http://stephenslab.uchicago.edu/software.html. Author Summary: The goal of polygenic modeling is to better understand the relationship between genetic variation and variation in observed characteristics, including variation in quantitative traits (e.g. cholesterol level in humans, milk production in cattle) and disease susceptibility. Improvements in polygenic modeling will help improve our understanding of this relationship and could ultimately lead to, for example, changes in clinical practice in humans or better breeding/mating strategies in agricultural programs. Polygenic models present important challenges, both at the modeling/statistical level (what modeling assumptions produce the best results) and at the computational level (how should these models be effectively fit to data). We develop novel approaches to help tackle both these challenges, and we demonstrate the gains in accuracy that result in both simulated and real data examples.

Suggested Citation

  • Xiang Zhou & Peter Carbonetto & Matthew Stephens, 2013. "Polygenic Modeling with Bayesian Sparse Linear Mixed Models," PLOS Genetics, Public Library of Science, vol. 9(2), pages 1-14, February.
  • Handle: RePEc:plo:pgen00:1003264
    DOI: 10.1371/journal.pgen.1003264
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1003264
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1003264&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1003264?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Dominic Holland & Oleksandr Frei & Rahul Desikan & Chun-Chieh Fan & Alexey A Shadrin & Olav B Smeland & V S Sundar & Paul Thompson & Ole A Andreassen & Anders M Dale, 2020. "Beyond SNP heritability: Polygenicity and discoverability of phenotypes estimated with a univariate Gaussian mixture model," PLOS Genetics, Public Library of Science, vol. 16(5), pages 1-30, May.
    2. Heather E Wheeler & Kaanan P Shah & Jonathon Brenner & Tzintzuni Garcia & Keston Aquino-Michaels & GTEx Consortium & Nancy J Cox & Dan L Nicolae & Hae Kyung Im, 2016. "Survey of the Heritability and Sparse Architecture of Gene Expression Traits across Human Tissues," PLOS Genetics, Public Library of Science, vol. 12(11), pages 1-23, November.
    3. Lulu Shang & Wei Zhao & Yi Zhe Wang & Zheng Li & Jerome J. Choi & Minjung Kho & Thomas H. Mosley & Sharon L. R. Kardia & Jennifer A. Smith & Xiang Zhou, 2023. "meQTL mapping in the GENOA study reveals genetic determinants of DNA methylation in African Americans," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    4. Abrahamsen, Tavis & Hobert, James P., 2019. "Fast Monte Carlo Markov chains for Bayesian shrinkage models with random effects," Journal of Multivariate Analysis, Elsevier, vol. 169(C), pages 61-80.
    5. Yanyi Song & Xiang Zhou & Min Zhang & Wei Zhao & Yongmei Liu & Sharon L. R. Kardia & Ana V. Diez Roux & Belinda L. Needham & Jennifer A. Smith & Bhramar Mukherjee, 2020. "Bayesian shrinkage estimation of high dimensional causal mediation effects in omics studies," Biometrics, The International Biometric Society, vol. 76(3), pages 700-710, September.
    6. Yiming Hu & Qiongshi Lu & Ryan Powles & Xinwei Yao & Can Yang & Fang Fang & Xinran Xu & Hongyu Zhao, 2017. "Leveraging functional annotations in genetic risk prediction for human complex diseases," PLOS Computational Biology, Public Library of Science, vol. 13(6), pages 1-16, June.
    7. Niloy Biswas & Anirban Bhattacharya & Pierre E. Jacob & James E. Johndrow, 2022. "Coupling‐based convergence assessment of some Gibbs samplers for high‐dimensional Bayesian regression with shrinkage priors," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(3), pages 973-996, July.
    8. Saikat Banerjee & Lingyao Zeng & Heribert Schunkert & Johannes Söding, 2018. "Bayesian multiple logistic regression for case-control GWAS," PLOS Genetics, Public Library of Science, vol. 14(12), pages 1-27, December.
    9. Hui Li & Rahul Mazumder & Xihong Lin, 2023. "Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    10. Brieuc Lehmann & Maxine Mackintosh & Gil McVean & Chris Holmes, 2023. "Optimal strategies for learning multi-ancestry polygenic scores vary across traits," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    11. Yiming Hu & Qiongshi Lu & Wei Liu & Yuhua Zhang & Mo Li & Hongyu Zhao, 2017. "Joint modeling of genetically correlated diseases and functional annotations increases accuracy of polygenic risk prediction," PLOS Genetics, Public Library of Science, vol. 13(6), pages 1-22, June.
    12. Gao Wang & Abhishek Sarkar & Peter Carbonetto & Matthew Stephens, 2020. "A simple new approach to variable selection in regression, with application to genetic fine mapping," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(5), pages 1273-1300, December.
    13. Carla Márquez-Luna & Steven Gazal & Po-Ru Loh & Samuel S. Kim & Nicholas Furlotte & Adam Auton & Alkes L. Price, 2021. "Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    14. McMahan Christopher & Baurley James & Bridges William & Joyner Chase & Kacamarga Muhamad Fitra & Lund Robert & Pardamean Carissa & Pardamean Bens, 2017. "A Bayesian hierarchical model for identifying significant polygenic effects while controlling for confounding and repeated measures," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 16(5-6), pages 407-419, December.
    15. Cristina C. Bastias & Aurélien Estarague & Denis Vile & Elza Gaignon & Cheng-Ruei Lee & Moises Exposito-Alonso & Cyrille Violle & François Vasseur, 2024. "Ecological trade-offs drive phenotypic and genetic differentiation of Arabidopsis thaliana in Europe," Nature Communications, Nature, vol. 15(1), pages 1-11, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1003264. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.