IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1006836.html
   My bibliography  Save this article

Joint modeling of genetically correlated diseases and functional annotations increases accuracy of polygenic risk prediction

Author

Listed:
  • Yiming Hu
  • Qiongshi Lu
  • Wei Liu
  • Yuhua Zhang
  • Mo Li
  • Hongyu Zhao

Abstract

Accurate prediction of disease risk based on genetic factors is an important goal in human genetics research and precision medicine. Advanced prediction models will lead to more effective disease prevention and treatment strategies. Despite the identification of thousands of disease-associated genetic variants through genome-wide association studies (GWAS) in the past decade, accuracy of genetic risk prediction remains moderate for most diseases, which is largely due to the challenges in both identifying all the functionally relevant variants and accurately estimating their effect sizes. In this work, we introduce PleioPred, a principled framework that leverages pleiotropy and functional annotations in genetic risk prediction for complex diseases. PleioPred uses GWAS summary statistics as its input, and jointly models multiple genetically correlated diseases and a variety of external information including linkage disequilibrium and diverse functional annotations to increase the accuracy of risk prediction. Through comprehensive simulations and real data analyses on Crohn’s disease, celiac disease and type-II diabetes, we demonstrate that our approach can substantially increase the accuracy of polygenic risk prediction and risk population stratification, i.e. PleioPred can significantly better separate type-II diabetes patients with early and late onset ages, illustrating its potential clinical application. Furthermore, we show that the increment in prediction accuracy is significantly correlated with the genetic correlation between the predicted and jointly modeled diseases.Author summary: Genetic risk prediction plays a significant role in precision medicine. Accurate prediction models could have great impact on disease prevention and treatment strategies. However, prediction accuracies for most complex diseases remain moderate mainly due to the challenges in identifying and quantifying the effects of genetic variants from millions of markers, limited access to individual-level genotype data, and lack of efficient computational methods. Up to now, most methods have been focused on predicting disease risk using data from a single trait. With the discovery of genetic correlations among many complex diseases, incorporating data of genetically correlated diseases could have the potential to increase prediction accuracy. Current statistical methods are not able to fully exploit the richness of these kinds of data to take into account the shared genetic architecture. To make use of commonly available GWAS summary statistics, we propose a novel method to address these challenges by jointly modeling genetically correlated diseases and integrating genomic functional annotations. We demonstrate the substantial improvement in accuracy in both extensive simulation studies and real data analysis of Crohn’s disease, celiac disease and type-II diabetes. Furthermore, we show that the increment in prediction accuracy is significantly correlated with the genetic correlation between the predicted and jointly modeled diseases.

Suggested Citation

  • Yiming Hu & Qiongshi Lu & Wei Liu & Yuhua Zhang & Mo Li & Hongyu Zhao, 2017. "Joint modeling of genetically correlated diseases and functional annotations increases accuracy of polygenic risk prediction," PLOS Genetics, Public Library of Science, vol. 13(6), pages 1-22, June.
  • Handle: RePEc:plo:pgen00:1006836
    DOI: 10.1371/journal.pgen.1006836
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1006836
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1006836&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1006836?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Xiang Zhou & Peter Carbonetto & Matthew Stephens, 2013. "Polygenic Modeling with Bayesian Sparse Linear Mixed Models," PLOS Genetics, Public Library of Science, vol. 9(2), pages 1-14, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jiacheng Miao & Hanmin Guo & Gefei Song & Zijie Zhao & Lin Hou & Qiongshi Lu, 2023. "Quantifying portable genetic effects and improving cross-ancestry genetic prediction with GWAS summary statistics," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    2. Shuang Song & Wei Jiang & Lin Hou & Hongyu Zhao, 2020. "Leveraging effect size distributions to improve polygenic risk scores derived from summary statistics of genome-wide association studies," PLOS Computational Biology, Public Library of Science, vol. 16(2), pages 1-18, February.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dominic Holland & Oleksandr Frei & Rahul Desikan & Chun-Chieh Fan & Alexey A Shadrin & Olav B Smeland & V S Sundar & Paul Thompson & Ole A Andreassen & Anders M Dale, 2020. "Beyond SNP heritability: Polygenicity and discoverability of phenotypes estimated with a univariate Gaussian mixture model," PLOS Genetics, Public Library of Science, vol. 16(5), pages 1-30, May.
    2. Yiming Hu & Qiongshi Lu & Ryan Powles & Xinwei Yao & Can Yang & Fang Fang & Xinran Xu & Hongyu Zhao, 2017. "Leveraging functional annotations in genetic risk prediction for human complex diseases," PLOS Computational Biology, Public Library of Science, vol. 13(6), pages 1-16, June.
    3. Carla Márquez-Luna & Steven Gazal & Po-Ru Loh & Samuel S. Kim & Nicholas Furlotte & Adam Auton & Alkes L. Price, 2021. "Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    4. Yanyi Song & Xiang Zhou & Min Zhang & Wei Zhao & Yongmei Liu & Sharon L. R. Kardia & Ana V. Diez Roux & Belinda L. Needham & Jennifer A. Smith & Bhramar Mukherjee, 2020. "Bayesian shrinkage estimation of high dimensional causal mediation effects in omics studies," Biometrics, The International Biometric Society, vol. 76(3), pages 700-710, September.
    5. Hui Li & Rahul Mazumder & Xihong Lin, 2023. "Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    6. McMahan Christopher & Baurley James & Bridges William & Joyner Chase & Kacamarga Muhamad Fitra & Lund Robert & Pardamean Carissa & Pardamean Bens, 2017. "A Bayesian hierarchical model for identifying significant polygenic effects while controlling for confounding and repeated measures," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 16(5-6), pages 407-419, December.
    7. Gao Wang & Abhishek Sarkar & Peter Carbonetto & Matthew Stephens, 2020. "A simple new approach to variable selection in regression, with application to genetic fine mapping," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(5), pages 1273-1300, December.
    8. Heather E Wheeler & Kaanan P Shah & Jonathon Brenner & Tzintzuni Garcia & Keston Aquino-Michaels & GTEx Consortium & Nancy J Cox & Dan L Nicolae & Hae Kyung Im, 2016. "Survey of the Heritability and Sparse Architecture of Gene Expression Traits across Human Tissues," PLOS Genetics, Public Library of Science, vol. 12(11), pages 1-23, November.
    9. Lulu Shang & Wei Zhao & Yi Zhe Wang & Zheng Li & Jerome J. Choi & Minjung Kho & Thomas H. Mosley & Sharon L. R. Kardia & Jennifer A. Smith & Xiang Zhou, 2023. "meQTL mapping in the GENOA study reveals genetic determinants of DNA methylation in African Americans," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    10. Abrahamsen, Tavis & Hobert, James P., 2019. "Fast Monte Carlo Markov chains for Bayesian shrinkage models with random effects," Journal of Multivariate Analysis, Elsevier, vol. 169(C), pages 61-80.
    11. Cristina C. Bastias & Aurélien Estarague & Denis Vile & Elza Gaignon & Cheng-Ruei Lee & Moises Exposito-Alonso & Cyrille Violle & François Vasseur, 2024. "Ecological trade-offs drive phenotypic and genetic differentiation of Arabidopsis thaliana in Europe," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    12. Niloy Biswas & Anirban Bhattacharya & Pierre E. Jacob & James E. Johndrow, 2022. "Coupling‐based convergence assessment of some Gibbs samplers for high‐dimensional Bayesian regression with shrinkage priors," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(3), pages 973-996, July.
    13. Saikat Banerjee & Lingyao Zeng & Heribert Schunkert & Johannes Söding, 2018. "Bayesian multiple logistic regression for case-control GWAS," PLOS Genetics, Public Library of Science, vol. 14(12), pages 1-27, December.
    14. Brieuc Lehmann & Maxine Mackintosh & Gil McVean & Chris Holmes, 2023. "Optimal strategies for learning multi-ancestry polygenic scores vary across traits," Nature Communications, Nature, vol. 14(1), pages 1-15, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1006836. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.