IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1003608.html
   My bibliography  Save this article

Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor

Author

Listed:
  • Gustavo de los Campos
  • Ana I Vazquez
  • Rohan Fernando
  • Yann C Klimentidis
  • Daniel Sorensen

Abstract

Despite important advances from Genome Wide Association Studies (GWAS), for most complex human traits and diseases, a sizable proportion of genetic variance remains unexplained and prediction accuracy (PA) is usually low. Evidence suggests that PA can be improved using Whole-Genome Regression (WGR) models where phenotypes are regressed on hundreds of thousands of variants simultaneously. The Genomic Best Linear Unbiased Prediction (G-BLUP, a ridge-regression type method) is a commonly used WGR method and has shown good predictive performance when applied to plant and animal breeding populations. However, breeding and human populations differ greatly in a number of factors that can affect the predictive performance of G-BLUP. Using theory, simulations, and real data analysis, we study the performance of G-BLUP when applied to data from related and unrelated human subjects. Under perfect linkage disequilibrium (LD) between markers and QTL, the prediction R-squared (R2) of G-BLUP reaches trait-heritability, asymptotically. However, under imperfect LD between markers and QTL, prediction R2 based on G-BLUP has a much lower upper bound. We show that the minimum decrease in prediction accuracy caused by imperfect LD between markers and QTL is given by (1−b)2, where b is the regression of marker-derived genomic relationships on those realized at causal loci. For pairs of related individuals, due to within-family disequilibrium, the patterns of realized genomic similarity are similar across the genome; therefore b is close to one inducing small decrease in R2. However, with distantly related individuals b reaches very low values imposing a very low upper bound on prediction R2. Our simulations suggest that for the analysis of data from unrelated individuals, the asymptotic upper bound on R2 may be of the order of 20% of the trait heritability. We show how PA can be enhanced with use of variable selection or differential shrinkage of estimates of marker effects.Author Summary: Despite great advances in genotyping technologies, the ability to predict complex traits and diseases remains limited. Increasing evidence suggests that many of these traits may be affected by a large number of small-effect genes that are difficult to detect in single-variant association studies. Whole-Genome Regression (WGR) methods can be used to confront this challenge and have exhibited good predictive power when applied to animal and plant breeding populations. WGR is receiving increased attention in the field of human genetics. However, human and breeding populations differ greatly in factors that can affect the performance of WGRs. Using theory, simulation and real data analysis, we study the predictive performance of the Genomic Best Linear Unbiased Predictor (G-BLUP), one of the most commonly used WGR methods. We derive upper bounds for the prediction accuracy of G-BLUP under perfect and imperfect LD between markers and genotypes at causal loci and validate such upper bounds using simulation and real data analysis. Imperfect LD between markers and causal loci can impose a very low upper bound on the prediction accuracy of G-BLUP, especially when data involve unrelated individuals. In this context, we propose and evaluate avenues for improving the predictive performance of G-BLUP.

Suggested Citation

  • Gustavo de los Campos & Ana I Vazquez & Rohan Fernando & Yann C Klimentidis & Daniel Sorensen, 2013. "Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor," PLOS Genetics, Public Library of Science, vol. 9(7), pages 1-15, July.
  • Handle: RePEc:plo:pgen00:1003608
    DOI: 10.1371/journal.pgen.1003608
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1003608
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1003608&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1003608?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Zhe Zhang & Jianfeng Liu & Xiangdong Ding & Piter Bijma & Dirk-Jan de Koning & Qin Zhang, 2010. "Best Linear Unbiased Prediction of Genomic Breeding Values Using a Trait-Specific Marker-Derived Relationship Matrix," PLOS ONE, Public Library of Science, vol. 5(9), pages 1-8, September.
    2. Robert Makowsky & Nicholas M Pajewski & Yann C Klimentidis & Ana I Vazquez & Christine W Duarte & David B Allison & Gustavo de los Campos, 2011. "Beyond Missing Heritability: Prediction of Complex Traits," PLOS Genetics, Public Library of Science, vol. 7(4), pages 1-9, April.
    3. Brendan Maher, 2008. "Personal genomes: The case of the missing heritability," Nature, Nature, vol. 456(7218), pages 18-21, November.
    4. Teri A. Manolio & Francis S. Collins & Nancy J. Cox & David B. Goldstein & Lucia A. Hindorff & David J. Hunter & Mark I. McCarthy & Erin M. Ramos & Lon R. Cardon & Aravinda Chakravarti & Judy H. Cho &, 2009. "Finding the missing heritability of complex diseases," Nature, Nature, vol. 461(7265), pages 747-753, October.
    5. Hana Lango Allen & Karol Estrada & Guillaume Lettre & Sonja I. Berndt & Michael N. Weedon & Fernando Rivadeneira & Cristen J. Willer & Anne U. Jackson & Sailaja Vedantam & Soumya Raychaudhuri & Teresa, 2010. "Hundreds of variants clustered in genomic loci and biological pathways affect human height," Nature, Nature, vol. 467(7317), pages 832-838, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Daniel Gianola & Kent A Weigel & Nicole Krämer & Alessandra Stella & Chris-Carolin Schön, 2014. "Enhancing Genome-Enabled Prediction by Bagging Genomic BLUP," PLOS ONE, Public Library of Science, vol. 9(4), pages 1-18, April.
    2. Gustavo de los Campos & Daniel Sorensen & Daniel Gianola, 2015. "Genomic Heritability: What Is It?," PLOS Genetics, Public Library of Science, vol. 11(5), pages 1-21, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ruixue Fan & Shaw-Hwa Lo, 2013. "A Robust Model-free Approach for Rare Variants Association Studies Incorporating Gene-Gene and Gene-Environmental Interactions," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-14, December.
    2. Dongjun Chung & Can Yang & Cong Li & Joel Gelernter & Hongyu Zhao, 2014. "GPA: A Statistical Approach to Prioritizing GWAS Results by Integrating Pleiotropy and Annotation," PLOS Genetics, Public Library of Science, vol. 10(11), pages 1-14, November.
    3. von Stumm, Sophie & Kandaswamy, Radhika & Maxwell, Jessye, 2023. "Gene-environment interplay in early life cognitive development," Intelligence, Elsevier, vol. 98(C).
    4. Diana Chang & Alon Keinan, 2012. "Predicting Signatures of “Synthetic Associations” and “Natural Associations” from Empirical Patterns of Human Genetic Variation," PLOS Computational Biology, Public Library of Science, vol. 8(7), pages 1-9, July.
    5. Yunpeng Wang & Arne B Gjuvsland & Jon Olav Vik & Nicolas P Smith & Peter J Hunter & Stig W Omholt, 2012. "Parameters in Dynamic Models of Complex Traits are Containers of Missing Heritability," PLOS Computational Biology, Public Library of Science, vol. 8(4), pages 1-9, April.
    6. Noah Zaitlen & Peter Kraft & Nick Patterson & Bogdan Pasaniuc & Gaurav Bhatia & Samuela Pollack & Alkes L Price, 2013. "Using Extended Genealogy to Estimate Components of Heritability for 23 Quantitative and Dichotomous Traits," PLOS Genetics, Public Library of Science, vol. 9(5), pages 1-11, May.
    7. Yoshitaka Nagamine & Ricardo Pong-Wong & Pau Navarro & Veronique Vitart & Caroline Hayward & Igor Rudan & Harry Campbell & James Wilson & Sarah Wild & Andrew A Hicks & Peter P Pramstaller & Nicholas H, 2012. "Localising Loci underlying Complex Trait Variation Using Regional Genomic Relationship Mapping," PLOS ONE, Public Library of Science, vol. 7(10), pages 1-12, October.
    8. Lucas Alvizi & Diogo Nani & Luciano Abreu Brito & Gerson Shigeru Kobayashi & Maria Rita Passos-Bueno & Roberto Mayor, 2023. "Neural crest E-cadherin loss drives cleft lip/palate by epigenetic modulation via pro-inflammatory gene–environment interaction," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    9. Young Lee & Suyeon Park & Sanghoon Moon & Juyoung Lee & Robert C. Elston & Woojoo Lee & Sungho Won, 2014. "On the Analysis of a Repeated Measure Design in Genome-Wide Association Analysis," IJERPH, MDPI, vol. 11(12), pages 1-21, November.
    10. C Ryan King & Paul J Rathouz & Dan L Nicolae, 2010. "An Evolutionary Framework for Association Testing in Resequencing Studies," PLOS Genetics, Public Library of Science, vol. 6(11), pages 1-11, November.
    11. Chuong B Do & David A Hinds & Uta Francke & Nicholas Eriksson, 2012. "Comparison of Family History and SNPs for Predicting Risk of Complex Disease," PLOS Genetics, Public Library of Science, vol. 8(10), pages 1-16, October.
    12. Kevin R Thornton & Andrew J Foran & Anthony D Long, 2013. "Properties and Modeling of GWAS when Complex Disease Risk Is Due to Non-Complementing, Deleterious Mutations in Genes of Large Effect," PLOS Genetics, Public Library of Science, vol. 9(2), pages 1-14, February.
    13. Michiel Vanneste & Hanne Hoskens & Seppe Goovaerts & Harold Matthews & Jay Devine & Jose D. Aponte & Joanne Cole & Mark Shriver & Mary L. Marazita & Seth M. Weinberg & Susan Walsh & Stephen Richmond &, 2024. "Syndrome-informed phenotyping identifies a polygenic background for achondroplasia-like facial variation in the general population," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    14. Ilias Georgakopoulos-Soares & Chengyu Deng & Vikram Agarwal & Candace S. Y. Chan & Jingjing Zhao & Fumitaka Inoue & Nadav Ahituv, 2023. "Transcription factor binding site orientation and order are major drivers of gene regulatory activity," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    15. Katherine Carbeck & Peter Arcese & Irby Lovette & Christin Pruett & Kevin Winker & Jennifer Walsh, 2023. "Candidate genes under selection in song sparrows co-vary with climate and body mass in support of Bergmann’s Rule," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    16. Iuliana Ionita-Laza & Joseph D Buxbaum & Nan M Laird & Christoph Lange, 2011. "A New Testing Strategy to Identify Rare Variants with Either Risk or Protective Effect on Disease," PLOS Genetics, Public Library of Science, vol. 7(2), pages 1-6, February.
    17. Wang, Fan & Puentes, Esteban & Behrman, Jere R. & Cunha, Flávio, 2024. "You are what your parents expect: Height and local reference points," Journal of Econometrics, Elsevier, vol. 243(1).
    18. Aida Bianco & Eusebio Chiefari & Carmelo G A Nobile & Daniela Foti & Maria Pavia & Antonio Brunetti, 2015. "The Association between HMGA1 rs146052672 Variant and Type 2 Diabetes: A Transethnic Meta-Analysis," PLOS ONE, Public Library of Science, vol. 10(8), pages 1-15, August.
    19. Zhongshang Yuan & Hong Liu & Xiaoshuai Zhang & Fangyu Li & Jinghua Zhao & Furen Zhang & Fuzhong Xue, 2013. "From Interaction to Co-Association —A Fisher r-To-z Transformation-Based Simple Statistic for Real World Genome-Wide Association Study," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-8, July.
    20. Yumei Yang & Qishan Wang & Qiang Chen & Rongrong Liao & Xiangzhe Zhang & Hongjie Yang & Youmin Zheng & Zhiwu Zhang & Yuchun Pan, 2014. "A New Genotype Imputation Method with Tolerance to High Missing Rate and Rare Variants," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-7, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1003608. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.