IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1004787.html
   My bibliography  Save this article

GPA: A Statistical Approach to Prioritizing GWAS Results by Integrating Pleiotropy and Annotation

Author

Listed:
  • Dongjun Chung
  • Can Yang
  • Cong Li
  • Joel Gelernter
  • Hongyu Zhao

Abstract

Results from Genome-Wide Association Studies (GWAS) have shown that complex diseases are often affected by many genetic variants with small or moderate effects. Identifications of these risk variants remain a very challenging problem. There is a need to develop more powerful statistical methods to leverage available information to improve upon traditional approaches that focus on a single GWAS dataset without incorporating additional data. In this paper, we propose a novel statistical approach, GPA (Genetic analysis incorporating Pleiotropy and Annotation), to increase statistical power to identify risk variants through joint analysis of multiple GWAS data sets and annotation information because: (1) accumulating evidence suggests that different complex diseases share common risk bases, i.e., pleiotropy; and (2) functionally annotated variants have been consistently demonstrated to be enriched among GWAS hits. GPA can integrate multiple GWAS datasets and functional annotations to seek association signals, and it can also perform hypothesis testing to test the presence of pleiotropy and enrichment of functional annotation. Statistical inference of the model parameters and SNP ranking is achieved through an EM algorithm that can handle genome-wide markers efficiently. When we applied GPA to jointly analyze five psychiatric disorders with annotation information, not only did GPA identify many weak signals missed by the traditional single phenotype analysis, but it also revealed relationships in the genetic architecture of these disorders. Using our hypothesis testing framework, statistically significant pleiotropic effects were detected among these psychiatric disorders, and the markers annotated in the central nervous system genes and eQTLs from the Genotype-Tissue Expression (GTEx) database were significantly enriched. We also applied GPA to a bladder cancer GWAS data set with the ENCODE DNase-seq data from 125 cell lines. GPA was able to detect cell lines that are biologically more relevant to bladder cancer. The R implementation of GPA is currently available at http://dongjunchung.github.io/GPA/.Author Summary: In the past 10 years, many genome wide association studies (GWAS) have been conducted to identify the genetic bases of complex human traits. As of January, 2014, more than 12,000 single-nucleotide polymorphisms (SNPs) have been reported to be significantly associated with at least one complex trait/disease. On one hand, about 85% of identified risk variants are located in non-coding regions, which motivates a systematic understanding of the function of non-coding variants in regulatory elements in the human genome. On the other hand, complex diseases are often affected by many genetic variants with small or moderate effects. To address these issues, we propose a statistical approach, GPA, to integrating information from multiple GWAS datasets and functional annotation. Notably, our approach only requires marker-wise p-values as input, making it especially useful when only summary statistics, instead of the full genotype and phenotype data, are available. We applied GPA to analyze GWAS datasets of five psychiatric disorders and bladder cancer, where the central nervous system genes, eQTLs from the Genotype-Tissue Expression (GTEx), and the ENCODE DNase-seq data from 125 cell lines were used as functional annotation. The analysis results suggest that GPA is an effective method for integrative data analysis in the post-GWAS era.

Suggested Citation

  • Dongjun Chung & Can Yang & Cong Li & Joel Gelernter & Hongyu Zhao, 2014. "GPA: A Statistical Approach to Prioritizing GWAS Results by Integrating Pleiotropy and Annotation," PLOS Genetics, Public Library of Science, vol. 10(11), pages 1-14, November.
  • Handle: RePEc:plo:pgen00:1004787
    DOI: 10.1371/journal.pgen.1004787
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004787
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1004787&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1004787?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Brendan Maher, 2008. "Personal genomes: The case of the missing heritability," Nature, Nature, vol. 456(7218), pages 18-21, November.
    2. Teri A. Manolio & Francis S. Collins & Nancy J. Cox & David B. Goldstein & Lucia A. Hindorff & David J. Hunter & Mark I. McCarthy & Erin M. Ramos & Lon R. Cardon & Aravinda Chakravarti & Judy H. Cho &, 2009. "Finding the missing heritability of complex diseases," Nature, Nature, vol. 461(7265), pages 747-753, October.
    3. Hana Lango Allen & Karol Estrada & Guillaume Lettre & Sonja I. Berndt & Michael N. Weedon & Fernando Rivadeneira & Cristen J. Willer & Anne U. Jackson & Sailaja Vedantam & Soumya Raychaudhuri & Teresa, 2010. "Hundreds of variants clustered in genomic loci and biological pathways affect human height," Nature, Nature, vol. 467(7317), pages 832-838, October.
    4. Karen A. Hunt & Vanisha Mistry & Nicholas A. Bockett & Tariq Ahmad & Maria Ban & Jonathan N. Barker & Jeffrey C. Barrett & Hannah Blackburn & Oliver Brand & Oliver Burren & Francesca Capon & Alastair , 2013. "Negligible impact of rare autoimmune-locus coding-region variants on missing heritability," Nature, Nature, vol. 498(7453), pages 232-235, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gustavo de los Campos & Ana I Vazquez & Rohan Fernando & Yann C Klimentidis & Daniel Sorensen, 2013. "Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor," PLOS Genetics, Public Library of Science, vol. 9(7), pages 1-15, July.
    2. Ruixue Fan & Shaw-Hwa Lo, 2013. "A Robust Model-free Approach for Rare Variants Association Studies Incorporating Gene-Gene and Gene-Environmental Interactions," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-14, December.
    3. von Stumm, Sophie & Kandaswamy, Radhika & Maxwell, Jessye, 2023. "Gene-environment interplay in early life cognitive development," Intelligence, Elsevier, vol. 98(C).
    4. Diana Chang & Alon Keinan, 2012. "Predicting Signatures of “Synthetic Associations” and “Natural Associations” from Empirical Patterns of Human Genetic Variation," PLOS Computational Biology, Public Library of Science, vol. 8(7), pages 1-9, July.
    5. Noah Zaitlen & Peter Kraft & Nick Patterson & Bogdan Pasaniuc & Gaurav Bhatia & Samuela Pollack & Alkes L Price, 2013. "Using Extended Genealogy to Estimate Components of Heritability for 23 Quantitative and Dichotomous Traits," PLOS Genetics, Public Library of Science, vol. 9(5), pages 1-11, May.
    6. Yoshitaka Nagamine & Ricardo Pong-Wong & Pau Navarro & Veronique Vitart & Caroline Hayward & Igor Rudan & Harry Campbell & James Wilson & Sarah Wild & Andrew A Hicks & Peter P Pramstaller & Nicholas H, 2012. "Localising Loci underlying Complex Trait Variation Using Regional Genomic Relationship Mapping," PLOS ONE, Public Library of Science, vol. 7(10), pages 1-12, October.
    7. Lucas Alvizi & Diogo Nani & Luciano Abreu Brito & Gerson Shigeru Kobayashi & Maria Rita Passos-Bueno & Roberto Mayor, 2023. "Neural crest E-cadherin loss drives cleft lip/palate by epigenetic modulation via pro-inflammatory gene–environment interaction," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    8. Young Lee & Suyeon Park & Sanghoon Moon & Juyoung Lee & Robert C. Elston & Woojoo Lee & Sungho Won, 2014. "On the Analysis of a Repeated Measure Design in Genome-Wide Association Analysis," IJERPH, MDPI, vol. 11(12), pages 1-21, November.
    9. C Ryan King & Paul J Rathouz & Dan L Nicolae, 2010. "An Evolutionary Framework for Association Testing in Resequencing Studies," PLOS Genetics, Public Library of Science, vol. 6(11), pages 1-11, November.
    10. Chuong B Do & David A Hinds & Uta Francke & Nicholas Eriksson, 2012. "Comparison of Family History and SNPs for Predicting Risk of Complex Disease," PLOS Genetics, Public Library of Science, vol. 8(10), pages 1-16, October.
    11. Kevin R Thornton & Andrew J Foran & Anthony D Long, 2013. "Properties and Modeling of GWAS when Complex Disease Risk Is Due to Non-Complementing, Deleterious Mutations in Genes of Large Effect," PLOS Genetics, Public Library of Science, vol. 9(2), pages 1-14, February.
    12. Michiel Vanneste & Hanne Hoskens & Seppe Goovaerts & Harold Matthews & Jay Devine & Jose D. Aponte & Joanne Cole & Mark Shriver & Mary L. Marazita & Seth M. Weinberg & Susan Walsh & Stephen Richmond &, 2024. "Syndrome-informed phenotyping identifies a polygenic background for achondroplasia-like facial variation in the general population," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    13. Ilias Georgakopoulos-Soares & Chengyu Deng & Vikram Agarwal & Candace S. Y. Chan & Jingjing Zhao & Fumitaka Inoue & Nadav Ahituv, 2023. "Transcription factor binding site orientation and order are major drivers of gene regulatory activity," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    14. Katherine Carbeck & Peter Arcese & Irby Lovette & Christin Pruett & Kevin Winker & Jennifer Walsh, 2023. "Candidate genes under selection in song sparrows co-vary with climate and body mass in support of Bergmann’s Rule," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    15. Iuliana Ionita-Laza & Joseph D Buxbaum & Nan M Laird & Christoph Lange, 2011. "A New Testing Strategy to Identify Rare Variants with Either Risk or Protective Effect on Disease," PLOS Genetics, Public Library of Science, vol. 7(2), pages 1-6, February.
    16. Wang, Fan & Puentes, Esteban & Behrman, Jere R. & Cunha, Flávio, 2024. "You are what your parents expect: Height and local reference points," Journal of Econometrics, Elsevier, vol. 243(1).
    17. Aida Bianco & Eusebio Chiefari & Carmelo G A Nobile & Daniela Foti & Maria Pavia & Antonio Brunetti, 2015. "The Association between HMGA1 rs146052672 Variant and Type 2 Diabetes: A Transethnic Meta-Analysis," PLOS ONE, Public Library of Science, vol. 10(8), pages 1-15, August.
    18. Zhongshang Yuan & Hong Liu & Xiaoshuai Zhang & Fangyu Li & Jinghua Zhao & Furen Zhang & Fuzhong Xue, 2013. "From Interaction to Co-Association —A Fisher r-To-z Transformation-Based Simple Statistic for Real World Genome-Wide Association Study," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-8, July.
    19. Yumei Yang & Qishan Wang & Qiang Chen & Rongrong Liao & Xiangzhe Zhang & Hongjie Yang & Youmin Zheng & Zhiwu Zhang & Yuchun Pan, 2014. "A New Genotype Imputation Method with Tolerance to High Missing Rate and Rare Variants," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-7, June.
    20. Chung-Feng Kao & Jia-Rou Liu & Hung Hung & Po-Hsiu Kuo, 2015. "A Robust GWSS Method to Simultaneously Detect Rare and Common Variants for Complex Disease," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-14, April.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1004787. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.