IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1005165.html
   My bibliography  Save this article

The Power of Gene-Based Rare Variant Methods to Detect Disease-Associated Variation and Test Hypotheses About Complex Disease

Author

Listed:
  • Loukas Moutsianas
  • Vineeta Agarwala
  • Christian Fuchsberger
  • Jason Flannick
  • Manuel A Rivas
  • Kyle J Gaulton
  • Patrick K Albers
  • GoT2D Consortium
  • Gil McVean
  • Michael Boehnke
  • David Altshuler
  • Mark I McCarthy

Abstract

Genome and exome sequencing in large cohorts enables characterization of the role of rare variation in complex diseases. Success in this endeavor, however, requires investigators to test a diverse array of genetic hypotheses which differ in the number, frequency and effect sizes of underlying causal variants. In this study, we evaluated the power of gene-based association methods to interrogate such hypotheses, and examined the implications for study design. We developed a flexible simulation approach, using 1000 Genomes data, to (a) generate sequence variation at human genes in up to 10K case-control samples, and (b) quantify the statistical power of a panel of widely used gene-based association tests under a variety of allelic architectures, locus effect sizes, and significance thresholds. For loci explaining ~1% of phenotypic variance underlying a common dichotomous trait, we find that all methods have low absolute power to achieve exome-wide significance (~5-20% power at α=2.5×10-6) in 3K individuals; even in 10K samples, power is modest (~60%). The combined application of multiple methods increases sensitivity, but does so at the expense of a higher false positive rate. MiST, SKAT-O, and KBAC have the highest individual mean power across simulated datasets, but we observe wide architecture-dependent variability in the individual loci detected by each test, suggesting that inferences about disease architecture from analysis of sequencing studies can differ depending on which methods are used. Our results imply that tens of thousands of individuals, extensive functional annotation, or highly targeted hypothesis testing will be required to confidently detect or exclude rare variant signals at complex disease loci.Author Summary: Re-sequencing technologies allow for a more complete interrogation of the role of human variation in complex disease. The inadequate power of single variant methods to assess the role of less common variation has led to the development of numerous statistical methods for testing aggregate groups of variants for association with disease. Such endeavors pose substantial analytical challenges, however, due to the diverse array of genetic hypotheses that need to be considered. In this work, we systematically quantify and compare the performance of a panel of commonly used gene-based association methods under a range of allelic architectures, significance thresholds, locus effect sizes, sample sizes, and filters for neutral variation. We find that MiST, SKAT-O, and KBAC have the highest mean power across simulated datasets. Across all methods, however, the power to detect even loci of relatively large effect is very low at exome-wide significance thresholds for sample sizes comparable with those of ongoing sequencing studies; as such, the absence of signal in studies of a few thousand individuals does not exclude a role for rare variation in complex traits. Finally, we directly compare the results reported by different gene-based methods in order to identify their comparative advantages and disadvantages under distinct locus architectures. Our findings have implications for meaningful interpretation of both positive and negative findings in ongoing and future sequencing studies.

Suggested Citation

  • Loukas Moutsianas & Vineeta Agarwala & Christian Fuchsberger & Jason Flannick & Manuel A Rivas & Kyle J Gaulton & Patrick K Albers & GoT2D Consortium & Gil McVean & Michael Boehnke & David Altshuler &, 2015. "The Power of Gene-Based Rare Variant Methods to Detect Disease-Associated Variation and Test Hypotheses About Complex Disease," PLOS Genetics, Public Library of Science, vol. 11(4), pages 1-24, April.
  • Handle: RePEc:plo:pgen00:1005165
    DOI: 10.1371/journal.pgen.1005165
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1005165
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1005165&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1005165?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Benjamin M Neale & Manuel A Rivas & Benjamin F Voight & David Altshuler & Bernie Devlin & Marju Orho-Melander & Sekar Kathiresan & Shaun M Purcell & Kathryn Roeder & Mark J Daly, 2011. "Testing for an Unusual Distribution of Rare Variants," PLOS Genetics, Public Library of Science, vol. 7(3), pages 1-8, March.
    2. Wenqing Fu & Timothy D. O’Connor & Goo Jun & Hyun Min Kang & Goncalo Abecasis & Suzanne M. Leal & Stacey Gabriel & Mark J. Rieder & David Altshuler & Jay Shendure & Deborah A. Nickerson & Michael J. B, 2013. "Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants," Nature, Nature, vol. 493(7431), pages 216-220, January.
    3. Dajiang J Liu & Suzanne M Leal, 2010. "A Novel Adaptive Method for the Analysis of Next-Generation Sequencing Data to Detect Complex Trait Associations with Rare Variants Due to Gene Main Effects and Interactions," PLOS Genetics, Public Library of Science, vol. 6(10), pages 1-14, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chung-Feng Kao & Jia-Rou Liu & Hung Hung & Po-Hsiu Kuo, 2015. "A Robust GWSS Method to Simultaneously Detect Rare and Common Variants for Complex Disease," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-14, April.
    2. Elodie Persyn & Richard Redon & Lise Bellanger & Christian Dina, 2018. "The impact of a fine-scale population stratification on rare variant association test results," PLOS ONE, Public Library of Science, vol. 13(12), pages 1-17, December.
    3. Wan-Yu Lin & Xiang-Yang Lou & Guimin Gao & Nianjun Liu, 2014. "Rare Variant Association Testing by Adaptive Combination of P-values," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-7, January.
    4. Silviu-Alin Bacanu & Matthew R Nelson & John C Whittaker, 2012. "Comparison of Statistical Tests for Association between Rare Variants and Binary Traits," PLOS ONE, Public Library of Science, vol. 7(8), pages 1-7, August.
    5. ChangJiang Xu & Martin Ladouceur & Zari Dastani & J Brent Richards & Antonio Ciampi & Celia M T Greenwood, 2012. "Multiple Regression Methods Show Great Potential for Rare Variant Association Tests," PLOS ONE, Public Library of Science, vol. 7(8), pages 1-10, August.
    6. Ruixue Fan & Shaw-Hwa Lo, 2013. "A Robust Model-free Approach for Rare Variants Association Studies Incorporating Gene-Gene and Gene-Environmental Interactions," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-14, December.
    7. Mathurin Dorel & Bertram Klinger & Tommaso Mari & Joern Toedling & Eric Blanc & Clemens Messerschmidt & Michal Nadler-Holly & Matthias Ziehm & Anja Sieber & Falk Hertwig & Dieter Beule & Angelika Egge, 2021. "Neuroblastoma signalling models unveil combination therapies targeting feedback-mediated resistance," PLOS Computational Biology, Public Library of Science, vol. 17(11), pages 1-26, November.
    8. Wenjing Qi & Andrew S Allen & Yi-Ju Li, 2019. "Family-based association tests for rare variants with censored traits," PLOS ONE, Public Library of Science, vol. 14(1), pages 1-17, January.
    9. María Soler Artigas & Louise V Wain & Nick Shrine & Tricia M McKeever & UK BiLEVE & Ian Sayers & Ian P Hall & Martin D Tobin, 2017. "Targeted Sequencing of Lung Function Loci in Chronic Obstructive Pulmonary Disease Cases and Controls," PLOS ONE, Public Library of Science, vol. 12(1), pages 1-17, January.
    10. Abhishek Niroula & Mauno Vihinen, 2019. "How good are pathogenicity predictors in detecting benign variants?," PLOS Computational Biology, Public Library of Science, vol. 15(2), pages 1-17, February.
    11. Jaleal S Sanjak & Anthony D Long & Kevin R Thornton, 2017. "A Model of Compound Heterozygous, Loss-of-Function Alleles Is Broadly Consistent with Observations from Complex-Disease GWAS Datasets," PLOS Genetics, Public Library of Science, vol. 13(1), pages 1-30, January.
    12. Zheng Xu & Song Yan & Cong Wu & Qing Duan & Sixia Chen & Yun Li, 2023. "Next-Generation Sequencing Data-Based Association Testing of a Group of Genetic Markers for Complex Responses Using a Generalized Linear Model Framework," Mathematics, MDPI, vol. 11(11), pages 1-28, June.
    13. Ruth Greenblatt & Peter Bacchetti & Ross Boylan & Kord Kober & Gayle Springer & Kathryn Anastos & Michael Busch & Mardge Cohen & Seble Kassaye & Deborah Gustafson & Bradley Aouizerat & on behalf of th, 2019. "Genetic and clinical predictors of CD4 lymphocyte recovery during suppressive antiretroviral therapy: Whole exome sequencing and antiretroviral therapy response phenotypes," PLOS ONE, Public Library of Science, vol. 14(8), pages 1-25, August.
    14. Thomas Beery & K. Ingemar Jönsson & Johan Elmberg, 2015. "From Environmental Connectedness to Sustainable Futures: Topophilia and Human Affiliation with Nature," Sustainability, MDPI, vol. 7(7), pages 1-18, July.
    15. Nanye Long & Samuel P Dickson & Jessica M Maia & Hee Shin Kim & Qianqian Zhu & Andrew S Allen, 2013. "Leveraging Prior Information to Detect Causal Variants via Multi-Variant Regression," PLOS Computational Biology, Public Library of Science, vol. 9(6), pages 1-11, June.
    16. Xinge Jessie Jeng & Zhongyin John Daye & Wenbin Lu & Jung-Ying Tzeng, 2016. "Rare Variants Association Analysis in Large-Scale Sequencing Studies at the Single Locus Level," PLOS Computational Biology, Public Library of Science, vol. 12(6), pages 1-23, June.
    17. Weihua Guan & Chun Li, 2014. "Design of DNA Pooling to Allow Incorporation of Covariates in Rare Variants Analysis," PLOS ONE, Public Library of Science, vol. 9(12), pages 1-16, December.
    18. Wan-Yu Lin, 2014. "Adaptive Combination of P-Values for Family-Based Association Testing with Sequence Data," PLOS ONE, Public Library of Science, vol. 9(12), pages 1-16, December.
    19. Melfi, Andrew & Viswanath, Divakar, 2018. "The Wright–Fisher site frequency spectrum as a perturbation of the coalescent’s," Theoretical Population Biology, Elsevier, vol. 124(C), pages 81-92.
    20. Cameron Palmer & Itsik Pe’er, 2016. "Bias Characterization in Probabilistic Genotype Data and Improved Signal Detection with Multiple Imputation," PLOS Genetics, Public Library of Science, vol. 12(6), pages 1-17, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1005165. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.