IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1004714.html
   My bibliography  Save this article

Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics

Author

Listed:
  • David Lamparter
  • Daniel Marbach
  • Rico Rueedi
  • Zoltán Kutalik
  • Sven Bergmann

Abstract

Integrating single nucleotide polymorphism (SNP) p-values from genome-wide association studies (GWAS) across genes and pathways is a strategy to improve statistical power and gain biological insight. Here, we present Pascal (Pathway scoring algorithm), a powerful tool for computing gene and pathway scores from SNP-phenotype association summary statistics. For gene score computation, we implemented analytic and efficient numerical solutions to calculate test statistics. We examined in particular the sum and the maximum of chi-squared statistics, which measure the strongest and the average association signals per gene, respectively. For pathway scoring, we use a modified Fisher method, which offers not only significant power improvement over more traditional enrichment strategies, but also eliminates the problem of arbitrary threshold selection inherent in any binary membership based pathway enrichment approach. We demonstrate the marked increase in power by analyzing summary statistics from dozens of large meta-studies for various traits. Our extensive testing indicates that our method not only excels in rigorous type I error control, but also results in more biologically meaningful discoveries.Author Summary: Genome-wide association studies (GWAS) typically generate lists of trait- or disease-associated SNPs. Yet, such output sheds little light on the underlying molecular mechanisms and tools are needed to extract biological insight from the results at the SNP level. Pathway analysis tools integrate signals from multiple SNPs at various positions in the genome in order to map associated genomic regions to well-established pathways, i.e., sets of genes known to act in concert. The nature of GWAS association results requires specifically tailored methods for this task. Here, we present Pascal (Pathway scoring algorithm), a tool that allows gene and pathway-level analysis of GWAS association results without the need to access the original genotypic data. Pascal was designed to be fast, accurate and to have high power to detect relevant pathways. We extensively tested our approach on a large collection of real GWAS association results and saw better discovery of confirmed pathways than with other popular methods. We believe that these results together with the ease-of-use of our publicly available software will allow Pascal to become a useful addition to the toolbox of the GWAS community.

Suggested Citation

  • David Lamparter & Daniel Marbach & Rico Rueedi & Zoltán Kutalik & Sven Bergmann, 2016. "Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics," PLOS Computational Biology, Public Library of Science, vol. 12(1), pages 1-20, January.
  • Handle: RePEc:plo:pcbi00:1004714
    DOI: 10.1371/journal.pcbi.1004714
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004714
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004714&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1004714?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Duchesne, Pierre & Lafaye De Micheaux, Pierre, 2010. "Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods," Computational Statistics & Data Analysis, Elsevier, vol. 54(4), pages 858-862, April.
    2. R. W. Farebrother, 1984. "The Distribution of a Positive Linear Combination of X2 Random Variables," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 33(3), pages 332-339, November.
    3. Ayellet V Segrè & DIAGRAM Consortium & MAGIC investigators & Leif Groop & Vamsi K Mootha & Mark J Daly & David Altshuler, 2010. "Common Inherited Variation in Mitochondrial Genes Is Not Enriched for Associations with Type 2 Diabetes or Related Glycemic Traits," PLOS Genetics, Public Library of Science, vol. 6(8), pages 1-19, August.
    4. Tune H. Pers & Juha M. Karjalainen & Yingleong Chan & Harm-Jan Westra & Andrew R. Wood & Jian Yang & Julian C. Lui & Sailaja Vedantam & Stefan Gustafsson & Tonu Esko & Tim Frayling & Elizabeth K. Spel, 2015. "Biological interpretation of genome-wide association studies using predicted gene functions," Nature Communications, Nature, vol. 6(1), pages 1-9, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Akshat Singhal & Song Cao & Christopher Churas & Dexter Pratt & Santo Fortunato & Fan Zheng & Trey Ideker, 2020. "Multiscale community detection in Cytoscape," PLOS Computational Biology, Public Library of Science, vol. 16(10), pages 1-10, October.
    2. Sofía Ortín Vela & Michael J. Beyeler & Olga Trofimova & Ilaria Iuliani & Jose D. Vargas Quiros & Victor A. Vries & Ilenia Meloni & Adham Elwakil & Florence Hoogewoud & Bart Liefers & David Presby & W, 2024. "Phenotypic and genetic characteristics of retinal vascular parameters and their association with diseases," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    3. Olga A Vsevolozhskaya & Min Shi & Fengjiao Hu & Dmitri V Zaykin, 2020. "DOT: Gene-set analysis by combining decorrelated association statistics," PLOS Computational Biology, Public Library of Science, vol. 16(4), pages 1-25, April.
    4. Tapati Basak & Kazuhisa Nagashima & Satoshi Kajimoto & Takahisa Kawaguchi & Yasuharu Tabara & Fumihiko Matsuda & Ryo Yamada, 2020. "A Geometry-Based Multiple Testing Correction for Contingency Tables by Truncated Normal Distribution," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(1), pages 63-77, April.
    5. Francesca Mateo & Zhengcheng He & Lin Mei & Gorka Ruiz de Garibay & Carmen Herranz & Nadia García & Amanda Lorentzian & Alexandra Baiges & Eline Blommaert & Antonio Gómez & Oriol Mirallas & Anna Garri, 2022. "Modification of BRCA1-associated breast cancer risk by HMMR overexpression," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    6. Go Sato & Yuya Shirai & Shinichi Namba & Ryuya Edahiro & Kyuto Sonehara & Tsuyoshi Hata & Mamoru Uemura & Koichi Matsuda & Yuichiro Doki & Hidetoshi Eguchi & Yukinori Okada, 2023. "Pan-cancer and cross-population genome-wide association studies dissect shared genetic backgrounds underlying carcinogenesis," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    7. Winn-Nuñez, Emily T. & Griffin, Maryclare & Crawford, Lorin, 2024. "A simple approach for local and global variable importance in nonlinear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 194(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lina Cai & Tomas Gonzales & Eleanor Wheeler & Nicola D. Kerrison & Felix R. Day & Claudia Langenberg & John R. B. Perry & Soren Brage & Nicholas J. Wareham, 2023. "Causal associations between cardiorespiratory fitness and type 2 diabetes," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    2. Chen, Tong & Lumley, Thomas, 2019. "Numerical evaluation of methods approximating the distribution of a large quadratic form in normal variables," Computational Statistics & Data Analysis, Elsevier, vol. 139(C), pages 75-81.
    3. Brittany L. Mitchell & Jake R. Saklatvala & Nick Dand & Fiona A. Hagenbeek & Xin Li & Josine L. Min & Laurent Thomas & Meike Bartels & Jouke Hottenga & Michelle K. Lupton & Dorret I. Boomsma & Xianjun, 2022. "Genome-wide association meta-analysis identifies 29 new acne susceptibility loci," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    4. Junjiao Feng & Liang Zhang & Chunhui Chen & Jintao Sheng & Zhifang Ye & Kanyin Feng & Jing Liu & Ying Cai & Bi Zhu & Zhaoxia Yu & Chuansheng Chen & Qi Dong & Gui Xue, 2022. "A cognitive neurogenetic approach to uncovering the structure of executive functions," Nature Communications, Nature, vol. 13(1), pages 1-19, December.
    5. Ghiglietti, Andrea & Paganoni, Anna Maria, 2017. "Exact tests for the means of Gaussian stochastic processes," Statistics & Probability Letters, Elsevier, vol. 131(C), pages 102-107.
    6. Aysu Okbay & Jonathan P. Beauchamp & Mark Alan Fontana & James J. Lee & Tune H. Pers & Cornelius A. Rietveld & Patrick Turley & Guo-Bo Chen & Valur Emilsson & S. Fleur W. Meddens & Sven Oskarsson & Jo, 2016. "Genome-wide association study identifies 74 loci associated with educational attainment," Nature, Nature, vol. 533(7604), pages 539-542, May.
    7. Michael G. Levin & Noah L. Tsao & Pankhuri Singhal & Chang Liu & Ha My T. Vy & Ishan Paranjpe & Joshua D. Backman & Tiffany R. Bellomo & William P. Bone & Kiran J. Biddinger & Qin Hui & Ozan Dikilitas, 2022. "Genome-wide association and multi-trait analyses characterize the common genetic architecture of heart failure," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    8. Xiaofeng Zhu & Yihe Yang & Noah Lorincz-Comi & Gen Li & Amy R. Bentley & Paul S. de Vries & Michael Brown & Alanna C. Morrison & Charles N. Rotimi & W. James Gauderman & Dabeeru C. Rao & Hugues Aschar, 2024. "An approach to identify gene-environment interactions and reveal new biological insight in complex traits," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    9. Catherine M. Francis & Matthias E. Futschik & Jian Huang & Wenjia Bai & Muralidharan Sargurupremraj & Alexander Teumer & Monique M. B. Breteler & Enrico Petretto & Amanda S. R. Ho & Philippe Amouyel &, 2022. "Genome-wide associations of aortic distensibility suggest causality for aortic aneurysms and brain white matter hyperintensities," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    10. Yu-Han H Hsu & Claire Churchhouse & Tune H Pers & Josep M Mercader & Andres Metspalu & Krista Fischer & Kristen Fortney & Eric K Morgen & Clicerio Gonzalez & Maria E Gonzalez & Tonu Esko & Joel N Hirs, 2019. "PAIRUP-MS: Pathway analysis and imputation to relate unknowns in profiles from mass spectrometry-based metabolite data," PLOS Computational Biology, Public Library of Science, vol. 15(1), pages 1-26, January.
    11. Milton Pividori & Sumei Lu & Binglan Li & Chun Su & Matthew E. Johnson & Wei-Qi Wei & Qiping Feng & Bahram Namjou & Krzysztof Kiryluk & Iftikhar J. Kullo & Yuan Luo & Blair D. Sullivan & Benjamin F. V, 2023. "Projecting genetic associations through gene expression patterns highlights disease etiology and drug mechanisms," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    12. Pötscher, Benedikt M. & Preinerstorfer, David, 2021. "Valid Heteroskedasticity Robust Testing," MPRA Paper 117855, University Library of Munich, Germany, revised Jul 2023.
    13. G. I. Rivas-Martínez & M. D. Jiménez-Gamero & J. L. Moreno-Rebollo, 2019. "A two-sample test for the error distribution in nonparametric regression based on the characteristic function," Statistical Papers, Springer, vol. 60(4), pages 1369-1395, August.
    14. Benjamin Lehne & Cathryn M Lewis & Thomas Schlitt, 2011. "From SNPs to Genes: Disease Association at the Gene Level," PLOS ONE, Public Library of Science, vol. 6(6), pages 1-10, June.
    15. Andrew D. Grotzinger & Travis T. Mallard & Zhaowen Liu & Jakob Seidlitz & Tian Ge & Jordan W. Smoller, 2023. "Multivariate genomic architecture of cortical thickness and surface area at multiple levels of analysis," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    16. Andrew D. Grotzinger & Javier de la Fuente & Gail Davies & Michel G. Nivard & Elliot M. Tucker-Drob, 2022. "Transcriptome-wide and stratified genomic structural equation modeling identify neurobiological pathways shared across diverse cognitive traits," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    17. Anton Rask Lundborg & Rajen D. Shah & Jonas Peters, 2022. "Conditional independence testing in Hilbert spaces with applications to functional data analysis," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(5), pages 1821-1850, November.
    18. van Aert, Robbie Cornelis Maria, 2018. "Dissertation R.C.M. van Aert," MetaArXiv eqhjd, Center for Open Science.
    19. Lee, James J. & McGue, Matt & Iacono, William G. & Michael, Andrew M. & Chabris, Christopher F., 2019. "The causal influence of brain size on human intelligence: Evidence from within-family phenotypic associations and GWAS modeling," Intelligence, Elsevier, vol. 75(C), pages 48-58.
    20. Saredo Said & Raha Pazoki & Ville Karhunen & Urmo Võsa & Symen Ligthart & Barbara Bodinier & Fotios Koskeridis & Paul Welsh & Behrooz Z. Alizadeh & Daniel I. Chasman & Naveed Sattar & Marc Chadeau-Hya, 2022. "Genetic analysis of over half a million people characterises C-reactive protein loci," Nature Communications, Nature, vol. 13(1), pages 1-10, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1004714. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.