IDEAS home Printed from https://ideas.repec.org/a/plo/pgen00/1008947.html
   My bibliography  Save this article

A general framework for functionally informed set-based analysis: Application to a large-scale colorectal cancer study

Author

Listed:
  • Xinyuan Dong
  • Yu-Ru Su
  • Richard Barfield
  • Stephanie A Bien
  • Qianchuan He
  • Tabitha A Harrison
  • Jeroen R Huyghe
  • Temitope O Keku
  • Noralane M Lindor
  • Clemens Schafmayer
  • Andrew T Chan
  • Stephen B Gruber
  • Mark A Jenkins
  • Charles Kooperberg
  • Ulrike Peters
  • Li Hsu

Abstract

Genome-wide association studies (GWAS) have successfully identified tens of thousands of genetic variants associated with various phenotypes, but together they explain only a fraction of heritability, suggesting many variants have yet to be discovered. Recently it has been recognized that incorporating functional information of genetic variants can improve power for identifying novel loci. For example, S-PrediXcan and TWAS tested the association of predicted gene expression with phenotypes based on GWAS summary statistics by leveraging the information on genetic regulation of gene expression and found many novel loci. However, as genetic variants may have effects on more than one gene and through different mechanisms, these methods likely only capture part of the total effects of these variants. In this paper, we propose a summary statistics-based mixed effects score test (sMiST) that tests for the total effect of both the effect of the mediator by imputing genetically predicted gene expression, like S-PrediXcan and TWAS, and the direct effects of individual variants. It allows for multiple functional annotations and multiple genetically predicted mediators. It can also perform conditional association analysis while adjusting for other genetic variants (e.g., known loci for the phenotype). Extensive simulation and real data analyses demonstrate that sMiST yields p-values that agree well with those obtained from individual level data but with substantively improved computational speed. Importantly, a broad application of sMiST to GWAS is possible, as only summary statistics of genetic variant associations are required. We apply sMiST to a large-scale GWAS of colorectal cancer using summary statistics from ∼120, 000 study participants and gene expression data from the Genotype-Tissue Expression (GTEx) project. We identify several novel and secondary independent genetic loci.Author summary: We developed summary statistics-based mixed effects score test statistics (sMiST) for testing the association of multiple genetically predicted mediators simultaneously and direct association of individual variants independent of mediators by using a random effects model. Extensive simulation and real data analyses demonstrate that sMiST recovers the results of MiST that is based on individual level data, but is computationally much faster. We applied our approach to a genome-wide association study of colorectal cancer and gene expression and identified several novel and secondary genetic loci.

Suggested Citation

  • Xinyuan Dong & Yu-Ru Su & Richard Barfield & Stephanie A Bien & Qianchuan He & Tabitha A Harrison & Jeroen R Huyghe & Temitope O Keku & Noralane M Lindor & Clemens Schafmayer & Andrew T Chan & Stephen, 2020. "A general framework for functionally informed set-based analysis: Application to a large-scale colorectal cancer study," PLOS Genetics, Public Library of Science, vol. 16(8), pages 1-21, August.
  • Handle: RePEc:plo:pgen00:1008947
    DOI: 10.1371/journal.pgen.1008947
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1008947
    Download Restriction: no

    File URL: https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1008947&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pgen.1008947?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Claudia Giambartolomei & Damjan Vukcevic & Eric E Schadt & Lude Franke & Aroon D Hingorani & Chris Wallace & Vincent Plagnol, 2014. "Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics," PLOS Genetics, Public Library of Science, vol. 10(5), pages 1-15, May.
    2. Xiaoquan Wen & Roger Pique-Regi & Francesca Luca, 2017. "Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization," PLOS Genetics, Public Library of Science, vol. 13(3), pages 1-25, March.
    3. Alvaro N. Barbeira & Scott P. Dickinson & Rodrigo Bonazzola & Jiamao Zheng & Heather E. Wheeler & Jason M. Torres & Eric S. Torstenson & Kaanan P. Shah & Tzintzuni Garcia & Todd L. Edwards & Eli A. St, 2018. "Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics," Nature Communications, Nature, vol. 9(1), pages 1-20, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Naim Panjwani & Fan Wang & Scott Mastromatteo & Allen Bao & Cheng Wang & Gengming He & Jiafen Gong & Johanna M Rommens & Lei Sun & Lisa J Strug, 2020. "LocusFocus: Web-based colocalization for the annotation and functional follow-up of GWAS," PLOS Computational Biology, Public Library of Science, vol. 16(10), pages 1-8, October.
    2. Anqi Zhu & Nana Matoba & Emma P Wilson & Amanda L Tapia & Yun Li & Joseph G Ibrahim & Jason L Stein & Michael I Love, 2021. "MRLocus: Identifying causal genes mediating a trait through Bayesian estimation of allelic heterogeneity," PLOS Genetics, Public Library of Science, vol. 17(4), pages 1-24, April.
    3. Angela Andaleon & Lauren S Mogil & Heather E Wheeler, 2019. "Genetically regulated gene expression underlies lipid traits in Hispanic cohorts," PLOS ONE, Public Library of Science, vol. 14(8), pages 1-21, August.
    4. Xena Marie Mapel & Naveen Kumar Kadri & Alexander S. Leonard & Qiongyu He & Audald Lloret-Villas & Meenu Bhati & Maya Hiltpold & Hubert Pausch, 2024. "Molecular quantitative trait loci in reproductive tissues impact male fertility in cattle," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    5. William J. Young & Jeffrey Haessler & Jan-Walter Benjamins & Linda Repetto & Jie Yao & Aaron Isaacs & Andrew R. Harper & Julia Ramirez & Sophie Garnier & Stefan Duijvenboden & Antoine R. Baldassari & , 2023. "Genetic architecture of spatial electrical biomarkers for cardiac arrhythmia and relationship with cardiovascular disease," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    6. Yaohua Yang & Yaxin Chen & Shuai Xu & Xingyi Guo & Guochong Jia & Jie Ping & Xiang Shu & Tianying Zhao & Fangcheng Yuan & Gang Wang & Yufang Xie & Hang Ci & Hongmo Liu & Yawen Qi & Yongjun Liu & Dan L, 2024. "Integrating muti-omics data to identify tissue-specific DNA methylation biomarkers for cancer risk," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    7. Benjamin J. Schmiedel & Job Rocha & Cristian Gonzalez-Colin & Sourya Bhattacharyya & Ariel Madrigal & Christian H. Ottensmeier & Ferhat Ay & Vivek Chandra & Pandurangan Vijayanand, 2021. "COVID-19 genetic risk variants are associated with expression of multiple genes in diverse immune cell types," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    8. Yanyu Xiao & Jingjing Wang & Jiaqi Li & Peijing Zhang & Jingyu Li & Yincong Zhou & Qing Zhou & Ming Chen & Xin Sheng & Zhihong Liu & Xiaoping Han & Guoji Guo, 2023. "An analytical framework for decoding cell type-specific genetic variation of gene regulation," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    9. Sébastien Thériault & Zhonglin Li & Erik Abner & Jian’an Luan & Hasanga D. Manikpurage & Ursula Houessou & Pardis Zamani & Mewen Briend & Dominique K. Boudreau & Nathalie Gaudreault & Lily Frenette & , 2024. "Integrative genomic analyses identify candidate causal genes for calcific aortic valve stenosis involving tissue-specific regulation," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    10. Jayshree Advani & Puja A. Mehta & Andrew R. Hamel & Sudeep Mehrotra & Christina Kiel & Tobias Strunz & Ximena Corso-Díaz & Madeline Kwicklis & Freekje Asten & Rinki Ratnapriya & Emily Y. Chew & Dena G, 2024. "QTL mapping of human retina DNA methylation identifies 87 gene-epigenome interactions in age-related macular degeneration," Nature Communications, Nature, vol. 15(1), pages 1-20, December.
    11. Yangqing Deng & Wei Pan, 2020. "A powerful and versatile colocalization test," PLOS Computational Biology, Public Library of Science, vol. 16(4), pages 1-18, April.
    12. Jacob Joseph & Chang Liu & Qin Hui & Krishna Aragam & Zeyuan Wang & Brian Charest & Jennifer E. Huffman & Jacob M. Keaton & Todd L. Edwards & Serkalem Demissie & Luc Djousse & Juan P. Casas & J. Micha, 2022. "Genetic architecture of heart failure with preserved versus reduced ejection fraction," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    13. Natalie DeForest & Yuqi Wang & Zhiyi Zhu & Jacqueline S. Dron & Ryan Koesterer & Pradeep Natarajan & Jason Flannick & Tiffany Amariuta & Gina M. Peloso & Amit R. Majithia, 2024. "Genome-wide discovery and integrative genomic characterization of insulin resistance loci using serum triglycerides to HDL-cholesterol ratio as a proxy," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    14. Julia Schröder & Vitalia Schüller & Andrea May & Christian Gerges & Mario Anders & Jessica Becker & Timo Hess & Nicole Kreuser & René Thieme & Kerstin U Ludwig & Tania Noder & Marino Venerito & Lothar, 2019. "Identification of loci of functional relevance to Barrett’s esophagus and esophageal adenocarcinoma: Cross-referencing of expression quantitative trait loci data from disease-relevant tissues with gen," PLOS ONE, Public Library of Science, vol. 14(12), pages 1-12, December.
    15. Lili Liu & Atlas Khan & Elena Sanchez-Rodriguez & Francesca Zanoni & Yifu Li & Nicholas Steers & Olivia Balderes & Junying Zhang & Priya Krithivasan & Robert A. LeDesma & Clara Fischman & Scott J. Heb, 2022. "Genetic regulation of serum IgA levels and susceptibility to common immune, infectious, kidney, and cardio-metabolic traits," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    16. Sylvia Hartmann & Summaira Yasmeen & Benjamin M. Jacobs & Spiros Denaxas & Munir Pirmohamed & Eric R. Gamazon & Mark J. Caulfield & Harry Hemingway & Maik Pietzner & Claudia Langenberg, 2023. "ADRA2A and IRX1 are putative risk genes for Raynaud’s phenomenon," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    17. Brittany L. Mitchell & Jake R. Saklatvala & Nick Dand & Fiona A. Hagenbeek & Xin Li & Josine L. Min & Laurent Thomas & Meike Bartels & Jouke Hottenga & Michelle K. Lupton & Dorret I. Boomsma & Xianjun, 2022. "Genome-wide association meta-analysis identifies 29 new acne susceptibility loci," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    18. Zichen Zhang & Ye Eun Bae & Jonathan R. Bradley & Lang Wu & Chong Wu, 2022. "SUMMIT: An integrative approach for better transcriptomic data imputation improves causal gene identification," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    19. Pietro Demela & Nicola Pirastu & Blagoje Soskic, 2023. "Cross-disorder genetic analysis of immune diseases reveals distinct gene associations that converge on common pathways," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    20. Zhaotong Lin & Wei Pan, 2024. "A robust cis-Mendelian randomization method with application to drug target discovery," Nature Communications, Nature, vol. 15(1), pages 1-14, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pgen00:1008947. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosgenetics (email available below). General contact details of provider: https://journals.plos.org/plosgenetics/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.