IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1007819.html
   My bibliography  Save this article

DOT: Gene-set analysis by combining decorrelated association statistics

Author

Listed:
  • Olga A Vsevolozhskaya
  • Min Shi
  • Fengjiao Hu
  • Dmitri V Zaykin

Abstract

Historically, the majority of statistical association methods have been designed assuming availability of SNP-level information. However, modern genetic and sequencing data present new challenges to access and sharing of genotype-phenotype datasets, including cost of management, difficulties in consolidation of records across research groups, etc. These issues make methods based on SNP-level summary statistics particularly appealing. The most common form of combining statistics is a sum of SNP-level squared scores, possibly weighted, as in burden tests for rare variants. The overall significance of the resulting statistic is evaluated using its distribution under the null hypothesis. Here, we demonstrate that this basic approach can be substantially improved by decorrelating scores prior to their addition, resulting in remarkable power gains in situations that are most commonly encountered in practice; namely, under heterogeneity of effect sizes and diversity between pairwise LD. In these situations, the power of the traditional test, based on the added squared scores, quickly reaches a ceiling, as the number of variants increases. Thus, the traditional approach does not benefit from information potentially contained in any additional SNPs, while our decorrelation by orthogonal transformation (DOT) method yields steady gain in power. We present theoretical and computational analyses of both approaches, and reveal causes behind sometimes dramatic difference in their respective powers. We showcase DOT by analyzing breast cancer and cleft lip data, in which our method strengthened levels of previously reported associations and implied the possibility of multiple new alleles that jointly confer disease risk.Author summary: Joint analysis of association between the outcome and a group of SNPs within a genetic region is increasingly recognized to complement single-SNP analysis and shed light on the underlying molecular mechanisms. However, the correlation among GWAS association results calls for specifically tailored statistical methods. Here we propose DOT (Decorrelation by Orthogonal Transformation) method that can efficiently combine evidence of association over different SNPs and genes within a pathway without access to the original genotypic data. DOT is fast, does not rely on a permutation algorithm, and is often dramatically more powerful than other popular methods, such as VEGAS and the recently proposed ACAT. We believe that DOT will become a useful addition to the toolbox of methods based on the summary statistics for the GWAS community.

Suggested Citation

  • Olga A Vsevolozhskaya & Min Shi & Fengjiao Hu & Dmitri V Zaykin, 2020. "DOT: Gene-set analysis by combining decorrelated association statistics," PLOS Computational Biology, Public Library of Science, vol. 16(4), pages 1-25, April.
  • Handle: RePEc:plo:pcbi00:1007819
    DOI: 10.1371/journal.pcbi.1007819
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007819
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1007819&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1007819?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. David Lamparter & Daniel Marbach & Rico Rueedi & Zoltán Kutalik & Sven Bergmann, 2016. "Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics," PLOS Computational Biology, Public Library of Science, vol. 12(1), pages 1-20, January.
    2. Zhang Tian-Xiao & Beaty Terri H. & Ruczinski Ingo, 2012. "Candidate Pathway Based Analysis for Cleft Lip with or without Cleft Palate," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(2), pages 1-21, January.
    3. Huan Liu & Elizabeth J. Leslie & Jenna C. Carlson & Terri H. Beaty & Mary L. Marazita & Andrew C. Lidral & Robert A. Cornell, 2017. "Identification of common non-coding variants at 1p22 that are functional for non-syndromic orofacial clefting," Nature Communications, Nature, vol. 8(1), pages 1-13, April.
    4. Hou, Chia-Ding, 2005. "A simple approximation for the distribution of the weighted combination of non-independent or independent probabilities," Statistics & Probability Letters, Elsevier, vol. 73(2), pages 179-187, June.
    5. Christiaan A de Leeuw & Joris M Mooij & Tom Heskes & Danielle Posthuma, 2015. "MAGMA: Generalized Gene-Set Analysis of GWAS Data," PLOS Computational Biology, Public Library of Science, vol. 11(4), pages 1-19, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Winn-Nuñez, Emily T. & Griffin, Maryclare & Crawford, Lorin, 2024. "A simple approach for local and global variable importance in nonlinear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 194(C).
    2. Shahram Bahrami & Kaja Nordengen & Jaroslav Rokicki & Alexey A. Shadrin & Zillur Rahman & Olav B. Smeland & Piotr P. Jaholkowski & Nadine Parker & Pravesh Parekh & Kevin S. O’Connell & Torbjørn Elvsås, 2024. "The genetic landscape of basal ganglia and implications for common brain disorders," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    3. Elena V. Feofanova & Michael R. Brown & Taryn Alkis & Astrid M. Manuel & Xihao Li & Usman A. Tahir & Zilin Li & Kevin M. Mendez & Rachel S. Kelly & Qibin Qi & Han Chen & Martin G. Larson & Rozenn N. L, 2023. "Whole-Genome Sequencing Analysis of Human Metabolome in Multi-Ethnic Populations," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    4. Yash Patel & Jean Shin & Eeva Sliz & Ariana Tang & Aniket Mishra & Rui Xia & Edith Hofer & Hema Sekhar Reddy Rajula & Ruiqi Wang & Frauke Beyer & Katrin Horn & Max Riedl & Jing Yu & Henry Völzke & Rob, 2024. "Genetic risk factors underlying white matter hyperintensities and cortical atrophy," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    5. Xiaofeng Zhu & Yihe Yang & Noah Lorincz-Comi & Gen Li & Amy R. Bentley & Paul S. de Vries & Michael Brown & Alanna C. Morrison & Charles N. Rotimi & W. James Gauderman & Dabeeru C. Rao & Hugues Aschar, 2024. "An approach to identify gene-environment interactions and reveal new biological insight in complex traits," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    6. Sophie A. Riesmeijer & Zoha Kamali & Michael Ng & Dmitriy Drichel & Bram Piersma & Kerstin Becker & Thomas B. Layton & Jagdeep Nanchahal & Michael Nothnagel & Ahmad Vaez & Hans Christian Hennies & Pau, 2024. "A genome-wide association meta-analysis implicates Hedgehog and Notch signaling in Dupuytren’s disease," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    7. Zhiqiang Sha & Dick Schijven & Amaia Carrion-Castillo & Marc Joliot & Bernard Mazoyer & Simon E. Fisher & Fabrice Crivello & Clyde Francks, 2021. "The genetic architecture of structural left–right asymmetry of the human brain," Nature Human Behaviour, Nature, vol. 5(9), pages 1226-1239, September.
    8. Tapati Basak & Kazuhisa Nagashima & Satoshi Kajimoto & Takahisa Kawaguchi & Yasuharu Tabara & Fumihiko Matsuda & Ryo Yamada, 2020. "A Geometry-Based Multiple Testing Correction for Contingency Tables by Truncated Normal Distribution," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(1), pages 63-77, April.
    9. Shufen Zheng & Philip S. Tsao & Cuiping Pan, 2024. "Abdominal aortic aneurysm and cardiometabolic traits share strong genetic susceptibility to lipid metabolism and inflammation," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    10. Catherine M. Francis & Matthias E. Futschik & Jian Huang & Wenjia Bai & Muralidharan Sargurupremraj & Alexander Teumer & Monique M. B. Breteler & Enrico Petretto & Amanda S. R. Ho & Philippe Amouyel &, 2022. "Genome-wide associations of aortic distensibility suggest causality for aortic aneurysms and brain white matter hyperintensities," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    11. Palwende Romuald Boua & Jean-Tristan Brandenburg & Ananyo Choudhury & Hermann Sorgho & Engelbert A. Nonterah & Godfred Agongo & Gershim Asiki & Lisa Micklesfield & Solomon Choma & Francesc Xavier Góme, 2022. "Genetic associations with carotid intima-media thickness link to atherosclerosis with sex-specific effects in sub-Saharan Africans," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    12. Yunfeng Huang & Dora Bodnar & Chia-Yen Chen & Gabriela Sanchez-Andrade & Mark Sanderson & Jun Shi & Katherine G. Meilleur & Matthew E. Hurles & Sebastian S. Gerety & Ellen A. Tsai & Heiko Runz, 2023. "Rare genetic variants impact muscle strength," Nature Communications, Nature, vol. 14(1), pages 1-8, December.
    13. Richard Burns & William J. Young & Nay Aung & Luis R. Lopes & Perry M. Elliott & Petros Syrris & Roberto Barriales-Villa & Catrin Sohrabi & Steffen E. Petersen & Julia Ramírez & Alistair Young & Patri, 2024. "Genetic basis of right and left ventricular heart shape," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    14. Hung-Lin Chen & Hsiu-Yin Chiang & David Ray Chang & Chi-Fung Cheng & Charles C. N. Wang & Tzu-Pin Lu & Chien-Yueh Lee & Amrita Chattopadhyay & Yu-Ting Lin & Che-Chen Lin & Pei-Tzu Yu & Chien-Fong Huan, 2024. "Discovery and prioritization of genetic determinants of kidney function in 297,355 individuals from Taiwan and Japan," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    15. Milton Pividori & Sumei Lu & Binglan Li & Chun Su & Matthew E. Johnson & Wei-Qi Wei & Qiping Feng & Bahram Namjou & Krzysztof Kiryluk & Iftikhar J. Kullo & Yuan Luo & Blair D. Sullivan & Benjamin F. V, 2023. "Projecting genetic associations through gene expression patterns highlights disease etiology and drug mechanisms," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    16. Chachrit Khunsriraksakul & Qinmengge Li & Havell Markus & Matthew T. Patrick & Renan Sauteraud & Daniel McGuire & Xingyan Wang & Chen Wang & Lida Wang & Siyuan Chen & Ganesh Shenoy & Bingshan Li & Xue, 2023. "Multi-ancestry and multi-trait genome-wide association meta-analyses inform clinical risk prediction for systemic lupus erythematosus," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    17. Yu Huang & Denis Plotnikov & Huan Wang & Danli Shi & Cong Li & Xueli Zhang & Xiayin Zhang & Shulin Tang & Xianwen Shang & Yijun Hu & Honghua Yu & Hongyang Zhang & Jeremy A. Guggenheim & Mingguang He, 2024. "GWAS-by-subtraction reveals an IOP-independent component of primary open angle glaucoma," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    18. Abolfazl Doostparast Torshizi & Dongnhu T. Truong & Liping Hou & Bart Smets & Christopher D. Whelan & Shuwei Li, 2024. "Proteogenomic network analysis reveals dysregulated mechanisms and potential mediators in Parkinson’s disease," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    19. Wei Xu & Ines Mesa-Eguiagaray & David M. Morris & Chengjia Wang & Calum D. Gray & Samuel Sjöström & Giorgos Papanastasiou & Sammy Badr & Julien Paccou & Xue Li & Paul R. H. J. Timmers & Maria Timofeev, 2025. "Deep learning and genome-wide association meta-analyses of bone marrow adiposity in the UK Biobank," Nature Communications, Nature, vol. 16(1), pages 1-19, December.
    20. Charley Xia & Sarah J. Pickett & David C. M. Liewald & Alexander Weiss & Gavin Hudson & W. David Hill, 2023. "The contributions of mitochondrial and nuclear mitochondrial genetic variation to neuroticism," Nature Communications, Nature, vol. 14(1), pages 1-14, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1007819. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.