IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1007819.html
   My bibliography  Save this article

DOT: Gene-set analysis by combining decorrelated association statistics

Author

Listed:
  • Olga A Vsevolozhskaya
  • Min Shi
  • Fengjiao Hu
  • Dmitri V Zaykin

Abstract

Historically, the majority of statistical association methods have been designed assuming availability of SNP-level information. However, modern genetic and sequencing data present new challenges to access and sharing of genotype-phenotype datasets, including cost of management, difficulties in consolidation of records across research groups, etc. These issues make methods based on SNP-level summary statistics particularly appealing. The most common form of combining statistics is a sum of SNP-level squared scores, possibly weighted, as in burden tests for rare variants. The overall significance of the resulting statistic is evaluated using its distribution under the null hypothesis. Here, we demonstrate that this basic approach can be substantially improved by decorrelating scores prior to their addition, resulting in remarkable power gains in situations that are most commonly encountered in practice; namely, under heterogeneity of effect sizes and diversity between pairwise LD. In these situations, the power of the traditional test, based on the added squared scores, quickly reaches a ceiling, as the number of variants increases. Thus, the traditional approach does not benefit from information potentially contained in any additional SNPs, while our decorrelation by orthogonal transformation (DOT) method yields steady gain in power. We present theoretical and computational analyses of both approaches, and reveal causes behind sometimes dramatic difference in their respective powers. We showcase DOT by analyzing breast cancer and cleft lip data, in which our method strengthened levels of previously reported associations and implied the possibility of multiple new alleles that jointly confer disease risk.Author summary: Joint analysis of association between the outcome and a group of SNPs within a genetic region is increasingly recognized to complement single-SNP analysis and shed light on the underlying molecular mechanisms. However, the correlation among GWAS association results calls for specifically tailored statistical methods. Here we propose DOT (Decorrelation by Orthogonal Transformation) method that can efficiently combine evidence of association over different SNPs and genes within a pathway without access to the original genotypic data. DOT is fast, does not rely on a permutation algorithm, and is often dramatically more powerful than other popular methods, such as VEGAS and the recently proposed ACAT. We believe that DOT will become a useful addition to the toolbox of methods based on the summary statistics for the GWAS community.

Suggested Citation

  • Olga A Vsevolozhskaya & Min Shi & Fengjiao Hu & Dmitri V Zaykin, 2020. "DOT: Gene-set analysis by combining decorrelated association statistics," PLOS Computational Biology, Public Library of Science, vol. 16(4), pages 1-25, April.
  • Handle: RePEc:plo:pcbi00:1007819
    DOI: 10.1371/journal.pcbi.1007819
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007819
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1007819&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1007819?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. David Lamparter & Daniel Marbach & Rico Rueedi & Zoltán Kutalik & Sven Bergmann, 2016. "Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics," PLOS Computational Biology, Public Library of Science, vol. 12(1), pages 1-20, January.
    2. Zhang Tian-Xiao & Beaty Terri H. & Ruczinski Ingo, 2012. "Candidate Pathway Based Analysis for Cleft Lip with or without Cleft Palate," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(2), pages 1-21, January.
    3. Huan Liu & Elizabeth J. Leslie & Jenna C. Carlson & Terri H. Beaty & Mary L. Marazita & Andrew C. Lidral & Robert A. Cornell, 2017. "Identification of common non-coding variants at 1p22 that are functional for non-syndromic orofacial clefting," Nature Communications, Nature, vol. 8(1), pages 1-13, April.
    4. Hou, Chia-Ding, 2005. "A simple approximation for the distribution of the weighted combination of non-independent or independent probabilities," Statistics & Probability Letters, Elsevier, vol. 73(2), pages 179-187, June.
    5. Christiaan A de Leeuw & Joris M Mooij & Tom Heskes & Danielle Posthuma, 2015. "MAGMA: Generalized Gene-Set Analysis of GWAS Data," PLOS Computational Biology, Public Library of Science, vol. 11(4), pages 1-19, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Winn-Nuñez, Emily T. & Griffin, Maryclare & Crawford, Lorin, 2024. "A simple approach for local and global variable importance in nonlinear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 194(C).
    2. Jialiang S. Wang & Tushar Kamath & Courtney M. Mazur & Fatemeh Mirzamohammadi & Daniel Rotter & Hironori Hojo & Christian D. Castro & Nicha Tokavanich & Rushi Patel & Nicolas Govea & Tetsuya Enishi & , 2021. "Control of osteocyte dendrite formation by Sp7 and its target gene osteocrin," Nature Communications, Nature, vol. 12(1), pages 1-20, December.
    3. Elena V. Feofanova & Michael R. Brown & Taryn Alkis & Astrid M. Manuel & Xihao Li & Usman A. Tahir & Zilin Li & Kevin M. Mendez & Rachel S. Kelly & Qibin Qi & Han Chen & Martin G. Larson & Rozenn N. L, 2023. "Whole-Genome Sequencing Analysis of Human Metabolome in Multi-Ethnic Populations," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    4. Dominik Aschenbrenner & Isar Nassiri & Suresh Venkateswaran & Sumeet Pandey & Matthew Page & Lauren Drowley & Martin Armstrong & Subra Kugathasan & Benjamin Fairfax & Holm H. Uhlig, 2024. "An isoform quantitative trait locus in SBNO2 links genetic susceptibility to Crohn’s disease with defective antimicrobial activity," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    5. Eva-Maria Stauffer & Richard A. I. Bethlehem & Lena Dorfschmidt & Hyejung Won & Varun Warrier & Edward T. Bullmore, 2023. "The genetic relationships between brain structure and schizophrenia," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    6. Akshat Singhal & Song Cao & Christopher Churas & Dexter Pratt & Santo Fortunato & Fan Zheng & Trey Ideker, 2020. "Multiscale community detection in Cytoscape," PLOS Computational Biology, Public Library of Science, vol. 16(10), pages 1-10, October.
    7. Xiaofeng Zhu & Yihe Yang & Noah Lorincz-Comi & Gen Li & Amy R. Bentley & Paul S. de Vries & Michael Brown & Alanna C. Morrison & Charles N. Rotimi & W. James Gauderman & Dabeeru C. Rao & Hugues Aschar, 2024. "An approach to identify gene-environment interactions and reveal new biological insight in complex traits," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    8. Sophie A. Riesmeijer & Zoha Kamali & Michael Ng & Dmitriy Drichel & Bram Piersma & Kerstin Becker & Thomas B. Layton & Jagdeep Nanchahal & Michael Nothnagel & Ahmad Vaez & Hans Christian Hennies & Pau, 2024. "A genome-wide association meta-analysis implicates Hedgehog and Notch signaling in Dupuytren’s disease," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    9. Zhiqiang Sha & Dick Schijven & Amaia Carrion-Castillo & Marc Joliot & Bernard Mazoyer & Simon E. Fisher & Fabrice Crivello & Clyde Francks, 2021. "The genetic architecture of structural left–right asymmetry of the human brain," Nature Human Behaviour, Nature, vol. 5(9), pages 1226-1239, September.
    10. Tapati Basak & Kazuhisa Nagashima & Satoshi Kajimoto & Takahisa Kawaguchi & Yasuharu Tabara & Fumihiko Matsuda & Ryo Yamada, 2020. "A Geometry-Based Multiple Testing Correction for Contingency Tables by Truncated Normal Distribution," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(1), pages 63-77, April.
    11. Qiao Fan & Hengtong Li & Xiaomeng Wang & Yih-Chung Tham & Kelvin Yi Chong Teo & Masayuki Yasuda & Weng Khong Lim & Yuet Ping Kwan & Jing Xian Teo & Ching-Jou Chen & Li Jia Chen & Jeeyun Ahn & Sonia Da, 2023. "Contribution of common and rare variants to Asian neovascular age-related macular degeneration subtypes," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    12. Shufen Zheng & Philip S. Tsao & Cuiping Pan, 2024. "Abdominal aortic aneurysm and cardiometabolic traits share strong genetic susceptibility to lipid metabolism and inflammation," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    13. Catherine M. Francis & Matthias E. Futschik & Jian Huang & Wenjia Bai & Muralidharan Sargurupremraj & Alexander Teumer & Monique M. B. Breteler & Enrico Petretto & Amanda S. R. Ho & Philippe Amouyel &, 2022. "Genome-wide associations of aortic distensibility suggest causality for aortic aneurysms and brain white matter hyperintensities," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    14. Palwende Romuald Boua & Jean-Tristan Brandenburg & Ananyo Choudhury & Hermann Sorgho & Engelbert A. Nonterah & Godfred Agongo & Gershim Asiki & Lisa Micklesfield & Solomon Choma & Francesc Xavier Góme, 2022. "Genetic associations with carotid intima-media thickness link to atherosclerosis with sex-specific effects in sub-Saharan Africans," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    15. Sarah Grosche & Ingo Marenholz & Jorge Esparza-Gordillo & Aleix Arnau-Soler & Erola Pairo-Castineira & Franz Rüschendorf & Tarunveer S. Ahluwalia & Catarina Almqvist & Andreas Arnold & Hansjörg Baurec, 2021. "Rare variant analysis in eczema identifies exonic variants in DUSP1, NOTCH4 and SLC9A4," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    16. Wimmer, Thomas & Geyer-Klingeberg, Jerome & Hütter, Marie & Schmid, Florian & Rathgeber, Andreas, 2021. "The impact of speculation on commodity prices: A Meta-Granger analysis," Journal of Commodity Markets, Elsevier, vol. 22(C).
    17. Yunfeng Huang & Dora Bodnar & Chia-Yen Chen & Gabriela Sanchez-Andrade & Mark Sanderson & Jun Shi & Katherine G. Meilleur & Matthew E. Hurles & Sebastian S. Gerety & Ellen A. Tsai & Heiko Runz, 2023. "Rare genetic variants impact muscle strength," Nature Communications, Nature, vol. 14(1), pages 1-8, December.
    18. Milton Pividori & Sumei Lu & Binglan Li & Chun Su & Matthew E. Johnson & Wei-Qi Wei & Qiping Feng & Bahram Namjou & Krzysztof Kiryluk & Iftikhar J. Kullo & Yuan Luo & Blair D. Sullivan & Benjamin F. V, 2023. "Projecting genetic associations through gene expression patterns highlights disease etiology and drug mechanisms," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    19. Joel T. Rämö & Tuomo Kiiskinen & Richard Seist & Kristi Krebs & Masahiro Kanai & Juha Karjalainen & Mitja Kurki & Eija Hämäläinen & Paavo Häppölä & Aki S. Havulinna & Heidi Hautakangas & Reedik Mägi &, 2023. "Genome-wide screen of otosclerosis in population biobanks: 27 loci and shared associations with skeletal structure," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    20. Chachrit Khunsriraksakul & Qinmengge Li & Havell Markus & Matthew T. Patrick & Renan Sauteraud & Daniel McGuire & Xingyan Wang & Chen Wang & Lida Wang & Siyuan Chen & Ganesh Shenoy & Bingshan Li & Xue, 2023. "Multi-ancestry and multi-trait genome-wide association meta-analyses inform clinical risk prediction for systemic lupus erythematosus," Nature Communications, Nature, vol. 14(1), pages 1-14, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1007819. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.