IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-42975-z.html
   My bibliography  Save this article

Speos: an ensemble graph representation learning framework to predict core gene candidates for complex diseases

Author

Listed:
  • Florin Ratajczak

    (Molecular Targets and Therapeutics Center (MTTC), Helmholtz Munich)

  • Mitchell Joblin

    (Amazon)

  • Marcel Hildebrandt

    (Siemens Technology, Siemens AG)

  • Martin Ringsquandl

    (Siemens Technology, Siemens AG)

  • Pascal Falter-Braun

    (Molecular Targets and Therapeutics Center (MTTC), Helmholtz Munich
    Ludwig-Maximilians-Universität München)

  • Matthias Heinig

    (Helmholtz Munich
    Technical University of Munich
    Munich Heart Association, Partner Site Munich)

Abstract

Understanding phenotype-to-genotype relationships is a grand challenge of 21st century biology with translational implications. The recently proposed “omnigenic” model postulates that effects of genetic variation on traits are mediated by core-genes and -proteins whose activities mechanistically influence the phenotype, whereas peripheral genes encode a regulatory network that indirectly affects phenotypes via core gene products. Here, we develop a positive-unlabeled graph representation-learning ensemble-approach based on a nested cross-validation to predict core-like genes for diverse diseases using Mendelian disorder genes for training. Employing mouse knockout phenotypes for external validations, we demonstrate that core-like genes display several key properties of core genes: Mouse knockouts of genes corresponding to our most confident predictions give rise to relevant mouse phenotypes at rates on par with the Mendelian disorder genes, and all candidates exhibit core gene properties like transcriptional deregulation in disease and loss-of-function intolerance. Moreover, as predicted for core genes, our candidates are enriched for drug targets and druggable proteins. In contrast to Mendelian disorder genes the new core-like genes are enriched for druggable yet untargeted gene products, which are therefore attractive targets for drug development. Interpretation of the underlying deep learning model suggests plausible explanations for our core gene predictions in form of molecular mechanisms and physical interactions. Our results demonstrate the potential of graph representation learning for the interpretation of biological complexity and pave the way for studying core gene properties and future drug development.

Suggested Citation

  • Florin Ratajczak & Mitchell Joblin & Marcel Hildebrandt & Martin Ringsquandl & Pascal Falter-Braun & Matthias Heinig, 2023. "Speos: an ensemble graph representation learning framework to predict core gene candidates for complex diseases," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-42975-z
    DOI: 10.1038/s41467-023-42975-z
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-42975-z
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-42975-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Mark B. Gerstein & Anshul Kundaje & Manoj Hariharan & Stephen G. Landt & Koon-Kiu Yan & Chao Cheng & Xinmeng Jasmine Mu & Ekta Khurana & Joel Rozowsky & Roger Alexander & Renqiang Min & Pedro Alves & , 2012. "Architecture of the human regulatory network derived from ENCODE data," Nature, Nature, vol. 489(7414), pages 91-100, September.
    2. Aled M. Edwards & Ruth Isserlin & Gary D. Bader & Stephen V. Frye & Timothy M. Willson & Frank H. Yu, 2011. "Too many roads not taken," Nature, Nature, vol. 470(7333), pages 163-165, February.
    3. Xiang Zhu & Zhana Duren & Wing Hung Wong, 2021. "Modeling regulatory network topology improves genome-wide analyses of complex human traits," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    4. Oleksandr Frei & Dominic Holland & Olav B. Smeland & Alexey A. Shadrin & Chun Chieh Fan & Steffen Maeland & Kevin S. O’Connell & Yunpeng Wang & Srdjan Djurovic & Wesley K. Thompson & Ole A. Andreassen, 2019. "Bivariate causal mixture model quantifies polygenic overlap between complex traits beyond genetic correlation," Nature Communications, Nature, vol. 10(1), pages 1-11, December.
    5. MaoQiang Xie & YingJie Xu & YaoGong Zhang & TaeHyun Hwang & Rui Kuang, 2015. "Network-based Phenome-Genome Association Prediction by Bi-Random Walk," PLOS ONE, Public Library of Science, vol. 10(5), pages 1-18, May.
    6. Oron Vanunu & Oded Magger & Eytan Ruppin & Tomer Shlomi & Roded Sharan, 2010. "Associating Genes and Protein Complexes with Disease via Network Propagation," PLOS Computational Biology, Public Library of Science, vol. 6(1), pages 1-9, January.
    7. Katja Luck & Dae-Kyum Kim & Luke Lambourne & Kerstin Spirohn & Bridget E. Begg & Wenting Bian & Ruth Brignall & Tiziana Cafarelli & Francisco J. Campos-Laborie & Benoit Charloteaux & Dongsic Choi & At, 2020. "A reference map of the human binary protein interactome," Nature, Nature, vol. 580(7803), pages 402-408, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Juan J Cáceres & Alberto Paccanaro, 2019. "Disease gene prediction for molecularly uncharacterized diseases," PLOS Computational Biology, Public Library of Science, vol. 15(7), pages 1-14, July.
    2. T M Murali & Matthew D Dyer & David Badger & Brett M Tyler & Michael G Katze, 2011. "Network-Based Prediction and Analysis of HIV Dependency Factors," PLOS Computational Biology, Public Library of Science, vol. 7(9), pages 1-15, September.
    3. David H Drewry & Carrow I Wells & David M Andrews & Richard Angell & Hassan Al-Ali & Alison D Axtman & Stephen J Capuzzi & Jonathan M Elkins & Peter Ettmayer & Mathias Frederiksen & Opher Gileadi & Na, 2017. "Progress towards a public chemogenomic set for protein kinases and a call for contributions," PLOS ONE, Public Library of Science, vol. 12(8), pages 1-20, August.
    4. Li-Chen Hung & Pei-Tseng Kung & Chi-Hsuan Lung & Ming-Hsui Tsai & Shih-An Liu & Li-Ting Chiu & Kuang-Hua Huang & Wen-Chen Tsai, 2020. "Assessment of the Risk of Oral Cancer Incidence in A High-Risk Population and Establishment of A Predictive Model for Oral Cancer Incidence Using A Population-Based Cohort in Taiwan," IJERPH, MDPI, vol. 17(2), pages 1-15, January.
    5. Jianhua Li & Xiaoyan Lin & Yueyang Teng & Shouliang Qi & Dayu Xiao & Jianying Zhang & Yan Kang, 2016. "A Comprehensive Evaluation of Disease Phenotype Networks for Gene Prioritization," PLOS ONE, Public Library of Science, vol. 11(7), pages 1-18, July.
    6. Gold, E. Richard, 2021. "The fall of the innovation empire and its possible rise through open science," Research Policy, Elsevier, vol. 50(5).
    7. Patrick Bryant & Gabriele Pozzati & Wensi Zhu & Aditi Shenoy & Petras Kundrotas & Arne Elofsson, 2022. "Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    8. Le Ou-Yang & Dao-Qing Dai & Xiao-Fei Zhang, 2013. "Protein Complex Detection via Weighted Ensemble Clustering Based on Bayesian Nonnegative Matrix Factorization," PLOS ONE, Public Library of Science, vol. 8(5), pages 1-18, May.
    9. Cheng, Yuanyuan, 2023. "A method of 3R to evaluate the correlation and predictive value of variables," OSF Preprints c79tu, Center for Open Science.
    10. Hao Li & Zebei Han & Yu Sun & Fu Wang & Pengzhen Hu & Yuang Gao & Xuemei Bai & Shiyu Peng & Chao Ren & Xiang Xu & Zeyu Liu & Hebing Chen & Yang Yang & Xiaochen Bo, 2024. "CGMega: explainable graph neural network framework with attention mechanisms for cancer gene module dissection," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    11. Morten Dybdahl Krebs & Gonçalo Espregueira Themudo & Michael Eriksen Benros & Ole Mors & Anders D. Børglum & David Hougaard & Preben Bo Mortensen & Merete Nordentoft & Michael J. Gandal & Chun Chieh F, 2021. "Associations between patterns in comorbid diagnostic trajectories of individuals with schizophrenia and etiological factors," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    12. Hongchun Lin & Hui Peng & Yuxiang Sun & Meijun Si & Jiao Wu & Yanlin Wang & Sandhya S. Thomas & Zheng Sun & Zhaoyong Hu, 2023. "Reprogramming of cis-regulatory networks during skeletal muscle atrophy in male mice," Nature Communications, Nature, vol. 14(1), pages 1-17, December.
    13. Cui, Ying & Cai, Meng & Stanley, H. Eugene, 2018. "Discovering disease-associated genes in weighted protein–protein interaction networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 496(C), pages 53-61.
    14. Fan Chen & Aria L. Byrd & Jinpeng Liu & Robert M. Flight & Tanner J. DuCote & Kassandra J. Naughton & Xiulong Song & Abigail R. Edgin & Alexsandr Lukyanchuk & Danielle T. Dixon & Christian M. Gosser &, 2023. "Polycomb deficiency drives a FOXP2-high aggressive state targetable by epigenetic inhibitors," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    15. Mengyun Yang & Huimin Luo & Yaohang Li & Fang-Xiang Wu & Jianxin Wang, 2019. "Overlap matrix completion for predicting drug-associated indications," PLOS Computational Biology, Public Library of Science, vol. 15(12), pages 1-21, December.
    16. Milton Pividori & Sumei Lu & Binglan Li & Chun Su & Matthew E. Johnson & Wei-Qi Wei & Qiping Feng & Bahram Namjou & Krzysztof Kiryluk & Iftikhar J. Kullo & Yuan Luo & Blair D. Sullivan & Benjamin F. V, 2023. "Projecting genetic associations through gene expression patterns highlights disease etiology and drug mechanisms," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    17. Carles Foguet & Yu Xu & Scott C. Ritchie & Samuel A. Lambert & Elodie Persyn & Artika P. Nath & Emma E. Davenport & David J. Roberts & Dirk S. Paul & Emanuele Angelantonio & John Danesh & Adam S. Butt, 2022. "Genetically personalised organ-specific metabolic models in health and disease," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    18. Mengge Liu & Lu Wang & Yujie Zhang & Haoyang Dong & Caihong Wang & Yayuan Chen & Qian Qian & Nannan Zhang & Shaoying Wang & Guoshu Zhao & Zhihui Zhang & Minghuan Lei & Sijia Wang & Qiyu Zhao & Feng Li, 2024. "Investigating the shared genetic architecture between depression and subcortical volumes," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    19. Yue Yuan & Qiang Huo & Ziru Zhang & Qun Wang & Juanxia Wang & Shuaikang Chang & Peng Cai & Karen M. Song & David W. Galbraith & Weixiao Zhang & Long Huang & Rentao Song & Zeyang Ma, 2024. "Decoding the gene regulatory network of endosperm differentiation in maize," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    20. Nilesh Kumar & M. Shahid Mukhtar, 2024. "Viral Targets in the Human Interactome with Comprehensive Centrality Analysis: SARS-CoV-2, a Case Study," Data, MDPI, vol. 9(8), pages 1-12, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-42975-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.