IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-42975-z.html
   My bibliography  Save this article

Speos: an ensemble graph representation learning framework to predict core gene candidates for complex diseases

Author

Listed:
  • Florin Ratajczak

    (Molecular Targets and Therapeutics Center (MTTC), Helmholtz Munich)

  • Mitchell Joblin

    (Amazon)

  • Marcel Hildebrandt

    (Siemens Technology, Siemens AG)

  • Martin Ringsquandl

    (Siemens Technology, Siemens AG)

  • Pascal Falter-Braun

    (Molecular Targets and Therapeutics Center (MTTC), Helmholtz Munich
    Ludwig-Maximilians-Universität München)

  • Matthias Heinig

    (Helmholtz Munich
    Technical University of Munich
    Munich Heart Association, Partner Site Munich)

Abstract

Understanding phenotype-to-genotype relationships is a grand challenge of 21st century biology with translational implications. The recently proposed “omnigenic” model postulates that effects of genetic variation on traits are mediated by core-genes and -proteins whose activities mechanistically influence the phenotype, whereas peripheral genes encode a regulatory network that indirectly affects phenotypes via core gene products. Here, we develop a positive-unlabeled graph representation-learning ensemble-approach based on a nested cross-validation to predict core-like genes for diverse diseases using Mendelian disorder genes for training. Employing mouse knockout phenotypes for external validations, we demonstrate that core-like genes display several key properties of core genes: Mouse knockouts of genes corresponding to our most confident predictions give rise to relevant mouse phenotypes at rates on par with the Mendelian disorder genes, and all candidates exhibit core gene properties like transcriptional deregulation in disease and loss-of-function intolerance. Moreover, as predicted for core genes, our candidates are enriched for drug targets and druggable proteins. In contrast to Mendelian disorder genes the new core-like genes are enriched for druggable yet untargeted gene products, which are therefore attractive targets for drug development. Interpretation of the underlying deep learning model suggests plausible explanations for our core gene predictions in form of molecular mechanisms and physical interactions. Our results demonstrate the potential of graph representation learning for the interpretation of biological complexity and pave the way for studying core gene properties and future drug development.

Suggested Citation

  • Florin Ratajczak & Mitchell Joblin & Marcel Hildebrandt & Martin Ringsquandl & Pascal Falter-Braun & Matthias Heinig, 2023. "Speos: an ensemble graph representation learning framework to predict core gene candidates for complex diseases," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-42975-z
    DOI: 10.1038/s41467-023-42975-z
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-42975-z
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-42975-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Xiang Zhu & Zhana Duren & Wing Hung Wong, 2021. "Modeling regulatory network topology improves genome-wide analyses of complex human traits," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    2. Oleksandr Frei & Dominic Holland & Olav B. Smeland & Alexey A. Shadrin & Chun Chieh Fan & Steffen Maeland & Kevin S. O’Connell & Yunpeng Wang & Srdjan Djurovic & Wesley K. Thompson & Ole A. Andreassen, 2019. "Bivariate causal mixture model quantifies polygenic overlap between complex traits beyond genetic correlation," Nature Communications, Nature, vol. 10(1), pages 1-11, December.
    3. MaoQiang Xie & YingJie Xu & YaoGong Zhang & TaeHyun Hwang & Rui Kuang, 2015. "Network-based Phenome-Genome Association Prediction by Bi-Random Walk," PLOS ONE, Public Library of Science, vol. 10(5), pages 1-18, May.
    4. Mark B. Gerstein & Anshul Kundaje & Manoj Hariharan & Stephen G. Landt & Koon-Kiu Yan & Chao Cheng & Xinmeng Jasmine Mu & Ekta Khurana & Joel Rozowsky & Roger Alexander & Renqiang Min & Pedro Alves & , 2012. "Architecture of the human regulatory network derived from ENCODE data," Nature, Nature, vol. 489(7414), pages 91-100, September.
    5. Oron Vanunu & Oded Magger & Eytan Ruppin & Tomer Shlomi & Roded Sharan, 2010. "Associating Genes and Protein Complexes with Disease via Network Propagation," PLOS Computational Biology, Public Library of Science, vol. 6(1), pages 1-9, January.
    6. Aled M. Edwards & Ruth Isserlin & Gary D. Bader & Stephen V. Frye & Timothy M. Willson & Frank H. Yu, 2011. "Too many roads not taken," Nature, Nature, vol. 470(7333), pages 163-165, February.
    7. Katja Luck & Dae-Kyum Kim & Luke Lambourne & Kerstin Spirohn & Bridget E. Begg & Wenting Bian & Ruth Brignall & Tiziana Cafarelli & Francisco J. Campos-Laborie & Benoit Charloteaux & Dongsic Choi & At, 2020. "A reference map of the human binary protein interactome," Nature, Nature, vol. 580(7803), pages 402-408, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Juan J Cáceres & Alberto Paccanaro, 2019. "Disease gene prediction for molecularly uncharacterized diseases," PLOS Computational Biology, Public Library of Science, vol. 15(7), pages 1-14, July.
    2. Ke Hu & Ju Xiang & Yun-Xia Yu & Liang Tang & Qin Xiang & Jian-Ming Li & Yong-Hong Tang & Yong-Jun Chen & Yan Zhang, 2020. "Significance-based multi-scale method for network community detection and its application in disease-gene prediction," PLOS ONE, Public Library of Science, vol. 15(3), pages 1-24, March.
    3. T M Murali & Matthew D Dyer & David Badger & Brett M Tyler & Michael G Katze, 2011. "Network-Based Prediction and Analysis of HIV Dependency Factors," PLOS Computational Biology, Public Library of Science, vol. 7(9), pages 1-15, September.
    4. David H Drewry & Carrow I Wells & David M Andrews & Richard Angell & Hassan Al-Ali & Alison D Axtman & Stephen J Capuzzi & Jonathan M Elkins & Peter Ettmayer & Mathias Frederiksen & Opher Gileadi & Na, 2017. "Progress towards a public chemogenomic set for protein kinases and a call for contributions," PLOS ONE, Public Library of Science, vol. 12(8), pages 1-20, August.
    5. Deborah Chasman & Brandi Gancarz & Linhui Hao & Michael Ferris & Paul Ahlquist & Mark Craven, 2014. "Inferring Host Gene Subnetworks Involved in Viral Replication," PLOS Computational Biology, Public Library of Science, vol. 10(5), pages 1-22, May.
    6. Yesheng Fu & Lei Li & Xin Zhang & Zhikang Deng & Ying Wu & Wenzhe Chen & Yuchen Liu & Shan He & Jian Wang & Yuping Xie & Zhiwei Tu & Yadi Lyu & Yange Wei & Shujie Wang & Chun-Ping Cui & Cui Hua Liu & , 2024. "Systematic HOIP interactome profiling reveals critical roles of linear ubiquitination in tissue homeostasis," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    7. Patrick Bryant & Gabriele Pozzati & Arne Elofsson, 2022. "Improved prediction of protein-protein interactions using AlphaFold2," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    8. Xing Chen & Jun Yin & Jia Qu & Li Huang, 2018. "MDHGI: Matrix Decomposition and Heterogeneous Graph Inference for miRNA-disease association prediction," PLOS Computational Biology, Public Library of Science, vol. 14(8), pages 1-24, August.
    9. Hong-Wen Tang & Kerstin Spirohn & Yanhui Hu & Tong Hao & István A. Kovács & Yue Gao & Richard Binari & Donghui Yang-Zhou & Kenneth H. Wan & Joel S. Bader & Dawit Balcha & Wenting Bian & Benjamin W. Bo, 2023. "Next-generation large-scale binary protein interaction network for Drosophila melanogaster," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    10. Li-Chen Hung & Pei-Tseng Kung & Chi-Hsuan Lung & Ming-Hsui Tsai & Shih-An Liu & Li-Ting Chiu & Kuang-Hua Huang & Wen-Chen Tsai, 2020. "Assessment of the Risk of Oral Cancer Incidence in A High-Risk Population and Establishment of A Predictive Model for Oral Cancer Incidence Using A Population-Based Cohort in Taiwan," IJERPH, MDPI, vol. 17(2), pages 1-15, January.
    11. Mijeong Kim & Yu Jin Jang & Muyoung Lee & Qingqing Guo & Albert J. Son & Nikita A. Kakkad & Abigail B. Roland & Bum-Kyu Lee & Jonghwan Kim, 2024. "The transcriptional regulatory network modulating human trophoblast stem cells to extravillous trophoblast differentiation," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    12. Chen Jia & Ramon Grima, 2024. "Holimap: an accurate and efficient method for solving stochastic gene network dynamics," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    13. Jianhua Li & Xiaoyan Lin & Yueyang Teng & Shouliang Qi & Dayu Xiao & Jianying Zhang & Yan Kang, 2016. "A Comprehensive Evaluation of Disease Phenotype Networks for Gene Prioritization," PLOS ONE, Public Library of Science, vol. 11(7), pages 1-18, July.
    14. Gold, E. Richard, 2021. "The fall of the innovation empire and its possible rise through open science," Research Policy, Elsevier, vol. 50(5).
    15. Gergo Gogl & Boglarka Zambo & Camille Kostmann & Alexandra Cousido-Siah & Bastien Morlet & Fabien Durbesson & Luc Negroni & Pascal Eberling & Pau Jané & Yves Nominé & Andras Zeke & Søren Østergaard & , 2022. "Quantitative fragmentomics allow affinity mapping of interactomes," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    16. Patrick Bryant & Gabriele Pozzati & Wensi Zhu & Aditi Shenoy & Petras Kundrotas & Arne Elofsson, 2022. "Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    17. Le Ou-Yang & Dao-Qing Dai & Xiao-Fei Zhang, 2013. "Protein Complex Detection via Weighted Ensemble Clustering Based on Bayesian Nonnegative Matrix Factorization," PLOS ONE, Public Library of Science, vol. 8(5), pages 1-18, May.
    18. Cheng, Yuanyuan, 2023. "A method of 3R to evaluate the correlation and predictive value of variables," OSF Preprints c79tu, Center for Open Science.
    19. Hao Li & Zebei Han & Yu Sun & Fu Wang & Pengzhen Hu & Yuang Gao & Xuemei Bai & Shiyu Peng & Chao Ren & Xiang Xu & Zeyu Liu & Hebing Chen & Yang Yang & Xiaochen Bo, 2024. "CGMega: explainable graph neural network framework with attention mechanisms for cancer gene module dissection," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    20. Morten Dybdahl Krebs & Gonçalo Espregueira Themudo & Michael Eriksen Benros & Ole Mors & Anders D. Børglum & David Hougaard & Preben Bo Mortensen & Merete Nordentoft & Michael J. Gandal & Chun Chieh F, 2021. "Associations between patterns in comorbid diagnostic trajectories of individuals with schizophrenia and etiological factors," Nature Communications, Nature, vol. 12(1), pages 1-12, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-42975-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.