IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1007078.html
   My bibliography  Save this article

Disease gene prediction for molecularly uncharacterized diseases

Author

Listed:
  • Juan J Cáceres
  • Alberto Paccanaro

Abstract

Network medicine approaches have been largely successful at increasing our knowledge of molecularly characterized diseases. Given a set of disease genes associated with a disease, neighbourhood-based methods and random walkers exploit the interactome allowing the prediction of further genes for that disease. In general, however, diseases with no known molecular basis constitute a challenge. Here we present a novel network approach to prioritize gene-disease associations that is able to also predict genes for diseases with no known molecular basis. Our method, which we have called Cardigan (ChARting DIsease Gene AssociatioNs), uses semi-supervised learning and exploits a measure of similarity between disease phenotypes. We evaluated its performance at predicting genes for both molecularly characterized and uncharacterized diseases in OMIM, using both weighted and binary interactomes, and compared it with state-of-the-art methods. Our tests, which use datasets collected at different points in time to replicate the dynamics of the disease gene discovery process, prove that Cardigan is able to accurately predict disease genes for molecularly uncharacterized diseases. Additionally, standard leave-one-out cross validation tests show how our approach outperforms state-of-the-art methods at predicting genes for molecularly characterized diseases by 14%-65%. Cardigan can also be used for disease module prediction, where it outperforms state-of-the-art methods by 87%-299%.Author summary: The elucidation of the genetic causes of diseases is central to understanding the mechanisms of action of a pathology and the development of treatments. Disease gene prediction methods streamline the discovery of the molecular basis for a disease by prioritizing genes for experimental validation. Although some methods use disease phenotype to aid the prioritization, the great majority use outdated static matrices which limits their disease coverage. Our approach uses an updatable disease phenotype similarity, and employs a non-linear transformation to define a prior probability distribution over the genes that mimics the distribution of disease genes in the interactome. Subsequently, a semi-supervised learning method establishes a prioritization ordering for all genes in the interactome, even for diseases with no known molecular basis. Our method can be used not only to obtain a better prioritization for disease-gene associations, but also for retrieving disease modules.

Suggested Citation

  • Juan J Cáceres & Alberto Paccanaro, 2019. "Disease gene prediction for molecularly uncharacterized diseases," PLOS Computational Biology, Public Library of Science, vol. 15(7), pages 1-14, July.
  • Handle: RePEc:plo:pcbi00:1007078
    DOI: 10.1371/journal.pcbi.1007078
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007078
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1007078&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1007078?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yunpeng Liu & Daniel A Tennant & Zexuan Zhu & John K Heath & Xin Yao & Shan He, 2014. "DiME: A Scalable Disease Module Identification Algorithm with Application to Glioma Progression," PLOS ONE, Public Library of Science, vol. 9(2), pages 1-17, February.
    2. MaoQiang Xie & YingJie Xu & YaoGong Zhang & TaeHyun Hwang & Rui Kuang, 2015. "Network-based Phenome-Genome Association Prediction by Bi-Random Walk," PLOS ONE, Public Library of Science, vol. 10(5), pages 1-18, May.
    3. Susan Dina Ghiassian & Jörg Menche & Albert-László Barabási, 2015. "A DIseAse MOdule Detection (DIAMOnD) Algorithm Derived from a Systematic Analysis of Connectivity Patterns of Disease Proteins in the Human Interactome," PLOS Computational Biology, Public Library of Science, vol. 11(4), pages 1-21, April.
    4. Oron Vanunu & Oded Magger & Eytan Ruppin & Tomer Shlomi & Roded Sharan, 2010. "Associating Genes and Protein Complexes with Disease via Network Propagation," PLOS Computational Biology, Public Library of Science, vol. 6(1), pages 1-9, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ke Hu & Ju Xiang & Yun-Xia Yu & Liang Tang & Qin Xiang & Jian-Ming Li & Yong-Hong Tang & Yong-Jun Chen & Yan Zhang, 2020. "Significance-based multi-scale method for network community detection and its application in disease-gene prediction," PLOS ONE, Public Library of Science, vol. 15(3), pages 1-24, March.
    2. Abby Hill & Scott Gleim & Florian Kiefer & Frederic Sigoillot & Joseph Loureiro & Jeremy Jenkins & Melody K Morris, 2019. "Benchmarking network algorithms for contextualizing genes of interest," PLOS Computational Biology, Public Library of Science, vol. 15(12), pages 1-14, December.
    3. Florin Ratajczak & Mitchell Joblin & Marcel Hildebrandt & Martin Ringsquandl & Pascal Falter-Braun & Matthias Heinig, 2023. "Speos: an ensemble graph representation learning framework to predict core gene candidates for complex diseases," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    4. T M Murali & Matthew D Dyer & David Badger & Brett M Tyler & Michael G Katze, 2011. "Network-Based Prediction and Analysis of HIV Dependency Factors," PLOS Computational Biology, Public Library of Science, vol. 7(9), pages 1-15, September.
    5. Deborah Chasman & Brandi Gancarz & Linhui Hao & Michael Ferris & Paul Ahlquist & Mark Craven, 2014. "Inferring Host Gene Subnetworks Involved in Viral Replication," PLOS Computational Biology, Public Library of Science, vol. 10(5), pages 1-22, May.
    6. Xing Chen & Jun Yin & Jia Qu & Li Huang, 2018. "MDHGI: Matrix Decomposition and Heterogeneous Graph Inference for miRNA-disease association prediction," PLOS Computational Biology, Public Library of Science, vol. 14(8), pages 1-24, August.
    7. Li-Chen Hung & Pei-Tseng Kung & Chi-Hsuan Lung & Ming-Hsui Tsai & Shih-An Liu & Li-Ting Chiu & Kuang-Hua Huang & Wen-Chen Tsai, 2020. "Assessment of the Risk of Oral Cancer Incidence in A High-Risk Population and Establishment of A Predictive Model for Oral Cancer Incidence Using A Population-Based Cohort in Taiwan," IJERPH, MDPI, vol. 17(2), pages 1-15, January.
    8. Peter Marx & Peter Antal & Bence Bolgar & Gyorgy Bagdy & Bill Deakin & Gabriella Juhasz, 2017. "Comorbidities in the diseasome are more apparent than real: What Bayesian filtering reveals about the comorbidities of depression," PLOS Computational Biology, Public Library of Science, vol. 13(6), pages 1-23, June.
    9. Jianhua Li & Xiaoyan Lin & Yueyang Teng & Shouliang Qi & Dayu Xiao & Jianying Zhang & Yan Kang, 2016. "A Comprehensive Evaluation of Disease Phenotype Networks for Gene Prioritization," PLOS ONE, Public Library of Science, vol. 11(7), pages 1-18, July.
    10. Sepideh Sadegh & James Skelton & Elisa Anastasi & Andreas Maier & Klaudia Adamowicz & Anna Möller & Nils M. Kriege & Jaanika Kronberg & Toomas Haller & Tim Kacprowski & Anil Wipat & Jan Baumbach & Dav, 2023. "Lacking mechanistic disease definitions and corresponding association data hamper progress in network medicine and beyond," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    11. Le Ou-Yang & Dao-Qing Dai & Xiao-Fei Zhang, 2013. "Protein Complex Detection via Weighted Ensemble Clustering Based on Bayesian Nonnegative Matrix Factorization," PLOS ONE, Public Library of Science, vol. 8(5), pages 1-18, May.
    12. Elisa Salviato & Vera Djordjilović & Monica Chiogna & Chiara Romualdi, 2019. "SourceSet: A graphical model approach to identify primary genes in perturbed biological pathways," PLOS Computational Biology, Public Library of Science, vol. 15(10), pages 1-28, October.
    13. Cui, Ying & Cai, Meng & Stanley, H. Eugene, 2018. "Discovering disease-associated genes in weighted protein–protein interaction networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 496(C), pages 53-61.
    14. Mengyun Yang & Huimin Luo & Yaohang Li & Fang-Xiang Wu & Jianxin Wang, 2019. "Overlap matrix completion for predicting drug-associated indications," PLOS Computational Biology, Public Library of Science, vol. 15(12), pages 1-21, December.
    15. Daniel E Carlin & Barry Demchak & Dexter Pratt & Eric Sage & Trey Ideker, 2017. "Network propagation in the cytoscape cyberinfrastructure," PLOS Computational Biology, Public Library of Science, vol. 13(10), pages 1-9, October.
    16. Sepideh Sadegh & James Skelton & Elisa Anastasi & Judith Bernett & David B. Blumenthal & Gihanna Galindez & Marisol Salgado-Albarrán & Olga Lazareva & Keith Flanagan & Simon Cockell & Cristian Nogales, 2021. "Network medicine for disease module identification and drug repurposing with the NeDRex platform," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    17. Zhen Shen & You-Hua Zhang & Kyungsook Han & Asoke K. Nandi & Barry Honig & De-Shuang Huang, 2017. "miRNA-Disease Association Prediction with Collaborative Matrix Factorization," Complexity, Hindawi, vol. 2017, pages 1-9, September.
    18. U Martin Singh-Blom & Nagarajan Natarajan & Ambuj Tewari & John O Woods & Inderjit S Dhillon & Edward M Marcotte, 2013. "Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses," PLOS ONE, Public Library of Science, vol. 8(5), pages 1-17, May.
    19. MaoQiang Xie & YingJie Xu & YaoGong Zhang & TaeHyun Hwang & Rui Kuang, 2015. "Network-based Phenome-Genome Association Prediction by Bi-Random Walk," PLOS ONE, Public Library of Science, vol. 10(5), pages 1-18, May.
    20. Joana P Gonçalves & Alexandre P Francisco & Yves Moreau & Sara C Madeira, 2012. "Interactogeneous: Disease Gene Prioritization Using Heterogeneous Networks and Full Topology Scores," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-13, November.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1007078. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.