IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1008865.html
   My bibliography  Save this article

Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks

Author

Listed:
  • Yang Li
  • Chengxin Zhang
  • Eric W Bell
  • Wei Zheng
  • Xiaogen Zhou
  • Dong-Jun Yu
  • Yang Zhang

Abstract

The topology of protein folds can be specified by the inter-residue contact-maps and accurate contact-map prediction can help ab initio structure folding. We developed TripletRes to deduce protein contact-maps from discretized distance profiles by end-to-end training of deep residual neural-networks. Compared to previous approaches, the major advantage of TripletRes is in its ability to learn and directly fuse a triplet of coevolutionary matrices extracted from the whole-genome and metagenome databases and therefore minimize the information loss during the course of contact model training. TripletRes was tested on a large set of 245 non-homologous proteins from CASP 11&12 and CAMEO experiments and outperformed other top methods from CASP12 by at least 58.4% for the CASP 11&12 targets and 44.4% for the CAMEO targets in the top-L long-range contact precision. On the 31 FM targets from the latest CASP13 challenge, TripletRes achieved the highest precision (71.6%) for the top-L/5 long-range contact predictions. It was also shown that a simple re-training of the TripletRes model with more proteins can lead to further improvement with precisions comparable to state-of-the-art methods developed after CASP13. These results demonstrate a novel efficient approach to extend the power of deep convolutional networks for high-accuracy medium- and long-range protein contact-map predictions starting from primary sequences, which are critical for constructing 3D structure of proteins that lack homologous templates in the PDB library.Author summary: Ab initio protein folding has been a major unsolved problem in computational biology for more than half a century. Recent community-wide Critical Assessment of Structure Prediction (CASP) experiments have witnessed exciting progress on ab initio structure prediction, which was mainly powered by the boosting of contact-map prediction as the latter can be used as constraints to guide ab initio folding simulations. In this work, we proposed a new open-source deep-learning architecture, TripletRes, built on the residual convolutional neural networks for high-accuracy contact prediction. The large-scale benchmark and blind test results demonstrate competitive performance of the proposed methods to other top approaches in predicting medium- and long-range contact-maps that are critical for guiding protein folding simulations. Detailed data analyses showed that the major advantage of TripletRes lies in the unique protocol to fuse multiple evolutionary feature matrices which are directly extracted from whole-genome and metagenome databases and therefore minimize the information loss during the contact model training.

Suggested Citation

  • Yang Li & Chengxin Zhang & Eric W Bell & Wei Zheng & Xiaogen Zhou & Dong-Jun Yu & Yang Zhang, 2021. "Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks," PLOS Computational Biology, Public Library of Science, vol. 17(3), pages 1-19, March.
  • Handle: RePEc:plo:pcbi00:1008865
    DOI: 10.1371/journal.pcbi.1008865
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008865
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1008865&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1008865?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Sheng Wang & Siqi Sun & Zhen Li & Renyu Zhang & Jinbo Xu, 2017. "Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model," PLOS Computational Biology, Public Library of Science, vol. 13(1), pages 1-34, January.
    2. Joe G. Greener & Shaun M. Kandathil & David T. Jones, 2019. "Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints," Nature Communications, Nature, vol. 10(1), pages 1-13, December.
    3. Sean R Eddy, 2011. "Accelerated Profile HMM Searches," PLOS Computational Biology, Public Library of Science, vol. 7(10), pages 1-16, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rahmatullah Roche & Sutanu Bhattacharya & Debswapna Bhattacharya, 2021. "Hybridized distance- and contact-based hierarchical structure modeling for folding soluble and membrane proteins," PLOS Computational Biology, Public Library of Science, vol. 17(2), pages 1-31, February.
    2. Peicong Lin & Yumeng Yan & Huanyu Tao & Sheng-You Huang, 2023. "Deep transfer learning for inter-chain contact predictions of transmembrane protein complexes," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    3. Ngaam J Cheung & Wookyung Yu, 2018. "De novo protein structure prediction using ultra-fast molecular dynamics simulation," PLOS ONE, Public Library of Science, vol. 13(11), pages 1-17, November.
    4. Ezequiel A Galpern & María I Freiberger & Diego U Ferreiro, 2020. "Large Ankyrin repeat proteins are formed with similar and energetically favorable units," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-16, June.
    5. Shuangxi Ji & Tuğçe Oruç & Liam Mead & Muhammad Fayyaz Rehman & Christopher Morton Thomas & Sam Butterworth & Peter James Winn, 2019. "DeepCDpred: Inter-residue distance and contact prediction for improved prediction of protein structure," PLOS ONE, Public Library of Science, vol. 14(1), pages 1-15, January.
    6. Juan A Morales-Cordovilla & Victoria Sanchez & Martin Ratajczak, 2018. "Protein alignment based on higher order conditional random fields for template-based modeling," PLOS ONE, Public Library of Science, vol. 13(6), pages 1-14, June.
    7. Amit A Upadhyay & Aaron D Fleetwood & Ogun Adebali & Robert D Finn & Igor B Zhulin, 2016. "Cache Domains That are Homologous to, but Different from PAS Domains Comprise the Largest Superfamily of Extracellular Sensors in Prokaryotes," PLOS Computational Biology, Public Library of Science, vol. 12(4), pages 1-21, April.
    8. Samantha Petti & Sean R Eddy, 2022. "Constructing benchmark test sets for biological sequence analysis using independent set algorithms," PLOS Computational Biology, Public Library of Science, vol. 18(3), pages 1-14, March.
    9. David Lee & Sayoni Das & Natalie L Dawson & Dragana Dobrijevic & John Ward & Christine Orengo, 2016. "Novel Computational Protocols for Functionally Classifying and Characterising Serine Beta-Lactamases," PLOS Computational Biology, Public Library of Science, vol. 12(6), pages 1-33, June.
    10. Dowan Kim & Myunghee Jung & In Jin Ha & Min Young Lee & Seok-Geun Lee & Younhee Shin & Sathiyamoorthy Subramaniyam & Jaehyeon Oh, 2018. "Transcriptional Profiles of Secondary Metabolite Biosynthesis Genes and Cytochromes in the Leaves of Four Papaver Species," Data, MDPI, vol. 3(4), pages 1-15, November.
    11. Dong-Hyun Kim & Hyun-Sik Yun & Young-Saeng Kim & Jong-Guk Kim, 2021. "Pollutant-Removing Biofilter Strains Associated with High Ammonia and Hydrogen Sulfide Removal Rate in a Livestock Wastewater Treatment Facility," Sustainability, MDPI, vol. 13(13), pages 1-16, June.
    12. Cuncong Zhong & Anna Edlund & Youngik Yang & Jeffrey S McLean & Shibu Yooseph, 2016. "Metagenome and Metatranscriptome Analyses Using Protein Family Profiles," PLOS Computational Biology, Public Library of Science, vol. 12(7), pages 1-22, July.
    13. Jaume Bonet & Sarah Wehrle & Karen Schriever & Che Yang & Anne Billet & Fabian Sesterhenn & Andreas Scheck & Freyr Sverrisson & Barbora Veselkova & Sabrina Vollers & Roxanne Lourman & Mélanie Villard , 2018. "Rosetta FunFolDes – A general framework for the computational design of functional proteins," PLOS Computational Biology, Public Library of Science, vol. 14(11), pages 1-30, November.
    14. Andrew J McGehee & Sutanu Bhattacharya & Rahmatullah Roche & Debswapna Bhattacharya, 2020. "PolyFold: An interactive visual simulator for distance-based protein folding," PLOS ONE, Public Library of Science, vol. 15(12), pages 1-11, December.
    15. Lei Wang & Jiangguo Zhang & Dali Wang & Chen Song, 2022. "Membrane contact probability: An essential and predictive character for the structural and functional studies of membrane proteins," PLOS Computational Biology, Public Library of Science, vol. 18(3), pages 1-27, March.
    16. Zhiye Guo & Jian Liu & Jeffrey Skolnick & Jianlin Cheng, 2022. "Prediction of inter-chain distance maps of protein complexes with 2D attention-based deep neural networks," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    17. Claudio Mirabello & Björn Wallner, 2019. "rawMSA: End-to-end Deep Learning using raw Multiple Sequence Alignments," PLOS ONE, Public Library of Science, vol. 14(8), pages 1-15, August.
    18. Rui Fa & Domenico Cozzetto & Cen Wan & David T Jones, 2018. "Predicting human protein function with multi-task deep neural networks," PLOS ONE, Public Library of Science, vol. 13(6), pages 1-16, June.
    19. Damiano Piovesan & Andras Hatos & Giovanni Minervini & Federica Quaglia & Alexander Miguel Monzon & Silvio C E Tosatto, 2020. "Assessing predictors for new post translational modification sites: A case study on hydroxylation," PLOS Computational Biology, Public Library of Science, vol. 16(6), pages 1-15, June.
    20. Balázs Szalkai & Ildikó Scheer & Kinga Nagy & Beáta G Vértessy & Vince Grolmusz, 2014. "The Metagenomic Telescope," PLOS ONE, Public Library of Science, vol. 9(7), pages 1-9, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1008865. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.