IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v12y2021i1d10.1038_s41467-021-23303-9.html
   My bibliography  Save this article

Structure-based protein function prediction using graph convolutional networks

Author

Listed:
  • Vladimir Gligorijević

    (Flatiron Institute)

  • P. Douglas Renfrew

    (Flatiron Institute)

  • Tomasz Kosciolek

    (University of California San Diego
    Jagiellonian University)

  • Julia Koehler Leman

    (Flatiron Institute)

  • Daniel Berenberg

    (Flatiron Institute
    New York University)

  • Tommi Vatanen

    (Broad Institute of MIT and Harvard
    University of Auckland)

  • Chris Chandler

    (Flatiron Institute)

  • Bryn C. Taylor

    (University of California San Diego)

  • Ian M. Fisk

    (Flatiron Institute, Simons Foundation)

  • Hera Vlamakis

    (Broad Institute of MIT and Harvard)

  • Ramnik J. Xavier

    (Broad Institute of MIT and Harvard
    Massachusetts General Hospital and Harvard Medical School
    Massachusetts General Hospital and Harvard Medical School
    MIT)

  • Rob Knight

    (University of California San Diego
    University of California San Diego
    University of California San Diego)

  • Kyunghyun Cho

    (New York University
    CIFAR Azrieli Global Scholar)

  • Richard Bonneau

    (Flatiron Institute
    New York University
    New York University
    New York University)

Abstract

The rapid increase in the number of proteins in sequence databases and the diversity of their functions challenge computational approaches for automated function prediction. Here, we introduce DeepFRI, a Graph Convolutional Network for predicting protein functions by leveraging sequence features extracted from a protein language model and protein structures. It outperforms current leading methods and sequence-based Convolutional Neural Networks and scales to the size of current sequence repositories. Augmenting the training set of experimental structures with homology models allows us to significantly expand the number of predictable functions. DeepFRI has significant de-noising capability, with only a minor drop in performance when experimental structures are replaced by protein models. Class activation mapping allows function predictions at an unprecedented resolution, allowing site-specific annotations at the residue-level in an automated manner. We show the utility and high performance of our method by annotating structures from the PDB and SWISS-MODEL, making several new confident function predictions. DeepFRI is available as a webserver at https://beta.deepfri.flatironinstitute.org/ .

Suggested Citation

  • Vladimir Gligorijević & P. Douglas Renfrew & Tomasz Kosciolek & Julia Koehler Leman & Daniel Berenberg & Tommi Vatanen & Chris Chandler & Bryn C. Taylor & Ian M. Fisk & Hera Vlamakis & Ramnik J. Xavie, 2021. "Structure-based protein function prediction using graph convolutional networks," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
  • Handle: RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-021-23303-9
    DOI: 10.1038/s41467-021-23303-9
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-021-23303-9
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-021-23303-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ziqi Gao & Chenran Jiang & Jiawen Zhang & Xiaosen Jiang & Lanqing Li & Peilin Zhao & Huanming Yang & Yong Huang & Jia Li, 2023. "Hierarchical graph learning for protein–protein interaction," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    2. Yaan J. Jang & Qi-Qi Qin & Si-Yu Huang & Arun T. John Peter & Xue-Ming Ding & Benoît Kornmann, 2024. "Accurate prediction of protein function using statistics-informed graph networks," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    3. Marco Malatesta & Emanuele Fornasier & Martino Luigi Salvo & Angela Tramonti & Erika Zangelmi & Alessio Peracchi & Andrea Secchi & Eugenia Polverini & Gabriele Giachin & Roberto Battistutta & Roberto , 2024. "One substrate many enzymes virtual screening uncovers missing genes of carnitine biosynthesis in human and mouse," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    4. Shunshi Kohyama & Béla P. Frohn & Leon Babl & Petra Schwille, 2024. "Machine learning-aided design and screening of an emergent protein function in synthetic cells," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    5. Samuel Miravet-Verde & Rocco Mazzolini & Carolina Segura-Morales & Alicia Broto & Maria Lluch-Senar & Luis Serrano, 2024. "ProTInSeq: transposon insertion tracking by ultra-deep DNA sequencing to identify translated large and small ORFs," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    6. Julia Koehler Leman & Pawel Szczerbiak & P. Douglas Renfrew & Vladimir Gligorijevic & Daniel Berenberg & Tommi Vatanen & Bryn C. Taylor & Chris Chandler & Stefan Janssen & Andras Pataki & Nick Carrier, 2023. "Sequence-structure-function relationships in the microbial protein universe," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    7. Stefanie Duller & Simone Vrbancic & Łukasz Szydłowski & Alexander Mahnert & Marcus Blohs & Michael Predl & Christina Kumpitsch & Verena Zrim & Christoph Högenauer & Tomasz Kosciolek & Ruth A. Schmitz , 2024. "Targeted isolation of Methanobrevibacter strains from fecal samples expands the cultivated human archaeome," Nature Communications, Nature, vol. 15(1), pages 1-16, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-021-23303-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.