IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-36443-x.html
   My bibliography  Save this article

Direct generation of protein conformational ensembles via machine learning

Author

Listed:
  • Giacomo Janson

    (Michigan State University)

  • Gilberto Valdes-Garcia

    (Michigan State University)

  • Lim Heo

    (Michigan State University)

  • Michael Feig

    (Michigan State University)

Abstract

Dynamics and conformational sampling are essential for linking protein structure to biological function. While challenging to probe experimentally, computer simulations are widely used to describe protein dynamics, but at significant computational costs that continue to limit the systems that can be studied. Here, we demonstrate that machine learning can be trained with simulation data to directly generate physically realistic conformational ensembles of proteins without the need for any sampling and at negligible computational cost. As a proof-of-principle we train a generative adversarial network based on a transformer architecture with self-attention on coarse-grained simulations of intrinsically disordered peptides. The resulting model, idpGAN, can predict sequence-dependent coarse-grained ensembles for sequences that are not present in the training set demonstrating that transferability can be achieved beyond the limited training data. We also retrain idpGAN on atomistic simulation data to show that the approach can be extended in principle to higher-resolution conformational ensemble generation.

Suggested Citation

  • Giacomo Janson & Gilberto Valdes-Garcia & Lim Heo & Michael Feig, 2023. "Direct generation of protein conformational ensembles via machine learning," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-36443-x
    DOI: 10.1038/s41467-023-36443-x
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-36443-x
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-36443-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Peter Eastman & Jason Swails & John D Chodera & Robert T McGibbon & Yutong Zhao & Kyle A Beauchamp & Lee-Ping Wang & Andrew C Simmonett & Matthew P Harrigan & Chaya D Stern & Rafal P Wiewiora & Bernar, 2017. "OpenMM 7: Rapid development of high performance algorithms for molecular dynamics," PLOS Computational Biology, Public Library of Science, vol. 13(7), pages 1-17, July.
    2. Elan Z. Eisenmesser & Oscar Millet & Wladimir Labeikovsky & Dmitry M. Korzhnev & Magnus Wolf-Watz & Daryl A. Bosco & Jack J. Skalicky & Lewis E. Kay & Dorothee Kern, 2005. "Intrinsic dynamics of an enzyme underlies catalysis," Nature, Nature, vol. 438(7064), pages 117-121, November.
    3. Andreas Mardt & Luca Pasquali & Hao Wu & Frank Noé, 2018. "Author Correction: VAMPnets for deep learning of molecular kinetics," Nature Communications, Nature, vol. 9(1), pages 1-1, December.
    4. Andreas Mardt & Luca Pasquali & Hao Wu & Frank Noé, 2018. "VAMPnets for deep learning of molecular kinetics," Nature Communications, Nature, vol. 9(1), pages 1-11, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shams Mehdi & Pratyush Tiwary, 2024. "Thermodynamics-inspired explanations of artificial intelligence," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    2. Konstantin Avchaciov & Marina P. Antoch & Ekaterina L. Andrianova & Andrei E. Tarkhov & Leonid I. Menshikov & Olga Burmistrova & Andrei V. Gudkov & Peter O. Fedichev, 2022. "Unsupervised learning of aging principles from longitudinal data," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    3. Corneel Casert & Isaac Tamblyn & Stephen Whitelam, 2024. "Learning stochastic dynamics and predicting emergent behavior using transformers," Nature Communications, Nature, vol. 15(1), pages 1-7, December.
    4. Benjamin D Lee & Anthony Gitter & Casey S Greene & Sebastian Raschka & Finlay Maguire & Alexander J Titus & Michael D Kessler & Alexandra J Lee & Marc G Chevrette & Paul Allen Stewart & Thiago Britto-, 2022. "Ten quick tips for deep learning in biology," PLOS Computational Biology, Public Library of Science, vol. 18(3), pages 1-20, March.
    5. Joshua S. North & Christopher K. Wikle & Erin M. Schliep, 2023. "A Review of Data‐Driven Discovery for Dynamic Systems," International Statistical Review, International Statistical Institute, vol. 91(3), pages 464-492, December.
    6. Janni Harju & Muriel C. F. Teeseling & Chase P. Broedersz, 2024. "Loop-extruders alter bacterial chromosome topology to direct entropic forces for segregation," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    7. Christian Hentrich & Mateusz Putyrski & Hanh Hanuschka & Waldemar Preis & Sarah-Jane Kellmann & Melissa Wich & Manuel Cavada & Sarah Hanselka & Victor S. Lelyveld & Francisco Ylera, 2024. "Engineered reversible inhibition of SpyCatcher reactivity enables rapid generation of bispecific antibodies," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    8. Tom Dixon & Derek MacPherson & Barmak Mostofian & Taras Dauzhenka & Samuel Lotz & Dwight McGee & Sharon Shechter & Utsab R. Shrestha & Rafal Wiewiora & Zachary A. McDargh & Fen Pei & Rajat Pal & João , 2022. "Predicting the structural basis of targeted protein degradation by integrating molecular dynamics simulations with structural mass spectrometry," Nature Communications, Nature, vol. 13(1), pages 1-24, December.
    9. Joseph G. Beton & Thomas Mulvaney & Tristan Cragnolini & Maya Topf, 2024. "Cryo-EM structure and B-factor refinement with ensemble representation," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    10. Andreas Mardt & Tim Hempel & Cecilia Clementi & Frank Noé, 2022. "Deep learning to decompose macromolecules into independent Markovian domains," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    11. Cheng Shen & Yuqing Zhang & Wenwen Cui & Yimeng Zhao & Danqi Sheng & Xinyu Teng & Miaoqing Shao & Muneyoshi Ichikawa & Jin Wang & Motoyuki Hattori, 2023. "Structural insights into the allosteric inhibition of P2X4 receptors," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    12. Re’em Moskovitz & Tossapol Pholcharee & Sophia M. DonVito & Bora Guloglu & Edward Lowe & Franziska Mohring & Robert W. Moon & Matthew K. Higgins, 2023. "Structural basis for DARC binding in reticulocyte invasion by Plasmodium vivax," Nature Communications, Nature, vol. 14(1), pages 1-9, December.
    13. David P. McDonogh & Julian D. Gale & Paolo Raiteri & Denis Gebauer, 2024. "Redefined ion association constants have consequences for calcium phosphate nucleation and biomineralization," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    14. Timothy R Lezon & Ivet Bahar, 2010. "Using Entropy Maximization to Understand the Determinants of Structural Dynamics beyond Native Contact Topology," PLOS Computational Biology, Public Library of Science, vol. 6(6), pages 1-12, June.
    15. F. P. Panei & P. Gkeka & M. Bonomi, 2024. "Identifying small-molecules binding sites in RNA conformational ensembles with SHAMAN," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    16. Zsolt Fazekas & Dóra K. Menyhárd & András Perczel, 2024. "LoCoHD: a metric for comparing local environments of proteins," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    17. Jaewoon Jung & Cheng Tan & Yuji Sugita, 2024. "GENESIS CGDYN: large-scale coarse-grained MD simulation with dynamic load balancing for heterogeneous biomolecular systems," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    18. Trayder Thomas & Benoît Roux, 2021. "Tyrosine kinases: complex molecular systems challenging computational methodologies," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 94(10), pages 1-13, October.
    19. Amy Rice & Sourav Haldar & Eric Wang & Paul S. Blank & Sergey A. Akimov & Timur R. Galimzyanov & Richard W. Pastor & Joshua Zimmerberg, 2022. "Planar aggregation of the influenza viral fusion peptide alters membrane structure and hydration, promoting poration," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    20. Santiago Esteban-Martín & Robert Bryn Fenwick & Jörgen Ådén & Benjamin Cossins & Carlos W Bertoncini & Victor Guallar & Magnus Wolf-Watz & Xavier Salvatella, 2014. "Correlated Inter-Domain Motions in Adenylate Kinase," PLOS Computational Biology, Public Library of Science, vol. 10(7), pages 1-7, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-36443-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.