IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-34937-8.html
   My bibliography  Save this article

Sampling of structure and sequence space of small protein folds

Author

Listed:
  • Thomas W. Linsky

    (University of Washington
    University of Washington)

  • Kyle Noble

    (University of Georgia)

  • Autumn R. Tobin

    (University of Georgia)

  • Rachel Crow

    (University of Washington)

  • Lauren Carter

    (University of Washington)

  • Jeffrey L. Urbauer

    (University of Georgia)

  • David Baker

    (University of Washington
    University of Washington
    University of Washington)

  • Eva-Maria Strauch

    (University of Georgia
    University of Georgia)

Abstract

Nature only samples a small fraction of the sequence space that can fold into stable proteins. Furthermore, small structural variations in a single fold, sometimes only a few amino acids, can define a protein’s molecular function. Hence, to design proteins with novel functionalities, such as molecular recognition, methods to control and sample shape diversity are necessary. To explore this space, we developed and experimentally validated a computational platform that can design a wide variety of small protein folds while sampling shape diversity. We designed and evaluated stability of about 30,000 de novo protein designs of eight different folds. Among these designs, about 6,200 stable proteins were identified, including some predicted to have a first-of-its-kind minimalized thioredoxin fold. Obtained data revealed protein folding rules for structural features such as helix-connecting loops. Beyond serving as a resource for protein engineering, this massive and diverse dataset also provides training data for machine learning. We developed an accurate classifier to predict the stability of our designed proteins. The methods and the wide range of protein shapes provide a basis for designing new protein functions without compromising stability.

Suggested Citation

  • Thomas W. Linsky & Kyle Noble & Autumn R. Tobin & Rachel Crow & Lauren Carter & Jeffrey L. Urbauer & David Baker & Eva-Maria Strauch, 2022. "Sampling of structure and sequence space of small protein folds," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-34937-8
    DOI: 10.1038/s41467-022-34937-8
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-34937-8
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-34937-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. TJ Brunette & Fabio Parmeggiani & Po-Ssu Huang & Gira Bhabha & Damian C. Ekiert & Susan E. Tsutakawa & Greg L. Hura & John A. Tainer & David Baker, 2015. "Exploring the repeat protein universe through computational protein design," Nature, Nature, vol. 528(7583), pages 580-584, December.
    2. Longxing Cao & Brian Coventry & Inna Goreshnik & Buwei Huang & William Sheffler & Joon Sung Park & Kevin M. Jude & Iva Marković & Rameshwar U. Kadam & Koen H. G. Verschueren & Kenneth Verstraete & Sco, 2022. "Design of protein-binding proteins from the target structure alone," Nature, Nature, vol. 605(7910), pages 551-560, May.
    3. Ivan Anishchenko & Samuel J. Pellock & Tamuka M. Chidyausiku & Theresa A. Ramelot & Sergey Ovchinnikov & Jingzhou Hao & Khushboo Bafna & Christoffer Norn & Alex Kang & Asim K. Bera & Frank DiMaio & La, 2021. "De novo protein design by deep network hallucination," Nature, Nature, vol. 600(7889), pages 547-552, December.
    4. Sarel J Fleishman & Andrew Leaver-Fay & Jacob E Corn & Eva-Maria Strauch & Sagar D Khare & Nobuyasu Koga & Justin Ashworth & Paul Murphy & Florian Richter & Gordon Lemmon & Jens Meiler & David Baker, 2011. "RosettaScripts: A Scripting Language Interface to the Rosetta Macromolecular Modeling Suite," PLOS ONE, Public Library of Science, vol. 6(6), pages 1-10, June.
    5. Po-Ssu Huang & Scott E. Boyken & David Baker, 2016. "The coming of age of de novo protein design," Nature, Nature, vol. 537(7620), pages 320-327, September.
    6. Nobuyasu Koga & Rie Tatsumi-Koga & Gaohua Liu & Rong Xiao & Thomas B. Acton & Gaetano T. Montelione & David Baker, 2012. "Principles for designing ideal protein structures," Nature, Nature, vol. 491(7423), pages 222-227, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tamuka M. Chidyausiku & Soraia R. Mendes & Jason C. Klima & Marta Nadal & Ulrich Eckhard & Jorge Roel-Touris & Scott Houliston & Tibisay Guevara & Hugh K. Haddox & Adam Moyer & Cheryl H. Arrowsmith & , 2022. "De novo design of immunoglobulin-like domains," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    2. Fabian Sesterhenn & Che Yang & Jaume Bonet & Johannes T. Cramer & Xiaolin Wen & Yimeng Wang & Chi I. Chiang & Luciano Andres Abriata & Iga Kucharska & Giacomo Castoro & Sabrina S. Vollers & Marie Gall, 2020. "De novo protein design enables the precise induction of RSV-neutralizing antibodies," Post-Print hal-02677103, HAL.
    3. Jorge Roel-Touris & Marta Nadal & Enrique Marcos, 2023. "Single-chain dimers from de novo immunoglobulins as robust scaffolds for multiple binding loops," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    4. Betz, Ulrich A.K. & Arora, Loukik & Assal, Reem A. & Azevedo, Hatylas & Baldwin, Jeremy & Becker, Michael S. & Bostock, Stefan & Cheng, Vinton & Egle, Tobias & Ferrari, Nicola & Schneider-Futschik, El, 2023. "Game changers in science and technology - now and beyond," Technological Forecasting and Social Change, Elsevier, vol. 193(C).
    5. Nathaniel R. Bennett & Brian Coventry & Inna Goreshnik & Buwei Huang & Aza Allen & Dionne Vafeados & Ying Po Peng & Justas Dauparas & Minkyung Baek & Lance Stewart & Frank DiMaio & Steven Munck & Savv, 2023. "Improving de novo protein binder design with deep learning," Nature Communications, Nature, vol. 14(1), pages 1-9, December.
    6. Julia Skokowa & Birte Hernandez Alvarez & Murray Coles & Malte Ritter & Masoud Nasri & Jérémy Haaf & Narges Aghaallaei & Yun Xu & Perihan Mir & Ann-Christin Krahl & Katherine W. Rogers & Kateryna Maks, 2022. "A topological refactoring design strategy yields highly stable granulopoietic proteins," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    7. Jaume Bonet & Sarah Wehrle & Karen Schriever & Che Yang & Anne Billet & Fabian Sesterhenn & Andreas Scheck & Freyr Sverrisson & Barbora Veselkova & Sabrina Vollers & Roxanne Lourman & Mélanie Villard , 2018. "Rosetta FunFolDes – A general framework for the computational design of functional proteins," PLOS Computational Biology, Public Library of Science, vol. 14(11), pages 1-30, November.
    8. Fatima A. Davila-Hernandez & Biao Jin & Harley Pyles & Shuai Zhang & Zheming Wang & Timothy F. Huddy & Asim K. Bera & Alex Kang & Chun-Long Chen & James J. Yoreo & David Baker, 2023. "Directing polymorph specific calcium carbonate formation with de novo protein templates," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    9. Evgenii Lobzaev & Giovanni Stracquadanio, 2024. "Dirichlet latent modelling enables effective learning and sampling of the functional protein design space," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    10. Sasha B. Ebrahimi & Devleena Samanta, 2023. "Engineering protein-based therapeutics through structural and chemical design," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    11. Daniel Ellis & Julia Lederhofer & Oliver J. Acton & Yaroslav Tsybovsky & Sally Kephart & Christina Yap & Rebecca A. Gillespie & Adrian Creanga & Audrey Olshefsky & Tyler Stephens & Deleah Pettie & Mic, 2022. "Structure-based design of stabilized recombinant influenza neuraminidase tetramers," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    12. Marc Corrales & Pol Cuscó & Dinara R Usmanova & Heng-Chang Chen & Natalya S Bogatyreva & Guillaume J Filion & Dmitry N Ivankov, 2015. "Machine Learning: How Much Does It Tell about Protein Folding Rates?," PLOS ONE, Public Library of Science, vol. 10(11), pages 1-12, November.
    13. Alexander M Sevy & Tim M Jacobs & James E Crowe Jr. & Jens Meiler, 2015. "Design of Protein Multi-specificity Using an Independent Sequence Search Reduces the Barrier to Low Energy Sequences," PLOS Computational Biology, Public Library of Science, vol. 11(7), pages 1-23, July.
    14. Namrata Anand & Raphael Eguchi & Irimpan I. Mathews & Carla P. Perez & Alexander Derry & Russ B. Altman & Po-Ssu Huang, 2022. "Protein sequence design with a learned potential," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    15. Jonathan Yaacov Weinstein & Carlos Martí-Gómez & Rosalie Lipsh-Sokolik & Shlomo Yakir Hoch & Demian Liebermann & Reinat Nevo & Haim Weissman & Ekaterina Petrovich-Kopitman & David Margulies & Dmitry I, 2023. "Designed active-site library reveals thousands of functional GFP variants," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    16. Edward P. Harvey & Jung-Eun Shin & Meredith A. Skiba & Genevieve R. Nemeth & Joseph D. Hurley & Alon Wellner & Ada Y. Shaw & Victor G. Miranda & Joseph K. Min & Chang C. Liu & Debora S. Marks & Andrew, 2022. "An in silico method to assess antibody fragment polyreactivity," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    17. Buwei Huang & Brian Coventry & Marta T. Borowska & Dimitrios C. Arhontoulis & Marc Exposit & Mohamad Abedi & Kevin M. Jude & Samer F. Halabiya & Aza Allen & Cami Cordray & Inna Goreshnik & Maggie Ahlr, 2024. "De novo design of miniprotein antagonists of cytokine storm inducers," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    18. Jared Adolf-Bryfogle & Oleks Kalyuzhniy & Michael Kubitz & Brian D Weitzner & Xiaozhen Hu & Yumiko Adachi & William R Schief & Roland L Dunbrack Jr., 2018. "RosettaAntibodyDesign (RAbD): A general framework for computational antibody design," PLOS Computational Biology, Public Library of Science, vol. 14(4), pages 1-38, April.
    19. Vishruth Mullapudi & Jaime Vaquer-Alicea & Vaibhav Bommareddy & Anthony R. Vega & Bryan D. Ryder & Charles L. White & Marc. I. Diamond & Lukasz A. Joachimiak, 2023. "Network of hotspot interactions cluster tau amyloid folds," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    20. Zsolt Fazekas & Dóra K. Menyhárd & András Perczel, 2024. "LoCoHD: a metric for comparing local environments of proteins," Nature Communications, Nature, vol. 15(1), pages 1-14, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-34937-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.