IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-56751-8.html
   My bibliography  Save this article

Integrating protein language models and automatic biofoundry for enhanced protein evolution

Author

Listed:
  • Qiang Zhang

    (Zhejiang University
    Zhejiang University)

  • Wanyi Chen

    (Zhejiang University
    Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Centre)

  • Ming Qin

    (Zhejiang University
    Zhejiang University)

  • Yuhao Wang

    (Zhejiang University
    Zhejiang University)

  • Zhongji Pu

    (Xianghu Laboratory)

  • Keyan Ding

    (ZJU-Hangzhou Global Scientific and Technological Innovation Centre)

  • Yuyue Liu

    (ZJU-Hangzhou Global Scientific and Technological Innovation Centre)

  • Qunfeng Zhang

    (Zhejiang University
    Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Centre)

  • Dongfang Li

    (ZJU-Hangzhou Global Scientific and Technological Innovation Centre)

  • Xinjia Li

    (Xianghu Laboratory)

  • Yu Zhao

    (Shenzhen)

  • Jianhua Yao

    (Shenzhen)

  • Lei Huang

    (Zhejiang University
    Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Centre)

  • Jianping Wu

    (Zhejiang University
    Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Centre
    Zhejiang University)

  • Lirong Yang

    (Zhejiang University
    Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Centre
    Zhejiang University)

  • Huajun Chen

    (Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Centre
    Zhejiang University)

  • Haoran Yu

    (Zhejiang University
    Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Centre)

Abstract

Traditional protein engineering methods, such as directed evolution, while effective, are often slow and labor-intensive. Advances in machine learning and automated biofoundry present new opportunities for optimizing these processes. This study devises a protein language model-enabled automatic evolution platform, a closed-loop system for automated protein engineering within the Design-Build-Test-Learn cycle. The protein language model ESM-2 makes zero-shot prediction of 96 variants to initiate the cycle. The biofoundry constructs and evaluates these variants, and feeds the results back to a multi-layer perceptron to train a fitness predictor, which then makes prediction of second round of 96 variants with improved fitness. With the tRNA synthetase as a model enzyme, four-rounds of evolution carried out within 10 days lead to mutants with enzyme activity improved by up to 2.4-fold. Our system significantly enhances the speed and accuracy of protein evolution, driving faster advancements in protein engineering for industrial applications.

Suggested Citation

  • Qiang Zhang & Wanyi Chen & Ming Qin & Yuhao Wang & Zhongji Pu & Keyan Ding & Yuyue Liu & Qunfeng Zhang & Dongfang Li & Xinjia Li & Yu Zhao & Jianhua Yao & Lei Huang & Jianping Wu & Lirong Yang & Huaju, 2025. "Integrating protein language models and automatic biofoundry for enhanced protein evolution," Nature Communications, Nature, vol. 16(1), pages 1-16, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-56751-8
    DOI: 10.1038/s41467-025-56751-8
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-56751-8
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-56751-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Behnam Enghiad & Pu Xue & Nilmani Singh & Aashutosh Girish Boob & Chengyou Shi & Vassily Andrew Petrov & Roy Liu & Siddhartha Suryanarayana Peri & Stephan Thomas Lane & Emily Danielle Gaither & Huimin, 2022. "PlasmidMaker is a versatile, automated, and high throughput end-to-end platform for plasmid construction," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    2. Johannes Büchler & Sumire Honda Malca & David Patsch & Moritz Voss & Nicholas J. Turner & Uwe T. Bornscheuer & Oliver Allemann & Camille Chapelain & Alexandre Lumbroso & Olivier Loiseleur & Rebecca Bu, 2022. "Algorithm-aided engineering of aliphatic halogenase WelO5* for the asymmetric late-stage functionalization of soraphens," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    3. Jason W. Chin, 2017. "Expanding and reprogramming the genetic code," Nature, Nature, vol. 550(7674), pages 53-60, October.
    4. Qiang Zhang & Wanyi Chen & Ming Qin & Yuhao Wang & Zhongji Pu & Keyan Ding & Yuyue Liu & Qunfeng Zhang & Dongfang Li & Xinjia Li & Yu Zhao & Jianhua Yao & Lei Huang & Jianping Wu & Lirong Yang & Huaju, 2025. "Integrating protein language models and automatic biofoundry for enhanced protein evolution," Nature Communications, Nature, vol. 16(1), pages 1-16, December.
    5. Siwei Li & Jingjing An & Yaqiu Li & Xiagu Zhu & Dongdong Zhao & Lixian Wang & Yonghui Sun & Yuanzhao Yang & Changhao Bi & Xueli Zhang & Meng Wang, 2022. "Automated high-throughput genome editing platform with an AI learning in situ prediction model," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    6. Po-Ssu Huang & Scott E. Boyken & David Baker, 2016. "The coming of age of de novo protein design," Nature, Nature, vol. 537(7620), pages 320-327, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Qiang Zhang & Wanyi Chen & Ming Qin & Yuhao Wang & Zhongji Pu & Keyan Ding & Yuyue Liu & Qunfeng Zhang & Dongfang Li & Xinjia Li & Yu Zhao & Jianhua Yao & Lei Huang & Jianping Wu & Lirong Yang & Huaju, 2025. "Integrating protein language models and automatic biofoundry for enhanced protein evolution," Nature Communications, Nature, vol. 16(1), pages 1-16, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Antje Krüger & Andrew M. Watkins & Roger Wellington-Oguri & Jonathan Romano & Camila Kofman & Alysse DeFoe & Yejun Kim & Jeff Anderson-Lee & Eli Fisker & Jill Townley & Anne E. d’Aquino & Rhiju Das & , 2023. "Community science designed ribosomes with beneficial phenotypes," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    2. Hongxia Zhao & Wenlong Ding & Jia Zang & Yang Yang & Chao Liu & Linzhen Hu & Yulin Chen & Guanglong Liu & Yu Fang & Ying Yuan & Shixian Lin, 2021. "Directed-evolution of translation system for efficient unnatural amino acids incorporation and generalizable synthetic auxotroph construction," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    3. Agnese I. Curatolo & Ofer Kimchi & Carl P. Goodrich & Ryan K. Krueger & Michael P. Brenner, 2023. "A computational toolbox for the assembly yield of complex and heterogeneous structures," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    4. Biao Ruan & Yanan He & Yingwei Chen & Eun Jung Choi & Yihong Chen & Dana Motabar & Tsega Solomon & Richard Simmerman & Thomas Kauffman & D. Travis Gallagher & John Orban & Philip N. Bryan, 2023. "Design and characterization of a protein fold switching network," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    5. Haoran Huang & Tao Yan & Chang Liu & Yuxiang Lu & Zhigang Wu & Xingchu Wang & Jie Wang, 2024. "Genetically encoded Nδ-vinyl histidine for the evolution of enzyme catalytic center," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    6. Noelia Ferruz & Steffen Schmidt & Birte Höcker, 2022. "ProtGPT2 is a deep unsupervised language model for protein design," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    7. Anton Kocheturov & Panos M. Pardalos & Athanasia Karakitsiou, 2019. "Massive datasets and machine learning for computational biomedicine: trends and challenges," Annals of Operations Research, Springer, vol. 276(1), pages 5-34, May.
    8. Joongoo Lee & Jaime N. Coronado & Namjin Cho & Jongdoo Lim & Brandon M. Hosford & Sangwon Seo & Do Soon Kim & Camila Kofman & Jeffrey S. Moore & Andrew D. Ellington & Eric V. Anslyn & Michael C. Jewet, 2022. "Ribosome-mediated biosynthesis of pyridazinone oligomers in vitro," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    9. Baran Hashemi & Nikolai Hartmann & Sahand Sharifzadeh & James Kahn & Thomas Kuhr, 2024. "Ultra-high-granularity detector simulation with intra-event aware generative adversarial network and self-supervised relational reasoning," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    10. Evgenii Lobzaev & Michael A. Herrera & Martyna Kasprzyk & Giovanni Stracquadanio, 2024. "Protein engineering using variational free energy approximation," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    11. Fatima A. Davila-Hernandez & Biao Jin & Harley Pyles & Shuai Zhang & Zheming Wang & Timothy F. Huddy & Asim K. Bera & Alex Kang & Chun-Long Chen & James J. Yoreo & David Baker, 2023. "Directing polymorph specific calcium carbonate formation with de novo protein templates," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    12. Smrithi Krishnan R & Kalyanashis Jana & Amina H. Shaji & Karthika S. Nair & Anjali Devi Das & Devika Vikraman & Harsha Bajaj & Ulrich Kleinekathöfer & Kozhinjampara R. Mahendran, 2022. "Assembly of transmembrane pores from mirror-image peptides," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    13. Evgenii Lobzaev & Giovanni Stracquadanio, 2024. "Dirichlet latent modelling enables effective learning and sampling of the functional protein design space," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    14. Sasha B. Ebrahimi & Devleena Samanta, 2023. "Engineering protein-based therapeutics through structural and chemical design," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    15. Amir Pandi & David Adam & Amir Zare & Van Tuan Trinh & Stefan L. Schaefer & Marie Burt & Björn Klabunde & Elizaveta Bobkova & Manish Kushwaha & Yeganeh Foroughijabbari & Peter Braun & Christoph Spahn , 2023. "Cell-free biosynthesis combined with deep learning accelerates de novo-development of antimicrobial peptides," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    16. Thomas W. Linsky & Kyle Noble & Autumn R. Tobin & Rachel Crow & Lauren Carter & Jeffrey L. Urbauer & David Baker & Eva-Maria Strauch, 2022. "Sampling of structure and sequence space of small protein folds," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    17. Huan Sun & Haiyang Jia & Olivia Kendall & Jovan Dragelj & Vladimir Kubyshkin & Tobias Baumann & Maria-Andrea Mroginski & Petra Schwille & Nediljko Budisa, 2022. "Halogenation of tyrosine perturbs large-scale protein self-organization," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    18. Xiaoting Shan & Ying Cai & Binyu Zhu & Lingli Zhou & Xujie Sun & Xiaoxuan Xu & Qi Yin & Dangge Wang & Yaping Li, 2024. "Rational strategies for improving the efficiency of design and discovery of nanomedicines," Nature Communications, Nature, vol. 15(1), pages 1-9, December.
    19. Jordan Yang & Nandita Naik & Jagdish Suresh Patel & Christopher S Wylie & Wenze Gu & Jessie Huang & F Marty Ytreberg & Mandar T Naik & Daniel M Weinreich & Brenda M Rubenstein, 2020. "Predicting the viability of beta-lactamase: How folding and binding free energies correlate with beta-lactamase fitness," PLOS ONE, Public Library of Science, vol. 15(5), pages 1-26, May.
    20. Siwei Li & Jingjing An & Yaqiu Li & Xiagu Zhu & Dongdong Zhao & Lixian Wang & Yonghui Sun & Yuanzhao Yang & Changhao Bi & Xueli Zhang & Meng Wang, 2022. "Automated high-throughput genome editing platform with an AI learning in situ prediction model," Nature Communications, Nature, vol. 13(1), pages 1-11, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-56751-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.