IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-54457-x.html
   My bibliography  Save this article

An automatic end-to-end chemical synthesis development platform powered by large language models

Author

Listed:
  • Yixiang Ruan

    (Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Chenyin Lu

    (ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Ning Xu

    (Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Yuchen He

    (Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Yixin Chen

    (Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Jian Zhang

    (ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Jun Xuan

    (ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Jianzhang Pan

    (ZJU-Hangzhou Global Scientific and Technological Innovation Center
    Zhejiang University)

  • Qun Fang

    (ZJU-Hangzhou Global Scientific and Technological Innovation Center
    Zhejiang University)

  • Hanyu Gao

    (The Hong Kong University of Science and Technology)

  • Xiaodong Shen

    (Suzhou Novartis Technical Development Co. Ltd.)

  • Ning Ye

    (Rezubio Pharmaceuticals Co. Ltd.)

  • Qiang Zhang

    (ZJU-Hangzhou Global Scientific and Technological Innovation Center
    Zhejiang University)

  • Yiming Mo

    (Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Center)

Abstract

The rapid emergence of large language model (LLM) technology presents promising opportunities to facilitate the development of synthetic reactions. In this work, we leveraged the power of GPT-4 to build an LLM-based reaction development framework (LLM-RDF) to handle fundamental tasks involved throughout the chemical synthesis development. LLM-RDF comprises six specialized LLM-based agents, including Literature Scouter, Experiment Designer, Hardware Executor, Spectrum Analyzer, Separation Instructor, and Result Interpreter, which are pre-prompted to accomplish the designated tasks. A web application with LLM-RDF as the backend was built to allow chemist users to interact with automated experimental platforms and analyze results via natural language, thus, eliminating the need for coding skills and ensuring accessibility for all chemists. We demonstrated the capabilities of LLM-RDF in guiding the end-to-end synthesis development process for the copper/TEMPO catalyzed aerobic alcohol oxidation to aldehyde reaction, including literature search and information extraction, substrate scope and condition screening, reaction kinetics study, reaction condition optimization, reaction scale-up and product purification. Furthermore, LLM-RDF’s broader applicability and versability was validated on various synthesis tasks of three distinct reactions (SNAr reaction, photoredox C-C cross-coupling reaction, and heterogeneous photoelectrochemical reaction).

Suggested Citation

  • Yixiang Ruan & Chenyin Lu & Ning Xu & Yuchen He & Yixin Chen & Jian Zhang & Jun Xuan & Jianzhang Pan & Qun Fang & Hanyu Gao & Xiaodong Shen & Ning Ye & Qiang Zhang & Yiming Mo, 2024. "An automatic end-to-end chemical synthesis development platform powered by large language models," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-54457-x
    DOI: 10.1038/s41467-024-54457-x
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-54457-x
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-54457-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. DiMasi, Joseph A. & Grabowski, Henry G. & Hansen, Ronald W., 2016. "Innovation in the pharmaceutical industry: New estimates of R&D costs," Journal of Health Economics, Elsevier, vol. 47(C), pages 20-33.
    2. Felix Wong & Erica J. Zheng & Jacqueline A. Valeri & Nina M. Donghia & Melis N. Anahtar & Satotaka Omori & Alicia Li & Andres Cubillos-Ruiz & Aarti Krishnan & Wengong Jin & Abigail L. Manson & Jens Fr, 2024. "Discovery of a structural class of antibiotics with explainable deep learning," Nature, Nature, vol. 626(7997), pages 177-185, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dosis, Anastasios & Muthoo, Abhinay, 2019. "Experimentation in Dynamic R&D Competition," CRETA Online Discussion Paper Series 52, Centre for Research in Economic Theory and its Applications CRETA.
    2. Yusuke Oh & Koji Takahashi, 2020. "R&D and Innovation: Evidence from Patent Data," Bank of Japan Working Paper Series 20-E-7, Bank of Japan.
    3. Gamba, Simona & Magazzini, Laura & Pertile, Paolo, 2021. "R&D and market size: Who benefits from orphan drug legislation?," Journal of Health Economics, Elsevier, vol. 80(C).
    4. Branstetter, Lee & Chatterjee, Chirantan & Higgins, Matthew J., 2022. "Generic competition and the incentives for early-stage pharmaceutical innovation," Research Policy, Elsevier, vol. 51(10).
    5. Unsal, Omer & Houston, Reza, 2024. "R&D grants and medical innovation," Journal of Economics and Business, Elsevier, vol. 128(C).
    6. Abe C. Dunn & Lasanthi Fernando & Eli Liebman, 2024. "How Much Are Medical Innovations Worth? A Detailed Analysis Using Cost-Effectiveness Studies," BEA Papers 0132, Bureau of Economic Analysis.
    7. Alfred B. Ordman, 2022. "When Will the FDA Do What Is in People’s Best Interests?," American Journal of Economics and Sociology, Wiley Blackwell, vol. 81(4), pages 721-751, September.
    8. Edouard Debonneuil & Anne Eyraud-Loisel & Frédéric Planchet, 2018. "Can Pension Funds Partially Manage Longevity Risk by Investing in a Longevity Megafund?," Risks, MDPI, vol. 6(3), pages 1-27, July.
    9. Billette de Villemeur, Etienne & Versaevel, Bruno, 2019. "One lab, two firms, many possibilities: On R&D outsourcing in the biopharmaceutical industry," Journal of Health Economics, Elsevier, vol. 65(C), pages 260-283.
    10. Fabian Gaessler & Stefan Wagner, 2022. "Patents, Data Exclusivity, and the Development of New Drugs," The Review of Economics and Statistics, MIT Press, vol. 104(3), pages 571-586, May.
    11. Gregor Dorfleitner & Felix Rößle, 2018. "The financial performance of the health care industry: a global, regional and industry specific empirical investigation," The European Journal of Health Economics, Springer;Deutsche Gesellschaft für Gesundheitsökonomie (DGGÖ), vol. 19(4), pages 585-594, May.
    12. Farasat A. S. Bokhari & Franco Mariuzzo & Anna Rita Bennato, 2021. "Innovation and growth in the UK pharmaceuticals: the case of product and marketing introductions," Small Business Economics, Springer, vol. 57(1), pages 603-634, June.
    13. Stig Johan Wiklund, 2019. "A modelling framework for improved design and decision-making in drug development," PLOS ONE, Public Library of Science, vol. 14(8), pages 1-22, August.
    14. Yin, Nina, 2023. "Pharmaceuticals, incremental innovation and market exclusivity," International Journal of Industrial Organization, Elsevier, vol. 87(C).
    15. Rathi, Sawan & Majumdar, Adrija & Chatterjee, Chirantan, 2024. "Did the COVID-19 pandemic propel usage of AI in pharmaceutical innovation? New evidence from patenting data," Technological Forecasting and Social Change, Elsevier, vol. 198(C).
    16. Heyoung Yang & Hyuck Jai Lee, 2018. "Long-Term Collaboration Network Based on ClinicalTrials.gov Database in the Pharmaceutical Industry," Sustainability, MDPI, vol. 10(2), pages 1-14, January.
    17. Ralph Siebert & Zhili Tian, 2020. "Dynamic Mergers Effects on R&D Investments and Drug Development across Research Phases in the Pharmaceutical Industry," CESifo Working Paper Series 8303, CESifo.
    18. Stacy Sneeringer & Matt Clancy, 2020. "Incentivizing New Veterinary Pharmaceutical Products to Combat Antibiotic Resistance," Applied Economic Perspectives and Policy, John Wiley & Sons, vol. 42(4), pages 653-673, December.
    19. Gemma Turon & Jason Hlozek & John G. Woodland & Ankur Kumar & Kelly Chibale & Miquel Duran-Frigola, 2023. "First fully-automated AI/ML virtual screening cascade implemented at a drug discovery centre in Africa," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    20. Adrian Towse;Jimena Ferraro;Jorge Mestre-Ferrandiz, 2017. "Incentives for New Drugs to Tackle Anti-Microbial Resistance," Briefing 001842, Office of Health Economics.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-54457-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.