IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-54457-x.html
   My bibliography  Save this article

An automatic end-to-end chemical synthesis development platform powered by large language models

Author

Listed:
  • Yixiang Ruan

    (Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Chenyin Lu

    (ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Ning Xu

    (Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Yuchen He

    (Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Yixin Chen

    (Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Jian Zhang

    (ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Jun Xuan

    (ZJU-Hangzhou Global Scientific and Technological Innovation Center)

  • Jianzhang Pan

    (ZJU-Hangzhou Global Scientific and Technological Innovation Center
    Zhejiang University)

  • Qun Fang

    (ZJU-Hangzhou Global Scientific and Technological Innovation Center
    Zhejiang University)

  • Hanyu Gao

    (The Hong Kong University of Science and Technology)

  • Xiaodong Shen

    (Suzhou Novartis Technical Development Co. Ltd.)

  • Ning Ye

    (Rezubio Pharmaceuticals Co. Ltd.)

  • Qiang Zhang

    (ZJU-Hangzhou Global Scientific and Technological Innovation Center
    Zhejiang University)

  • Yiming Mo

    (Zhejiang University
    ZJU-Hangzhou Global Scientific and Technological Innovation Center)

Abstract

The rapid emergence of large language model (LLM) technology presents promising opportunities to facilitate the development of synthetic reactions. In this work, we leveraged the power of GPT-4 to build an LLM-based reaction development framework (LLM-RDF) to handle fundamental tasks involved throughout the chemical synthesis development. LLM-RDF comprises six specialized LLM-based agents, including Literature Scouter, Experiment Designer, Hardware Executor, Spectrum Analyzer, Separation Instructor, and Result Interpreter, which are pre-prompted to accomplish the designated tasks. A web application with LLM-RDF as the backend was built to allow chemist users to interact with automated experimental platforms and analyze results via natural language, thus, eliminating the need for coding skills and ensuring accessibility for all chemists. We demonstrated the capabilities of LLM-RDF in guiding the end-to-end synthesis development process for the copper/TEMPO catalyzed aerobic alcohol oxidation to aldehyde reaction, including literature search and information extraction, substrate scope and condition screening, reaction kinetics study, reaction condition optimization, reaction scale-up and product purification. Furthermore, LLM-RDF’s broader applicability and versability was validated on various synthesis tasks of three distinct reactions (SNAr reaction, photoredox C-C cross-coupling reaction, and heterogeneous photoelectrochemical reaction).

Suggested Citation

  • Yixiang Ruan & Chenyin Lu & Ning Xu & Yuchen He & Yixin Chen & Jian Zhang & Jun Xuan & Jianzhang Pan & Qun Fang & Hanyu Gao & Xiaodong Shen & Ning Ye & Qiang Zhang & Yiming Mo, 2024. "An automatic end-to-end chemical synthesis development platform powered by large language models," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-54457-x
    DOI: 10.1038/s41467-024-54457-x
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-54457-x
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-54457-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. DiMasi, Joseph A. & Grabowski, Henry G. & Hansen, Ronald W., 2016. "Innovation in the pharmaceutical industry: New estimates of R&D costs," Journal of Health Economics, Elsevier, vol. 47(C), pages 20-33.
    2. Felix Wong & Erica J. Zheng & Jacqueline A. Valeri & Nina M. Donghia & Melis N. Anahtar & Satotaka Omori & Alicia Li & Andres Cubillos-Ruiz & Aarti Krishnan & Wengong Jin & Abigail L. Manson & Jens Fr, 2024. "Discovery of a structural class of antibiotics with explainable deep learning," Nature, Nature, vol. 626(7997), pages 177-185, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dosis, Anastasios & Muthoo, Abhinay, 2019. "Experimentation in Dynamic R&D Competition," CRETA Online Discussion Paper Series 52, Centre for Research in Economic Theory and its Applications CRETA.
    2. Branstetter, Lee & Chatterjee, Chirantan & Higgins, Matthew J., 2022. "Generic competition and the incentives for early-stage pharmaceutical innovation," Research Policy, Elsevier, vol. 51(10).
    3. Alfred B. Ordman, 2022. "When Will the FDA Do What Is in People’s Best Interests?," American Journal of Economics and Sociology, Wiley Blackwell, vol. 81(4), pages 721-751, September.
    4. Edouard Debonneuil & Anne Eyraud-Loisel & Frédéric Planchet, 2018. "Can Pension Funds Partially Manage Longevity Risk by Investing in a Longevity Megafund?," Risks, MDPI, vol. 6(3), pages 1-27, July.
    5. Yin, Nina, 2023. "Pharmaceuticals, incremental innovation and market exclusivity," International Journal of Industrial Organization, Elsevier, vol. 87(C).
    6. Rathi, Sawan & Majumdar, Adrija & Chatterjee, Chirantan, 2024. "Did the COVID-19 pandemic propel usage of AI in pharmaceutical innovation? New evidence from patenting data," Technological Forecasting and Social Change, Elsevier, vol. 198(C).
    7. Heyoung Yang & Hyuck Jai Lee, 2018. "Long-Term Collaboration Network Based on ClinicalTrials.gov Database in the Pharmaceutical Industry," Sustainability, MDPI, vol. 10(2), pages 1-14, January.
    8. Stacy Sneeringer & Matt Clancy, 2020. "Incentivizing New Veterinary Pharmaceutical Products to Combat Antibiotic Resistance," Applied Economic Perspectives and Policy, John Wiley & Sons, vol. 42(4), pages 653-673, December.
    9. Gemma Turon & Jason Hlozek & John G. Woodland & Ankur Kumar & Kelly Chibale & Miquel Duran-Frigola, 2023. "First fully-automated AI/ML virtual screening cascade implemented at a drug discovery centre in Africa," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    10. Adrian Towse;Jimena Ferraro;Jorge Mestre-Ferrandiz, 2017. "Incentives for New Drugs to Tackle Anti-Microbial Resistance," Briefing 001842, Office of Health Economics.
    11. Steffen Nauhaus & Johannes Luger & Sebastian Raisch, 2021. "Strategic Decision Making in the Digital Age: Expert Sentiment and Corporate Capital Allocation," Journal of Management Studies, Wiley Blackwell, vol. 58(7), pages 1933-1961, November.
    12. Aysun, Uluc, 2024. "Technology diffusion and international business cycles," Journal of International Money and Finance, Elsevier, vol. 140(C).
    13. Gold, E. Richard, 2021. "The fall of the innovation empire and its possible rise through open science," Research Policy, Elsevier, vol. 50(5).
    14. Alba C. Rojas-Cordova & Niyousha Hosseinichimeh, 2018. "Trial Termination and Drug Misclassification in Sequential Adaptive Clinical Trials," Service Science, INFORMS, vol. 10(3), pages 354-377, September.
    15. Crego, Julio & Kárpáti, Daniel & Kværner, Jens & Renneboog, Luc, 2022. "The Economic Value of Eliminating Diseases," Other publications TiSEM 8b51764f-3ccd-4bb8-9da1-4, Tilburg University, School of Economics and Management.
    16. Camille Loir & Bertrand Groslambert, 2023. "The impact of innovation on the profitability of the biotech industry," Economics Bulletin, AccessEcon, vol. 43(3), pages 1286-1297.
    17. Casey B. Mulligan, 2020. "Economic Activity and the Value of Medical Innovation during a Pandemic," Working Papers 2020-48, Becker Friedman Institute for Research In Economics.
    18. Billette de Villemeur, Etienne & Versaevel, Bruno, 2019. "One lab, two firms, many possibilities: On R&D outsourcing in the biopharmaceutical industry," Journal of Health Economics, Elsevier, vol. 65(C), pages 260-283.
    19. Henry Grabowski & Carlos Brain & Anna Taub & Rahul Guha, 2017. "Pharmaceutical Patent Challenges: Company Strategies and Litigation Outcomes," American Journal of Health Economics, MIT Press, vol. 3(1), pages 33-59, Winter.
    20. Clancy, Matthew S. & Sneeringer, Stacy E., 2018. "How Much Does it Cost to Induce R&D in Animal Health?," 2018 Annual Meeting, August 5-7, Washington, D.C. 273865, Agricultural and Applied Economics Association.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-54457-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.