IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-41698-5.html
   My bibliography  Save this article

Retrosynthesis prediction with an interpretable deep-learning framework based on molecular assembly tasks

Author

Listed:
  • Yu Wang

    (Shandong University
    Shandong University)

  • Chao Pang

    (Shandong University
    Shandong University)

  • Yuzhe Wang

    (Shandong University
    Shandong University)

  • Junru Jin

    (Shandong University
    Shandong University)

  • Jingjie Zhang

    (Shandong University
    Shandong University)

  • Xiangxiang Zeng

    (College of Computer Science and Electronic Engineering, Hunan University)

  • Ran Su

    (Tianjin University)

  • Quan Zou

    (University of Electronic Science and Technology of China)

  • Leyi Wei

    (Shandong University
    College of Computer Science and Electronic Engineering, Hunan University)

Abstract

Automating retrosynthesis with artificial intelligence expedites organic chemistry research in digital laboratories. However, most existing deep-learning approaches are hard to explain, like a “black box” with few insights. Here, we propose RetroExplainer, formulizing the retrosynthesis task into a molecular assembly process, containing several retrosynthetic actions guided by deep learning. To guarantee a robust performance of our model, we propose three units: a multi-sense and multi-scale Graph Transformer, structure-aware contrastive learning, and dynamic adaptive multi-task learning. The results on 12 large-scale benchmark datasets demonstrate the effectiveness of RetroExplainer, which outperforms the state-of-the-art single-step retrosynthesis approaches. In addition, the molecular assembly process renders our model with good interpretability, allowing for transparent decision-making and quantitative attribution. When extended to multi-step retrosynthesis planning, RetroExplainer has identified 101 pathways, in which 86.9% of the single reactions correspond to those already reported in the literature. As a result, RetroExplainer is expected to offer valuable insights for reliable, high-throughput, and high-quality organic synthesis in drug development.

Suggested Citation

  • Yu Wang & Chao Pang & Yuzhe Wang & Junru Jin & Jingjie Zhang & Xiangxiang Zeng & Ran Su & Quan Zou & Leyi Wei, 2023. "Retrosynthesis prediction with an interpretable deep-learning framework based on molecular assembly tasks," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-41698-5
    DOI: 10.1038/s41467-023-41698-5
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-41698-5
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-41698-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Marwin H. S. Segler & Mike Preuss & Mark P. Waller, 2018. "Planning chemical syntheses with deep neural networks and symbolic AI," Nature, Nature, vol. 555(7698), pages 604-610, March.
    2. Umit V. Ucak & Islambek Ashyrmamatov & Junsu Ko & Juyong Lee, 2022. "Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    3. Dávid Péter Kovács & William McCorkindale & Alpha A. Lee, 2021. "Quantitative interpretation explains machine learning models for chemical reaction prediction and uncovers bias," Nature Communications, Nature, vol. 12(1), pages 1-9, December.
    4. Barbara Mikulak-Klucznik & Patrycja Gołębiowska & Alison A. Bayly & Oskar Popik & Tomasz Klucznik & Sara Szymkuć & Ewa P. Gajewska & Piotr Dittwald & Olga Staszewska-Krajewska & Wiktor Beker & Tomasz , 2020. "Computational planning of the synthesis of complex natural products," Nature, Nature, vol. 588(7836), pages 83-88, December.
    5. Igor V. Tetko & Pavel Karpov & Ruud Deursen & Guillaume Godin, 2020. "State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis," Nature Communications, Nature, vol. 11(1), pages 1-11, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Weihe Zhong & Ziduo Yang & Calvin Yu-Chian Chen, 2023. "Retrosynthesis prediction using an end-to-end graph generative architecture for molecular graph editing," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    2. Umit V. Ucak & Islambek Ashyrmamatov & Junsu Ko & Juyong Lee, 2022. "Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    3. Wenhao Gao & Priyanka Raghavan & Connor W. Coley, 2022. "Autonomous platforms for data-driven organic synthesis," Nature Communications, Nature, vol. 13(1), pages 1-4, December.
    4. Itai Levin & Mengjie Liu & Christopher A. Voigt & Connor W. Coley, 2022. "Merging enzymatic and synthetic chemistry with computational synthesis planning," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    5. Naudé, Wim, 2020. "Artificial Intelligence against COVID-19: An Early Review," IZA Discussion Papers 13110, Institute of Labor Economics (IZA).
    6. Lei Fang & Junren Li & Ming Zhao & Li Tan & Jian-Guang Lou, 2023. "Single-step retrosynthesis prediction by leveraging commonly preserved substructures," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    7. Mochen Liao & Kai Lan & Yuan Yao, 2022. "Sustainability implications of artificial intelligence in the chemical industry: A conceptual framework," Journal of Industrial Ecology, Yale University, vol. 26(1), pages 164-182, February.
    8. Shingo Harada & Hiroki Takenaka & Tsubasa Ito & Haruki Kanda & Tetsuhiro Nemoto, 2024. "Valence-isomer selective cycloaddition reaction of cycloheptatrienes-norcaradienes," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    9. Leng, Lijian & Li, Tanghao & Zhan, Hao & Rizwan, Muhammad & Zhang, Weijin & Peng, Haoyi & Yang, Zequn & Li, Hailong, 2023. "Machine learning-aided prediction of nitrogen heterocycles in bio-oil from the pyrolysis of biomass," Energy, Elsevier, vol. 278(PB).
    10. Marcel Rolf Pfeifer, 2021. "Development of a Smart Manufacturing Execution System Architecture for SMEs: A Czech Case Study," Sustainability, MDPI, vol. 13(18), pages 1-23, September.
    11. Yasuhiro Yoshikai & Tadahaya Mizuno & Shumpei Nemoto & Hiroyuki Kusuhara, 2024. "Difficulty in chirality recognition for Transformer architectures learning chemical structures from string representations," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    12. M. Saqlain & S. Ali & J. Y. Lee, 2023. "A Monte-Carlo tree search algorithm for the flexible job-shop scheduling in manufacturing systems," Flexible Services and Manufacturing Journal, Springer, vol. 35(2), pages 548-571, June.
    13. Zhao, Jingyuan & Feng, Xuning & Wang, Junbin & Lian, Yubo & Ouyang, Minggao & Burke, Andrew F., 2023. "Battery fault diagnosis and failure prognosis for electric vehicles using spatio-temporal transformer networks," Applied Energy, Elsevier, vol. 352(C).
    14. Lu Liu & Benjamin F. Jones & Brian Uzzi & Dashun Wang, 2023. "Data, measurement and empirical methods in the science of science," Nature Human Behaviour, Nature, vol. 7(7), pages 1046-1058, July.
    15. Jinho Chang & Jong Chul Ye, 2024. "Bidirectional generation of structure and properties through a single molecular foundation model," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    16. Hang Xiao & Rong Li & Xiaoyang Shi & Yan Chen & Liangliang Zhu & Xi Chen & Lei Wang, 2023. "An invertible, invariant crystal representation for inverse design of solid-state materials using generative deep learning," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    17. Nathan J. Szymanski & Pragnay Nevatia & Christopher J. Bartel & Yan Zeng & Gerbrand Ceder, 2023. "Autonomous and dynamic precursor selection for solid-state materials synthesis," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    18. Yuanyuan Jiang & Zongwei Yang & Jiali Guo & Hongzhen Li & Yijing Liu & Yanzhi Guo & Menglong Li & Xuemei Pu, 2021. "Coupling complementary strategy to flexible graph neural network for quick discovery of coformer in diverse co-crystal materials," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    19. Shuangjia Zheng & Tao Zeng & Chengtao Li & Binghong Chen & Connor W. Coley & Yuedong Yang & Ruibo Wu, 2022. "Deep learning driven biosynthetic pathways navigation for natural products with BioNavi-NP," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    20. de Mars, Patrick & O’Sullivan, Aidan, 2021. "Applying reinforcement learning and tree search to the unit commitment problem," Applied Energy, Elsevier, vol. 302(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-41698-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.