IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-38851-5.html
   My bibliography  Save this article

Retrosynthesis prediction using an end-to-end graph generative architecture for molecular graph editing

Author

Listed:
  • Weihe Zhong

    (Shenzhen Campus of Sun Yat-sen University
    Shenzhen Campus of Sun Yat-sen University)

  • Ziduo Yang

    (Shenzhen Campus of Sun Yat-sen University)

  • Calvin Yu-Chian Chen

    (Shenzhen Campus of Sun Yat-sen University
    China Medical University Hospital
    Asia University)

Abstract

Retrosynthesis planning, the process of identifying a set of available reactions to synthesize the target molecules, remains a major challenge in organic synthesis. Recently, computer-aided synthesis planning has gained renewed interest and various retrosynthesis prediction algorithms based on deep learning have been proposed. However, most existing methods are limited to the applicability and interpretability of model predictions, and further improvement of predictive accuracy to a more practical level is still required. In this work, inspired by the arrow-pushing formalism in chemical reaction mechanisms, we present an end-to-end architecture for retrosynthesis prediction called Graph2Edits. Specifically, Graph2Edits is based on graph neural network to predict the edits of the product graph in an auto-regressive manner, and sequentially generates transformation intermediates and final reactants according to the predicted edits sequence. This strategy combines the two-stage processes of semi-template-based methods into one-pot learning, improving the applicability in some complicated reactions, and also making its predictions more interpretable. Evaluated on the standard benchmark dataset USPTO-50k, our model achieves the state-of-the-art performance for semi-template-based retrosynthesis with a promising 55.1% top-1 accuracy.

Suggested Citation

  • Weihe Zhong & Ziduo Yang & Calvin Yu-Chian Chen, 2023. "Retrosynthesis prediction using an end-to-end graph generative architecture for molecular graph editing," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-38851-5
    DOI: 10.1038/s41467-023-38851-5
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-38851-5
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-38851-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Marwin H. S. Segler & Mike Preuss & Mark P. Waller, 2018. "Planning chemical syntheses with deep neural networks and symbolic AI," Nature, Nature, vol. 555(7698), pages 604-610, March.
    2. Umit V. Ucak & Islambek Ashyrmamatov & Junsu Ko & Juyong Lee, 2022. "Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    3. Dávid Péter Kovács & William McCorkindale & Alpha A. Lee, 2021. "Quantitative interpretation explains machine learning models for chemical reaction prediction and uncovers bias," Nature Communications, Nature, vol. 12(1), pages 1-9, December.
    4. Igor V. Tetko & Pavel Karpov & Ruud Deursen & Guillaume Godin, 2020. "State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis," Nature Communications, Nature, vol. 11(1), pages 1-11, December.
    5. Barbara Mikulak-Klucznik & Patrycja Gołębiowska & Alison A. Bayly & Oskar Popik & Tomasz Klucznik & Sara Szymkuć & Ewa P. Gajewska & Piotr Dittwald & Olga Staszewska-Krajewska & Wiktor Beker & Tomasz , 2020. "Computational planning of the synthesis of complex natural products," Nature, Nature, vol. 588(7836), pages 83-88, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yu Wang & Chao Pang & Yuzhe Wang & Junru Jin & Jingjie Zhang & Xiangxiang Zeng & Ran Su & Quan Zou & Leyi Wei, 2023. "Retrosynthesis prediction with an interpretable deep-learning framework based on molecular assembly tasks," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    2. Umit V. Ucak & Islambek Ashyrmamatov & Junsu Ko & Juyong Lee, 2022. "Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    3. Wenhao Gao & Priyanka Raghavan & Connor W. Coley, 2022. "Autonomous platforms for data-driven organic synthesis," Nature Communications, Nature, vol. 13(1), pages 1-4, December.
    4. Itai Levin & Mengjie Liu & Christopher A. Voigt & Connor W. Coley, 2022. "Merging enzymatic and synthetic chemistry with computational synthesis planning," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    5. Naudé, Wim, 2020. "Artificial Intelligence against COVID-19: An Early Review," IZA Discussion Papers 13110, Institute of Labor Economics (IZA).
    6. Shingo Harada & Hiroki Takenaka & Tsubasa Ito & Haruki Kanda & Tetsuhiro Nemoto, 2024. "Valence-isomer selective cycloaddition reaction of cycloheptatrienes-norcaradienes," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    7. Leng, Lijian & Li, Tanghao & Zhan, Hao & Rizwan, Muhammad & Zhang, Weijin & Peng, Haoyi & Yang, Zequn & Li, Hailong, 2023. "Machine learning-aided prediction of nitrogen heterocycles in bio-oil from the pyrolysis of biomass," Energy, Elsevier, vol. 278(PB).
    8. Yasuhiro Yoshikai & Tadahaya Mizuno & Shumpei Nemoto & Hiroyuki Kusuhara, 2024. "Difficulty in chirality recognition for Transformer architectures learning chemical structures from string representations," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    9. M. Saqlain & S. Ali & J. Y. Lee, 2023. "A Monte-Carlo tree search algorithm for the flexible job-shop scheduling in manufacturing systems," Flexible Services and Manufacturing Journal, Springer, vol. 35(2), pages 548-571, June.
    10. Lu Liu & Benjamin F. Jones & Brian Uzzi & Dashun Wang, 2023. "Data, measurement and empirical methods in the science of science," Nature Human Behaviour, Nature, vol. 7(7), pages 1046-1058, July.
    11. Jinho Chang & Jong Chul Ye, 2024. "Bidirectional generation of structure and properties through a single molecular foundation model," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    12. Lei Fang & Junren Li & Ming Zhao & Li Tan & Jian-Guang Lou, 2023. "Single-step retrosynthesis prediction by leveraging commonly preserved substructures," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    13. Mochen Liao & Kai Lan & Yuan Yao, 2022. "Sustainability implications of artificial intelligence in the chemical industry: A conceptual framework," Journal of Industrial Ecology, Yale University, vol. 26(1), pages 164-182, February.
    14. Marcel Rolf Pfeifer, 2021. "Development of a Smart Manufacturing Execution System Architecture for SMEs: A Czech Case Study," Sustainability, MDPI, vol. 13(18), pages 1-23, September.
    15. Zhao, Jingyuan & Feng, Xuning & Wang, Junbin & Lian, Yubo & Ouyang, Minggao & Burke, Andrew F., 2023. "Battery fault diagnosis and failure prognosis for electric vehicles using spatio-temporal transformer networks," Applied Energy, Elsevier, vol. 352(C).
    16. Hang Xiao & Rong Li & Xiaoyang Shi & Yan Chen & Liangliang Zhu & Xi Chen & Lei Wang, 2023. "An invertible, invariant crystal representation for inverse design of solid-state materials using generative deep learning," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    17. Nathan J. Szymanski & Pragnay Nevatia & Christopher J. Bartel & Yan Zeng & Gerbrand Ceder, 2023. "Autonomous and dynamic precursor selection for solid-state materials synthesis," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    18. Yuanyuan Jiang & Zongwei Yang & Jiali Guo & Hongzhen Li & Yijing Liu & Yanzhi Guo & Menglong Li & Xuemei Pu, 2021. "Coupling complementary strategy to flexible graph neural network for quick discovery of coformer in diverse co-crystal materials," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    19. Shuangjia Zheng & Tao Zeng & Chengtao Li & Binghong Chen & Connor W. Coley & Yuedong Yang & Ruibo Wu, 2022. "Deep learning driven biosynthetic pathways navigation for natural products with BioNavi-NP," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    20. de Mars, Patrick & O’Sullivan, Aidan, 2021. "Applying reinforcement learning and tree search to the unit commitment problem," Applied Energy, Elsevier, vol. 302(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-38851-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.