IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-28857-w.html
   My bibliography  Save this article

Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments

Author

Listed:
  • Umit V. Ucak

    (Kangwon National University)

  • Islambek Ashyrmamatov

    (Kangwon National University)

  • Junsu Ko

    (Arontier co.)

  • Juyong Lee

    (Kangwon National University
    Arontier co.)

Abstract

Designing efficient synthetic routes for a target molecule remains a major challenge in organic synthesis. Atom environments are ideal, stand-alone, chemically meaningful building blocks providing a high-resolution molecular representation. Our approach mimics chemical reasoning, and predicts reactant candidates by learning the changes of atom environments associated with the chemical reaction. Through careful inspection of reactant candidates, we demonstrate atom environments as promising descriptors for studying reaction route prediction and discovery. Here, we present a new single-step retrosynthesis prediction method, viz. RetroTRAE, being free from all SMILES-based translation issues, yields a top-1 accuracy of 58.3% on the USPTO test dataset, and top-1 accuracy reaches to 61.6% with the inclusion of highly similar analogs, outperforming other state-of-the-art neural machine translation-based methods. Our methodology introduces a novel scheme for fragmental and topological descriptors to be used as natural inputs for retrosynthetic prediction tasks.

Suggested Citation

  • Umit V. Ucak & Islambek Ashyrmamatov & Junsu Ko & Juyong Lee, 2022. "Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-28857-w
    DOI: 10.1038/s41467-022-28857-w
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-28857-w
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-28857-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Dávid Péter Kovács & William McCorkindale & Alpha A. Lee, 2021. "Quantitative interpretation explains machine learning models for chemical reaction prediction and uncovers bias," Nature Communications, Nature, vol. 12(1), pages 1-9, December.
    2. Barbara Mikulak-Klucznik & Patrycja Gołębiowska & Alison A. Bayly & Oskar Popik & Tomasz Klucznik & Sara Szymkuć & Ewa P. Gajewska & Piotr Dittwald & Olga Staszewska-Krajewska & Wiktor Beker & Tomasz , 2020. "Computational planning of the synthesis of complex natural products," Nature, Nature, vol. 588(7836), pages 83-88, December.
    3. Giorgio Pesciullesi & Philippe Schwaller & Teodoro Laino & Jean-Louis Reymond, 2020. "Transfer learning enables the molecular transformer to predict regio- and stereoselective reactions on carbohydrates," Nature Communications, Nature, vol. 11(1), pages 1-8, December.
    4. Igor V. Tetko & Pavel Karpov & Ruud Deursen & Guillaume Godin, 2020. "State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis," Nature Communications, Nature, vol. 11(1), pages 1-11, December.
    5. Marwin H. S. Segler & Mike Preuss & Mark P. Waller, 2018. "Planning chemical syntheses with deep neural networks and symbolic AI," Nature, Nature, vol. 555(7698), pages 604-610, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yu Shee & Haote Li & Pengpeng Zhang & Andrea M. Nikolic & Wenxin Lu & H. Ray Kelly & Vidhyadhar Manee & Sanil Sreekumar & Frederic G. Buono & Jinhua J. Song & Timothy R. Newhouse & Victor S. Batista, 2024. "Site-specific template generative approach for retrosynthetic planning," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    2. Yu Wang & Chao Pang & Yuzhe Wang & Junru Jin & Jingjie Zhang & Xiangxiang Zeng & Ran Su & Quan Zou & Leyi Wei, 2023. "Retrosynthesis prediction with an interpretable deep-learning framework based on molecular assembly tasks," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    3. Weihe Zhong & Ziduo Yang & Calvin Yu-Chian Chen, 2023. "Retrosynthesis prediction using an end-to-end graph generative architecture for molecular graph editing," Nature Communications, Nature, vol. 14(1), pages 1-14, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yu Wang & Chao Pang & Yuzhe Wang & Junru Jin & Jingjie Zhang & Xiangxiang Zeng & Ran Su & Quan Zou & Leyi Wei, 2023. "Retrosynthesis prediction with an interpretable deep-learning framework based on molecular assembly tasks," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    2. Weihe Zhong & Ziduo Yang & Calvin Yu-Chian Chen, 2023. "Retrosynthesis prediction using an end-to-end graph generative architecture for molecular graph editing," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    3. Yuqiang Han & Xiaoyang Xu & Chang-Yu Hsieh & Keyan Ding & Hongxia Xu & Renjun Xu & Tingjun Hou & Qiang Zhang & Huajun Chen, 2024. "Retrosynthesis prediction with an iterative string editing model," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    4. Wenhao Gao & Priyanka Raghavan & Connor W. Coley, 2022. "Autonomous platforms for data-driven organic synthesis," Nature Communications, Nature, vol. 13(1), pages 1-4, December.
    5. Itai Levin & Mengjie Liu & Christopher A. Voigt & Connor W. Coley, 2022. "Merging enzymatic and synthetic chemistry with computational synthesis planning," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    6. Yu Shee & Haote Li & Pengpeng Zhang & Andrea M. Nikolic & Wenxin Lu & H. Ray Kelly & Vidhyadhar Manee & Sanil Sreekumar & Frederic G. Buono & Jinhua J. Song & Timothy R. Newhouse & Victor S. Batista, 2024. "Site-specific template generative approach for retrosynthetic planning," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    7. Shuangjia Zheng & Tao Zeng & Chengtao Li & Binghong Chen & Connor W. Coley & Yuedong Yang & Ruibo Wu, 2022. "Deep learning driven biosynthetic pathways navigation for natural products with BioNavi-NP," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    8. Jia-Min Lu & Hui-Feng Wang & Qi-Hang Guo & Jian-Wei Wang & Tong-Tong Li & Ke-Xin Chen & Meng-Ting Zhang & Jian-Bo Chen & Qian-Nuan Shi & Yi Huang & Shao-Wen Shi & Guang-Yong Chen & Jian-Zhang Pan & Zh, 2024. "Roboticized AI-assisted microfluidic photocatalytic synthesis and screening up to 10,000 reactions per day," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    9. Naudé, Wim, 2020. "Artificial Intelligence against COVID-19: An Early Review," IZA Discussion Papers 13110, Institute of Labor Economics (IZA).
    10. Shu-Wen Li & Li-Cheng Xu & Cheng Zhang & Shuo-Qing Zhang & Xin Hong, 2023. "Reaction performance prediction with an extrapolative and interpretable graph model based on chemical knowledge," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    11. Lei Fang & Junren Li & Ming Zhao & Li Tan & Jian-Guang Lou, 2023. "Single-step retrosynthesis prediction by leveraging commonly preserved substructures," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    12. Mochen Liao & Kai Lan & Yuan Yao, 2022. "Sustainability implications of artificial intelligence in the chemical industry: A conceptual framework," Journal of Industrial Ecology, Yale University, vol. 26(1), pages 164-182, February.
    13. Shingo Harada & Hiroki Takenaka & Tsubasa Ito & Haruki Kanda & Tetsuhiro Nemoto, 2024. "Valence-isomer selective cycloaddition reaction of cycloheptatrienes-norcaradienes," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    14. Leng, Lijian & Li, Tanghao & Zhan, Hao & Rizwan, Muhammad & Zhang, Weijin & Peng, Haoyi & Yang, Zequn & Li, Hailong, 2023. "Machine learning-aided prediction of nitrogen heterocycles in bio-oil from the pyrolysis of biomass," Energy, Elsevier, vol. 278(PB).
    15. Daniel Probst & Matteo Manica & Yves Gaetan Nana Teukam & Alessandro Castrogiovanni & Federico Paratore & Teodoro Laino, 2022. "Biocatalysed synthesis planning using data-driven learning," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    16. Debesh Mishra & Biswajit Mohapatra & Abhaya Sanatan Satpathy & Kamalakanta Muduli & Binayak Mishra & Swagatika Mishra & Upma Paliwal, 2024. "The pandemic COVID-19 and associated challenges with implementation of artificial intelligence (AI) in Indian agriculture," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 15(6), pages 2715-2729, June.
    17. Min Li & Yang Zhou & Zexing Wen & Qian Ni & Ziqin Zhou & Yiling Liu & Qiang Zhou & Zongchao Jia & Bin Guo & Yuanhong Ma & Bo Chen & Zhi-Min Zhang & Jian-bo Wang, 2024. "An efficient C-glycoside production platform enabled by rationally tuning the chemoselectivity of glycosyltransferases," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    18. Marcel Rolf Pfeifer, 2021. "Development of a Smart Manufacturing Execution System Architecture for SMEs: A Czech Case Study," Sustainability, MDPI, vol. 13(18), pages 1-23, September.
    19. Yasuhiro Yoshikai & Tadahaya Mizuno & Shumpei Nemoto & Hiroyuki Kusuhara, 2024. "Difficulty in chirality recognition for Transformer architectures learning chemical structures from string representations," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    20. M. Saqlain & S. Ali & J. Y. Lee, 2023. "A Monte-Carlo tree search algorithm for the flexible job-shop scheduling in manufacturing systems," Flexible Services and Manufacturing Journal, Springer, vol. 35(2), pages 548-571, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-28857-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.