IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-024-55374-9.html
   My bibliography  Save this article

A data-driven group retrosynthesis planning model inspired by neurosymbolic programming

Author

Listed:
  • Xuefeng Zhang

    (Peking University)

  • Haowei Lin

    (Peking University)

  • Muhan Zhang

    (Peking University)

  • Yuan Zhou

    (Tsinghua University
    Beijing Institute of Mathematical Sciences and Applications
    Tsinghua University)

  • Jianzhu Ma

    (Tsinghua University
    Tsinghua University)

Abstract

Deep generative models have garnered significant attention for their efficiency in drug discovery, yet the synthesis of proposed molecules remains a challenge. Retrosynthetic planning, a part of computer-assisted synthesis planning, addresses this challenge by recursively decomposing molecules using symbolic rules and machine-trained scoring functions. However, current methods often treat each molecule independently, missing the opportunity to utilize shared synthesis patterns and repeat pathways, which may contribute from known synthesis routes to newly emerging, similar molecules, a notable challenge with AI-generated small molecules. Our investigation reveals reusable synthesis patterns that augment the reaction template library, resulting in progressively decreasing marginal inference time as the algorithm processes more molecules. Nevertheless, expanding the library enlarges the search space, necessitating investigation into methods for effectively prediction of reactions in retrosynthesis search. Inspired by human learning, our algorithm, akin to neurosymbolic programming, builds upon commonly used multi-step concepts such as cascade and complementary reactions and can evolve from practical experiences, enhancing the prediction model for fundamental and compositional reaction templates. The evolutionary process involves wake, abstraction, and dreaming phases, alternatively extending the reaction template library and refining models for more efficient retrosynthesis. Our algorithm outperforms existing methods, discovers chemistry patterns, and significantly reduces inference time in retrosynthetic planning for a group of similar molecules, showcasing its potential in validating results from generative models.

Suggested Citation

  • Xuefeng Zhang & Haowei Lin & Muhan Zhang & Yuan Zhou & Jianzhu Ma, 2025. "A data-driven group retrosynthesis planning model inspired by neurosymbolic programming," Nature Communications, Nature, vol. 16(1), pages 1-17, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-024-55374-9
    DOI: 10.1038/s41467-024-55374-9
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-55374-9
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-55374-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Marwin H. S. Segler & Mike Preuss & Mark P. Waller, 2018. "Planning chemical syntheses with deep neural networks and symbolic AI," Nature, Nature, vol. 555(7698), pages 604-610, March.
    2. Weihe Zhong & Ziduo Yang & Calvin Yu-Chian Chen, 2023. "Retrosynthesis prediction using an end-to-end graph generative architecture for molecular graph editing," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yuqiang Han & Xiaoyang Xu & Chang-Yu Hsieh & Keyan Ding & Hongxia Xu & Renjun Xu & Tingjun Hou & Qiang Zhang & Huajun Chen, 2024. "Retrosynthesis prediction with an iterative string editing model," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    2. Jia-Min Lu & Hui-Feng Wang & Qi-Hang Guo & Jian-Wei Wang & Tong-Tong Li & Ke-Xin Chen & Meng-Ting Zhang & Jian-Bo Chen & Qian-Nuan Shi & Yi Huang & Shao-Wen Shi & Guang-Yong Chen & Jian-Zhang Pan & Zh, 2024. "Roboticized AI-assisted microfluidic photocatalytic synthesis and screening up to 10,000 reactions per day," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    3. Naudé, Wim, 2020. "Artificial Intelligence against COVID-19: An Early Review," IZA Discussion Papers 13110, Institute of Labor Economics (IZA).
    4. Mingyang Wang & Shuai Li & Jike Wang & Odin Zhang & Hongyan Du & Dejun Jiang & Zhenxing Wu & Yafeng Deng & Yu Kang & Peichen Pan & Dan Li & Xiaorui Wang & Xiaojun Yao & Tingjun Hou & Chang-Yu Hsieh, 2024. "ClickGen: Directed exploration of synthesizable chemical space via modular reactions and reinforcement learning," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    5. Mochen Liao & Kai Lan & Yuan Yao, 2022. "Sustainability implications of artificial intelligence in the chemical industry: A conceptual framework," Journal of Industrial Ecology, Yale University, vol. 26(1), pages 164-182, February.
    6. Shingo Harada & Hiroki Takenaka & Tsubasa Ito & Haruki Kanda & Tetsuhiro Nemoto, 2024. "Valence-isomer selective cycloaddition reaction of cycloheptatrienes-norcaradienes," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    7. Wenhao Gao & Priyanka Raghavan & Connor W. Coley, 2022. "Autonomous platforms for data-driven organic synthesis," Nature Communications, Nature, vol. 13(1), pages 1-4, December.
    8. Debesh Mishra & Biswajit Mohapatra & Abhaya Sanatan Satpathy & Kamalakanta Muduli & Binayak Mishra & Swagatika Mishra & Upma Paliwal, 2024. "The pandemic COVID-19 and associated challenges with implementation of artificial intelligence (AI) in Indian agriculture," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 15(6), pages 2715-2729, June.
    9. Itai Levin & Mengjie Liu & Christopher A. Voigt & Connor W. Coley, 2022. "Merging enzymatic and synthetic chemistry with computational synthesis planning," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    10. Peng-Cheng Zhao & Xue-Xin Wei & Qiong Wang & Qi-Hao Wang & Jia-Ning Li & Jie Shang & Cheng Lu & Jian-Yu Shi, 2025. "Single-step retrosynthesis prediction via multitask graph representation learning," Nature Communications, Nature, vol. 16(1), pages 1-19, December.
    11. Marcel Rolf Pfeifer, 2021. "Development of a Smart Manufacturing Execution System Architecture for SMEs: A Czech Case Study," Sustainability, MDPI, vol. 13(18), pages 1-23, September.
    12. Daniel Cruz & Sonia Żółtowska & Oleksandr Savateev & Markus Antonietti & Paolo Giusto, 2025. "Carbon nitride caught in the act of artificial photosynthesis," Nature Communications, Nature, vol. 16(1), pages 1-7, December.
    13. M. Saqlain & S. Ali & J. Y. Lee, 2023. "A Monte-Carlo tree search algorithm for the flexible job-shop scheduling in manufacturing systems," Flexible Services and Manufacturing Journal, Springer, vol. 35(2), pages 548-571, June.
    14. Zhao, Jingyuan & Feng, Xuning & Wang, Junbin & Lian, Yubo & Ouyang, Minggao & Burke, Andrew F., 2023. "Battery fault diagnosis and failure prognosis for electric vehicles using spatio-temporal transformer networks," Applied Energy, Elsevier, vol. 352(C).
    15. Lu Liu & Benjamin F. Jones & Brian Uzzi & Dashun Wang, 2023. "Data, measurement and empirical methods in the science of science," Nature Human Behaviour, Nature, vol. 7(7), pages 1046-1058, July.
    16. Hang Xiao & Rong Li & Xiaoyang Shi & Yan Chen & Liangliang Zhu & Xi Chen & Lei Wang, 2023. "An invertible, invariant crystal representation for inverse design of solid-state materials using generative deep learning," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    17. Nathan J. Szymanski & Pragnay Nevatia & Christopher J. Bartel & Yan Zeng & Gerbrand Ceder, 2023. "Autonomous and dynamic precursor selection for solid-state materials synthesis," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    18. Umit V. Ucak & Islambek Ashyrmamatov & Junsu Ko & Juyong Lee, 2022. "Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    19. Yuanyuan Jiang & Zongwei Yang & Jiali Guo & Hongzhen Li & Yijing Liu & Yanzhi Guo & Menglong Li & Xuemei Pu, 2021. "Coupling complementary strategy to flexible graph neural network for quick discovery of coformer in diverse co-crystal materials," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    20. Yu Shee & Haote Li & Pengpeng Zhang & Andrea M. Nikolic & Wenxin Lu & H. Ray Kelly & Vidhyadhar Manee & Sanil Sreekumar & Frederic G. Buono & Jinhua J. Song & Timothy R. Newhouse & Victor S. Batista, 2024. "Site-specific template generative approach for retrosynthetic planning," Nature Communications, Nature, vol. 15(1), pages 1-10, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-024-55374-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.