IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-022-35692-6.html
   My bibliography  Save this article

Leveraging molecular structure and bioactivity with chemical language models for de novo drug design

Author

Listed:
  • Michael Moret

    (ETH Zurich, Department of Chemistry and Applied Biosciences)

  • Irene Pachon Angona

    (ETH Zurich, Department of Chemistry and Applied Biosciences)

  • Leandro Cotos

    (ETH Zurich, Department of Chemistry and Applied Biosciences)

  • Shen Yan

    (University of Zurich, University Children’s Hospital, Children’s Research Center, Pediatric Molecular Neuro-Oncology Research)

  • Kenneth Atz

    (ETH Zurich, Department of Chemistry and Applied Biosciences)

  • Cyrill Brunner

    (ETH Zurich, Department of Chemistry and Applied Biosciences)

  • Martin Baumgartner

    (University of Zurich, University Children’s Hospital, Children’s Research Center, Pediatric Molecular Neuro-Oncology Research)

  • Francesca Grisoni

    (ETH Zurich, Department of Chemistry and Applied Biosciences
    Eindhoven University of Technology, Institute for Complex Molecular Systems and Eindhoven Artificial Intelligence Systems Institute, Department of Biomedical Engineering
    Center for 393 Living Technologies, Alliance TU/e, WUR, UU, UMC 394 Utrecht)

  • Gisbert Schneider

    (ETH Zurich, Department of Chemistry and Applied Biosciences
    ETH Singapore SEC Ltd, 1 CREATE Way, #06-01 CREATE Tower)

Abstract

Generative chemical language models (CLMs) can be used for de novo molecular structure generation by learning from a textual representation of molecules. Here, we show that hybrid CLMs can additionally leverage the bioactivity information available for the training compounds. To computationally design ligands of phosphoinositide 3-kinase gamma (PI3Kγ), a collection of virtual molecules was created with a generative CLM. This virtual compound library was refined using a CLM-based classifier for bioactivity prediction. This second hybrid CLM was pretrained with patented molecular structures and fine-tuned with known PI3Kγ ligands. Several of the computer-generated molecular designs were commercially available, enabling fast prescreening and preliminary experimental validation. A new PI3Kγ ligand with sub-micromolar activity was identified, highlighting the method’s scaffold-hopping potential. Chemical synthesis and biochemical testing of two of the top-ranked de novo designed molecules and their derivatives corroborated the model’s ability to generate PI3Kγ ligands with medium to low nanomolar activity for hit-to-lead expansion. The most potent compounds led to pronounced inhibition of PI3K-dependent Akt phosphorylation in a medulloblastoma cell model, demonstrating efficacy of PI3Kγ ligands in PI3K/Akt pathway repression in human tumor cells. The results positively advocate hybrid CLMs for virtual compound screening and activity-focused molecular design.

Suggested Citation

  • Michael Moret & Irene Pachon Angona & Leandro Cotos & Shen Yan & Kenneth Atz & Cyrill Brunner & Martin Baumgartner & Francesca Grisoni & Gisbert Schneider, 2023. "Leveraging molecular structure and bioactivity with chemical language models for de novo drug design," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-022-35692-6
    DOI: 10.1038/s41467-022-35692-6
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-35692-6
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-35692-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Megan M. Kaneda & Karen S. Messer & Natacha Ralainirina & Hongying Li & Christopher J. Leem & Sara Gorjestani & Gyunghwi Woo & Abraham V. Nguyen & Camila C. Figueiredo & Philippe Foubert & Michael C. , 2016. "PI3Kγ is a molecular switch that controls immune suppression," Nature, Nature, vol. 539(7629), pages 437-442, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Laura Isigkeit & Tim Hörmann & Espen Schallmayer & Katharina Scholz & Felix F. Lillich & Johanna H. M. Ehrler & Benedikt Hufnagel & Jasmin Büchner & Julian A. Marschner & Jörg Pabel & Ewgenij Proschak, 2024. "Automated design of multi-target ligands by generative deep learning," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    2. Kenneth Atz & Leandro Cotos & Clemens Isert & Maria Håkansson & Dorota Focht & Mattis Hilleke & David F. Nippa & Michael Iff & Jann Ledergerber & Carl C. G. Schiebroek & Valentina Romeo & Jan A. Hiss , 2024. "Prospective de novo drug design with deep interactome learning," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    3. Alessio Fallani & Leonardo Medrano Sandonas & Alexandre Tkatchenko, 2024. "Inverse mapping of quantum properties to structures for chemical space of small organic molecules," Nature Communications, Nature, vol. 15(1), pages 1-14, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Michael C. Schmid & Sang Won Kang & Hui Chen & Marc Paradise & Anghesom Ghebremedhin & Megan M. Kaneda & Shao-Ming Chin & Anh Do & D. Martin Watterson & Judith A. Varner, 2022. "PI3Kγ stimulates a high molecular weight form of myosin light chain kinase to promote myeloid cell adhesion and tumor inflammation," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    2. Erik Nutma & Nurun Fancy & Maria Weinert & Stergios Tsartsalis & Manuel C. Marzin & Robert C. J. Muirhead & Irene Falk & Marjolein Breur & Joy Bruin & David Hollaus & Robin Pieterman & Jasper Anink & , 2023. "Translocator protein is a marker of activated microglia in rodent models but not human neurodegenerative diseases," Nature Communications, Nature, vol. 14(1), pages 1-25, December.
    3. Mingming Zhao & Xiaohui Cheng & Pingwen Shao & Yao Dong & Yongjie Wu & Lin Xiao & Zhiying Cui & Xuedi Sun & Chuancheng Gao & Jiangning Chen & Zhen Huang & Junfeng Zhang, 2024. "Bacterial protoplast-derived nanovesicles carrying CRISPR-Cas9 tools re-educate tumor-associated macrophages for enhanced cancer immunotherapy," Nature Communications, Nature, vol. 15(1), pages 1-18, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-022-35692-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.