IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-37572-z.html
   My bibliography  Save this article

Improving the generalizability of protein-ligand binding predictions with AI-Bind

Author

Listed:
  • Ayan Chatterjee

    (Northeastern University)

  • Robin Walters

    (Northeastern University)

  • Zohair Shafi

    (Northeastern University)

  • Omair Shafi Ahmed

    (Northeastern University)

  • Michael Sebek

    (Northeastern University
    Northeastern University)

  • Deisy Gysi

    (Northeastern University
    Northeastern University
    Brigham and Women’s Hospital, Harvard Medical School)

  • Rose Yu

    (University of California)

  • Tina Eliassi-Rad

    (Northeastern University
    Northeastern University
    Santa Fe Institute
    Northeastern University)

  • Albert-László Barabási

    (Northeastern University
    Northeastern University
    Central European University)

  • Giulia Menichetti

    (Northeastern University
    Northeastern University
    Brigham and Women’s Hospital, Harvard Medical School)

Abstract

Identifying novel drug-target interactions is a critical and rate-limiting step in drug discovery. While deep learning models have been proposed to accelerate the identification process, here we show that state-of-the-art models fail to generalize to novel (i.e., never-before-seen) structures. We unveil the mechanisms responsible for this shortcoming, demonstrating how models rely on shortcuts that leverage the topology of the protein-ligand bipartite network, rather than learning the node features. Here we introduce AI-Bind, a pipeline that combines network-based sampling strategies with unsupervised pre-training to improve binding predictions for novel proteins and ligands. We validate AI-Bind predictions via docking simulations and comparison with recent experimental evidence, and step up the process of interpreting machine learning prediction of protein-ligand binding by identifying potential active binding sites on the amino acid sequence. AI-Bind is a high-throughput approach to identify drug-target combinations with the potential of becoming a powerful tool in drug discovery.

Suggested Citation

  • Ayan Chatterjee & Robin Walters & Zohair Shafi & Omair Shafi Ahmed & Michael Sebek & Deisy Gysi & Rose Yu & Tina Eliassi-Rad & Albert-László Barabási & Giulia Menichetti, 2023. "Improving the generalizability of protein-ligand binding predictions with AI-Bind," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-37572-z
    DOI: 10.1038/s41467-023-37572-z
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-37572-z
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-37572-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. David E. Gordon & Gwendolyn M. Jang & Mehdi Bouhaddou & Jiewei Xu & Kirsten Obernier & Kris M. White & Matthew J. O’Meara & Veronica V. Rezelj & Jeffrey Z. Guo & Danielle L. Swaney & Tia A. Tummino & , 2020. "A SARS-CoV-2 protein interaction map reveals targets for drug repurposing," Nature, Nature, vol. 583(7816), pages 459-468, July.
    2. Christoph Gorgulla & Andras Boeszoermenyi & Zi-Fu Wang & Patrick D. Fischer & Paul W. Coote & Krishna M. Padmanabha Das & Yehor S. Malets & Dmytro S. Radchenko & Yurii S. Moroz & David A. Scott & Kons, 2020. "An open-source drug discovery platform enables ultra-large virtual screens," Nature, Nature, vol. 580(7805), pages 663-668, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Taha Y. Taha & Irene P. Chen & Jennifer M. Hayashi & Takako Tabata & Keith Walcott & Gabriella R. Kimmerly & Abdullah M. Syed & Alison Ciling & Rahul K. Suryawanshi & Hannah S. Martin & Bryan H. Bach , 2023. "Rapid assembly of SARS-CoV-2 genomes reveals attenuation of the Omicron BA.1 variant through NSP6," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    2. David Gomez-Zepeda & Danielle Arnold-Schild & Julian Beyrle & Arthur Declercq & Ralf Gabriels & Elena Kumm & Annica Preikschat & Mateusz Krzysztof Łącki & Aurélie Hirschler & Jeewan Babu Rijal & Chris, 2024. "Thunder-DDA-PASEF enables high-coverage immunopeptidomics and is boosted by MS2Rescore with MS2PIP timsTOF fragmentation prediction model," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    3. Christine E. Peters & Ursula Schulze-Gahmen & Manon Eckhardt & Gwendolyn M. Jang & Jiewei Xu & Ernst H. Pulido & Conner Bardine & Charles S. Craik & Melanie Ott & Or Gozani & Kliment A. Verba & Ruth H, 2022. "Structure-function analysis of enterovirus protease 2A in complex with its essential host factor SETD3," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    4. Paul Beroza & James J. Crawford & Oleg Ganichkin & Leo Gendelev & Seth F. Harris & Raphael Klein & Anh Miu & Stefan Steinbacher & Franca-Maria Klingler & Christian Lemmen, 2022. "Chemical space docking enables large-scale structure-based virtual screening to discover ROCK1 kinase inhibitors," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    5. Scotland E. Farley & Jennifer E. Kyle & Hans C. Leier & Lisa M. Bramer & Jules B. Weinstein & Timothy A. Bates & Joon-Yong Lee & Thomas O. Metz & Carsten Schultz & Fikadu G. Tafesse, 2022. "A global lipid map reveals host dependency factors conserved across SARS-CoV-2 variants," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    6. Andrea M. Chiariello & Alex Abraham & Simona Bianco & Andrea Esposito & Andrea Fontana & Francesca Vercellone & Mattia Conte & Mario Nicodemi, 2024. "Multiscale modelling of chromatin 4D organization in SARS-CoV-2 infected cells," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    7. Gabriela Dias Noske & Yun Song & Rafaela Sachetto Fernandes & Rod Chalk & Haitem Elmassoudi & Lizbé Koekemoer & C. David Owen & Tarick J. El-Baba & Carol V. Robinson & Glaucius Oliva & Andre Schutzer , 2023. "An in-solution snapshot of SARS-COV-2 main protease maturation process and inhibition," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    8. Haofeng Wang & Qi Yang & Xiaoce Liu & Zili Xu & Maolin Shao & Dongxu Li & Yinkai Duan & Jielin Tang & Xianqiang Yu & Yumin Zhang & Aihua Hao & Yajie Wang & Jie Chen & Chenghao Zhu & Luke Guddat & Hong, 2023. "Structure-based discovery of dual pathway inhibitors for SARS-CoV-2 entry," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    9. Sara Sunshine & Andreas S. Puschnik & Joseph M. Replogle & Matthew T. Laurie & Jamin Liu & Beth Shoshana Zha & James K. Nuñez & Janie R. Byrum & Aidan H. McMorrow & Matthew B. Frieman & Juliane Winkle, 2023. "Systematic functional interrogation of SARS-CoV-2 host factors using Perturb-seq," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    10. Xiaopan Gao & Huabin Tian & Kaixiang Zhu & Qing Li & Wei Hao & Linyue Wang & Bo Qin & Hongyu Deng & Sheng Cui, 2022. "Structural basis for Sarbecovirus ORF6 mediated blockage of nucleocytoplasmic transport," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    11. Thomas Kruse & Caroline Benz & Dimitriya H. Garvanska & Richard Lindqvist & Filip Mihalic & Fabian Coscia & Raviteja Inturi & Ahmed Sayadi & Leandro Simonetti & Emma Nilsson & Muhammad Ali & Johanna K, 2021. "Large scale discovery of coronavirus-host factor protein interaction motifs reveals SARS-CoV-2 specific mechanisms and vulnerabilities," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    12. Kun Wang & Chia-Wei Lee & Xuewu Sui & Siyoung Kim & Shuhui Wang & Aidan B. Higgs & Aaron J. Baublis & Gregory A. Voth & Maofu Liao & Tobias C. Walther & Robert V. Farese, 2023. "The structure of phosphatidylinositol remodeling MBOAT7 reveals its catalytic mechanism and enables inhibitor identification," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    13. Filip Mihalič & Leandro Simonetti & Girolamo Giudice & Marie Rubin Sander & Richard Lindqvist & Marie Berit Akpiroro Peters & Caroline Benz & Eszter Kassa & Dilip Badgujar & Raviteja Inturi & Muhammad, 2023. "Large-scale phage-based screening reveals extensive pan-viral mimicry of host short linear motifs," Nature Communications, Nature, vol. 14(1), pages 1-20, December.
    14. Lifan Chen & Zisheng Fan & Jie Chang & Ruirui Yang & Hui Hou & Hao Guo & Yinghui Zhang & Tianbiao Yang & Chenmao Zhou & Qibang Sui & Zhengyang Chen & Chen Zheng & Xinyue Hao & Keke Zhang & Rongrong Cu, 2023. "Sequence-based drug design as a concept in computational drug design," Nature Communications, Nature, vol. 14(1), pages 1-21, December.
    15. Hanbaek Lyu & Yacoub H. Kureh & Joshua Vendrow & Mason A. Porter, 2024. "Learning low-rank latent mesoscale structures in networks," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    16. Ma’ayan Israeli & Yaara Finkel & Yfat Yahalom-Ronen & Nir Paran & Theodor Chitlaru & Ofir Israeli & Inbar Cohen-Gihon & Moshe Aftalion & Reut Falach & Shahar Rotem & Uri Elia & Ital Nemet & Limor Klik, 2022. "Genome-wide CRISPR screens identify GATA6 as a proviral host factor for SARS-CoV-2 via modulation of ACE2," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    17. Ramiz Salama & Fadi Al-Turjman, 2023. "Sustainable Energy Production in Smart Cities," Sustainability, MDPI, vol. 15(22), pages 1-25, November.
    18. Charulata Jindal & Sandeep Kumar & Sunil Sharma & Yuk Ming Choi & Jimmy T. Efird, 2020. "The Prevention and Management of COVID-19: Seeking a Practical and Timely Solution," IJERPH, MDPI, vol. 17(11), pages 1-11, June.
    19. Kelsey M. Haas & Michael J. McGregor & Mehdi Bouhaddou & Benjamin J. Polacco & Eun-Young Kim & Thong T. Nguyen & Billy W. Newton & Matthew Urbanowski & Heejin Kim & Michael A. P. Williams & Veronica V, 2023. "Proteomic and genetic analyses of influenza A viruses identify pan-viral host targets," Nature Communications, Nature, vol. 14(1), pages 1-27, December.
    20. Matthias M. Zimmer & Anuja Kibe & Ulfert Rand & Lukas Pekarek & Liqing Ye & Stefan Buck & Redmond P. Smyth & Luka Cicin-Sain & Neva Caliskan, 2021. "The short isoform of the host antiviral protein ZAP acts as an inhibitor of SARS-CoV-2 programmed ribosomal frameshifting," Nature Communications, Nature, vol. 12(1), pages 1-15, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-37572-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.