IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-33026-0.html
   My bibliography  Save this article

Integrating and formatting biomedical data as pre-calculated knowledge graph embeddings in the Bioteque

Author

Listed:
  • Adrià Fernández-Torras

    (The Barcelona Institute of Science and Technology)

  • Miquel Duran-Frigola

    (The Barcelona Institute of Science and Technology
    Ersilia Open Source Initiative)

  • Martino Bertoni

    (The Barcelona Institute of Science and Technology)

  • Martina Locatelli

    (The Barcelona Institute of Science and Technology)

  • Patrick Aloy

    (The Barcelona Institute of Science and Technology
    Institució Catalana de Recerca i Estudis Avançats (ICREA))

Abstract

Biomedical data is accumulating at a fast pace and integrating it into a unified framework is a major challenge, so that multiple views of a given biological event can be considered simultaneously. Here we present the Bioteque, a resource of unprecedented size and scope that contains pre-calculated biomedical descriptors derived from a gigantic knowledge graph, displaying more than 450 thousand biological entities and 30 million relationships between them. The Bioteque integrates, harmonizes, and formats data collected from over 150 data sources, including 12 biological entities (e.g., genes, diseases, drugs) linked by 67 types of associations (e.g., ‘drug treats disease’, ‘gene interacts with gene’). We show how Bioteque descriptors facilitate the assessment of high-throughput protein-protein interactome data, the prediction of drug response and new repurposing opportunities, and demonstrate that they can be used off-the-shelf in downstream machine learning tasks without loss of performance with respect to using original data. The Bioteque thus offers a thoroughly processed, tractable, and highly optimized assembly of the biomedical knowledge available in the public domain.

Suggested Citation

  • Adrià Fernández-Torras & Miquel Duran-Frigola & Martino Bertoni & Martina Locatelli & Patrick Aloy, 2022. "Integrating and formatting biomedical data as pre-calculated knowledge graph embeddings in the Bioteque," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-33026-0
    DOI: 10.1038/s41467-022-33026-0
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-33026-0
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-33026-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Camilo Ruiz & Marinka Zitnik & Jure Leskovec, 2021. "Identification of disease treatment mechanisms through the multiscale interactome," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    2. Jordi Barretina & Giordano Caponigro & Nicolas Stransky & Kavitha Venkatesan & Adam A. Margolin & Sungjoon Kim & Christopher J.Wilson & Joseph Lehár & Gregory V. Kryukov & Dmitriy Sonkin & Anupama Red, 2012. "Addendum: The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity," Nature, Nature, vol. 492(7428), pages 290-290, December.
    3. Erdogan Taskesen & Marcel J T Reinders, 2016. "2D Representation of Transcriptomes by t-SNE Exposes Relatedness between Human Tissues," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-6, February.
    4. Laura Cantini & Pooya Zakeri & Celine Hernandez & Aurelien Naldi & Denis Thieffry & Elisabeth Remy & Anaïs Baudot, 2021. "Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    5. Yue Qin & Edward L. Huttlin & Casper F. Winsnes & Maya L. Gosztyla & Ludivine Wacheul & Marcus R. Kelly & Steven M. Blue & Fan Zheng & Michael Chen & Leah V. Schaffer & Katherine Licon & Anna Bäckströ, 2021. "A multi-scale map of cell structure fusing protein images and interactions," Nature, Nature, vol. 600(7889), pages 536-542, December.
    6. Mahmoud Ghandi & Franklin W. Huang & Judit Jané-Valbuena & Gregory V. Kryukov & Christopher C. Lo & E. Robert McDonald & Jordi Barretina & Ellen T. Gelfand & Craig M. Bielski & Haoxin Li & Kevin Hu & , 2019. "Next-generation characterization of the Cancer Cell Line Encyclopedia," Nature, Nature, vol. 569(7757), pages 503-508, May.
    7. Jordi Barretina & Giordano Caponigro & Nicolas Stransky & Kavitha Venkatesan & Adam A. Margolin & Sungjoon Kim & Christopher J. Wilson & Joseph Lehár & Gregory V. Kryukov & Dmitriy Sonkin & Anupama Re, 2012. "The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity," Nature, Nature, vol. 483(7391), pages 603-607, March.
    8. Benjamin Haibe-Kains & Nehme El-Hachem & Nicolai Juul Birkbak & Andrew C. Jin & Andrew H. Beck & Hugo J. W. L. Aerts & John Quackenbush, 2013. "Inconsistency in large pharmacogenomic studies," Nature, Nature, vol. 504(7480), pages 389-393, December.
    9. Anna C. Belkina & Christopher O. Ciccolella & Rina Anno & Richard Halpert & Josef Spidlen & Jennifer E. Snyder-Cappione, 2019. "Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets," Nature Communications, Nature, vol. 10(1), pages 1-12, December.
    10. Katja Luck & Dae-Kyum Kim & Luke Lambourne & Kerstin Spirohn & Bridget E. Begg & Wenting Bian & Ruth Brignall & Tiziana Cafarelli & Francisco J. Campos-Laborie & Benoit Charloteaux & Dongsic Choi & At, 2020. "A reference map of the human binary protein interactome," Nature, Nature, vol. 580(7803), pages 402-408, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Arnau Comajuncosa-Creus & Guillem Jorba & Xavier Barril & Patrick Aloy, 2024. "Comprehensive detection and characterization of human druggable pockets through binding site descriptors," Nature Communications, Nature, vol. 15(1), pages 1-20, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Caitlin E. Mills & Kartik Subramanian & Marc Hafner & Mario Niepel & Luca Gerosa & Mirra Chung & Chiara Victor & Benjamin Gaudio & Clarence Yapp & Ajit J. Nirmal & Nicholas Clark & Peter K. Sorger, 2022. "Multiplexed and reproducible high content screening of live and fixed cells using Dye Drop," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    2. Yanli Liu & Zhong Wu & Jin Zhou & Dinesh K. A. Ramadurai & Katelyn L. Mortenson & Estrella Aguilera-Jimenez & Yifei Yan & Xiaojun Yang & Alison M. Taylor & Katherine E. Varley & Jason Gertz & Peter S., 2021. "A predominant enhancer co-amplified with the SOX2 oncogene is necessary and sufficient for its expression in squamous cancer," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    3. Xiao-Song Wang & Sanghoon Lee & Han Zhang & Gong Tang & Yue Wang, 2022. "An integral genomic signature approach for tailored cancer therapy using genome-wide sequencing data," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    4. George Rosenberger & Wenxue Li & Mikko Turunen & Jing He & Prem S. Subramaniam & Sergey Pampou & Aaron T. Griffin & Charles Karan & Patrick Kerwin & Diana Murray & Barry Honig & Yansheng Liu & Andrea , 2024. "Network-based elucidation of colon cancer drug resistance mechanisms by phosphoproteomic time-series analysis," Nature Communications, Nature, vol. 15(1), pages 1-27, December.
    5. Jurica Levatić & Marina Salvadores & Francisco Fuster-Tormo & Fran Supek, 2022. "Mutational signatures are markers of drug sensitivity of cancer cells," Nature Communications, Nature, vol. 13(1), pages 1-19, December.
    6. Katelyn L. Mortenson & Courtney Dawes & Emily R. Wilson & Nathan E. Patchen & Hailey E. Johnson & Jason Gertz & Swneke D. Bailey & Yang Liu & Katherine E. Varley & Xiaoyang Zhang, 2024. "3D genomic analysis reveals novel enhancer-hijacking caused by complex structural alterations that drive oncogene overexpression," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    7. Min Wu & Tingting Wang & Nan Ji & Ting Lu & Ran Yuan & Lingxiang Wu & Junxia Zhang & Mengyuan Li & Penghui Cao & Jiarui Zhao & Guanzhang Li & Jianyu Li & Yu Li & Yujie Tang & Zhengliang Gao & Xiuxing , 2024. "Multi-omics and pharmacological characterization of patient-derived glioma cell lines," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    8. Kelsy C. Cotto & Yang-Yang Feng & Avinash Ramu & Megan Richters & Sharon L. Freshour & Zachary L. Skidmore & Huiming Xia & Joshua F. McMichael & Jason Kunisaki & Katie M. Campbell & Timothy Hung-Po Ch, 2023. "Integrated analysis of genomic and transcriptomic data for the discovery of splice-associated variants in cancer," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    9. Noha A. M. Shendy & Melissa Bikowitz & Logan H. Sigua & Yang Zhang & Audrey Mercier & Yousef Khashana & Stephanie Nance & Qi Liu & Ian M. Delahunty & Sarah Robinson & Vanshita Goel & Matthew G. Rees &, 2024. "Group 3 medulloblastoma transcriptional networks collapse under domain specific EP300/CBP inhibition," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    10. Han Jin & Cheng Zhang & Martin Zwahlen & Kalle Feilitzen & Max Karlsson & Mengnan Shi & Meng Yuan & Xiya Song & Xiangyu Li & Hong Yang & Hasan Turkez & Linn Fagerberg & Mathias Uhlén & Adil Mardinoglu, 2023. "Systematic transcriptional analysis of human cell lines for gene expression landscape and tumor representation," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    11. Junyi Chen & Xiaoying Wang & Anjun Ma & Qi-En Wang & Bingqiang Liu & Lang Li & Dong Xu & Qin Ma, 2022. "Deep transfer learning of cancer drug responses by integrating bulk and single-cell RNA-seq data," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    12. Omar Alhalabi & Jianfeng Chen & Yuxue Zhang & Yang Lu & Qi Wang & Sumankalai Ramachandran & Rebecca Slack Tidwell & Guangchun Han & Xinmiao Yan & Jieru Meng & Ruiping Wang & Anh G. Hoang & Wei-Lien Wa, 2022. "MTAP deficiency creates an exploitable target for antifolate therapy in 9p21-loss cancers," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    13. Yan Li & Chen Xu & Bing Wang & Fujiang Xu & Fahan Ma & Yuanyuan Qu & Dongxian Jiang & Kai Li & Jinwen Feng & Sha Tian & Xiaohui Wu & Yunzhi Wang & Yang Liu & Zhaoyu Qin & Yalan Liu & Jing Qin & Qi Son, 2022. "Proteomic characterization of gastric cancer response to chemotherapy and targeted therapy reveals potential therapeutic strategies," Nature Communications, Nature, vol. 13(1), pages 1-26, December.
    14. Aina Maria Mas & Enrique Goñi & Igor Ruiz de los Mozos & Aida Arcas & Luisa Statello & Jovanna González & Lorea Blázquez & Wei Ting Chelsea Lee & Dipika Gupta & Álvaro Sejas & Shoko Hoshina & Alexandr, 2023. "ORC1 binds to cis-transcribed RNAs for efficient activation of replication origins," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    15. Nicolae Sapoval & Amirali Aghazadeh & Michael G. Nute & Dinler A. Antunes & Advait Balaji & Richard Baraniuk & C. J. Barberan & Ruth Dannenfelser & Chen Dun & Mohammadamin Edrisi & R. A. Leo Elworth &, 2022. "Current progress and open challenges for applying deep learning across the biosciences," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    16. G. Gambardella & G. Viscido & B. Tumaini & A. Isacchi & R. Bosotti & D. di Bernardo, 2022. "A single-cell analysis of breast cancer cell lines to study tumour heterogeneity and drug response," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    17. Seungyeul Yoo & Abhilasha Sinha & Dawei Yang & Nasser K. Altorki & Radhika Tandon & Wenhui Wang & Deebly Chavez & Eunjee Lee & Ayushi S. Patel & Takashi Sato & Ranran Kong & Bisen Ding & Eric E. Schad, 2022. "Integrative network analysis of early-stage lung adenocarcinoma identifies aurora kinase inhibition as interceptor of invasion and progression," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    18. Shi, Chengchun & Xu, Tianlin & Bergsma, Wicher & Li, Lexin, 2021. "Double generative adversarial networks for conditional independence testing," LSE Research Online Documents on Economics 112550, London School of Economics and Political Science, LSE Library.
    19. Alon Stern & Mariam Fokra & Boris Sarvin & Ahmad Abed Alrahem & Won Dong Lee & Elina Aizenshtein & Nikita Sarvin & Tomer Shlomi, 2023. "Inferring mitochondrial and cytosolic metabolism by coupling isotope tracing and deconvolution," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    20. Sandor Spisak & David Chen & Pornlada Likasitwatanakul & Paul Doan & Zhixin Li & Pratyusha Bala & Laura Vizkeleti & Viktoria Tisza & Pushpamali Silva & Marios Giannakis & Brian Wolpin & Jun Qi & Nilay, 2024. "Identifying regulators of aberrant stem cell and differentiation activity in colorectal cancer using a dual endogenous reporter system," Nature Communications, Nature, vol. 15(1), pages 1-16, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-33026-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.