IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-53666-8.html
   My bibliography  Save this article

HiDDEN: a machine learning method for detection of disease-relevant populations in case-control single-cell transcriptomics data

Author

Listed:
  • Aleksandrina Goeva

    (Broad Institute of Massachusetts Institute of Technology and Harvard)

  • Michael-John Dolan

    (Broad Institute of Massachusetts Institute of Technology and Harvard)

  • Judy Luu

    (Broad Institute of Massachusetts Institute of Technology and Harvard)

  • Eric Garcia

    (Broad Institute of Massachusetts Institute of Technology and Harvard)

  • Rebecca Boiarsky

    (Broad Institute of Massachusetts Institute of Technology and Harvard
    Massachusetts Institute of Technology)

  • Rajat M. Gupta

    (Broad Institute of Massachusetts Institute of Technology and Harvard
    Harvard Medical School)

  • Evan Macosko

    (Broad Institute of Massachusetts Institute of Technology and Harvard
    Department of Psychiatry)

Abstract

In case-control single-cell RNA-seq studies, sample-level labels are transferred onto individual cells, labeling all case cells as affected, when in reality only a small fraction of them may actually be perturbed. Here, using simulations, we demonstrate that the standard approach to single cell analysis fails to isolate the subset of affected case cells and their markers when either the affected subset is small, or when the strength of the perturbation is mild. To address this fundamental limitation, we introduce HiDDEN, a computational method that refines the case-control labels to accurately reflect the perturbation status of each cell. We show HiDDEN’s superior ability to recover biological signals missed by the standard analysis workflow in simulated ground truth datasets of cell type mixtures. When applied to a dataset of human multiple myeloma precursor conditions, HiDDEN recapitulates the expert manual annotation and discovers malignancy in early stage samples missed in the original analysis. When applied to a mouse model of demyelination, HiDDEN identifies an endothelial subpopulation playing a role in early stage blood-brain barrier dysfunction. We anticipate that HiDDEN should find wide usage in contexts that require the detection of subtle transcriptional changes in cell types across conditions.

Suggested Citation

  • Aleksandrina Goeva & Michael-John Dolan & Judy Luu & Eric Garcia & Rebecca Boiarsky & Rajat M. Gupta & Evan Macosko, 2024. "HiDDEN: a machine learning method for detection of disease-relevant populations in case-control single-cell transcriptomics data," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-53666-8
    DOI: 10.1038/s41467-024-53666-8
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-53666-8
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-53666-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Alexandre F. Aissa & Abul B. M. M. K. Islam & Majd M. Ariss & Cammille C. Go & Alexandra E. Rader & Ryan D. Conrardy & Alexa M. Gajda & Carlota Rubio-Perez & Klara Valyi-Nagy & Mary Pasquinelli & Lawr, 2021. "Single-cell transcriptional changes associated with drug tolerance and response to combination therapies in cancer," Nature Communications, Nature, vol. 12(1), pages 1-25, December.
    2. Rebecca Boiarsky & Nicholas J. Haradhvala & Jean-Baptiste Alberge & Romanos Sklavenitis-Pistofidis & Tarek H. Mouhieddine & Oksana Zavidij & Ming-Chieh Shih & Danielle Firer & Mendy Miller & Habib El-, 2022. "Single cell characterization of myeloma and its precursor conditions reveals transcriptional signatures of early tumorigenesis," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Junyi Chen & Xiaoying Wang & Anjun Ma & Qi-En Wang & Bingqiang Liu & Lang Li & Dong Xu & Qin Ma, 2022. "Deep transfer learning of cancer drug responses by integrating bulk and single-cell RNA-seq data," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    2. Franziska Haderk & Yu-Ting Chou & Lauren Cech & Celia Fernández-Méndez & Johnny Yu & Victor Olivas & Ismail M. Meraz & Dora Barbosa Rabago & D. Lucas Kerr & Carlos Gomez & David V. Allegakoen & Juan G, 2024. "Focal adhesion kinase-YAP signaling axis drives drug-tolerant persister cells and residual disease in lung cancer," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    3. Michael S. Balzer & Tomohito Doke & Ya-Wen Yang & Daniel L. Aldridge & Hailong Hu & Hung Mai & Dhanunjay Mukhi & Ziyuan Ma & Rojesh Shrestha & Matthew B. Palmer & Christopher A. Hunter & Katalin Suszt, 2022. "Single-cell analysis highlights differences in druggable pathways underlying adaptive or fibrotic kidney regeneration," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    4. Travis S. Johnson & Parvathi Sudha & Enze Liu & Nathan Becker & Sylvia Robertson & Patrick Blaney & Gareth Morgan & Vivek S. Chopra & Cedric Santos & Michael Nixon & Kun Huang & Attaya Suvannasankha &, 2024. "1q amplification and PHF19 expressing high-risk cells are associated with relapsed/refractory multiple myeloma," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    5. Sarah Figarol & Célia Delahaye & Rémi Gence & Aurélia Doussine & Juan Pablo Cerapio & Mathylda Brachais & Claudine Tardy & Nicolas Béry & Raghda Asslan & Jacques Colinge & Jean-Philippe Villemin & Ant, 2024. "Farnesyltransferase inhibition overcomes oncogene-addicted non-small cell lung cancer adaptive resistance to targeted therapies," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    6. Cristiana Spinelli & Lata Adnani & Brian Meehan & Laura Montermini & Sidong Huang & Minjun Kim & Tamiko Nishimura & Sidney E. Croul & Ichiro Nakano & Yasser Riazalhosseini & Janusz Rak, 2024. "Mesenchymal glioma stem cells trigger vasectasia—distinct neovascularization process stimulated by extracellular vesicles carrying EGFR," Nature Communications, Nature, vol. 15(1), pages 1-13, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-53666-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.