IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-50613-5.html
   My bibliography  Save this article

Evaluating batch correction methods for image-based cell profiling

Author

Listed:
  • John Arevalo

    (Broad Institute of MIT and Harvard)

  • Ellen Su

    (Broad Institute of MIT and Harvard)

  • Jessica D. Ewald

    (Broad Institute of MIT and Harvard)

  • Robert Dijk

    (Broad Institute of MIT and Harvard)

  • Anne E. Carpenter

    (Broad Institute of MIT and Harvard)

  • Shantanu Singh

    (Broad Institute of MIT and Harvard)

Abstract

High-throughput image-based profiling platforms are powerful technologies capable of collecting data from billions of cells exposed to thousands of perturbations in a time- and cost-effective manner. Therefore, image-based profiling data has been increasingly used for diverse biological applications, such as predicting drug mechanism of action or gene function. However, batch effects severely limit community-wide efforts to integrate and interpret image-based profiling data collected across different laboratories and equipment. To address this problem, we benchmark ten high-performing single-cell RNA sequencing (scRNA-seq) batch correction techniques, representing diverse approaches, using a newly released Cell Painting dataset, JUMP. We focus on five scenarios with varying complexity, ranging from batches prepared in a single lab over time to batches imaged using different microscopes in multiple labs. We find that Harmony and Seurat RPCA are noteworthy, consistently ranking among the top three methods for all tested scenarios while maintaining computational efficiency. Our proposed framework, benchmark, and metrics can be used to assess new batch correction methods in the future. This work paves the way for improvements that enable the community to make the best use of public Cell Painting data for scientific discovery.

Suggested Citation

  • John Arevalo & Ellen Su & Jessica D. Ewald & Robert Dijk & Anne E. Carpenter & Shantanu Singh, 2024. "Evaluating batch correction methods for image-based cell profiling," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-50613-5
    DOI: 10.1038/s41467-024-50613-5
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-50613-5
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-50613-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Xiangjie Li & Kui Wang & Yafei Lyu & Huize Pan & Jingxiao Zhang & Dwight Stambolian & Katalin Susztak & Muredach P. Reilly & Gang Hu & Mingyao Li, 2020. "Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis," Nature Communications, Nature, vol. 11(1), pages 1-14, December.
    2. Karren Dai Yang & Anastasiya Belyaeva & Saradha Venkatachalapathy & Karthik Damodaran & Abigail Katcoff & Adityanarayanan Radhakrishnan & G. V. Shivashankar & Caroline Uhler, 2021. "Multi-domain translation between single-cell imaging and sequencing data using autoencoders," Nature Communications, Nature, vol. 12(1), pages 1-10, December.
    3. Marie F. A. Cutiongco & Bjørn Sand Jensen & Paul M. Reynolds & Nikolaj Gadegaard, 2020. "Predicting gene expression using morphological cell responses to nanotopography," Nature Communications, Nature, vol. 11(1), pages 1-13, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yichuan Cao & Xiamiao Zhao & Songming Tang & Qun Jiang & Sijie Li & Siyu Li & Shengquan Chen, 2024. "scButterfly: a versatile single-cell cross-modality translation method via dual-aligned variational autoencoders," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    2. Yingxin Lin & Yue Cao & Elijah Willie & Ellis Patrick & Jean Y. H. Yang, 2023. "Atlas-scale single-cell multi-sample multi-condition data integration using scMerge2," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    3. Xin Tang & Jiawei Zhang & Yichun He & Xinhe Zhang & Zuwan Lin & Sebastian Partarrieu & Emma Bou Hanna & Zhaolin Ren & Hao Shen & Yuhong Yang & Xiao Wang & Na Li & Jie Ding & Jia Liu, 2023. "Explainable multi-task learning for multi-modality biological data analysis," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    4. Duy Pham & Xiao Tan & Brad Balderson & Jun Xu & Laura F. Grice & Sohye Yoon & Emily F. Willis & Minh Tran & Pui Yeng Lam & Arti Raghubar & Priyakshi Kalita-de Croft & Sunil Lakhani & Jana Vukovic & Ma, 2023. "Robust mapping of spatiotemporal trajectories and cell–cell interactions in healthy and diseased tissues," Nature Communications, Nature, vol. 14(1), pages 1-25, December.
    5. Yasa Baig & Helena R. Ma & Helen Xu & Lingchong You, 2023. "Autoencoder neural networks enable low dimensional structure analyses of microbial growth dynamics," Nature Communications, Nature, vol. 14(1), pages 1-17, December.
    6. Yun-Tsan Chang & Pacôme Prompsy & Susanne Kimeswenger & Yi-Chien Tsai & Desislava Ignatova & Olesya Pavlova & Christoph Iselin & Lars E. French & Mitchell P. Levesque & François Kuonen & Malgorzata Bo, 2024. "MHC-I upregulation safeguards neoplastic T cells in the skin against NK cell-mediated eradication in mycosis fungoides," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    7. Kai Cao & Qiyu Gong & Yiguang Hong & Lin Wan, 2022. "A unified computational framework for single-cell data integration with optimal transport," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    8. Caitlin E. Carey & Rebecca Shafee & Robbee Wedow & Amanda Elliott & Duncan S. Palmer & John Compitello & Masahiro Kanai & Liam Abbott & Patrick Schultz & Konrad J. Karczewski & Samuel C. Bryant & Caro, 2024. "Principled distillation of UK Biobank phenotype data reveals underlying structure in human variation," Nature Human Behaviour, Nature, vol. 8(8), pages 1599-1615, August.
    9. Jules Samaran & Gabriel Peyré & Laura Cantini, 2024. "scConfluence: single-cell diagonal integration with regularized Inverse Optimal Transport on weakly connected features," Nature Communications, Nature, vol. 15(1), pages 1-20, December.
    10. Lisa Laux & Marie F A Cutiongco & Nikolaj Gadegaard & Bjørn Sand Jensen, 2020. "Interactive machine learning for fast and robust cell profiling," PLOS ONE, Public Library of Science, vol. 15(9), pages 1-16, September.
    11. Zhuohan Yu & Yanchi Su & Yifu Lu & Yuning Yang & Fuzhou Wang & Shixiong Zhang & Yi Chang & Ka-Chun Wong & Xiangtao Li, 2023. "Topological identification and interpretation for single-cell gene regulation elucidation across multiple platforms using scMGCA," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    12. Yang Xu & Rachel Patton McCord, 2022. "Diagonal integration of multimodal single-cell data: potential pitfalls and paths forward," Nature Communications, Nature, vol. 13(1), pages 1-4, December.
    13. Ajita Shree & Musale Krushna Pavan & Hamim Zafar, 2023. "scDREAMER for atlas-level integration of single-cell datasets using deep generative model paired with adversarial classifier," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    14. Greta Simionato & Konrad Hinkelmann & Revaz Chachanidze & Paola Bianchi & Elisa Fermo & Richard van Wijk & Marc Leonetti & Christian Wagner & Lars Kaestner & Stephan Quint, 2021. "Red blood cell phenotyping from 3D confocal images using artificial neural networks," PLOS Computational Biology, Public Library of Science, vol. 17(5), pages 1-17, May.
    15. Xiaokang Yu & Xinyi Xu & Jingxiao Zhang & Xiangjie Li, 2023. "Batch alignment of single-cell transcriptomics data using deep metric learning," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    16. Adityanarayanan Radhakrishnan & Sam F. Friedman & Shaan Khurshid & Kenney Ng & Puneet Batra & Steven A. Lubitz & Anthony A. Philippakis & Caroline Uhler, 2023. "Cross-modal autoencoder framework learns holistic representations of cardiovascular state," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    17. Qihuang Zhang & Shunzhou Jiang & Amelia Schroeder & Jian Hu & Kejie Li & Baohong Zhang & David Dai & Edward B. Lee & Rui Xiao & Mingyao Li, 2023. "Leveraging spatial transcriptomics data to recover cell locations in single-cell RNA-seq with CeLEry," Nature Communications, Nature, vol. 14(1), pages 1-19, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-50613-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.