IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-29356-8.html
   My bibliography  Save this article

Normalizing and denoising protein expression data from droplet-based single cell profiling

Author

Listed:
  • Matthew P. Mulè

    (National Institutes of Health (NIH)
    University of Cambridge)

  • Andrew J. Martins

    (National Institutes of Health (NIH))

  • John S. Tsang

    (National Institutes of Health (NIH)
    National Institutes of Health (NIH))

Abstract

Multimodal single-cell profiling methods that measure protein expression with oligo-conjugated antibodies hold promise for comprehensive dissection of cellular heterogeneity, yet the resulting protein counts have substantial technical noise that can mask biological variations. Here we integrate experiments and computational analyses to reveal two major noise sources and develop a method called “dsb” (denoised and scaled by background) to normalize and denoise droplet-based protein expression data. We discover that protein-specific noise originates from unbound antibodies encapsulated during droplet generation; this noise can thus be accurately estimated and corrected by utilizing protein levels in empty droplets. We also find that isotype control antibodies and the background protein population average in each cell exhibit significant correlations across single cells, we thus use their shared variance to correct for cell-to-cell technical noise in each cell. We validate these findings by analyzing the performance of dsb in eight independent datasets spanning multiple technologies, including CITE-seq, ASAP-seq, and TEA-seq. Compared to existing normalization methods, our approach improves downstream analyses by better unmasking biologically meaningful cell populations. Our method is available as an open-source R package that interfaces easily with existing single cell software platforms such as Seurat, Bioconductor, and Scanpy and can be accessed at “dsb [ https://cran.r-project.org/package=dsb ]”.

Suggested Citation

  • Matthew P. Mulè & Andrew J. Martins & John S. Tsang, 2022. "Normalizing and denoising protein expression data from droplet-based single cell profiling," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-29356-8
    DOI: 10.1038/s41467-022-29356-8
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-29356-8
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-29356-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Lars Kjer-Nielsen & Onisha Patel & Alexandra J. Corbett & Jérôme Le Nours & Bronwyn Meehan & Ligong Liu & Mugdha Bhati & Zhenjun Chen & Lyudmila Kostenko & Rangsima Reantragoon & Nicholas A. Williamso, 2012. "MR1 presents microbial vitamin B metabolites to MAIT cells," Nature, Nature, vol. 491(7426), pages 717-723, November.
    2. Ludo Waltman & Nees Eck, 2013. "A smart local moving algorithm for large-scale modularity-based community detection," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 86(11), pages 1-14, November.
    3. Robert Tibshirani & Guenther Walther & Trevor Hastie, 2001. "Estimating the number of clusters in a data set via the gap statistic," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 63(2), pages 411-423.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lutz Bornmann & Robin Haunschild & Sven E. Hug, 2018. "Visualizing the context of citations referencing papers published by Eugene Garfield: a new type of keyword co-occurrence analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(2), pages 427-437, February.
    2. Thiemo Fetzer & Samuel Marden, 2017. "Take What You Can: Property Rights, Contestability and Conflict," Economic Journal, Royal Economic Society, vol. 0(601), pages 757-783, May.
    3. Natalya Ivanova & Ekaterina Zolotova, 2023. "Landolt Indicator Values in Modern Research: A Review," Sustainability, MDPI, vol. 15(12), pages 1-22, June.
    4. Daniel Agness & Travis Baseler & Sylvain Chassang & Pascaline Dupas & Erik Snowberg, 2022. "Valuing the Time of the Self-Employed," CESifo Working Paper Series 9567, CESifo.
    5. Batool, Fatima & Hennig, Christian, 2021. "Clustering with the Average Silhouette Width," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
    6. Nicoleta Serban & Huijing Jiang, 2012. "Multilevel Functional Clustering Analysis," Biometrics, The International Biometric Society, vol. 68(3), pages 805-814, September.
    7. Nina Sakinah Ahmad Rofaie & Seuk Wai Phoong & Muzalwana Abdul Talib & Ainin Sulaiman, 2023. "Light-emitting diode (LED) research: A bibliometric analysis during 2003–2018," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(1), pages 173-191, February.
    8. Giovanni Matteo & Pierfrancesco Nardi & Stefano Grego & Caterina Guidi, 2018. "Bibliometric analysis of Climate Change Vulnerability Assessment research," Environment Systems and Decisions, Springer, vol. 38(4), pages 508-516, December.
    9. Orietta Nicolis & Jean Paul Maidana & Fabian Contreras & Danilo Leal, 2024. "Analyzing the Impact of COVID-19 on Economic Sustainability: A Clustering Approach," Sustainability, MDPI, vol. 16(4), pages 1-30, February.
    10. Yi-Ming Wei & Jin-Wei Wang & Tianqi Chen & Bi-Ying Yu & Hua Liao, 2018. "Frontiers of Low-Carbon Technologies: Results from Bibliographic Coupling with Sliding Window," CEEP-BIT Working Papers 116, Center for Energy and Environmental Policy Research (CEEP), Beijing Institute of Technology.
    11. Li, Pai-Ling & Chiou, Jeng-Min, 2011. "Identifying cluster number for subspace projected functional data clustering," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2090-2103, June.
    12. Loredana Canfora & Corrado Costa & Federico Pallottino & Stefano Mocali, 2021. "Trends in Soil Microbial Inoculants Research: A Science Mapping Approach to Unravel Strengths and Weaknesses of Their Application," Agriculture, MDPI, vol. 11(2), pages 1-21, February.
    13. Yaeji Lim & Hee-Seok Oh & Ying Kuen Cheung, 2019. "Multiscale Clustering for Functional Data," Journal of Classification, Springer;The Classification Society, vol. 36(2), pages 368-391, July.
    14. Forzani, Liliana & Gieco, Antonella & Tolmasky, Carlos, 2017. "Likelihood ratio test for partial sphericity in high and ultra-high dimensions," Journal of Multivariate Analysis, Elsevier, vol. 159(C), pages 18-38.
    15. Yujia Li & Xiangrui Zeng & Chien‐Wei Lin & George C. Tseng, 2022. "Simultaneous estimation of cluster number and feature sparsity in high‐dimensional cluster analysis," Biometrics, The International Biometric Society, vol. 78(2), pages 574-585, June.
    16. Vojtech Blazek & Michal Petruzela & Tomas Vantuch & Zdenek Slanina & Stanislav Mišák & Wojciech Walendziuk, 2020. "The Estimation of the Influence of Household Appliances on the Power Quality in a Microgrid System," Energies, MDPI, vol. 13(17), pages 1-21, August.
    17. Evi Sachini & Nikolaos Karampekios & Pierpaolo Brutti & Konstantinos Sioumalas-Christodoulou, 2020. "Should I stay or should I go? Using bibliometrics to identify the international mobility of highly educated Greek manpower," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(1), pages 641-663, October.
    18. Andrew Clark & Alexander Mihailov & Michael Zargham, 2024. "Complex Systems Modeling of Community Inclusion Currencies," Computational Economics, Springer;Society for Computational Economics, vol. 64(2), pages 1259-1294, August.
    19. Natalya Ivanova & Ekaterina Zolotova, 2024. "Vegetation Dynamics Studies Based on Ellenberg and Landolt Indicator Values: A Review," Land, MDPI, vol. 13(10), pages 1-24, October.
    20. Nicoleta Serban, 2008. "Estimating and clustering curves in the presence of heteroscedastic errors," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 20(7), pages 553-571.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-29356-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.