IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-49448-x.html
   My bibliography  Save this article

A unified model-based framework for doublet or multiplet detection in single-cell multiomics data

Author

Listed:
  • Haoran Hu

    (University of Pittsburgh)

  • Xinjun Wang

    (Memorial Sloan Kettering Cancer Center)

  • Site Feng

    (University of Pittsburgh
    Tsinghua University)

  • Zhongli Xu

    (Tsinghua University
    University of Pittsburgh)

  • Jing Liu

    (University of Pittsburgh)

  • Elisa Heidrich-O’Hare

    (University of Pittsburgh)

  • Yanshuo Chen

    (University of Maryland
    Center of Bioinformatics and Computational Biology)

  • Molin Yue

    (University of Pittsburgh)

  • Lang Zeng

    (University of Pittsburgh)

  • Ziqi Rong

    (University of Michigan)

  • Tianmeng Chen

    (University of Pittsburgh)

  • Timothy Billiar

    (University of Pittsburgh)

  • Ying Ding

    (University of Pittsburgh)

  • Heng Huang

    (University of Maryland
    Center of Bioinformatics and Computational Biology)

  • Richard H. Duerr

    (University of Pittsburgh
    University of Pittsburgh)

  • Wei Chen

    (University of Pittsburgh
    University of Pittsburgh
    University of Pittsburgh)

Abstract

Droplet-based single-cell sequencing techniques rely on the fundamental assumption that each droplet encapsulates a single cell, enabling individual cell omics profiling. However, the inevitable issue of multiplets, where two or more cells are encapsulated within a single droplet, can lead to spurious cell type annotations and obscure true biological findings. The issue of multiplets is exacerbated in single-cell multiomics settings, where integrating cross-modality information for clustering can inadvertently promote the aggregation of multiplet clusters and increase the risk of erroneous cell type annotations. Here, we propose a compound Poisson model-based framework for multiplet detection in single-cell multiomics data. Leveraging experimental cell hashing results as the ground truth for multiplet status, we conducted trimodal DOGMA-seq experiments and generated 17 benchmarking datasets from two tissues, involving a total of 280,123 droplets. We demonstrated that the proposed method is an essential tool for integrating cross-modality multiplet signals, effectively eliminating multiplet clusters in single-cell multiomics data—a task at which the benchmarked single-omics methods proved inadequate.

Suggested Citation

  • Haoran Hu & Xinjun Wang & Site Feng & Zhongli Xu & Jing Liu & Elisa Heidrich-O’Hare & Yanshuo Chen & Molin Yue & Lang Zeng & Ziqi Rong & Tianmeng Chen & Timothy Billiar & Ying Ding & Heng Huang & Rich, 2024. "A unified model-based framework for doublet or multiplet detection in single-cell multiomics data," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-49448-x
    DOI: 10.1038/s41467-024-49448-x
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-49448-x
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-49448-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Simard, Richard & L'Ecuyer, Pierre, 2011. "Computing the Two-Sided Kolmogorov-Smirnov Distribution," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 39(i11).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Talwar, Manish & Talwar, Shalini & Kaur, Puneet & Tripathy, Naliniprava & Dhir, Amandeep, 2021. "Has financial attitude impacted the trading activity of retail investors during the COVID-19 pandemic?," Journal of Retailing and Consumer Services, Elsevier, vol. 58(C).
    2. Song-Hee Kim & Ward Whitt, 2014. "Are Call Center and Hospital Arrivals Well Modeled by Nonhomogeneous Poisson Processes?," Manufacturing & Service Operations Management, INFORMS, vol. 16(3), pages 464-480, July.
    3. Jorino van Rhijn & Cornelis W. Oosterlee & Lech A. Grzelak & Shuaiqiang Liu, 2021. "Monte Carlo Simulation of SDEs using GANs," Papers 2104.01437, arXiv.org.
    4. Jiahang He & Toshiyuki Yamamoto, 2020. "Characterization of Daily Travel Distance of a University Car Fleet for the Purpose of Replacing Conventional Vehicles with Electric Vehicles," Sustainability, MDPI, vol. 12(2), pages 1-12, January.
    5. Wu, Zhongqi & Jiang, Hui & Liang, Xiaoyu & Zhou, Yangye, 2024. "Multi-period distributionally robust emergency medical service location model with customized ambiguity sets," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 181(C).
    6. Rokhsareh Khashtabeh & Morteza Akbari & Mahdi Kolahi & Ali Talebanfard, 2021. "Assessing the effects of desertification control projects using socio-economic indicators in the arid regions of eastern Iran," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 23(7), pages 10455-10469, July.
    7. Fernando Freire Vasconcelos & Renato Máximo Sátiro & Luiz Paulo Lopes Fávero & Gabriela Troyano Bortoloto & Hamilton Luiz Corrêa, 2023. "Analysis of Judiciary Expenditure and Productivity Using Machine Learning Techniques," Mathematics, MDPI, vol. 11(14), pages 1-19, July.
    8. Tanvir Uddin Chowdhury & Peter Y. Park & Kevin Gingerich, 2022. "Estimation of Appropriate Acceleration Lane Length for Safe and Efficient Truck Platooning Operation on Freeway Merge Areas," Sustainability, MDPI, vol. 14(19), pages 1-25, October.
    9. Delbaz, Reza & Ebrahimian, Hamed & Abbasi, Fariborz & Ghameshlou, Arezoo N. & Liaghat, Abdolmajid & Ranazadeh, Dariush, 2023. "A global meta-analysis on surface and drip fertigation for annual crops under different fertilization levels," Agricultural Water Management, Elsevier, vol. 289(C).
    10. Sloot Henrik, 2022. "Implementing Markovian models for extendible Marshall–Olkin distributions," Dependence Modeling, De Gruyter, vol. 10(1), pages 308-343, January.
    11. Haramoto, Hiroshi & Matsumoto, Makoto, 2019. "Checking the quality of approximation of p-values in statistical tests for random number generators by using a three-level test," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 161(C), pages 66-75.
    12. Li, Penghua & Zhang, Zijian & Grosu, Radu & Deng, Zhongwei & Hou, Jie & Rong, Yujun & Wu, Rui, 2022. "An end-to-end neural network framework for state-of-health estimation and remaining useful life prediction of electric vehicle lithium batteries," Renewable and Sustainable Energy Reviews, Elsevier, vol. 156(C).
    13. Hossein Hassani & Emmanuel Sirimal Silva, 2015. "A Kolmogorov-Smirnov Based Test for Comparing the Predictive Accuracy of Two Sets of Forecasts," Econometrics, MDPI, vol. 3(3), pages 1-20, August.
    14. Talwar, Shalini & Srivastava, Shalini & Sakashita, Mototaka & Islam, Nazrul & Dhir, Amandeep, 2022. "Personality and travel intentions during and after the COVID-19 pandemic: An artificial neural network (ANN) approach," Journal of Business Research, Elsevier, vol. 142(C), pages 400-411.
    15. Abbas Mahmoudabadi & Fatemeh Pourhossein Ghazimahalleh, 2023. "Investigating the Effect of Drivers' Training Courses on Commercial Drivers' Success Rate for Qualification," International Journal of Management Science and Business Administration, Inovatus Services Ltd., vol. 9(4), pages 35-41, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-49448-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.