IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v12y2021i1d10.1038_s41467-021-27130-w.html
   My bibliography  Save this article

A benchmark study of simulation methods for single-cell RNA sequencing data

Author

Listed:
  • Yue Cao

    (The University of Sydney
    The University of Sydney)

  • Pengyi Yang

    (The University of Sydney
    The University of Sydney
    Children’s Medical Research Institute)

  • Jean Yee Hwa Yang

    (The University of Sydney
    The University of Sydney)

Abstract

Single-cell RNA-seq (scRNA-seq) data simulation is critical for evaluating computational methods for analysing scRNA-seq data especially when ground truth is experimentally unattainable. The reliability of evaluation depends on the ability of simulation methods to capture properties of experimental data. However, while many scRNA-seq data simulation methods have been proposed, a systematic evaluation of these methods is lacking. We develop a comprehensive evaluation framework, SimBench, including a kernel density estimation measure to benchmark 12 simulation methods through 35 scRNA-seq experimental datasets. We evaluate the simulation methods on a panel of data properties, ability to maintain biological signals, scalability and applicability. Our benchmark uncovers performance differences among the methods and highlights the varying difficulties in simulating data characteristics. Furthermore, we identify several limitations including maintaining heterogeneity of distribution. These results, together with the framework and datasets made publicly available as R packages, will guide simulation methods selection and their future development.

Suggested Citation

  • Yue Cao & Pengyi Yang & Jean Yee Hwa Yang, 2021. "A benchmark study of simulation methods for single-cell RNA sequencing data," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
  • Handle: RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-021-27130-w
    DOI: 10.1038/s41467-021-27130-w
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-021-27130-w
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-021-27130-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Robrecht Cannoodt & Wouter Saelens & Louise Deconinck & Yvan Saeys, 2021. "Spearheading future omics analyses using dyngen, a multi-modal simulator of single cells," Nature Communications, Nature, vol. 12(1), pages 1-9, December.
    2. Mohamed Marouf & Pierre Machart & Vikas Bansal & Christoph Kilian & Daniel S. Magruder & Christian F. Krebs & Stefan Bonn, 2020. "Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks," Nature Communications, Nature, vol. 11(1), pages 1-12, December.
    3. Beate Vieth & Swati Parekh & Christoph Ziegenhain & Wolfgang Enard & Ines Hellmann, 2019. "A systematic evaluation of single cell RNA-seq analysis pipelines," Nature Communications, Nature, vol. 10(1), pages 1-11, December.
    4. Davide Risso & Fanny Perraudeau & Svetlana Gribkova & Sandrine Dudoit & Jean-Philippe Vert, 2018. "A general and flexible method for signal extraction from single-cell RNA-seq data," Nature Communications, Nature, vol. 9(1), pages 1-17, December.
    5. Xiuwei Zhang & Chenling Xu & Nir Yosef, 2019. "Simulating multiple faceted variability in single cell RNA sequencing," Nature Communications, Nature, vol. 10(1), pages 1-16, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jolene S. Ranek & Wayne Stallaert & J. Justin Milner & Margaret Redick & Samuel C. Wolff & Adriana S. Beltran & Natalie Stanley & Jeremy E. Purvis, 2024. "DELVE: feature selection for preserving biological trajectories in single-cell data," Nature Communications, Nature, vol. 15(1), pages 1-26, December.
    2. Yuting Feng & Shuyi Wang & Xiaoye Liu & Yiming Han & Hongwei Xu & Xiaocen Duan & Wenyue Xie & Zhuoling Tian & Zuoying Yuan & Zhuo Wan & Liang Xu & Siying Qin & Kangmin He & Jianyong Huang, 2023. "Geometric constraint-triggered collagen expression mediates bacterial-host adhesion," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    3. Xiaohang Fu & Yingxin Lin & David M. Lin & Daniel Mechtersheimer & Chuhan Wang & Farhan Ameen & Shila Ghazanfar & Ellis Patrick & Jinman Kim & Jean Y. H. Yang, 2024. "BIDCell: Biologically-informed self-supervised learning for segmentation of subcellular spatial transcriptomics data," Nature Communications, Nature, vol. 15(1), pages 1-17, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ziqi Zhang & Xinye Zhao & Mehak Bindra & Peng Qiu & Xiuwei Zhang, 2024. "scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    2. Lingfei Wang, 2021. "Single-cell normalization and association testing unifying CRISPR screen and gene co-expression analyses with Normalisr," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    3. Yazdan Zinati & Abdulrahman Takiddeen & Amin Emad, 2024. "GRouNdGAN: GRN-guided simulation of single-cell RNA-seq data using causal generative adversarial networks," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    4. Ziqi Zhang & Haoran Sun & Ragunathan Mariappan & Xi Chen & Xinyu Chen & Mika S. Jain & Mirjana Efremova & Sarah A. Teichmann & Vaibhav Rajan & Xiuwei Zhang, 2023. "scMoMaT jointly performs single cell mosaic integration and multi-modal bio-marker detection," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    5. Xin Tang & Jiawei Zhang & Yichun He & Xinhe Zhang & Zuwan Lin & Sebastian Partarrieu & Emma Bou Hanna & Zhaolin Ren & Hao Shen & Yuhong Yang & Xiao Wang & Na Li & Jie Ding & Jia Liu, 2023. "Explainable multi-task learning for multi-modality biological data analysis," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    6. Lulu Shang & Xiang Zhou, 2022. "Spatially aware dimension reduction for spatial transcriptomics," Nature Communications, Nature, vol. 13(1), pages 1-22, December.
    7. Xiaotian Wu & Hao Wu & Zhijin Wu, 2021. "Penalized Latent Dirichlet Allocation Model in Single-Cell RNA Sequencing," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(3), pages 543-562, December.
    8. Qi Liu & Charles A Herring & Quanhu Sheng & Jie Ping & Alan J Simmons & Bob Chen & Amrita Banerjee & Wei Li & Guoqiang Gu & Robert J Coffey & Yu Shyr & Ken S Lau, 2018. "Quantitative assessment of cell population diversity in single-cell landscapes," PLOS Biology, Public Library of Science, vol. 16(10), pages 1-29, October.
    9. Michael Greenacre & Patrick J. F Groenen & Trevor Hastie & Alfonso Iodice d’Enza & Angelos Markos & Elena Tuzhilina, 2023. "Principal component analysis," Economics Working Papers 1856, Department of Economics and Business, Universitat Pompeu Fabra.
    10. Xiang Lin & Tian Tian & Zhi Wei & Hakon Hakonarson, 2022. "Clustering of single-cell multi-omics data with a multimodal deep learning method," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    11. Jingyang Qian & Hudong Bao & Xin Shao & Yin Fang & Jie Liao & Zhuo Chen & Chengyu Li & Wenbo Guo & Yining Hu & Anyao Li & Yue Yao & Xiaohui Fan & Yiyu Cheng, 2024. "Simulating multiple variability in spatially resolved transcriptomics with scCube," Nature Communications, Nature, vol. 15(1), pages 1-21, December.
    12. Angeles Arzalluz-Luque & Pedro Salguero & Sonia Tarazona & Ana Conesa, 2022. "acorde unravels functionally interpretable networks of isoform co-usage from single cell data," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    13. Jolene S. Ranek & Wayne Stallaert & J. Justin Milner & Margaret Redick & Samuel C. Wolff & Adriana S. Beltran & Natalie Stanley & Jeremy E. Purvis, 2024. "DELVE: feature selection for preserving biological trajectories in single-cell data," Nature Communications, Nature, vol. 15(1), pages 1-26, December.
    14. Reiichi Sugihara & Yuki Kato & Tomoya Mori & Yukio Kawahara, 2022. "Alignment of single-cell trajectory trees with CAPITAL," Nature Communications, Nature, vol. 13(1), pages 1-11, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:12:y:2021:i:1:d:10.1038_s41467-021-27130-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.