IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1010025.html
   My bibliography  Save this article

Non-linear archetypal analysis of single-cell RNA-seq data by deep autoencoders

Author

Listed:
  • Yuge Wang
  • Hongyu Zhao

Abstract

Advances in single-cell RNA sequencing (scRNA-seq) have led to successes in discovering novel cell types and understanding cellular heterogeneity among complex cell populations through cluster analysis. However, cluster analysis is not able to reveal continuous spectrum of states and underlying gene expression programs (GEPs) shared across cell types. We introduce scAAnet, an autoencoder for single-cell non-linear archetypal analysis, to identify GEPs and infer the relative activity of each GEP across cells. We use a count distribution-based loss term to account for the sparsity and overdispersion of the raw count data and add an archetypal constraint to the loss function of scAAnet. We first show that scAAnet outperforms existing methods for archetypal analysis across different metrics through simulations. We then demonstrate the ability of scAAnet to extract biologically meaningful GEPs using publicly available scRNA-seq datasets including a pancreatic islet dataset, a lung idiopathic pulmonary fibrosis dataset and a prefrontal cortex dataset.Author summary: Single-cell RNA sequencing (scRNA-seq) techniques enable the profiling of gene expression at the single-cell level, and thus make it possible to uncover the cellular heterogeneity in a complex cell population which is composed of multiple cell types. Due to the complexity of biological system, different cell types may share underlying gene expression programs (GEPs) at different levels. However, such shared patterns are difficult to study by traditional cluster analysis. Based on the assumption that the expression profile of each cell results from a non-linear combination of multiple GEPs, we develop scAAnet, a deep learning model for non-linear archetypal decomposition of scRNA-seq data. We demonstrate that scAAnet is able to both achieve better decomposition performance in simulated data and identify biologically meaningful GEPs that are either cell-type-specific or disease-enriched in three real scRNA-seq datasets. To help interpret results from scAAnet, we also provide downstream analysis tools for the identification of program-specific marker genes. We expect scAAnet can be applied to explore GEPs shared across cells when scRNA-seq is used to study a complex disease or biological system.

Suggested Citation

  • Yuge Wang & Hongyu Zhao, 2022. "Non-linear archetypal analysis of single-cell RNA-seq data by deep autoencoders," PLOS Computational Biology, Public Library of Science, vol. 18(4), pages 1-31, April.
  • Handle: RePEc:plo:pcbi00:1010025
    DOI: 10.1371/journal.pcbi.1010025
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010025
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1010025&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1010025?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Paul T E Cusack, 2020. "On Pain," Biomedical Journal of Scientific & Technical Research, Biomedical Research Network+, LLC, vol. 31(3), pages 24253-24254, October.
    2. Daniel D. Lee & H. Sebastian Seung, 1999. "Learning the parts of objects by non-negative matrix factorization," Nature, Nature, vol. 401(6755), pages 788-791, October.
    3. Giorgia Quadrato & Tuan Nguyen & Evan Z. Macosko & John L. Sherwood & Sung Min Yang & Daniel R. Berger & Natalie Maria & Jorg Scholvin & Melissa Goldman & Justin P. Kinney & Edward S. Boyden & Jeff W., 2017. "Cell diversity and network dynamics in photosensitive human brain organoids," Nature, Nature, vol. 545(7652), pages 48-53, May.
    4. Gökcen Eraslan & Lukas M. Simon & Maria Mircea & Nikola S. Mueller & Fabian J. Theis, 2019. "Single-cell RNA-seq denoising using a deep count autoencoder," Nature Communications, Nature, vol. 10(1), pages 1-14, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rafael Teixeira & Mário Antunes & Diogo Gomes & Rui L. Aguiar, 2024. "Comparison of Semantic Similarity Models on Constrained Scenarios," Information Systems Frontiers, Springer, vol. 26(4), pages 1307-1330, August.
    2. Del Corso, Gianna M. & Romani, Francesco, 2019. "Adaptive nonnegative matrix factorization and measure comparisons for recommender systems," Applied Mathematics and Computation, Elsevier, vol. 354(C), pages 164-179.
    3. P Fogel & C Geissler & P Cotte & G Luta, 2022. "Applying separative non-negative matrix factorization to extra-financial data," Working Papers hal-03689774, HAL.
    4. Xiao-Bai Li & Jialun Qin, 2017. "Anonymizing and Sharing Medical Text Records," Information Systems Research, INFORMS, vol. 28(2), pages 332-352, June.
    5. Daniel Niederer & Juliane Mueller, 2020. "Sustainability effects of motor control stabilisation exercises on pain and function in chronic nonspecific low back pain patients: A systematic review with meta-analysis and meta-regression," PLOS ONE, Public Library of Science, vol. 15(1), pages 1-21, January.
    6. Sana Sadiq & Khadija Anasse & Najib Slimani, 2022. "The impact of mobile phones on high school students: connecting the research dots," Technium Social Sciences Journal, Technium Science, vol. 30(1), pages 252-270, April.
    7. Jitka Vseteckova, 2020. "Psychological Therapy for ICT Literate Older Adults in the Time of COVID-19 - Perceptions on the Acceptability of Online Versus Face to Face Versions of a Mindfulness for Later Life Group," Biomedical Journal of Scientific & Technical Research, Biomedical Research Network+, LLC, vol. 31(1), pages 23912-23916, October.
    8. Khalid Ahmed Al-Ansari & Ahmet Faruk Aysan, 2021. "More than ten years of Blockchain creation: How did we use the technology and which direction is the research heading? [Plus de dix ans de création Blockchain : Comment avons-nous utilisé la techno," Working Papers hal-03343048, HAL.
    9. Ling, Gabriel Hoh Teck & Suhud, Nur Amiera binti Md & Leng, Pau Chung & Yeo, Lee Bak & Cheng, Chin Tiong & Ahmad, Mohd Hamdan Haji & Matusin, AK Mohd Rafiq AK, 2021. "Factors Influencing Asia-Pacific Countries’ Success Level in Curbing COVID-19: A Review Using a Social–Ecological System (SES) Framework," SocArXiv b9f2w, Center for Open Science.
    10. Benedict E. DeDominicis, 2021. "Multinational Enterprises And Economic Nationalism: A Strategic Analysis Of Culture," Global Journal of Business Research, The Institute for Business and Finance Research, vol. 15(1), pages 19-66.
    11. Robert J. R. Elliott & Ingmar Schumacher & Cees Withagen, 2020. "Suggestions for a Covid-19 Post-Pandemic Research Agenda in Environmental Economics," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 76(4), pages 1187-1213, August.
    12. Rafał Krupiński, 2020. "Virtual Reality System and Scientific Visualisation for Smart Designing and Evaluating of Lighting," Energies, MDPI, vol. 13(20), pages 1-17, October.
    13. Werner Hölzl & Michael Böheim & Klaus Friesenbichler & Agnes Kügler & Thomas Leoni, 2021. "Staatliche Hilfsmaßnahmen für Unternehmen in der COVID-19-Krise. Eine begleitende Analyse operativer Aspekte und Unternehmenseinschätzungen," WIFO Studies, WIFO, number 66624.
    14. Thorbecke, Willem & Chen, Chen & Salike, Nimesh, 2021. "China’s exports in a protectionist world," Journal of Asian Economics, Elsevier, vol. 77(C).
    15. Naiyang Guan & Lei Wei & Zhigang Luo & Dacheng Tao, 2013. "Limited-Memory Fast Gradient Descent Method for Graph Regularized Nonnegative Matrix Factorization," PLOS ONE, Public Library of Science, vol. 8(10), pages 1-10, October.
    16. Óscar Chiva-Bartoll & Honorato Morente-Oria & Francisco Tomás González-Fernández & Pedro Jesús Ruiz-Montero, 2020. "Anxiety and Bodily Pain in Older Women Participants in a Physical Education Program. A Multiple Moderated Mediation Analysis," Sustainability, MDPI, vol. 12(10), pages 1-12, May.
    17. Gigi Foster, 2020. "The behavioural economics of government responses to COVID-19," Journal of Behavioral Economics for Policy, Society for the Advancement of Behavioral Economics (SABE), vol. 4(S3), pages 11-43, December.
    18. Spelta, A. & Pecora, N. & Rovira Kaltwasser, P., 2019. "Identifying Systemically Important Banks: A temporal approach for macroprudential policies," Journal of Policy Modeling, Elsevier, vol. 41(1), pages 197-218.
    19. Reza Salajegheh & Edward C Nemergut & Terran M Rice & Roy Joseph & Siny Tsang & Bethany M Sarosiek & C Paige Muthusubramanian & Katelyn M Hipwell & Kate B Horton & Bhiken I Naik, 2020. "Impact of a perioperative oral opioid substitution protocol during the nationwide intravenous opioid shortage: A single center, interrupted time series with segmented regression analysis," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-13, June.
    20. M. Moghadam & K. Aminian & M. Asghari & M. Parnianpour, 2013. "How well do the muscular synergies extracted via non-negative matrix factorisation explain the variation of torque at shoulder joint?," Computer Methods in Biomechanics and Biomedical Engineering, Taylor & Francis Journals, vol. 16(3), pages 291-301.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1010025. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.