IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1006772.html
   My bibliography  Save this article

A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications

Author

Listed:
  • He Peng
  • Xiangxiang Zeng
  • Yadi Zhou
  • Defu Zhang
  • Ruth Nussinov
  • Feixiong Cheng

Abstract

Recent advances in next-generation sequencing and computational technologies have enabled routine analysis of large-scale single-cell ribonucleic acid sequencing (scRNA-seq) data. However, scRNA-seq technologies have suffered from several technical challenges, including low mean expression levels in most genes and higher frequencies of missing data than bulk population sequencing technologies. Identifying functional gene sets and their regulatory networks that link specific cell types to human diseases and therapeutics from scRNA-seq profiles are daunting tasks. In this study, we developed a Component Overlapping Attribute Clustering (COAC) algorithm to perform the localized (cell subpopulation) gene co-expression network analysis from large-scale scRNA-seq profiles. Gene subnetworks that represent specific gene co-expression patterns are inferred from the components of a decomposed matrix of scRNA-seq profiles. We showed that single-cell gene subnetworks identified by COAC from multiple time points within cell phases can be used for cell type identification with high accuracy (83%). In addition, COAC-inferred subnetworks from melanoma patients’ scRNA-seq profiles are highly correlated with survival rate from The Cancer Genome Atlas (TCGA). Moreover, the localized gene subnetworks identified by COAC from individual patients’ scRNA-seq data can be used as pharmacogenomics biomarkers to predict drug responses (The area under the receiver operating characteristic curves ranges from 0.728 to 0.783) in cancer cell lines from the Genomics of Drug Sensitivity in Cancer (GDSC) database. In summary, COAC offers a powerful tool to identify potential network-based diagnostic and pharmacogenomics biomarkers from large-scale scRNA-seq profiles. COAC is freely available at https://github.com/ChengF-Lab/COAC.Author summary: Single-cell RNA sequencing (scRNA-seq) can reveal complex and rare cell populations, uncover gene regulatory relationships, track the trajectories of distinct cell lineages in development, and identify cell-cell variabilities in human diseases and therapeutics. Although experimental methods for scRNA-seq are increasingly accessible, computational approaches to infer gene regulatory networks from raw data remain limited. From a single-cell perspective, the stochastic features of a single cell must be properly embedded into gene regulatory networks. However, it is difficult to identify technical noise (e.g., low mean expression levels and missing data) and cell-cell variabilities remain poorly understood. In this study, we introduced a network-based approach, termed Component Overlapping Attribute Clustering (COAC), to infer novel gene-gene subnetworks in individual components (subsets of whole components) representing multiple cell types and phases of scRNA-seq data. We showed that COAC can reduce batch effects and identify specific cell types in two large-scale human scRNA-seq datasets. Importantly, we demonstrated that gene subnetworks identified by COAC from scRNA-seq profiles highly correlated with patients's survival and drug responses in cancer, offering a novel computational tool for advancing precision medicine.

Suggested Citation

  • He Peng & Xiangxiang Zeng & Yadi Zhou & Defu Zhang & Ruth Nussinov & Feixiong Cheng, 2019. "A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications," PLOS Computational Biology, Public Library of Science, vol. 15(2), pages 1-17, February.
  • Handle: RePEc:plo:pcbi00:1006772
    DOI: 10.1371/journal.pcbi.1006772
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006772
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1006772&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1006772?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Feixiong Cheng & Rishi J. Desai & Diane E. Handy & Ruisheng Wang & Sebastian Schneeweiss & Albert-László Barabási & Joseph Loscalzo, 2018. "Network-based approach to prediction and population-based validation of in silico drug repurposing," Nature Communications, Nature, vol. 9(1), pages 1-12, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Csaba Both & Nima Dehmamy & Rose Yu & Albert-László Barabási, 2023. "Accelerating network layouts using graph neural networks," Nature Communications, Nature, vol. 14(1), pages 1-9, December.
    2. Sepideh Sadegh & James Skelton & Elisa Anastasi & Andreas Maier & Klaudia Adamowicz & Anna Möller & Nils M. Kriege & Jaanika Kronberg & Toomas Haller & Tim Kacprowski & Anil Wipat & Jan Baumbach & Dav, 2023. "Lacking mechanistic disease definitions and corresponding association data hamper progress in network medicine and beyond," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    3. Chengxi Zang & Hao Zhang & Jie Xu & Hansi Zhang & Sajjad Fouladvand & Shreyas Havaldar & Feixiong Cheng & Kun Chen & Yong Chen & Benjamin S. Glicksberg & Jin Chen & Jiang Bian & Fei Wang, 2023. "High-throughput target trial emulation for Alzheimer’s disease drug repurposing with real-world data," Nature Communications, Nature, vol. 14(1), pages 1-16, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1006772. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.