Author
Abstract
With the rapid advances of various single-cell technologies, an increasing number of single-cell datasets are being generated, and the computational tools for aligning the datasets which make subsequent integration or meta-analysis possible have become critical. Typically, single-cell datasets from different technologies cannot be directly combined or concatenated, due to the innate difference in the data, such as the number of measured parameters and the distributions. Even datasets generated by the same technology are often affected by the batch effect. A computational approach for aligning different datasets and hence identifying related clusters will be useful for data integration and interpretation in large scale single-cell experiments. Our proposed algorithm called JSOM, a variation of the Self-organizing map, aligns two related datasets that contain similar clusters, by constructing two maps—low-dimensional discretized representation of datasets–that jointly evolve according to both datasets. Here we applied the JSOM algorithm to flow cytometry, mass cytometry, and single-cell RNA sequencing datasets. The resulting JSOM maps not only align the related clusters in the two datasets but also preserve the topology of the datasets so that the maps could be used for further analysis, such as clustering.Author summary: Biological datasets are now generated more than ever as many data acquisition technologies have been developed over the years, especially single-cell technologies. With increasing amounts of datasets available for larger scale studies, robust computational tools that could align datasets are needed for data integration and interpretation. We present a new algorithm that can align two biological datasets and demonstrated that the algorithm can work with data generated from different data acquisition technologies. Our proposed algorithm produces low dimensional representations of two datasets to align them in a way that preserves the topology of the respective datasets. Such aligned maps facilitate further analysis, such as clustering. The proposed algorithm showed promising results when applied to different combinations of datasets, i.e., flow cytometry to flow cytometry, flow cytometry to mass cytometry, and two different single-cell RNA sequencing technologies. Therefore, our newly developed algorithm could potentially lead to new discoveries that were once difficult to obtain.
Suggested Citation
Hong Seo Lim & Peng Qiu, 2021.
"JSOM: Jointly-evolving self-organizing maps for alignment of biological datasets and identification of related clusters,"
PLOS Computational Biology, Public Library of Science, vol. 17(3), pages 1-16, March.
Handle:
RePEc:plo:pcbi00:1008804
DOI: 10.1371/journal.pcbi.1008804
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1008804. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.