1
|
Huang X, Liu R, Yang S, Chen X, Li H. scAnnoX: an R package integrating multiple public tools for single-cell annotation. PeerJ 2024; 12:e17184. [PMID: 38560451 PMCID: PMC10981883 DOI: 10.7717/peerj.17184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 03/11/2024] [Indexed: 04/04/2024] Open
Abstract
Background Single-cell annotation plays a crucial role in the analysis of single-cell genomics data. Despite the existence of numerous single-cell annotation algorithms, a comprehensive tool for integrating and comparing these algorithms is also lacking. Methods This study meticulously investigated a plethora of widely adopted single-cell annotation algorithms. Ten single-cell annotation algorithms were selected based on the classification of either reference dataset-dependent or marker gene-dependent approaches. These algorithms included SingleR, Seurat, sciBet, scmap, CHETAH, scSorter, sc.type, cellID, scCATCH, and SCINA. Building upon these algorithms, we developed an R package named scAnnoX for the integration and comparative analysis of single-cell annotation algorithms. Results The development of the scAnnoX software package provides a cohesive framework for annotating cells in scRNA-seq data, enabling researchers to more efficiently perform comparative analyses among the cell type annotations contained in scRNA-seq datasets. The integrated environment of scAnnoX streamlines the testing, evaluation, and comparison processes among various algorithms. Among the ten annotation tools evaluated, SingleR, Seurat, sciBet, and scSorter emerged as top-performing algorithms in terms of prediction accuracy, with SingleR and sciBet demonstrating particularly superior performance, offering guidance for users. Interested parties can access the scAnnoX package at https://github.com/XQ-hub/scAnnoX.
Collapse
Affiliation(s)
- Xiaoqian Huang
- School of Mathematics and Computer Science, Yunnan Minzu University, Kunming, Yunnan Province, China
| | - Ruiqi Liu
- School of Mathematics and Computer Science, Yunnan Minzu University, Kunming, Yunnan Province, China
| | - Shiwei Yang
- School of Mathematics and Computer Science, Yunnan Minzu University, Kunming, Yunnan Province, China
| | - Xiaozhou Chen
- School of Mathematics and Computer Science, Yunnan Minzu University, Kunming, Yunnan Province, China
| | - Huamei Li
- Department of Hepatobiliary Surgery, the Affiliated Drum Tower Hospital, Medical School, Nanjing University, Nanjing, Jiangsu Province, China
| |
Collapse
|
2
|
Wan G, Maliga Z, Yan B, Vallius T, Shi Y, Khattab S, Chang C, Nirmal AJ, Yu KH, Liu D, Lian CG, DeSimone MS, Sorger PK, Semenov YR. SpatialCells: Automated Profiling of Tumor Microenvironments with Spatially Resolved Multiplexed Single-Cell Data. bioRxiv 2023:2023.11.10.566378. [PMID: 38014067 PMCID: PMC10680639 DOI: 10.1101/2023.11.10.566378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Background Cancer is a complex cellular ecosystem where malignant cells coexist and interact with immune, stromal, and other cells within the tumor microenvironment. Recent technological advancements in spatially resolved multiplexed imaging at single-cell resolution have led to the generation of large-scale and high-dimensional datasets from biological specimens. This underscores the necessity for automated methodologies that can effectively characterize the molecular, cellular, and spatial properties of tumor microenvironments for various malignancies. Results This study introduces SpatialCells, an open-source software package designed for region-based exploratory analysis and comprehensive characterization of tumor microenvironments using multiplexed single-cell data. Conclusions SpatialCells efficiently streamlines the automated extraction of features from multiplexed single-cell data and can process samples containing millions of cells. Thus, SpatialCells facilitates subsequent association analyses and machine learning predictions, making it an essential tool in advancing our understanding of tumor growth, invasion, and metastasis.
Collapse
Affiliation(s)
- Guihong Wan
- Department of Dermatology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Laboratory of Systems Pharmacology, Program in Therapeutic Science, Harvard Medical School, Boston, MA
| | - Zoltan Maliga
- Laboratory of Systems Pharmacology, Program in Therapeutic Science, Harvard Medical School, Boston, MA
| | - Boshen Yan
- Department of Dermatology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Tuulia Vallius
- Laboratory of Systems Pharmacology, Program in Therapeutic Science, Harvard Medical School, Boston, MA
- Ludwig Center for Cancer Research at Harvard, Harvard Medical School, Boston, MA
| | - Yingxiao Shi
- Department of Medicine, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sara Khattab
- Department of Dermatology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Crystal Chang
- Department of Dermatology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Ajit J. Nirmal
- Laboratory of Systems Pharmacology, Program in Therapeutic Science, Harvard Medical School, Boston, MA
- Department of Dermatology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Kun-Hsing Yu
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - David Liu
- Department of Medicine, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Christine G. Lian
- Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Mia S. DeSimone
- Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Peter K. Sorger
- Laboratory of Systems Pharmacology, Program in Therapeutic Science, Harvard Medical School, Boston, MA
| | - Yevgeniy R. Semenov
- Department of Dermatology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Laboratory of Systems Pharmacology, Program in Therapeutic Science, Harvard Medical School, Boston, MA
| |
Collapse
|
3
|
Danciu DP, Hooli J, Martin-Villalba A, Marciniak-Czochra A. Mathematics of neural stem cells: Linking data and processes. Cells Dev 2023; 174:203849. [PMID: 37179018 DOI: 10.1016/j.cdev.2023.203849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 04/29/2023] [Accepted: 05/05/2023] [Indexed: 05/15/2023]
Abstract
Adult stem cells are described as a discrete population of cells that stand at the top of a hierarchy of progressively differentiating cells. Through their unique ability to self-renew and differentiate, they regulate the number of end-differentiated cells that contribute to tissue physiology. The question of how discrete, continuous, or reversible the transitions through these hierarchies are and the precise parameters that determine the ultimate performance of stem cells in adulthood are the subject of intense research. In this review, we explain how mathematical modelling has improved the mechanistic understanding of stem cell dynamics in the adult brain. We also discuss how single-cell sequencing has influenced the understanding of cell states or cell types. Finally, we discuss how the combination of single-cell sequencing technologies and mathematical modelling provides a unique opportunity to answer some burning questions in the field of stem cell biology.
Collapse
Affiliation(s)
- Diana-Patricia Danciu
- Heidelberg University, Institute of Mathematics (IMA), Im Neuenheimer Feld 205, 69120 Heidelberg, Germany; Interdisciplinary Center for Scientific Computing (IWR), Im Neuenheimer Feld 205, 69120 Heidelberg, Germany
| | - Jooa Hooli
- Heidelberg University, Institute of Mathematics (IMA), Im Neuenheimer Feld 205, 69120 Heidelberg, Germany; Interdisciplinary Center for Scientific Computing (IWR), Im Neuenheimer Feld 205, 69120 Heidelberg, Germany; Heidelberg University, Faculty of Biosciences, Im Neuenheimer Feld 234, 69120 Heidelberg, Germany; German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Ana Martin-Villalba
- German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Anna Marciniak-Czochra
- Heidelberg University, Institute of Mathematics (IMA), Im Neuenheimer Feld 205, 69120 Heidelberg, Germany; Interdisciplinary Center for Scientific Computing (IWR), Im Neuenheimer Feld 205, 69120 Heidelberg, Germany.
| |
Collapse
|
4
|
Huang P, Cai M, Lu X, McKennan C, Wang J. Accurate estimation of rare cell type fractions from tissue omics data via hierarchical deconvolution. bioRxiv 2023:2023.03.15.532820. [PMID: 36993280 PMCID: PMC10055056 DOI: 10.1101/2023.03.15.532820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Bulk transcriptomics in tissue samples reflects the average expression levels across different cell types and is highly influenced by cellular fractions. As such, it is critical to estimate cellular fractions to both deconfound differential expression analyses and infer cell type-specific differential expression. Since experimentally counting cells is infeasible in most tissues and studies, in silico cellular deconvolution methods have been developed as an alternative. However, existing methods are designed for tissues consisting of clearly distinguishable cell types and have difficulties estimating highly correlated or rare cell types. To address this challenge, we propose Hierarchical Deconvolution (HiDecon) that uses single-cell RNA sequencing references and a hierarchical cell type tree, which models the similarities among cell types and cell differentiation relationships, to estimate cellular fractions in bulk data. By coordinating cell fractions across layers of the hierarchical tree, cellular fraction information is passed up and down the tree, which helps correct estimation biases by pooling information across related cell types. The flexible hierarchical tree structure also enables estimating rare cell fractions by splitting the tree to higher resolutions. Through simulations and real data applications with the ground truth of measured cellular fractions, we demonstrate that HiDecon significantly outperforms existing methods and accurately estimates cellular fractions.
Collapse
Affiliation(s)
- Penghui Huang
- Deparment of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Manqi Cai
- Deparment of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Xinghua Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Chris McKennan
- Deparment of Statistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Jiebiao Wang
- Deparment of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
5
|
Dobrzyński M, Jacques MA, Pertz O. Mining of Single-Cell Signaling Time-Series for Dynamic Phenotypes with Clustering. Methods Mol Biol 2022; 2488:183-206. [PMID: 35347690 DOI: 10.1007/978-1-0716-2277-3_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Fluorescent live cell time-lapse microscopy is steadily contributing to our better understanding of the relationship between cell signaling and fate. However, large volumes of time-series data generated in these experiments and the heterogenous nature of signaling responses due to cell-cell variability hinder the exploration of such datasets. The population averages insufficiently describe the dynamics, yet finding prototypic dynamic patterns that relate to different cell fates is difficult when mining thousands of time-series. Here we demonstrate a protocol where we identify such dynamic phenotypes in a population of PC-12 cells that respond to a range of sustained growth factor perturbations. We use Time-Course Inspector, a free R/Shiny web application to explore and cluster single-cell time-series.
Collapse
Affiliation(s)
| | | | - Olivier Pertz
- Institute of Cell Biology, University of Bern, Bern, Switzerland
| |
Collapse
|
6
|
Glauche I, Marr C. Mechanistic models of blood cell fate decisions in the era of single-cell data. Curr Opin Syst Biol 2021; 28:None. [PMID: 34950807 PMCID: PMC8660645 DOI: 10.1016/j.coisb.2021.100355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Billions of functionally distinct blood cells emerge from a pool of hematopoietic stem cells in our bodies every day. This progressive differentiation process is hierarchically structured and remarkably robust. We provide an introductory review to mathematical approaches addressing the functional aspects of how lineage choice is potentially implemented on a molecular level. Emerging from studies on the mutual repression of key transcription factors, we illustrate how those simple concepts have been challenged in recent years and subsequently extended. Especially, the analysis of omics data on the single-cell level with computational tools provides descriptive insights on a yet unknown level, while their embedding into a consistent mechanistic and mathematical framework is still incomplete.
Collapse
Affiliation(s)
- Ingmar Glauche
- Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | - Carsten Marr
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| |
Collapse
|
7
|
Zinovyev A. Adaptation through the lens of single-cell multi-omics data: Comment on "Dynamic and thermodynamic models of adaptation" by A.N. Gorban et al. Phys Life Rev 2021; 38:132-134. [PMID: 34088607 DOI: 10.1016/j.plrev.2021.05.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Accepted: 05/19/2021] [Indexed: 10/21/2022]
Affiliation(s)
- Andrei Zinovyev
- Institut Curie, PSL Research University, F-75005 Paris, France; INSERM, U900, F-75005 Paris, France; CBIO-Centre for Computational Biology, Mines ParisTech, PSL Research University, 75006 Paris, France; Laboratory of advanced methods for high-dimensional data analysis, Lobachevsky University, 603000 Nizhny Novgorod, Russia.
| |
Collapse
|
8
|
Li Y, Luo P, Lu Y, Wu FX. Identifying cell types from single-cell data based on similarities and dissimilarities between cells. BMC Bioinformatics 2021; 22:255. [PMID: 34006217 PMCID: PMC8132444 DOI: 10.1186/s12859-020-03873-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 11/09/2020] [Indexed: 12/15/2022] Open
Abstract
Background With the development of the technology of single-cell sequence, revealing homogeneity and heterogeneity between cells has become a new area of computational systems biology research. However, the clustering of cell types becomes more complex with the mutual penetration between different types of cells and the instability of gene expression. One way of overcoming this problem is to group similar, related single cells together by the means of various clustering analysis methods. Although some methods such as spectral clustering can do well in the identification of cell types, they only consider the similarities between cells and ignore the influence of dissimilarities on clustering results. This methodology may limit the performance of most of the conventional clustering algorithms for the identification of clusters, it needs to develop special methods for high-dimensional sparse categorical data. Results Inspired by the phenomenon that same type cells have similar gene expression patterns, but different types of cells evoke dissimilar gene expression patterns, we improve the existing spectral clustering method for clustering single-cell data that is based on both similarities and dissimilarities between cells. The method first measures the similarity/dissimilarity among cells, then constructs the incidence matrix by fusing similarity matrix with dissimilarity matrix, and, finally, uses the eigenvalues of the incidence matrix to perform dimensionality reduction and employs the K-means algorithm in the low dimensional space to achieve clustering. The proposed improved spectral clustering method is compared with the conventional spectral clustering method in recognizing cell types on several real single-cell RNA-seq datasets. Conclusions In summary, we show that adding intercellular dissimilarity can effectively improve accuracy and achieve robustness and that improved spectral clustering method outperforms the traditional spectral clustering method in grouping cells.
Collapse
Affiliation(s)
- Yuanyuan Li
- School of Mathematics and Physics, Wuhan Institute of Technology, No.206, Guanggu 1st road, Wuhan, 430205, Hubei, China. .,Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada.
| | - Ping Luo
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
| | - Yi Lu
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada.,Department of Mechanical Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada.,Department of Computer Science, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK, S7N 5A9, Canada
| |
Collapse
|
9
|
Loos C, Hasenauer J. Robust calibration of hierarchical population models for heterogeneous cell populations. J Theor Biol 2020; 488:110118. [PMID: 31866394 DOI: 10.1016/j.jtbi.2019.110118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Revised: 12/09/2019] [Accepted: 12/13/2019] [Indexed: 11/20/2022]
Abstract
Cellular heterogeneity is known to have important effects on signal processing and cellular decision making. To understand these processes, multiple classes of mathematical models have been introduced. The hierarchical population model builds a novel class which allows for the mechanistic description of heterogeneity and explicitly takes into account subpopulation structures. However, this model requires a parametric distribution assumption for the cell population and, so far, only the normal distribution has been employed. Here, we incorporate alternative distribution assumptions into the model, assess their robustness against outliers and evaluate their influence on the performance of model calibration in a simulation study and a real-world application example. We found that alternative distributions provide reliable parameter estimates even in the presence of outliers, and can in fact increase the convergence of model calibration.
Collapse
|