Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Matsumoto H, Kiryu H. SCOUP: a probabilistic model based on the Ornstein-Uhlenbeck process to analyze single-cell expression data during differentiation. BMC Bioinformatics 2016;17:232. [PMID: 27277014 PMCID: PMC4898467 DOI: 10.1186/s12859-016-1109-3] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Accepted: 06/02/2016] [Indexed: 01/10/2023] Open

For:	Matsumoto H, Kiryu H. SCOUP: a probabilistic model based on the Ornstein-Uhlenbeck process to analyze single-cell expression data during differentiation. BMC Bioinformatics 2016;17:232. [PMID: 27277014 PMCID: PMC4898467 DOI: 10.1186/s12859-016-1109-3] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Accepted: 06/02/2016] [Indexed: 01/10/2023] Open

Number	Cited by Other Article(s)
1	CVGAE: A Self-Supervised Generative Method for Gene Regulatory Network Inference Using Single-Cell RNA Sequencing Data. Interdiscip Sci 2024:10.1007/s12539-024-00633-y. [PMID: 38778003 DOI: 10.1007/s12539-024-00633-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Revised: 04/07/2024] [Accepted: 04/09/2024] [Indexed: 05/25/2024] Abstract Gene regulatory network (GRN) inference based on single-cell RNA sequencing data (scRNAseq) plays a crucial role in understanding the regulatory mechanisms between genes. Various computational methods have been employed for GRN inference, but their performance in terms of network accuracy and model generalization is not satisfactory, and their poor performance is caused by high-dimensional data and network sparsity. In this paper, we propose a self-supervised method for gene regulatory network inference using single-cell RNA sequencing data (CVGAE). CVGAE uses graph neural network for inductive representation learning, which merges gene expression data and observed topology into a low-dimensional vector space. The well-trained vectors will be used to calculate mathematical distance of each gene, and further predict interactions between genes. In overall framework, FastICA is implemented to relief computational complexity caused by high dimensional data, and CVGAE adopts multi-stacked GraphSAGE layers as an encoder and an improved decoder to overcome network sparsity. CVGAE is evaluated on several single cell datasets containing four related ground-truth networks, and the result shows that CVGAE achieve better performance than comparative methods. To validate learning and generalization capabilities, CVGAE is applied in few-shot environment by change the ratio of train set and test set. In condition of few-shot, CVGAE obtains comparable or superior performance. Collapse Key Words Gene regulatory network inference Graph neural networks Representation learning Single-cell RNA sequencing Collapse MESH Headings Collapse Grants 22A0101 the Scientific Research Fund of Hunan Provincial Education Department Collapse Affiliation(s) Collapse
2	Decoding the principle of cell-fate determination for its reverse control. NPJ Syst Biol Appl 2024;10:47. [PMID: 38710700 DOI: 10.1038/s41540-024-00372-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Accepted: 04/16/2024] [Indexed: 05/08/2024] Open Abstract Understanding and manipulating cell fate determination is pivotal in biology. Cell fate is determined by intricate and nonlinear interactions among molecules, making mathematical model-based quantitative analysis indispensable for its elucidation. Nevertheless, obtaining the essential dynamic experimental data for model development has been a significant obstacle. However, recent advancements in large-scale omics data technology are providing the necessary foundation for developing such models. Based on accumulated experimental evidence, we can postulate that cell fate is governed by a limited number of core regulatory circuits. Following this concept, we present a conceptual control framework that leverages single-cell RNA-seq data for dynamic molecular regulatory network modeling, aiming to identify and manipulate core regulatory circuits and their master regulators to drive desired cellular state transitions. We illustrate the proposed framework by applying it to the reversion of lung cancer cell states, although it is more broadly applicable to understanding and controlling a wide range of cell-fate determination processes. Collapse Key Words Collapse MESH Headings Humans Gene Regulatory Networks/genetics Single-Cell Analysis/methods Lung Neoplasms/genetics Lung Neoplasms/pathology Cell Differentiation/genetics Models, Biological Computational Biology/methods Collapse Grants 2023R1A2C3002619 and 2021M3A9I4024447 National Research Foundation of Korea (NRF) Collapse Affiliation(s) Collapse
3	New perspectives on biology, disease progression, and therapy response of head and neck cancer gained from single cell RNA sequencing and spatial transcriptomics. Oncol Res 2023;32:1-17. [PMID: 38188682 PMCID: PMC10767240 DOI: 10.32604/or.2023.044774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 10/12/2023] [Indexed: 01/09/2024] Open Abstract Head and neck squamous cell carcinoma (HNSCC) is one of the most frequent cancers worldwide. The main risk factors are consumption of tobacco products and alcohol, as well as infection with human papilloma virus. Approved therapeutic options comprise surgery, radiation, chemotherapy, targeted therapy through epidermal growth factor receptor inhibition, and immunotherapy, but outcome has remained unsatisfactory due to recurrence rates of ~50% and the frequent occurrence of second primaries. The availability of the human genome sequence at the beginning of the millennium heralded the omics era, in which rapid technological progress has advanced our knowledge of the molecular biology of malignant diseases, including HNSCC, at an unprecedented pace. Initially, microarray-based methods, followed by approaches based on next-generation sequencing, were applied to study the genetics, epigenetics, and gene expression patterns of bulk tumors. More recently, the advent of single-cell RNA sequencing (scRNAseq) and spatial transcriptomics methods has facilitated the investigation of the heterogeneity between and within different cell populations in the tumor microenvironment (e.g., cancer cells, fibroblasts, immune cells, endothelial cells), led to the discovery of novel cell types, and advanced the discovery of cell-cell communication within tumors. This review provides an overview of scRNAseq, spatial transcriptomics, and the associated bioinformatics methods, and summarizes how their application has promoted our understanding of the emergence, composition, progression, and therapy responsiveness of, and intercellular signaling within, HNSCC. Collapse Key Words Gene expression Head and neck squamous cell carcinoma Immunotherapy Omics Tumor microenvironment Collapse MESH Headings Humans Squamous Cell Carcinoma of Head and Neck/genetics Squamous Cell Carcinoma of Head and Neck/therapy Endothelial Cells Head and Neck Neoplasms/genetics Head and Neck Neoplasms/therapy Gene Expression Profiling Computational Biology Disease Progression Sequence Analysis, RNA Tumor Microenvironment/genetics Collapse Grants Collapse Affiliation(s) Collapse
4	GCSTI: A Single-Cell Pseudotemporal Trajectory Inference Method Based on Graph Compression. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:2945-2958. [PMID: 37037234 DOI: 10.1109/tcbb.2023.3266109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023] Abstract The single-cell pseudotemporal trajectory inference is an important way to explore the process of developmental changes within a cell. Due to the uneven rate of cell growth, changes in gene expression depend less on the time of data collection and more on a cell's "internal clock". To overcome the challenges of gene analysis, and replicate biological developmental processes, several strategies have been put forth. However, due to the size of single-cell datasets, locating relevant signposts usually necessitate clustering analysis or a sizable amount of priori information. To this end, we propose a novel single-cell pseudotemporal trajectory inference technique: GCSTI method, which is based on graph compression and doesn't rely on a priori knowledge or clustering procedures, can handle the trajectory inference problem for a large network in a stable and efficient manner. Additionally, we simultaneously improve the pseudotime defining method currently employed in this study in order to obtain more trustworthy and beneficial outcomes for trajectory inference. Finally, we validate the efficacy and stability of the GCSTI method using datasets from human skeletal muscle myogenic cells and four simulated datasets. Collapse Key Words Collapse MESH Headings Humans Gene Expression Profiling/methods Single-Cell Analysis/methods Collapse Grants Collapse Affiliation(s) Collapse
5	scTIGER: A Deep-Learning Method for Inferring Gene Regulatory Networks from Case versus Control scRNA-seq Datasets. Int J Mol Sci 2023;24:13339. [PMID: 37686146 PMCID: PMC10488287 DOI: 10.3390/ijms241713339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 08/06/2023] [Accepted: 08/23/2023] [Indexed: 09/10/2023] Open Abstract Inferring gene regulatory networks (GRNs) from single-cell RNA-seq (scRNA-seq) data is an important computational question to find regulatory mechanisms involved in fundamental cellular processes. Although many computational methods have been designed to predict GRNs from scRNA-seq data, they usually have high false positive rates and none infer GRNs by directly using the paired datasets of case-versus-control experiments. Here we present a novel deep-learning-based method, named scTIGER, for GRN detection by using the co-differential relationships of gene expression profiles in paired scRNA-seq datasets. scTIGER employs cell-type-based pseudotiming, an attention-based convolutional neural network method and permutation-based significance testing for inferring GRNs among gene modules. As state-of-the-art applications, we first applied scTIGER to scRNA-seq datasets of prostate cancer cells, and successfully identified the dynamic regulatory networks of AR, ERG, PTEN and ATF3 for same-cell type between prostatic cancerous and normal conditions, and two-cell types within the prostatic cancerous environment. We then applied scTIGER to scRNA-seq data from neurons with and without fear memory and detected specific regulatory networks for BDNF, CREB1 and MAPK4. Additionally, scTIGER demonstrates robustness against high levels of dropout noise in scRNA-seq data. Collapse Key Words deep learning gene co-differential expression network gene regulatory network memory formation prostate cancer scRNA-seq Collapse MESH Headings Male Humans Deep Learning Gene Regulatory Networks Single-Cell Gene Expression Analysis Fear Mitogen-Activated Protein Kinases Prostatic Neoplasms Collapse Grants DBI-2239350 National Science Foundation C2204 W. W. Smith Charitable Trust 61572358 National Natural Science Foundation of China 19JCZDJC35100 Natural Science Foundation of Tianjin City Collapse Affiliation(s) Collapse
6	From time-series transcriptomics to gene regulatory networks: A review on inference methods. PLoS Comput Biol 2023;19:e1011254. [PMID: 37561790 PMCID: PMC10414591 DOI: 10.1371/journal.pcbi.1011254] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023] Open Abstract Inference of gene regulatory networks has been an active area of research for around 20 years, leading to the development of sophisticated inference algorithms based on a variety of assumptions and approaches. With the ever increasing demand for more accurate and powerful models, the inference problem remains of broad scientific interest. The abstract representation of biological systems through gene regulatory networks represents a powerful method to study such systems, encoding different amounts and types of information. In this review, we summarize the different types of inference algorithms specifically based on time-series transcriptomics, giving an overview of the main applications of gene regulatory networks in computational biology. This review is intended to give an updated reference of regulatory networks inference tools to biologists and researchers new to the topic and guide them in selecting the appropriate inference method that best fits their questions, aims, and experimental data. Collapse Key Words Collapse MESH Headings Gene Regulatory Networks/genetics Transcriptome/genetics Gene Expression Profiling Algorithms Computational Biology/methods Collapse Grants Chair of Bioinformatics in Oncology of CRCT Collapse Affiliation(s) Collapse
7	Context-dependent gene regulatory network reveals regulation dynamics and cell trajectories using unspliced transcripts. Brief Bioinform 2023;24:6991202. [PMID: 36653899 DOI: 10.1093/bib/bbac633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 12/06/2022] [Accepted: 12/29/2022] [Indexed: 01/20/2023] Open Abstract Gene regulatory networks govern complex gene expression programs in various biological phenomena, including embryonic development, cell fate decisions and oncogenesis. Single-cell techniques are increasingly being used to study gene expression, providing higher resolution than traditional approaches. However, inferring a comprehensive gene regulatory network across different cell types remains a challenge. Here, we propose to construct context-dependent gene regulatory networks (CDGRNs) from single-cell RNA sequencing data utilizing both spliced and unspliced transcript expression levels. A gene regulatory network is decomposed into subnetworks corresponding to different transcriptomic contexts. Each subnetwork comprises the consensus active regulation pairs of transcription factors and their target genes shared by a group of cells, inferred by a Gaussian mixture model. We find that the union of gene regulation pairs in all contexts is sufficient to reconstruct differentiation trajectories. Functions specific to the cell cycle, cell differentiation or tissue-specific functions are enriched throughout the developmental process in each context. Surprisingly, we also observe that the network entropy of CDGRNs decreases along differentiation trajectories, indicating directionality in differentiation. Overall, CDGRN allows us to establish the connection between gene regulation at the molecular level and cell differentiation at the macroscopic level. Collapse Key Words Gaussian mixture model gene regulatory network single-cell RNA sequencing trajectory analysis Collapse MESH Headings Gene Regulatory Networks Cell Differentiation/genetics Embryonic Development Transcription Factors/genetics Transcription Factors/metabolism Gene Expression Profiling Collapse Grants MOST 109-2221-E-002-161-MY3 Ministry of Science and Technology Collapse Affiliation(s) Collapse
8	Single-cell and single-nuclei RNA sequencing as powerful tools to decipher cellular heterogeneity and dysregulation in neurodegenerative diseases. Front Cell Dev Biol 2022;10:884748. [PMID: 36353512 PMCID: PMC9637968 DOI: 10.3389/fcell.2022.884748] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Accepted: 10/06/2022] [Indexed: 08/10/2023] Open Abstract Neurodegenerative diseases affect millions of people worldwide and there are currently no cures. Two types of common neurodegenerative diseases are Alzheimer's (AD) and Parkinson's disease (PD). Single-cell and single-nuclei RNA sequencing (scRNA-seq and snRNA-seq) have become powerful tools to elucidate the inherent complexity and dynamics of the central nervous system at cellular resolution. This technology has allowed the identification of cell types and states, providing new insights into cellular susceptibilities and molecular mechanisms underlying neurodegenerative conditions. Exciting research using high throughput scRNA-seq and snRNA-seq technologies to study AD and PD is emerging. Herein we review the recent progress in understanding these neurodegenerative diseases using these state-of-the-art technologies. We discuss the fundamental principles and implications of single-cell sequencing of the human brain. Moreover, we review some examples of the computational and analytical tools required to interpret the extensive amount of data generated from these assays. We conclude by highlighting challenges and limitations in the application of these technologies in the study of AD and PD. Collapse Key Words Alzheimer’s disease Parkinson’s disease cellular heterogeneity cellular vulnerability single-cell sequencing single-nuclei sequencing Collapse MESH Headings Collapse Grants R01 NS088353 NINDS NIH HHS R21 NS113068 NINDS NIH HHS Collapse Affiliation(s) Collapse
9	A probabilistic Boolean model on hair follicle cell fate regulation by TGF-β. Biophys J 2022;121:2638-2652. [PMID: 35714600 DOI: 10.1016/j.bpj.2022.05.035] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 05/20/2022] [Accepted: 05/23/2022] [Indexed: 11/24/2022] Open Abstract Hair follicles (HFs) are mini skin organs that undergo cyclic growth. Various signals regulate HF cell fate decisions jointly. Recent experimental results suggest that transforming growth factor beta (TGF-β) exhibits a dual role in HF cell fate regulation that can be either anti- or pro-apoptosis. To understand the underlying mechanisms of HF cell fate control, we develop a novel probabilistic Boolean network (pBN) model on the HF epithelial cell gene regulation dynamics. First, the model is derived from literature, then refined using single-cell RNA sequencing data. Using the model, we both explore the mechanisms underlying HF cell fate decisions and make predictions that could potentially guide future experiments: 1) we propose that a threshold-like switch in the TGF-β strength may necessitate the dual roles of TGF-β in either activating apoptosis or cell proliferation, in cooperation with Bmp and tumor necrosis factor (TNF) and at different stages of a follicle growth cycle; 2) our model shows concordance with the high-activator-low-inhibitor theory of anagen initiation; 3) we predict that TNF may be more effective in catagen initiation than TGF-β, and they may cooperate in a two-step fashion; 4) finally, predictions of gene knockout and overexpression reveal the roles in HF cell fate regulations of each gene. Attractor and motif analysis from the associated Boolean networks reveal the relations between the topological structure of the gene regulation network and the cell fate regulation mechanism. A discrete spatial model equipped with the pBN illustrates how TGF-β and TNF cooperate in initiating and driving the apoptosis wave during catagen. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
10	Spatially and Temporally Distributed Complexity-A Refreshed Framework for the Study of GRN Evolution. Cells 2022;11:cells11111790. [PMID: 35681485 PMCID: PMC9179533 DOI: 10.3390/cells11111790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 05/24/2022] [Accepted: 05/28/2022] [Indexed: 11/16/2022] Open Abstract Irrespective of the heuristic value of interpretations of developmental processes in terms of gene regulatory networks (GRNs), larger-angle views often suffer from: (i) an inadequate understanding of the relationship between genotype and phenotype; (ii) a predominantly zoocentric vision; and (iii) overconfidence in a putatively hierarchical organization of animal body plans. Here, we constructively criticize these assumptions. First, developmental biology is pervaded by adultocentrism, but development is not necessarily egg to adult. Second, during development, many unicells undergo transcriptomic profile transitions that are comparable to those recorded in pluricellular organisms; thus, their study should not be neglected from the GRN perspective. Third, the putatively hierarchical nature of the animal body is mirrored in the GRN logic, but in relating genotype to phenotype, independent assessments of the dynamics of the regulatory machinery and the animal’s architecture are required, better served by a combinatorial than by a hierarchical approach. The trade-offs between spatial and temporal aspects of regulation, as well as their evolutionary consequences, are also discussed. Multicellularity may derive from a unicell’s sequential phenotypes turned into different but coexisting, spatially arranged cell types. In turn, polyphenism may have been a crucial mechanism involved in the origin of complex life cycles. Collapse Key Words adultocentrism development hierarchy multicellular organisms phenotypic plasticity polymorphism polyphenism unicells Collapse MESH Headings Animals Gene Regulatory Networks Genotype Phenotype Collapse Grants Collapse Affiliation(s) Collapse
11	Inference of Molecular Regulatory Systems Using Statistical Path-Consistency Algorithm. ENTROPY 2022;24:e24050693. [PMID: 35626576 PMCID: PMC9142129 DOI: 10.3390/e24050693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 05/12/2022] [Accepted: 05/12/2022] [Indexed: 11/16/2022] Abstract One of the key challenges in systems biology and molecular sciences is how to infer regulatory relationships between genes and proteins using high-throughout omics datasets. Although a wide range of methods have been designed to reverse engineer the regulatory networks, recent studies show that the inferred network may depend on the variable order in the dataset. In this work, we develop a new algorithm, called the statistical path-consistency algorithm (SPCA), to solve the problem of the dependence of variable order. This method generates a number of different variable orders using random samples, and then infers a network by using the path-consistent algorithm based on each variable order. We propose measures to determine the edge weights using the corresponding edge weights in the inferred networks, and choose the edges with the largest weights as the putative regulations between genes or proteins. The developed method is rigorously assessed by the six benchmark networks in DREAM challenges, the mitogen-activated protein (MAP) kinase pathway, and a cancer-specific gene regulatory network. The inferred networks are compared with those obtained by using two up-to-date inference methods. The accuracy of the inferred networks shows that the developed method is effective for discovering molecular regulatory systems. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
12	Inferring transcription factor regulatory networks from single-cell ATAC-seq data based on graph neural networks. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00469-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
13	Guidelines for bioinformatics of single-cell sequencing data analysis in Alzheimer's disease: review, recommendation, implementation and application. Mol Neurodegener 2022;17:17. [PMID: 35236372 PMCID: PMC8889402 DOI: 10.1186/s13024-022-00517-z] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 01/18/2022] [Indexed: 12/13/2022] Open Abstract Alzheimer's disease (AD) is the most common form of dementia, characterized by progressive cognitive impairment and neurodegeneration. Extensive clinical and genomic studies have revealed biomarkers, risk factors, pathways, and targets of AD in the past decade. However, the exact molecular basis of AD development and progression remains elusive. The emerging single-cell sequencing technology can potentially provide cell-level insights into the disease. Here we systematically review the state-of-the-art bioinformatics approaches to analyze single-cell sequencing data and their applications to AD in 14 major directions, including 1) quality control and normalization, 2) dimension reduction and feature extraction, 3) cell clustering analysis, 4) cell type inference and annotation, 5) differential expression, 6) trajectory inference, 7) copy number variation analysis, 8) integration of single-cell multi-omics, 9) epigenomic analysis, 10) gene network inference, 11) prioritization of cell subpopulations, 12) integrative analysis of human and mouse sc-RNA-seq data, 13) spatial transcriptomics, and 14) comparison of single cell AD mouse model studies and single cell human AD studies. We also address challenges in using human postmortem and mouse tissues and outline future developments in single cell sequencing data analysis. Importantly, we have implemented our recommended workflow for each major analytic direction and applied them to a large single nucleus RNA-sequencing (snRNA-seq) dataset in AD. Key analytic results are reported while the scripts and the data are shared with the research community through GitHub. In summary, this comprehensive review provides insights into various approaches to analyze single cell sequencing data and offers specific guidelines for study design and a variety of analytic directions. The review and the accompanied software tools will serve as a valuable resource for studying cellular and molecular mechanisms of AD, other diseases, or biological systems at the single cell level. Collapse Key Words Alzheimer’s disease And brain cell types Clustering analysis Gene networks Single cell ATAC-sequencing Single cell RNA-sequencing Single cell sequencing Spatial transcriptomics Trajectory analysis Collapse MESH Headings Alzheimer Disease/genetics Alzheimer Disease/metabolism Animals Computational Biology DNA Copy Number Variations Data Analysis Mice Single-Cell Analysis/methods Collapse Grants R01 DE029322 NIDCR NIH HHS RF1 AG057440 NIA NIH HHS R01 AG063819 NIA NIH HHS U01 AG052411 NIA NIH HHS RF1 AG054014 NIA NIH HHS R01 AG057907 NIA NIH HHS RF1 AG074010 NIA NIH HHS P30 AG072947 NIA NIH HHS R01 AG062355 NIA NIH HHS R01 DA051191 NIDA NIH HHS S10 OD026880 NIH HHS S10 OD030463 NIH HHS U01 AG046170 NIA NIH HHS IK2 BX003804 BLRD VA R21 AI149013 NIAID NIH HHS R01 AG068293 NIA NIH HHS R01 AG068030 NIA NIH HHS national institute on aging Collapse Affiliation(s) Collapse
14	Network inference with Granger causality ensembles on single-cell transcriptomics. Cell Rep 2022;38:110333. [PMID: 35139376 DOI: 10.1016/j.celrep.2022.110333] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 02/19/2021] [Accepted: 01/12/2022] [Indexed: 12/20/2022] Open Abstract Cellular gene expression changes throughout a dynamic biological process, such as differentiation. Pseudotimes estimate cells' progress along a dynamic process based on their individual gene expression states. Ordering the expression data by pseudotime provides information about the underlying regulator-gene interactions. Because the pseudotime distribution is not uniform, many standard mathematical methods are inapplicable for analyzing the ordered gene expression states. Here we present single-cell inference of networks using Granger ensembles (SINGE), an algorithm for gene regulatory network inference from ordered single-cell gene expression data. SINGE uses kernel-based Granger causality regression to smooth irregular pseudotimes and missing expression values. It aggregates predictions from an ensemble of regression analyses to compile a ranked list of candidate interactions between transcriptional regulators and target genes. In two mouse embryonic stem cell differentiation datasets, SINGE outperforms other contemporary algorithms. However, a more detailed examination reveals caveats about poor performance for individual regulators and uninformative pseudotimes. Collapse Key Words mouse embryonic stem cells network evaluation pseudotime time series analysis transcriptional regulation Collapse MESH Headings Algorithms Animals Cell Differentiation/physiology Computational Biology/methods Gene Expression Profiling/methods Gene Regulatory Networks/physiology Mice Software Transcriptome/physiology Collapse Grants P50 DE026787 NIDCR NIH HHS U01 HL099773 NHLBI NIH HHS UH3 TR000506 NCATS NIH HHS Collapse Affiliation(s) Collapse
15	Dictionary learning allows model-free pseudotime estimation of transcriptomic data. BMC Genomics 2022;23:56. [PMID: 35033004 PMCID: PMC8760643 DOI: 10.1186/s12864-021-08276-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 12/22/2021] [Indexed: 11/10/2022] Open Abstract Background Pseudotime estimation from dynamic single-cell transcriptomic data enables characterisation and understanding of the underlying processes, for example developmental processes. Various pseudotime estimation methods have been proposed during the last years. Typically, these methods start with a dimension reduction step because the low-dimensional representation is usually easier to analyse. Approaches such as PCA, ICA or t-SNE belong to the most widely used methods for dimension reduction in pseudotime estimation methods. However, these methods usually make assumptions on the derived dimensions, which can result in important dataset properties being missed. In this paper, we suggest a new dictionary learning based approach, dynDLT, for dimension reduction and pseudotime estimation of dynamic transcriptomic data. Dictionary learning is a matrix factorisation approach that does not restrict the dependence of the derived dimensions. To evaluate the performance, we conduct a large simulation study and analyse 8 real-world datasets. Results The simulation studies reveal that firstly, dynDLT preserves the simulated patterns in low-dimension and the pseudotimes can be derived from the low-dimensional representation. Secondly, the results show that dynDLT is suitable for the detection of genes exhibiting the simulated dynamic patterns, thereby facilitating the interpretation of the compressed representation and thus the dynamic processes. For the real-world data analysis, we select datasets with samples that are taken at different time points throughout an experiment. The pseudotimes found by dynDLT have high correlations with the experimental times. We compare the results to other approaches used in pseudotime estimation, or those that are method-wise closely connected to dictionary learning: ICA, NMF, PCA, t-SNE, and UMAP. DynDLT has the best overall performance for the simulated and real-world datasets. Conclusions We introduce dynDLT, a method that is suitable for pseudotime estimation. Its main advantages are: (1) It presents a model-free approach, meaning that it does not restrict the dependence of the derived dimensions; (2) Genes that are relevant in the detected dynamic processes can be identified from the dictionary matrix; (3) By a restriction of the dictionary entries to positive values, the dictionary atoms are highly interpretable. Supplementary Information The online version contains supplementary material available at (10.1186/s12864-021-08276-9). Collapse Key Words Biomarker Branching Dictionary learning Dimension reduction Dynamic Pseudotime estimation RNA-seq Single-cell Time course Trajectory inference Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
16	A comprehensive survey of regulatory network inference methods using single cell RNA sequencing data. Brief Bioinform 2021;22:bbaa190. [PMID: 34020546 PMCID: PMC8138892 DOI: 10.1093/bib/bbaa190] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 06/19/2020] [Accepted: 07/24/2020] [Indexed: 12/13/2022] Open Abstract Gene regulatory network is a complicated set of interactions between genetic materials, which dictates how cells develop in living organisms and react to their surrounding environment. Robust comprehension of these interactions would help explain how cells function as well as predict their reactions to external factors. This knowledge can benefit both developmental biology and clinical research such as drug development or epidemiology research. Recently, the rapid advance of single-cell sequencing technologies, which pushed the limit of transcriptomic profiling to the individual cell level, opens up an entirely new area for regulatory network research. To exploit this new abundant source of data and take advantage of data in single-cell resolution, a number of computational methods have been proposed to uncover the interactions hidden by the averaging process in standard bulk sequencing. In this article, we review 15 such network inference methods developed for single-cell data. We discuss their underlying assumptions, inference techniques, usability, and pros and cons. In an extensive analysis using simulation, we also assess the methods' performance, sensitivity to dropout and time complexity. The main objective of this survey is to assist not only life scientists in selecting suitable methods for their data and analysis purposes but also computational scientists in developing new methods by highlighting outstanding challenges in the field that remain to be addressed in the future development. Collapse Key Words RNA sequencing gene regulatory network scRNA-seq simulation studies single-cell data Collapse MESH Headings Algorithms Computational Biology/methods Gene Expression Profiling/methods Gene Regulatory Networks Humans Models, Genetic Reproducibility of Results Sequence Analysis, RNA/methods Single-Cell Analysis/methods Software Collapse Grants 80NSSC19M0170 Intramural NASA P20 GM103440 NIGMS NIH HHS National Aeronautics and Space Administration Collapse Affiliation(s) Collapse
17	DTFLOW: Inference and Visualization of Single-cell Pseudotime Trajectory Using Diffusion Propagation. GENOMICS, PROTEOMICS & BIOINFORMATICS 2021;19:306-318. [PMID: 33662626 PMCID: PMC8602766 DOI: 10.1016/j.gpb.2020.08.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2019] [Revised: 05/26/2020] [Accepted: 10/29/2020] [Indexed: 12/13/2022] Abstract One of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data. Although substantial studies have been conducted in recent years, more effective methods are still strongly needed to infer the developmental processes accurately. This work devises a new method, named DTFLOW, for determining the pseudo-temporal trajectories with multiple branches. DTFLOW consists of two major steps: a new method called Bhattacharyya kernel feature decomposition (BKFD) to reduce the data dimensions, and a novel approach named Reverse Searching on k-nearest neighbor graph (RSKG) to identify the multi-branching processes of cellular differentiation. In BKFD, we first establish a stationary distribution for each cell to represent the transition of cellular developmental states based on the random walk with restart algorithm, and then propose a new distance metric for calculating pseudotime of single cells by introducing the Bhattacharyya kernel matrix. The effectiveness of DTFLOW is rigorously examined by using four single-cell datasets. We compare the efficiency of DTFLOW with the published state-of-the-art methods. Simulation results suggest that DTFLOW has superior accuracy and strong robustness properties for constructing pseudotime trajectories. The Python source code of DTFLOW can be freely accessed at https://github.com/statway/DTFLOW. Collapse Key Words Bhattacharyya kernel Manifold learning Pseudotime trajectory Single-cell heterogeneity Collapse MESH Headings Algorithms Cluster Analysis Computer Simulation Single-Cell Analysis/methods Software Collapse Grants Collapse Affiliation(s) Collapse
18	CytoTree: an R/Bioconductor package for analysis and visualization of flow and mass cytometry data. BMC Bioinformatics 2021;22:138. [PMID: 33752602 PMCID: PMC7983272 DOI: 10.1186/s12859-021-04054-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 02/26/2021] [Indexed: 01/20/2023] Open Abstract Background The rapidly increasing dimensionality and throughput of flow and mass cytometry data necessitate new bioinformatics tools for analysis and interpretation, and the recently emerging single-cell-based algorithms provide a powerful strategy to meet this challenge. Results Here, we present CytoTree, an R/Bioconductor package designed to analyze and interpret multidimensional flow and mass cytometry data. CytoTree provides multiple computational functionalities that integrate most of the commonly used techniques in unsupervised clustering and dimensionality reduction and, more importantly, support the construction of a tree-shaped trajectory based on the minimum spanning tree algorithm. A graph-based algorithm is also implemented to estimate the pseudotime and infer intermediate-state cells. We apply CytoTree to several examples of mass cytometry and time-course flow cytometry data on heterogeneity-based cytology and differentiation/reprogramming experiments to illustrate the practical utility achieved in a fast and convenient manner. Conclusions CytoTree represents a versatile tool for analyzing multidimensional flow and mass cytometry data and to producing heuristic results for trajectory construction and pseudotime estimation in an integrated workflow. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04054-2. Collapse Key Words Flow cytometry Mass cytometry Pseudotime Single-cell Tree Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
19	Latent representation learning in biology and translational medicine. PATTERNS (NEW YORK, N.Y.) 2021;2:100198. [PMID: 33748792 PMCID: PMC7961186 DOI: 10.1016/j.patter.2021.100198] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Abstract Current data generation capabilities in the life sciences render scientists in an apparently contradicting situation. While it is possible to simultaneously measure an ever-increasing number of systems parameters, the resulting data are becoming increasingly difficult to interpret. Latent variable modeling allows for such interpretation by learning non-measurable hidden variables from observations. This review gives an overview over the different formal approaches to latent variable modeling, as well as applications at different scales of biological systems, such as molecular structures, intra- and intercellular regulatory up to physiological networks. The focus is on demonstrating how these approaches have enabled interpretable representations and ultimately insights in each of these domains. We anticipate that a wider dissemination of latent variable modeling in the life sciences will enable a more effective and productive interpretation of studies based on heterogeneous and high-dimensional data modalities. Collapse Key Words biology latent representation learning latent variable modeling life science Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
20	Inference of gene regulatory networks using pseudo-time series data. Bioinformatics 2021;37:2423-2431. [PMID: 33576787 DOI: 10.1093/bioinformatics/btab099] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 01/18/2021] [Accepted: 02/10/2021] [Indexed: 11/12/2022] Open Abstract MOTIVATION Inferring gene regulatory networks (GRNs) from high-throughput data is an important and challenging problem in systems biology. Although numerous GRN methods have been developed, most have focused on the verification of the specific data set. However, it is difficult to establish directed topological networks that are both suitable for time-series and non-time-series datasets due to the complexity and diversity of biological networks. RESULTS Here, we proposed a novel method, GNIPLR (Gene networks inference based on projection and lagged regression) to infer GRNs from time-series or non-time-series gene expression data. GNIPLR projected gene data twice using the LASSO projection (LSP) algorithm and the linear projection (LP) approximation to produce a linear and monotonous pseudo-time series, and then determined the direction of regulation in combination with lagged regression analyses. The proposed algorithm was validated using simulated and real biological data. Moreover, we also applied the GNIPLR algorithm to the liver hepatocellular carcinoma (LIHC) and bladder urothelial carcinoma (BLCA) cancer expression datasets. These analyses revealed significantly higher accuracy and AUC values than other popular methods. AVAILABILITY The GNIPLR tool is freely available at https://github.com/zyllluck/GNIPLR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
21	Inference of Networks from Large Datasets. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11345-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
22	Single-cell transcriptomics uncover distinct innate and adaptive cell subsets during tissue homeostasis and regeneration. J Leukoc Biol 2020;108:1593-1602. [PMID: 33070367 DOI: 10.1002/jlb.6mr0720-131r] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2020] [Revised: 07/30/2020] [Accepted: 08/10/2020] [Indexed: 02/06/2023] Open Abstract Recently, immune cell-mediated tissue repair and regeneration has been an emerging paradigm of regenerative medicine. Immune cells form an essential part of the wound as induction of inflammation is a necessary step to elicit tissue healing. Rapid progress in transcriptomic analyses by high-throughput next-generation sequencing has been developed to study gene regulatory network and establish molecular signatures of immune cells that could potentially predict their functional roles in tissue repair and regeneration. However, the identification of cellular heterogeneity especially on the rare cell subsets has been limited in transcriptomic analyses of bulk cell populations. Therefore, genome-wide, single-cell RNA sequencing (scRNA-Seq) has offered an unprecedented approach to unravel cellular diversity and to study novel immune cell populations involved in tissue repair and regeneration through unsupervised sampling of individual cells without the need to rely on prior knowledge about cell-specific markers. The analysis of gene expression patterns at a single-cell resolution also holds promises to uncover the mechanisms and therefore the development of therapeutic strategy promoting immunoregenerative medicine. In this review, we will discuss how scRNA-Seq facilitates the characterization of immune cells, including macrophages, innate lymphoid cells and T and B lymphocytes, discovery of immune cell heterogeneity, identification of novel subsets, and tracking of developmental trajectories of distinct immune cells during tissue homeostasis, repair, and regeneration. Collapse Key Words immune cells single cell RNA-Seq tissue regeneration tissue repair Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
23	Immunology in the Era of Single-Cell Technologies. Annu Rev Immunol 2020;38:727-757. [PMID: 32075461 DOI: 10.1146/annurev-immunol-090419-020340] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Abstract Immune cells are characterized by diversity, specificity, plasticity, and adaptability-properties that enable them to contribute to homeostasis and respond specifically and dynamically to the many threats encountered by the body. Single-cell technologies, including the assessment of transcriptomics, genomics, and proteomics at the level of individual cells, are ideally suited to studying these properties of immune cells. In this review we discuss the benefits of adopting single-cell approaches in studying underappreciated qualities of immune cells and highlight examples where these technologies have been critical to advancing our understanding of the immune system in health and disease. Collapse Key Words heterogeneity immune repertoire immunology single-cell spatial imaging trajectory inference Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
24	Cell lineage inference from SNP and scRNA-Seq data. Nucleic Acids Res 2019;47:e56. [PMID: 30820578 PMCID: PMC6547431 DOI: 10.1093/nar/gkz146] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Revised: 02/13/2019] [Accepted: 02/20/2019] [Indexed: 12/15/2022] Open Abstract Several recent studies focus on the inference of developmental and response trajectories from single cell RNA-Seq (scRNA-Seq) data. A number of computational methods, often referred to as pseudo-time ordering, have been developed for this task. Recently, CRISPR has also been used to reconstruct lineage trees by inserting random mutations. However, both approaches suffer from drawbacks that limit their use. Here, we develop a method to detect significant, cell type specific, sequence mutations from scRNA-Seq data. We show that only a few mutations are enough for reconstructing good branching models. Integrating these mutations with expression data further improves the accuracy of the reconstructed models. As we show, the majority of mutations we identify are likely RNA editing events indicating that such information can be used to distinguish cell types. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
25	A sparse differential clustering algorithm for tracing cell type changes via single-cell RNA-sequencing data. Nucleic Acids Res 2019;46:e14. [PMID: 29140455 PMCID: PMC5815159 DOI: 10.1093/nar/gkx1113] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2017] [Accepted: 10/24/2017] [Indexed: 12/15/2022] Open Abstract Cell types in cell populations change as the condition changes: some cell types die out, new cell types may emerge and surviving cell types evolve to adapt to the new condition. Using single-cell RNA-sequencing data that measure the gene expression of cells before and after the condition change, we propose an algorithm, SparseDC, which identifies cell types, traces their changes across conditions and identifies genes which are marker genes for these changes. By solving a unified optimization problem, SparseDC completes all three tasks simultaneously. SparseDC is highly computationally efficient and demonstrates its accuracy on both simulated and real data. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
26	WASABI: a dynamic iterative framework for gene regulatory network inference. BMC Bioinformatics 2019;20:220. [PMID: 31046682 PMCID: PMC6498543 DOI: 10.1186/s12859-019-2798-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 04/09/2019] [Indexed: 12/14/2022] Open Abstract BACKGROUND Inference of gene regulatory networks from gene expression data has been a long-standing and notoriously difficult task in systems biology. Recently, single-cell transcriptomic data have been massively used for gene regulatory network inference, with both successes and limitations. RESULTS In the present work we propose an iterative algorithm called WASABI, dedicated to inferring a causal dynamical network from time-stamped single-cell data, which tackles some of the limitations associated with current approaches. We first introduce the concept of waves, which posits that the information provided by an external stimulus will affect genes one-by-one through a cascade, like waves spreading through a network. This concept allows us to infer the network one gene at a time, after genes have been ordered regarding their time of regulation. We then demonstrate the ability of WASABI to correctly infer small networks, which have been simulated in silico using a mechanistic model consisting of coupled piecewise-deterministic Markov processes for the proper description of gene expression at the single-cell level. We finally apply WASABI on in vitro generated data on an avian model of erythroid differentiation. The structure of the resulting gene regulatory network sheds a new light on the molecular mechanisms controlling this process. In particular, we find no evidence for hub genes and a much more distributed network structure than expected. Interestingly, we find that a majority of genes are under the direct control of the differentiation-inducing stimulus. CONCLUSIONS Together, these results demonstrate WASABI versatility and ability to tackle some general gene regulatory networks inference issues. It is our hope that WASABI will prove useful in helping biologists to fully exploit the power of time-stamped single-cell data. Collapse Key Words Erythropoiesis Gene network inference High parallel computing Multiscale modelling Proteomic Single-cell transcriptomics T2EC Collapse MESH Headings Algorithms Animals Cell Differentiation/genetics Computer Simulation Erythroid Cells/metabolism Gene Expression Profiling Gene Regulatory Networks Markov Chains Single-Cell Analysis Systems Biology/methods Collapse Grants ANR-IABI-3096 Agence Nationale de la Recherche ANR-17-CE12-0031 Agence Nationale de la Recherche CIFRE 2015/0436 Association Nationale de la Recherche Technique Collapse Affiliation(s) Collapse
27	SCOUT: A new algorithm for the inference of pseudo-time trajectory using single-cell data. Comput Biol Chem 2019;80:111-120. [PMID: 30947069 DOI: 10.1016/j.compbiolchem.2019.03.013] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Accepted: 03/23/2019] [Indexed: 11/21/2022] Abstract Single cell technology is a powerful tool to reveal intercellular heterogeneity and discover cellular developmental processes. When analyzing the complexity of cellular dynamics and variability, it is important to construct a pseudo-time trajectory using single-cell expression data to reflect the process of cellular development. Although a number of computational and statistical methods have been developed recently for single-cell analysis, more effective and efficient methods are still strongly needed. In this work we propose a new method named SCOUT for the inference of single-cell pseudo-time ordering with bifurcation trajectories. We first propose to use the fixed-radius near neighbors algorithms based on cell densities to find landmarks to represent the cell states, and employ the minimum spanning tree (MST) to determine the developmental branches. We then propose to use the projection of Apollonian circle or a weighted distance to determine the pseudo-time trajectories of single cells. The proposed algorithm is applied to one synthetic and two realistic single-cell datasets (including single-branching and multi-branching trajectories) and the cellular developmental dynamics is recovered successfully. Compared with other popular methods, numerical results show that our proposed method is able to generate more robust and accurate pseudo-time trajectories. The code of the method is implemented in Python and available at https://github.com/statway/SCOUT. Collapse Key Words Cell heterogeneity Pseudo-time trajectory Single-cell transcriptomics Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
28	A comparison of single-cell trajectory inference methods. Nat Biotechnol 2019;37:547-554. [DOI: 10.1038/s41587-019-0071-9] [Citation(s) in RCA: 666] [Impact Index Per Article: 133.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2018] [Accepted: 02/13/2019] [Indexed: 11/09/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
29	The Human Cell Atlas: Technical approaches and challenges. Brief Funct Genomics 2018;17:283-294. [PMID: 29092000 PMCID: PMC6063304 DOI: 10.1093/bfgp/elx029] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open Abstract The Human Cell Atlas is a large, international consortium that aims to identify and describe every cell type in the human body. The comprehensive cellular maps that arise from this ambitious effort have the potential to transform many aspects of fundamental biology and clinical practice. Here, we discuss the technical approaches that could be used today to generate such a resource and also the technical challenges that will be encountered. Collapse Key Words human cell atlas single cell rna sequencing bioinformatics Collapse MESH Headings Databases, Factual Gene Expression Profiling Humans Sequence Analysis, RNA Single-Cell Analysis Collapse Grants Wellcome Trust 206194 Wellcome Trust Collapse Affiliation(s) Collapse
30	Mapping gene regulatory networks from single-cell omics data. Brief Funct Genomics 2018;17:246-254. [PMID: 29342231 PMCID: PMC6063279 DOI: 10.1093/bfgp/elx046] [Citation(s) in RCA: 132] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open Abstract Single-cell techniques are advancing rapidly and are yielding unprecedented insight into cellular heterogeneity. Mapping the gene regulatory networks (GRNs) underlying cell states provides attractive opportunities to mechanistically understand this heterogeneity. In this review, we discuss recently emerging methods to map GRNs from single-cell transcriptomics data, tackling the challenge of increased noise levels and data sparsity compared with bulk data, alongside increasing data volumes. Next, we discuss how new techniques for single-cell epigenomics, such as single-cell ATAC-seq and single-cell DNA methylation profiling, can be used to decipher gene regulatory programmes. We finally look forward to the application of single-cell multi-omics and perturbation techniques that will likely play important roles for GRN inference in the future. Collapse Key Words single-cell transcriptomics single-cell epigenomics gene regulatory networks Collapse MESH Headings Epigenomics/methods Gene Expression Profiling/methods Gene Regulatory Networks Sequence Analysis, RNA/methods Single-Cell Analysis/methods Collapse Grants Collapse Affiliation(s) Collapse
31	SCODE: an efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation. Bioinformatics 2018;33:2314-2321. [PMID: 28379368 PMCID: PMC5860123 DOI: 10.1093/bioinformatics/btx194] [Citation(s) in RCA: 204] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Accepted: 04/02/2017] [Indexed: 01/17/2023] Open Abstract Motivation The analysis of RNA-Seq data from individual differentiating cells enables us to reconstruct the differentiation process and the degree of differentiation (in pseudo-time) of each cell. Such analyses can reveal detailed expression dynamics and functional relationships for differentiation. To further elucidate differentiation processes, more insight into gene regulatory networks is required. The pseudo-time can be regarded as time information and, therefore, single-cell RNA-Seq data are time-course data with high time resolution. Although time-course data are useful for inferring networks, conventional inference algorithms for such data suffer from high time complexity when the number of samples and genes is large. Therefore, a novel algorithm is necessary to infer networks from single-cell RNA-Seq during differentiation. Results In this study, we developed the novel and efficient algorithm SCODE to infer regulatory networks, based on ordinary differential equations. We applied SCODE to three single-cell RNA-Seq datasets and confirmed that SCODE can reconstruct observed expression dynamics. We evaluated SCODE by comparing its inferred networks with use of a DNaseI-footprint based network. The performance of SCODE was best for two of the datasets and nearly best for the remaining dataset. We also compared the runtimes and showed that the runtimes for SCODE are significantly shorter than for alternatives. Thus, our algorithm provides a promising approach for further single-cell differentiation analyses. Availability and Implementation The R source code of SCODE is available at https://github.com/hmatsu1226/SCODE Supplementary information Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
32	Single-Cell Computational Strategies for Lineage Reconstruction in Tissue Systems. Cell Mol Gastroenterol Hepatol 2018;5:539-548. [PMID: 29713661 PMCID: PMC5924749 DOI: 10.1016/j.jcmgh.2018.01.023] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Accepted: 01/31/2018] [Indexed: 12/21/2022] Abstract Function at the organ level manifests itself from a heterogeneous collection of cell types. Cellular heterogeneity emerges from developmental processes by which multipotent progenitor cells make fate decisions and transition to specific cell types through intermediate cell states. Although genetic experimental strategies such as lineage tracing have provided insights into cell lineages, recent developments in single-cell technologies have greatly increased our ability to interrogate distinct cell types, as well as transitional cell states in tissue systems. From single-cell data that describe these intermediate cell states, computational tools have been developed to reconstruct cell-state transition trajectories that model cell developmental processes. These algorithms, although powerful, are still in their infancy, and attention must be paid to their strengths and weaknesses when they are used. Here, we review some of these tools, also referred to as pseudotemporal ordering algorithms, and their associated assumptions and caveats. We hope to provide a rational and generalizable workflow for single-cell trajectory analysis that is intuitive for experimental biologists. Collapse Key Words Cell State Transition Differentiation MST, minimum spanning tree PCA, principal component analysis Pseudotime Single-Cell Analysis Stem Cells Trajectory scRNA-seq, single-cell RNA-sequencing t-SNE, t-distributed stochastic neighbor embedding Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
33	Constructing cell lineages from single-cell transcriptomes. Mol Aspects Med 2017;59:95-113. [PMID: 29107741 DOI: 10.1016/j.mam.2017.10.004] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Revised: 10/23/2017] [Accepted: 10/25/2017] [Indexed: 12/25/2022] Abstract Advances in single-cell RNA-sequencing have helped reveal the previously underappreciated level of cellular heterogeneity present during cellular differentiation. A static snapshot of single-cell transcriptomes provides a good representation of the various stages of differentiation as differentiation is rarely synchronized between cells. Data from numerous single-cell analyses has suggested that cellular differentiation and development can be conceptualized as continuous processes. Consequently, computational algorithms have been developed to infer lineage relationships between cell types and construct developmental trajectories along which cells are re-ordered such that similarity between successive cell pairs is maximized. Here, we compare and contrast the existing computational methods, and illustrate how they may be applied to build mouse myeloid progenitor lineages from massively parallel RNA single-cell sequencing data. Collapse Key Words Algorithm Differentiation Lineage mapping Progenitor RNA-Sequencing Single-cell analysis Collapse MESH Headings Algorithms Animals Cell Differentiation/genetics Cell Differentiation/physiology Cell Lineage Gene Expression Profiling/methods Humans Sequence Analysis, RNA/methods Single-Cell Analysis/methods Transcriptome/genetics Transcriptome/physiology Collapse Grants Collapse Affiliation(s) Collapse
34	Learning regulatory models for cell development from single cell transcriptomic data. ACTA ACUST UNITED AC 2017. [DOI: 10.1016/j.coisb.2017.07.013] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
35	Understanding development and stem cells using single cell-based analyses of gene expression. Development 2017;144:17-32. [PMID: 28049689 DOI: 10.1242/dev.133058] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Abstract In recent years, genome-wide profiling approaches have begun to uncover the molecular programs that drive developmental processes. In particular, technical advances that enable genome-wide profiling of thousands of individual cells have provided the tantalizing prospect of cataloging cell type diversity and developmental dynamics in a quantitative and comprehensive manner. Here, we review how single-cell RNA sequencing has provided key insights into mammalian developmental and stem cell biology, emphasizing the analytical approaches that are specific to studying gene expression in single cells. Collapse Key Words Computational biology Gene regulatory networks Pseudotime RNA-Seq Single cell Stem cells Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
36	Order Under Uncertainty: Robust Differential Expression Analysis Using Probabilistic Models for Pseudotime Inference. PLoS Comput Biol 2016;12:e1005212. [PMID: 27870852 PMCID: PMC5117567 DOI: 10.1371/journal.pcbi.1005212] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2016] [Accepted: 10/13/2016] [Indexed: 11/18/2022] Open Abstract Single cell gene expression profiling can be used to quantify transcriptional dynamics in temporal processes, such as cell differentiation, using computational methods to label each cell with a ‘pseudotime’ where true time series experimentation is too difficult to perform. However, owing to the high variability in gene expression between individual cells, there is an inherent uncertainty in the precise temporal ordering of the cells. Pre-existing methods for pseudotime estimation have predominantly given point estimates precluding a rigorous analysis of the implications of uncertainty. We use probabilistic modelling techniques to quantify pseudotime uncertainty and propagate this into downstream differential expression analysis. We demonstrate that reliance on a point estimate of pseudotime can lead to inflated false discovery rates and that probabilistic approaches provide greater robustness and measures of the temporal resolution that can be obtained from pseudotime inference. Understanding the “cellular programming” that controls fundamental, dynamic biological processes is important for determining normal cellular function and potential perturbations that might give rise to physiological disorders. Ideally, investigations would employ time series experiments to periodically measure the properties of each cell. This would allow us to understand the sequence of gene (in)activations that constitute the program being followed. In practice, such experiments can be difficult to perform as cellular activity may be asynchronous with each cell occupying a different phase of the process of interested. Furthermore, the unbiased measurement of all transcripts or proteins requires the cells to be captured and lysed precluding the continued monitoring of that cell. In the absence of the ability to conduct true time series experiments, pseudotime algorithms exploit the asynchronous cellular nature of these systems to mathematically assign a “pseudotime” to each cell based on its molecular profile allowing the cells to be aligned and the sequence of gene activation events retrospectively inferred. Existing approaches predominantly use deterministic methods that ignore the statistical uncertainties associated with the problem. This paper demonstrates that this statistical uncertainty limits the temporal resolution that can be extracted from static snapshots of cell expression profiles and can also detrimentally affect downstream analysis. Collapse Key Words Collapse MESH Headings Collapse Grants Wellcome Trust MC_PC_14131 Medical Research Council MR/L001411/1 Medical Research Council MR/M00919X/1 Medical Research Council Li Ka Shing Foundation Collapse Affiliation(s) Collapse
37	Computational methods for trajectory inference from single-cell transcriptomics. Eur J Immunol 2016;46:2496-2506. [DOI: 10.1002/eji.201646347] [Citation(s) in RCA: 112] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2016] [Revised: 08/30/2016] [Accepted: 09/26/2016] [Indexed: 12/22/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse