1
|
Peng D, Cahan P. OneSC: a computational platform for recapitulating cell state transitions. Bioinformatics 2024; 40:btae703. [PMID: 39570626 PMCID: PMC11630913 DOI: 10.1093/bioinformatics/btae703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Revised: 11/13/2024] [Accepted: 11/19/2024] [Indexed: 11/22/2024] Open
Abstract
MOTIVATION Computational modeling of cell state transitions has been a great interest of many in the field of developmental biology, cancer biology, and cell fate engineering because it enables performing perturbation experiments in silico more rapidly and cheaply than could be achieved in a lab. Recent advancements in single-cell RNA-sequencing (scRNA-seq) allow the capture of high-resolution snapshots of cell states as they transition along temporal trajectories. Using these high-throughput datasets, we can train computational models to generate in silico "synthetic" cells that faithfully mimic the temporal trajectories. RESULTS Here we present OneSC, a platform that can simulate cell state transitions using systems of stochastic differential equations govern by a regulatory network of core transcription factors (TFs). Different from many current network inference methods, OneSC prioritizes on generating Boolean network that produces faithful cell state transitions and terminal cell states that mimic real biological systems. Applying OneSC to real data, we inferred a core TF network using a mouse myeloid progenitor scRNA-seq dataset and showed that the dynamical simulations of that network generate synthetic single-cell expression profiles that faithfully recapitulate the four myeloid differentiation trajectories going into differentiated cell states (erythrocytes, megakaryocytes, granulocytes, and monocytes). Finally, through the in silico perturbations of the mouse myeloid progenitor core network, we showed that OneSC can accurately predict cell fate decision biases of TF perturbations that closely match with previous experimental observations. AVAILABILITY AND IMPLEMENTATION OneSC is implemented as a Python package on GitHub (https://github.com/CahanLab/oneSC) and on Zenodo (https://zenodo.org/records/14052421).
Collapse
Affiliation(s)
- Da Peng
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, United States
| | - Patrick Cahan
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, United States
- Institute for Cell Engineering, Johns Hopkins University, Baltimore, MD 21205, United States
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD 21205, United States
| |
Collapse
|
2
|
Karamveer, Uzun Y. Approaches for Benchmarking Single-Cell Gene Regulatory Network Methods. Bioinform Biol Insights 2024; 18:11779322241287120. [PMID: 39502448 PMCID: PMC11536393 DOI: 10.1177/11779322241287120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 09/10/2024] [Indexed: 11/08/2024] Open
Abstract
Gene regulatory networks are powerful tools for modeling genetic interactions that control the expression of genes driving cell differentiation, and single-cell sequencing offers a unique opportunity to build these networks with high-resolution genomic data. There are many proposed computational methods to build these networks using single-cell data, and different approaches are used to benchmark these methods. However, a comprehensive discussion specifically focusing on benchmarking approaches is missing. In this article, we lay the GRN terminology, present an overview of common gold-standard studies and data sets, and define the performance metrics for benchmarking network construction methodologies. We also point out the advantages and limitations of different benchmarking approaches, suggest alternative ground truth data sets that can be used for benchmarking, and specify additional considerations in this context.
Collapse
Affiliation(s)
- Karamveer
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Yasin Uzun
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Penn State Cancer Institute, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| |
Collapse
|
3
|
Zhu L, Kang X, Li C, Zheng J. TMELand: An End-to-End Pipeline for Quantification and Visualization of Waddington's Epigenetic Landscape Based on Gene Regulatory Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1604-1612. [PMID: 37310837 DOI: 10.1109/tcbb.2023.3285395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Waddington's epigenetic landscape is a framework depicting the processes of cell differentiation and reprogramming under the control of a gene regulatory network (GRN). Traditional model-driven methods for landscape quantification focus on the Boolean network or differential equation-based models of GRN, which need sophisticated prior knowledge and hence hamper their practical applications. To resolve this problem, we combine data-driven methods for inferring GRNs from gene expression data with model-driven approach to the landscape mapping. Specifically, we build an end-to-end pipeline to link data-driven and model-driven methods and develop a software tool named TMELand for GRN inference, visualizing Waddington's epigenetic landscape, and calculating state transition paths between attractors to uncover the intrinsic mechanism of cellular transition dynamics. By integrating GRN inference from real transcriptomic data with landscape modeling, TMELand can facilitate studies of computational systems biology, such as predicting cellular states and visualizing the dynamical trends of cell fate determination and transition dynamics from single-cell transcriptomic data.
Collapse
|
4
|
Peng D, Cahan P. OneSC: A computational platform for recapitulating cell state transitions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.31.596831. [PMID: 38895453 PMCID: PMC11185539 DOI: 10.1101/2024.05.31.596831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Computational modelling of cell state transitions has been a great interest of many in the field of developmental biology, cancer biology and cell fate engineering because it enables performing perturbation experiments in silico more rapidly and cheaply than could be achieved in a wet lab. Recent advancements in single-cell RNA sequencing (scRNA-seq) allow the capture of high-resolution snapshots of cell states as they transition along temporal trajectories. Using these high-throughput datasets, we can train computational models to generate in silico 'synthetic' cells that faithfully mimic the temporal trajectories. Here we present OneSC, a platform that can simulate synthetic cells across developmental trajectories using systems of stochastic differential equations govern by a core transcription factors (TFs) regulatory network. Different from the current network inference methods, OneSC prioritizes on generating Boolean network that produces faithful cell state transitions and steady cell states that mimic real biological systems. Applying OneSC to real data, we inferred a core TF network using a mouse myeloid progenitor scRNA-seq dataset and showed that the dynamical simulations of that network generate synthetic single-cell expression profiles that faithfully recapitulate the four myeloid differentiation trajectories going into differentiated cell states (erythrocytes, megakaryocytes, granulocytes and monocytes). Finally, through the in-silico perturbations of the mouse myeloid progenitor core network, we showed that OneSC can accurately predict cell fate decision biases of TF perturbations that closely match with previous experimental observations.
Collapse
Affiliation(s)
- Da Peng
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, 21205, USA
| | - Patrick Cahan
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, 21205, USA
- Institute for Cell Engineering, Johns Hopkins University, Baltimore, Maryland, 21205, USA
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, Maryland, 21205, USA
| |
Collapse
|
5
|
Chen F, Li C. Inferring structural and dynamical properties of gene networks from data with deep learning. NAR Genom Bioinform 2022; 4:lqac068. [PMID: 36110897 PMCID: PMC9469930 DOI: 10.1093/nargab/lqac068] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 07/22/2022] [Accepted: 08/24/2022] [Indexed: 11/29/2022] Open
Abstract
The reconstruction of gene regulatory networks (GRNs) from data is vital in systems biology. Although different approaches have been proposed to infer causality from data, some challenges remain, such as how to accurately infer the direction and type of interactions, how to deal with complex network involving multiple feedbacks, as well as how to infer causality between variables from real-world data, especially single cell data. Here, we tackle these problems by deep neural networks (DNNs). The underlying regulatory network for different systems (gene regulations, ecology, diseases, development) can be successfully reconstructed from trained DNN models. We show that DNN is superior to existing approaches including Boolean network, Random Forest and partial cross mapping for network inference. Further, by interrogating the ensemble DNN model trained from single cell data from dynamical system perspective, we are able to unravel complex cell fate dynamics during preimplantation development. We also propose a data-driven approach to quantify the energy landscape for gene regulatory systems, by combining DNN with the partial self-consistent mean field approximation (PSCA) approach. We anticipate the proposed method can be applied to other fields to decipher the underlying dynamical mechanisms of systems from data.
Collapse
Affiliation(s)
- Feng Chen
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
- Shanghai Center for Mathematical Sciences, Fudan University, Shanghai 200433, China
| | - Chunhe Li
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
- Shanghai Center for Mathematical Sciences, Fudan University, Shanghai 200433, China
- School of Mathematical Sciences, Fudan University, Shanghai 200433, China
| |
Collapse
|
6
|
Wang M, Song WM, Ming C, Wang Q, Zhou X, Xu P, Krek A, Yoon Y, Ho L, Orr ME, Yuan GC, Zhang B. Guidelines for bioinformatics of single-cell sequencing data analysis in Alzheimer's disease: review, recommendation, implementation and application. Mol Neurodegener 2022; 17:17. [PMID: 35236372 PMCID: PMC8889402 DOI: 10.1186/s13024-022-00517-z] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 01/18/2022] [Indexed: 12/13/2022] Open
Abstract
Alzheimer's disease (AD) is the most common form of dementia, characterized by progressive cognitive impairment and neurodegeneration. Extensive clinical and genomic studies have revealed biomarkers, risk factors, pathways, and targets of AD in the past decade. However, the exact molecular basis of AD development and progression remains elusive. The emerging single-cell sequencing technology can potentially provide cell-level insights into the disease. Here we systematically review the state-of-the-art bioinformatics approaches to analyze single-cell sequencing data and their applications to AD in 14 major directions, including 1) quality control and normalization, 2) dimension reduction and feature extraction, 3) cell clustering analysis, 4) cell type inference and annotation, 5) differential expression, 6) trajectory inference, 7) copy number variation analysis, 8) integration of single-cell multi-omics, 9) epigenomic analysis, 10) gene network inference, 11) prioritization of cell subpopulations, 12) integrative analysis of human and mouse sc-RNA-seq data, 13) spatial transcriptomics, and 14) comparison of single cell AD mouse model studies and single cell human AD studies. We also address challenges in using human postmortem and mouse tissues and outline future developments in single cell sequencing data analysis. Importantly, we have implemented our recommended workflow for each major analytic direction and applied them to a large single nucleus RNA-sequencing (snRNA-seq) dataset in AD. Key analytic results are reported while the scripts and the data are shared with the research community through GitHub. In summary, this comprehensive review provides insights into various approaches to analyze single cell sequencing data and offers specific guidelines for study design and a variety of analytic directions. The review and the accompanied software tools will serve as a valuable resource for studying cellular and molecular mechanisms of AD, other diseases, or biological systems at the single cell level.
Collapse
Affiliation(s)
- Minghui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Won-min Song
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Chen Ming
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Qian Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Xianxiao Zhou
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Peng Xu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Azra Krek
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Yonejung Yoon
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Lap Ho
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Miranda E. Orr
- Department of Internal Medicine, Section of Gerontology and Geriatric Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
- Sticht Center for Healthy Aging and Alzheimer’s Prevention, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Bin Zhang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| |
Collapse
|
7
|
Jiang R, Sun T, Song D, Li JJ. Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biol 2022; 23:31. [PMID: 35063006 PMCID: PMC8783472 DOI: 10.1186/s13059-022-02601-5] [Citation(s) in RCA: 172] [Impact Index Per Article: 57.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 01/04/2022] [Indexed: 12/13/2022] Open
Abstract
Researchers view vast zeros in single-cell RNA-seq data differently: some regard zeros as biological signals representing no or low gene expression, while others regard zeros as missing data to be corrected. To help address the controversy, here we discuss the sources of biological and non-biological zeros; introduce five mechanisms of adding non-biological zeros in computational benchmarking; evaluate the impacts of non-biological zeros on data analysis; benchmark three input data types: observed counts, imputed counts, and binarized counts; discuss the open questions regarding non-biological zeros; and advocate the importance of transparent analysis.
Collapse
Affiliation(s)
- Ruochen Jiang
- Department of Statistics, University of California, Los Angeles, 90095-1554, CA, USA
| | - Tianyi Sun
- Department of Statistics, University of California, Los Angeles, 90095-1554, CA, USA
| | - Dongyuan Song
- Bioinformatics Interdepartmental Ph.D. Program, University of California, Los Angeles, 90095-7246, CA, USA
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, 90095-1554, CA, USA.
- Department of Human Genetics, University of California, Los Angeles, 90095-7088, CA, USA.
- Department of Computational Medicine, University of California, Los Angeles, 90095-1766, CA, USA.
- Department of Biostatistics, University of California, Los Angeles, 90095-1772, CA, USA.
| |
Collapse
|
8
|
Shrivastava H, Zhang X, Song L, Aluru S. GRNUlar: A Deep Learning Framework for Recovering Single-Cell Gene Regulatory Networks. J Comput Biol 2022; 29:27-44. [PMID: 35050715 DOI: 10.1089/cmb.2021.0437] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
We propose GRNUlar, a novel deep learning framework for supervised learning of gene regulatory networks (GRNs) from single-cell RNA-Sequencing (scRNA-Seq) data. Our framework incorporates two intertwined models. First, we leverage the expressive ability of neural networks to capture complex dependencies between transcription factors and the corresponding genes they regulate, by developing a multitask learning framework. Second, to capture sparsity of GRNs observed in the real world, we design an unrolled algorithm technique for our framework. Our deep architecture requires supervision for training, for which we repurpose existing synthetic data simulators that generate scRNA-Seq data guided by an underlying GRN. Experimental results demonstrate that GRNUlar outperforms state-of-the-art methods on both synthetic and real data sets. Our study also demonstrates the novel and successful use of expression data simulators for supervised learning of GRN inference.
Collapse
Affiliation(s)
- Harsh Shrivastava
- Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Xiuwei Zhang
- Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Le Song
- Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Srinivas Aluru
- Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA
| |
Collapse
|
9
|
Single-cell network biology for resolving cellular heterogeneity in human diseases. Exp Mol Med 2020; 52:1798-1808. [PMID: 33244151 PMCID: PMC8080824 DOI: 10.1038/s12276-020-00528-0] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 08/26/2020] [Accepted: 08/31/2020] [Indexed: 01/10/2023] Open
Abstract
Understanding cellular heterogeneity is the holy grail of biology and medicine. Cells harboring identical genomes show a wide variety of behaviors in multicellular organisms. Genetic circuits underlying cell-type identities will facilitate the understanding of the regulatory programs for differentiation and maintenance of distinct cellular states. Such a cell-type-specific gene network can be inferred from coregulatory patterns across individual cells. Conventional methods of transcriptome profiling using tissue samples provide only average signals of diverse cell types. Therefore, reconstructing gene regulatory networks for a particular cell type is not feasible with tissue-based transcriptome data. Recently, single-cell omics technology has emerged and enabled the capture of the transcriptomic landscape of every individual cell. Although single-cell gene expression studies have already opened up new avenues, network biology using single-cell transcriptome data will further accelerate our understanding of cellular heterogeneity. In this review, we provide an overview of single-cell network biology and summarize recent progress in method development for network inference from single-cell RNA sequencing (scRNA-seq) data. Then, we describe how cell-type-specific gene networks can be utilized to study regulatory programs specific to disease-associated cell types and cellular states. Moreover, with scRNA data, modeling personal or patient-specific gene networks is feasible. Therefore, we also introduce potential applications of single-cell network biology for precision medicine. We envision a rapid paradigm shift toward single-cell network analysis for systems biology in the near future. Gene regulatory networks reconstructed from single-cell RNA sequencing datasets are allowing researchers to better understand the molecular circuits and cell states that contribute to complex human disease. Junha Cha and Insuk Lee from Yonsei University in Seoul, South Korea, review the concept of ‘single-cell network biology’, which involves using computational algorithms on genetic expression data from thousands of cells to infer functional interactions in various biological contexts. This systems biology approach to analyzing the profiles of messenger RNA in single cells is helping researchers discover new signaling pathways that could serve as disease biomarkers or therapeutic targets. In the future, patient-specific models of personal gene networks could explain why certain genetic variants affect disease risk. This research could also eventually lead to new types of individualized medical treatments.
Collapse
|
10
|
Single-Cell Clustering Based on Shared Nearest Neighbor and Graph Partitioning. Interdiscip Sci 2020; 12:117-130. [PMID: 32086753 DOI: 10.1007/s12539-019-00357-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Revised: 12/23/2019] [Accepted: 12/26/2019] [Indexed: 12/22/2022]
Abstract
Clustering of single-cell RNA sequencing (scRNA-seq) data enables discovering cell subtypes, which is helpful for understanding and analyzing the processes of diseases. Determining the weight of edges is an essential component in graph-based clustering methods. While several graph-based clustering algorithms for scRNA-seq data have been proposed, they are generally based on k-nearest neighbor (KNN) and shared nearest neighbor (SNN) without considering the structure information of graph. Here, to improve the clustering accuracy, we present a novel method for single-cell clustering, called structural shared nearest neighbor-Louvain (SSNN-Louvain), which integrates the structure information of graph and module detection. In SSNN-Louvain, based on the distance between a node and its shared nearest neighbors, the weight of edge is defined by introducing the ratio of the number of the shared nearest neighbors to that of nearest neighbors, thus integrating structure information of the graph. Then, a modified Louvain community detection algorithm is proposed and applied to identify modules in the graph. Essentially, each community represents a subtype of cells. It is worth mentioning that our proposed method integrates the advantages of both SNN graph and community detection without the need for tuning any additional parameter other than the number of neighbors. To test the performance of SSNN-Louvain, we compare it to five existing methods on 16 real datasets, including nonnegative matrix factorization, single-cell interpretation via multi-kernel learning, SNN-Cliq, Seurat and PhenoGraph. The experimental results show that our approach achieves the best average performance in these datasets.
Collapse
|
11
|
Pratapa A, Jalihal AP, Law JN, Bharadwaj A, Murali TM. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods 2020; 17:147-154. [PMID: 31907445 PMCID: PMC7098173 DOI: 10.1038/s41592-019-0690-6] [Citation(s) in RCA: 381] [Impact Index Per Article: 76.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 11/22/2019] [Indexed: 01/10/2023]
Abstract
We present a systematic evaluation of state-of-the-art algorithms for inferring gene regulatory networks from single-cell transcriptional data. As the ground truth for assessing accuracy, we use synthetic networks with predictable trajectories, literature-curated Boolean models and diverse transcriptional regulatory networks. We develop a strategy to simulate single-cell transcriptional data from synthetic and Boolean networks that avoids pitfalls of previously used methods. Furthermore, we collect networks from multiple experimental single-cell RNA-seq datasets. We develop an evaluation framework called BEELINE. We find that the area under the precision-recall curve and early precision of the algorithms are moderate. The methods are better in recovering interactions in synthetic networks than Boolean models. The algorithms with the best early precision values for Boolean models also perform well on experimental datasets. Techniques that do not require pseudotime-ordered cells are generally more accurate. Based on these results, we present recommendations to end users. BEELINE will aid the development of gene regulatory network inference algorithms.
Collapse
Affiliation(s)
- Aditya Pratapa
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - Amogh P Jalihal
- Genetics, Bioinformatics, and Computational Biology Ph.D. Program, Virginia Tech, Blacksburg, VA, USA
| | - Jeffrey N Law
- Genetics, Bioinformatics, and Computational Biology Ph.D. Program, Virginia Tech, Blacksburg, VA, USA
| | - Aditya Bharadwaj
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA.
| |
Collapse
|
12
|
Chen X, Li M, Zheng R, Wu FX, Wang J. D3GRN: a data driven dynamic network construction method to infer gene regulatory networks. BMC Genomics 2019; 20:929. [PMID: 31881937 PMCID: PMC6933629 DOI: 10.1186/s12864-019-6298-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND To infer gene regulatory networks (GRNs) from gene-expression data is still a fundamental and challenging problem in systems biology. Several existing algorithms formulate GRNs inference as a regression problem and obtain the network with an ensemble strategy. Recent studies on data driven dynamic network construction provide us a new perspective to solve the regression problem. RESULTS In this study, we propose a data driven dynamic network construction method to infer gene regulatory network (D3GRN), which transforms the regulatory relationship of each target gene into functional decomposition problem and solves each sub problem by using the Algorithm for Revealing Network Interactions (ARNI). To remedy the limitation of ARNI in constructing networks solely from the unit level, a bootstrapping and area based scoring method is taken to infer the final network. On DREAM4 and DREAM5 benchmark datasets, D3GRN performs competitively with the state-of-the-art algorithms in terms of AUPR. CONCLUSIONS We have proposed a novel data driven dynamic network construction method by combining ARNI with bootstrapping and area based scoring strategy. The proposed method performs well on the benchmark datasets, contributing as a competitive method to infer gene regulatory networks in a new perspective.
Collapse
Affiliation(s)
- Xiang Chen
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, China.
| | - Ruiqing Zheng
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Fang-Xiang Wu
- Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
13
|
Bonnaffoux A, Herbach U, Richard A, Guillemin A, Gonin-Giraud S, Gros PA, Gandrillon O. WASABI: a dynamic iterative framework for gene regulatory network inference. BMC Bioinformatics 2019; 20:220. [PMID: 31046682 PMCID: PMC6498543 DOI: 10.1186/s12859-019-2798-1] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 04/09/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Inference of gene regulatory networks from gene expression data has been a long-standing and notoriously difficult task in systems biology. Recently, single-cell transcriptomic data have been massively used for gene regulatory network inference, with both successes and limitations. RESULTS In the present work we propose an iterative algorithm called WASABI, dedicated to inferring a causal dynamical network from time-stamped single-cell data, which tackles some of the limitations associated with current approaches. We first introduce the concept of waves, which posits that the information provided by an external stimulus will affect genes one-by-one through a cascade, like waves spreading through a network. This concept allows us to infer the network one gene at a time, after genes have been ordered regarding their time of regulation. We then demonstrate the ability of WASABI to correctly infer small networks, which have been simulated in silico using a mechanistic model consisting of coupled piecewise-deterministic Markov processes for the proper description of gene expression at the single-cell level. We finally apply WASABI on in vitro generated data on an avian model of erythroid differentiation. The structure of the resulting gene regulatory network sheds a new light on the molecular mechanisms controlling this process. In particular, we find no evidence for hub genes and a much more distributed network structure than expected. Interestingly, we find that a majority of genes are under the direct control of the differentiation-inducing stimulus. CONCLUSIONS Together, these results demonstrate WASABI versatility and ability to tackle some general gene regulatory networks inference issues. It is our hope that WASABI will prove useful in helping biologists to fully exploit the power of time-stamped single-cell data.
Collapse
Affiliation(s)
- Arnaud Bonnaffoux
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
- Inria Team Dracula, Inria Center Grenoble Rhône-Alpes, Lyon, France
- Cosmotech, Lyon, France
| | - Ulysse Herbach
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
- Inria Team Dracula, Inria Center Grenoble Rhône-Alpes, Lyon, France
- Univ Lyon, Université Claude Bernard Lyon 1, CNRS UMR 5208, Institut Camille Jordan, Villeurbanne, France
| | - Angélique Richard
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
| | - Anissa Guillemin
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
| | - Sandrine Gonin-Giraud
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
| | | | - Olivier Gandrillon
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
- Inria Team Dracula, Inria Center Grenoble Rhône-Alpes, Lyon, France
| |
Collapse
|
14
|
Abstract
Background:
The recently developed single-cell RNA sequencing (scRNA-seq) has
attracted a great amount of attention due to its capability to interrogate expression of individual
cells, which is superior to traditional bulk cell sequencing that can only measure mean gene
expression of a population of cells. scRNA-seq has been successfully applied in finding new cell
subtypes. New computational challenges exist in the analysis of scRNA-seq data.
Objective:
We provide an overview of the features of different similarity calculation and clustering
methods, in order to facilitate users to select methods that are suitable for their scRNA-seq. We
would also like to show that feature selection methods are important to improve clustering
performance.
Results:
We first described similarity measurement methods, followed by reviewing some new
clustering methods, as well as their algorithmic details. This analysis revealed several new
questions, including how to automatically estimate the number of clustering categories, how to
discover novel subpopulation, and how to search for new marker genes by using feature selection
methods.
Conclusion:
Without prior knowledge about the number of cell types, clustering or semisupervised
learning methods are important tools for exploratory analysis of scRNA-seq data.</P>
Collapse
Affiliation(s)
- Xiaoshu Zhu
- School of Computer Science and Engineering, Central South University, 410083, Changsha, Hunan, China
| | - Hong-Dong Li
- School of Computer Science and Engineering, Central South University, 410083, Changsha, Hunan, China
| | - Lilu Guo
- School of Computer Science and Engineering, Yulin Normal University, 537000, Yulin, Guangxi, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SKS7N5A9, Canada
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, 410083, Changsha, Hunan, China
| |
Collapse
|
15
|
Zhu X, Li HD, Xu Y, Guo L, Wu FX, Duan G, Wang J. A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data. Genes (Basel) 2019; 10:E98. [PMID: 30700040 PMCID: PMC6409843 DOI: 10.3390/genes10020098] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Revised: 01/24/2019] [Accepted: 01/25/2019] [Indexed: 02/01/2023] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) has recently brought new insight into cell differentiation processes and functional variation in cell subtypes from homogeneous cell populations. A lack of prior knowledge makes unsupervised machine learning methods, such as clustering, suitable for analyzing scRNA-seq . However, there are several limitations to overcome, including high dimensionality, clustering result instability, and parameter adjustment complexity. In this study, we propose a method by combining structure entropy and k nearest neighbor to identify cell subpopulations in scRNA-seq data. In contrast to existing clustering methods for identifying cell subtypes, minimized structure entropy results in natural communities without specifying the number of clusters. To investigate the performance of our model, we applied it to eight scRNA-seq datasets and compared our method with three existing methods (nonnegative matrix factorization, single-cell interpretation via multikernel learning, and structural entropy minimization principle). The experimental results showed that our approach achieves, on average, better performance in these datasets compared to the benchmark methods.
Collapse
Affiliation(s)
- Xiaoshu Zhu
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China.
- School of Computer Science and Engineering, Yulin Normal University, Yulin, Guangxi 537000, China.
| | - Hong-Dong Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China.
| | - Yunpei Xu
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China.
| | - Lilu Guo
- School of Computer Science and Engineering, Yulin Normal University, Yulin, Guangxi 537000, China.
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SKS7N5A9, Canada.
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China.
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China.
| |
Collapse
|
16
|
Lesage R, Kerkhofs J, Geris L. Computational Modeling and Reverse Engineering to Reveal Dominant Regulatory Interactions Controlling Osteochondral Differentiation: Potential for Regenerative Medicine. Front Bioeng Biotechnol 2018; 6:165. [PMID: 30483498 PMCID: PMC6243751 DOI: 10.3389/fbioe.2018.00165] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Accepted: 10/22/2018] [Indexed: 01/11/2023] Open
Abstract
The specialization of cartilage cells, or chondrogenic differentiation, is an intricate and meticulously regulated process that plays a vital role in both bone formation and cartilage regeneration. Understanding the molecular regulation of this process might help to identify key regulatory factors that can serve as potential therapeutic targets, or that might improve the development of qualitative and robust skeletal tissue engineering approaches. However, each gene involved in this process is influenced by a myriad of feedback mechanisms that keep its expression in a desirable range, making the prediction of what will happen if one of these genes defaults or is targeted with drugs, challenging. Computer modeling provides a tool to simulate this intricate interplay from a network perspective. This paper aims to give an overview of the current methodologies employed to analyze cell differentiation in the context of skeletal tissue engineering in general and osteochondral differentiation in particular. In network modeling, a network can either be derived from mechanisms and pathways that have been reported in the literature (knowledge-based approach) or it can be inferred directly from the data (data-driven approach). Combinatory approaches allow further optimization of the network. Once a network is established, several modeling technologies are available to interpret dynamically the relationships that have been put forward in the network graph (implication of the activation or inhibition of certain pathways on the evolution of the system over time) and to simulate the possible outcomes of the established network such as a given cell state. This review provides for each of the aforementioned steps (building, optimizing, and modeling the network) a brief theoretical perspective, followed by a concise overview of published works, focusing solely on applications related to cell fate decisions, cartilage differentiation and growth plate biology. Particular attention is paid to an in-house developed example of gene regulatory network modeling of growth plate chondrocyte differentiation as all the aforementioned steps can be illustrated. In summary, this paper discusses and explores a series of tools that form a first step toward a rigorous and systems-level modeling of osteochondral differentiation in the context of regenerative medicine.
Collapse
Affiliation(s)
- Raphaelle Lesage
- Prometheus, Division of Skeletal Tissue Engineering Leuven, KU Leuven, Leuven, Belgium.,Biomechanics Section, KU Leuven, Leuven, Belgium
| | - Johan Kerkhofs
- Prometheus, Division of Skeletal Tissue Engineering Leuven, KU Leuven, Leuven, Belgium.,Biomechanics Section, KU Leuven, Leuven, Belgium
| | - Liesbet Geris
- Prometheus, Division of Skeletal Tissue Engineering Leuven, KU Leuven, Leuven, Belgium.,Biomechanics Section, KU Leuven, Leuven, Belgium.,Biomechanics Research Unit, GIGA in silico Medicine, University of Liège, Liège, Belgium
| |
Collapse
|
17
|
Zheng R, Li M, Chen X, Wu FX, Pan Y, Wang J. BiXGBoost: a scalable, flexible boosting-based method for reconstructing gene regulatory networks. Bioinformatics 2018; 35:1893-1900. [DOI: 10.1093/bioinformatics/bty908] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2018] [Revised: 10/28/2018] [Accepted: 11/04/2018] [Indexed: 12/11/2022] Open
Affiliation(s)
- Ruiqing Zheng
- School of Information Science and Engineering, Central South University, Changsha, China
| | - Min Li
- School of Information Science and Engineering, Central South University, Changsha, China
| | - Xiang Chen
- School of Information Science and Engineering, Central South University, Changsha, China
| | - Fang-Xiang Wu
- School of Information Science and Engineering, Central South University, Changsha, China
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, Canada
| | - Yi Pan
- School of Information Science and Engineering, Central South University, Changsha, China
- Department of Computer Science, Georgia State University, Atlanta, GA, USA
| | - Jianxin Wang
- School of Information Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
18
|
Hon CC, Shin JW, Carninci P, Stubbington MJT. The Human Cell Atlas: Technical approaches and challenges. Brief Funct Genomics 2018; 17:283-294. [PMID: 29092000 PMCID: PMC6063304 DOI: 10.1093/bfgp/elx029] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The Human Cell Atlas is a large, international consortium that aims to identify and describe every cell type in the human body. The comprehensive cellular maps that arise from this ambitious effort have the potential to transform many aspects of fundamental biology and clinical practice. Here, we discuss the technical approaches that could be used today to generate such a resource and also the technical challenges that will be encountered.
Collapse
Affiliation(s)
- Chung-Chau Hon
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa, Japan
| | - Jay W Shin
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa, Japan
| | - Piero Carninci
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa, Japan
| | | |
Collapse
|
19
|
Fiers MWEJ, Minnoye L, Aibar S, Bravo González-Blas C, Kalender Atak Z, Aerts S. Mapping gene regulatory networks from single-cell omics data. Brief Funct Genomics 2018; 17:246-254. [PMID: 29342231 PMCID: PMC6063279 DOI: 10.1093/bfgp/elx046] [Citation(s) in RCA: 143] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Single-cell techniques are advancing rapidly and are yielding unprecedented insight into cellular heterogeneity. Mapping the gene regulatory networks (GRNs) underlying cell states provides attractive opportunities to mechanistically understand this heterogeneity. In this review, we discuss recently emerging methods to map GRNs from single-cell transcriptomics data, tackling the challenge of increased noise levels and data sparsity compared with bulk data, alongside increasing data volumes. Next, we discuss how new techniques for single-cell epigenomics, such as single-cell ATAC-seq and single-cell DNA methylation profiling, can be used to decipher gene regulatory programmes. We finally look forward to the application of single-cell multi-omics and perturbation techniques that will likely play important roles for GRN inference in the future.
Collapse
Affiliation(s)
- Mark W E J Fiers
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
| | - Liesbeth Minnoye
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| | - Sara Aibar
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| | - Carmen Bravo González-Blas
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| | - Zeynep Kalender Atak
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| | - Stein Aerts
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| |
Collapse
|
20
|
Woodhouse S, Piterman N, Wintersteiger CM, Göttgens B, Fisher J. SCNS: a graphical tool for reconstructing executable regulatory networks from single-cell genomic data. BMC SYSTEMS BIOLOGY 2018; 12:59. [PMID: 29801503 PMCID: PMC5970485 DOI: 10.1186/s12918-018-0581-y] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2018] [Accepted: 04/10/2018] [Indexed: 11/25/2022]
Abstract
Background Reconstruction of executable mechanistic models from single-cell gene expression data represents a powerful approach to understanding developmental and disease processes. New ambitious efforts like the Human Cell Atlas will soon lead to an explosion of data with potential for uncovering and understanding the regulatory networks which underlie the behaviour of all human cells. In order to take advantage of this data, however, there is a need for general-purpose, user-friendly and efficient computational tools that can be readily used by biologists who do not have specialist computer science knowledge. Results The Single Cell Network Synthesis toolkit (SCNS) is a general-purpose computational tool for the reconstruction and analysis of executable models from single-cell gene expression data. Through a graphical user interface, SCNS takes single-cell qPCR or RNA-sequencing data taken across a time course, and searches for logical rules that drive transitions from early cell states towards late cell states. Because the resulting reconstructed models are executable, they can be used to make predictions about the effect of specific gene perturbations on the generation of specific lineages. Conclusions SCNS should be of broad interest to the growing number of researchers working in single-cell genomics and will help further facilitate the generation of valuable mechanistic insights into developmental, homeostatic and disease processes. Electronic supplementary material The online version of this article (10.1186/s12918-018-0581-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Steven Woodhouse
- Department of Hematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, CB2 0XY, UK.,Wellcome Trust - Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK.,Microsoft Research Cambridge, 21 Station Road, Cambridge, CB1 2FB, UK
| | - Nir Piterman
- Department of Informatics, University of Leicester, University Road, Leicester, LE1 7RH, UK
| | | | - Berthold Göttgens
- Department of Hematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, CB2 0XY, UK. .,Wellcome Trust - Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK.
| | - Jasmin Fisher
- Microsoft Research Cambridge, 21 Station Road, Cambridge, CB1 2FB, UK. .,Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK.
| |
Collapse
|
21
|
Matsumoto H, Kiryu H, Furusawa C, Ko MSH, Ko SBH, Gouda N, Hayashi T, Nikaido I. SCODE: an efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation. Bioinformatics 2018; 33:2314-2321. [PMID: 28379368 PMCID: PMC5860123 DOI: 10.1093/bioinformatics/btx194] [Citation(s) in RCA: 251] [Impact Index Per Article: 35.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Accepted: 04/02/2017] [Indexed: 01/17/2023] Open
Abstract
Motivation The analysis of RNA-Seq data from individual differentiating cells enables us to reconstruct the differentiation process and the degree of differentiation (in pseudo-time) of each cell. Such analyses can reveal detailed expression dynamics and functional relationships for differentiation. To further elucidate differentiation processes, more insight into gene regulatory networks is required. The pseudo-time can be regarded as time information and, therefore, single-cell RNA-Seq data are time-course data with high time resolution. Although time-course data are useful for inferring networks, conventional inference algorithms for such data suffer from high time complexity when the number of samples and genes is large. Therefore, a novel algorithm is necessary to infer networks from single-cell RNA-Seq during differentiation. Results In this study, we developed the novel and efficient algorithm SCODE to infer regulatory networks, based on ordinary differential equations. We applied SCODE to three single-cell RNA-Seq datasets and confirmed that SCODE can reconstruct observed expression dynamics. We evaluated SCODE by comparing its inferred networks with use of a DNaseI-footprint based network. The performance of SCODE was best for two of the datasets and nearly best for the remaining dataset. We also compared the runtimes and showed that the runtimes for SCODE are significantly shorter than for alternatives. Thus, our algorithm provides a promising approach for further single-cell differentiation analyses. Availability and Implementation The R source code of SCODE is available at https://github.com/hmatsu1226/SCODE Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hirotaka Matsumoto
- Bioinformatics Research Unit, Advanced Center for Computing and Communication, RIKEN, Wako, Saitama 351-0198, Japan
| | - Hisanori Kiryu
- Department of Computational Biology and Medical Sciences, Faculty of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8561, Japan
| | - Chikara Furusawa
- Quantitative Biology Center (QBiC), RIKEN, Suita, Osaka 565-0874, Japan.,Universal Biology Institute, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Minoru S H Ko
- Department of Systems Medicine, Keio University School of Medicine, Tokyo 160-8582, Japan
| | - Shigeru B H Ko
- Department of Systems Medicine, Keio University School of Medicine, Tokyo 160-8582, Japan
| | - Norio Gouda
- Department of Systems Medicine, Keio University School of Medicine, Tokyo 160-8582, Japan
| | - Tetsutaro Hayashi
- Bioinformatics Research Unit, Advanced Center for Computing and Communication, RIKEN, Wako, Saitama 351-0198, Japan
| | - Itoshi Nikaido
- Bioinformatics Research Unit, Advanced Center for Computing and Communication, RIKEN, Wako, Saitama 351-0198, Japan
| |
Collapse
|
22
|
Chen J, Rénia L, Ginhoux F. Constructing cell lineages from single-cell transcriptomes. Mol Aspects Med 2017; 59:95-113. [PMID: 29107741 DOI: 10.1016/j.mam.2017.10.004] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Revised: 10/23/2017] [Accepted: 10/25/2017] [Indexed: 12/25/2022]
Abstract
Advances in single-cell RNA-sequencing have helped reveal the previously underappreciated level of cellular heterogeneity present during cellular differentiation. A static snapshot of single-cell transcriptomes provides a good representation of the various stages of differentiation as differentiation is rarely synchronized between cells. Data from numerous single-cell analyses has suggested that cellular differentiation and development can be conceptualized as continuous processes. Consequently, computational algorithms have been developed to infer lineage relationships between cell types and construct developmental trajectories along which cells are re-ordered such that similarity between successive cell pairs is maximized. Here, we compare and contrast the existing computational methods, and illustrate how they may be applied to build mouse myeloid progenitor lineages from massively parallel RNA single-cell sequencing data.
Collapse
Affiliation(s)
- Jinmiao Chen
- Singapore Immunology Network (SIgN), A*STAR, 8A Biomedical Grove, Immunos Building, Level 4, Singapore 138648, Singapore.
| | - Laurent Rénia
- Singapore Immunology Network (SIgN), A*STAR, 8A Biomedical Grove, Immunos Building, Level 4, Singapore 138648, Singapore
| | - Florent Ginhoux
- Singapore Immunology Network (SIgN), A*STAR, 8A Biomedical Grove, Immunos Building, Level 4, Singapore 138648, Singapore
| |
Collapse
|
23
|
|
24
|
Kumar P, Tan Y, Cahan P. Understanding development and stem cells using single cell-based analyses of gene expression. Development 2017; 144:17-32. [PMID: 28049689 DOI: 10.1242/dev.133058] [Citation(s) in RCA: 76] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
In recent years, genome-wide profiling approaches have begun to uncover the molecular programs that drive developmental processes. In particular, technical advances that enable genome-wide profiling of thousands of individual cells have provided the tantalizing prospect of cataloging cell type diversity and developmental dynamics in a quantitative and comprehensive manner. Here, we review how single-cell RNA sequencing has provided key insights into mammalian developmental and stem cell biology, emphasizing the analytical approaches that are specific to studying gene expression in single cells.
Collapse
Affiliation(s)
- Pavithra Kumar
- Department of Biomedical Engineering, Institute for Cell Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Yuqi Tan
- Department of Biomedical Engineering, Institute for Cell Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Patrick Cahan
- Department of Biomedical Engineering, Institute for Cell Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| |
Collapse
|
25
|
Papili Gao N, Ud-Dean SMM, Gandrillon O, Gunawan R. SINCERITIES: inferring gene regulatory networks from time-stamped single cell transcriptional expression profiles. Bioinformatics 2017; 34:258-266. [PMID: 28968704 PMCID: PMC5860204 DOI: 10.1093/bioinformatics/btx575] [Citation(s) in RCA: 126] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2016] [Revised: 06/12/2017] [Accepted: 09/13/2017] [Indexed: 11/13/2022] Open
Abstract
Motivation Single cell transcriptional profiling opens up a new avenue in studying the functional role of cell-to-cell variability in physiological processes. The analysis of single cell expression profiles creates new challenges due to the distributive nature of the data and the stochastic dynamics of gene transcription process. The reconstruction of gene regulatory networks (GRNs) using single cell transcriptional profiles is particularly challenging, especially when directed gene-gene relationships are desired. Results We developed SINCERITIES (SINgle CEll Regularized Inference using TIme-stamped Expression profileS) for the inference of GRNs from single cell transcriptional profiles. We focused on time-stamped cross-sectional expression data, commonly generated from transcriptional profiling of single cells collected at multiple time points after cell stimulation. SINCERITIES recovers directed regulatory relationships among genes by employing regularized linear regression (ridge regression), using temporal changes in the distributions of gene expressions. Meanwhile, the modes of the gene regulations (activation and repression) come from partial correlation analyses between pairs of genes. We demonstrated the efficacy of SINCERITIES in inferring GRNs using in silico time-stamped single cell expression data and single cell transcriptional profiles of THP-1 monocytic human leukemia cells. The case studies showed that SINCERITIES could provide accurate GRN predictions, significantly better than other GRN inference algorithms such as TSNI, GENIE3 and JUMP3. Moreover, SINCERITIES has a low computational complexity and is amenable to problems of extremely large dimensionality. Finally, an application of SINCERITIES to single cell expression data of T2EC chicken erythrocytes pointed to BATF as a candidate novel regulator of erythroid development. Availability and implementation MATLAB and R version of SINCERITIES are freely available from the following websites: http://www.cabsel.ethz.ch/tools/sincerities.html and https://github.com/CABSEL/SINCERITIES. The single cell THP-1 and T2EC transcriptional profiles are available from the original publications (Kouno et al., 2013; Richard et al., 2016). The in silico single cell data are available on SINCERITIES websites. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nan Papili Gao
- Institute for Chemical and Bioengineering, ETH Zurich, Zurich, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - S M Minhaz Ud-Dean
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY, USA
| | - Olivier Gandrillon
- Laboratory of Biology and Modelling of the Cell, Univ Lyon, ENS de Lyon, Univ Claude Bernard, CNRS UMR, INSERM Lyon, France.,Inria Team Dracula, Inria Center Grenoble Rhône-Alpes, Rhône-Alpes, France
| | - Rudiyanto Gunawan
- Institute for Chemical and Bioengineering, ETH Zurich, Zurich, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
26
|
Hamey FK, Nestorowa S, Kinston SJ, Kent DG, Wilson NK, Göttgens B. Reconstructing blood stem cell regulatory network models from single-cell molecular profiles. Proc Natl Acad Sci U S A 2017; 114:5822-5829. [PMID: 28584094 PMCID: PMC5468644 DOI: 10.1073/pnas.1610609114] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Adult blood contains a mixture of mature cell types, each with specialized functions. Single hematopoietic stem cells (HSCs) have been functionally shown to generate all mature cell types for the lifetime of the organism. Differentiation of HSCs toward alternative lineages must be balanced at the population level by the fate decisions made by individual cells. Transcription factors play a key role in regulating these decisions and operate within organized regulatory programs that can be modeled as transcriptional regulatory networks. As dysregulation of single HSC fate decisions is linked to fatal malignancies such as leukemia, it is important to understand how these decisions are controlled on a cell-by-cell basis. Here we developed and applied a network inference method, exploiting the ability to infer dynamic information from single-cell snapshot expression data based on expression profiles of 48 genes in 2,167 blood stem and progenitor cells. This approach allowed us to infer transcriptional regulatory network models that recapitulated differentiation of HSCs into progenitor cell types, focusing on trajectories toward megakaryocyte-erythrocyte progenitors and lymphoid-primed multipotent progenitors. By comparing these two models, we identified and subsequently experimentally validated a difference in the regulation of nuclear factor, erythroid 2 (Nfe2) and core-binding factor, runt domain, alpha subunit 2, translocated to, 3 homolog (Cbfa2t3h) by the transcription factor Gata2. Our approach confirms known aspects of hematopoiesis, provides hypotheses about regulation of HSC differentiation, and is widely applicable to other hierarchical biological systems to uncover regulatory relationships.
Collapse
Affiliation(s)
- Fiona K Hamey
- Department of Haematology, Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge Institute for Medical Research, Cambridge CB2 0XY, United Kingdom
| | - Sonia Nestorowa
- Department of Haematology, Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge Institute for Medical Research, Cambridge CB2 0XY, United Kingdom
| | - Sarah J Kinston
- Department of Haematology, Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge Institute for Medical Research, Cambridge CB2 0XY, United Kingdom
| | - David G Kent
- Department of Haematology, Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge Institute for Medical Research, Cambridge CB2 0XY, United Kingdom
| | - Nicola K Wilson
- Department of Haematology, Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge Institute for Medical Research, Cambridge CB2 0XY, United Kingdom
| | - Berthold Göttgens
- Department of Haematology, Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge Institute for Medical Research, Cambridge CB2 0XY, United Kingdom
| |
Collapse
|
27
|
Chasman D, Roy S. Inference of cell type specific regulatory networks on mammalian lineages. ACTA ACUST UNITED AC 2017; 2:130-139. [PMID: 29082337 DOI: 10.1016/j.coisb.2017.04.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Transcriptional regulatory networks are at the core of establishing cell type specific gene expression programs. In mammalian systems, such regulatory networks are determined by multiple levels of regulation, including by transcription factors, chromatin environment, and three-dimensional organization of the genome. Recent efforts to measure diverse regulatory genomic datasets across multiple cell types and tissues offer unprecedented opportunities to examine the context-specificity and dynamics of regulatory networks at a greater resolution and scale than before. In parallel, numerous computational approaches to analyze these data have emerged that serve as important tools for understanding mammalian cell type specific regulation. In this article, we review recent computational approaches to predict the expression and sequence-based regulators of a gene's expression level and examine long-range gene regulation. We highlight promising approaches, insights gained, and open challenges that need to be overcome to build a comprehensive picture of cell type specific transcriptional regulatory networks.
Collapse
Affiliation(s)
- Deborah Chasman
- Wisconsin Institute for Discovery University of Wisconsin-Madison, Madison, WI 53715
| | - Sushmita Roy
- Wisconsin Institute for Discovery University of Wisconsin-Madison, Madison, WI 53715.,Department of Biostatistics and Medical Informatics University of Wisconsin-Madison, Madison, WI 53792
| |
Collapse
|
28
|
Cannoodt R, Saelens W, Saeys Y. Computational methods for trajectory inference from single-cell transcriptomics. Eur J Immunol 2016; 46:2496-2506. [DOI: 10.1002/eji.201646347] [Citation(s) in RCA: 112] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2016] [Revised: 08/30/2016] [Accepted: 09/26/2016] [Indexed: 12/22/2022]
Affiliation(s)
- Robrecht Cannoodt
- Data Mining and Modelling for Biomedicine group; VIB Inflammation Research Center; Ghent Belgium
- Department of Internal Medicine; Ghent University; Ghent Belgium
- Center for Medical Genetics; Ghent University; Ghent Belgium
- Cancer Research Institute Ghent (CRIG); Ghent Belgium
| | - Wouter Saelens
- Data Mining and Modelling for Biomedicine group; VIB Inflammation Research Center; Ghent Belgium
- Department of Internal Medicine; Ghent University; Ghent Belgium
| | - Yvan Saeys
- Data Mining and Modelling for Biomedicine group; VIB Inflammation Research Center; Ghent Belgium
- Department of Internal Medicine; Ghent University; Ghent Belgium
| |
Collapse
|
29
|
Lim CY, Wang H, Woodhouse S, Piterman N, Wernisch L, Fisher J, Göttgens B. BTR: training asynchronous Boolean models using single-cell expression data. BMC Bioinformatics 2016; 17:355. [PMID: 27600248 PMCID: PMC5012073 DOI: 10.1186/s12859-016-1235-y] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2016] [Accepted: 09/01/2016] [Indexed: 12/25/2022] Open
Abstract
Background Rapid technological innovation for the generation of single-cell genomics data presents new challenges and opportunities for bioinformatics analysis. One such area lies in the development of new ways to train gene regulatory networks. The use of single-cell expression profiling technique allows the profiling of the expression states of hundreds of cells, but these expression states are typically noisier due to the presence of technical artefacts such as drop-outs. While many algorithms exist to infer a gene regulatory network, very few of them are able to harness the extra expression states present in single-cell expression data without getting adversely affected by the substantial technical noise present. Results Here we introduce BTR, an algorithm for training asynchronous Boolean models with single-cell expression data using a novel Boolean state space scoring function. BTR is capable of refining existing Boolean models and reconstructing new Boolean models by improving the match between model prediction and expression data. We demonstrate that the Boolean scoring function performed favourably against the BIC scoring function for Bayesian networks. In addition, we show that BTR outperforms many other network inference algorithms in both bulk and single-cell synthetic expression data. Lastly, we introduce two case studies, in which we use BTR to improve published Boolean models in order to generate potentially new biological insights. Conclusions BTR provides a novel way to refine or reconstruct Boolean models using single-cell expression data. Boolean model is particularly useful for network reconstruction using single-cell data because it is more robust to the effect of drop-outs. In addition, BTR does not assume any relationship in the expression states among cells, it is useful for reconstructing a gene regulatory network with as few assumptions as possible. Given the simplicity of Boolean models and the rapid adoption of single-cell genomics by biologists, BTR has the potential to make an impact across many fields of biomedical research. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1235-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Chee Yee Lim
- Department of Haematology, Wellcome Trust and MRC Cambridge Stem Cell Institute, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge, CB2 0XY, UK
| | - Huange Wang
- Department of Haematology, Wellcome Trust and MRC Cambridge Stem Cell Institute, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge, CB2 0XY, UK
| | - Steven Woodhouse
- Department of Haematology, Wellcome Trust and MRC Cambridge Stem Cell Institute, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge, CB2 0XY, UK
| | - Nir Piterman
- Department of Computer Science, University of Leicester, Leicester, UK
| | | | - Jasmin Fisher
- Microsoft Research Cambridge, Cambridge, UK.,Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Berthold Göttgens
- Department of Haematology, Wellcome Trust and MRC Cambridge Stem Cell Institute, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge, CB2 0XY, UK.
| |
Collapse
|
30
|
Espinosa Angarica V, del Sol A. Modeling heterogeneity in the pluripotent state: A promising strategy for improving the efficiency and fidelity of stem cell differentiation. Bioessays 2016; 38:758-68. [PMID: 27321053 PMCID: PMC5094535 DOI: 10.1002/bies.201600103] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Pluripotency can be considered a functional characteristic of pluripotent stem cells (PSCs) populations and their niches, rather than a property of individual cells. In this view, individual cells within the population independently adopt a variety of different expression states, maintained by different signaling, transcriptional, and epigenetics regulatory networks. In this review, we propose that generation of integrative network models from single cell data will be essential for getting a better understanding of the regulation of self-renewal and differentiation. In particular, we suggest that the identification of network stability determinants in these integrative models will provide important insights into the mechanisms mediating the transduction of signals from the niche, and how these signals can trigger differentiation. In this regard, the differential use of these stability determinants in subpopulation-specific regulatory networks would mediate differentiation into different cell fates. We suggest that this approach could offer a promising avenue for the development of novel strategies for increasing the efficiency and fidelity of differentiation, which could have a strong impact on regenerative medicine.
Collapse
Affiliation(s)
- Vladimir Espinosa Angarica
- Luxembourg Center for Systems Biomedicine (LCSB)University of Luxembourg, Campus BelvalBelvauxLuxembourg
| | - Antonio del Sol
- Luxembourg Center for Systems Biomedicine (LCSB)University of Luxembourg, Campus BelvalBelvauxLuxembourg
| |
Collapse
|
31
|
Chasman D, Fotuhi Siahpirani A, Roy S. Network-based approaches for analysis of complex biological systems. Curr Opin Biotechnol 2016; 39:157-166. [PMID: 27115495 DOI: 10.1016/j.copbio.2016.04.007] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2015] [Revised: 04/04/2016] [Accepted: 04/05/2016] [Indexed: 12/22/2022]
Abstract
Cells function and respond to changes in their environment by the coordinated activity of their molecular components, including mRNAs, proteins and metabolites. At the heart of proper cellular function are molecular networks connecting these components to process extra-cellular environmental signals and drive dynamic, context-specific cellular responses. Network-based computational approaches aim to systematically integrate measurements from high-throughput experiments to gain a global understanding of cellular function under changing environmental conditions. We provide an overview of recent methodological developments toward solving two major computational problems within this field in the past two years (2013-2015): network reconstruction and network-based interpretation. Looking forward, we envision development of methods that can predict phenotypes with high accuracy as well as provide biologically plausible mechanistic hypotheses.
Collapse
Affiliation(s)
- Deborah Chasman
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, United States
| | - Alireza Fotuhi Siahpirani
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, United States; Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, United States; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, United States
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, United States; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, United States; Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, United States.
| |
Collapse
|
32
|
Gene Regulatory Network Inference Using Time-Stamped Cross-Sectional Single Cell Expression Data. ACTA ACUST UNITED AC 2016. [DOI: 10.1016/j.ifacol.2016.12.117] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
33
|
Semrau S, van Oudenaarden A. Studying Lineage Decision-Making In Vitro: Emerging Concepts and Novel Tools. Annu Rev Cell Dev Biol 2015; 31:317-45. [DOI: 10.1146/annurev-cellbio-100814-125300] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
| | - Alexander van Oudenaarden
- Hubrecht Institute, 3584 CT Utrecht, The Netherlands;
- University Medical Center Utrecht, Cancer Genomics Netherlands, 3584 CG Utrecht, The Netherlands
| |
Collapse
|
34
|
Yalcin D, Hakguder ZM, Otu HH. Bioinformatics approaches to single-cell analysis in developmental biology. Mol Hum Reprod 2015; 22:182-92. [PMID: 26358759 DOI: 10.1093/molehr/gav050] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Accepted: 09/04/2015] [Indexed: 12/17/2022] Open
Abstract
Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology.
Collapse
Affiliation(s)
- Dicle Yalcin
- Department of Electrical and Computer Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588-0511, USA
| | - Zeynep M Hakguder
- Department of Electrical and Computer Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588-0511, USA
| | - Hasan H Otu
- Department of Electrical and Computer Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588-0511, USA
| |
Collapse
|