1
|
Sun F, Li H, Sun D, Fu S, Gu L, Shao X, Wang Q, Dong X, Duan B, Xing F, Wu J, Xiao M, Zhao F, Han JDJ, Liu Q, Fan X, Li C, Wang C, Shi T. Single-cell omics: experimental workflow, data analyses and applications. SCIENCE CHINA. LIFE SCIENCES 2025; 68:5-102. [PMID: 39060615 DOI: 10.1007/s11427-023-2561-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 04/18/2024] [Indexed: 07/28/2024]
Abstract
Cells are the fundamental units of biological systems and exhibit unique development trajectories and molecular features. Our exploration of how the genomes orchestrate the formation and maintenance of each cell, and control the cellular phenotypes of various organismsis, is both captivating and intricate. Since the inception of the first single-cell RNA technology, technologies related to single-cell sequencing have experienced rapid advancements in recent years. These technologies have expanded horizontally to include single-cell genome, epigenome, proteome, and metabolome, while vertically, they have progressed to integrate multiple omics data and incorporate additional information such as spatial scRNA-seq and CRISPR screening. Single-cell omics represent a groundbreaking advancement in the biomedical field, offering profound insights into the understanding of complex diseases, including cancers. Here, we comprehensively summarize recent advances in single-cell omics technologies, with a specific focus on the methodology section. This overview aims to guide researchers in selecting appropriate methods for single-cell sequencing and related data analysis.
Collapse
Affiliation(s)
- Fengying Sun
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China
| | - Haoyan Li
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Dongqing Sun
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Shaliu Fu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Lei Gu
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Shao
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China
| | - Qinqin Wang
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Dong
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Bin Duan
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Feiyang Xing
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Jun Wu
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Minmin Xiao
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jing-Dong J Han
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China.
| | - Qi Liu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China.
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China.
- Zhejiang Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.
| | - Chen Li
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Chenfei Wang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
| | - Tieliu Shi
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China.
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, 200062, China.
| |
Collapse
|
2
|
Fang M, Gorin G, Pachter L. Trajectory inference from single-cell genomics data with a process time model. PLoS Comput Biol 2025; 21:e1012752. [PMID: 39836699 PMCID: PMC11760028 DOI: 10.1371/journal.pcbi.1012752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 01/24/2025] [Accepted: 12/25/2024] [Indexed: 01/23/2025] Open
Abstract
Single-cell transcriptomics experiments provide gene expression snapshots of heterogeneous cell populations across cell states. These snapshots have been used to infer trajectories and dynamic information even without intensive, time-series data by ordering cells according to gene expression similarity. However, while single-cell snapshots sometimes offer valuable insights into dynamic processes, current methods for ordering cells are limited by descriptive notions of "pseudotime" that lack intrinsic physical meaning. Instead of pseudotime, we propose inference of "process time" via a principled modeling approach to formulating trajectories and inferring latent variables corresponding to timing of cells subject to a biophysical process. Our implementation of this approach, called Chronocell, provides a biophysical formulation of trajectories built on cell state transitions. The Chronocell model is identifiable, making parameter inference meaningful. Furthermore, Chronocell can interpolate between trajectory inference, when cell states lie on a continuum, and clustering, when cells cluster into discrete states. By using a variety of datasets ranging from cluster-like to continuous, we show that Chronocell enables us to assess the suitability of datasets and reveals distinct cellular distributions along process time that are consistent with biological process times. We also compare our parameter estimates of degradation rates to those derived from metabolic labeling datasets, thereby showcasing the biophysical utility of Chronocell. Nevertheless, based on performance characterization on simulations, we find that process time inference can be challenging, highlighting the importance of dataset quality and careful model assessment.
Collapse
Affiliation(s)
- Meichen Fang
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Gennady Gorin
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California, United States of America
| |
Collapse
|
3
|
Guo Y, Xiao Z. Constructing the dynamic transcriptional regulatory networks to identify phenotype-specific transcription regulators. Brief Bioinform 2024; 25:bbae542. [PMID: 39451156 PMCID: PMC11503644 DOI: 10.1093/bib/bbae542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 09/25/2024] [Accepted: 10/10/2024] [Indexed: 10/26/2024] Open
Abstract
The transcriptional regulatory network (TRN) is a graph framework that helps understand the complex transcriptional regulation mechanisms in the transcription process. Identifying the phenotype-specific transcription regulators is vital to reveal the functional roles of transcription elements in associating the specific phenotypes. Although many methods have been developed towards detecting the phenotype-specific transcription elements based on the static TRN in the past decade, most of them are not satisfactory for elucidating the phenotype-related functional roles of transcription regulators in multiple levels, as the dynamic characteristics of transcription regulators are usually ignored in static models. In this study, we introduce a novel framework called DTGN to identify the phenotype-specific transcription factors (TFs) and pathways by constructing dynamic TRNs. We first design a graph autoencoder model to integrate the phenotype-oriented time-series gene expression data and static TRN to learn the temporal representations of genes. Then, based on the learned temporal representations of genes, we develop a statistical method to construct a series of dynamic TRNs associated with the development of specific phenotypes. Finally, we identify the phenotype-specific TFs and pathways from the constructed dynamic TRNs. Results from multiple phenotypic datasets show that the proposed DTGN framework outperforms most existing methods in identifying phenotype-specific TFs and pathways. Our framework offers a new approach to exploring the functional roles of transcription regulators that associate with specific phenotypes in a dynamic model.
Collapse
Affiliation(s)
- Yang Guo
- School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China
| | - Zhiqiang Xiao
- School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China
| |
Collapse
|
4
|
Bonham-Carter B, Schiebinger G. Cellular proliferation biases clonal lineage tracing and trajectory inference. Bioinformatics 2024; 40:btae483. [PMID: 39102821 PMCID: PMC11316616 DOI: 10.1093/bioinformatics/btae483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 03/28/2024] [Accepted: 07/30/2024] [Indexed: 08/07/2024] Open
Abstract
MOTIVATION Lineage tracing and trajectory inference from single-cell RNA-sequencing data hold tremendous potential for uncovering the genetic programs driving development and disease. Single cell datasets are thought to provide an unbiased view on the diverse cellular architecture of tissues. Sampling bias, however, can skew single cell datasets away from the cellular composition they are meant to represent. RESULTS We demonstrate a novel form of sampling bias, caused by a statistical phenomenon related to repeated sampling from a growing, heterogeneous population. Relative growth rates of cells influence the probability that they will be sampled in clones observed across multiple time points. We support our probabilistic derivations with a simulation study and an analysis of a real time-course of T-cell development. We find that this bias can impact fate probability predictions, and we explore how to develop trajectory inference methods which are robust to this bias. AVAILABILITY AND IMPLEMENTATION Source code for the simulated datasets and to create the figures in this manuscript is freely available in python at https://github.com/rbonhamcarter/simulate-clones. A python implementation of the extension of the LineageOT method is freely available at https://github.com/rbonhamcarter/LineageOT/tree/multi-time-clones.
Collapse
Affiliation(s)
- Becca Bonham-Carter
- Department of Mathematics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Geoffrey Schiebinger
- Department of Mathematics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| |
Collapse
|
5
|
Hong Y, Li H, Long C, Liang P, Zhou J, Zuo Y. An increment of diversity method for cell state trajectory inference of time-series scRNA-seq data. FUNDAMENTAL RESEARCH 2024; 4:770-776. [PMID: 39156571 PMCID: PMC11330101 DOI: 10.1016/j.fmre.2024.01.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 08/29/2023] [Accepted: 01/03/2024] [Indexed: 08/20/2024] Open
Abstract
The increasing emergence of the time-series single-cell RNA sequencing (scRNA-seq) data, inferring developmental trajectory by connecting transcriptome similar cell states (i.e., cell types or clusters) has become a major challenge. Most existing computational methods are designed for individual cells and do not take into account the available time series information. We present IDTI based on the Increment of Diversity for Trajectory Inference, which combines time series information and the minimum increment of diversity method to infer cell state trajectory of time-series scRNA-seq data. We apply IDTI to simulated and three real diverse tissue development datasets, and compare it with six other commonly used trajectory inference methods in terms of topology similarity and branching accuracy. The results have shown that the IDTI method accurately constructs the cell state trajectory without the requirement of starting cells. In the performance test, we further demonstrate that IDTI has the advantages of high accuracy and strong robustness.
Collapse
Affiliation(s)
| | | | - Chunshen Long
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, College of Life Sciences, Inner Mongolia University, Hohhot 010020, China
| | - Pengfei Liang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, College of Life Sciences, Inner Mongolia University, Hohhot 010020, China
| | - Jian Zhou
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, College of Life Sciences, Inner Mongolia University, Hohhot 010020, China
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, College of Life Sciences, Inner Mongolia University, Hohhot 010020, China
| |
Collapse
|
6
|
Zhang K, Zhu J, Kong D, Zhang Z. Modeling single cell trajectory using forward-backward stochastic differential equations. PLoS Comput Biol 2024; 20:e1012015. [PMID: 38620017 PMCID: PMC11018287 DOI: 10.1371/journal.pcbi.1012015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 03/22/2024] [Indexed: 04/17/2024] Open
Abstract
Recent advances in single-cell sequencing technology have provided opportunities for mathematical modeling of dynamic developmental processes at the single-cell level, such as inferring developmental trajectories. Optimal transport has emerged as a promising theoretical framework for this task by computing pairings between cells from different time points. However, optimal transport methods have limitations in capturing nonlinear trajectories, as they are static and can only infer linear paths between endpoints. In contrast, stochastic differential equations (SDEs) offer a dynamic and flexible approach that can model non-linear trajectories, including the shape of the path. Nevertheless, existing SDE methods often rely on numerical approximations that can lead to inaccurate inferences, deviating from true trajectories. To address this challenge, we propose a novel approach combining forward-backward stochastic differential equations (FBSDE) with a refined approximation procedure. Our FBSDE model integrates the forward and backward movements of two SDEs in time, aiming to capture the underlying dynamics of single-cell developmental trajectories. Through comprehensive benchmarking on multiple scRNA-seq datasets, we demonstrate the superior performance of FBSDE compared to other methods, highlighting its efficacy in accurately inferring developmental trajectories.
Collapse
Affiliation(s)
- Kevin Zhang
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Junhao Zhu
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Dehan Kong
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Zhaolei Zhang
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
7
|
Pan X, Zhang X. Studying temporal dynamics of single cells: expression, lineage and regulatory networks. Biophys Rev 2024; 16:57-67. [PMID: 38495440 PMCID: PMC10937865 DOI: 10.1007/s12551-023-01090-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 06/27/2023] [Indexed: 03/19/2024] Open
Abstract
Learning how multicellular organs are developed from single cells to different cell types is a fundamental problem in biology. With the high-throughput scRNA-seq technology, computational methods have been developed to reveal the temporal dynamics of single cells from transcriptomic data, from phenomena on cell trajectories to the underlying mechanism that formed the trajectory. There are several distinct families of computational methods including Trajectory Inference (TI), Lineage Tracing (LT), and Gene Regulatory Network (GRN) Inference which are involved in such studies. This review summarizes these computational approaches which use scRNA-seq data to study cell differentiation and cell fate specification as well as the advantages and limitations of different methods. We further discuss how GRNs can potentially affect cell fate decisions and trajectory structures. Supplementary Information The online version contains supplementary material available at 10.1007/s12551-023-01090-5.
Collapse
Affiliation(s)
- Xinhai Pan
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA
| | - Xiuwei Zhang
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA
| |
Collapse
|
8
|
Zheng Y, Schupp JC, Adams T, Clair G, Justet A, Ahangari F, Yan X, Hansen P, Carlon M, Cortesi E, Vermant M, Vos R, De Sadeleer LJ, Rosas IO, Pineda R, Sembrat J, Königshoff M, McDonough JE, Vanaudenaerde BM, Wuyts WA, Kaminski N, Ding J. Unagi: Deep Generative Model for Deciphering Cellular Dynamics and In-Silico Drug Discovery in Complex Diseases. RESEARCH SQUARE 2023:rs.3.rs-3676579. [PMID: 38196613 PMCID: PMC10775382 DOI: 10.21203/rs.3.rs-3676579/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
Human diseases are characterized by intricate cellular dynamics. Single-cell sequencing provides critical insights, yet a persistent gap remains in computational tools for detailed disease progression analysis and targeted in-silico drug interventions. Here, we introduce UNAGI, a deep generative neural network tailored to analyze time-series single-cell transcriptomic data. This tool captures the complex cellular dynamics underlying disease progression, enhancing drug perturbation modeling and discovery. When applied to a dataset from patients with Idiopathic Pulmonary Fibrosis (IPF), UNAGI learns disease-informed cell embeddings that sharpen our understanding of disease progression, leading to the identification of potential therapeutic drug candidates. Validation via proteomics reveals the accuracy of UNAGI's cellular dynamics analyses, and the use of the Fibrotic Cocktail treated human Precision-cut Lung Slices confirms UNAGI's predictions that Nifedipine, an antihypertensive drug, may have antifibrotic effects on human tissues. UNAGI's versatility extends to other diseases, including a COVID dataset, demonstrating adaptability and confirming its broader applicability in decoding complex cellular dynamics beyond IPF, amplifying its utility in the quest for therapeutic solutions across diverse pathological landscapes.
Collapse
Affiliation(s)
- Yumin Zheng
- Quantitative Life Sciences, Faculty of Medicine & Health Sciences, McGill University, Montreal, QC, Canada
- Meakins-Christie Laboratories, Translational Research in Respiratory Diseases Program, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Jonas C. Schupp
- Pulmonary, Critical Care and Sleep Medicine, Yale University, School of Medicine, New Haven, CT, United States
| | - Taylor Adams
- Pulmonary, Critical Care and Sleep Medicine, Yale University, School of Medicine, New Haven, CT, United States
| | - Geremy Clair
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States
| | - Aurelien Justet
- Pulmonary, Critical Care and Sleep Medicine, Yale University, School of Medicine, New Haven, CT, United States
| | - Farida Ahangari
- Pulmonary, Critical Care and Sleep Medicine, Yale University, School of Medicine, New Haven, CT, United States
| | - Xiting Yan
- Pulmonary, Critical Care and Sleep Medicine, Yale University, School of Medicine, New Haven, CT, United States
| | - Paul Hansen
- Meakins-Christie Laboratories, Translational Research in Respiratory Diseases Program, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Marianne Carlon
- Laboratory of Respiratory Diseases and Thoracic Surgery (BREATHE), Department of Chronic Diseases and Metabolism, KU Leuven, Belgium
| | - Emanuela Cortesi
- Laboratory of Respiratory Diseases and Thoracic Surgery (BREATHE), Department of Chronic Diseases and Metabolism, KU Leuven, Belgium
| | - Marie Vermant
- Laboratory of Respiratory Diseases and Thoracic Surgery (BREATHE), Department of Chronic Diseases and Metabolism, KU Leuven, Belgium
| | - Robin Vos
- Laboratory of Respiratory Diseases and Thoracic Surgery (BREATHE), Department of Chronic Diseases and Metabolism, KU Leuven, Belgium
| | - Laurens J. De Sadeleer
- Laboratory of Respiratory Diseases and Thoracic Surgery (BREATHE), Department of Chronic Diseases and Metabolism, KU Leuven, Belgium
| | - Ivan O Rosas
- Division of Pulmonary, Critical Care and Sleep Medicine, Baylor College of Medicine, Houston, TX, USA
| | - Ricardo Pineda
- Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - John Sembrat
- Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Melanie Königshoff
- Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - John E. McDonough
- Pulmonary, Critical Care and Sleep Medicine, Yale University, School of Medicine, New Haven, CT, United States
| | - Bart M. Vanaudenaerde
- Laboratory of Respiratory Diseases and Thoracic Surgery (BREATHE), Department of Chronic Diseases and Metabolism, KU Leuven, Belgium
| | - Wim A. Wuyts
- Laboratory of Respiratory Diseases and Thoracic Surgery (BREATHE), Department of Chronic Diseases and Metabolism, KU Leuven, Belgium
| | - Naftali Kaminski
- Pulmonary, Critical Care and Sleep Medicine, Yale University, School of Medicine, New Haven, CT, United States
| | - Jun Ding
- Quantitative Life Sciences, Faculty of Medicine & Health Sciences, McGill University, Montreal, QC, Canada
- Meakins-Christie Laboratories, Translational Research in Respiratory Diseases Program, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
- Mila - Quebec AI Institute, Montreal, QC, Canada
| |
Collapse
|
9
|
Mao S, Liu J, Zhao W, Zhou X. LVPT: Lazy Velocity Pseudotime Inference Method. Biomolecules 2023; 13:1242. [PMID: 37627306 PMCID: PMC10452358 DOI: 10.3390/biom13081242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 08/09/2023] [Accepted: 08/10/2023] [Indexed: 08/27/2023] Open
Abstract
The emergence of RNA velocity has enriched our understanding of the dynamic transcriptional landscape within individual cells. In light of this breakthrough, we embarked on integrating RNA velocity with cellular pseudotime inference, aiming to improve the prediction of cell orders along biological trajectories beyond existing methods. Here, we developed LVPT, a novel method for pseudotime and trajectory inference. LVPT introduces a lazy probability to indicate the probability that the cell stays in the original state and calculates the transition matrix based on RNA velocity to provide the probability and direction of cell differentiation. LVPT shows better and comparable performance of pseudotime inference compared with other existing methods on both simulated datasets with different structures and real datasets. The validation results were consistent with prior knowledge, indicating that LVPT is an accurate and efficient method for pseudotime inference.
Collapse
Affiliation(s)
- Shuainan Mao
- The Department of Biotherapy and West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu 610041, China
- Med-X Center for Informatics, Sichuan University, Chengdu 610041, China
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Jiajia Liu
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Weiling Zhao
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX 77054, USA
| |
Collapse
|
10
|
Abstract
Dimensionality reduction is standard practice for filtering noise and identifying relevant features in large-scale data analyses. In biology, single-cell genomics studies typically begin with reduction to 2 or 3 dimensions to produce "all-in-one" visuals of the data that are amenable to the human eye, and these are subsequently used for qualitative and quantitative exploratory analysis. However, there is little theoretical support for this practice, and we show that extreme dimension reduction, from hundreds or thousands of dimensions to 2, inevitably induces significant distortion of high-dimensional datasets. We therefore examine the practical implications of low-dimensional embedding of single-cell data and find that extensive distortions and inconsistent practices make such embeddings counter-productive for exploratory, biological analyses. In lieu of this, we discuss alternative approaches for conducting targeted embedding and feature exploration to enable hypothesis-driven biological discovery.
Collapse
Affiliation(s)
- Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California, United States of America
| |
Collapse
|
11
|
Zeng Q, Mousa M, Nadukkandy AS, Franssens L, Alnaqbi H, Alshamsi FY, Safar HA, Carmeliet P. Understanding tumour endothelial cell heterogeneity and function from single-cell omics. Nat Rev Cancer 2023:10.1038/s41568-023-00591-5. [PMID: 37349410 DOI: 10.1038/s41568-023-00591-5] [Citation(s) in RCA: 60] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/22/2023] [Indexed: 06/24/2023]
Abstract
Anti-angiogenic therapies (AATs) are used to treat different types of cancers. However, their success is limited owing to insufficient efficacy and resistance. Recently, single-cell omics studies of tumour endothelial cells (TECs) have provided new mechanistic insight. Here, we overview the heterogeneity of human TECs of all tumour types studied to date, at the single-cell level. Notably, most human tumour types contain varying numbers but only a small population of angiogenic TECs, the presumed targets of AATs, possibly contributing to the limited efficacy of and resistance to AATs. In general, TECs are heterogeneous within and across all tumour types, but comparing TEC phenotypes across tumours is currently challenging, owing to the lack of a uniform nomenclature for endothelial cells and consistent single-cell analysis protocols, urgently raising the need for a more consistent approach. Nonetheless, across most tumour types, universal TEC markers (ACKR1, PLVAP and IGFBP3) can be identified. Besides angiogenesis, biological processes such as immunomodulation and extracellular matrix organization are among the most commonly predicted enriched signatures of TECs across different tumour types. Although angiogenesis and extracellular matrix targets have been considered for AAT (without the hoped success), the immunomodulatory properties of TECs have not been fully considered as a novel anticancer therapeutic approach. Therefore, we also discuss progress, limitations, solutions and novel targets for AAT development.
Collapse
Affiliation(s)
- Qun Zeng
- Laboratory of Angiogenesis and Vascular Metabolism, Department of Oncology, KU Leuven and Center for Cancer Biology, VIB, Leuven, Belgium
| | - Mira Mousa
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, UAE
| | - Aisha Shigna Nadukkandy
- Laboratory of Angiogenesis and Vascular Metabolism, Department of Oncology, KU Leuven and Center for Cancer Biology, VIB, Leuven, Belgium
- Laboratory of Angiogenesis and Vascular Heterogeneity, Department of Biomedicine, Aarhus University, Aarhus, Denmark
| | - Lies Franssens
- Laboratory of Angiogenesis and Vascular Metabolism, Department of Oncology, KU Leuven and Center for Cancer Biology, VIB, Leuven, Belgium
| | - Halima Alnaqbi
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, UAE
| | - Fatima Yousif Alshamsi
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, UAE
| | - Habiba Al Safar
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, UAE.
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, UAE.
| | - Peter Carmeliet
- Laboratory of Angiogenesis and Vascular Metabolism, Department of Oncology, KU Leuven and Center for Cancer Biology, VIB, Leuven, Belgium.
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, UAE.
- Laboratory of Angiogenesis and Vascular Heterogeneity, Department of Biomedicine, Aarhus University, Aarhus, Denmark.
| |
Collapse
|
12
|
Lan T, Hutvagner G, Zhang X, Liu T, Wong L, Li J. Density-based detection of cell transition states to construct disparate and bifurcating trajectories. Nucleic Acids Res 2022; 50:e122. [PMID: 36124665 PMCID: PMC9757071 DOI: 10.1093/nar/gkac785] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 08/22/2022] [Accepted: 09/01/2022] [Indexed: 12/24/2022] Open
Abstract
Tree- and linear-shaped cell differentiation trajectories have been widely observed in developmental biologies and can be also inferred through computational methods from single-cell RNA-sequencing datasets. However, trajectories with complicated topologies such as loops, disparate lineages and bifurcating hierarchy remain difficult to infer accurately. Here, we introduce a density-based trajectory inference method capable of constructing diverse shapes of topological patterns including the most intriguing bifurcations. The novelty of our method is a step to exploit overlapping probability distributions to identify transition states of cells for determining connectability between cell clusters, and another step to infer a stable trajectory through a base-topology guided iterative fitting. Our method precisely re-constructed various benchmark reference trajectories. As a case study to demonstrate practical usefulness, our method was tested on single-cell RNA sequencing profiles of blood cells of SARS-CoV-2-infected patients. We not only re-discovered the linear trajectory bridging the transition from IgM plasmablast cells to developing neutrophils, and also found a previously-undiscovered lineage which can be rigorously supported by differentially expressed gene analysis.
Collapse
Affiliation(s)
- Tian Lan
- Data Science Institute and School of Computer Science, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Gyorgy Hutvagner
- School of Biomedical Engineering, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Xuan Zhang
- Data Science Institute and School of Computer Science, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Tao Liu
- Children’s Cancer Institute Australia for Medical Research, Randwick, NSW 2031, Australia
| | - Limsoon Wong
- School of Computing, National University of Singapore, 13 Computing Drive, 117417, Singapore
| | - Jinyan Li
- Data Science Institute and School of Computer Science, University of Technology Sydney, Ultimo, NSW 2007, Australia
| |
Collapse
|
13
|
Li D, Velazquez JJ, Ding J, Hislop J, Ebrahimkhani MR, Bar-Joseph Z. TraSig: inferring cell-cell interactions from pseudotime ordering of scRNA-Seq data. Genome Biol 2022; 23:73. [PMID: 35255944 PMCID: PMC8900372 DOI: 10.1186/s13059-022-02629-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 02/09/2022] [Indexed: 02/08/2023] Open
Abstract
A major advantage of single cell RNA-sequencing (scRNA-Seq) data is the ability to reconstruct continuous ordering and trajectories for cells. Here we present TraSig, a computational method for improving the inference of cell-cell interactions in scRNA-Seq studies that utilizes the dynamic information to identify significant ligand-receptor pairs with similar trajectories, which in turn are used to score interacting cell clusters. We applied TraSig to several scRNA-Seq datasets and obtained unique predictions that improve upon those identified by prior methods. Functional experiments validate the ability of TraSig to identify novel signaling interactions that impact vascular development in liver organoids.Software https://github.com/doraadong/TraSig .
Collapse
Affiliation(s)
- Dongshunyi Li
- Computational Biology Department, School of Computer Science, Carnegie Mellon Universit, Pittsburgh, 15213, PA, USA
| | - Jeremy J Velazquez
- Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, 15213, PA, USA
- Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, 15261, PA, USA
| | - Jun Ding
- Meakins-Christie Laboratories, Department of Medicine, McGill University Health Centre, Montreal, H4A 3J1, Quebec, Canada
| | - Joshua Hislop
- Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, 15213, PA, USA
- Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, 15261, PA, USA
- Department of Bioengineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, 15261, PA, USA
| | - Mo R Ebrahimkhani
- Department of Pathology, School of Medicine, University of Pittsburgh, Pittsburgh, 15213, PA, USA.
- Pittsburgh Liver Research Center, University of Pittsburgh, Pittsburgh, 15261, PA, USA.
- Department of Bioengineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, 15261, PA, USA.
- McGowan Institute for Regenerative Medicine, University of Pittsburgh, Pittsburgh, 15219, PA, USA.
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon Universit, Pittsburgh, 15213, PA, USA
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, 15213, PA, USA
| |
Collapse
|
14
|
Ding J, Sharon N, Bar-Joseph Z. Temporal modelling using single-cell transcriptomics. Nat Rev Genet 2022; 23:355-368. [PMID: 35102309 DOI: 10.1038/s41576-021-00444-7] [Citation(s) in RCA: 84] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/14/2021] [Indexed: 12/16/2022]
Abstract
Methods for profiling genes at the single-cell level have revolutionized our ability to study several biological processes and systems including development, differentiation, response programmes and disease progression. In many of these studies, cells are profiled over time in order to infer dynamic changes in cell states and types, sets of expressed genes, active pathways and key regulators. However, time-series single-cell RNA sequencing (scRNA-seq) also raises several new analysis and modelling issues. These issues range from determining when and how deep to profile cells, linking cells within and between time points, learning continuous trajectories, and integrating bulk and single-cell data for reconstructing models of dynamic networks. In this Review, we discuss several approaches for the analysis and modelling of time-series scRNA-seq, highlighting their steps, key assumptions, and the types of data and biological questions they are most appropriate for.
Collapse
|
15
|
Jiang Q, Zhang S, Wan L. Dynamic inference of cell developmental complex energy landscape from time series single-cell transcriptomic data. PLoS Comput Biol 2022; 18:e1009821. [PMID: 35073331 PMCID: PMC8812873 DOI: 10.1371/journal.pcbi.1009821] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 02/03/2022] [Accepted: 01/10/2022] [Indexed: 12/27/2022] Open
Abstract
Time series single-cell RNA sequencing (scRNA-seq) data are emerging. However, dynamic inference of an evolving cell population from time series scRNA-seq data is challenging owing to the stochasticity and nonlinearity of the underlying biological processes. This calls for the development of mathematical models and methods capable of reconstructing cellular dynamic transition processes and uncovering the nonlinear cell-cell interactions. In this study, we present GraphFP, a nonlinear Fokker-Planck equation on graph based model and dynamic inference framework, with the aim of reconstructing the cell state-transition complex potential energy landscape from time series single-cell transcriptomic data. The free energy of our model explicitly takes into account of the cell-cell interactions in a nonlinear quadratic term. We then recast the model inference problem in the form of a dynamic optimal transport framework and solve it efficiently with the adjoint method of optimal control. We evaluated GraphFP on the time series scRNA-seq data set of embryonic murine cerebral cortex development. We illustrated that it 1) reconstructs cell state potential energy, which is a measure of cellular differentiation potency, 2) faithfully charts the probability flows between paired cell states over the dynamic processes of cell differentiation, and 3) accurately quantifies the stochastic dynamics of cell type frequencies on probability simplex in continuous time. We also illustrated that GraphFP is robust in terms of cluster labelling with different resolutions, as well as parameter choices. Meanwhile, GraphFP provides a model-based approach to delineate the cell-cell interactions that drive cell differentiation. GraphFP software is available at https://github.com/QiJiang-QJ/GraphFP. Dynamic inference of cell development processes from time series scRNA-seq data is a major challenge. Here, we present GraphFP, a coherent computational framework that simultaneously reconstructs the cell state-transition complex potential energy landscape and infers cell-cell interactions from time series single-cell transcriptomic data. Based on the mathematical framework of nonlinear Fokker-Planck equation on graph, GraphFP models the stochastic dynamics of the cell state/type frequencies on probability simplex in continuous time, where the free energy with a nonlinear quadratic interaction term is employed to characterize cell-cell interactions. We formulate the model inference problem in the form of a dynamic optimal transport framework and solve it efficiently with the celebrated adjoint method. GraphFP allows for 1) reconstructing cell state potential energy, which is a measure of cellular differentiation potency, 2) charting the probability flows between paired cell states over dynamic processes, 3) quantifying the stochastic dynamics of cell type frequencies on probability simplex in continuous time, and 4) delineating cell-cell interactions that drive cell differentiation. We show how GraphFP can be used to faithfully reveal and accurately quantify the cell development processes using the embryonic murine cerebral cortex development time series scRNA-seq dataset.
Collapse
Affiliation(s)
- Qi Jiang
- NCMIS, LSC, LSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Shuo Zhang
- NCMIS, LSC, LSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Lin Wan
- NCMIS, LSC, LSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
- * E-mail:
| |
Collapse
|
16
|
Ding J, Lugo-Martinez J, Yuan Y, Huang J, Hume AJ, Suder EL, Mühlberger E, Kotton DN, Bar-Joseph Z. Reconstructed signaling and regulatory networks identify potential drugs for SARS-CoV-2 infection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021:2020.06.01.127589. [PMID: 33083801 PMCID: PMC7574259 DOI: 10.1101/2020.06.01.127589] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Several molecular datasets have been recently compiled to characterize the activity of SARS-CoV-2 within human cells. Here we extend computational methods to integrate several different types of sequence, functional and interaction data to reconstruct networks and pathways activated by the virus in host cells. We identify key proteins in these networks and further intersect them with genes differentially expressed at conditions that are known to impact viral activity. Several of the top ranked genes do not directly interact with virus proteins. We experimentally tested treatments for a number of the predicted targets. We show that blocking one of the predicted indirect targets significantly reduces viral loads in stem cell-derived alveolar epithelial type II cells (iAT2s).
Collapse
Affiliation(s)
- Jun Ding
- Meakins-Christie Laboratories, Department of Medicine, McGill University Health Centre, Montreal, Quebec, H4A 3J1, Canada
| | - Jose Lugo-Martinez
- Department of Computer Science, University of Puerto Rico, San Juan, Puerto Rico, 00925, USA
| | - Ye Yuan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Jessie Huang
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA
- The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Adam J. Hume
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA
- The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Ellen L. Suder
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA
- The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Elke Mühlberger
- National Emerging Infectious Diseases Laboratory (NEIDL), Boston University, Boston, MA 02118, USA
- Department of Microbiology, Boston University School of Medicine, Boston, MA 02118, USA
| | - Darrell N. Kotton
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA
- The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, 15213, USA
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, 15213, USA
| |
Collapse
|
17
|
Zhang S, Afanassiev A, Greenstreet L, Matsumoto T, Schiebinger G. Optimal transport analysis reveals trajectories in steady-state systems. PLoS Comput Biol 2021; 17:e1009466. [PMID: 34860824 PMCID: PMC8691649 DOI: 10.1371/journal.pcbi.1009466] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 12/21/2021] [Accepted: 09/20/2021] [Indexed: 01/25/2023] Open
Abstract
Understanding how cells change their identity and behaviour in living systems is an important question in many fields of biology. The problem of inferring cell trajectories from single-cell measurements has been a major topic in the single-cell analysis community, with different methods developed for equilibrium and non-equilibrium systems (e.g. haematopoeisis vs. embryonic development). We show that optimal transport analysis, a technique originally designed for analysing time-courses, may also be applied to infer cellular trajectories from a single snapshot of a population in equilibrium. Therefore, optimal transport provides a unified approach to inferring trajectories that is applicable to both stationary and non-stationary systems. Our method, StationaryOT, is mathematically motivated in a natural way from the hypothesis of a Waddington's epigenetic landscape. We implement StationaryOT as a software package and demonstrate its efficacy in applications to simulated data as well as single-cell data from Arabidopsis thaliana root development.
Collapse
Affiliation(s)
- Stephen Zhang
- Department of Mathematics, University of British Columbia, Vancouver, Canada
| | - Anton Afanassiev
- Department of Mathematics, University of British Columbia, Vancouver, Canada
| | - Laura Greenstreet
- Department of Mathematics, University of British Columbia, Vancouver, Canada
| | - Tetsuya Matsumoto
- Department of Mathematics, University of British Columbia, Vancouver, Canada
| | | |
Collapse
|
18
|
Ding J, Alavi A, Ebrahimkhani MR, Bar-Joseph Z. Computational tools for analyzing single-cell data in pluripotent cell differentiation studies. CELL REPORTS METHODS 2021; 1:100087. [PMID: 35474899 PMCID: PMC9017169 DOI: 10.1016/j.crmeth.2021.100087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Single-cell technologies are revolutionizing the ability of researchers to infer the causes and results of biological processes. Although several studies of pluripotent cell differentiation have recently utilized single-cell sequencing data, other aspects related to the optimization of differentiation protocols, their validation, robustness, and usage are still not taking full advantage of single-cell technologies. In this review, we focus on computational approaches for the analysis of single-cell omics and imaging data and discuss their use to address many of the major challenges involved in the development, validation, and use of cells obtained from pluripotent cell differentiation.
Collapse
Affiliation(s)
- Jun Ding
- Meakins-Christie Laboratories, Department of Medicine, McGill University Health Centre, 1001 Decarie Boulevard, Montreal QC H4A 3J1, Canada
| | - Amir Alavi
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
| | - Mo R. Ebrahimkhani
- Department of Pathology, School of Medicine, University of Pittsburgh, 3550 Terrace Street, Pittsburgh, PA 15261, USA
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
| |
Collapse
|
19
|
Shen S, Sun Y, Matsumoto M, Shim WJ, Sinniah E, Wilson SB, Werner T, Wu Z, Bradford ST, Hudson J, Little MH, Powell J, Nguyen Q, Palpant NJ. Integrating single-cell genomics pipelines to discover mechanisms of stem cell differentiation. Trends Mol Med 2021; 27:1135-1158. [PMID: 34657800 DOI: 10.1016/j.molmed.2021.09.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 09/19/2021] [Accepted: 09/22/2021] [Indexed: 12/12/2022]
Abstract
Pluripotent stem cells underpin a growing sector that leverages their differentiation potential for research, industry, and clinical applications. This review evaluates the landscape of methods in single-cell transcriptomics that are enabling accelerated discovery in stem cell science. We focus on strategies for scaling stem cell differentiation through multiplexed single-cell analyses, for evaluating molecular regulation of cell differentiation using new analysis algorithms, and methods for integration and projection analysis to classify and benchmark stem cell derivatives against in vivo cell types. By discussing the available methods, comparing their strengths, and illustrating strategies for developing integrated analysis pipelines, we provide user considerations to inform their implementation and interpretation.
Collapse
Affiliation(s)
- Sophie Shen
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia
| | - Yuliangzi Sun
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia
| | - Maika Matsumoto
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia
| | - Woo Jun Shim
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia
| | - Enakshi Sinniah
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia
| | - Sean B Wilson
- Murdoch Children's Research Institute, Melbourne, Australia
| | - Tessa Werner
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia
| | - Zhixuan Wu
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia
| | | | - James Hudson
- QIMR Berghofer Medical Research Institute, Brisbane, Australia
| | - Melissa H Little
- Murdoch Children's Research Institute, Melbourne, Australia; Department of Pediatrics, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, Australia
| | - Joseph Powell
- Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Sydney, Australia; UNSW Cellular Genomics Futures Institute, UNSW, Sydney, Australia
| | - Quan Nguyen
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia
| | - Nathan J Palpant
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia.
| |
Collapse
|
20
|
Zhao C, Xiu W, Hua Y, Zhang N, Zhang Y. CStreet: a computed Cell State trajectory inference method for time-series single-cell RNA sequencing data. Bioinformatics 2021; 37:3774-3780. [PMID: 34196686 DOI: 10.1093/bioinformatics/btab488] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 06/24/2021] [Accepted: 06/30/2021] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The increasing amount of time-series single-cell RNA sequencing (scRNA-seq) data raises the key issue of connecting cell states (i.e., cell clusters or cell types) to obtain the continuous temporal dynamics of transcription, which can highlight the unified biological mechanisms involved in cell state transitions. However, most existing trajectory methods are specifically designed for individual cells, so they can hardly meet the needs of accurately inferring the trajectory topology of the cell state, which usually contains cells assigned to different branches. RESULTS Here, we present CStreet, a computed Cell State trajectory inference method for time-series scRNA-seq data. It uses time-series information to construct the k-nearest neighbors connections between cells within each time point and between adjacent time points. Then, CStreet estimates the connection probabilities of the cell states and visualizes the trajectory, which may include multiple starting points and paths, using a force-directed graph. By comparing the performance of CStreet with that of six commonly used cell state trajectory reconstruction methods on simulated data and real data, we demonstrate the high accuracy and high tolerance of CStreet. AVAILABILITY AND IMPLEMENTATION CStreet is written in Python and freely available on the web at https://github.com/TongjiZhanglab/CStreet and https://doi.org/10.5281/zenodo.4483205. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chengchen Zhao
- Institute for Regenerative Medicine, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, Frontier Science Center for Stem Cell Research, School of Life Science and Technology, Tongji University, Shanghai, 200092, China
| | - Wenchao Xiu
- Institute for Regenerative Medicine, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, Frontier Science Center for Stem Cell Research, School of Life Science and Technology, Tongji University, Shanghai, 200092, China
| | - Yuwei Hua
- Institute for Regenerative Medicine, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, Frontier Science Center for Stem Cell Research, School of Life Science and Technology, Tongji University, Shanghai, 200092, China
| | - Naiqian Zhang
- School of Mathematics and Statistics, Shandong University at Weihai, Weihai, 264209, China
| | - Yong Zhang
- Institute for Regenerative Medicine, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, Frontier Science Center for Stem Cell Research, School of Life Science and Technology, Tongji University, Shanghai, 200092, China
| |
Collapse
|
21
|
Cahan P, Cacchiarelli D, Dunn SJ, Hemberg M, de Sousa Lopes SMC, Morris SA, Rackham OJL, Del Sol A, Wells CA. Computational Stem Cell Biology: Open Questions and Guiding Principles. Cell Stem Cell 2021; 28:20-32. [PMID: 33417869 PMCID: PMC7799393 DOI: 10.1016/j.stem.2020.12.012] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Computational biology is enabling an explosive growth in our understanding of stem cells and our ability to use them for disease modeling, regenerative medicine, and drug discovery. We discuss four topics that exemplify applications of computation to stem cell biology: cell typing, lineage tracing, trajectory inference, and regulatory networks. We use these examples to articulate principles that have guided computational biology broadly and call for renewed attention to these principles as computation becomes increasingly important in stem cell biology. We also discuss important challenges for this field with the hope that it will inspire more to join this exciting area.
Collapse
Affiliation(s)
- Patrick Cahan
- Institute for Cell Engineering, Department of Biomedical Engineering, Department of Molecular Biology and Genetics, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA.
| | - Davide Cacchiarelli
- Telethon Institute of Genetics and Medicine (TIGEM), Armenise/Harvard Laboratory of Integrative Genomics, Pozzuoli, Italy d Department of Translational Medicine, University of Naples "Federico II," Naples, Italy
| | - Sara-Jane Dunn
- DeepMind, 14-18 Handyside Street, London N1C 4DN, UK; Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Jeffrey Cheah Biomedical Centre, Puddicombe Way, Cambridge Biomedical Campus, Cambridge CB2 0AW, UK
| | - Martin Hemberg
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | | | - Samantha A Morris
- Department of Developmental Biology, Department of Genetics, Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Owen J L Rackham
- Centre for Computational Biology and The Program for Cardiovascular and Metabolic Disorders, Duke-NUS Medical School, Singapore, Singapore
| | - Antonio Del Sol
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, Belvaux 4366, Luxembourg; CIC bioGUNE, Bizkaia Technology Park, 801 Building, 48160 Derio, Spain; IKERBASQUE, Basque Foundation for Science, Bilbao 48013, Spain
| | - Christine A Wells
- Centre for Stem Cell Systems, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Melbourne, VIC 3010, Australia
| |
Collapse
|
22
|
Tran TN, Bader GD. Tempora: Cell trajectory inference using time-series single-cell RNA sequencing data. PLoS Comput Biol 2020; 16:e1008205. [PMID: 32903255 PMCID: PMC7505465 DOI: 10.1371/journal.pcbi.1008205] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Revised: 09/21/2020] [Accepted: 07/29/2020] [Indexed: 12/21/2022] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) can map cell types, states and transitions during dynamic biological processes such as tissue development and regeneration. Many trajectory inference methods have been developed to order cells by their progression through a dynamic process. However, when time series data is available, most of these methods do not consider the available time information when ordering cells and are instead designed to work only on a single scRNA-seq data snapshot. We present Tempora, a novel cell trajectory inference method that orders cells using time information from time-series scRNA-seq data. In performance comparison tests, Tempora inferred known developmental lineages from three diverse tissue development time series data sets, beating state of the art methods in accuracy and speed. Tempora works at the level of cell clusters (types) and uses biological pathway information to help identify cell type relationships. This approach increases gene expression signal from single cells, processing speed, and interpretability of the inferred trajectory. Our results demonstrate the utility of a combination of time and pathway information to supervise trajectory inference for scRNA-seq based analysis. Single-cell RNA sequencing (scRNA-seq) enables an unparalleled ability to map the heterogeneity of dynamic multicellular processes, such as tissue development, tumor growth, wound response and repair, and inflammation. Multiple methods have been developed to order cells along a pseudotime axis that represents a trajectory through such processes using the concept that cells that are closely related in a lineage will have similar transcriptomes. However, time series experiments provide another useful information source to order cells, from earlier to later time point. By introducing a novel use of biological pathway prior information, our Tempora algorithm improves the accuracy and speed of cell trajectory inference from time-series scRNA-seq data as measured by reconstructing known developmental trajectories from three diverse data sets. By analyzing scRNA-seq data at the cluster (cell type) level instead of at the single-cell level and by using known pathway information, Tempora amplifies gene expression signals from one cell using similar cells in a cluster and similar genes within a pathway. This approach also reduces computational time and resources needed to analyze large data sets because it works with a relatively small number of clusters instead of a potentially large number of cells. Finally, it eases interpretation, via operating on a relatively small number of clusters which usually represent known cell types, as well as by identifying time-dependent pathways. Tempora is useful for finding novel insights in dynamic processes.
Collapse
Affiliation(s)
- Thinh N. Tran
- Department of Molecular Genetics, University of Toronto, Ontario, Canada
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Ontario, Canada
| | - Gary D. Bader
- Department of Molecular Genetics, University of Toronto, Ontario, Canada
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Ontario, Canada
- * E-mail:
| |
Collapse
|
23
|
Zafar H, Lin C, Bar-Joseph Z. Single-cell lineage tracing by integrating CRISPR-Cas9 mutations with transcriptomic data. Nat Commun 2020; 11:3055. [PMID: 32546686 PMCID: PMC7298005 DOI: 10.1038/s41467-020-16821-5] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 05/25/2020] [Indexed: 02/07/2023] Open
Abstract
Recent studies combine two novel technologies, single-cell RNA-sequencing and CRISPR-Cas9 barcode editing for elucidating developmental lineages at the whole organism level. While these studies provided several insights, they face several computational challenges. First, lineages are reconstructed based on noisy and often saturated random mutation data. Additionally, due to the randomness of the mutations, lineages from multiple experiments cannot be combined to reconstruct a species-invariant lineage tree. To address these issues we developed a statistical method, LinTIMaT, which reconstructs cell lineages using a maximum-likelihood framework by integrating mutation and expression data. Our analysis shows that expression data helps resolve the ambiguities arising in when lineages are inferred based on mutations alone, while also enabling the integration of different individual lineages for the reconstruction of an invariant lineage tree. LinTIMaT lineages have better cell type coherence, improve the functional significance of gene sets and provide new insights on progenitors and differentiation pathways.
Collapse
Affiliation(s)
- Hamim Zafar
- Department of Computer Science and Engineering, Indian Institute of Technology Kanpur, Kanpur, India
- Department of Biological Sciences and Bioengineering, Indian Institute of Technology Kanpur, Kanpur, India
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Chieh Lin
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
| |
Collapse
|
24
|
|
25
|
Hurley K, Ding J, Villacorta-Martin C, Herriges MJ, Jacob A, Vedaie M, Alysandratos KD, Sun YL, Lin C, Werder RB, Huang J, Wilson AA, Mithal A, Mostoslavsky G, Oglesby I, Caballero IS, Guttentag SH, Ahangari F, Kaminski N, Rodriguez-Fraticelli A, Camargo F, Bar-Joseph Z, Kotton DN. Reconstructed Single-Cell Fate Trajectories Define Lineage Plasticity Windows during Differentiation of Human PSC-Derived Distal Lung Progenitors. Cell Stem Cell 2020; 26:593-608.e8. [PMID: 32004478 PMCID: PMC7469703 DOI: 10.1016/j.stem.2019.12.009] [Citation(s) in RCA: 100] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2019] [Revised: 11/04/2019] [Accepted: 12/19/2019] [Indexed: 12/17/2022]
Abstract
Alveolar epithelial type 2 cells (AEC2s) are the facultative progenitors responsible for maintaining lung alveoli throughout life but are difficult to isolate from patients. Here, we engineer AEC2s from human pluripotent stem cells (PSCs) in vitro and use time-series single-cell RNA sequencing with lentiviral barcoding to profile the kinetics of their differentiation in comparison to primary fetal and adult AEC2 benchmarks. We observe bifurcating cell-fate trajectories as primordial lung progenitors differentiate in vitro, with some progeny reaching their AEC2 fate target, while others diverge to alternative non-lung endodermal fates. We develop a Continuous State Hidden Markov model to identify the timing and type of signals, such as overexuberant Wnt responses, that induce some early multipotent NKX2-1+ progenitors to lose lung fate. Finally, we find that this initial developmental plasticity is regulatable and subsides over time, ultimately resulting in PSC-derived AEC2s that exhibit a stable phenotype and nearly limitless self-renewal capacity.
Collapse
Affiliation(s)
- Killian Hurley
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA; The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA; Department of Medicine, Royal College of Surgeons in Ireland, Education and Research Centre, Beaumont Hospital, Dublin, Ireland; Tissue Engineering Research Group, Royal College of Surgeons in Ireland, Dublin, Ireland
| | - Jun Ding
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Carlos Villacorta-Martin
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA
| | - Michael J Herriges
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA; The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Anjali Jacob
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA; The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Marall Vedaie
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA; The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Konstantinos D Alysandratos
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA; The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Yuliang L Sun
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA; The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Chieh Lin
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15217, USA
| | - Rhiannon B Werder
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA; The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Jessie Huang
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA; The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Andrew A Wilson
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA; The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Aditya Mithal
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA
| | - Gustavo Mostoslavsky
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA
| | - Irene Oglesby
- Department of Medicine, Royal College of Surgeons in Ireland, Education and Research Centre, Beaumont Hospital, Dublin, Ireland; Tissue Engineering Research Group, Royal College of Surgeons in Ireland, Dublin, Ireland
| | - Ignacio S Caballero
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA
| | - Susan H Guttentag
- Department of Pediatrics, Monroe Carell Jr. Children's Hospital, Vanderbilt University, Nashville, TN 37232, USA
| | - Farida Ahangari
- Pulmonary, Critical Care and Sleep Medicine, Yale University School of Medicine, New Haven, CT 16520, USA
| | - Naftali Kaminski
- Pulmonary, Critical Care and Sleep Medicine, Yale University School of Medicine, New Haven, CT 16520, USA
| | | | - Fernando Camargo
- Stem Cell Program, Boston Children's Hospital, Boston, MA 02115, USA; Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA 02138, USA; Harvard Stem Cell Institute, Boston, MA 02115, USA
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15217, USA.
| | - Darrell N Kotton
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA; The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA.
| |
Collapse
|
26
|
Lin C, Ding J, Bar-Joseph Z. Inferring TF activation order in time series scRNA-Seq studies. PLoS Comput Biol 2020; 16:e1007644. [PMID: 32069291 PMCID: PMC7048296 DOI: 10.1371/journal.pcbi.1007644] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 02/28/2020] [Accepted: 01/09/2020] [Indexed: 12/11/2022] Open
Abstract
Methods for the analysis of time series single cell expression data (scRNA-Seq) either do not utilize information about transcription factors (TFs) and their targets or only study these as a post-processing step. Using such information can both, improve the accuracy of the reconstructed model and cell assignments, while at the same time provide information on how and when the process is regulated. We developed the Continuous-State Hidden Markov Models TF (CSHMM-TF) method which integrates probabilistic modeling of scRNA-Seq data with the ability to assign TFs to specific activation points in the model. TFs are assumed to influence the emission probabilities for cells assigned to later time points allowing us to identify not just the TFs controlling each path but also their order of activation. We tested CSHMM-TF on several mouse and human datasets. As we show, the method was able to identify known and novel TFs for all processes, assigned time of activation agrees with both expression information and prior knowledge and combinatorial predictions are supported by known interactions. We also show that CSHMM-TF improves upon prior methods that do not utilize TF-gene interaction. An important attribute of time series single cell RNA-Seq (scRNA-Seq) data, is the ability to infer continuous trajectories of genes based on orderings of the cells. While several methods have been developed for ordering cells and inferring such trajectories, to date it was not possible to use these to infer the temporal activity of several key TFs. These TFs are are only post-transcriptionally regulated and so their expression does not provide complete information on their activity. To address this we developed the Continuous-State Hidden Markov Models TF (CSHMM-TF) methods that assigns continuous activation time to TFs based on both, their expression and the expression of their targets. Applying our method to several time series scRNA-Seq datasets we show that it correctly identifies the key regulators for the processes being studied. We analyze the temporal assignments for these TFs and show that they provide new insights about combinatorial regulation and the ordering of TF activation. We used several complementary sources to validate some of these predictions and discuss a number of other novel suggestions based on the method. As we show, the method is able to scale to large and noisy datasets and so is appropriate for several studies utilizing time series scRNA-Seq data.
Collapse
Affiliation(s)
- Chieh Lin
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Jun Ding
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Ziv Bar-Joseph
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|
27
|
Li J, Wang GZ. Application of Computational Biology to Decode Brain Transcriptomes. GENOMICS PROTEOMICS & BIOINFORMATICS 2019; 17:367-380. [PMID: 31655213 PMCID: PMC6943780 DOI: 10.1016/j.gpb.2019.03.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Revised: 02/21/2019] [Accepted: 03/15/2019] [Indexed: 01/03/2023]
Abstract
The rapid development of high-throughput sequencing technologies has generated massive valuable brain transcriptome atlases, providing great opportunities for systematically investigating gene expression characteristics across various brain regions throughout a series of developmental stages. Recent studies have revealed that the transcriptional architecture is the key to interpreting the molecular mechanisms of brain complexity. However, our knowledge of brain transcriptional characteristics remains very limited. With the immense efforts to generate high-quality brain transcriptome atlases, new computational approaches to analyze these high-dimensional multivariate data are greatly needed. In this review, we summarize some public resources for brain transcriptome atlases and discuss the general computational pipelines that are commonly used in this field, which would aid in making new discoveries in brain development and disorders.
Collapse
Affiliation(s)
- Jie Li
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Guang-Zhong Wang
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|