1
|
Sadria M, Swaroop V. Discovering governing equations of biological systems through representation learning and sparse model discovery. NAR Genom Bioinform 2025; 7:lqaf048. [PMID: 40290314 PMCID: PMC12034105 DOI: 10.1093/nargab/lqaf048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Revised: 03/19/2025] [Accepted: 04/11/2025] [Indexed: 04/30/2025] Open
Abstract
Understanding the governing rules of complex biological systems remains a significant challenge due to the nonlinear, high-dimensional nature of biological data. In this study, we present CLERA, a novel end-to-end computational framework designed to uncover parsimonious dynamical models and identify active gene programs from single-cell RNA sequencing data. By integrating a supervised autoencoder architecture with Sparse Identification of Nonlinear Dynamics, CLERA leverages prior knowledge to simultaneously extract related low-dimensional representation and uncover the underlying dynamical systems that drive the processes. Through the analysis of both synthetic and biological data, CLERA demonstrates robust performance in reconstructing gene expression dynamics, identifying key regulatory genes, and capturing temporal patterns across distinct cell types. CLERA's ability to generate dynamic interaction networks, combined with network rewiring using Personalized PageRank to highlight central genes and active gene programs, offers new insights into the complex regulatory mechanisms underlying cellular processes.
Collapse
Affiliation(s)
- Mehrshad Sadria
- Department of Applied Mathematics, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
| | - Vasu Swaroop
- Department of Computer Science Information Systems, BITS-Pilani, Pilani Campus, Pilani 333031, India
| |
Collapse
|
2
|
Wang P, Liu W, Wang J, Liu Y, Li P, Xu P, Cui W, Zhang R, Long Q, Hu Z, Fang C, Dong J, Zhang C, Chen Y, Wang C, Liu G, Xie H, Zhang Y, Xiao M, Chen S, Jiang H, Chen Y, Yang G, Zhang S, Meng Z, Wang X, Feng G, Li X, Zhou Y. scCompass: An Integrated Multi-Species scRNA-seq Database for AI-Ready. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025:e2500870. [PMID: 40317650 DOI: 10.1002/advs.202500870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2025] [Revised: 03/29/2025] [Indexed: 05/07/2025]
Abstract
Emerging single-cell sequencing technology has generated large amounts of data, allowing analysis of cellular dynamics and gene regulation at the single-cell resolution. Advances in artificial intelligence enhance life sciences research by delivering critical insights and optimizing data analysis processes. However, inconsistent data processing quality and standards remain to be a major challenge. Here scCompass is proposed, which provides a comprehensive resource designed to build large-scale, multi-species, and model-friendly single-cell data collection. By applying standardized data pre-processing, scCompass integrates and curates transcriptomic data from nearly 105 million single cells across 13 species. Using this extensive dataset, it is able to identify stable expression genes (SEGs) and organ-specific expression genes (OSGs) in humans and mice. Different scalable datasets are provided that can be easily adapted for AI model training and the pretrained checkpoints with state-of-the-art single-cell foundation models. In summary, scCompass is highly efficient and scalable database for AI-ready, which combined with user-friendly data sharing, visualization, and online analysis, greatly simplifies data access and exploitation for researchers in single-cell biology (http://www.bdbe.cn/kun).
Collapse
Affiliation(s)
- Pengfei Wang
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Wenhao Liu
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regenerative Medicine, Chinese Academy of Sciences, Beijing, 100101, China
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China
- College of Life Science, Northeast Agricultural University, Harbin, 150030, China
| | - Jiajia Wang
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
| | - Yana Liu
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regenerative Medicine, Chinese Academy of Sciences, Beijing, 100101, China
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China
| | - Pengjiang Li
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Ping Xu
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Wentao Cui
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Ran Zhang
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Qingqing Long
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Zhilong Hu
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Chen Fang
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regenerative Medicine, Chinese Academy of Sciences, Beijing, 100101, China
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Jingxi Dong
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regenerative Medicine, Chinese Academy of Sciences, Beijing, 100101, China
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China
| | - Chunyang Zhang
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
| | - Yan Chen
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
| | - Chengrui Wang
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
| | - Guole Liu
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Hanyu Xie
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
| | - Yiyang Zhang
- CEMS, NCMIS, HCMS, MDIS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Meng Xiao
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
| | - Shubai Chen
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Haiping Jiang
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regenerative Medicine, Chinese Academy of Sciences, Beijing, 100101, China
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Yiqiang Chen
- Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Ge Yang
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Shihua Zhang
- CEMS, NCMIS, HCMS, MDIS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Zhen Meng
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Xuezhi Wang
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Guihai Feng
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regenerative Medicine, Chinese Academy of Sciences, Beijing, 100101, China
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Xin Li
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- Institute for Stem Cell and Regenerative Medicine, Chinese Academy of Sciences, Beijing, 100101, China
- Beijing Institute for Stem Cell and Regenerative Medicine, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| | - Yuanchun Zhou
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
- University of Chinese Academy of Sciences, Beijing, 100864, China
| |
Collapse
|
3
|
Zhao W, Larschan E, Sandstede B, Singh R. Optimal transport reveals dynamic gene regulatory networks via gene velocity estimation. PLoS Comput Biol 2025; 21:e1012476. [PMID: 40341271 PMCID: PMC12118989 DOI: 10.1371/journal.pcbi.1012476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Revised: 05/28/2025] [Accepted: 04/10/2025] [Indexed: 05/10/2025] Open
Abstract
Inferring gene regulatory networks from gene expression data is an important and challenging problem in the biology community. We propose OTVelo, a methodology that takes time-stamped single-cell gene expression data as input and predicts gene regulation across two time points. It is known that the rate of change of gene expression, which we will refer to as gene velocity, provides crucial information that enhances such inference; however, this information is not always available due to the limitations in sequencing depth. Our algorithm overcomes this limitation by estimating gene velocities using optimal transport. We then infer gene regulation using time-lagged correlation and Granger causality via regularized linear regression. Instead of providing an aggregated network across all time points, our method uncovers the underlying dynamical mechanism across time points. We validate our algorithm on 13 simulated datasets with both synthetic and curated networks and demonstrate its efficacy on 9 experimental data sets.
Collapse
Affiliation(s)
- Wenjun Zhao
- Division of Applied Mathematics, Brown University, Providence, Rhode Island, United States of America
- Department of Mathematics, University of British Columbia, Vancouver, Canada
| | - Erica Larschan
- Department of Molecular Biology, Cell Biology and Biochemistry, Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| | - Björn Sandstede
- Division of Applied Mathematics, Brown University, Providence, Rhode Island, United States of America
| | - Ritambhara Singh
- Department of Computer Science, Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
4
|
Aivazidis A, Memi F, Kleshchevnikov V, Er S, Clarke B, Stegle O, Bayraktar OA. Cell2fate infers RNA velocity modules to improve cell fate prediction. Nat Methods 2025; 22:698-707. [PMID: 40032996 PMCID: PMC11978503 DOI: 10.1038/s41592-025-02608-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 01/23/2025] [Indexed: 03/05/2025]
Abstract
RNA velocity exploits the temporal information contained in spliced and unspliced RNA counts to infer transcriptional dynamics. Existing velocity models often rely on coarse biophysical simplifications or numerical approximations to solve the underlying ordinary differential equations (ODEs), which can compromise accuracy in challenging settings, such as complex or weak transcription rate changes across cellular trajectories. Here we present cell2fate, a formulation of RNA velocity based on a linearization of the velocity ODE, which allows solving a biophysically more accurate model in a fully Bayesian fashion. As a result, cell2fate decomposes the RNA velocity solutions into modules, providing a biophysical connection between RNA velocity and statistical dimensionality reduction. We comprehensively benchmark cell2fate in real-world settings, demonstrating enhanced interpretability and power to reconstruct complex dynamics and weak dynamical signals in rare and mature cell types. Finally, we apply cell2fate to the developing human brain, where we spatially map RNA velocity modules onto the tissue architecture, connecting the spatial organization of tissues with temporal dynamics of transcription.
Collapse
Affiliation(s)
| | - Fani Memi
- Wellcome Sanger Institute, Cambridge, UK
| | | | - Sezgin Er
- International School of Medicine, Istanbul Medipol University, Istanbul, Turkey
| | - Brian Clarke
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Oliver Stegle
- Wellcome Sanger Institute, Cambridge, UK.
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany.
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
| | | |
Collapse
|
5
|
Yang L, Yu XX, Wang X, Jin CT, Xu CR. The expression order determines the pioneer functions of NGN3 and NEUROD1 in pancreatic endocrine differentiation. SCIENCE ADVANCES 2025; 11:eadt4770. [PMID: 40138419 PMCID: PMC11939047 DOI: 10.1126/sciadv.adt4770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Accepted: 02/20/2025] [Indexed: 03/29/2025]
Abstract
Pioneer transcription factors (TFs) initiate chromatin remodeling, which is crucial for gene regulation and cell differentiation. In this study, we investigated how the sequential expression of neurogenin 3 (NGN3) and NEUROD1 affects their pioneering functions during pancreatic endocrine differentiation. Using a genetically engineered mouse model, we mapped NGN3-binding sites, confirming the pivotal role of this molecule in regulating chromatin accessibility. The pioneering function of NGN3 involves dose tolerance, and low doses are sufficient. Although NEUROD1 generally acts as a conventional TF, it can assume a pioneering role in the absence of NGN3. The sequential expression of NeuroD1 and Ngn3 predominantly drives α cell generation, which may explain the inefficient β cell induction observed in vitro. Our findings demonstrate that pioneer activity is dynamically shaped by temporal TF expression and inter-TF interactions, providing insights into transcriptional regulation and its implications for disease mechanisms and therapeutic targeting and enhancing in vitro differentiation strategies.
Collapse
Affiliation(s)
- Liu Yang
- State Key Laboratory of Female Fertility Promotion, Department of Medical Genetics, School of Basic Medical Sciences, Peking University, Beijing 100191, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Xin-Xin Yu
- State Key Laboratory of Female Fertility Promotion, Department of Medical Genetics, School of Basic Medical Sciences, Peking University, Beijing 100191, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Xin Wang
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- School of Life Sciences, Peking University, Beijing 100871, China
| | - Chen-Tao Jin
- State Key Laboratory of Female Fertility Promotion, Department of Medical Genetics, School of Basic Medical Sciences, Peking University, Beijing 100191, China
| | - Cheng-Ran Xu
- State Key Laboratory of Female Fertility Promotion, Department of Medical Genetics, School of Basic Medical Sciences, Peking University, Beijing 100191, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| |
Collapse
|
6
|
Heidenreich AC, Bacigalupo L, Rossotti M, Rodríguez-Seguí SA. Identification of mouse and human embryonic pancreatic cells with adult Procr + progenitor transcriptomic and epigenomic characteristics. Front Endocrinol (Lausanne) 2025; 16:1543960. [PMID: 40017694 PMCID: PMC11864936 DOI: 10.3389/fendo.2025.1543960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/12/2024] [Accepted: 01/21/2025] [Indexed: 03/01/2025] Open
Abstract
Background The quest to find a progenitor cell in the adult pancreas has driven research in the field for decades. Many potential progenitor cell sources have been reported, but so far this is a matter of debate mainly due to reproducibility issues. The existence of adult Procr+ progenitor cells in mice islets has been recently reported. These were shown to comprise ~1% of islet cells, lack expression of Neurog3 and endocrine hormones, and to be capable of differentiating into all endocrine cell types. However, these findings had limited impact, as further evidence supporting the existence and function of Procr+ progenitors has not emerged. Methods and findings We report here an unbiased comparison across mouse and human pancreatic samples, including adult islets and embryonic tissue, to track the existence of Procr+ progenitors originally described based on their global gene expression signature. We could not find Procr+ progenitors on other mouse or human adult pancreatic islet samples. Unexpectedly, our results revealed a transcriptionally close mesothelial cell population in the mouse and human embryonic pancreas. These Procr-like mesothelial cells of the embryonic pancreas share the salient transcriptional and epigenomic features of previously reported Procr+ progenitors found in adult pancreatic islets. Notably, we report here that Procr-like transcriptional signature is gradually established in mesothelial cells during mouse pancreas development from E12.5 to E17.5, which has its largest amount. Further supporting a developmentally relevant role in the human pancreas, we additionally report that a transcriptionally similar population is spontaneously differentiated from human pluripotent stem cells cultured in vitro along the pancreatic lineage. Conclusions Our results show that, although the previously reported Procr+ progenitor cell population could not be found in other adult pancreatic islet samples, a mesothelial cell population with a closely related transcriptional signature is present in both the mouse and human embryonic pancreas. Several lines of evidence presented in this work support a developmentally relevant function for these Procr-like mesothelial cells.
Collapse
Affiliation(s)
- Ana C. Heidenreich
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), CONICET-Universidad de Buenos Aires, Buenos Aires, Argentina
- Departamento de Fisiología, Biología Molecular y Celular, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Lucas Bacigalupo
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), CONICET-Universidad de Buenos Aires, Buenos Aires, Argentina
- Departamento de Fisiología, Biología Molecular y Celular, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Martina Rossotti
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), CONICET-Universidad de Buenos Aires, Buenos Aires, Argentina
- Departamento de Fisiología, Biología Molecular y Celular, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Santiago A. Rodríguez-Seguí
- Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE), CONICET-Universidad de Buenos Aires, Buenos Aires, Argentina
- Departamento de Fisiología, Biología Molecular y Celular, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| |
Collapse
|
7
|
Klein D, Palla G, Lange M, Klein M, Piran Z, Gander M, Meng-Papaxanthos L, Sterr M, Saber L, Jing C, Bastidas-Ponce A, Cota P, Tarquis-Medina M, Parikh S, Gold I, Lickert H, Bakhti M, Nitzan M, Cuturi M, Theis FJ. Mapping cells through time and space with moscot. Nature 2025; 638:1065-1075. [PMID: 39843746 PMCID: PMC11864987 DOI: 10.1038/s41586-024-08453-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 11/25/2024] [Indexed: 01/24/2025]
Abstract
Single-cell genomic technologies enable the multimodal profiling of millions of cells across temporal and spatial dimensions. However, experimental limitations hinder the comprehensive measurement of cells under native temporal dynamics and in their native spatial tissue niche. Optimal transport has emerged as a powerful tool to address these constraints and has facilitated the recovery of the original cellular context1-4. Yet, most optimal transport applications are unable to incorporate multimodal information or scale to single-cell atlases. Here we introduce multi-omics single-cell optimal transport (moscot), a scalable framework for optimal transport in single-cell genomics that supports multimodality across all applications. We demonstrate the capability of moscot to efficiently reconstruct developmental trajectories of 1.7 million cells from mouse embryos across 20 time points. To illustrate the capability of moscot in space, we enrich spatial transcriptomic datasets by mapping multimodal information from single-cell profiles in a mouse liver sample and align multiple coronal sections of the mouse brain. We present moscot.spatiotemporal, an approach that leverages gene-expression data across both spatial and temporal dimensions to uncover the spatiotemporal dynamics of mouse embryogenesis. We also resolve endocrine-lineage relationships of delta and epsilon cells in a previously unpublished mouse, time-resolved pancreas development dataset using paired measurements of gene expression and chromatin accessibility. Our findings are confirmed through experimental validation of NEUROD2 as a regulator of epsilon progenitor cells in a model of human induced pluripotent stem cell islet cell differentiation. Moscot is available as open-source software, accompanied by extensive documentation.
Collapse
Affiliation(s)
- Dominik Klein
- Institute of Computational Biology, Helmholtz Center, Munich, Germany
- Department of Mathematics, Technical University of Munich, Garching, Germany
| | - Giovanni Palla
- Institute of Computational Biology, Helmholtz Center, Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Marius Lange
- Institute of Computational Biology, Helmholtz Center, Munich, Germany
- Department of Mathematics, Technical University of Munich, Garching, Germany
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | | | - Zoe Piran
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Manuel Gander
- Institute of Computational Biology, Helmholtz Center, Munich, Germany
| | | | - Michael Sterr
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
| | - Lama Saber
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
- School of Medicine, Technical University of Munich, Munich, Germany
| | - Changying Jing
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
- Munich Medical Research School (MMRS), Ludwig Maximilian University (LMU), Munich, Germany
| | - Aimée Bastidas-Ponce
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
| | - Perla Cota
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
- School of Medicine, Technical University of Munich, Munich, Germany
| | - Marta Tarquis-Medina
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
| | - Shrey Parikh
- Institute of Computational Biology, Helmholtz Center, Munich, Germany
| | - Ilan Gold
- Institute of Computational Biology, Helmholtz Center, Munich, Germany
| | - Heiko Lickert
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany.
- German Center for Diabetes Research, Neuherberg, Germany.
- School of Medicine, Technical University of Munich, Munich, Germany.
| | - Mostafa Bakhti
- Institute of Diabetes and Regeneration Research, Helmholtz Center, Munich, Germany
- German Center for Diabetes Research, Neuherberg, Germany
| | - Mor Nitzan
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
- Racah Institute of Physics, The Hebrew University of Jerusalem, Jerusalem, Israel
- Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | | | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Center, Munich, Germany.
- Department of Mathematics, Technical University of Munich, Garching, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.
| |
Collapse
|
8
|
Wei J, Zhang B, Wang Q, Zhou T, Tian T, Chen L. Diffusive topology preserving manifold distances for single-cell data analysis. Proc Natl Acad Sci U S A 2025; 122:e2404860121. [PMID: 39854240 PMCID: PMC11789025 DOI: 10.1073/pnas.2404860121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 11/25/2024] [Indexed: 01/26/2025] Open
Abstract
Manifold learning techniques have emerged as crucial tools for uncovering latent patterns in high-dimensional single-cell data. However, most existing dimensionality reduction methods primarily rely on 2D visualization, which can distort true data relationships and fail to extract reliable biological information. Here, we present DTNE (diffusive topology neighbor embedding), a dimensionality reduction framework that faithfully approximates manifold distance to enhance cellular relationships and dynamics. DTNE constructs a manifold distance matrix using a modified personalized PageRank algorithm, thereby preserving topological structure while enabling diverse single-cell analyses. This approach facilitates distribution-based cellular relationship analysis, pseudotime inference, and clustering within a unified framework. Extensive benchmarking against mainstream algorithms on diverse datasets demonstrates DTNE's superior performance in maintaining geodesic distances and revealing significant biological patterns. Our results establish DTNE as a powerful tool for high-dimensional data analysis in uncovering meaningful biological insights.
Collapse
Affiliation(s)
- Jiangyong Wei
- Guangdong Institute of Intelligence Science and Technology, 519031Hengqin, Zhuhai, Guangdong, China
| | - Bin Zhang
- Guangdong Institute of Intelligence Science and Technology, 519031Hengqin, Zhuhai, Guangdong, China
| | - Qiu Wang
- Guangdong Institute of Intelligence Science and Technology, 519031Hengqin, Zhuhai, Guangdong, China
| | - Tianshou Zhou
- School of Mathematics and Statistics, Sun Yat-sen University, 510275Guangzhou, China
| | - Tianhai Tian
- School of Mathematics, Monash University, Melbourne, VIC3800, Australia
| | - Luonan Chen
- Guangdong Institute of Intelligence Science and Technology, 519031Hengqin, Zhuhai, Guangdong, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 310024Hangzhou, China
- Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai200031, China
| |
Collapse
|
9
|
Huang YA, Li YC, You ZH, Hu L, Hu PW, Wang L, Peng Y, Huang ZA. Consensus representation of multiple cell-cell graphs from gene signaling pathways for cell type annotation. BMC Biol 2025; 23:23. [PMID: 39849579 PMCID: PMC11756145 DOI: 10.1186/s12915-025-02128-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Accepted: 01/13/2025] [Indexed: 01/25/2025] Open
Abstract
BACKGROUND Recent advancements in single-cell RNA sequencing have greatly expanded our knowledge of the heterogeneous nature of tissues. However, robust and accurate cell type annotation continues to be a major challenge, hindered by issues such as marker specificity, batch effects, and a lack of comprehensive spatial and interaction data. Traditional annotation methods often fail to adequately address the complexity of cellular interactions and gene regulatory networks. RESULTS We proposed scMCGraph, a comprehensive computational framework that integrates gene expression with pathway activity to accurately annotate cell types within diverse scRNA-seq datasets. Initially, our model constructs multiple pathway-specific views using various pathway databases, which reflect both gene expression and pathway activities. These pathway-specific views are then integrated into a consensus graph. The consensus graph is subsequently utilized to reconstruct the multiple pathway views. Our model demonstrated exceptional robustness and accuracy across various analyses, including cross-platform, cross-time, cross-sample, and clinical dataset evaluations. CONCLUSIONS scMCGraph represents a significant advance in cell type annotation. The experiments have demonstrated that introducing pathway information significantly improves the learning of cell-cell graphs, with their resulting consensus graph enhancing the predictive performance of cell type prediction. Different pathway databases provide complementary data, and an increase in the number of pathways can also boost model performance. Extensive testing shows that in various cross-dataset application scenarios, scMCGraph consistently exhibits both accuracy and robustness.
Collapse
Affiliation(s)
- Yu-An Huang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710000, China.
- Research & Development Institute of Northwestern Polytechnical University in Shenzhen, Shenzhen, 518063, China.
| | - Yue-Chao Li
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710000, China
| | - Zhu-Hong You
- School of Electronic Information, Xijing University, Xi'an, 710000, China.
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Urumqi, 830011, China
| | - Peng-Wei Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Urumqi, 830011, China
| | - Lei Wang
- Guangxi Key Lab of Human-Machine Interaction and Intelligent Decision, Guangxi Academy of Sciences, Nanning, 530001, China
| | - Yuzhong Peng
- Guangxi Key Lab of Human-Machine Interaction and Intelligent Decision, Nanning Normal University, Nanning, 530001, China
| | - Zhi-An Huang
- Research Office, City University of Hong Kong (Dongguan), Dongguan, 523000, China.
| |
Collapse
|
10
|
Sumanaweera D, Suo C, Cujba AM, Muraro D, Dann E, Polanski K, Steemers AS, Lee W, Oliver AJ, Park JE, Meyer KB, Dumitrascu B, Teichmann SA. Gene-level alignment of single-cell trajectories. Nat Methods 2025; 22:68-81. [PMID: 39300283 PMCID: PMC11725504 DOI: 10.1038/s41592-024-02378-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 07/12/2024] [Indexed: 09/22/2024]
Abstract
Single-cell data analysis can infer dynamic changes in cell populations, for example across time, space or in response to perturbation, thus deriving pseudotime trajectories. Current approaches comparing trajectories often use dynamic programming but are limited by assumptions such as the existence of a definitive match. Here we describe Genes2Genes, a Bayesian information-theoretic dynamic programming framework for aligning single-cell trajectories. It is able to capture sequential matches and mismatches of individual genes between a reference and query trajectory, highlighting distinct clusters of alignment patterns. Across both real world and simulated datasets, it accurately inferred alignments and demonstrated its utility in disease cell-state trajectory analysis. In a proof-of-concept application, Genes2Genes revealed that T cells differentiated in vitro match an immature in vivo state while lacking expression of genes associated with TNF signaling. This demonstrates that precise trajectory alignment can pinpoint divergence from the in vivo system, thus guiding the optimization of in vitro culture conditions.
Collapse
Affiliation(s)
- Dinithi Sumanaweera
- Wellcome Sanger Institute; Wellcome Genome Campus, Hinxton, Cambridge, UK
- Theory of Condensed Matter, Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge, UK
| | - Chenqu Suo
- Wellcome Sanger Institute; Wellcome Genome Campus, Hinxton, Cambridge, UK
- Department of Paediatrics, Cambridge University Hospitals; Hills Road, Cambridge, UK
| | - Ana-Maria Cujba
- Wellcome Sanger Institute; Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Daniele Muraro
- Wellcome Sanger Institute; Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Emma Dann
- Wellcome Sanger Institute; Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Krzysztof Polanski
- Wellcome Sanger Institute; Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Alexander S Steemers
- Wellcome Sanger Institute; Wellcome Genome Campus, Hinxton, Cambridge, UK
- Princess Máxima Center for Pediatric Oncology, Utrecht, Netherlands
| | - Woochan Lee
- Wellcome Sanger Institute; Wellcome Genome Campus, Hinxton, Cambridge, UK
- Department of Biomedical Sciences, Seoul National University, Seoul, Korea
| | - Amanda J Oliver
- Wellcome Sanger Institute; Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Jong-Eun Park
- Wellcome Sanger Institute; Wellcome Genome Campus, Hinxton, Cambridge, UK
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea
| | - Kerstin B Meyer
- Wellcome Sanger Institute; Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Bianca Dumitrascu
- Department of Statistics, Columbia University, New York, NY, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA
| | - Sarah A Teichmann
- Wellcome Sanger Institute; Wellcome Genome Campus, Hinxton, Cambridge, UK.
- Cambridge Stem Cell Institute, Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, University of Cambridge, Cambridge, UK.
- Department of Medicine, University of Cambridge, Cambridge, UK.
- Co-director of CIFAR Macmillan Research Program, Toronto, Ontario, Canada.
| |
Collapse
|
11
|
Wang Y, Dede M, Mohanty V, Dou J, Li Z, Chen K. A statistical approach for systematic identification of transition cells from scRNA-seq data. CELL REPORTS METHODS 2024; 4:100913. [PMID: 39644902 PMCID: PMC11704623 DOI: 10.1016/j.crmeth.2024.100913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 09/01/2024] [Accepted: 11/13/2024] [Indexed: 12/09/2024]
Abstract
Decoding cellular state transitions is crucial for understanding complex biological processes in development and disease. While recent advancements in single-cell RNA sequencing (scRNA-seq) offer insights into cellular trajectories, existing tools primarily study expressional rather than regulatory state shifts. We present CellTran, a statistical approach utilizing paired-gene expression correlations to detect transition cells from scRNA-seq data without explicitly resolving gene regulatory networks. Applying our approach to various contexts, including tissue regeneration, embryonic development, preinvasive lesions, and humoral responses post-vaccination, reveals transition cells and their distinct gene expression profiles. Our study sheds light on the underlying molecular mechanisms driving cellular state transitions, enhancing our ability to identify therapeutic targets for disease interventions.
Collapse
Affiliation(s)
- Yuanxin Wang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Merve Dede
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Vakul Mohanty
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jinzhuang Dou
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Ziyi Li
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Ken Chen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
| |
Collapse
|
12
|
Lederer AR, Leonardi M, Talamanca L, Bobrovskiy DM, Herrera A, Droin C, Khven I, Carvalho HJF, Valente A, Dominguez Mantes A, Mulet Arabí P, Pinello L, Naef F, La Manno G. Statistical inference with a manifold-constrained RNA velocity model uncovers cell cycle speed modulations. Nat Methods 2024; 21:2271-2286. [PMID: 39482463 PMCID: PMC11621032 DOI: 10.1038/s41592-024-02471-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 09/15/2024] [Indexed: 11/03/2024]
Abstract
Across biological systems, cells undergo coordinated changes in gene expression, resulting in transcriptome dynamics that unfold within a low-dimensional manifold. While low-dimensional dynamics can be extracted using RNA velocity, these algorithms can be fragile and rely on heuristics lacking statistical control. Moreover, the estimated vector field is not dynamically consistent with the traversed gene expression manifold. To address these challenges, we introduce a Bayesian model of RNA velocity that couples velocity field and manifold estimation in a reformulated, unified framework, identifying the parameters of an explicit dynamical system. Focusing on the cell cycle, we implement VeloCycle to study gene regulation dynamics on one-dimensional periodic manifolds and validate its ability to infer cell cycle periods using live imaging. We also apply VeloCycle to reveal speed differences in regionally defined progenitors and Perturb-seq gene knockdowns. Overall, VeloCycle expands the single-cell RNA sequencing analysis toolkit with a modular and statistically consistent RNA velocity inference framework.
Collapse
Affiliation(s)
- Alex R Lederer
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Maxine Leonardi
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Lorenzo Talamanca
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Daniil M Bobrovskiy
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Antonio Herrera
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Colas Droin
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Irina Khven
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Hugo J F Carvalho
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Alessandro Valente
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Albert Dominguez Mantes
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Laboratory of Bioimage Analysis and Computational Microscopy, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Pau Mulet Arabí
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Luca Pinello
- Molecular Pathology Unit, Massachusetts General Research Institute, Charlestown, MA, USA
- Massachusetts General Hospital Cancer Center, Harvard Medical School, Charlestown, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Felix Naef
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| | - Gioele La Manno
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| |
Collapse
|
13
|
Chen X, Ma Y, Shi Y, Fu Y, Nan M, Ren Q, Gao J. Population-Level Cell Trajectory Inference Based on Gaussian Distributions. Biomolecules 2024; 14:1396. [PMID: 39595573 PMCID: PMC11592043 DOI: 10.3390/biom14111396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Revised: 10/29/2024] [Accepted: 10/30/2024] [Indexed: 11/28/2024] Open
Abstract
In the past decade, inferring developmental trajectories from single-cell data has become a significant challenge in bioinformatics. RNA velocity, with its incorporation of directional dynamics, has significantly advanced the study of single-cell trajectories. However, as single-cell RNA sequencing technology evolves, it generates complex, high-dimensional data with high noise levels. Existing trajectory inference methods, which overlook cell distribution characteristics, may perform inadequately under such conditions. To address this, we introduce CPvGTI, a Gaussian distribution-based trajectory inference method. CPvGTI utilizes a Gaussian mixture model, optimized by the Expectation-Maximization algorithm, to construct new cell populations in the original data space. By integrating RNA velocity, CPvGTI employs Gaussian Process Regression to analyze the differentiation trajectories of these cell populations. To evaluate the performance of CPvGTI, we assess CPvGTI's performance against several state-of-the-art methods using four structurally diverse simulated datasets and four real datasets. The simulation studies indicate that CPvGTI excels in pseudo-time prediction and structural reconstruction compared to existing methods. Furthermore, the discovery of new branch trajectories in human forebrain and mouse hematopoiesis datasets confirms CPvGTI's superior performance.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Jie Gao
- School of Science, Jiangnan University, Wuxi 214122, China; (X.C.); (Y.M.); (Y.S.); (Y.F.); (M.N.); (Q.R.)
| |
Collapse
|
14
|
Wang W, Wang Y, Lyu R, Grün D. Scalable identification of lineage-specific gene regulatory networks from metacells with NetID. Genome Biol 2024; 25:275. [PMID: 39425176 PMCID: PMC11488259 DOI: 10.1186/s13059-024-03418-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 10/08/2024] [Indexed: 10/21/2024] Open
Abstract
The identification of gene regulatory networks (GRNs) is crucial for understanding cellular differentiation. Single-cell RNA sequencing data encode gene-level covariations at high resolution, yet data sparsity and high dimensionality hamper accurate and scalable GRN reconstruction. To overcome these challenges, we introduce NetID leveraging homogenous metacells while avoiding spurious gene-gene correlations. Benchmarking demonstrates superior performance of NetID compared to imputation-based methods. By incorporating cell fate probability information, NetID facilitates the prediction of lineage-specific GRNs and recovers known network motifs governing bone marrow hematopoiesis, making it a powerful toolkit for deciphering gene regulatory control of cellular differentiation from large-scale single-cell transcriptome data.
Collapse
Affiliation(s)
- Weixu Wang
- Human Phenome Institute, Fudan University, Shanghai, China
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
| | - Yichen Wang
- Cancer, Ageing and Somatic Mutation, Wellcome Sanger Institute, Hinxton, UK
| | - Ruiqi Lyu
- School of Computer Science, Carnegie Mellon University, Pittsburgh, USA
| | - Dominic Grün
- Würzburg Institute of Systems Immunology, Julius-Maximilians-Universität Würzburg, Würzburg, Germany.
- CAIDAS - Center for Artificial Intelligence and Data Science, Würzburg, Germany.
| |
Collapse
|
15
|
Davidson RK, Wu W, Kanojia S, George RM, Huter K, Sandoval K, Osmulski M, Casey N, Spaeth JM. The SWI/SNF chromatin remodelling complex regulates pancreatic endocrine cell expansion and differentiation in mice in vivo. Diabetologia 2024; 67:2275-2288. [PMID: 38958700 PMCID: PMC11912225 DOI: 10.1007/s00125-024-06211-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 05/16/2024] [Indexed: 07/04/2024]
Abstract
AIMS/HYPOTHESIS Strategies to augment functional beta cell mass include directed differentiation of stem cells towards a beta cell fate, which requires extensive knowledge of transcriptional programs governing endocrine progenitor cell differentiation in vivo. We aimed to study the contributions of the Brahma-related gene-1 (BRG1) and Brahma (BRM) ATPase subunits of the SWI/SNF chromatin remodelling complex to endocrine cell development. METHODS We generated mice with endocrine progenitor-specific Neurog3-Cre BRG1 removal in the presence of heterozygous (Brg1Δendo;Brm+/-) or homozygous (double knockout: DKOΔendo) BRM deficiency. Whole-body metabolic phenotyping, islet function characterisation, islet quantitative PCR and histological characterisation were performed on animals and tissues postnatally. To test the mechanistic actions of SWI/SNF in controlling gene expression during endocrine cell development, single-cell RNA-seq was performed on flow-sorted endocrine-committed cells from embryonic day 15.5 control and mutant embryos. RESULTS Brg1Δendo;Brm+/- mice exhibit severe glucose intolerance, hyperglycaemia and hypoinsulinaemia, resulting, in part, from reduced islet number; diminished alpha, beta and delta cell mass; compromised islet insulin secretion; and altered islet gene expression programs, including reductions in MAFA and urocortin 3 (UCN3). DKOΔendo mice were not recovered at weaning; however, postnatal day 6 DKOΔendo mice were severely hyperglycaemic with reduced serum insulin levels and beta cell area. Single-cell RNA-seq of embryonic day 15.5 lineage-labelled cells revealed endocrine progenitor, alpha and beta cell populations from SWI/SNF mutants have reduced expression of Mafa, Gcg, Ins1 and Ins2, suggesting limited differentiation capacity. Reduced Neurog3 transcripts were discovered in DKOΔendo endocrine progenitor clusters, and the proliferative capacity of neurogenin 3 (NEUROG3)+ cells was reduced in Brg1Δendo;Brm+/- and DKOΔendo mutants. CONCLUSIONS/INTERPRETATION Loss of BRG1 from developing endocrine progenitor cells has a severe postnatal impact on glucose homeostasis, and loss of both subunits impedes animal survival, with both groups exhibiting alterations in hormone transcripts embryonically. Taken together, these data highlight the critical role SWI/SNF plays in governing gene expression programs essential for endocrine cell development and expansion. DATA AVAILABILITY Raw and processed data for scRNA-seq have been deposited into the NCBI Gene Expression Omnibus (GEO) database under the accession number GSE248369.
Collapse
Grants
- DK127129 Division of Diabetes, Endocrinology, and Metabolic Diseases
- DK106846 Division of Diabetes, Endocrinology, and Metabolic Diseases
- R03 DK127129 NIDDK NIH HHS
- F32 DK104426 NIDDK NIH HHS
- DK097512 Division of Diabetes, Endocrinology, and Metabolic Diseases
- P30 CA082709 NCI NIH HHS
- DK129287 Division of Diabetes, Endocrinology, and Metabolic Diseases
- P30 DK097512 NIDDK NIH HHS
- R01 DK129287 NIDDK NIH HHS
- DK097771 Division of Diabetes, Endocrinology, and Metabolic Diseases
- F31 DK128918 NIDDK NIH HHS
- DK115633 Division of Diabetes, Endocrinology, and Metabolic Diseases
- K01 DK115633 NIDDK NIH HHS
- U24 DK097771 NIDDK NIH HHS
- DK128918 Division of Diabetes, Endocrinology, and Metabolic Diseases
- CA082709 Division of Cancer Prevention, National Cancer Institute
Collapse
Affiliation(s)
- Rebecca K Davidson
- Center for Diabetes & Metabolic Diseases, Indiana University School of Medicine, Indianapolis, IN, USA
- Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN, USA
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Wenting Wu
- Center for Diabetes & Metabolic Diseases, Indiana University School of Medicine, Indianapolis, IN, USA
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Sukrati Kanojia
- Center for Diabetes & Metabolic Diseases, Indiana University School of Medicine, Indianapolis, IN, USA
- Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN, USA
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Rajani M George
- Center for Diabetes & Metabolic Diseases, Indiana University School of Medicine, Indianapolis, IN, USA
- Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN, USA
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Kayla Huter
- Center for Diabetes & Metabolic Diseases, Indiana University School of Medicine, Indianapolis, IN, USA
- Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN, USA
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Kassandra Sandoval
- Center for Diabetes & Metabolic Diseases, Indiana University School of Medicine, Indianapolis, IN, USA
- Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN, USA
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Meredith Osmulski
- Center for Diabetes & Metabolic Diseases, Indiana University School of Medicine, Indianapolis, IN, USA
- Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN, USA
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Nolan Casey
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Jason M Spaeth
- Center for Diabetes & Metabolic Diseases, Indiana University School of Medicine, Indianapolis, IN, USA.
- Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN, USA.
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, USA.
- Department of Biochemistry & Molecular Biology, Indiana University School of Medicine, Indianapolis, IN, USA.
| |
Collapse
|
16
|
Mizukoshi C, Kojima Y, Nomura S, Hayashi S, Abe K, Shimamura T. DeepKINET: a deep generative model for estimating single-cell RNA splicing and degradation rates. Genome Biol 2024; 25:229. [PMID: 39237934 PMCID: PMC11378460 DOI: 10.1186/s13059-024-03367-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 08/04/2024] [Indexed: 09/07/2024] Open
Abstract
Messenger RNA splicing and degradation are critical for gene expression regulation, the abnormality of which leads to diseases. Previous methods for estimating kinetic rates have limitations, assuming uniform rates across cells. DeepKINET is a deep generative model that estimates splicing and degradation rates at single-cell resolution from scRNA-seq data. DeepKINET outperforms existing methods on simulated and metabolic labeling datasets. Applied to forebrain and breast cancer data, it identifies RNA-binding proteins responsible for kinetic rate diversity. DeepKINET also analyzes the effects of splicing factor mutations on target genes in erythroid lineage cells. DeepKINET effectively reveals cellular heterogeneity in post-transcriptional regulation.
Collapse
Affiliation(s)
- Chikara Mizukoshi
- Division of Systems Biology, Graduate School of Medicine, Nagoya University, Aichi, Japan.
- Nagoya University Hospital, Aichi, Japan.
| | - Yasuhiro Kojima
- Laboratory of Computational Life Science, National Cancer Center Research Institute, Tokyo, Japan.
- Department of Computational and Systems Biology, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan.
| | - Satoshi Nomura
- Division of Systems Biology, Graduate School of Medicine, Nagoya University, Aichi, Japan
| | - Shuto Hayashi
- Department of Computational and Systems Biology, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
| | - Ko Abe
- Department of Computational and Systems Biology, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
| | - Teppei Shimamura
- Division of Systems Biology, Graduate School of Medicine, Nagoya University, Aichi, Japan.
- Department of Computational and Systems Biology, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan.
| |
Collapse
|
17
|
Sadria M, Bury TM. FateNet: an integration of dynamical systems and deep learning for cell fate prediction. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae525. [PMID: 39177093 PMCID: PMC11399232 DOI: 10.1093/bioinformatics/btae525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 06/28/2024] [Accepted: 08/21/2024] [Indexed: 08/24/2024]
Abstract
MOTIVATION Understanding cellular decision-making, particularly its timing and impact on the biological system such as tissue health and function, is a fundamental challenge in biology and medicine. Existing methods for inferring fate decisions and cellular state dynamics from single-cell RNA sequencing data lack precision regarding decision points and broader tissue implications. Addressing this gap, we present FateNet, a computational approach integrating dynamical systems theory and deep learning to probe the cell decision-making process using scRNA-seq data. RESULTS By leveraging information about normal forms and scaling behavior near bifurcations common to many dynamical systems, FateNet predicts cell decision occurrence with higher accuracy than conventional methods and offers qualitative insights into the new state of the biological system. Also, through in-silico perturbation experiments, FateNet identifies key genes and pathways governing the differentiation process in hematopoiesis. Validated using different scRNA-seq data, FateNet emerges as a user-friendly and valuable tool for predicting critical points in biological processes, providing insights into complex trajectories. AVAILABILITY AND IMPLEMENTATION github.com/ThomasMBury/fatenet.
Collapse
Affiliation(s)
- Mehrshad Sadria
- Department of Applied Mathematics, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| | - Thomas M Bury
- Department of Physiology, McGill University, Montreal, QC H3G 1Y6, Canada
| |
Collapse
|
18
|
Xu B, Braun R. Variational inference of single cell time series. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.29.610389. [PMID: 39257806 PMCID: PMC11384007 DOI: 10.1101/2024.08.29.610389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
Time course single-cell RNA sequencing (scRNA-seq) enables researchers to probe genome-wide expression dynamics at the the single cell scale. However, when gene expression is affected jointly by time and cellular identity, analyzing such data - including conducting cell type annotation and modeling cell type-dependent dynamics - becomes challenging. To address this problem, we propose SNOW (SiNgle cell flOW map), a deep learning algorithm to deconvolve single cell time series data into time-dependent and time-independent contributions. SNOW has a number of advantages. First, it enables cell type annotation based on the time-independent dimensions. Second, it yields a probabilistic model that can be used to discriminate between biological temporal variation and batch effects contaminating individual timepoints, and provides an approach to mitigate batch effects. Finally, it is capable of projecting cells forward and backward in time, yielding time series at the individual cell level. This enables gene expression dynamics to be studied without the need for clustering or pseudobulking, which can be error prone and result in information loss. We describe our probabilistic framework in detail and demonstrate SNOW using data from three distinct time course scRNA-seq studies. Our results show that SNOW is able to construct biologically meaningful latent spaces, remove batch effects, and generate realistic time-series at the single-cell level. By way of example, we illustrate how the latter may be used to enhance the detection of cell type-specific circadian gene expression rhythms, and may be readily extended to other time-series analyses.
Collapse
Affiliation(s)
- Bingxian Xu
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
- NSF-Simons National Institute for Theory and Mathematics in Biology, Chicago, IL 60611, USA
| | - Rosemary Braun
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
- NSF-Simons National Institute for Theory and Mathematics in Biology, Chicago, IL 60611, USA
- Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, IL 60208, USA
- Department of Physics and Astronomy, Northwestern University, Evanston, IL 60208, USA
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL 60208, USA
- Santa Fe Institute, Santa Fe, NM 87501, USA
| |
Collapse
|
19
|
Cao X, Huang YA, You ZH, Shang X, Hu L, Hu PW, Huang ZA. scPriorGraph: constructing biosemantic cell-cell graphs with prior gene set selection for cell type identification from scRNA-seq data. Genome Biol 2024; 25:207. [PMID: 39103856 DOI: 10.1186/s13059-024-03357-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 07/29/2024] [Indexed: 08/07/2024] Open
Abstract
Cell type identification is an indispensable analytical step in single-cell data analyses. To address the high noise stemming from gene expression data, existing computational methods often overlook the biologically meaningful relationships between genes, opting to reduce all genes to a unified data space. We assume that such relationships can aid in characterizing cell type features and improving cell type recognition accuracy. To this end, we introduce scPriorGraph, a dual-channel graph neural network that integrates multi-level gene biosemantics. Experimental results demonstrate that scPriorGraph effectively aggregates feature values of similar cells using high-quality graphs, achieving state-of-the-art performance in cell type identification.
Collapse
Affiliation(s)
- Xiyue Cao
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Yu-An Huang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China.
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China.
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
| | - Peng-Wei Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
| | - Zhi-An Huang
- Research Office, City University of Hong Kong (Dongguan), Dongguan, 523000, China
| |
Collapse
|
20
|
Zeng Z, Ma Y, Hu L, Tan B, Liu P, Wang Y, Xing C, Xiong Y, Du H. OmicVerse: a framework for bridging and deepening insights across bulk and single-cell sequencing. Nat Commun 2024; 15:5983. [PMID: 39013860 PMCID: PMC11252408 DOI: 10.1038/s41467-024-50194-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 06/28/2024] [Indexed: 07/18/2024] Open
Abstract
Single-cell sequencing is frequently affected by "omission" due to limitations in sequencing throughput, yet bulk RNA-seq may contain these ostensibly "omitted" cells. Here, we introduce the single cell trajectory blending from Bulk RNA-seq (BulkTrajBlend) algorithm, a component of the OmicVerse suite that leverages a Beta-Variational AutoEncoder for data deconvolution and graph neural networks for the discovery of overlapping communities. This approach effectively interpolates and restores the continuity of "omitted" cells within single-cell RNA sequencing datasets. Furthermore, OmicVerse provides an extensive toolkit for both bulk and single cell RNA-seq analysis, offering seamless access to diverse methodologies, streamlining computational processes, fostering exquisite data visualization, and facilitating the extraction of significant biological insights to advance scientific research.
Collapse
Affiliation(s)
- Zehua Zeng
- School of Chemistry and Biological Engineering, University of Science and Technology Beijing, Beijing, China.
- Daxing Research Institute, University of Science and Technology Beijing, Beijing, China.
| | - Yuqing Ma
- Center of Precision Medicine and Healthcare, Tsinghua-Berkeley Shenzhen Institute, Shenzhen, Guangdong Province, China
- Institute of Biopharmaceutics and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen, Guangdong Province, China
| | - Lei Hu
- School of Chemistry and Biological Engineering, University of Science and Technology Beijing, Beijing, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
| | - Bowen Tan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing, China
| | - Peng Liu
- School of Chemistry and Biological Engineering, University of Science and Technology Beijing, Beijing, China
| | - Yixuan Wang
- School of Chemistry and Biological Engineering, University of Science and Technology Beijing, Beijing, China
| | - Cencan Xing
- School of Chemistry and Biological Engineering, University of Science and Technology Beijing, Beijing, China.
- Daxing Research Institute, University of Science and Technology Beijing, Beijing, China.
| | - Yuanyan Xiong
- Key Laboratory of Gene Engineering of the Ministry of Education, Institute of Healthy Aging Research, School of Life Sciences, Sun-Yat-Sen University, Guangzhou, Guangdong, China.
| | - Hongwu Du
- School of Chemistry and Biological Engineering, University of Science and Technology Beijing, Beijing, China.
- Daxing Research Institute, University of Science and Technology Beijing, Beijing, China.
| |
Collapse
|
21
|
Sadria M, Layton A, Goyal S, Bader GD. Fatecode enables cell fate regulator prediction using classification-supervised autoencoder perturbation. CELL REPORTS METHODS 2024; 4:100819. [PMID: 38986613 PMCID: PMC11294839 DOI: 10.1016/j.crmeth.2024.100819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 11/20/2023] [Accepted: 06/18/2024] [Indexed: 07/12/2024]
Abstract
Cell reprogramming, which guides the conversion between cell states, is a promising technology for tissue repair and regeneration, with the ultimate goal of accelerating recovery from diseases or injuries. To accomplish this, regulators must be identified and manipulated to control cell fate. We propose Fatecode, a computational method that predicts cell fate regulators based only on single-cell RNA sequencing (scRNA-seq) data. Fatecode learns a latent representation of the scRNA-seq data using a deep learning-based classification-supervised autoencoder and then performs in silico perturbation experiments on the latent representation to predict genes that, when perturbed, would alter the original cell type distribution to increase or decrease the population size of a cell type of interest. We assessed Fatecode's performance using simulations from a mechanistic gene-regulatory network model and scRNA-seq data mapping blood and brain development of different organisms. Our results suggest that Fatecode can detect known cell fate regulators from single-cell transcriptomics datasets.
Collapse
Affiliation(s)
- Mehrshad Sadria
- Department of Applied Mathematics, University of Waterloo, Waterloo, ON, Canada.
| | - Anita Layton
- Department of Applied Mathematics, University of Waterloo, Waterloo, ON, Canada; Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada; Department of Biology, University of Waterloo, Waterloo, ON, Canada; School of Pharmacy, University of Waterloo, Waterloo, ON, Canada
| | - Sidhartha Goyal
- Department of Physics, University of Toronto, Toronto, ON, Canada
| | - Gary D Bader
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; The Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Computer Science, University of Toronto, Toronto, ON, Canada; The Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada; Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada; Canadian Institute for Advanced Research (CIFAR), Toronto, ON, Canada
| |
Collapse
|
22
|
Otto DJ, Jordan C, Dury B, Dien C, Setty M. Quantifying cell-state densities in single-cell phenotypic landscapes using Mellon. Nat Methods 2024; 21:1185-1195. [PMID: 38890426 DOI: 10.1038/s41592-024-02302-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 05/08/2024] [Indexed: 06/20/2024]
Abstract
Cell-state density characterizes the distribution of cells along phenotypic landscapes and is crucial for unraveling the mechanisms that drive diverse biological processes. Here, we present Mellon, an algorithm for estimation of cell-state densities from high-dimensional representations of single-cell data. We demonstrate Mellon's efficacy by dissecting the density landscape of differentiating systems, revealing a consistent pattern of high-density regions corresponding to major cell types intertwined with low-density, rare transitory states. We present evidence implicating enhancer priming and the activation of master regulators in emergence of these transitory states. Mellon offers the flexibility to perform temporal interpolation of time-series data, providing a detailed view of cell-state dynamics during developmental processes. Mellon facilitates density estimation across various single-cell data modalities, scaling linearly with the number of cells. Our work underscores the importance of cell-state density in understanding the differentiation processes, and the potential of Mellon to provide insights into mechanisms guiding biological trajectories.
Collapse
Affiliation(s)
- Dominik J Otto
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Translational Data Science IRC, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Cailin Jordan
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Translational Data Science IRC, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Molecular and Cellular Biology Program, University of Washington, Seattle, WA, USA
| | - Brennan Dury
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Translational Data Science IRC, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Christine Dien
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Translational Data Science IRC, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - Manu Setty
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA.
- Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, USA.
- Translational Data Science IRC, Fred Hutchinson Cancer Center, Seattle, WA, USA.
| |
Collapse
|
23
|
Moriel N, Memet E, Nitzan M. Optimal sequencing budget allocation for trajectory reconstruction of single cells. Bioinformatics 2024; 40:i446-i452. [PMID: 38940162 PMCID: PMC11211845 DOI: 10.1093/bioinformatics/btae258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
BACKGROUND Charting cellular trajectories over gene expression is key to understanding dynamic cellular processes and their underlying mechanisms. While advances in single-cell RNA-sequencing technologies and computational methods have pushed forward the recovery of such trajectories, trajectory inference remains a challenge due to the noisy, sparse, and high-dimensional nature of single-cell data. This challenge can be alleviated by increasing either the number of cells sampled along the trajectory (breadth) or the sequencing depth, i.e. the number of reads captured per cell (depth). Generally, these two factors are coupled due to an inherent breadth-depth tradeoff that arises when the sequencing budget is constrained due to financial or technical limitations. RESULTS Here we study the optimal allocation of a fixed sequencing budget to optimize the recovery of trajectory attributes. Empirical results reveal that reconstruction accuracy of internal cell structure in expression space scales with the logarithm of either the breadth or depth of sequencing. We additionally observe a power law relationship between the optimal number of sampled cells and the corresponding sequencing budget. For linear trajectories, non-monotonicity in trajectory reconstruction across the breadth-depth tradeoff can impact downstream inference, such as expression pattern analysis along the trajectory. We demonstrate these results for five single-cell RNA-sequencing datasets encompassing differentiation of embryonic stem cells, pancreatic beta cells, hepatoblast and multipotent hematopoietic cells, as well as induced reprogramming of embryonic fibroblasts into neurons. By addressing the challenges of single-cell data, our study offers insights into maximizing the efficiency of cellular trajectory analysis through strategic allocation of sequencing resources.
Collapse
Affiliation(s)
- Noa Moriel
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel
| | - Edvin Memet
- Department of Physics, Harvard University, Cambridge, MA 02138, United States
| | - Mor Nitzan
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel
- Racah Institute of Physics, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel
- Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| |
Collapse
|
24
|
Gao CF, Vaikuntanathan S, Riesenfeld SJ. Dissection and integration of bursty transcriptional dynamics for complex systems. Proc Natl Acad Sci U S A 2024; 121:e2306901121. [PMID: 38669186 PMCID: PMC11067469 DOI: 10.1073/pnas.2306901121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 03/06/2024] [Indexed: 04/28/2024] Open
Abstract
RNA velocity estimation is a potentially powerful tool to reveal the directionality of transcriptional changes in single-cell RNA-sequencing data, but it lacks accuracy, absent advanced metabolic labeling techniques. We developed an approach, TopicVelo, that disentangles simultaneous, yet distinct, dynamics by using a probabilistic topic model, a highly interpretable form of latent space factorization, to infer cells and genes associated with individual processes, thereby capturing cellular pluripotency or multifaceted functionality. Focusing on process-associated cells and genes enables accurate estimation of process-specific velocities via a master equation for a transcriptional burst model accounting for intrinsic stochasticity. The method obtains a global transition matrix by leveraging cell topic weights to integrate process-specific signals. In challenging systems, this method accurately recovers complex transitions and terminal states, while our use of first-passage time analysis provides insights into transient transitions. These results expand the limits of RNA velocity, empowering future studies of cell fate and functional responses.
Collapse
Affiliation(s)
- Cheng Frank Gao
- Department of Chemistry, University of Chicago, Chicago, IL60637
| | - Suriyanarayanan Vaikuntanathan
- Department of Chemistry, University of Chicago, Chicago, IL60637
- Institute for Biophysical Dynamics, University of Chicago, Chicago, IL60637
| | - Samantha J. Riesenfeld
- Institute for Biophysical Dynamics, University of Chicago, Chicago, IL60637
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, IL60637
- Department of Medicine, University of Chicago, Chicago, IL60637
- Committee on Immunology, Biological Sciences Division, University of Chicago, Chicago, IL60637
| |
Collapse
|
25
|
Rosebrock D, Vingron M, Arndt PF. Modeling gene expression cascades during cell state transitions. iScience 2024; 27:109386. [PMID: 38500834 PMCID: PMC10946328 DOI: 10.1016/j.isci.2024.109386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 12/14/2023] [Accepted: 02/27/2024] [Indexed: 03/20/2024] Open
Abstract
During cellular processes such as differentiation or response to external stimuli, cells exhibit dynamic changes in their gene expression profiles. Single-cell RNA sequencing (scRNA-seq) can be used to investigate these dynamic changes. To this end, cells are typically ordered along a pseudotemporal trajectory which recapitulates the progression of cells as they transition from one cell state to another. We infer transcriptional dynamics by modeling the gene expression profiles in pseudotemporally ordered cells using a Bayesian inference approach. This enables ordering genes along transcriptional cascades, estimating differences in the timing of gene expression dynamics, and deducing regulatory gene interactions. Here, we apply this approach to scRNA-seq datasets derived from mouse embryonic forebrain and pancreas samples. This analysis demonstrates the utility of the method to derive the ordering of gene dynamics and regulatory relationships critical for proper cellular differentiation and maturation across a variety of developmental contexts.
Collapse
Affiliation(s)
- Daniel Rosebrock
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Martin Vingron
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Peter F. Arndt
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| |
Collapse
|
26
|
Ko KD, Sartorelli V. A deep learning adversarial autoencoder with dynamic batching displays high performance in denoising and ordering scRNA-seq data. iScience 2024; 27:109027. [PMID: 38361616 PMCID: PMC10867661 DOI: 10.1016/j.isci.2024.109027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 11/20/2023] [Accepted: 01/22/2024] [Indexed: 02/17/2024] Open
Abstract
By providing high-resolution of cell-to-cell variation in gene expression, single-cell RNA sequencing (scRNA-seq) offers insights into cell heterogeneity, differentiating dynamics, and disease mechanisms. However, challenges such as low capture rates and dropout events can introduce noise in data analysis. Here, we propose a deep neural generative framework, the dynamic batching adversarial autoencoder (DB-AAE), which excels at denoising scRNA-seq datasets. DB-AAE directly captures optimal features from input data and enhances feature preservation, including cell type-specific gene expression patterns. Comprehensive evaluation on simulated and real datasets demonstrates that DB-AAE outperforms other methods in denoising accuracy and biological signal preservation. It also improves the accuracy of other algorithms in establishing pseudo-time inference. This study highlights DB-AAE's effectiveness and potential as a valuable tool for enhancing the quality and reliability of downstream analyses in scRNA-seq research.
Collapse
Affiliation(s)
- Kyung Dae Ko
- Laboratory of Muscle Stem Cells & Gene Regulation, NIAMS, NIH, Bethesda, MD, USA
| | - Vittorio Sartorelli
- Laboratory of Muscle Stem Cells & Gene Regulation, NIAMS, NIH, Bethesda, MD, USA
| |
Collapse
|
27
|
Li J, Pan X, Yuan Y, Shen HB. TFvelo: gene regulation inspired RNA velocity estimation. Nat Commun 2024; 15:1387. [PMID: 38360714 PMCID: PMC11258302 DOI: 10.1038/s41467-024-45661-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 01/30/2024] [Indexed: 02/17/2024] Open
Abstract
RNA velocity is closely related with cell fate and is an important indicator for the prediction of cell states with elegant physical explanation derived from single-cell RNA-seq data. Most existing RNA velocity models aim to extract dynamics from the phase delay between unspliced and spliced mRNA for each individual gene. However, unspliced/spliced mRNA abundance may not provide sufficient signal for dynamic modeling, leading to poor fit in phase portraits. Motivated by the idea that RNA velocity could be driven by the transcriptional regulation, we propose TFvelo, which expands RNA velocity concept to various single-cell datasets without relying on splicing information, by introducing gene regulatory information. Our experiments on synthetic data and multiple scRNA-Seq datasets show that TFvelo can accurately fit genes dynamics on phase portraits, and effectively infer cell pseudo-time and trajectory from RNA abundance data. TFvelo opens a robust and accurate avenue for modeling RNA velocity for single cell data.
Collapse
Affiliation(s)
- Jiachen Li
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Ye Yuan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China.
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China.
| |
Collapse
|
28
|
Lederer AR, Leonardi M, Talamanca L, Herrera A, Droin C, Khven I, Carvalho HJF, Valente A, Mantes AD, Arabí PM, Pinello L, Naef F, Manno GL. Statistical inference with a manifold-constrained RNA velocity model uncovers cell cycle speed modulations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.18.576093. [PMID: 38328127 PMCID: PMC10849531 DOI: 10.1101/2024.01.18.576093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Across a range of biological processes, cells undergo coordinated changes in gene expression, resulting in transcriptome dynamics that unfold within a low-dimensional manifold. Single-cell RNA-sequencing (scRNA-seq) only measures temporal snapshots of gene expression. However, information on the underlying low-dimensional dynamics can be extracted using RNA velocity, which models unspliced and spliced RNA abundances to estimate the rate of change of gene expression. Available RNA velocity algorithms can be fragile and rely on heuristics that lack statistical control. Moreover, the estimated vector field is not dynamically consistent with the traversed gene expression manifold. Here, we develop a generative model of RNA velocity and a Bayesian inference approach that solves these problems. Our model couples velocity field and manifold estimation in a reformulated, unified framework, so as to coherently identify the parameters of an autonomous dynamical system. Focusing on the cell cycle, we implemented VeloCycle to study gene regulation dynamics on one-dimensional periodic manifolds and validated using live-imaging its ability to infer actual cell cycle periods. We benchmarked RNA velocity inference with sensitivity analyses and demonstrated one- and multiple-sample testing. We also conducted Markov chain Monte Carlo inference on the model, uncovering key relationships between gene-specific kinetics and our gene-independent velocity estimate. Finally, we applied VeloCycle to in vivo samples and in vitro genome-wide Perturb-seq, revealing regionally-defined proliferation modes in neural progenitors and the effect of gene knockdowns on cell cycle speed. Ultimately, VeloCycle expands the scRNA-seq analysis toolkit with a modular and statistically rigorous RNA velocity inference framework.
Collapse
|
29
|
Cui H, Maan H, Vladoiu MC, Zhang J, Taylor MD, Wang B. DeepVelo: deep learning extends RNA velocity to multi-lineage systems with cell-specific kinetics. Genome Biol 2024; 25:27. [PMID: 38243313 PMCID: PMC10799431 DOI: 10.1186/s13059-023-03148-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 12/18/2023] [Indexed: 01/21/2024] Open
Abstract
Existing RNA velocity estimation methods strongly rely on predefined dynamics and cell-agnostic constant transcriptional kinetic rates, assumptions often violated in complex and heterogeneous single-cell RNA sequencing (scRNA-seq) data. Using a graph convolution network, DeepVelo overcomes these limitations by generalizing RNA velocity to cell populations containing time-dependent kinetics and multiple lineages. DeepVelo infers time-varying cellular rates of transcription, splicing, and degradation, recovers each cell's stage in the differentiation process, and detects functionally relevant driver genes regulating these processes. Application to various developmental and pathogenic processes demonstrates DeepVelo's capacity to study complex differentiation and lineage decision events in heterogeneous scRNA-seq data.
Collapse
Affiliation(s)
- Haotian Cui
- Peter Munk Cardiac Center, University Health Network, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Vector Institute, Toronto, Ontario, Canada
| | - Hassaan Maan
- Peter Munk Cardiac Center, University Health Network, Toronto, Ontario, Canada
- Vector Institute, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Maria C Vladoiu
- Department of Pathology and Molecular Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Jiao Zhang
- The Arthur and Sonia Labatt Brain Tumor Research Centre, The Hospital for Sick Children, Toronto, Ontario, Canada
- Developmental and Stem Cell Biology Program, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Michael D Taylor
- The Arthur and Sonia Labatt Brain Tumor Research Centre, The Hospital for Sick Children, Toronto, Ontario, Canada
- Developmental and Stem Cell Biology Program, The Hospital for Sick Children, Toronto, Ontario, Canada
- Baylor College of Medicine, Houston, TX, USA
- Texas Children's Hospital, Houston, TX, USA
| | - Bo Wang
- Peter Munk Cardiac Center, University Health Network, Toronto, Ontario, Canada.
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.
- Vector Institute, Toronto, Ontario, Canada.
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
30
|
Li S, Zhang P, Chen W, Ye L, Brannan KW, Le NT, Abe JI, Cooke JP, Wang G. A relay velocity model infers cell-dependent RNA velocity. Nat Biotechnol 2024; 42:99-108. [PMID: 37012448 PMCID: PMC10545816 DOI: 10.1038/s41587-023-01728-5] [Citation(s) in RCA: 35] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 02/28/2023] [Indexed: 04/05/2023]
Abstract
RNA velocity provides an approach for inferring cellular state transitions from single-cell RNA sequencing (scRNA-seq) data. Conventional RNA velocity models infer universal kinetics from all cells in an scRNA-seq experiment, resulting in unpredictable performance in experiments with multi-stage and/or multi-lineage transition of cell states where the assumption of the same kinetic rates for all cells no longer holds. Here we present cellDancer, a scalable deep neural network that locally infers velocity for each cell from its neighbors and then relays a series of local velocities to provide single-cell resolution inference of velocity kinetics. In the simulation benchmark, cellDancer shows robust performance in multiple kinetic regimes, high dropout ratio datasets and sparse datasets. We show that cellDancer overcomes the limitations of existing RNA velocity models in modeling erythroid maturation and hippocampus development. Moreover, cellDancer provides cell-specific predictions of transcription, splicing and degradation rates, which we identify as potential indicators of cell fate in the mouse pancreas.
Collapse
Affiliation(s)
- Shengyu Li
- Center for Bioinformatics and Computational Biology, Houston Methodist Research Institute, Houston, TX, USA
- Center for Cardiovascular Regeneration, Houston Methodist Research Institute, Houston, TX, USA
- Center for RNA Therapeutics, Houston Methodist Research Institute, Houston, TX, USA
- Department of Cardiothoracic Surgery, Weill Cornell Medicine, Cornell University, New York, NY, USA
| | - Pengzhi Zhang
- Center for Bioinformatics and Computational Biology, Houston Methodist Research Institute, Houston, TX, USA
- Center for Cardiovascular Regeneration, Houston Methodist Research Institute, Houston, TX, USA
- Center for RNA Therapeutics, Houston Methodist Research Institute, Houston, TX, USA
- Department of Cardiothoracic Surgery, Weill Cornell Medicine, Cornell University, New York, NY, USA
| | - Weiqing Chen
- Center for Bioinformatics and Computational Biology, Houston Methodist Research Institute, Houston, TX, USA
- Department of Physiology, Biophysics & Systems Biology, Weill Cornell Graduate School of Medical Science, Weill Cornell Medicine, Cornell University, Ithaca, NY, USA
| | - Lingqun Ye
- Center for Bioinformatics and Computational Biology, Houston Methodist Research Institute, Houston, TX, USA
- Center for Cardiovascular Regeneration, Houston Methodist Research Institute, Houston, TX, USA
- Center for RNA Therapeutics, Houston Methodist Research Institute, Houston, TX, USA
| | - Kristopher W Brannan
- Center for Cardiovascular Regeneration, Houston Methodist Research Institute, Houston, TX, USA
- Center for RNA Therapeutics, Houston Methodist Research Institute, Houston, TX, USA
- Department of Cardiothoracic Surgery, Weill Cornell Medicine, Cornell University, New York, NY, USA
| | - Nhat-Tu Le
- Center for Cardiovascular Regeneration, Houston Methodist Research Institute, Houston, TX, USA
- Department of Cardiothoracic Surgery, Weill Cornell Medicine, Cornell University, New York, NY, USA
| | - Jun-Ichi Abe
- Department of Cardiology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - John P Cooke
- Center for Cardiovascular Regeneration, Houston Methodist Research Institute, Houston, TX, USA
| | - Guangyu Wang
- Center for Bioinformatics and Computational Biology, Houston Methodist Research Institute, Houston, TX, USA.
- Center for Cardiovascular Regeneration, Houston Methodist Research Institute, Houston, TX, USA.
- Center for RNA Therapeutics, Houston Methodist Research Institute, Houston, TX, USA.
- Department of Cardiothoracic Surgery, Weill Cornell Medicine, Cornell University, New York, NY, USA.
| |
Collapse
|
31
|
Gayoso A, Weiler P, Lotfollahi M, Klein D, Hong J, Streets A, Theis FJ, Yosef N. Deep generative modeling of transcriptional dynamics for RNA velocity analysis in single cells. Nat Methods 2024; 21:50-59. [PMID: 37735568 PMCID: PMC10776389 DOI: 10.1038/s41592-023-01994-w] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 08/08/2023] [Indexed: 09/23/2023]
Abstract
RNA velocity has been rapidly adopted to guide interpretation of transcriptional dynamics in snapshot single-cell data; however, current approaches for estimating RNA velocity lack effective strategies for quantifying uncertainty and determining the overall applicability to the system of interest. Here, we present veloVI (velocity variational inference), a deep generative modeling framework for estimating RNA velocity. veloVI learns a gene-specific dynamical model of RNA metabolism and provides a transcriptome-wide quantification of velocity uncertainty. We show that veloVI compares favorably to previous approaches with respect to goodness of fit, consistency across transcriptionally similar cells and stability across preprocessing pipelines for quantifying RNA abundance. Further, we demonstrate that veloVI's posterior velocity uncertainty can be used to assess whether velocity analysis is appropriate for a given dataset. Finally, we highlight veloVI as a flexible framework for modeling transcriptional dynamics by adapting the underlying dynamical model to use time-dependent transcription rates.
Collapse
Affiliation(s)
- Adam Gayoso
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Philipp Weiler
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Department of Mathematics, Technical University of Munich, Munich, Germany
| | - Mohammad Lotfollahi
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Wellcome Sanger Institute, Cambridge, UK
| | - Dominik Klein
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Department of Mathematics, Technical University of Munich, Munich, Germany
| | - Justin Hong
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
- Department of Computer Science, Columbia University, New York, NY, USA
| | - Aaron Streets
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.
- Department of Mathematics, Technical University of Munich, Munich, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany.
| | - Nir Yosef
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA.
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA.
| |
Collapse
|
32
|
Leary JR, Bacher R. Interpretable trajectory inference with single-cell Linear Adaptive Negative-binomial Expression (scLANE) testing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.19.572477. [PMID: 38187622 PMCID: PMC10769309 DOI: 10.1101/2023.12.19.572477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
The rapid proliferation of trajectory inference methods for single-cell RNA-seq data has allowed researchers to investigate complex biological processes by examining underlying gene expression dynamics. After estimating a latent cell ordering, statistical models are used to determine which genes exhibit changes in expression that are significantly associated with progression through the biological trajectory. While a few techniques for performing trajectory differential expression exist, most rely on the flexibility of generalized additive models in order to account for the inherent nonlinearity of changes in gene expression. As such, the results can be difficult to interpret, and biological conclusions often rest on subjective visual inspections of the most dynamic genes. To address this challenge, we propose scLANE testing, which is built around an interpretable generalized linear model and handles nonlinearity with basis splines chosen empirically for each gene. In addition, extensions to estimating equations and mixed models allow for reliable trajectory testing under complex experimental designs. After validating the accuracy of scLANE under several different simulation scenarios, we apply it to a set of diverse biological datasets and display its ability to provide novel biological information when used downstream of both pseudotime and RNA velocity estimation methods.
Collapse
Affiliation(s)
- Jack R. Leary
- Department of Biostatistics, College of Public Health and Health Professions, University of Florida, Gainesville, FL 32610, USA
| | - Rhonda Bacher
- Department of Biostatistics, College of Public Health and Health Professions, University of Florida, Gainesville, FL 32610, USA
| |
Collapse
|
33
|
Zheng R, Xu Z, Zeng Y, Wang E, Li M. SPIDE: A single cell potency inference method based on the local cell-specific network entropy. Methods 2023; 220:90-97. [PMID: 37952704 DOI: 10.1016/j.ymeth.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Revised: 10/25/2023] [Accepted: 11/06/2023] [Indexed: 11/14/2023] Open
Abstract
For a given single cell RNA-seq data, it is critical to pinpoint key cellular stages and quantify cells' differentiation potency along a differentiation pathway in a time course manner. Currently, several methods based on the entropy of gene functions or PPI network have been proposed to solve the problem. Nevertheless, these methods still suffer from the inaccurate interactions and noises originating from scRNA-seq profile. In this study, we proposed a cell potency inference method based on cell-specific network entropy, called SPIDE. SPIDE introduces the local weighted cell-specific network for each cell to maintain cell heterogeneity and calculates the entropy by incorporating gene expression with network structure. In this study, we compared three cell entropy estimation models on eight scRNA-Seq datasets. The results show that SPIDE obtains consistent conclusions with real cell differentiation potency on most datasets. Moreover, SPIDE accurately recovers the continuous changes of potency during cell differentiation and significantly correlates with the stemness of tumor cells in Colorectal cancer. To conclude, our study provides a universal and accurate framework for cell entropy estimation, which deepens our understanding of cell differentiation, the development of diseases and other related biological research.
Collapse
Affiliation(s)
- Ruiqing Zheng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Ziwei Xu
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yanping Zeng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Edwin Wang
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary T2N 4N1, Alberta, Canada
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha 410083, China.
| |
Collapse
|
34
|
Zheng SC, Stein-O’Brien G, Boukas L, Goff LA, Hansen KD. Pumping the brakes on RNA velocity by understanding and interpreting RNA velocity estimates. Genome Biol 2023; 24:246. [PMID: 37885016 PMCID: PMC10601342 DOI: 10.1186/s13059-023-03065-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 09/19/2023] [Indexed: 10/28/2023] Open
Abstract
BACKGROUND RNA velocity analysis of single cells offers the potential to predict temporal dynamics from gene expression. In many systems, RNA velocity has been observed to produce a vector field that qualitatively reflects known features of the system. However, the limitations of RNA velocity estimates are still not well understood. RESULTS We analyze the impact of different steps in the RNA velocity workflow on direction and speed. We consider both high-dimensional velocity estimates and low-dimensional velocity vector fields mapped onto an embedding. We conclude the transition probability method for mapping velocity estimates onto an embedding is effectively interpolating in the embedding space. Our findings reveal a significant dependence of the RNA velocity workflow on smoothing via the k-nearest-neighbors (k-NN) graph of the observed data. This reliance results in considerable estimation errors for both direction and speed in both high- and low-dimensional settings when the k-NN graph fails to accurately represent the true data structure; this is an unknown feature of real data. RNA velocity performs poorly at estimating speed in both low- and high-dimensional spaces, except in very low noise settings. We introduce a novel quality measure that can identify when RNA velocity should not be used. CONCLUSIONS Our findings emphasize the importance of choices in the RNA velocity workflow and highlight critical limitations of data analysis. We advise against over-interpreting expression dynamics using RNA velocity, particularly in terms of speed. Finally, we emphasize that the use of RNA velocity in assessing the correctness of a low-dimensional embedding is circular.
Collapse
Affiliation(s)
- Shijie C. Zheng
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD USA
| | - Genevieve Stein-O’Brien
- Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD USA
- Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD USA
- Kavli Neurodiscovery Institute, Johns Hopkins University, Baltimore, MD USA
- Quantitative Sciences Division, Department of Oncology, Johns Hopkins School of Medicine, Baltimore, MD USA
| | - Leandros Boukas
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD USA
- Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD USA
| | - Loyal A. Goff
- Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD USA
- Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD USA
- Kavli Neurodiscovery Institute, Johns Hopkins University, Baltimore, MD USA
| | - Kasper D. Hansen
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD USA
- Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD USA
| |
Collapse
|
35
|
Cota P, Saber L, Taskin D, Jing C, Bastidas-Ponce A, Vanheusden M, Shahryari A, Sterr M, Burtscher I, Bakhti M, Lickert H. NEUROD2 function is dispensable for human pancreatic β cell specification. Front Endocrinol (Lausanne) 2023; 14:1286590. [PMID: 37955006 PMCID: PMC10634430 DOI: 10.3389/fendo.2023.1286590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 10/09/2023] [Indexed: 11/14/2023] Open
Abstract
Introduction The molecular programs regulating human pancreatic endocrine cell induction and fate allocation are not well deciphered. Here, we investigated the spatiotemporal expression pattern and the function of the neurogenic differentiation factor 2 (NEUROD2) during human endocrinogenesis. Methods Using Crispr-Cas9 gene editing, we generated a reporter knock-in transcription factor (TF) knock-out human inducible pluripotent stem cell (iPSC) line in which the open reading frame of both NEUROD2 alleles are replaced by a nuclear histone 2B-Venus reporter (NEUROD2nVenus/nVenus). Results We identified a transient expression of NEUROD2 mRNA and its nuclear Venus reporter activity at the stage of human endocrine progenitor formation in an iPSC differentiation model. This expression profile is similar to what was previously reported in mice, uncovering an evolutionarily conserved gene expression pattern of NEUROD2 during endocrinogenesis. In vitro differentiation of the generated homozygous NEUROD2nVenus/nVenus iPSC line towards human endocrine lineages uncovered no significant impact upon the loss of NEUROD2 on endocrine cell induction. Moreover, analysis of endocrine cell specification revealed no striking changes in the generation of insulin-producing b cells and glucagon-secreting a cells upon lack of NEUROD2. Discussion Overall, our results suggest that NEUROD2 is expendable for human b cell formation in vitro.
Collapse
Affiliation(s)
- Perla Cota
- Institute of Diabetes and Regeneration Research, Helmholtz Munich, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
- School of Medicine, Technical University of Munich (TUM), Munich, Germany
| | - Lama Saber
- Institute of Diabetes and Regeneration Research, Helmholtz Munich, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
- School of Medicine, Technical University of Munich (TUM), Munich, Germany
| | - Damla Taskin
- Institute of Diabetes and Regeneration Research, Helmholtz Munich, Neuherberg, Germany
| | - Changying Jing
- Institute of Diabetes and Regeneration Research, Helmholtz Munich, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
- Munich Medical Research School (MMRS), Ludwig Maximilian University (LMU), Munich, Germany
| | - Aimée Bastidas-Ponce
- Institute of Diabetes and Regeneration Research, Helmholtz Munich, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Matthew Vanheusden
- Institute of Diabetes and Regeneration Research, Helmholtz Munich, Neuherberg, Germany
| | - Alireza Shahryari
- Institute of Diabetes and Regeneration Research, Helmholtz Munich, Neuherberg, Germany
| | - Michael Sterr
- Institute of Diabetes and Regeneration Research, Helmholtz Munich, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Ingo Burtscher
- Institute of Diabetes and Regeneration Research, Helmholtz Munich, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Mostafa Bakhti
- Institute of Diabetes and Regeneration Research, Helmholtz Munich, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Heiko Lickert
- Institute of Diabetes and Regeneration Research, Helmholtz Munich, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
- School of Medicine, Technical University of Munich (TUM), Munich, Germany
| |
Collapse
|
36
|
Ma Z, Zhang X, Zhong W, Yi H, Chen X, Zhao Y, Ma Y, Song E, Xu T. Deciphering early human pancreas development at the single-cell level. Nat Commun 2023; 14:5354. [PMID: 37660175 PMCID: PMC10475098 DOI: 10.1038/s41467-023-40893-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Accepted: 08/15/2023] [Indexed: 09/04/2023] Open
Abstract
Understanding pancreas development can provide clues for better treatments of pancreatic diseases. However, the molecular heterogeneity and developmental trajectory of the early human pancreas are poorly explored. Here, we performed large-scale single-cell RNA sequencing and single-cell assay for transposase accessible chromatin sequencing of human embryonic pancreas tissue obtained from first-trimester embryos. We unraveled the molecular heterogeneity, developmental trajectories and regulatory networks of the major cell types. The results reveal that dorsal pancreatic multipotent cells in humans exhibit different gene expression patterns than ventral multipotent cells. Pancreato-biliary progenitors that generate ventral multipotent cells in humans were identified. Notch and MAPK signals from mesenchymal cells regulate the differentiation of multipotent cells into trunk and duct cells. Notably, we identified endocrine progenitor subclusters with different differentiation potentials. Although the developmental trajectories are largely conserved between humans and mice, some distinct gene expression patterns have also been identified. Overall, we provide a comprehensive landscape of early human pancreas development to understand its lineage transitions and molecular complexity.
Collapse
Affiliation(s)
- Zhuo Ma
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Xiaofei Zhang
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- Key Laboratory of Molecular Biophysics of the Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China
- Hainan Provincial Key Laboratory for Human Reproductive Medicine and Genetic Research, Key Laboratory of Reproductive Health Diseases Research and Translation (Hainan Medical University), Ministry of Education, The First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 570102, China
| | - Wen Zhong
- Science for Life Laboratory, Department of Biomedical and Clinical Sciences (BKV), Linköping University, Linköping, 581 83, Sweden
- Department of Neuroscience, Karolinska Institutet, Stockholm, Sweden
| | - Hongyan Yi
- Hainan Provincial Key Laboratory for Human Reproductive Medicine and Genetic Research, Key Laboratory of Reproductive Health Diseases Research and Translation (Hainan Medical University), Ministry of Education, The First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 570102, China
| | - Xiaowei Chen
- Center for High Throughput Sequencing, Core Facility for Protein Research, Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yinsuo Zhao
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yanlin Ma
- Hainan Provincial Key Laboratory for Human Reproductive Medicine and Genetic Research, Key Laboratory of Reproductive Health Diseases Research and Translation (Hainan Medical University), Ministry of Education, The First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 570102, China.
| | - Eli Song
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Tao Xu
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.
- Guangzhou Laboratory, Guangzhou, 510005, China.
- Central Hospital Affiliated to Shandong First Medical University, Jinan, 250013, China.
- Medical Science and Technology Innovation Center, Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, 250062, China.
| |
Collapse
|
37
|
Hrovatin K, Bastidas-Ponce A, Bakhti M, Zappia L, Büttner M, Salinno C, Sterr M, Böttcher A, Migliorini A, Lickert H, Theis FJ. Delineating mouse β-cell identity during lifetime and in diabetes with a single cell atlas. Nat Metab 2023; 5:1615-1637. [PMID: 37697055 PMCID: PMC10513934 DOI: 10.1038/s42255-023-00876-x] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 07/26/2023] [Indexed: 09/13/2023]
Abstract
Although multiple pancreatic islet single-cell RNA-sequencing (scRNA-seq) datasets have been generated, a consensus on pancreatic cell states in development, homeostasis and diabetes as well as the value of preclinical animal models is missing. Here, we present an scRNA-seq cross-condition mouse islet atlas (MIA), a curated resource for interactive exploration and computational querying. We integrate over 300,000 cells from nine scRNA-seq datasets consisting of 56 samples, varying in age, sex and diabetes models, including an autoimmune type 1 diabetes model (NOD), a glucotoxicity/lipotoxicity type 2 diabetes model (db/db) and a chemical streptozotocin β-cell ablation model. The β-cell landscape of MIA reveals new cell states during disease progression and cross-publication differences between previously suggested marker genes. We show that β-cells in the streptozotocin model transcriptionally correlate with those in human type 2 diabetes and mouse db/db models, but are less similar to human type 1 diabetes and mouse NOD β-cells. We also report pathways that are shared between β-cells in immature, aged and diabetes models. MIA enables a comprehensive analysis of β-cell responses to different stressors, providing a roadmap for the understanding of β-cell plasticity, compensation and demise.
Collapse
Affiliation(s)
- Karin Hrovatin
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Aimée Bastidas-Ponce
- Institute of Diabetes and Regeneration Research, Helmholtz Zentrum München, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
- Medical Faculty, Technical University of Munich, Munich, Germany
| | - Mostafa Bakhti
- Institute of Diabetes and Regeneration Research, Helmholtz Zentrum München, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Luke Zappia
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
- Department of Mathematics, Technical University of Munich, Garching, Germany
| | - Maren Büttner
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
- Genomics and Immunoregulation, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany
- Systems Medicine, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE), Bonn, Germany
| | - Ciro Salinno
- Institute of Diabetes and Regeneration Research, Helmholtz Zentrum München, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
- Medical Faculty, Technical University of Munich, Munich, Germany
| | - Michael Sterr
- Institute of Diabetes and Regeneration Research, Helmholtz Zentrum München, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Anika Böttcher
- Institute of Diabetes and Regeneration Research, Helmholtz Zentrum München, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Adriana Migliorini
- Institute of Diabetes and Regeneration Research, Helmholtz Zentrum München, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
- McEwen Stem Cell Institute, University Health Network (UHN), Toronto, Ontario, Canada
| | - Heiko Lickert
- Institute of Diabetes and Regeneration Research, Helmholtz Zentrum München, Neuherberg, Germany.
- German Center for Diabetes Research (DZD), Neuherberg, Germany.
- Medical Faculty, Technical University of Munich, Munich, Germany.
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.
- Department of Mathematics, Technical University of Munich, Garching, Germany.
| |
Collapse
|
38
|
Tixi W, Maldonado M, Chang YT, Chiu A, Yeung W, Parveen N, Nelson MS, Hart R, Wang S, Hsu WJ, Fueger P, Kopp JL, Huising MO, Dhawan S, Shih HP. Coordination between ECM and cell-cell adhesion regulates the development of islet aggregation, architecture, and functional maturation. eLife 2023; 12:e90006. [PMID: 37610090 PMCID: PMC10482429 DOI: 10.7554/elife.90006] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 07/12/2023] [Indexed: 08/24/2023] Open
Abstract
Pancreatic islets are three-dimensional cell aggregates consisting of unique cellular composition, cell-to-cell contacts, and interactions with blood vessels. Cell aggregation is essential for islet endocrine function; however, it remains unclear how developing islets establish aggregation. By combining genetic animal models, imaging tools, and gene expression profiling, we demonstrate that islet aggregation is regulated by extracellular matrix signaling and cell-cell adhesion. Islet endocrine cell-specific inactivation of extracellular matrix receptor integrin β1 disrupted blood vessel interactions but promoted cell-cell adhesion and the formation of larger islets. In contrast, ablation of cell-cell adhesion molecule α-catenin promoted blood vessel interactions yet compromised islet clustering. Simultaneous removal of integrin β1 and α-catenin disrupts islet aggregation and the endocrine cell maturation process, demonstrating that establishment of islet aggregates is essential for functional maturation. Our study provides new insights into understanding the fundamental self-organizing mechanism for islet aggregation, architecture, and functional maturation.
Collapse
Affiliation(s)
- Wilma Tixi
- Department of Translational Research and Cellular Therapeutics, Arthur Riggs Diabetes and Metabolism Research Institute, Beckman Research Institute, City of HopeDuarteUnited States
| | - Maricela Maldonado
- Department of Translational Research and Cellular Therapeutics, Arthur Riggs Diabetes and Metabolism Research Institute, Beckman Research Institute, City of HopeDuarteUnited States
- Department of Biomedical Engineering, College of Engineering, California State University, Long BeachLong BeachUnited States
| | - Ya-Ting Chang
- Department of Translational Research and Cellular Therapeutics, Arthur Riggs Diabetes and Metabolism Research Institute, Beckman Research Institute, City of HopeDuarteUnited States
| | - Amy Chiu
- Department of Translational Research and Cellular Therapeutics, Arthur Riggs Diabetes and Metabolism Research Institute, Beckman Research Institute, City of HopeDuarteUnited States
| | - Wilson Yeung
- Department of Translational Research and Cellular Therapeutics, Arthur Riggs Diabetes and Metabolism Research Institute, Beckman Research Institute, City of HopeDuarteUnited States
| | - Nazia Parveen
- Department of Translational Research and Cellular Therapeutics, Arthur Riggs Diabetes and Metabolism Research Institute, Beckman Research Institute, City of HopeDuarteUnited States
| | - Michael S Nelson
- Light Microscopy Core, Beckman Research Institute, City of HopeDuarteUnited States
| | - Ryan Hart
- Department of Neurobiology, Physiology and Behavior, University of California, DavisDavisUnited States
| | - Shihao Wang
- Department of Cellular and Physiological Sciences, Life Sciences Institute, University of British ColumbiaVancouverCanada
| | - Wu Jih Hsu
- Department of Cellular and Physiological Sciences, Life Sciences Institute, University of British ColumbiaVancouverCanada
| | - Patrick Fueger
- Department of Molecular & Cellular Endocrinology, Arthur Riggs Diabetes and Metabolism Research Institute, Beckman Research Institute, City of HopeDuarteUnited States
| | - Janel L Kopp
- Department of Cellular and Physiological Sciences, Life Sciences Institute, University of British ColumbiaVancouverCanada
| | - Mark O Huising
- Department of Neurobiology, Physiology and Behavior, University of California, DavisDavisUnited States
- Department of Physiology and Membrane Biology, School of Medicine, University of California, DavisDavisUnited States
| | - Sangeeta Dhawan
- Department of Translational Research and Cellular Therapeutics, Arthur Riggs Diabetes and Metabolism Research Institute, Beckman Research Institute, City of HopeDuarteUnited States
| | - Hung Ping Shih
- Department of Translational Research and Cellular Therapeutics, Arthur Riggs Diabetes and Metabolism Research Institute, Beckman Research Institute, City of HopeDuarteUnited States
| |
Collapse
|
39
|
Jiménez S, Schreiber V, Mercier R, Gradwohl G, Molina N. Characterization of cell-fate decision landscapes by estimating transcription factor dynamics. CELL REPORTS METHODS 2023; 3:100512. [PMID: 37533652 PMCID: PMC10391345 DOI: 10.1016/j.crmeth.2023.100512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 03/23/2023] [Accepted: 06/01/2023] [Indexed: 08/04/2023]
Abstract
Time-specific modulation of gene expression during differentiation by transcription factors promotes cell diversity. However, estimating their dynamic regulatory activity at the single-cell level and in a high-throughput manner remains challenging. We present FateCompass, an integrative approach that utilizes single-cell transcriptomics data to identify lineage-specific transcription factors throughout differentiation. By combining a probabilistic framework with RNA velocities or differentiation potential, we estimate transition probabilities, while a linear model of gene regulation is employed to compute transcription factor activities. Considering dynamic changes and correlations of expression and activities, FateCompass identifies lineage-specific regulators. Our validation using in silico data and application to pancreatic endocrine cell differentiation datasets highlight both known and potentially novel lineage-specific regulators. Notably, we uncovered undescribed transcription factors of an enterochromaffin-like population during in vitro differentiation toward ß-like cells. FateCompass provides a valuable framework for hypothesis generation, advancing our understanding of the gene regulatory networks driving cell-fate decisions.
Collapse
Affiliation(s)
- Sara Jiménez
- Université de Strasbourg, Strasbourg, France
- CNRS, UMR 7104, 67400 Illkirch, France
- INSERM, UMR-S 1258, 67400 Illkirch, France
- IGBMC, Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67400 Illkirch, France
| | - Valérie Schreiber
- Université de Strasbourg, Strasbourg, France
- CNRS, UMR 7104, 67400 Illkirch, France
- INSERM, UMR-S 1258, 67400 Illkirch, France
- IGBMC, Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67400 Illkirch, France
| | - Reuben Mercier
- Université de Strasbourg, Strasbourg, France
- CNRS, UMR 7104, 67400 Illkirch, France
- INSERM, UMR-S 1258, 67400 Illkirch, France
- IGBMC, Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67400 Illkirch, France
| | - Gérard Gradwohl
- Université de Strasbourg, Strasbourg, France
- CNRS, UMR 7104, 67400 Illkirch, France
- INSERM, UMR-S 1258, 67400 Illkirch, France
- IGBMC, Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67400 Illkirch, France
| | - Nacho Molina
- Université de Strasbourg, Strasbourg, France
- CNRS, UMR 7104, 67400 Illkirch, France
- INSERM, UMR-S 1258, 67400 Illkirch, France
- IGBMC, Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67400 Illkirch, France
| |
Collapse
|
40
|
Otto D, Jordan C, Dury B, Dien C, Setty M. Quantifying Cell-State Densities in Single-Cell Phenotypic Landscapes using Mellon. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.09.548272. [PMID: 37502954 PMCID: PMC10369887 DOI: 10.1101/2023.07.09.548272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Cell-state density characterizes the distribution of cells along phenotypic landscapes and is crucial for unraveling the mechanisms that drive cellular differentiation, regeneration, and disease. Here, we present Mellon, a novel computational algorithm for high-resolution estimation of cell-state densities from single-cell data. We demonstrate Mellon's efficacy by dissecting the density landscape of various differentiating systems, revealing a consistent pattern of high-density regions corresponding to major cell types intertwined with low-density, rare transitory states. Utilizing hematopoietic stem cell fate specification to B-cells as a case study, we present evidence implicating enhancer priming and the activation of master regulators in the emergence of these transitory states. Mellon offers the flexibility to perform temporal interpolation of time-series data, providing a detailed view of cell-state dynamics during the inherently continuous developmental processes. Scalable and adaptable, Mellon facilitates density estimation across various single-cell data modalities, scaling linearly with the number of cells. Our work underscores the importance of cell-state density in understanding the differentiation processes, and the potential of Mellon to provide new insights into the regulatory mechanisms guiding cellular fate decisions.
Collapse
Affiliation(s)
- Dominik Otto
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle WA
- Computational Biology Program, Public Health Sciences Division, Seattle WA
- Translational Data Science IRC, Fred Hutchinson Cancer Center, Seattle WA
| | - Cailin Jordan
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle WA
- Computational Biology Program, Public Health Sciences Division, Seattle WA
- Translational Data Science IRC, Fred Hutchinson Cancer Center, Seattle WA
- Molecular and Cellular Biology Program, University of Washington, Seattle WA
| | - Brennan Dury
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle WA
- Computational Biology Program, Public Health Sciences Division, Seattle WA
- Translational Data Science IRC, Fred Hutchinson Cancer Center, Seattle WA
| | - Christine Dien
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle WA
- Computational Biology Program, Public Health Sciences Division, Seattle WA
- Translational Data Science IRC, Fred Hutchinson Cancer Center, Seattle WA
| | - Manu Setty
- Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle WA
- Computational Biology Program, Public Health Sciences Division, Seattle WA
- Translational Data Science IRC, Fred Hutchinson Cancer Center, Seattle WA
| |
Collapse
|
41
|
Li Q. scTour: a deep learning architecture for robust inference and accurate prediction of cellular dynamics. Genome Biol 2023; 24:149. [PMID: 37353848 PMCID: PMC10290357 DOI: 10.1186/s13059-023-02988-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 06/13/2023] [Indexed: 06/25/2023] Open
Abstract
Despite the continued efforts, a batch-insensitive tool that can both infer and predict the developmental dynamics using single-cell genomics is lacking. Here, I present scTour, a novel deep learning architecture to perform robust inference and accurate prediction of cellular dynamics with minimal influence from batch effects. For inference, scTour simultaneously estimates the developmental pseudotime, delineates the vector field, and maps the transcriptomic latent space under a single, integrated framework. For prediction, scTour precisely reconstructs the underlying dynamics of unseen cellular states or a new independent dataset. scTour's functionalities are demonstrated in a variety of biological processes from 19 datasets.
Collapse
Affiliation(s)
- Qian Li
- Department of Pathology, University of Cambridge, Cambridge, UK.
| |
Collapse
|
42
|
Bohuslavova R, Fabriciova V, Lebrón-Mora L, Malfatti J, Smolik O, Valihrach L, Benesova S, Zucha D, Berkova Z, Saudek F, Evans SM, Pavlinkova G. ISL1 controls pancreatic alpha cell fate and beta cell maturation. Cell Biosci 2023; 13:53. [PMID: 36899442 PMCID: PMC9999528 DOI: 10.1186/s13578-023-01003-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 03/01/2023] [Indexed: 03/12/2023] Open
Abstract
BACKGROUND Glucose homeostasis is dependent on functional pancreatic α and ß cells. The mechanisms underlying the generation and maturation of these endocrine cells remain unclear. RESULTS We unravel the molecular mode of action of ISL1 in controlling α cell fate and the formation of functional ß cells in the pancreas. By combining transgenic mouse models, transcriptomic and epigenomic profiling, we uncover that elimination of Isl1 results in a diabetic phenotype with a complete loss of α cells, disrupted pancreatic islet architecture, downregulation of key ß-cell regulators and maturation markers of ß cells, and an enrichment in an intermediate endocrine progenitor transcriptomic profile. CONCLUSIONS Mechanistically, apart from the altered transcriptome of pancreatic endocrine cells, Isl1 elimination results in altered silencing H3K27me3 histone modifications in the promoter regions of genes that are essential for endocrine cell differentiation. Our results thus show that ISL1 transcriptionally and epigenetically controls α cell fate competence, and ß cell maturation, suggesting that ISL1 is a critical component for generating functional α and ß cells.
Collapse
Affiliation(s)
- Romana Bohuslavova
- Laboratory of Molecular Pathogenetics, Institute of Biotechnology CAS, 25250, Vestec, Czechia.
| | - Valeria Fabriciova
- Laboratory of Molecular Pathogenetics, Institute of Biotechnology CAS, 25250, Vestec, Czechia
| | - Laura Lebrón-Mora
- Laboratory of Molecular Pathogenetics, Institute of Biotechnology CAS, 25250, Vestec, Czechia
| | - Jessica Malfatti
- Laboratory of Molecular Pathogenetics, Institute of Biotechnology CAS, 25250, Vestec, Czechia
| | - Ondrej Smolik
- Laboratory of Molecular Pathogenetics, Institute of Biotechnology CAS, 25250, Vestec, Czechia
| | - Lukas Valihrach
- Laboratory of Gene Expression, Institute of Biotechnology CAS, 25250, Vestec, Czechia
| | - Sarka Benesova
- Laboratory of Gene Expression, Institute of Biotechnology CAS, 25250, Vestec, Czechia
| | - Daniel Zucha
- Laboratory of Gene Expression, Institute of Biotechnology CAS, 25250, Vestec, Czechia
| | - Zuzana Berkova
- Laboratory of Pancreatic Islets, Institute for Clinical and Experimental Medicine, 14021, Prague, Czechia
| | - Frantisek Saudek
- Laboratory of Pancreatic Islets, Institute for Clinical and Experimental Medicine, 14021, Prague, Czechia
| | - Sylvia M Evans
- Department of Pharmacology; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California at San Diego, La Jolla, CA, USA
| | - Gabriela Pavlinkova
- Laboratory of Molecular Pathogenetics, Institute of Biotechnology CAS, 25250, Vestec, Czechia.
| |
Collapse
|
43
|
Lotfollahi M, Rybakov S, Hrovatin K, Hediyeh-Zadeh S, Talavera-López C, Misharin AV, Theis FJ. Biologically informed deep learning to query gene programs in single-cell atlases. Nat Cell Biol 2023; 25:337-350. [PMID: 36732632 PMCID: PMC9928587 DOI: 10.1038/s41556-022-01072-x] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 12/08/2022] [Indexed: 02/04/2023]
Abstract
The increasing availability of large-scale single-cell atlases has enabled the detailed description of cell states. In parallel, advances in deep learning allow rapid analysis of newly generated query datasets by mapping them into reference atlases. However, existing data transformations learned to map query data are not easily explainable using biologically known concepts such as genes or pathways. Here we propose expiMap, a biologically informed deep-learning architecture that enables single-cell reference mapping. ExpiMap learns to map cells into biologically understandable components representing known 'gene programs'. The activity of each cell for a gene program is learned while simultaneously refining them and learning de novo programs. We show that expiMap compares favourably to existing methods while bringing an additional layer of interpretability to integrative single-cell analysis. Furthermore, we demonstrate its applicability to analyse single-cell perturbation responses in different tissues and species and resolve responses of patients who have coronavirus disease 2019 to different treatments across cell types.
Collapse
Affiliation(s)
- Mohammad Lotfollahi
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Wellcome Sanger Institute, Cambridge, UK
| | - Sergei Rybakov
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Department of Mathematics, Technical University of Munich, Munich, Germany
| | - Karin Hrovatin
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - Soroor Hediyeh-Zadeh
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Bioinformatics Division, WEHI, Melbourne, Victoria, Australia
| | - Carlos Talavera-López
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Division of Infectious Diseases and Tropical Medicine, Ludwig-Maximilian-Universität Klinikum, Munich, Germany
| | - Alexander V Misharin
- Division of Pulmonary and Critical Care Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.
- Wellcome Sanger Institute, Cambridge, UK.
- Department of Mathematics, Technical University of Munich, Munich, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany.
| |
Collapse
|
44
|
Chen J, Xu H, Tao W, Chen Z, Zhao Y, Han JDJ. Transformer for one stop interpretable cell type annotation. Nat Commun 2023; 14:223. [PMID: 36641532 PMCID: PMC9840170 DOI: 10.1038/s41467-023-35923-4] [Citation(s) in RCA: 74] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 01/09/2023] [Indexed: 01/15/2023] Open
Abstract
Consistent annotation transfer from reference dataset to query dataset is fundamental to the development and reproducibility of single-cell research. Compared with traditional annotation methods, deep learning based methods are faster and more automated. A series of useful single cell analysis tools based on autoencoder architecture have been developed but these struggle to strike a balance between depth and interpretability. Here, we present TOSICA, a multi-head self-attention deep learning model based on Transformer that enables interpretable cell type annotation using biologically understandable entities, such as pathways or regulons. We show that TOSICA achieves fast and accurate one-stop annotation and batch-insensitive integration while providing biologically interpretable insights for understanding cellular behavior during development and disease progressions. We demonstrate TOSICA's advantages by applying it to scRNA-seq data of tumor-infiltrating immune cells, and CD14+ monocytes in COVID-19 to reveal rare cell types, heterogeneity and dynamic trajectories associated with disease progression and severity.
Collapse
Affiliation(s)
- Jiawei Chen
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China
| | - Hao Xu
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China
| | - Wanyu Tao
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China
| | - Zhaoxiong Chen
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China
| | - Yuxuan Zhao
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China
| | - Jing-Dong J Han
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China.
| |
Collapse
|
45
|
Sun ED, Ma R, Zou J. Dynamic visualization of high-dimensional data. NATURE COMPUTATIONAL SCIENCE 2023; 3:86-100. [PMID: 38177955 DOI: 10.1038/s43588-022-00380-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 11/16/2022] [Indexed: 01/06/2024]
Abstract
Dimensionality reduction (DR) is commonly used to project high-dimensional data into lower dimensions for visualization, which could then generate new insights and hypotheses. However, DR algorithms introduce distortions in the visualization and cannot faithfully represent all relations in the data. Thus, there is a need for methods to assess the reliability of DR visualizations. Here we present DynamicViz, a framework for generating dynamic visualizations that capture the sensitivity of DR visualizations to perturbations in the data resulting from bootstrap sampling. DynamicViz can be applied to all commonly used DR methods. We show the utility of dynamic visualizations in diagnosing common interpretative pitfalls of static visualizations and extending existing single-cell analyses. We introduce the variance score to quantify the dynamic variability of observations in these visualizations. The variance score characterizes natural variability in the data and can be used to optimize DR algorithm implementations.
Collapse
Affiliation(s)
- Eric D Sun
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Rong Ma
- Department of Statistics, Stanford University, Stanford, CA, USA
| | - James Zou
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
| |
Collapse
|
46
|
Weiler P, Van den Berge K, Street K, Tiberi S. A Guide to Trajectory Inference and RNA Velocity. Methods Mol Biol 2022; 2584:269-292. [PMID: 36495456 DOI: 10.1007/978-1-0716-2756-3_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Technological developments have led to an explosion of high-throughput single-cell data, which are revealing unprecedented perspectives on cell identity. Recently, significant attention has focused on investigating, from single-cell RNA-sequencing (scRNA-seq) data, cellular dynamic processes, such as cell differentiation, cell cycle and cell (de)activation. In particular, trajectory inference methods, by ordering cells along a trajectory, allow estimating a differentiation tree of cells. While trajectory inference tools typically work with gene expression levels, common scRNA-seq protocols allow the identification and quantification of unspliced pre-mRNAs and mature spliced mRNAs for each gene. By exploiting the abundance of unspliced and spliced mRNA, one can infer the RNA velocity of individual cells, i.e., the time derivative of the gene expression state of cells. Whereas traditional trajectory inference methods reconstruct cellular dynamics given a population of cells of varying maturity, RNA velocity relies on a dynamical model describing splicing dynamics. Here, we initially discuss conceptual and theoretical aspects of both approaches, then illustrate how they can be combined together, and finally present an example use case on real data.
Collapse
Affiliation(s)
- Philipp Weiler
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.,Department of Mathematics, Technical University of Munich, Munich, Germany
| | - Koen Van den Berge
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium.,Department of Statistics, University of California, Berkeley, CA, USA
| | - Kelly Street
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Simone Tiberi
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland. .,Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland.
| |
Collapse
|
47
|
Chen Z, King WC, Hwang A, Gerstein M, Zhang J. DeepVelo: Single-cell transcriptomic deep velocity field learning with neural ordinary differential equations. SCIENCE ADVANCES 2022; 8:eabq3745. [PMID: 36449617 PMCID: PMC9710871 DOI: 10.1126/sciadv.abq3745] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Recent advances in single-cell sequencing technologies have provided unprecedented opportunities to measure the gene expression profile and RNA velocity of individual cells. However, modeling transcriptional dynamics is computationally challenging because of the high-dimensional, sparse nature of the single-cell gene expression measurements and the nonlinear regulatory relationships. Here, we present DeepVelo, a neural network-based ordinary differential equation that can model complex transcriptome dynamics by describing continuous-time gene expression changes within individual cells. We apply DeepVelo to public datasets from different sequencing platforms to (i) formulate transcriptome dynamics on different time scales, (ii) measure the instability of cell states, and (iii) identify developmental driver genes via perturbation analysis. Benchmarking against the state-of-the-art methods shows that DeepVelo can learn a more accurate representation of the velocity field. Furthermore, our perturbation studies reveal that single-cell dynamical systems could exhibit chaotic properties. In summary, DeepVelo allows data-driven discoveries of differential equations that delineate single-cell transcriptome dynamics.
Collapse
Affiliation(s)
- Zhanlin Chen
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
| | - William C. King
- Healthcare and Life Sciences, Microsoft, Redmond, WA 98052, USA
| | - Aheyon Hwang
- Mathematical, Computational, and Systems Biology, University of California, Irvine, Irvine, CA 92697, USA
| | - Mark Gerstein
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
- Department of Computer Science, Yale University, New Haven, CT 06520, USA
- Corresponding author. (M.G.); (J.Z.)
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, Irvine, CA 92697, USA
- Corresponding author. (M.G.); (J.Z.)
| |
Collapse
|
48
|
Zeng Z, Zhao S, Peng Y, Hu X, Yin Z. Cascade Forest-Based Model for Prediction of RNA Velocity. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27227873. [PMID: 36431973 PMCID: PMC9698518 DOI: 10.3390/molecules27227873] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 11/08/2022] [Accepted: 11/10/2022] [Indexed: 11/16/2022]
Abstract
In recent years, single-cell RNA sequencing technology (scRNA-seq) has developed rapidly and has been widely used in biological and medical research, such as in expression heterogeneity and transcriptome dynamics of single cells. The investigation of RNA velocity is a new topic in the study of cellular dynamics using single-cell RNA sequencing data. It can recover directional dynamic information from single-cell transcriptomics by linking measurements to the underlying dynamics of gene expression. Predicting the RNA velocity vector of each cell based on its gene expression data and formulating RNA velocity prediction as a classification problem is a new research direction. In this paper, we develop a cascade forest model to predict RNA velocity. Compared with other popular ensemble classifiers, such as XGBoost, RandomForest, LightGBM, NGBoost, and TabNet, it performs better in predicting RNA velocity. This paper provides guidance for researchers in selecting and applying appropriate classification tools in their analytical work and suggests some possible directions for future improvement of classification tools.
Collapse
|
49
|
Bocci F, Zhou P, Nie Q. spliceJAC: transition genes and state-specific gene regulation from single-cell transcriptome data. Mol Syst Biol 2022; 18:e11176. [PMID: 36321549 PMCID: PMC9627675 DOI: 10.15252/msb.202211176] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 10/07/2022] [Accepted: 10/10/2022] [Indexed: 11/25/2022] Open
Abstract
Extracting dynamical information from single-cell transcriptomics is a novel task with the promise to advance our understanding of cell state transition and interactions between genes. Yet, theory-oriented, bottom-up approaches that consider differences among cell states are largely lacking. Here, we present spliceJAC, a method to quantify the multivariate mRNA splicing from single-cell RNA sequencing (scRNA-seq). spliceJAC utilizes the unspliced and spliced mRNA count matrices to constructs cell state-specific gene-gene regulatory interactions and applies stability analysis to predict putative driver genes critical to the transitions between cell states. By applying spliceJAC to biological systems including pancreas endothelium development and epithelial-mesenchymal transition (EMT) in A549 lung cancer cells, we predict genes that serve specific signaling roles in different cell states, recover important differentially expressed genes in agreement with pre-existing analysis, and predict new transition genes that are either exclusive or shared between different cell state transitions.
Collapse
Affiliation(s)
- Federico Bocci
- Department of MathematicsUniversity of CaliforniaIrvineCAUSA
- NSF‐Simons Center for Multiscale Cell Fate ResearchUniversity of CaliforniaIrvineCAUSA
| | - Peijie Zhou
- Department of MathematicsUniversity of CaliforniaIrvineCAUSA
| | - Qing Nie
- Department of MathematicsUniversity of CaliforniaIrvineCAUSA
- NSF‐Simons Center for Multiscale Cell Fate ResearchUniversity of CaliforniaIrvineCAUSA
- Department of Developmental and Cell BiologyUniversity of CaliforniaIrvineCAUSA
| |
Collapse
|
50
|
Marot-Lassauzaie V, Bouman BJ, Donaghy FD, Demerdash Y, Essers MAG, Haghverdi L. Towards reliable quantification of cell state velocities. PLoS Comput Biol 2022; 18:e1010031. [PMID: 36170235 PMCID: PMC9550177 DOI: 10.1371/journal.pcbi.1010031] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 10/10/2022] [Accepted: 08/26/2022] [Indexed: 11/25/2022] Open
Abstract
A few years ago, it was proposed to use the simultaneous quantification of unspliced and spliced messenger RNA (mRNA) to add a temporal dimension to high-throughput snapshots of single cell RNA sequencing data. This concept can yield additional insight into the transcriptional dynamics of the biological systems under study. However, current methods for inferring cell state velocities from such data (known as RNA velocities) are afflicted by several theoretical and computational problems, hindering realistic and reliable velocity estimation. We discuss these issues and propose new solutions for addressing some of the current challenges in consistency of data processing, velocity inference and visualisation. We translate our computational conclusion in two velocity analysis tools: one detailed method κ-velo and one heuristic method eco-velo, each of which uses a different set of assumptions about the data.
Collapse
Affiliation(s)
- Valérie Marot-Lassauzaie
- Berlin Institute for Medical Systems Biology, Max Delbrück Center (BIMSB-MDC) in the Helmholtz Association, Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt- Universität zu Berlin, Berlin, Germany
| | - Brigitte Joanne Bouman
- Berlin Institute for Medical Systems Biology, Max Delbrück Center (BIMSB-MDC) in the Helmholtz Association, Berlin, Germany
- Humboldt Universität zu Berlin, Institute for Biology, Berlin, Germany
| | - Fearghal Declan Donaghy
- Berlin Institute for Medical Systems Biology, Max Delbrück Center (BIMSB-MDC) in the Helmholtz Association, Berlin, Germany
| | - Yasmin Demerdash
- Division Inflammatory Stress in Stem Cells, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGMBH), Heidelberg, Germany
- Faculty of Biosciences, University of Heidelberg, Heidelberg, Germany
| | - Marieke Alida Gertruda Essers
- Division Inflammatory Stress in Stem Cells, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGMBH), Heidelberg, Germany
- DKFZ-ZMBH Alliance, Heidelberg, Germany
| | - Laleh Haghverdi
- Berlin Institute for Medical Systems Biology, Max Delbrück Center (BIMSB-MDC) in the Helmholtz Association, Berlin, Germany
| |
Collapse
|