1
|
Ramirez A, Orcutt-Jahns BT, Pascoe S, Abraham A, Remigio B, Thomas N, Meyer AS. Integrative, high-resolution analysis of single-cell gene expression across experimental conditions with PARAFAC2-RISE. Cell Syst 2025:101294. [PMID: 40378843 DOI: 10.1016/j.cels.2025.101294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 02/20/2025] [Accepted: 04/22/2025] [Indexed: 05/19/2025]
Abstract
Effective exploration and analysis tools are vital for the extraction of insights from single-cell data. However, current techniques for modeling single-cell studies performed across experimental conditions (e.g., samples) require restrictive assumptions or do not adequately deconvolute condition-to-condition variation from cell-to-cell variation. Here, we report that reduction and insight in single-cell exploration (RISE), an adaptation of the tensor decomposition method PARAFAC2, enables the dimensionality reduction and analysis of single-cell data across conditions. We demonstrate the benefits of RISE across distinct examples of single-cell RNA-sequencing experiments of peripheral immune cells: pharmacologic drug perturbations and systemic lupus erythematosus patient samples. RISE enables associations of gene variation patterns with patients or perturbations while connecting each coordinated change to single cells without requiring cell-type annotations. The theoretical grounding of RISE suggests a unified framework for many single-cell data modeling tasks while providing an intuitive dimensionality reduction approach for multi-sample single-cell studies across biological contexts. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Andrew Ramirez
- Department of Bioengineering, University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA
| | - Brian T Orcutt-Jahns
- Department of Bioengineering, University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA
| | - Sean Pascoe
- Department of Bioengineering, University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA; Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
| | - Armaan Abraham
- Department of Bioengineering, University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA
| | - Breanna Remigio
- Computational and Systems Biology, UCLA, Los Angeles, CA 90095, USA
| | - Nathaniel Thomas
- Department of Computer Science, UCLA, Los Angeles, CA 90095, USA
| | - Aaron S Meyer
- Department of Bioengineering, University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA; Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, CA 90095, USA; Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, UCLA, Los Angeles, CA 90095, USA.
| |
Collapse
|
2
|
Liu Y, Li C, Shen LC, Yan H, Wei G, Gasser RB, Hu X, Song J, Yu DJ. scRCA: A Siamese network-based pipeline for annotating cell types using noisy single-cell RNA-seq reference data. Comput Biol Med 2025; 190:110068. [PMID: 40158457 DOI: 10.1016/j.compbiomed.2025.110068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Revised: 03/19/2025] [Accepted: 03/20/2025] [Indexed: 04/02/2025]
Abstract
Accurate cell type annotation is fundamentally critical for single-cell sequencing (scRNA-seq) data analysis to provide insightful knowledge of tissue-specific cell heterogeneity and cell state transition tracking. Cell type annotation is usually conducted by comparative analysis with known data (i.e., reference) - which contains a presumably accurate representation of cell types. However, this assumption is often problematic, as factors such as human errors in wet-lab experiments and methodological limitations can introduce annotation errors in the reference dataset. As current pipelines for single-cell transcriptomic analysis do not adequately consider this challenge, there is a major demand for constructing a computational pipeline that achieves high-quality cell type annotation using reference datasets containing inherent errors (referred to as "noise" in this study). Here, we built a Siamese network-based pipeline, termed scRCA, to accurately annotate cell types based on noisy reference data. To help users evaluate the reliability of scRCA annotations, an interpreter was also developed to explore the factors underlying the model's predictions. Our experiments demonstrate that, across 14 datasets, scRCA outperformed other widely adopted reference-based methods for cell type annotation. Using an independent dataset of four multiple myeloma patients, we further illustrated that scRCA can distinguish cancerous cells based on gene expression levels and identify genes closely associated with multiple myeloma through scRCA's interpretable module, providing significant information for subsequent clinical treatments. With these advancements, we anticipate that scRCA will serve as a practical reference-based approach for accurate annotating cell type annotation.
Collapse
Affiliation(s)
- Yan Liu
- Department of Computer Science, Yangzhou University, Yangzhou, 225100, China
| | - Chen Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, 3800, Australia
| | - Long-Chen Shen
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing, 210094, China
| | - He Yan
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing, 210094, China
| | - Guo Wei
- School of Life Sciences, Nanjing University, Nanjing, 210023, China
| | - Robin B Gasser
- Monash Data Futures Institute, Monash University, Melbourne, Victoria, 3800, Australia
| | - Xiaohua Hu
- Information Department, The First Affiliated Hospital of Naval Military Medical University, Changhai Road 168, Shanghai, 200433, China
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, 3800, Australia; Monash Data Futures Institute, Monash University, Melbourne, Victoria, 3800, Australia.
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, 200 Xiaolingwei, Nanjing, 210094, China.
| |
Collapse
|
3
|
Li X, Liu Y, Zhou F, Guo W, Chen G, Tao J, Huang J, Qiu J, Chen H, Ren B, You L, Shi Y, Yang G, Zhang T, Gu J, Zhao Y. Decipher the single-cell level responses to chemotherapy in pancreatic ductal adenocarcinoma by a cross-time context graph model. Cancer Lett 2025; 626:217751. [PMID: 40294840 DOI: 10.1016/j.canlet.2025.217751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2025] [Revised: 04/24/2025] [Accepted: 04/26/2025] [Indexed: 04/30/2025]
Abstract
Gemcitabine is commonly used for pancreatic ductal adenocarcinoma (PDAC), one of the most lethal cancer types. However, the drug resistance is a critical challenge for improving the PDAC chemotherapy. Here, we applied single-cell RNA sequencing (scRNA-seq) on PDAC patient-derived xenograft (PDX) models to study the complex cellular responses related to the gemcitabine resistances. To reconstruct dynamic tumor cell responses from these static scRNA-seq snapshots, we proposed scConGraph, a scalable bi-layer graph model that can efficiently integrate cross-time context information. Based on scConGraph, we observed that stemness and endoplasmic reticulum stress contribute to intrinsic resistance. As for acquired resistance, cancer cells may resist or evade gemcitabine treatment by activating the cell cycle, entering quiescence, or inducing epithelial-mesenchymal transition. Notably, GDF15 exhibited recurrent and significant upregulations among acquired-resistance cell subpopulations. Experimental validation confirmed that inhibiting GDF15 sensitizes tumor cells to gemcitabine, suggesting a potential target for gemcitabine-induced chemoresistance.
Collapse
Affiliation(s)
- Xinqi Li
- MOE Key Lab of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation & Institute for Precision Medicine, Tsinghua University, Beijing, China
| | - Yueze Liu
- Department of General Surgery, State Key Laboratory of Complex Severe, Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Feihan Zhou
- Department of General Surgery, State Key Laboratory of Complex Severe, Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Wenbo Guo
- MOE Key Lab of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation & Institute for Precision Medicine, Tsinghua University, Beijing, China
| | - Guangyu Chen
- Department of Breast Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
| | - Jinxin Tao
- Department of General Surgery, State Key Laboratory of Complex Severe, Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Jingmin Huang
- MOE Key Lab of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation & Institute for Precision Medicine, Tsinghua University, Beijing, China
| | - Jiangdong Qiu
- Department of General Surgery, State Key Laboratory of Complex Severe, Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Hao Chen
- Department of General Surgery, State Key Laboratory of Complex Severe, Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Bo Ren
- Department of General Surgery, State Key Laboratory of Complex Severe, Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Lei You
- Department of General Surgery, State Key Laboratory of Complex Severe, Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yanan Shi
- Biomedical Engineering Facility of National Infrastructures for Translational Medicine, Peking Union Medical College Hospital, Beijing, China
| | - Gang Yang
- Department of General Surgery, State Key Laboratory of Complex Severe, Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
| | - Taiping Zhang
- Department of General Surgery, State Key Laboratory of Complex Severe, Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China; Clinical Immunology Center, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
| | - Jin Gu
- MOE Key Lab of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation & Institute for Precision Medicine, Tsinghua University, Beijing, China.
| | - Yupei Zhao
- Department of General Surgery, State Key Laboratory of Complex Severe, Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
| |
Collapse
|
4
|
Sukys A, Grima R. Cell-cycle dependence of bursty gene expression: insights from fitting mechanistic models to single-cell RNA-seq data. Nucleic Acids Res 2025; 53:gkaf295. [PMID: 40240003 PMCID: PMC12000877 DOI: 10.1093/nar/gkaf295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 03/22/2025] [Accepted: 03/28/2025] [Indexed: 04/18/2025] Open
Abstract
Bursty gene expression is characterized by two intuitive parameters, burst frequency and burst size, the cell-cycle dependence of which has not been extensively profiled at the transcriptome level. In this study, we estimate the burst parameters per allele in the G1 and G2/M cell-cycle phases for thousands of mouse genes by fitting mechanistic models of gene expression to messenger RNA count data, obtained by sequencing of single cells whose cell-cycle position has been inferred using a deep-learning method. We find that upon DNA replication, the median burst frequency approximately halves, while the burst size remains mostly unchanged. Genome-wide distributions of the burst parameter ratios between the G2/M and G1 phases are broad, indicating substantial heterogeneity in transcriptional regulation. We also observe a significant negative correlation between the burst frequency and size ratios, suggesting that regulatory processes do not independently control the burst parameters. We show that to accurately estimate the burst parameter ratios, mechanistic models must explicitly account for gene copy number variation and extrinsic noise due to the coupling of transcription to cell age across the cell cycle, but corrections for technical noise due to imperfect capture of RNA molecules in sequencing experiments are less critical.
Collapse
Affiliation(s)
- Augustinas Sukys
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, United Kingdom
- The Alan Turing Institute, London NW1 2DB, United Kingdom
- School of BioSciences, University of Melbourne, Parkville, Victoria 3052, Australia
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, United Kingdom
| |
Collapse
|
5
|
Ramirez A, Orcutt-Jahns BT, Pascoe S, Abraham A, Remigio B, Thomas N, Meyer AS. Integrative, high-resolution analysis of single cell gene expression across experimental conditions with PARAFAC2-RISE. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.07.29.605698. [PMID: 39131377 PMCID: PMC11312543 DOI: 10.1101/2024.07.29.605698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Effective and scalable exploration and analysis tools are vital for the extraction of insights from large-scale single-cell data. However, current techniques for modeling single-cell studies performed across experimental conditions (e.g., samples, perturbations, or patients) require restrictive assumptions, lack flexibility, or do not adequately deconvolute condition-to-condition variation from cell-to-cell variation. Here, we report that Reduction and Insight in Single-cell Exploration (RISE), an adaptation of the tensor decomposition method PARAFAC2, enables the dimensionality reduction and analysis of single-cell data across conditions. We demonstrate the benefits of RISE across two distinct examples of single-cell RNA-sequencing experiments of peripheral immune cells: pharmacologic drug perturbations and systemic lupus erythematosus (SLE) patient samples. RISE enables straightforward associations of gene variation patterns with specific patients or perturbations, while connecting each coordinated change to single cells without requiring cell type annotations. The theoretical grounding of RISE suggests a unified framework for many single-cell data modeling tasks. Thus, RISE provides an intuitive universal dimensionality reduction approach for multi-sample single-cell studies across diverse biological contexts.
Collapse
Affiliation(s)
- Andrew Ramirez
- Department of Bioengineering, University of California, Los Angeles (UCLA), CA, USA
| | | | - Sean Pascoe
- Department of Bioengineering, University of California, Los Angeles (UCLA), CA, USA
- Department of Molecular Biosciences, Northwestern University, Evanston, IL, USA
| | - Armaan Abraham
- Department of Bioengineering, University of California, Los Angeles (UCLA), CA, USA
| | | | | | - Aaron S. Meyer
- Department of Bioengineering, University of California, Los Angeles (UCLA), CA, USA
- Jonsson Comprehensive Cancer Center, UCLA, CA, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, UCLA, CA, USA
| |
Collapse
|
6
|
Guo W, Li X, Wang D, Yan N, Hu Q, Yang F, Zhang X, Yao J, Gu J. scStateDynamics: deciphering the drug-responsive tumor cell state dynamics by modeling single-cell level expression changes. Genome Biol 2024; 25:297. [PMID: 39574111 PMCID: PMC11583649 DOI: 10.1186/s13059-024-03436-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 11/15/2024] [Indexed: 11/24/2024] Open
Abstract
Understanding tumor cell heterogeneity and plasticity is crucial for overcoming drug resistance. Single-cell technologies enable analyzing cell states at a given condition, but catenating static cell snapshots to characterize dynamic drug responses remains challenging. Here, we propose scStateDynamics, an algorithm to infer tumor cell state dynamics and identify common drug effects by modeling single-cell level gene expression changes. Its reliability is validated on both simulated and lineage tracing data. Application to real tumor drug treatment datasets identifies more subtle cell subclusters with different drug responses beyond static transcriptome similarity and disentangles drug action mechanisms from the cell-level expression changes.
Collapse
Affiliation(s)
- Wenbo Guo
- MOE Key Lab of Bioinformatics, Department of Automation, BNRIST Bioinformatics Division, Tsinghua University, Beijing, China
| | - Xinqi Li
- MOE Key Lab of Bioinformatics, Department of Automation, BNRIST Bioinformatics Division, Tsinghua University, Beijing, China
| | - Dongfang Wang
- Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
| | - Nan Yan
- MOE Key Lab of Bioinformatics, Department of Automation, BNRIST Bioinformatics Division, Tsinghua University, Beijing, China
| | - Qifan Hu
- MOE Key Lab of Bioinformatics, Department of Automation, BNRIST Bioinformatics Division, Tsinghua University, Beijing, China
| | - Fan Yang
- AI Lab, Shenzhen, Tencent, China
| | - Xuegong Zhang
- MOE Key Lab of Bioinformatics, Department of Automation, BNRIST Bioinformatics Division, Tsinghua University, Beijing, China
- Center for Synthetic and Systems Biology, School of Life Sciences and School of Medicine, Tsinghua University, Beijing, China
| | | | - Jin Gu
- MOE Key Lab of Bioinformatics, Department of Automation, BNRIST Bioinformatics Division, Tsinghua University, Beijing, China.
| |
Collapse
|
7
|
Chari T, Gorin G, Pachter L. Biophysically interpretable inference of cell types from multimodal sequencing data. NATURE COMPUTATIONAL SCIENCE 2024; 4:677-689. [PMID: 39317762 DOI: 10.1038/s43588-024-00689-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 08/08/2024] [Indexed: 09/26/2024]
Abstract
Multimodal, single-cell genomics technologies enable simultaneous measurement of multiple facets of DNA and RNA processing in the cell. This creates opportunities for transcriptome-wide, mechanistic studies of cellular processing in heterogeneous cell populations, such as regulation of cell fate by transcriptional stochasticity or tumor proliferation through aberrant splicing dynamics. However, current methods for determining cell types or 'clusters' in multimodal data often rely on ad hoc approaches to balance or integrate measurements, and assumptions ignoring inherent properties of the data. To enable interpretable and consistent cell cluster determination, we present meK-means (mechanistic K-means) which integrates modalities through a unifying model of transcription to learn underlying, shared biophysical states. With meK-means we can cluster cells with nascent and mature mRNA measurements, utilizing the causal, physical relationships between these modalities. This identifies shared transcription dynamics across cells, which induce the observed molecule counts, and provides an alternative definition for 'clusters' through the governing parameters of cellular processes.
Collapse
Affiliation(s)
- Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | | | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA.
| |
Collapse
|
8
|
Sadria M, Layton A, Goyal S, Bader GD. Fatecode enables cell fate regulator prediction using classification-supervised autoencoder perturbation. CELL REPORTS METHODS 2024; 4:100819. [PMID: 38986613 PMCID: PMC11294839 DOI: 10.1016/j.crmeth.2024.100819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 11/20/2023] [Accepted: 06/18/2024] [Indexed: 07/12/2024]
Abstract
Cell reprogramming, which guides the conversion between cell states, is a promising technology for tissue repair and regeneration, with the ultimate goal of accelerating recovery from diseases or injuries. To accomplish this, regulators must be identified and manipulated to control cell fate. We propose Fatecode, a computational method that predicts cell fate regulators based only on single-cell RNA sequencing (scRNA-seq) data. Fatecode learns a latent representation of the scRNA-seq data using a deep learning-based classification-supervised autoencoder and then performs in silico perturbation experiments on the latent representation to predict genes that, when perturbed, would alter the original cell type distribution to increase or decrease the population size of a cell type of interest. We assessed Fatecode's performance using simulations from a mechanistic gene-regulatory network model and scRNA-seq data mapping blood and brain development of different organisms. Our results suggest that Fatecode can detect known cell fate regulators from single-cell transcriptomics datasets.
Collapse
Affiliation(s)
- Mehrshad Sadria
- Department of Applied Mathematics, University of Waterloo, Waterloo, ON, Canada.
| | - Anita Layton
- Department of Applied Mathematics, University of Waterloo, Waterloo, ON, Canada; Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada; Department of Biology, University of Waterloo, Waterloo, ON, Canada; School of Pharmacy, University of Waterloo, Waterloo, ON, Canada
| | - Sidhartha Goyal
- Department of Physics, University of Toronto, Toronto, ON, Canada
| | - Gary D Bader
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; The Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Computer Science, University of Toronto, Toronto, ON, Canada; The Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada; Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada; Canadian Institute for Advanced Research (CIFAR), Toronto, ON, Canada
| |
Collapse
|
9
|
Chari T, Gorin G, Pachter L. Stochastic Modeling of Biophysical Responses to Perturbation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.04.602131. [PMID: 39005347 PMCID: PMC11245117 DOI: 10.1101/2024.07.04.602131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Recent advances in high-throughput, multi-condition experiments allow for genome-wide investigation of how perturbations affect transcription and translation in the cell across multiple biological entities or modalities, from chromatin and mRNA information to protein production and spatial morphology. This presents an unprecedented opportunity to unravel how the processes of DNA and RNA regulation direct cell fate determination and disease response. Most methods designed for analyzing large-scale perturbation data focus on the observational outcomes, e.g., expression; however, many potential transcriptional mechanisms, such as transcriptional bursting or splicing dynamics, can underlie these complex and noisy observations. In this analysis, we demonstrate how a stochastic biophysical modeling approach to interpreting high-throughout perturbation data enables deeper investigation of the 'how' behind such molecular measurements. Our approach takes advantage of modalities already present in data produced with current technologies, such as nascent and mature mRNA measurements, to illuminate transcriptional dynamics induced by perturbation, predict kinetic behaviors in new perturbation settings, and uncover novel populations of cells with distinct kinetic responses to perturbation.
Collapse
Affiliation(s)
- Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
| | | | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California
| |
Collapse
|
10
|
Jiang J, Chen S, Tsou T, McGinnis CS, Khazaei T, Zhu Q, Park JH, Strazhnik IM, Vielmetter J, Gong Y, Hanna J, Chow ED, Sivak DA, Gartner ZJ, Thomson M. D-SPIN constructs gene regulatory network models from multiplexed scRNA-seq data revealing organizing principles of cellular perturbation response. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.04.19.537364. [PMID: 37131803 PMCID: PMC10153191 DOI: 10.1101/2023.04.19.537364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Gene regulatory networks within cells modulate the expression of the genome in response to signals and changing environmental conditions. Reconstructions of gene regulatory networks can reveal the information processing and control principles used by cells to maintain homeostasis and execute cell-state transitions. Here, we introduce a computational framework, D-SPIN, that generates quantitative models of gene regulatory networks from single-cell mRNA-seq datasets collected across thousands of distinct perturbation conditions. D-SPIN models the cell as a collection of interacting gene-expression programs, and constructs a probabilistic model to infer regulatory interactions between gene-expression programs and external perturbations. Using large Perturb-seq and drug-response datasets, we demonstrate that D-SPIN models reveal the organization of cellular pathways, sub-functions of macromolecular complexes, and the logic of cellular regulation of transcription, translation, metabolism, and protein degradation in response to gene knockdown perturbations. D-SPIN can also be applied to dissect drug response mechanisms in heterogeneous cell populations, elucidating how combinations of immunomodulatory drugs can induce novel cell states through additive recruitment of gene expression programs. D-SPIN provides a computational framework for constructing interpretable models of gene-regulatory networks to reveal principles of cellular information processing and physiological control.
Collapse
Affiliation(s)
- Jialong Jiang
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, 91125, USA
| | - Sisi Chen
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, 91125, USA
- Beckman Single-Cell Profiling and Engineering Center, California Institute of Technology, Pasadena, CA, 91125, USA
- Apertura Gene Therapy, 345 Park Ave South, New York, NY 10010
| | - Tiffany Tsou
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, 91125, USA
- Beckman Single-Cell Profiling and Engineering Center, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Christopher S. McGinnis
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA, 94143, USA
| | - Tahmineh Khazaei
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, 91125, USA
| | - Qin Zhu
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA, 94143, USA
| | - Jong H. Park
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, 91125, USA
- Beckman Single-Cell Profiling and Engineering Center, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Inna-Marie Strazhnik
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, 91125, USA
| | - Jost Vielmetter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, 91125, USA
| | - Yingying Gong
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, 91125, USA
| | - John Hanna
- Department of Pathology, Harvard Medical School and Brigham and Women’s Hospital, Boston, MA, 02115, USA
| | - Eric D. Chow
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA, 94143, USA
- Center for Advanced Technology, University of California San Francisco, San Francisco, CA, 94143, USA
| | - David A. Sivak
- Department of Physics, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Zev J. Gartner
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA, 94143, USA
- Helen Diller Family Comprehensive Cancer Center, San Francisco, CA, 94115, USA
- Chan Zuckerberg BioHub, University of California San Francisco, San Francisco, CA, 94143, USA
- Center for Cellular Construction, University of California San Francisco, San Francisco, CA, 94143, USA
| | - Matt Thomson
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, 91125, USA
- Beckman Single-Cell Profiling and Engineering Center, California Institute of Technology, Pasadena, CA, 91125, USA
| |
Collapse
|
11
|
Bunne C, Stark SG, Gut G, Del Castillo JS, Levesque M, Lehmann KV, Pelkmans L, Krause A, Rätsch G. Learning single-cell perturbation responses using neural optimal transport. Nat Methods 2023; 20:1759-1768. [PMID: 37770709 PMCID: PMC10630137 DOI: 10.1038/s41592-023-01969-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 06/23/2023] [Indexed: 09/30/2023]
Abstract
Understanding and predicting molecular responses in single cells upon chemical, genetic or mechanical perturbations is a core question in biology. Obtaining single-cell measurements typically requires the cells to be destroyed. This makes learning heterogeneous perturbation responses challenging as we only observe unpaired distributions of perturbed or non-perturbed cells. Here we leverage the theory of optimal transport and the recent advent of input convex neural architectures to present CellOT, a framework for learning the response of individual cells to a given perturbation by mapping these unpaired distributions. CellOT outperforms current methods at predicting single-cell drug responses, as profiled by scRNA-seq and a multiplexed protein-imaging technology. Further, we illustrate that CellOT generalizes well on unseen settings by (1) predicting the scRNA-seq responses of holdout patients with lupus exposed to interferon-β and patients with glioblastoma to panobinostat; (2) inferring lipopolysaccharide responses across different species; and (3) modeling the hematopoietic developmental trajectories of different subpopulations.
Collapse
Affiliation(s)
- Charlotte Bunne
- Department of Computer Science, ETH Zurich, Zürich, Switzerland
- AI Center, ETH Zurich, Zürich, Switzerland
| | - Stefan G Stark
- Department of Computer Science, ETH Zurich, Zürich, Switzerland
- AI Center, ETH Zurich, Zürich, Switzerland
- Medical Informatics Unit, University of Zurich Hospital, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Zurich, Switzerland
| | - Gabriele Gut
- Department of Molecular Life Sciences, University of Zurich, Zürich, Switzerland
| | | | - Mitch Levesque
- Department of Dermatology, University of Zurich Hospital, University of Zurich, Zürich, Switzerland
| | - Kjong-Van Lehmann
- Department of Computer Science, ETH Zurich, Zürich, Switzerland.
- Cancer Research Center Cologne-Essen, Site: Center Integrated Oncology Aachen, Aachen, Germany.
| | - Lucas Pelkmans
- Department of Molecular Life Sciences, University of Zurich, Zürich, Switzerland.
| | - Andreas Krause
- Department of Computer Science, ETH Zurich, Zürich, Switzerland.
- AI Center, ETH Zurich, Zürich, Switzerland.
| | - Gunnar Rätsch
- Department of Computer Science, ETH Zurich, Zürich, Switzerland.
- AI Center, ETH Zurich, Zürich, Switzerland.
- Medical Informatics Unit, University of Zurich Hospital, Zürich, Switzerland.
- Swiss Institute of Bioinformatics, Zurich, Switzerland.
- Department of Biology, ETH Zurich, Zürich, Switzerland.
| |
Collapse
|
12
|
Pool AH, Poldsam H, Chen S, Thomson M, Oka Y. Recovery of missing single-cell RNA-sequencing data with optimized transcriptomic references. Nat Methods 2023; 20:1506-1515. [PMID: 37697162 DOI: 10.1038/s41592-023-02003-w] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 08/15/2023] [Indexed: 09/13/2023]
Abstract
Single-cell RNA-sequencing (scRNA-seq) is an indispensable tool for characterizing cellular diversity and generating hypotheses throughout biology. Droplet-based scRNA-seq datasets often lack expression data for genes that can be detected with other methods. Here we show that the observed sensitivity deficits stem from three sources: (1) poor annotation of 3' gene ends; (2) issues with intronic read incorporation; and (3) gene overlap-derived read loss. We show that missing gene expression data can be recovered by optimizing the reference transcriptome for scRNA-seq through recovering false intergenic reads, implementing a hybrid pre-mRNA mapping strategy and resolving gene overlaps. We demonstrate, with a diverse collection of mouse and human tissue data, that reference optimization can substantially improve cellular profiling resolution and reveal missing cell types and marker genes. Our findings argue that transcriptomic references need to be optimized for scRNA-seq analysis and warrant a reanalysis of previously published datasets and cell atlases.
Collapse
Affiliation(s)
- Allan-Hermann Pool
- Department of Neuroscience, University of Texas Southwestern Medical Center, Dallas, TX, USA.
- Peter O'Donnell Brain Institute, University of Texas Southwestern Medical Center, Dallas, TX, USA.
- Department of Anesthesiology and Pain Management, University of Texas Southwestern Medical Center, Dallas, TX, USA.
| | - Helen Poldsam
- Department of Neuroscience, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Department of Chemistry and Biotechnology, Tallinn University of Technology, Tallinn, Estonia
| | - Sisi Chen
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Matt Thomson
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Yuki Oka
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
| |
Collapse
|
13
|
Chari T, Gorin G, Pachter L. Biophysically Interpretable Inference of Cell Types from Multimodal Sequencing Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.17.558131. [PMID: 37745403 PMCID: PMC10516047 DOI: 10.1101/2023.09.17.558131] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Multimodal, single-cell genomics technologies enable simultaneous capture of multiple facets of DNA and RNA processing in the cell. This creates opportunities for transcriptome-wide, mechanistic studies of cellular processing in heterogeneous cell types, with applications ranging from inferring kinetic differences between cells, to the role of stochasticity in driving heterogeneity. However, current methods for determining cell types or 'clusters' present in multimodal data often rely on ad hoc or independent treatment of modalities, and assumptions ignoring inherent properties of the count data. To enable interpretable and consistent cell cluster determination from multimodal data, we present meK-Means (mechanistic K-Means) which integrates modalities and learns underlying, shared biophysical states through a unifying model of transcription. In particular, we demonstrate how meK-Means can be used to cluster cells from unspliced and spliced mRNA count modalities. By utilizing the causal, physical relationships underlying these modalities, we identify shared transcriptional kinetics across cells, which induce the observed gene expression profiles, and provide an alternative definition for 'clusters' through the governing parameters of cellular processes.
Collapse
Affiliation(s)
- Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
| | - Gennady Gorin
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California
| |
Collapse
|
14
|
Abstract
Dimensionality reduction is standard practice for filtering noise and identifying relevant features in large-scale data analyses. In biology, single-cell genomics studies typically begin with reduction to 2 or 3 dimensions to produce "all-in-one" visuals of the data that are amenable to the human eye, and these are subsequently used for qualitative and quantitative exploratory analysis. However, there is little theoretical support for this practice, and we show that extreme dimension reduction, from hundreds or thousands of dimensions to 2, inevitably induces significant distortion of high-dimensional datasets. We therefore examine the practical implications of low-dimensional embedding of single-cell data and find that extensive distortions and inconsistent practices make such embeddings counter-productive for exploratory, biological analyses. In lieu of this, we discuss alternative approaches for conducting targeted embedding and feature exploration to enable hypothesis-driven biological discovery.
Collapse
Affiliation(s)
- Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California, United States of America
| |
Collapse
|
15
|
Sun L, Wang G, Zhang Z. SimCH: simulation of single-cell RNA sequencing data by modeling cellular heterogeneity at gene expression level. Brief Bioinform 2023; 24:6961608. [PMID: 36575569 DOI: 10.1093/bib/bbac590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 11/08/2022] [Accepted: 12/02/2022] [Indexed: 12/29/2022] Open
Abstract
Single-cell ribonucleic acid (RNA) sequencing (scRNA-seq) has been a powerful technology for transcriptome analysis. However, the systematic validation of diverse computational tools used in scRNA-seq analysis remains challenging. Here, we propose a novel simulation tool, termed as Simulation of Cellular Heterogeneity (SimCH), for the flexible and comprehensive assessment of scRNA-seq computational methods. The Gaussian Copula framework is recruited to retain gene coexpression of experimental data shown to be associated with cellular heterogeneity. The synthetic count matrices generated by suitable SimCH modes closely match experimental data originating from either homogeneous or heterogeneous cell populations and either unique molecular identifier (UMI)-based or non-UMI-based techniques. We demonstrate how SimCH can benchmark several types of computational methods, including cell clustering, discovery of differentially expressed genes, trajectory inference, batch correction and imputation. Moreover, we show how SimCH can be used to conduct power evaluation of cell clustering methods. Given these merits, we believe that SimCH can accelerate single-cell research.
Collapse
Affiliation(s)
- Lei Sun
- School of Information Engineering, Yangzhou University, Yangzhou, P.R. China.,School of Artificial Intelligence, Yangzhou University, Yangzhou, P.R. China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing, P.R. China
| | - Gongming Wang
- School of Information Engineering, Yangzhou University, Yangzhou, P.R. China.,School of Artificial Intelligence, Yangzhou University, Yangzhou, P.R. China.,China Unicom Software Research Institute Jinan Branch, Jinan, P.R. China
| | - Zhihua Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing, P.R. China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, P.R. China
| |
Collapse
|
16
|
Bing X, Bunea F, Strimas-Mackey S, Wegkamp M. Likelihood estimation of sparse topic distributions in topic models and its applications to Wasserstein document distance calculations. Ann Stat 2022. [DOI: 10.1214/22-aos2229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
- Xin Bing
- Department of Statistical Sciences, University of Toronto
| | | | | | - Marten Wegkamp
- Departments of Mathematics, and of Statistics and Data Science, Cornell University
| |
Collapse
|
17
|
Gandhi M, Bakhai V, Trivedi J, Mishra A, De Andrés F, LLerena A, Sharma R, Nair S. Current perspectives on interethnic variability in multiple myeloma: Single cell technology, population pharmacogenetics and molecular signal transduction. Transl Oncol 2022; 25:101532. [PMID: 36103755 PMCID: PMC9478452 DOI: 10.1016/j.tranon.2022.101532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 08/31/2022] [Accepted: 09/05/2022] [Indexed: 11/15/2022] Open
Abstract
This review discusses the emerging single cell technologies and applications in Multiple myeloma (MM), population pharmacogenetics of MM, resistance to chemotherapy, genetic determinants of drug-induced toxicity, molecular signal transduction. The role(s) of epigenetics and noncoding RNAs including microRNAs (miRNAs) and long noncoding RNAs (lncRNAs) that influence the risk and severity of MM are also discussed. It is understood that ethnic component acts as a driver of variable response to chemotherapy in different sub-populations globally. This review augments our understanding of genetic variability in ‘myelomagenesis’ and drug-induced toxicity, myeloma microenvironment at the molecular and cellular level, and developing precision medicine strategies to combat this malignancy. The emerging single cell technologies hold great promise for enhancing our understanding of MM tumor heterogeneity and clonal diversity.
Multiple myeloma (MM) is an aggressive cancer characterised by malignancy of the plasma cells and a rising global incidence. The gold standard for optimum response is aggressive chemotherapy followed by autologous stem cell transplantation (ASCT). However, majority of the patients are above 60 years and this presents the clinician with complications such as ineligibility for ASCT, frailty, drug-induced toxicity and differential/partial response to treatment. The latter is partly driven by heterogenous genotypes of the disease in different subpopulations. In this review, we discuss emerging single cell technologies and applications in MM, population pharmacogenetics of MM, resistance to chemotherapy, genetic determinants of drug-induced toxicity, molecular signal transduction, as well as the role(s) played by epigenetics and noncoding RNAs including microRNAs (miRNAs) and long noncoding RNAs (lncRNAs) that influence the risk and severity of the disease. Taken together, our discussions further our understanding of genetic variability in ‘myelomagenesis’ and drug-induced toxicity, augment our understanding of the myeloma microenvironment at the molecular and cellular level and provide a basis for developing precision medicine strategies to combat this malignancy.
Collapse
Affiliation(s)
- Manav Gandhi
- Burnett School of Biomedical Sciences, College of Medicine, University of Central Florida, 6900 Lake Nona Blvd., Orlando, FL 32827, USA
| | - Viral Bakhai
- Shobhaben Pratapbhai Patel School of Pharmacy & Technology Management, SVKM's NMIMS University, V. L. Mehta Road, Vile Parle (West), Mumbai 400056, India
| | - Jash Trivedi
- University of Mumbai, Santa Cruz, Mumbai 400055, India
| | - Adarsh Mishra
- Shobhaben Pratapbhai Patel School of Pharmacy & Technology Management, SVKM's NMIMS University, V. L. Mehta Road, Vile Parle (West), Mumbai 400056, India
| | - Fernando De Andrés
- INUBE Extremadura Biosanitary Research Institute, Badajoz, Spain; Faculty of Medicine, University of Extremadura, Badajoz, Spain; CICAB Clinical Research Center, Pharmacogenetics and Personalized Medicine Unit, Badajoz University Hospital, Extremadura Health Service, Badajoz, Spain
| | - Adrián LLerena
- INUBE Extremadura Biosanitary Research Institute, Badajoz, Spain; Faculty of Medicine, University of Extremadura, Badajoz, Spain; CICAB Clinical Research Center, Pharmacogenetics and Personalized Medicine Unit, Badajoz University Hospital, Extremadura Health Service, Badajoz, Spain
| | - Rohit Sharma
- Department of Rasa Shastra and Bhaishajya Kalpana, Faculty of Ayurveda, Institute of Medical Sciences, Banaras Hindu University, Varanasi, Uttar Pradesh 221005, India.
| | - Sujit Nair
- University of Mumbai, Santa Cruz, Mumbai 400055, India.
| |
Collapse
|
18
|
Sohail A, Jiang X, Wahid A, Wang H, Cao C, Xiao H. Free-flow zone electrophoresis facilitated proteomics analysis of heterogeneous subpopulations in H1299 lung cancer cells. Anal Chim Acta 2022; 1227:340306. [DOI: 10.1016/j.aca.2022.340306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 07/30/2022] [Accepted: 08/21/2022] [Indexed: 11/01/2022]
|
19
|
Chen X, Chen S, Thomson M. Minimal gene set discovery in single-cell mRNA-seq datasets with ActiveSVM. NATURE COMPUTATIONAL SCIENCE 2022; 2:387-398. [PMID: 38177588 PMCID: PMC10766518 DOI: 10.1038/s43588-022-00263-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 05/17/2022] [Indexed: 01/06/2024]
Abstract
Sequencing costs currently prohibit the application of single-cell mRNA-seq to many biological and clinical analyses. Targeted single-cell mRNA-sequencing reduces sequencing costs by profiling reduced gene sets that capture biological information with a minimal number of genes. Here we introduce an active learning method that identifies minimal but highly informative gene sets that enable the identification of cell types, physiological states and genetic perturbations in single-cell data using a small number of genes. Our active feature selection procedure generates minimal gene sets from single-cell data by employing an active support vector machine (ActiveSVM) classifier. We demonstrate that ActiveSVM feature selection identifies gene sets that enable ~90% cell-type classification accuracy across, for example, cell atlas and disease-characterization datasets. The discovery of small but highly informative gene sets should enable reductions in the number of measurements necessary for application of single-cell mRNA-seq to clinical tests, therapeutic discovery and genetic screens.
Collapse
Affiliation(s)
- Xiaoqiao Chen
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California, USA
| | - Sisi Chen
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, USA
- Beckman Institute Single-cell Profiling and Engineering Center, Pasadena, California, USA
| | - Matt Thomson
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California, USA.
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, USA.
- Beckman Institute Single-cell Profiling and Engineering Center, Pasadena, California, USA.
| |
Collapse
|
20
|
Chari T, Weissbourd B, Gehring J, Ferraioli A, Leclère L, Herl M, Gao F, Chevalier S, Copley RR, Houliston E, Anderson DJ, Pachter L. Whole-animal multiplexed single-cell RNA-seq reveals transcriptional shifts across Clytia medusa cell types. SCIENCE ADVANCES 2021; 7:eabh1683. [PMID: 34826233 PMCID: PMC8626072 DOI: 10.1126/sciadv.abh1683] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2021] [Accepted: 10/06/2021] [Indexed: 05/12/2023]
Abstract
We present an organism-wide, transcriptomic cell atlas of the hydrozoan medusa Clytia hemisphaerica and describe how its component cell types respond to perturbation. Using multiplexed single-cell RNA sequencing, in which individual animals were indexed and pooled from control and perturbation conditions into a single sequencing run, we avoid artifacts from batch effects and are able to discern shifts in cell state in response to organismal perturbations. This work serves as a foundation for future studies of development, function, and regeneration in a genetically tractable jellyfish species. Moreover, we introduce a powerful workflow for high-resolution, whole-animal, multiplexed single-cell genomics that is readily adaptable to other traditional or nontraditional model organisms.
Collapse
Affiliation(s)
- Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Brandon Weissbourd
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
- Tianqiao and Chrissy Chen Institute for Neuroscience, Pasadena, CA 91125, USA
- Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA 91125, USA
| | - Jase Gehring
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Anna Ferraioli
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230, France
| | - Lucas Leclère
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230, France
| | - Makenna Herl
- University of New Hampshire School of Law, Concord, NH 03301, USA
| | - Fan Gao
- Caltech Bioinformatics Resource Center, California Institute of Technology, Pasadena, CA 91125, USA
| | - Sandra Chevalier
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230, France
| | - Richard R. Copley
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230, France
| | - Evelyn Houliston
- Sorbonne Université, CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-mer (LBDV), 06230, France
| | - David J. Anderson
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
- Tianqiao and Chrissy Chen Institute for Neuroscience, Pasadena, CA 91125, USA
- Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA 91125, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
21
|
Sozen B, Jorgensen V, Weatherbee BAT, Chen S, Zhu M, Zernicka-Goetz M. Reconstructing aspects of human embryogenesis with pluripotent stem cells. Nat Commun 2021; 12:5550. [PMID: 34548496 PMCID: PMC8455697 DOI: 10.1038/s41467-021-25853-4] [Citation(s) in RCA: 120] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 08/24/2021] [Indexed: 02/01/2023] Open
Abstract
Understanding human development is of fundamental biological and clinical importance. Despite its significance, mechanisms behind human embryogenesis remain largely unknown. Here, we attempt to model human early embryo development with expanded pluripotent stem cells (EPSCs) in 3-dimensions. We define a protocol that allows us to generate self-organizing cystic structures from human EPSCs that display some hallmarks of human early embryogenesis. These structures mimic polarization and cavitation characteristic of pre-implantation development leading to blastocyst morphology formation and the transition to post-implantation-like organization upon extended culture. Single-cell RNA sequencing of these structures reveals subsets of cells bearing some resemblance to epiblast, hypoblast and trophectoderm lineages. Nevertheless, significant divergences from natural blastocysts persist in some key markers, and signalling pathways point towards ways in which morphology and transcriptional-level cell identities may diverge in stem cell models of the embryo. Thus, this stem cell platform provides insights into the design of stem cell models of embryogenesis.
Collapse
Affiliation(s)
- Berna Sozen
- Plasticity and Self-Organization Group, Division of Biology and Biological Engineering, Caltech, Pasadena, CA, 91125, USA
- Department of Genetics, Yale School of Medicine, Yale University, New Haven, CT, 06520, USA
| | - Victoria Jorgensen
- Plasticity and Self-Organization Group, Division of Biology and Biological Engineering, Caltech, Pasadena, CA, 91125, USA
| | - Bailey A T Weatherbee
- Mammalian Development and Stem Cell Group, Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, CB2 3EG, UK
| | - Sisi Chen
- Plasticity and Self-Organization Group, Division of Biology and Biological Engineering, Caltech, Pasadena, CA, 91125, USA
| | - Meng Zhu
- Mammalian Development and Stem Cell Group, Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, CB2 3EG, UK
- Blavatnik Institute, Harvard Medical School, Department of Genetics, Boston, MA, 02115, USA
| | - Magdalena Zernicka-Goetz
- Plasticity and Self-Organization Group, Division of Biology and Biological Engineering, Caltech, Pasadena, CA, 91125, USA.
- Mammalian Development and Stem Cell Group, Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, CB2 3EG, UK.
| |
Collapse
|
22
|
Ji Y, Lotfollahi M, Wolf FA, Theis FJ. Machine learning for perturbational single-cell omics. Cell Syst 2021; 12:522-537. [PMID: 34139164 DOI: 10.1016/j.cels.2021.05.016] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 05/04/2021] [Accepted: 05/19/2021] [Indexed: 12/18/2022]
Abstract
Cell biology is fundamentally limited in its ability to collect complete data on cellular phenotypes and the wide range of responses to perturbation. Areas such as computer vision and speech recognition have addressed this problem of characterizing unseen or unlabeled conditions with the combined advances of big data, deep learning, and computing resources in the past 5 years. Similarly, recent advances in machine learning approaches enabled by single-cell data start to address prediction tasks in perturbation response modeling. We first define objectives in learning perturbation response in single-cell omics; survey existing approaches, resources, and datasets (https://github.com/theislab/sc-pert); and discuss how a perturbation atlas can enable deep learning models to construct an informative perturbation latent space. We then examine future avenues toward more powerful and explainable modeling using deep neural networks, which enable the integration of disparate information sources and an understanding of heterogeneous, complex, and unseen systems.
Collapse
Affiliation(s)
- Yuge Ji
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany; Department of Mathematics, Technical University of Munich, Munich, Germany
| | - Mohammad Lotfollahi
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany; TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - F Alexander Wolf
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany; Cellarity, Cambridge, MA, USA
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany; Department of Mathematics, Technical University of Munich, Munich, Germany; Cellarity, Cambridge, MA, USA.
| |
Collapse
|