1
|
Ranek JS, Stallaert W, Milner JJ, Redick M, Wolff SC, Beltran AS, Stanley N, Purvis JE. DELVE: feature selection for preserving biological trajectories in single-cell data. Nat Commun 2024; 15:2765. [PMID: 38553455 PMCID: PMC10980758 DOI: 10.1038/s41467-024-46773-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 03/07/2024] [Indexed: 04/02/2024] Open
Abstract
Single-cell technologies can measure the expression of thousands of molecular features in individual cells undergoing dynamic biological processes. While examining cells along a computationally-ordered pseudotime trajectory can reveal how changes in gene or protein expression impact cell fate, identifying such dynamic features is challenging due to the inherent noise in single-cell data. Here, we present DELVE, an unsupervised feature selection method for identifying a representative subset of molecular features which robustly recapitulate cellular trajectories. In contrast to previous work, DELVE uses a bottom-up approach to mitigate the effects of confounding sources of variation, and instead models cell states from dynamic gene or protein modules based on core regulatory complexes. Using simulations, single-cell RNA sequencing, and iterative immunofluorescence imaging data in the context of cell cycle and cellular differentiation, we demonstrate how DELVE selects features that better define cell-types and cell-type transitions. DELVE is available as an open-source python package: https://github.com/jranek/delve .
Collapse
Affiliation(s)
- Jolene S Ranek
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Computational Medicine Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Wayne Stallaert
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
| | - J Justin Milner
- Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, NC, USA
| | - Margaret Redick
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Computational Medicine Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Samuel C Wolff
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Computational Medicine Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Adriana S Beltran
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Human Pluripotent Cell Core, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, NC, USA
| | - Natalie Stanley
- Computational Medicine Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| | - Jeremy E Purvis
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
- Computational Medicine Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| |
Collapse
|
2
|
Guo X, Ning J, Chen Y, Liu G, Zhao L, Fan Y, Sun S. Recent advances in differential expression analysis for single-cell RNA-seq and spatially resolved transcriptomic studies. Brief Funct Genomics 2024; 23:95-109. [PMID: 37022699 DOI: 10.1093/bfgp/elad011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 12/09/2022] [Accepted: 03/10/2023] [Indexed: 04/07/2023] Open
Abstract
Differential expression (DE) analysis is a necessary step in the analysis of single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) data. Unlike traditional bulk RNA-seq, DE analysis for scRNA-seq or SRT data has unique characteristics that may contribute to the difficulty of detecting DE genes. However, the plethora of DE tools that work with various assumptions makes it difficult to choose an appropriate one. Furthermore, a comprehensive review on detecting DE genes for scRNA-seq data or SRT data from multi-condition, multi-sample experimental designs is lacking. To bridge such a gap, here, we first focus on the challenges of DE detection, then highlight potential opportunities that facilitate further progress in scRNA-seq or SRT analysis, and finally provide insights and guidance in selecting appropriate DE tools or developing new computational DE methods.
Collapse
Affiliation(s)
- Xiya Guo
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Jin Ning
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Yuanze Chen
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Guoliang Liu
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Liyan Zhao
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Yue Fan
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Shiquan Sun
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| |
Collapse
|
3
|
Lim SY, Rizos H. Single-cell RNA sequencing in melanoma: what have we learned so far? EBioMedicine 2024; 100:104969. [PMID: 38241976 PMCID: PMC10831183 DOI: 10.1016/j.ebiom.2024.104969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 12/18/2023] [Accepted: 01/03/2024] [Indexed: 01/21/2024] Open
Abstract
Over the past decade, there have been remarkable improvements in the treatment and survival rates of melanoma patients. Treatment resistance remains a persistent challenge, however, and is partly attributable to intratumoural heterogeneity. Melanoma cells can transition through a series of phenotypic and transcriptional cell states that vary in invasiveness and treatment responsiveness. The diverse stromal and immune contexture of the tumour microenvironment also contributes to intratumoural heterogeneity and disparities in treatment response in melanoma patients. Recent advances in single-cell sequencing technologies have enabled a more detailed understanding of melanoma heterogeneity and the underlying transcriptional programs that regulate melanoma cell diversity and behaviour. In this review, we examine the concept of intratumoural heterogeneity and the challenges it poses to achieving long-lasting treatment responses. We focus on the significance of next generation single-cell sequencing in advancing our understanding of melanoma diversity and the unique insights gained from single-cell studies.
Collapse
Affiliation(s)
- Su Yin Lim
- Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Australia; Melanoma Institute Australia, Sydney, Australia.
| | - Helen Rizos
- Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Australia; Melanoma Institute Australia, Sydney, Australia
| |
Collapse
|
4
|
Ghazanfar S, Guibentif C, Marioni JC. Stabilized mosaic single-cell data integration using unshared features. Nat Biotechnol 2024; 42:284-292. [PMID: 37231260 PMCID: PMC10869270 DOI: 10.1038/s41587-023-01766-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 03/28/2023] [Indexed: 05/27/2023]
Abstract
Currently available single-cell omics technologies capture many unique features with different biological information content. Data integration aims to place cells, captured with different technologies, onto a common embedding to facilitate downstream analytical tasks. Current horizontal data integration techniques use a set of common features, thereby ignoring non-overlapping features and losing information. Here we introduce StabMap, a mosaic data integration technique that stabilizes mapping of single-cell data by exploiting the non-overlapping features. StabMap first infers a mosaic data topology based on shared features, then projects all cells onto supervised or unsupervised reference coordinates by traversing shortest paths along the topology. We show that StabMap performs well in various simulation contexts, facilitates 'multi-hop' mosaic data integration where some datasets do not share any features and enables the use of spatial gene expression features for mapping dissociated single-cell data onto a spatial transcriptomic reference.
Collapse
Affiliation(s)
- Shila Ghazanfar
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK.
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK.
- School of Mathematics and Statistics, The University of Sydney, Camperdown, New South Wales, Australia.
- Charles Perkins Centre, The University of Sydney, Camperdown, New South Wales, Australia.
| | - Carolina Guibentif
- Sahlgrenska Center for Cancer Research, Inst. Biomedicine, Dept. Microbiology and Immunology, University of Gothenburg, Gothenburg, Sweden
| | - John C Marioni
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK.
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK.
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.
| |
Collapse
|
5
|
Tian J, Lei J, Roeder K. From local to global gene co-expression estimation using single-cell RNA-seq data. Biometrics 2024; 80:ujae001. [PMID: 38465983 PMCID: PMC10926266 DOI: 10.1093/biomtc/ujae001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 10/01/2023] [Accepted: 01/15/2024] [Indexed: 03/12/2024]
Abstract
In genomics studies, the investigation of gene relationships often brings important biological insights. Currently, the large heterogeneous datasets impose new challenges for statisticians because gene relationships are often local. They change from one sample point to another, may only exist in a subset of the sample, and can be nonlinear or even nonmonotone. Most previous dependence measures do not specifically target local dependence relationships, and the ones that do are computationally costly. In this paper, we explore a state-of-the-art network estimation technique that characterizes gene relationships at the single cell level, under the name of cell-specific gene networks. We first show that averaging the cell-specific gene relationship over a population gives a novel univariate dependence measure, the averaged Local Density Gap (aLDG), that accumulates local dependence and can detect any nonlinear, nonmonotone relationship. Together with a consistent nonparametric estimator, we establish its robustness on both the population and empirical levels. Then, we show that averaging the cell-specific gene relationship over mini-batches determined by some external structure information (eg, spatial or temporal factor) better highlights meaningful local structure change points. We explore the application of aLDG and its minibatch variant in many scenarios, including pairwise gene relationship estimation, bifurcating point detection in cell trajectory, and spatial transcriptomics structure visualization. Both simulations and real data analysis show that aLDG outperforms existing ones.
Collapse
Affiliation(s)
- Jinjin Tian
- Department of Statistics and Data Science, Carnegie Mellon University, 15213, Pittsburgh, PA, United States
| | - Jing Lei
- Department of Statistics and Data Science, Carnegie Mellon University, 15213, Pittsburgh, PA, United States
| | - Kathryn Roeder
- Department of Statistics and Data Science, Carnegie Mellon University, 15213, Pittsburgh, PA, United States
| |
Collapse
|
6
|
Cao Y, Tran A, Kim H, Robertson N, Lin Y, Torkel M, Yang P, Patrick E, Ghazanfar S, Yang J. Thinking process templates for constructing data stories with SCDNEY. F1000Res 2023; 12:261. [PMID: 38434622 PMCID: PMC10905113 DOI: 10.12688/f1000research.130623.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/08/2023] [Indexed: 03/05/2024] Open
Abstract
Background Globally, scientists now have the ability to generate a vast amount of high throughput biomedical data that carry critical information for important clinical and public health applications. This data revolution in biology is now creating a plethora of new single-cell datasets. Concurrently, there have been significant methodological advances in single-cell research. Integrating these two resources, creating tailor-made, efficient, and purpose-specific data analysis approaches can assist in accelerating scientific discovery. Methods We developed a series of living workshops for building data stories, using Single-cell data integrative analysis (scdney). scdney is a wrapper package with a collection of single-cell analysis R packages incorporating data integration, cell type annotation, higher order testing and more. Results Here, we illustrate two specific workshops. The first workshop examines how to characterise the identity and/or state of cells and the relationship between them, known as phenotyping. The second workshop focuses on extracting higher-order features from cells to predict disease progression. Conclusions Through these workshops, we not only showcase current solutions, but also highlight critical thinking points. In particular, we highlight the Thinking Process Template that provides a structured framework for the decision-making process behind such single-cell analyses. Furthermore, our workshop will incorporate dynamic contributions from the community in a collaborative learning approach, thus the term 'living'.
Collapse
Affiliation(s)
- Yue Cao
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Andy Tran
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Hani Kim
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia
- Children's Medical Research Institute, The University of Sydney, Westmead, NSW, 2145, Australia
| | - Nick Robertson
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Yingxin Lin
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Marni Torkel
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Pengyi Yang
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia
- Children's Medical Research Institute, The University of Sydney, Westmead, NSW, 2145, Australia
| | - Ellis Patrick
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Shila Ghazanfar
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Jean Yang
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia
| |
Collapse
|
7
|
Walsh LA, Quail DF. Decoding the tumor microenvironment with spatial technologies. Nat Immunol 2023; 24:1982-1993. [PMID: 38012408 DOI: 10.1038/s41590-023-01678-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Accepted: 10/10/2023] [Indexed: 11/29/2023]
Abstract
Visualization of the cellular heterogeneity and spatial architecture of the tumor microenvironment (TME) is becoming increasingly important to understand mechanisms of disease progression and therapeutic response. This is particularly relevant in the era of cancer immunotherapy, in which the contexture of immune cell positioning within the tumor landscape has been proven to affect efficacy. Although single-cell technologies have mostly replaced conventional approaches to analyze specific cellular subsets within tumors, those that integrate a spatial dimension are now on the rise. In this Review, we assess the strengths and limitations of emerging spatial technologies with a focus on their applications in tumor immunology, as well as forthcoming opportunities for artificial intelligence (AI) and the value of integrating multiomics datasets to achieve a holistic picture of the TME.
Collapse
Affiliation(s)
- Logan A Walsh
- Rosalind and Morris Goodman Cancer Institute, McGill University, Montreal, Quebec, Canada.
- Department of Human Genetics, Faculty of Medicine, McGill University, Montreal, Quebec, Canada.
| | - Daniela F Quail
- Rosalind and Morris Goodman Cancer Institute, McGill University, Montreal, Quebec, Canada.
- Department of Physiology, Faculty of Medicine, McGill University, Montreal, Quebec, Canada.
- Department of Medicine, Division of Experimental Medicine, McGill University, Montreal, Quebec, Canada.
| |
Collapse
|
8
|
Zhang C, Dong K, Aihara K, Chen L, Zhang S. STAMarker: determining spatial domain-specific variable genes with saliency maps in deep learning. Nucleic Acids Res 2023; 51:e103. [PMID: 37811885 PMCID: PMC10639070 DOI: 10.1093/nar/gkad801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 08/26/2023] [Accepted: 09/19/2023] [Indexed: 10/10/2023] Open
Abstract
Spatial transcriptomics characterizes gene expression profiles while retaining the information of the spatial context, providing an unprecedented opportunity to understand cellular systems. One of the essential tasks in such data analysis is to determine spatially variable genes (SVGs), which demonstrate spatial expression patterns. Existing methods only consider genes individually and fail to model the inter-dependence of genes. To this end, we present an analytic tool STAMarker for robustly determining spatial domain-specific SVGs with saliency maps in deep learning. STAMarker is a three-stage ensemble framework consisting of graph-attention autoencoders, multilayer perceptron (MLP) classifiers, and saliency map computation by the backpropagated gradient. We illustrate the effectiveness of STAMarker and compare it with serveral commonly used competing methods on various spatial transcriptomic data generated by different platforms. STAMarker considers all genes at once and is more robust when the dataset is very sparse. STAMarker could identify spatial domain-specific SVGs for characterizing spatial domains and enable in-depth analysis of the region of interest in the tissue section.
Collapse
Affiliation(s)
- Chihao Zhang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kangning Dong
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kazuyuki Aihara
- International Research Center for Neurointelligence, The University of Tokyo Institutes for Advanced Study, The University of Tokyo, Tokyo 113-0033, Japan
| | - Luonan Chen
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
- School of Life Science and Technology, Shanghai Tech University, Shanghai 201210, China
- Guangdong Institute of Intelligence Science and Technology, Hengqin, Zhuhai, Guangdong 519031, China
| | - Shihua Zhang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
| |
Collapse
|
9
|
Alexandrov T, Saez‐Rodriguez J, Saka SK. Enablers and challenges of spatial omics, a melting pot of technologies. Mol Syst Biol 2023; 19:e10571. [PMID: 37842805 PMCID: PMC10632737 DOI: 10.15252/msb.202110571] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 07/31/2023] [Accepted: 08/03/2023] [Indexed: 10/17/2023] Open
Abstract
Spatial omics has emerged as a rapidly growing and fruitful field with hundreds of publications presenting novel methods for obtaining spatially resolved information for any omics data type on spatial scales ranging from subcellular to organismal. From a technology development perspective, spatial omics is a highly interdisciplinary field that integrates imaging and omics, spatial and molecular analyses, sequencing and mass spectrometry, and image analysis and bioinformatics. The emergence of this field has not only opened a window into spatial biology, but also created multiple novel opportunities, questions, and challenges for method developers. Here, we provide the perspective of technology developers on what makes the spatial omics field unique. After providing a brief overview of the state of the art, we discuss technological enablers and challenges and present our vision about the future applications and impact of this melting pot.
Collapse
Affiliation(s)
- Theodore Alexandrov
- Structural and Computational Biology UnitEuropean Molecular Biology LaboratoryHeidelbergGermany
- Molecular Medicine Partnership UnitEuropean Molecular Biology LaboratoryHeidelbergGermany
- BioInnovation InstituteCopenhagenDenmark
| | - Julio Saez‐Rodriguez
- Molecular Medicine Partnership UnitEuropean Molecular Biology LaboratoryHeidelbergGermany
- Faculty of Medicine and Heidelberg University Hospital, Institute for Computational BiomedicineHeidelberg UniversityHeidelbergGermany
| | - Sinem K Saka
- Genome Biology UnitEuropean Molecular Biology LaboratoryHeidelbergGermany
| |
Collapse
|
10
|
Shireman JM, Cheng L, Goel A, Garcia DM, Partha S, Quiñones-Hinojosa A, Kendziorski C, Dey M. Spatial transcriptomics in glioblastoma: is knowing the right zip code the key to the next therapeutic breakthrough? Front Oncol 2023; 13:1266397. [PMID: 37916170 PMCID: PMC10618006 DOI: 10.3389/fonc.2023.1266397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 09/27/2023] [Indexed: 11/03/2023] Open
Abstract
Spatial transcriptomics, the technology of visualizing cellular gene expression landscape in a cells native tissue location, has emerged as a powerful tool that allows us to address scientific questions that were elusive just a few years ago. This technological advance is a decisive jump in the technological evolution that is revolutionizing studies of tissue structure and function in health and disease through the introduction of an entirely new dimension of data, spatial context. Perhaps the organ within the body that relies most on spatial organization is the brain. The central nervous system's complex microenvironmental and spatial architecture is tightly regulated during development, is maintained in health, and is detrimental when disturbed by pathologies. This inherent spatial complexity of the central nervous system makes it an exciting organ to study using spatial transcriptomics for pathologies primarily affecting the brain, of which Glioblastoma is one of the worst. Glioblastoma is a hyper-aggressive, incurable, neoplasm and has been hypothesized to not only integrate into the spatial architecture of the surrounding brain, but also possess an architecture of its own that might be actively remodeling the surrounding brain. In this review we will examine the current landscape of spatial transcriptomics in glioblastoma, outline novel findings emerging from the rising use of spatial transcriptomics, and discuss future directions and ultimate clinical/translational avenues.
Collapse
Affiliation(s)
- Jack M. Shireman
- Department of Neurosurgery, University of Wisconsin School of Medicine and Public Health, University of Wisconsin-Madison (UW) Carbone Cancer Center, Madison, WI, United States
| | - Lingxin Cheng
- Department of Biostatistics and Medical Informatics, University of Wisconsin School of Medicine and Public Health, Madison, WI, United States
| | - Amiti Goel
- Department of Neurosurgery, University of Wisconsin School of Medicine and Public Health, University of Wisconsin-Madison (UW) Carbone Cancer Center, Madison, WI, United States
| | - Diogo Moniz Garcia
- Department of Neurosurgery, Mayo Clinic, Jacksonville, FL, United States
| | - Sanil Partha
- Department of Neurosurgery, University of Wisconsin School of Medicine and Public Health, University of Wisconsin-Madison (UW) Carbone Cancer Center, Madison, WI, United States
| | | | - Christina Kendziorski
- Department of Biostatistics and Medical Informatics, University of Wisconsin School of Medicine and Public Health, Madison, WI, United States
| | - Mahua Dey
- Department of Neurosurgery, University of Wisconsin School of Medicine and Public Health, University of Wisconsin-Madison (UW) Carbone Cancer Center, Madison, WI, United States
| |
Collapse
|
11
|
Yuan Z, Yao J. Harnessing computational spatial omics to explore the spatial biology intricacies. Semin Cancer Biol 2023; 95:25-41. [PMID: 37400044 DOI: 10.1016/j.semcancer.2023.06.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 05/09/2023] [Accepted: 06/19/2023] [Indexed: 07/05/2023]
Abstract
Spatially resolved transcriptomics (SRT) has unlocked new dimensions in our understanding of intricate tissue architectures. However, this rapidly expanding field produces a wealth of diverse and voluminous data, necessitating the evolution of sophisticated computational strategies to unravel inherent patterns. Two distinct methodologies, gene spatial pattern recognition (GSPR) and tissue spatial pattern recognition (TSPR), have emerged as vital tools in this process. GSPR methodologies are designed to identify and classify genes exhibiting noteworthy spatial patterns, while TSPR strategies aim to understand intercellular interactions and recognize tissue domains with molecular and spatial coherence. In this review, we provide a comprehensive exploration of SRT, highlighting crucial data modalities and resources that are instrumental for the development of methods and biological insights. We address the complexities and challenges posed by the use of heterogeneous data in developing GSPR and TSPR methodologies and propose an optimal workflow for both. We delve into the latest advancements in GSPR and TSPR, examining their interrelationships. Lastly, we peer into the future, envisaging the potential directions and perspectives in this dynamic field.
Collapse
Affiliation(s)
- Zhiyuan Yuan
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.
| | | |
Collapse
|
12
|
Luo J, Deng M, Zhang X, Sun X. ESICCC as a systematic computational framework for evaluation, selection, and integration of cell-cell communication inference methods. Genome Res 2023; 33:1788-1805. [PMID: 37827697 PMCID: PMC10691505 DOI: 10.1101/gr.278001.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Accepted: 09/21/2023] [Indexed: 10/14/2023]
Abstract
Cell-cell communication (CCC) is critical for determining cell fates and functions in multicellular organisms. With the advent of single-cell RNA-sequencing (scRNA-seq) and spatial transcriptomics (ST), an increasing number of CCC inference methods have been developed. Nevertheless, a thorough comparison of their performances is yet to be conducted. To fill this gap, we developed a systematic benchmark framework called ESICCC to evaluate 18 ligand-receptor (LR) inference methods and five ligand/receptor-target inference methods using a total of 116 data sets, including 15 ST data sets, 15 sets of cell line perturbation data, two sets of cell type-specific expression/proteomics data, and 84 sets of sampled or unsampled scRNA-seq data. We evaluated and compared the agreement, accuracy, robustness, and usability of these methods. Regarding accuracy evaluation, RNAMagnet, CellChat, and scSeqComm emerge as the three best-performing methods for intercellular ligand-receptor inference based on scRNA-seq data, whereas stMLnet and HoloNet are the best methods for predicting ligand/receptor-target regulation using ST data. To facilitate the practical applications, we provide a decision-tree-style guideline for users to easily choose best tools for their specific research concerns in CCC inference, and develop an ensemble pipeline CCCbank that enables versatile combinations of methods and databases. Moreover, our comparative results also uncover several critical influential factors for CCC inference, such as prior interaction information, ligand-receptor scoring algorithm, intracellular signaling complexity, and spatial relationship, which may be considered in the future studies to advance the development of new methodologies.
Collapse
Affiliation(s)
- Jiaxin Luo
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
- School of Mathematics, Sun Yat-sen University, Guangzhou 510275, China
| | - Minghua Deng
- School of Mathematical Sciences, Peking University, Beijing, 100871, China
| | - Xuegong Zhang
- Bioinformatics Division of BNRIST and Department of Automation, MOE Key Lab of Bioinformatics, Tsinghua University, Beijing, 100084, China
| | - Xiaoqiang Sun
- School of Mathematics, Sun Yat-sen University, Guangzhou 510275, China;
| |
Collapse
|
13
|
Velten B, Stegle O. Principles and challenges of modeling temporal and spatial omics data. Nat Methods 2023; 20:1462-1474. [PMID: 37710019 DOI: 10.1038/s41592-023-01992-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 07/31/2023] [Indexed: 09/16/2023]
Abstract
Studies with temporal or spatial resolution are crucial to understand the molecular dynamics and spatial dependencies underlying a biological process or system. With advances in high-throughput omic technologies, time- and space-resolved molecular measurements at scale are increasingly accessible, providing new opportunities to study the role of timing or structure in a wide range of biological questions. At the same time, analyses of the data being generated in the context of spatiotemporal studies entail new challenges that need to be considered, including the need to account for temporal and spatial dependencies and compare them across different scales, biological samples or conditions. In this Review, we provide an overview of common principles and challenges in the analysis of temporal and spatial omics data. We discuss statistical concepts to model temporal and spatial dependencies and highlight opportunities for adapting existing analysis methods to data with temporal and spatial dimensions.
Collapse
Affiliation(s)
- Britta Velten
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany.
- Cellular Genetics Programme, Wellcome Sanger Institute, Hinxton, Cambridge, UK.
- Centre for Organismal Studies (COS) and Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany.
| | - Oliver Stegle
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany.
- Cellular Genetics Programme, Wellcome Sanger Institute, Hinxton, Cambridge, UK.
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
| |
Collapse
|
14
|
Peters Couto BZ, Robertson N, Patrick E, Ghazanfar S. MoleculeExperiment enables consistent infrastructure for molecule-resolved spatial omics data in bioconductor. Bioinformatics 2023; 39:btad550. [PMID: 37698995 PMCID: PMC10504467 DOI: 10.1093/bioinformatics/btad550] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 07/18/2023] [Accepted: 09/10/2023] [Indexed: 09/14/2023] Open
Abstract
MOTIVATION Imaging-based spatial transcriptomics (ST) technologies have achieved subcellular resolution, enabling detection of individual molecules in their native tissue context. Data associated with these technologies promise unprecedented opportunity toward understanding cellular and subcellular biology. However, in R/Bioconductor, there is a scarcity of existing computational infrastructure to represent such data, and particularly to summarize and transform it for existing widely adopted computational tools in single-cell transcriptomics analysis, including SingleCellExperiment and SpatialExperiment (SPE) classes. With the emergence of several commercial offerings of imaging-based ST, there is a pressing need to develop consistent data structure standards for these technologies at the individual molecule-level. RESULTS To this end, we have developed MoleculeExperiment, an R/Bioconductor package, which (i) stores molecule and cell segmentation boundary information at the molecule-level, (ii) standardizes this molecule-level information across different imaging-based ST technologies, including 10× Genomics' Xenium, and (iii) streamlines transition from a MoleculeExperiment object to a SpatialExperiment object. Overall, MoleculeExperiment is generally applicable as a data infrastructure class for consistent analysis of molecule-resolved spatial omics data. AVAILABILITY AND IMPLEMENTATION The MoleculeExperiment package is publicly available on Bioconductor at https://bioconductor.org/packages/release/bioc/html/MoleculeExperiment.html. Source code is available on Github at: https://github.com/SydneyBioX/MoleculeExperiment. The vignette for MoleculeExperiment can be found at https://bioconductor.org/packages/release/bioc/html/MoleculeExperiment.html.
Collapse
Affiliation(s)
- Bárbara Zita Peters Couto
- School of Mathematics and Statistics, The University of Sydney, Camperdown, NSW 2006, Australia
- Charles Perkins Centre, The University of Sydney, Camperdown, NSW 2006, Australia
- Sydney Precision Data Science Centre, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Nicholas Robertson
- School of Mathematics and Statistics, The University of Sydney, Camperdown, NSW 2006, Australia
- Charles Perkins Centre, The University of Sydney, Camperdown, NSW 2006, Australia
- Sydney Precision Data Science Centre, The University of Sydney, Camperdown, NSW 2006, Australia
- Sydney Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
| | - Ellis Patrick
- School of Mathematics and Statistics, The University of Sydney, Camperdown, NSW 2006, Australia
- Charles Perkins Centre, The University of Sydney, Camperdown, NSW 2006, Australia
- Sydney Precision Data Science Centre, The University of Sydney, Camperdown, NSW 2006, Australia
- Sydney Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
- Centre for Cancer Research, The Westmead Institute for Medical Research, The University of Sydney, Westmead, NSW 2145, Australia
| | - Shila Ghazanfar
- School of Mathematics and Statistics, The University of Sydney, Camperdown, NSW 2006, Australia
- Charles Perkins Centre, The University of Sydney, Camperdown, NSW 2006, Australia
- Sydney Precision Data Science Centre, The University of Sydney, Camperdown, NSW 2006, Australia
| |
Collapse
|
15
|
Li Z, Wang T, Liu P, Huang Y. SpatialDM for rapid identification of spatially co-expressed ligand-receptor and revealing cell-cell communication patterns. Nat Commun 2023; 14:3995. [PMID: 37414760 PMCID: PMC10325966 DOI: 10.1038/s41467-023-39608-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 06/21/2023] [Indexed: 07/08/2023] Open
Abstract
Cell-cell communication is a key aspect of dissecting the complex cellular microenvironment. Existing single-cell and spatial transcriptomics-based methods primarily focus on identifying cell-type pairs for a specific interaction, while less attention has been paid to the prioritisation of interaction features or the identification of interaction spots in the spatial context. Here, we introduce SpatialDM, a statistical model and toolbox leveraging a bivariant Moran's statistic to detect spatially co-expressed ligand and receptor pairs, their local interacting spots (single-spot resolution), and communication patterns. By deriving an analytical null distribution, this method is scalable to millions of spots and shows accurate and robust performance in various simulations. On multiple datasets including melanoma, Ventricular-Subventricular Zone, and intestine, SpatialDM reveals promising communication patterns and identifies differential interactions between conditions, hence enabling the discovery of context-specific cell cooperation and signalling.
Collapse
Affiliation(s)
- Zhuoxuan Li
- School of Biomedical Sciences, University of Hong Kong, Hong Kong SAR, China
| | - Tianjie Wang
- Department of Statistics and Actuarial Science, University of Hong Kong, Hong Kong SAR, China
| | - Pentao Liu
- School of Biomedical Sciences, University of Hong Kong, Hong Kong SAR, China.
- Center for Translational Stem Cell Biology, Hong Kong Science and Technology Park, Hong Kong SAR, China.
| | - Yuanhua Huang
- School of Biomedical Sciences, University of Hong Kong, Hong Kong SAR, China.
- Department of Statistics and Actuarial Science, University of Hong Kong, Hong Kong SAR, China.
- Center for Translational Stem Cell Biology, Hong Kong Science and Technology Park, Hong Kong SAR, China.
| |
Collapse
|
16
|
Ranek JS, Stallaert W, Milner J, Stanley N, Purvis JE. Feature selection for preserving biological trajectories in single-cell data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.09.540043. [PMID: 37214963 PMCID: PMC10197710 DOI: 10.1101/2023.05.09.540043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Single-cell technologies can readily measure the expression of thousands of molecular features from individual cells undergoing dynamic biological processes, such as cellular differentiation, immune response, and disease progression. While examining cells along a computationally ordered pseudotime offers the potential to study how subtle changes in gene or protein expression impact cell fate decision-making, identifying characteristic features that drive continuous biological processes remains difficult to detect from unenriched and noisy single-cell data. Given that all profiled sources of feature variation contribute to the cell-to-cell distances that define an inferred cellular trajectory, including confounding sources of biological variation (e.g. cell cycle or metabolic state) or noisy and irrelevant features (e.g. measurements with low signal-to-noise ratio) can mask the underlying trajectory of study and hinder inference. Here, we present DELVE (dynamic selection of locally covarying features), an unsupervised feature selection method for identifying a representative subset of dynamically-expressed molecular features that recapitulates cellular trajectories. In contrast to previous work, DELVE uses a bottom-up approach to mitigate the effect of unwanted sources of variation confounding inference, and instead models cell states from dynamic feature modules that constitute core regulatory complexes. Using simulations, single-cell RNA sequencing data, and iterative immunofluorescence imaging data in the context of the cell cycle and cellular differentiation, we demonstrate that DELVE selects features that more accurately characterize cell populations and improve the recovery of cell type transitions. This feature selection framework provides an alternative approach for improving trajectory inference and uncovering co-variation amongst features along a biological trajectory. DELVE is implemented as an open-source python package and is publicly available at: https://github.com/jranek/delve.
Collapse
Affiliation(s)
- Jolene S. Ranek
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Computational Medicine Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Wayne Stallaert
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Justin Milner
- Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Natalie Stanley
- Computational Medicine Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jeremy E. Purvis
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Computational Medicine Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
17
|
Jansma A. Higher-Order Interactions and Their Duals Reveal Synergy and Logical Dependence beyond Shannon-Information. ENTROPY (BASEL, SWITZERLAND) 2023; 25:e25040648. [PMID: 37190436 PMCID: PMC10137660 DOI: 10.3390/e25040648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/06/2023] [Accepted: 04/07/2023] [Indexed: 05/17/2023]
Abstract
Information-theoretic quantities reveal dependencies among variables in the structure of joint, marginal, and conditional entropies while leaving certain fundamentally different systems indistinguishable. Furthermore, there is no consensus on the correct higher-order generalisation of mutual information (MI). In this manuscript, we show that a recently proposed model-free definition of higher-order interactions among binary variables (MFIs), such as mutual information, is a Möbius inversion on a Boolean algebra, except of surprisal instead of entropy. This provides an information-theoretic interpretation to the MFIs, and by extension to Ising interactions. We study the objects dual to mutual information and the MFIs on the order-reversed lattices. We find that dual MI is related to the previously studied differential mutual information, while dual interactions are interactions with respect to a different background state. Unlike (dual) mutual information, interactions and their duals uniquely identify all six 2-input logic gates, the dy- and triadic distributions, and different causal dynamics that are identical in terms of their Shannon information content.
Collapse
Affiliation(s)
- Abel Jansma
- MRC Human Genetics Unit, Institute of Genetics & Cancer, University of Edinburgh, Edinburgh EH8 9YL, UK
- Higgs Centre for Theoretical Physics, School of Physics & Astronomy, University of Edinburgh, Edinburgh EH8 9YL, UK
- Biomedical AI Lab, School of Informatics, University of Edinburgh, Edinburgh EH8 9YL, UK
| |
Collapse
|
18
|
Zhu J, Shang L, Zhou X. SRTsim: spatial pattern preserving simulations for spatially resolved transcriptomics. Genome Biol 2023; 24:39. [PMID: 36869394 PMCID: PMC9983268 DOI: 10.1186/s13059-023-02879-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Accepted: 02/16/2023] [Indexed: 03/05/2023] Open
Abstract
Spatially resolved transcriptomics (SRT)-specific computational methods are often developed, tested, validated, and evaluated in silico using simulated data. Unfortunately, existing simulated SRT data are often poorly documented, hard to reproduce, or unrealistic. Single-cell simulators are not directly applicable for SRT simulation as they cannot incorporate spatial information. We present SRTsim, an SRT-specific simulator for scalable, reproducible, and realistic SRT simulations. SRTsim not only maintains various expression characteristics of SRT data but also preserves spatial patterns. We illustrate the benefits of SRTsim in benchmarking methods for spatial clustering, spatial expression pattern detection, and cell-cell communication identification.
Collapse
Affiliation(s)
- Jiaqiang Zhu
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Lulu Shang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
19
|
Yue L, Liu F, Hu J, Yang P, Wang Y, Dong J, Shu W, Huang X, Wang S. A guidebook of spatial transcriptomic technologies, data resources and analysis approaches. Comput Struct Biotechnol J 2023; 21:940-955. [PMID: 38213887 PMCID: PMC10781722 DOI: 10.1016/j.csbj.2023.01.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 01/13/2023] [Accepted: 01/14/2023] [Indexed: 01/18/2023] Open
Abstract
Advances in transcriptomic technologies have deepened our understanding of the cellular gene expression programs of multicellular organisms and provided a theoretical basis for disease diagnosis and therapy. However, both bulk and single-cell RNA sequencing approaches lose the spatial context of cells within the tissue microenvironment, and the development of spatial transcriptomics has made overall bias-free access to both transcriptional information and spatial information possible. Here, we elaborate development of spatial transcriptomic technologies to help researchers select the best-suited technology for their goals and integrate the vast amounts of data to facilitate data accessibility and availability. Then, we marshal various computational approaches to analyze spatial transcriptomic data for various purposes and describe the spatial multimodal omics and its potential for application in tumor tissue. Finally, we provide a detailed discussion and outlook of the spatial transcriptomic technologies, data resources and analysis approaches to guide current and future research on spatial transcriptomics.
Collapse
Affiliation(s)
- Liangchen Yue
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| | - Feng Liu
- College of Medical Informatics, Chongqing Medical University, Chongqing 400016, China
| | - Jiongsong Hu
- University of South China, Hengyang, Hunan 421001, China
| | - Pin Yang
- Anhui Medical University, Hefei 230022, Anhui, China
| | - Yuxiang Wang
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| | - Junguo Dong
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| | - Wenjie Shu
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| | - Xingxu Huang
- Zhejiang Provincial Key Laboratory of Pancreatic Disease, the First Affiliated Hospital, and Institute of Translational Medicine, Zhejiang University School of Medicine, Hangzhou 310029, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Shengqi Wang
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| |
Collapse
|
20
|
Bernstein MN, Ni Z, Prasad A, Brown J, Mohanty C, Stewart R, Newton MA, Kendziorski C. SpatialCorr identifies gene sets with spatially varying correlation structure. CELL REPORTS METHODS 2022; 2:100369. [PMID: 36590683 PMCID: PMC9795364 DOI: 10.1016/j.crmeth.2022.100369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Revised: 09/26/2022] [Accepted: 11/21/2022] [Indexed: 12/15/2022]
Abstract
Recent advances in spatially resolved transcriptomics technologies enable both the measurement of genome-wide gene expression profiles and their mapping to spatial locations within a tissue. A first step in spatial transcriptomics data analysis is identifying genes with expression that varies spatially, and robust statistical methods exist to address this challenge. While useful, these methods do not detect spatial changes in the coordinated expression within a group of genes. To this end, we present SpatialCorr, a method for identifying sets of genes with spatially varying correlation structure. Given a collection of gene sets pre-defined by a user, SpatialCorr tests for spatially induced differences in the correlation of each gene set within tissue regions, as well as between and among regions. An application to cutaneous squamous cell carcinoma demonstrates the power of the approach for revealing biological insights not identified using existing methods.
Collapse
Affiliation(s)
| | - Zijian Ni
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Aman Prasad
- Department of Dermatology, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - Jared Brown
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Chitrasen Mohanty
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| | - Ron Stewart
- Morgridge Institute for Research, Madison, WI 53715, USA
| | - Michael A. Newton
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| | - Christina Kendziorski
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| |
Collapse
|
21
|
Li Z, Zhou X. BASS: multi-scale and multi-sample analysis enables accurate cell type clustering and spatial domain detection in spatial transcriptomic studies. Genome Biol 2022; 23:168. [PMID: 35927760 PMCID: PMC9351148 DOI: 10.1186/s13059-022-02734-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 07/21/2022] [Indexed: 02/08/2023] Open
Abstract
Spatial transcriptomic studies are reaching single-cell spatial resolution, with data often collected from multiple tissue sections. Here, we present a computational method, BASS, that enables multi-scale and multi-sample analysis for single-cell resolution spatial transcriptomics. BASS performs cell type clustering at the single-cell scale and spatial domain detection at the tissue regional scale, with the two tasks carried out simultaneously within a Bayesian hierarchical modeling framework. We illustrate the benefits of BASS through comprehensive simulations and applications to three datasets. The substantial power gain brought by BASS allows us to reveal accurate transcriptomic and cellular landscape in both cortex and hypothalamus.
Collapse
Affiliation(s)
- Zheng Li
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA. .,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
22
|
Song Q, Wang J, Bar-Joseph Z. scSTEM: clustering pseudotime ordered single-cell data. Genome Biol 2022; 23:150. [PMID: 35799304 PMCID: PMC9264648 DOI: 10.1186/s13059-022-02716-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 06/21/2022] [Indexed: 11/25/2022] Open
Abstract
We develop scSTEM, single-cell STEM, a method for clustering dynamic profiles of genes in trajectories inferred from pseudotime ordering of single-cell RNA-seq (scRNA-seq) data. scSTEM uses one of several metrics to summarize the expression of genes and assigns a p-value to clusters enabling the identification of significant profiles and comparison of profiles across different paths. Application of scSTEM to several scRNA-seq datasets demonstrates its usefulness and ability to improve downstream analysis of biological processes. scSTEM is available at https://github.com/alexQiSong/scSTEM.
Collapse
Affiliation(s)
- Qi Song
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Jingtao Wang
- Department of Medicine, Division of Experimental Medicine, McGill University, Montreal, QC, Canada
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15213, USA. .,Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15213, USA.
| |
Collapse
|
23
|
Li R, Yang X. De novo reconstruction of cell interaction landscapes from single-cell spatial transcriptome data with DeepLinc. Genome Biol 2022; 23:124. [PMID: 35659722 PMCID: PMC9164488 DOI: 10.1186/s13059-022-02692-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 05/20/2022] [Indexed: 11/29/2022] Open
Abstract
Based on a deep generative model of variational graph autoencoder (VGAE), we develop a new method, DeepLinc (deep learning framework for Landscapes of Interacting Cells), for the de novo reconstruction of cell interaction networks from single-cell spatial transcriptomic data. DeepLinc demonstrates high efficiency in learning from imperfect and incomplete spatial transcriptome data, filtering false interactions, and imputing missing distal and proximal interactions. The latent representations learned by DeepLinc are also used for inferring the signature genes contributing to the cell interaction landscapes, and for reclustering the cells based on the spatially coded cell heterogeneity in complex tissues at single-cell resolution.
Collapse
Affiliation(s)
- Runze Li
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Xuerui Yang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
24
|
Tanevski J, Flores ROR, Gabor A, Schapiro D, Saez-Rodriguez J. Explainable multiview framework for dissecting spatial relationships from highly multiplexed data. Genome Biol 2022; 23:97. [PMID: 35422018 PMCID: PMC9011939 DOI: 10.1186/s13059-022-02663-5] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 04/01/2022] [Indexed: 12/12/2022] Open
Abstract
The advancement of highly multiplexed spatial technologies requires scalable methods that can leverage spatial information. We present MISTy, a flexible, scalable, and explainable machine learning framework for extracting relationships from any spatial omics data, from dozens to thousands of measured markers. MISTy builds multiple views focusing on different spatial or functional contexts to dissect different effects. We evaluated MISTy on in silico and breast cancer datasets measured by imaging mass cytometry and spatial transcriptomics. We estimated structural and functional interactions coming from different spatial contexts in breast cancer and demonstrated how to relate MISTy’s results to clinical features.
Collapse
|
25
|
Zeng Z, Li Y, Li Y, Luo Y. Statistical and machine learning methods for spatially resolved transcriptomics data analysis. Genome Biol 2022; 23:83. [PMID: 35337374 PMCID: PMC8951701 DOI: 10.1186/s13059-022-02653-7] [Citation(s) in RCA: 45] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Accepted: 03/15/2022] [Indexed: 01/28/2023] Open
Abstract
The recent advancement in spatial transcriptomics technology has enabled multiplexed profiling of cellular transcriptomes and spatial locations. As the capacity and efficiency of the experimental technologies continue to improve, there is an emerging need for the development of analytical approaches. Furthermore, with the continuous evolution of sequencing protocols, the underlying assumptions of current analytical methods need to be re-evaluated and adjusted to harness the increasing data complexity. To motivate and aid future model development, we herein review the recent development of statistical and machine learning methods in spatial transcriptomics, summarize useful resources, and highlight the challenges and opportunities ahead.
Collapse
Affiliation(s)
- Zexian Zeng
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100084, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100084, China
- Department of Data Sciences, Dana Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA
| | - Yawei Li
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Yiming Li
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Yuan Luo
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA.
- Northwestern University Clinical and Translational Sciences Institute, Chicago, IL, 60611, USA.
- Institute for Augmented Intelligence in Medicine, Northwestern University, Chicago, IL, 60611, USA.
- Center for Health Information Partnerships, Northwestern University, Chicago, IL, 60611, USA.
| |
Collapse
|
26
|
Erickson AG, Kameneva P, Adameyko I. The transcriptional portraits of the neural crest at the individual cell level. Semin Cell Dev Biol 2022; 138:68-80. [PMID: 35260294 PMCID: PMC9441473 DOI: 10.1016/j.semcdb.2022.02.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 02/04/2022] [Accepted: 02/21/2022] [Indexed: 01/15/2023]
Abstract
Since the discovery of this cell population by His in 1850, the neural crest has been under intense study for its important role during vertebrate development. Much has been learned about the function and regulation of neural crest cell differentiation, and as a result, the neural crest has become a key model system for stem cell biology in general. The experiments performed in embryology, genetics, and cell biology in the last 150 years in the neural crest field has given rise to several big questions that have been debated intensely for many years: "How does positional information impact developmental potential? Are neural crest cells individually multipotent or a mixed population of committed progenitors? What are the key factors that regulate the acquisition of stem cell identity, and how does a stem cell decide to differentiate towards one cell fate versus another?" Recently, a marriage between single cell multi-omics, statistical modeling, and developmental biology has shed a substantial amount of light on these questions, and has paved a clear path for future researchers in the field.
Collapse
Affiliation(s)
- Alek G Erickson
- Department of Physiology and Pharmacology, Karolinska Institutet, 17165 Stockholm, Sweden
| | - Polina Kameneva
- Department of Neuroimmunology, Center for Brain Research, Medical University Vienna, 1090 Vienna, Austria
| | - Igor Adameyko
- Department of Physiology and Pharmacology, Karolinska Institutet, 17165 Stockholm, Sweden; Department of Neuroimmunology, Center for Brain Research, Medical University Vienna, 1090 Vienna, Austria.
| |
Collapse
|
27
|
Lohoff T, Ghazanfar S, Missarova A, Koulena N, Pierson N, Griffiths JA, Bardot ES, Eng CHL, Tyser RCV, Argelaguet R, Guibentif C, Srinivas S, Briscoe J, Simons BD, Hadjantonakis AK, Göttgens B, Reik W, Nichols J, Cai L, Marioni JC. Integration of spatial and single-cell transcriptomic data elucidates mouse organogenesis. Nat Biotechnol 2022; 40:74-85. [PMID: 34489600 PMCID: PMC8763645 DOI: 10.1038/s41587-021-01006-2] [Citation(s) in RCA: 108] [Impact Index Per Article: 54.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 07/07/2021] [Indexed: 02/07/2023]
Abstract
Molecular profiling of single cells has advanced our knowledge of the molecular basis of development. However, current approaches mostly rely on dissociating cells from tissues, thereby losing the crucial spatial context of regulatory processes. Here, we apply an image-based single-cell transcriptomics method, sequential fluorescence in situ hybridization (seqFISH), to detect mRNAs for 387 target genes in tissue sections of mouse embryos at the 8-12 somite stage. By integrating spatial context and multiplexed transcriptional measurements with two single-cell transcriptome atlases, we characterize cell types across the embryo and demonstrate that spatially resolved expression of genes not profiled by seqFISH can be imputed. We use this high-resolution spatial map to characterize fundamental steps in the patterning of the midbrain-hindbrain boundary (MHB) and the developing gut tube. We uncover axes of cell differentiation that are not apparent from single-cell RNA-sequencing (scRNA-seq) data, such as early dorsal-ventral separation of esophageal and tracheal progenitor populations in the gut tube. Our method provides an approach for studying cell fate decisions in complex tissues and development.
Collapse
Affiliation(s)
- T Lohoff
- Wellcome-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
- Epigenetics Programme, Babraham Institute, Cambridge, UK
| | - S Ghazanfar
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - A Missarova
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - N Koulena
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - N Pierson
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - J A Griffiths
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
- Genomics Plc, Cambridge, UK
| | - E S Bardot
- Developmental Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - C-H L Eng
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - R C V Tyser
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, UK
| | - R Argelaguet
- Epigenetics Programme, Babraham Institute, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - C Guibentif
- Wellcome-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Haematology, University of Cambridge, Cambridge, UK
- Sahlgrenska Center for Cancer Research, Department of Microbiology and Immunology, University of Gothenburg, Gothenburg, Sweden
| | - S Srinivas
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, UK
| | - J Briscoe
- The Francis Crick Institute, London, UK
| | - B D Simons
- Wellcome-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- The Wellcome/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK
- Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, University of Cambridge, Cambridge, UK
| | - A-K Hadjantonakis
- Developmental Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - B Göttgens
- Wellcome-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Haematology, University of Cambridge, Cambridge, UK
| | - W Reik
- Wellcome-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK.
- Epigenetics Programme, Babraham Institute, Cambridge, UK.
- Centre for Trophoblast Research, University of Cambridge, Cambridge, UK.
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.
| | - J Nichols
- Wellcome-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK.
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK.
| | - L Cai
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
| | - J C Marioni
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK.
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK.
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.
| |
Collapse
|
28
|
Constructing local cell-specific networks from single-cell data. Proc Natl Acad Sci U S A 2021; 118:2113178118. [PMID: 34903665 PMCID: PMC8713783 DOI: 10.1073/pnas.2113178118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/09/2021] [Indexed: 11/18/2022] Open
Abstract
Understanding gene regulatory networks is a topic of great interest because it can provide insights into cellular development, and identify factors that differ between normal and abnormal cells and phenotypes. Single-cell RNA sequencing provides a unique opportunity to gain understanding at the cellular level, but the technical features of the data create severe challenges when constructing gene networks. We develop a method that successfully skirts these challenges to estimate a cell-specific network for each single cell and cell type. Application of our algorithm to two brain cell samples furthers our understanding of autism spectrum disorder by examining the evolution of gene networks in fetal brain cells and comparing the networks of cells sampled from case and control subjects. Gene coexpression networks yield critical insights into biological processes, and single-cell RNA sequencing provides an opportunity to target inquiries at the cellular level. However, due to the sparsity and heterogeneity of transcript counts, it is challenging to construct accurate gene networks. We develop an approach, locCSN, that estimates cell-specific networks (CSNs) for each cell, preserving information about cellular heterogeneity that is lost with other approaches. LocCSN is based on a nonparametric investigation of the joint distribution of gene expression; hence it can readily detect nonlinear correlations, and it is more robust to distributional challenges. Although individual CSNs are estimated with considerable noise, average CSNs provide stable estimates of networks, which reveal gene communities better than traditional measures. Additionally, we propose downstream analysis methods using CSNs to utilize more fully the information contained within them. Repeated estimates of gene networks facilitate testing for differences in network structure between cell groups. Notably, with this approach, we can identify differential network genes, which typically do not differ in gene expression, but do differ in terms of the coexpression networks. These genes might help explain the etiology of disease. Finally, to further our understanding of autism spectrum disorder, we examine the evolution of gene networks in fetal brain cells and compare the CSNs of cells sampled from case and control subjects to reveal intriguing patterns in gene coexpression.
Collapse
|
29
|
Si-C is a method for inferring super-resolution intact genome structure from single-cell Hi-C data. Nat Commun 2021; 12:4369. [PMID: 34272403 PMCID: PMC8285481 DOI: 10.1038/s41467-021-24662-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 06/25/2021] [Indexed: 12/21/2022] Open
Abstract
There is a strong demand for methods that can efficiently reconstruct valid super-resolution intact genome 3D structures from sparse and noise single-cell Hi-C data. Here, we develop Single-Cell Chromosome Conformation Calculator (Si-C) within the Bayesian theory framework and apply this approach to reconstruct intact genome 3D structures from single-cell Hi-C data of eight G1-phase haploid mouse ES cells. The inferred 100-kb and 10-kb structures consistently reproduce the known conserved features of chromatin organization revealed by independent imaging experiments. The analysis of the 10-kb resolution 3D structures reveals cell-to-cell varying domain structures in individual cells and hyperfine structures in domains, such as loops. An average of 0.2 contact reads per divided bin is sufficient for Si-C to obtain reliable structures. The valid super-resolution structures constructed by Si-C demonstrate the potential for visualizing and investigating interactions between all chromatin loci at the genome scale in individual cells. Constructing valid super-resolution intact genome 3D structures from single-cell Hi-C data is essential in investigating chromosome folding. Here the authors develop a method that makes it possible to visualize and investigate chromosome folding in individual cells at the genome scale
Collapse
|
30
|
Zhu J, Sun S, Zhou X. SPARK-X: non-parametric modeling enables scalable and robust detection of spatial expression patterns for large spatial transcriptomic studies. Genome Biol 2021; 22:184. [PMID: 34154649 PMCID: PMC8218388 DOI: 10.1186/s13059-021-02404-0] [Citation(s) in RCA: 65] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Accepted: 06/07/2021] [Indexed: 01/01/2023] Open
Abstract
Spatial transcriptomic studies are becoming increasingly common and large, posing important statistical and computational challenges for many analytic tasks. Here, we present SPARK-X, a non-parametric method for rapid and effective detection of spatially expressed genes in large spatial transcriptomic studies. SPARK-X not only produces effective type I error control and high power but also brings orders of magnitude computational savings. We apply SPARK-X to analyze three large datasets, one of which is only analyzable by SPARK-X. In these data, SPARK-X identifies many spatially expressed genes including those that are spatially expressed within the same cell type, revealing new biological insights.
Collapse
Affiliation(s)
- Jiaqiang Zhu
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Shiquan Sun
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Department of Epidemiology and Biostatistics, Xi'an Jiaotong University, Xi'an, Shaanxi, 710061, P.R. China
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
31
|
Liang S, Mohanty V, Dou J, Miao Q, Huang Y, Müftüoğlu M, Ding L, Peng W, Chen K. Single-cell manifold-preserving feature selection for detecting rare cell populations. NATURE COMPUTATIONAL SCIENCE 2021; 1:374-384. [PMID: 36969355 PMCID: PMC10035340 DOI: 10.1038/s43588-021-00070-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Accepted: 04/19/2021] [Indexed: 01/04/2023]
Abstract
A key challenge in studying organisms and diseases is to detect rare molecular programs and rare cell populations (RCPs) that drive development, differentiation, and transformation. Molecular features such as genes and proteins defining RCPs are often unknown and difficult to detect from unenriched single-cell data, using conventional dimensionality reduction and clustering-based approaches. Here, we propose an unsupervised approach, SCMER (Single-Cell Manifold presERving feature selection), which selects a compact set of molecular features with definitive meanings that preserve the manifold of the data. We applied SCMER in the context of hematopoiesis, lymphogenesis, tumorigenesis, and drug resistance and response. We found that SCMER can identify non-redundant features that sensitively delineate both common cell lineages and rare cellular states. SCMER can be used for discovering molecular features in a high dimensional dataset, designing targeted, cost-effective assays for clinical applications, and facilitating multi-modality integration.
Collapse
Affiliation(s)
- Shaoheng Liang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
- Department of Computer Science, Rice University, Houston, Texas, 77005, USA
| | - Vakul Mohanty
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
| | - Jinzhuang Dou
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
| | - Qi Miao
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
- Department of Biostatistics & Data Science, School of Public Health, The University of Texas Health Science Center at Houston (UTHealth), Houston, Texas, 77030, USA
| | - Yuefan Huang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
- Department of Biostatistics & Data Science, School of Public Health, The University of Texas Health Science Center at Houston (UTHealth), Houston, Texas, 77030, USA
| | - Muharrem Müftüoğlu
- Department of Leukemia, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
| | - Li Ding
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, 63108
| | - Weiyi Peng
- Department of Biology and Biochemistry, University of Houston, Houston, Texas, 77024
| | - Ken Chen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, USA
| |
Collapse
|
32
|
Bach K, Pensa S, Zarocsinceva M, Kania K, Stockis J, Pinaud S, Lazarus KA, Shehata M, Simões BM, Greenhalgh AR, Howell SJ, Clarke RB, Caldas C, Halim TYF, Marioni JC, Khaled WT. Time-resolved single-cell analysis of Brca1 associated mammary tumourigenesis reveals aberrant differentiation of luminal progenitors. Nat Commun 2021; 12:1502. [PMID: 33686070 PMCID: PMC7940427 DOI: 10.1038/s41467-021-21783-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Accepted: 02/11/2021] [Indexed: 12/13/2022] Open
Abstract
It is unclear how genetic aberrations impact the state of nascent tumour cells and their microenvironment. BRCA1 driven triple negative breast cancer (TNBC) has been shown to arise from luminal progenitors yet little is known about how BRCA1 loss-of-function (LOF) and concomitant mutations affect the luminal progenitor cell state. Here we demonstrate how time-resolved single-cell profiling of genetically engineered mouse models before tumour formation can address this challenge. We found that perturbing Brca1/p53 in luminal progenitors induces aberrant alveolar differentiation pre-malignancy accompanied by pro-tumourigenic changes in the immune compartment. Unlike alveolar differentiation during gestation, this process is cell autonomous and characterised by the dysregulation of transcription factors driving alveologenesis. Based on our data we propose a model where Brca1/p53 LOF inadvertently promotes a differentiation program hardwired in luminal progenitors, highlighting the deterministic role of the cell-of-origin and offering a potential explanation for the tissue specificity of BRCA1 tumours.
Collapse
Affiliation(s)
- Karsten Bach
- University of Cambridge, Department of Pharmacology, Cambridge, UK
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
- Cancer Research UK, Cambridge Cancer Centre, Cambridge, UK
| | - Sara Pensa
- University of Cambridge, Department of Pharmacology, Cambridge, UK
- Cancer Research UK, Cambridge Cancer Centre, Cambridge, UK
| | - Marija Zarocsinceva
- Cancer Research UK, Cambridge Cancer Centre, Cambridge, UK
- Wellcome-MRC Cambridge Stem Cell Institute, Cambridge, UK
| | - Katarzyna Kania
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
| | - Julie Stockis
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
| | - Silvain Pinaud
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
| | - Kyren A Lazarus
- University of Cambridge, Department of Pharmacology, Cambridge, UK
- Cancer Research UK, Cambridge Cancer Centre, Cambridge, UK
| | - Mona Shehata
- Medical Research Council Cancer Unit, University of Cambridge, Cambridge, UK
| | - Bruno M Simões
- Manchester Breast Centre, Oglesby Cancer Research Building, University of Manchester, Manchester, UK
| | - Alice R Greenhalgh
- Manchester Breast Centre, Oglesby Cancer Research Building, University of Manchester, Manchester, UK
| | - Sacha J Howell
- Manchester Breast Centre, Oglesby Cancer Research Building, University of Manchester, Manchester, UK
- Department of Medical Oncology, Christie NHS Foundation Trust, Manchester, UK
| | - Robert B Clarke
- Manchester Breast Centre, Oglesby Cancer Research Building, University of Manchester, Manchester, UK
| | - Carlos Caldas
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
- Cancer Research UK, Cambridge Cancer Centre, Cambridge, UK
| | - Timotheus Y F Halim
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
| | - John C Marioni
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK.
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.
- European Bioinformatics Institute, European Molecular Biology Laboratory, Hinxton, UK.
| | - Walid T Khaled
- University of Cambridge, Department of Pharmacology, Cambridge, UK.
- Cancer Research UK, Cambridge Cancer Centre, Cambridge, UK.
- Wellcome-MRC Cambridge Stem Cell Institute, Cambridge, UK.
| |
Collapse
|
33
|
Seweryn MT, Pietrzak M, Ma Q. Application of information theoretical approaches to assess diversity and similarity in single-cell transcriptomics. Comput Struct Biotechnol J 2020; 18:1830-1837. [PMID: 32728406 PMCID: PMC7371753 DOI: 10.1016/j.csbj.2020.05.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 04/24/2020] [Accepted: 05/06/2020] [Indexed: 02/09/2023] Open
Abstract
Single-cell transcriptomics offers a powerful way to reveal the heterogeneity of individual cells. To date, many information theoretical approaches have been proposed to assess diversity and similarity, and characterize the latent heterogeneity in transcriptome data. Diversity implies gene expression variations and can facilitate the identification of signature genes; while, similarity unravels co-expression patterns for cell type clustering. In this review, we summarized 16 measures of information theory used for evaluating diversity and similarity in single-cell transcriptomic data, provide references and shed light on selected theoretical properties when there is a need to select proper measurements in general cases. We further provide an R package assembling discussed approaches to improve the researchers own single-cell transcriptome study. At last, we prospected further applications of diversity and similarity measures in support of depicting heterogeneity in single-cell multi-omics data.
Collapse
Affiliation(s)
- Michal T. Seweryn
- Center for Medical Genomics, Jagiellonian University, Cracow, Poland
| | - Maciej Pietrzak
- Department of Biomedical Informatics, The Ohio State University, Columbus OH, United States
| | - Qin Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus OH, United States
| |
Collapse
|