1
|
Das Adhikari S, Yang J, Wang J, Cui Y. Recent advances in spatially variable gene detection in spatial transcriptomics. Comput Struct Biotechnol J 2024; 23:883-891. [PMID: 38370977 PMCID: PMC10869304 DOI: 10.1016/j.csbj.2024.01.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 01/22/2024] [Accepted: 01/22/2024] [Indexed: 02/20/2024] Open
Abstract
With the emergence of advanced spatial transcriptomic technologies, there has been a surge in research papers dedicated to analyzing spatial transcriptomics data, resulting in significant contributions to our understanding of biology. The initial stage of downstream analysis of spatial transcriptomic data has centered on identifying spatially variable genes (SVGs) or genes expressed with specific spatial patterns across the tissue. SVG detection is an important task since many downstream analyses depend on these selected SVGs. Over the past few years, a plethora of new methods have been proposed for the detection of SVGs, accompanied by numerous innovative concepts and discussions. This article provides a selective review of methods and their practical implementations, offering valuable insights into the current literature in this field.
Collapse
Affiliation(s)
- Sikta Das Adhikari
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| | - Jiaxin Yang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
2
|
Ospina OE, Soupir AC, Manjarres-Betancur R, Gonzalez-Calderon G, Yu X, Fridley BL. Differential gene expression analysis of spatial transcriptomic experiments using spatial mixed models. Sci Rep 2024; 14:10967. [PMID: 38744956 PMCID: PMC11094014 DOI: 10.1038/s41598-024-61758-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 05/09/2024] [Indexed: 05/16/2024] Open
Abstract
Spatial transcriptomics (ST) assays represent a revolution in how the architecture of tissues is studied by allowing for the exploration of cells in their spatial context. A common element in the analysis is delineating tissue domains or "niches" followed by detecting differentially expressed genes to infer the biological identity of the tissue domains or cell types. However, many studies approach differential expression analysis by using statistical approaches often applied in the analysis of non-spatial scRNA data (e.g., two-sample t-tests, Wilcoxon's rank sum test), hence neglecting the spatial dependency observed in ST data. In this study, we show that applying linear mixed models with spatial correlation structures using spatial random effects effectively accounts for the spatial autocorrelation and reduces inflation of type-I error rate observed in non-spatial based differential expression testing. We also show that spatial linear models with an exponential correlation structure provide a better fit to the ST data as compared to non-spatial models, particularly for spatially resolved technologies that quantify expression at finer scales (i.e., single-cell resolution).
Collapse
Affiliation(s)
- Oscar E Ospina
- Department of Biostatistics & Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Alex C Soupir
- Department of Biostatistics & Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | | | | | - Xiaoqing Yu
- Department of Biostatistics & Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Brooke L Fridley
- Department of Biostatistics & Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA.
- Biostatistics and Epidemiology Core, Division of Health Services & Outcomes Research, Children's Mercy, Kansas City, MO, USA.
| |
Collapse
|
3
|
Chakrabarti A, Ni Y, Mallick BK. Joint Bayesian estimation of cell dependence and gene associations in spatially resolved transcriptomic data. Sci Rep 2024; 14:9516. [PMID: 38664448 PMCID: PMC11045727 DOI: 10.1038/s41598-024-60002-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 04/17/2024] [Indexed: 04/28/2024] Open
Abstract
Recent technologies such as spatial transcriptomics, enable the measurement of gene expressions at the single-cell level along with the spatial locations of these cells in the tissue. Spatial clustering of the cells provides valuable insights into the understanding of the functional organization of the tissue. However, most such clustering methods involve some dimension reduction that leads to a loss of the inherent dependency structure among genes at any spatial location in the tissue. This destroys valuable insights of gene co-expression patterns apart from possibly impacting spatial clustering performance. In spatial transcriptomics, the matrix-variate gene expression data, along with spatial coordinates of the single cells, provides information on both gene expression dependencies and cell spatial dependencies through its row and column covariances. In this work, we propose a joint Bayesian approach to simultaneously estimate these gene and spatial cell correlations. These estimates provide data summaries for downstream analyses. We illustrate our method with simulations and analysis of several real spatial transcriptomic datasets. Our work elucidates gene co-expression networks as well as clear spatial clustering patterns of the cells. Furthermore, our analysis reveals that downstream spatial-differential analysis may aid in the discovery of unknown cell types from known marker genes.
Collapse
Affiliation(s)
- Arhit Chakrabarti
- Department of Statistics, Texas A &M University, College Station, TX, 77843, USA.
| | - Yang Ni
- Department of Statistics, Texas A &M University, College Station, TX, 77843, USA
| | - Bani K Mallick
- Department of Statistics, Texas A &M University, College Station, TX, 77843, USA
| |
Collapse
|
4
|
Yang J, Jiang X, Jin KW, Shin S, Li Q. Bayesian hidden mark interaction model for detecting spatially variable genes in imaging-based spatially resolved transcriptomics data. Front Genet 2024; 15:1356709. [PMID: 38725485 PMCID: PMC11079231 DOI: 10.3389/fgene.2024.1356709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 04/08/2024] [Indexed: 05/12/2024] Open
Abstract
Recent technology breakthroughs in spatially resolved transcriptomics (SRT) have enabled the comprehensive molecular characterization of cells whilst preserving their spatial and gene expression contexts. One of the fundamental questions in analyzing SRT data is the identification of spatially variable genes whose expressions display spatially correlated patterns. Existing approaches are built upon either the Gaussian process-based model, which relies on ad hoc kernels, or the energy-based Ising model, which requires gene expression to be measured on a lattice grid. To overcome these potential limitations, we developed a generalized energy-based framework to model gene expression measured from imaging-based SRT platforms, accommodating the irregular spatial distribution of measured cells. Our Bayesian model applies a zero-inflated negative binomial mixture model to dichotomize the raw count data, reducing noise. Additionally, we incorporate a geostatistical mark interaction model with a generalized energy function, where the interaction parameter is used to identify the spatial pattern. Auxiliary variable MCMC algorithms were employed to sample from the posterior distribution with an intractable normalizing constant. We demonstrated the strength of our method on both simulated and real data. Our simulation study showed that our method captured various spatial patterns with high accuracy; moreover, analysis of a seqFISH dataset and a STARmap dataset established that our proposed method is able to identify genes with novel and strong spatial patterns.
Collapse
Affiliation(s)
- Jie Yang
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX, United States
| | - Xi Jiang
- Department of Statistics and Data Science, Southern Methodist University, Dallas, TX, United States
| | - Kevin Wang Jin
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States
| | - Sunyoung Shin
- Department of Mathematics, Pohang University of Science and Technology, Pohang, Republic of Korea
| | - Qiwei Li
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX, United States
| |
Collapse
|
5
|
Guo X, Ning J, Chen Y, Liu G, Zhao L, Fan Y, Sun S. Recent advances in differential expression analysis for single-cell RNA-seq and spatially resolved transcriptomic studies. Brief Funct Genomics 2024; 23:95-109. [PMID: 37022699 DOI: 10.1093/bfgp/elad011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 12/09/2022] [Accepted: 03/10/2023] [Indexed: 04/07/2023] Open
Abstract
Differential expression (DE) analysis is a necessary step in the analysis of single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) data. Unlike traditional bulk RNA-seq, DE analysis for scRNA-seq or SRT data has unique characteristics that may contribute to the difficulty of detecting DE genes. However, the plethora of DE tools that work with various assumptions makes it difficult to choose an appropriate one. Furthermore, a comprehensive review on detecting DE genes for scRNA-seq data or SRT data from multi-condition, multi-sample experimental designs is lacking. To bridge such a gap, here, we first focus on the challenges of DE detection, then highlight potential opportunities that facilitate further progress in scRNA-seq or SRT analysis, and finally provide insights and guidance in selecting appropriate DE tools or developing new computational DE methods.
Collapse
Affiliation(s)
- Xiya Guo
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Jin Ning
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Yuanze Chen
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Guoliang Liu
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Liyan Zhao
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Yue Fan
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Shiquan Sun
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| |
Collapse
|
6
|
Li R, Chen X, Yang X. Navigating the landscapes of spatial transcriptomics: How computational methods guide the way. WILEY INTERDISCIPLINARY REVIEWS. RNA 2024; 15:e1839. [PMID: 38527900 DOI: 10.1002/wrna.1839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 02/24/2024] [Accepted: 03/04/2024] [Indexed: 03/27/2024]
Abstract
Spatially resolved transcriptomics has been dramatically transforming biological and medical research in various fields. It enables transcriptome profiling at single-cell, multi-cellular, or sub-cellular resolution, while retaining the information of geometric localizations of cells in complex tissues. The coupling of cell spatial information and its molecular characteristics generates a novel multi-modal high-throughput data source, which poses new challenges for the development of analytical methods for data-mining. Spatial transcriptomic data are often highly complex, noisy, and biased, presenting a series of difficulties, many unresolved, for data analysis and generation of biological insights. In addition, to keep pace with the ever-evolving spatial transcriptomic experimental technologies, the existing analytical theories and tools need to be updated and reformed accordingly. In this review, we provide an overview and discussion of the current computational approaches for mining of spatial transcriptomics data. Future directions and perspectives of methodology design are proposed to stimulate further discussions and advances in new analytical models and algorithms. This article is categorized under: RNA Methods > RNA Analyses in Cells RNA Evolution and Genomics > Computational Analyses of RNA RNA Export and Localization > RNA Localization.
Collapse
Affiliation(s)
- Runze Li
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
| | - Xu Chen
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
| | - Xuerui Yang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
| |
Collapse
|
7
|
Mason K, Sathe A, Hess PR, Rong J, Wu CY, Furth E, Susztak K, Levinsohn J, Ji HP, Zhang N. Niche-DE: niche-differential gene expression analysis in spatial transcriptomics data identifies context-dependent cell-cell interactions. Genome Biol 2024; 25:14. [PMID: 38217002 PMCID: PMC10785550 DOI: 10.1186/s13059-023-03159-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 12/22/2023] [Indexed: 01/14/2024] Open
Abstract
Existing methods for analysis of spatial transcriptomic data focus on delineating the global gene expression variations of cell types across the tissue, rather than local gene expression changes driven by cell-cell interactions. We propose a new statistical procedure called niche-differential expression (niche-DE) analysis that identifies cell-type-specific niche-associated genes, which are differentially expressed within a specific cell type in the context of specific spatial niches. We further develop niche-LR, a method to reveal ligand-receptor signaling mechanisms that underlie niche-differential gene expression patterns. Niche-DE and niche-LR are applicable to low-resolution spot-based spatial transcriptomics data and data that is single-cell or subcellular in resolution.
Collapse
Affiliation(s)
- Kaishu Mason
- Department of Statistics and Data Science, The Wharton School, University of Pennsylvania, Philadelphia, USA
| | - Anuja Sathe
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Paul R Hess
- Department of Statistics and Data Science, The Wharton School, University of Pennsylvania, Philadelphia, USA
| | - Jiazhen Rong
- Genomics and Computational Biology Graduate Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
| | - Chi-Yun Wu
- The Gladstone Institute, San Francisco, USA
| | - Emma Furth
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
| | - Katalin Susztak
- Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Jonathan Levinsohn
- Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Hanlee P Ji
- Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Nancy Zhang
- Department of Statistics and Data Science, The Wharton School, University of Pennsylvania, Philadelphia, USA.
| |
Collapse
|
8
|
Yang J, Jiang X, Jin KW, Shin S, Li Q. Bayesian Hidden Mark Interaction Model for Detecting Spatially Variable Genes in Imaging-Based Spatially Resolved Transcriptomics Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.17.572071. [PMID: 38168368 PMCID: PMC10760150 DOI: 10.1101/2023.12.17.572071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
Recent technology breakthroughs in spatially resolved transcriptomics (SRT) have enabled the comprehensive molecular characterization of cells whilst preserving their spatial and gene expression contexts. One of the fundamental questions in analyzing SRT data is the identification of spatially variable genes whose expressions display spatially correlated patterns. Existing approaches are built upon either the Gaussian process-based model, which relies on ad hoc kernels, or the energy-based Ising model, which requires gene expression to be measured on a lattice grid. To overcome these potential limitations, we developed a generalized energy-based framework to model gene expression measured from imaging-based SRT platforms, accommodating the irregular spatial distribution of measured cells. Our Bayesian model applies a zero-inflated negative binomial mixture model to dichotomize the raw count data, reducing noise. Additionally, we incorporate a geostatistical mark interaction model with a generalized energy function, where the interaction parameter is used to identify the spatial pattern. Auxiliary variable MCMC algorithms were employed to sample from the posterior distribution with an intractable normalizing constant. We demonstrated the strength of our method on both simulated and real data. Our simulation study showed that our method captured various spatial patterns with high accuracy; moreover, analysis of a seqFISH dataset and a STARmap dataset established that our proposed method is able to identify genes with novel and strong spatial patterns.
Collapse
Affiliation(s)
- Jie Yang
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, Texas, U.S.A
| | - Xi Jiang
- Department of Statistics and Data Science, Southern Methodist University, Dallas, Texas, U.S.A
| | - Kevin W. Jin
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, U.S.A
| | - Sunyoung Shin
- Department of Mathematics, Pohang University of Science and Technology, Pohang, South Korea
| | - Qiwei Li
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, Texas, U.S.A
| |
Collapse
|
9
|
Jiang X, Dong L, Wang S, Wen Z, Chen M, Xu L, Xiao G, Li Q. Reconstructing Spatial Transcriptomics at the Single-cell Resolution with BayesDeep. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.07.570715. [PMID: 38106214 PMCID: PMC10723442 DOI: 10.1101/2023.12.07.570715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Spatially resolved transcriptomics (SRT) techniques have revolutionized the characterization of molecular profiles while preserving spatial and morphological context. However, most next-generation sequencing-based SRT techniques are limited to measuring gene expression in a confined array of spots, capturing only a fraction of the spatial domain. Typically, these spots encompass gene expression from a few to hundreds of cells, underscoring a critical need for more detailed, single-cell resolution SRT data to enhance our understanding of biological functions within the tissue context. Addressing this challenge, we introduce BayesDeep, a novel Bayesian hierarchical model that leverages cellular morphological data from histology images, commonly paired with SRT data, to reconstruct SRT data at the single-cell resolution. BayesDeep effectively model count data from SRT studies via a negative binomial regression model. This model incorporates explanatory variables such as cell types and nuclei-shape information for each cell extracted from the paired histology image. A feature selection scheme is integrated to examine the association between the morphological and molecular profiles, thereby improving the model robustness. We applied BayesDeep to two real SRT datasets, successfully demonstrating its capability to reconstruct SRT data at the single-cell resolution. This advancement not only yields new biological insights but also significantly enhances various downstream analyses, such as pseudotime and cell-cell communication.
Collapse
Affiliation(s)
- Xi Jiang
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
- Department of Statistics and Data Science, Southern Methodist University, Dallas, Texas, U.S.A
| | - Lei Dong
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
| | - Shidan Wang
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
| | - Zhuoyu Wen
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
| | - Mingyi Chen
- Department of Pathology, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
| | - Lin Xu
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
- Department of Pediatrics, Division of Hematology/Oncology, University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
| | - Qiwei Li
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, Texas, U.S.A
| |
Collapse
|
10
|
Li Z, Patel ZM, Song D, Yan G, Li JJ, Pinello L. Benchmarking computational methods to identify spatially variable genes and peaks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.02.569717. [PMID: 38076922 PMCID: PMC10705556 DOI: 10.1101/2023.12.02.569717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2023]
Abstract
Spatially resolved transcriptomics offers unprecedented insight by enabling the profiling of gene expression within the intact spatial context of cells, effectively adding a new and essential dimension to data interpretation. To efficiently detect spatial structure of interest, an essential step in analyzing such data involves identifying spatially variable genes. Despite researchers having developed several computational methods to accomplish this task, the lack of a comprehensive benchmark evaluating their performance remains a considerable gap in the field. Here, we present a systematic evaluation of 14 methods using 60 simulated datasets generated by four different simulation strategies, 12 real-world transcriptomics, and three spatial ATAC-seq datasets. We find that spatialDE2 consistently outperforms the other benchmarked methods, and Moran's I achieves competitive performance in different experimental settings. Moreover, our results reveal that more specialized algorithms are needed to identify spatially variable peaks.
Collapse
Affiliation(s)
- Zhijian Li
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Zain M. Patel
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Dongyuan Song
- Interdepartmental Program of Bioinformatics, University of California, Los Angeles, CA, USA
| | - Guanao Yan
- Department of Statistics and Data Science, University of California, Los Angeles, CA, USA
| | - Jingyi Jessica Li
- Department of Statistics and Data Science, University of California, Los Angeles, CA, USA
| | - Luca Pinello
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
11
|
Adhikari SD, Yang J, Wang J, Cui Y. A SELECTIVE REVIEW OF RECENT DEVELOPMENTS IN SPATIALLY VARIABLE GENE DETECTION FOR SPATIAL TRANSCRIPTOMICS. ARXIV 2023:arXiv:2311.13801v1. [PMID: 38045476 PMCID: PMC10690303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
With the emergence of advanced spatial transcriptomic technologies, there has been a surge in research papers dedicated to analyzing spatial transcriptomics data, resulting in significant contributions to our understanding of biology. The initial stage of downstream analysis of spatial transcriptomic data has centered on identifying spatially variable genes (SVGs) or genes expressed with specific spatial patterns across the tissue. SVG detection is an important task since many downstream analyses depend on these selected SVGs. Over the past few years, a plethora of new methods have been proposed for the detection of SVGs, accompanied by numerous innovative concepts and discussions. This article provides a selective review of methods and their practical implementations, offering valuable insights into the current literature in this field.
Collapse
Affiliation(s)
- Sikta Das Adhikari
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jiaxin Yang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
12
|
Zhou R, Yang G, Zhang Y, Wang Y. Spatial transcriptomics in development and disease. MOLECULAR BIOMEDICINE 2023; 4:32. [PMID: 37806992 PMCID: PMC10560656 DOI: 10.1186/s43556-023-00144-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 08/29/2023] [Indexed: 10/10/2023] Open
Abstract
The proper functioning of diverse biological systems depends on the spatial organization of their cells, a critical factor for biological processes like shaping intricate tissue functions and precisely determining cell fate. Nonetheless, conventional bulk or single-cell RNA sequencing methods were incapable of simultaneously capturing both gene expression profiles and the spatial locations of cells. Hence, a multitude of spatially resolved technologies have emerged, offering a novel dimension for investigating regional gene expression, spatial domains, and interactions between cells. Spatial transcriptomics (ST) is a method that maps gene expression in tissue while preserving spatial information. It can reveal cellular heterogeneity, spatial organization and functional interactions in complex biological systems. ST can also complement and integrate with other omics methods to provide a more comprehensive and holistic view of biological systems at multiple levels of resolution. Since the advent of ST, new methods offering higher throughput and resolution have become available, holding significant potential to expedite fresh insights into comprehending biological complexity. Consequently, a rapid increase in associated research has occurred, using these technologies to unravel the spatial complexity during developmental processes or disease conditions. In this review, we summarize the recent advancement of ST in historical, technical, and application contexts. We compare different types of ST methods based on their principles and workflows, and present the bioinformatics tools for analyzing and integrating ST data with other modalities. We also highlight the applications of ST in various domains of biomedical research, especially development and diseases. Finally, we discuss the current limitations and challenges in the field, and propose the future directions of ST.
Collapse
Affiliation(s)
- Ran Zhou
- Department of Neurosurgery, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Gaoxia Yang
- Department of Neurosurgery, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, 610041, China
- National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, 610041, Sichuan, China
| | - Yan Zhang
- National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, 610041, Sichuan, China.
| | - Yuan Wang
- Department of Neurosurgery, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, 610041, China.
| |
Collapse
|
13
|
Yuan Z, Yao J. Harnessing computational spatial omics to explore the spatial biology intricacies. Semin Cancer Biol 2023; 95:25-41. [PMID: 37400044 DOI: 10.1016/j.semcancer.2023.06.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 05/09/2023] [Accepted: 06/19/2023] [Indexed: 07/05/2023]
Abstract
Spatially resolved transcriptomics (SRT) has unlocked new dimensions in our understanding of intricate tissue architectures. However, this rapidly expanding field produces a wealth of diverse and voluminous data, necessitating the evolution of sophisticated computational strategies to unravel inherent patterns. Two distinct methodologies, gene spatial pattern recognition (GSPR) and tissue spatial pattern recognition (TSPR), have emerged as vital tools in this process. GSPR methodologies are designed to identify and classify genes exhibiting noteworthy spatial patterns, while TSPR strategies aim to understand intercellular interactions and recognize tissue domains with molecular and spatial coherence. In this review, we provide a comprehensive exploration of SRT, highlighting crucial data modalities and resources that are instrumental for the development of methods and biological insights. We address the complexities and challenges posed by the use of heterogeneous data in developing GSPR and TSPR methodologies and propose an optimal workflow for both. We delve into the latest advancements in GSPR and TSPR, examining their interrelationships. Lastly, we peer into the future, envisaging the potential directions and perspectives in this dynamic field.
Collapse
Affiliation(s)
- Zhiyuan Yuan
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.
| | | |
Collapse
|
14
|
Seal S, Bitler BG, Ghosh D. SMASH: Scalable Method for Analyzing Spatial Heterogeneity of genes in spatial transcriptomics data. PLoS Genet 2023; 19:e1010983. [PMID: 37862362 PMCID: PMC10619839 DOI: 10.1371/journal.pgen.1010983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 11/01/2023] [Accepted: 09/19/2023] [Indexed: 10/22/2023] Open
Abstract
In high-throughput spatial transcriptomics (ST) studies, it is of great interest to identify the genes whose level of expression in a tissue covaries with the spatial location of cells/spots. Such genes, also known as spatially variable genes (SVGs), can be crucial to the biological understanding of both structural and functional characteristics of complex tissues. Existing methods for detecting SVGs either suffer from huge computational demand or significantly lack statistical power. We propose a non-parametric method termed SMASH that achieves a balance between the above two problems. We compare SMASH with other existing methods in varying simulation scenarios demonstrating its superior statistical power and robustness. We apply the method to four ST datasets from different platforms uncovering interesting biological insights.
Collapse
Affiliation(s)
- Souvik Seal
- Department of Public Health Sciences, School of Medicine, Medical University of South Carolina, Charleston, South Carolina, United States of America
| | - Benjamin G. Bitler
- Department of Obstetrics and Gynecology, School of Medicine, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado, United States of America
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado, United States of America
| |
Collapse
|
15
|
Weber LM, Saha A, Datta A, Hansen KD, Hicks SC. nnSVG for the scalable identification of spatially variable genes using nearest-neighbor Gaussian processes. Nat Commun 2023; 14:4059. [PMID: 37429865 DOI: 10.1038/s41467-023-39748-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 06/23/2023] [Indexed: 07/12/2023] Open
Abstract
Feature selection to identify spatially variable genes or other biologically informative genes is a key step during analyses of spatially-resolved transcriptomics data. Here, we propose nnSVG, a scalable approach to identify spatially variable genes based on nearest-neighbor Gaussian processes. Our method (i) identifies genes that vary in expression continuously across the entire tissue or within a priori defined spatial domains, (ii) uses gene-specific estimates of length scale parameters within the Gaussian process models, and (iii) scales linearly with the number of spatial locations. We demonstrate the performance of our method using experimental data from several technological platforms and simulations. A software implementation is available at https://bioconductor.org/packages/nnSVG .
Collapse
Affiliation(s)
- Lukas M Weber
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Arkajyoti Saha
- Department of Statistics, University of Washington, Seattle, WA, USA
| | - Abhirup Datta
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Kasper D Hansen
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Stephanie C Hicks
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
| |
Collapse
|
16
|
Jiang X, Luo D, Fern Ndez E, Yang J, Li H, Jin KW, Zhan Y, Yao B, Bedi S, Xiao G, Zhan X, Li Q, Xie Y. Spatial Transcriptomics Arena (STAr): an Integrated Platform for Spatial Transcriptomics Methodology Research. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.10.532127. [PMID: 36945650 PMCID: PMC10028992 DOI: 10.1101/2023.03.10.532127] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2023]
Abstract
The emerging field of spatially resolved transcriptomics (SRT) has revolutionized biomedical research. SRT quantifies expression levels at different spatial locations, providing a new and powerful tool to interrogate novel biological insights. An essential question in the analysis of SRT data is to identify spatially variable (SV) genes; the expression levels of such genes have spatial variation across different tissues. SV genes usually play an important role in underlying biological mechanisms and tissue heterogeneity. Currently, several computational methods have been developed to detect such genes; however, there is a lack of unbiased assessment of these approaches to guide researchers in selecting the appropriate methods for their specific biomedical applications. In addition, it is difficult for researchers to implement different existing methods for either biological study or methodology development. Furthermore, currently available public SRT datasets are scattered across different websites and preprocessed in different ways, posing additional obstacles for quantitative researchers developing computational methods for SRT data analysis. To address these challenges, we designed Spatial Transcriptomics Arena (STAr), an open platform comprising 193 curated datasets from seven technologies, seven statistical methods, and analysis results. This resource allows users to retrieve high-quality datasets, apply or develop spatial gene detection methods, as well as browse and compare spatial gene analysis results. It also enables researchers to comprehensively evaluate SRT methodology research in both simulated and real datasets. Altogether, STAr is an integrated research resource intended to promote reproducible research and accelerate rigorous methodology development, which can eventually lead to an improved understanding of biological processes and diseases. STAr can be accessed at https://lce.biohpc.swmed.edu/star/ .
Collapse
|
17
|
Zhu J, Shang L, Zhou X. SRTsim: spatial pattern preserving simulations for spatially resolved transcriptomics. Genome Biol 2023; 24:39. [PMID: 36869394 PMCID: PMC9983268 DOI: 10.1186/s13059-023-02879-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Accepted: 02/16/2023] [Indexed: 03/05/2023] Open
Abstract
Spatially resolved transcriptomics (SRT)-specific computational methods are often developed, tested, validated, and evaluated in silico using simulated data. Unfortunately, existing simulated SRT data are often poorly documented, hard to reproduce, or unrealistic. Single-cell simulators are not directly applicable for SRT simulation as they cannot incorporate spatial information. We present SRTsim, an SRT-specific simulator for scalable, reproducible, and realistic SRT simulations. SRTsim not only maintains various expression characteristics of SRT data but also preserves spatial patterns. We illustrate the benefits of SRTsim in benchmarking methods for spatial clustering, spatial expression pattern detection, and cell-cell communication identification.
Collapse
Affiliation(s)
- Jiaqiang Zhu
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Lulu Shang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
18
|
Jeon H, Xie J, Jeon Y, Jung KJ, Gupta A, Chang W, Chung D. Statistical Power Analysis for Designing Bulk, Single-Cell, and Spatial Transcriptomics Experiments: Review, Tutorial, and Perspectives. Biomolecules 2023; 13:biom13020221. [PMID: 36830591 PMCID: PMC9952882 DOI: 10.3390/biom13020221] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 01/20/2023] [Accepted: 01/21/2023] [Indexed: 01/26/2023] Open
Abstract
Gene expression profiling technologies have been used in various applications such as cancer biology. The development of gene expression profiling has expanded the scope of target discovery in transcriptomic studies, and each technology produces data with distinct characteristics. In order to guarantee biologically meaningful findings using transcriptomic experiments, it is important to consider various experimental factors in a systematic way through statistical power analysis. In this paper, we review and discuss the power analysis for three types of gene expression profiling technologies from a practical standpoint, including bulk RNA-seq, single-cell RNA-seq, and high-throughput spatial transcriptomics. Specifically, we describe the existing power analysis tools for each research objective for each of the bulk RNA-seq and scRNA-seq experiments, along with recommendations. On the other hand, since there are no power analysis tools for high-throughput spatial transcriptomics at this point, we instead investigate the factors that can influence power analysis.
Collapse
Affiliation(s)
- Hyeongseon Jeon
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA
| | - Juan Xie
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA
- The Interdisciplinary Ph.D. Program in Biostatistics, The Ohio State University, Columbus, OH 43210, USA
| | - Yeseul Jeon
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- Department of Statistics and Data Science, Yonsei University, Seoul 03722, Republic of Korea
- Department of Applied Statistics, Yonsei University, Seoul 03722, Republic of Korea
| | - Kyeong Joo Jung
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH 43210, USA
| | - Arkobrato Gupta
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA
- The Interdisciplinary Ph.D. Program in Biostatistics, The Ohio State University, Columbus, OH 43210, USA
| | - Won Chang
- Division of Statistics and Data Science, University of Cincinnati, Cincinnati, OH 45221, USA
| | - Dongjun Chung
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA
- The Interdisciplinary Ph.D. Program in Biostatistics, The Ohio State University, Columbus, OH 43210, USA
- Correspondence:
| |
Collapse
|
19
|
Yue L, Liu F, Hu J, Yang P, Wang Y, Dong J, Shu W, Huang X, Wang S. A guidebook of spatial transcriptomic technologies, data resources and analysis approaches. Comput Struct Biotechnol J 2023; 21:940-955. [PMID: 38213887 PMCID: PMC10781722 DOI: 10.1016/j.csbj.2023.01.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 01/13/2023] [Accepted: 01/14/2023] [Indexed: 01/18/2023] Open
Abstract
Advances in transcriptomic technologies have deepened our understanding of the cellular gene expression programs of multicellular organisms and provided a theoretical basis for disease diagnosis and therapy. However, both bulk and single-cell RNA sequencing approaches lose the spatial context of cells within the tissue microenvironment, and the development of spatial transcriptomics has made overall bias-free access to both transcriptional information and spatial information possible. Here, we elaborate development of spatial transcriptomic technologies to help researchers select the best-suited technology for their goals and integrate the vast amounts of data to facilitate data accessibility and availability. Then, we marshal various computational approaches to analyze spatial transcriptomic data for various purposes and describe the spatial multimodal omics and its potential for application in tumor tissue. Finally, we provide a detailed discussion and outlook of the spatial transcriptomic technologies, data resources and analysis approaches to guide current and future research on spatial transcriptomics.
Collapse
Affiliation(s)
- Liangchen Yue
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| | - Feng Liu
- College of Medical Informatics, Chongqing Medical University, Chongqing 400016, China
| | - Jiongsong Hu
- University of South China, Hengyang, Hunan 421001, China
| | - Pin Yang
- Anhui Medical University, Hefei 230022, Anhui, China
| | - Yuxiang Wang
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| | - Junguo Dong
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| | - Wenjie Shu
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| | - Xingxu Huang
- Zhejiang Provincial Key Laboratory of Pancreatic Disease, the First Affiliated Hospital, and Institute of Translational Medicine, Zhejiang University School of Medicine, Hangzhou 310029, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Shengqi Wang
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| |
Collapse
|
20
|
Ospina O, Soupir A, Fridley BL. A Primer on Preprocessing, Visualization, Clustering, and Phenotyping of Barcode-Based Spatial Transcriptomics Data. Methods Mol Biol 2023; 2629:115-140. [PMID: 36929076 DOI: 10.1007/978-1-0716-2986-4_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Recent developments in spatially resolved transcriptomics (ST) have resulted in a large number of studies characterizing the architecture of tissues, the spatial distribution of cell types, and their interactions. Furthermore, ST promises to enable the discovery of more accurate drug targets while also providing a better understanding of the etiology and evolution of complex diseases. The analysis of ST brings similar challenges as seen in other gene expression assays such as scRNA-seq; however, there is the additional spatial information that warrants the development of suitable algorithms for the quality control, preprocessing, visualization, and other discovery-enabling approaches (e.g., clustering, cell phenotyping). In this chapter, we review some of the existing algorithms to perform these analytical tasks and highlight some of the unmet analytical challenges in the analysis of ST data. Given the diversity of available ST technologies, we focus this chapter on the analysis of barcode-based RNA quantitation techniques.
Collapse
Affiliation(s)
- Oscar Ospina
- Department of Biostatistics and Bioinformatics, Moffitt Cancer Center, Tampa, FL, USA
| | - Alex Soupir
- Department of Biostatistics and Bioinformatics, Moffitt Cancer Center, Tampa, FL, USA
| | - Brooke L Fridley
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL, USA.
| |
Collapse
|
21
|
Bernstein MN, Ni Z, Prasad A, Brown J, Mohanty C, Stewart R, Newton MA, Kendziorski C. SpatialCorr identifies gene sets with spatially varying correlation structure. CELL REPORTS METHODS 2022; 2:100369. [PMID: 36590683 PMCID: PMC9795364 DOI: 10.1016/j.crmeth.2022.100369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Revised: 09/26/2022] [Accepted: 11/21/2022] [Indexed: 12/15/2022]
Abstract
Recent advances in spatially resolved transcriptomics technologies enable both the measurement of genome-wide gene expression profiles and their mapping to spatial locations within a tissue. A first step in spatial transcriptomics data analysis is identifying genes with expression that varies spatially, and robust statistical methods exist to address this challenge. While useful, these methods do not detect spatial changes in the coordinated expression within a group of genes. To this end, we present SpatialCorr, a method for identifying sets of genes with spatially varying correlation structure. Given a collection of gene sets pre-defined by a user, SpatialCorr tests for spatially induced differences in the correlation of each gene set within tissue regions, as well as between and among regions. An application to cutaneous squamous cell carcinoma demonstrates the power of the approach for revealing biological insights not identified using existing methods.
Collapse
Affiliation(s)
| | - Zijian Ni
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Aman Prasad
- Department of Dermatology, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - Jared Brown
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Chitrasen Mohanty
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| | - Ron Stewart
- Morgridge Institute for Research, Madison, WI 53715, USA
| | - Michael A. Newton
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| | - Christina Kendziorski
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| |
Collapse
|
22
|
Ma C, Chitra U, Zhang S, Raphael BJ. Belayer: Modeling discrete and continuous spatial variation in gene expression from spatially resolved transcriptomics. Cell Syst 2022; 13:786-797.e13. [PMID: 36265465 PMCID: PMC9814896 DOI: 10.1016/j.cels.2022.09.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 07/13/2022] [Accepted: 09/06/2022] [Indexed: 01/26/2023]
Abstract
Spatially resolved transcriptomics (SRT) technologies measure gene expression at known locations in a tissue slice, enabling the identification of spatially varying genes or cell types. Current approaches for these tasks assume either that gene expression varies continuously across a tissue or that a tissue contains a small number of regions with distinct cellular composition. We propose a model for SRT data from layered tissues that includes both continuous and discrete spatial variation in expression and an algorithm, Belayer, to learn the parameters of this model. Belayer models gene expression as a piecewise linear function of the relative depth of a tissue layer with possible discontinuities at layer boundaries. We use conformal maps to model relative depth and derive a dynamic programming algorithm to infer layer boundaries and gene expression functions. Belayer accurately identifies tissue layers and biologically meaningful spatially varying genes in SRT data from the brain and skin.
Collapse
Affiliation(s)
- Cong Ma
- Department of Computer Science, Princeton University, 35 Olden St, Princeton, NJ 08540, USA
| | - Uthsav Chitra
- Department of Computer Science, Princeton University, 35 Olden St, Princeton, NJ 08540, USA
| | - Shirley Zhang
- Department of Computer Science, Princeton University, 35 Olden St, Princeton, NJ 08540, USA
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, 35 Olden St, Princeton, NJ 08540, USA.
| |
Collapse
|
23
|
Jiang X, Xiao G, Li Q. A Bayesian modified Ising model for identifying spatially variable genes from spatial transcriptomics data. Stat Med 2022; 41:4647-4665. [DOI: 10.1002/sim.9530] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 05/13/2022] [Accepted: 07/01/2022] [Indexed: 12/13/2022]
Affiliation(s)
- Xi Jiang
- Department of Statistical Science Southern Methodist University Dallas Texas USA
- Department of Population and Data Sciences The University of Texas Southwestern Medical Center Dallas Texas USA
| | - Guanghua Xiao
- Department of Population and Data Sciences The University of Texas Southwestern Medical Center Dallas Texas USA
| | - Qiwei Li
- Department of Mathematical Sciences The University of Texas at Dallas Richardson Texas USA
| |
Collapse
|
24
|
Spatial components of molecular tissue biology. Nat Biotechnol 2022; 40:308-318. [PMID: 35132261 DOI: 10.1038/s41587-021-01182-1] [Citation(s) in RCA: 92] [Impact Index Per Article: 46.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 12/03/2021] [Indexed: 02/06/2023]
Abstract
Methods for profiling RNA and protein expression in a spatially resolved manner are rapidly evolving, making it possible to comprehensively characterize cells and tissues in health and disease. To maximize the biological insights obtained using these techniques, it is critical to both clearly articulate the key biological questions in spatial analysis of tissues and develop the requisite computational tools to address them. Developers of analytical tools need to decide on the intrinsic molecular features of each cell that need to be considered, and how cell shape and morphological features are incorporated into the analysis. Also, optimal ways to compare different tissue samples at various length scales are still being sought. Grouping these biological problems and related computational algorithms into classes across length scales, thus characterizing common issues that need to be addressed, will facilitate further progress in spatial transcriptomics and proteomics.
Collapse
|
25
|
Abstract
Spatial documentation is exponentially increasing given the availability of Big Data in the Internet of Things, enabled by device miniaturization and data storage capacity. Bayesian spatial statistics is a useful statistical tool to determine the dependence structure and hidden patterns in space through prior knowledge and data likelihood. However, this class of modeling is not yet well explored when compared to adopting classification and regression in machine-learning models, in which the assumption of the spatiotemporal independence of the data is often made, that is an inexistent or very weak dependence. Thus, this systematic review aims to address the main models presented in the literature over the past 20 years, identifying the gaps and research opportunities. Elements such as random fields, spatial domains, prior specification, the covariance function, and numerical approximations are discussed. This work explores the two subclasses of spatial smoothing: global and local.
Collapse
|