1
|
Bao X, Bai X, Liu X, Shi Q, Zhang C. Spatially informed graph transformers for spatially resolved transcriptomics. Commun Biol 2025; 8:574. [PMID: 40188303 PMCID: PMC11972348 DOI: 10.1038/s42003-025-08015-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Accepted: 03/28/2025] [Indexed: 04/07/2025] Open
Abstract
Spatially resolved transcriptomics (SRT) has emerged as a powerful technique for mapping gene expression landscapes within spatial contexts. However, significant challenges persist in effectively integrating gene expression with spatial information to elucidate the heterogeneity of biological tissues. Here, we present a Spatially informed Graph Transformers framework, SpaGT, which leverages both node and edge channels to model spatially aware graph representation for denoising gene expression and identifying spatial domains. Unlike conventional graph neural networks, which rely on static, localized convolutional aggregation, SpaGT employs a structure-reinforced self-attention mechanism that iteratively evolves topological structural information and transcriptional signal representation. By replacing graph convolution with global self-attention, SpaGT enables the integration of both global and spatially localized information, thereby improving the detection of fine-grained spatial domains. We demonstrate that SpaGT achieves superior performance in identifying spatial domains and denoising gene expression data across diverse platforms and species. Furthermore, SpaGT facilitates the discovery of spatially variable genes with significant prognostic potential in cancer tissues. These findings establish SpaGT as a powerful tool for unraveling the complexities of biological tissues.
Collapse
Affiliation(s)
- Xinyu Bao
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
| | - Xiaosheng Bai
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
| | - Xiaoping Liu
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China.
| | - Qianqian Shi
- Hubei Engineering Technology Research Center of Agricultural Big Data, Huazhong Agricultural University, Wuhan, China.
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China.
| | - Chuanchao Zhang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China.
| |
Collapse
|
2
|
Yang J, Zheng Z, Jiao Y, Yu K, Bhatara S, Yang X, Natarajan S, Zhang J, Pan Q, Easton J, Yan KK, Peng J, Liu K, Yu J. Spotiphy enables single-cell spatial whole transcriptomics across an entire section. Nat Methods 2025; 22:724-736. [PMID: 40074951 PMCID: PMC11978521 DOI: 10.1038/s41592-025-02622-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 01/29/2025] [Indexed: 03/14/2025]
Abstract
Spatial transcriptomics (ST) has advanced our understanding of tissue regionalization by enabling the visualization of gene expression within whole-tissue sections, but current approaches remain plagued by the challenge of achieving single-cell resolution without sacrificing whole-genome coverage. Here we present Spotiphy (spot imager with pseudo-single-cell-resolution histology), a computational toolkit that transforms sequencing-based ST data into single-cell-resolved whole-transcriptome images. Spotiphy delivers the most precise cellular proportions in extensive benchmarking evaluations. Spotiphy-derived inferred single-cell profiles reveal astrocyte and disease-associated microglia regional specifications in Alzheimer's disease and healthy mouse brains. Spotiphy identifies multiple spatial domains and alterations in tumor-tumor microenvironment interactions in human breast ST data. Spotiphy bridges the information gap and enables visualization of cell localization and transcriptomic profiles throughout entire sections, offering highly informative outputs and an innovative spatial analysis pipeline for exploring complex biological systems.
Collapse
Affiliation(s)
- Jiyuan Yang
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Ziqian Zheng
- Department of Industrial & Systems Engineering, University of Wisconsin-Madison, Madison, WI, USA
| | - Yun Jiao
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Kaiwen Yu
- Center of Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Sheetal Bhatara
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Xu Yang
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Sivaraman Natarajan
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Jiahui Zhang
- Department of Industrial & Systems Engineering, University of Wisconsin-Madison, Madison, WI, USA
| | - Qingfei Pan
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - John Easton
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Koon-Kiu Yan
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Junmin Peng
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA.
| | - Kaibo Liu
- Department of Industrial & Systems Engineering, University of Wisconsin-Madison, Madison, WI, USA.
| | - Jiyang Yu
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA.
| |
Collapse
|
3
|
Liang X, Torkel M, Cao Y, Yang JYH. Multi-task benchmarking of spatially resolved gene expression simulation models. Genome Biol 2025; 26:57. [PMID: 40098171 PMCID: PMC11912772 DOI: 10.1186/s13059-025-03505-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Accepted: 02/12/2025] [Indexed: 03/19/2025] Open
Abstract
BACKGROUND Computational methods for spatially resolved transcriptomics (SRT) are often developed and assessed using simulated data. The effectiveness of these evaluations relies on the ability of simulation methods to accurately reflect experimental data. However, a systematic evaluation framework for spatial simulators is currently lacking. RESULTS Here, we present SpatialSimBench, a comprehensive evaluation framework that assesses 13 simulation methods using ten distinct STR datasets. We introduce simAdaptor, a tool that extends single-cell simulators by incorporating spatial variables, enabling them to simulate spatial data. SimAdaptor ensures SpatialSimBench is backwards compatible, facilitating direct comparisons between spatially aware simulators and existing non-spatial single-cell simulators through the adaption. Using SpatialSimBench, we demonstrate the feasibility of leveraging existing single-cell simulators for SRT data and highlight performance differences among methods. Additionally, we evaluate the simulation methods based on a total of 35 metrics across data property estimation, various downstream analyses, and scalability. In total, we generated 4550 results from 13 simulation methods, ten spatial datasets, and 35 metrics. CONCLUSIONS Our findings reveal that model estimation can be influenced by distribution assumptions and dataset characteristics. In summary, our evaluation framework provides guidelines for selecting appropriate methods for specific scenarios and informs future method development.
Collapse
Affiliation(s)
- Xiaoqi Liang
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Marni Torkel
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Yue Cao
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia.
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia.
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia.
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China.
| | - Jean Yee Hwa Yang
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, 2006, Australia.
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, 2006, Australia.
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia.
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China.
| |
Collapse
|
4
|
Wang Y, Liu Z, Ma X. MuCST: restoring and integrating heterogeneous morphology images and spatial transcriptomics data with contrastive learning. Genome Med 2025; 17:21. [PMID: 40082941 PMCID: PMC11907906 DOI: 10.1186/s13073-025-01449-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2024] [Accepted: 03/07/2025] [Indexed: 03/16/2025] Open
Abstract
Spatially resolved transcriptomics (SRT) simultaneously measure spatial location, histology images, and transcriptional profiles of cells or regions in undissociated tissues. Integrative analysis of multi-modal SRT data holds immense potential for understanding biological mechanisms. Here, we present a flexible multi-modal contrastive learning for the integration of SRT data (MuCST), which joins denoising, heterogeneity elimination, and compatible feature learning. MuCST accurately identifies spatial domains and is applicable to diverse datasets platforms. Overall, MuCST provides an alternative for integrative analysis of multi-modal SRT data ( https://github.com/xkmaxidian/MuCST ).
Collapse
Affiliation(s)
- Yu Wang
- School of Computer Science and Technology, Xidian University, No.2 South Taibai Road, Xi'an, 710071, Shaanxi, China
- Key Laboratory of Smart Human-Computer Interaction and Wearable Technology of Shaanxi Province, Xidian University, No.2 South Taibai Road, Xi'an, 710071, Shaanxi, China
| | - Zaiyi Liu
- Department of Radiology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, 106 Zhongshan Er Road, Guangzhou, 510080, Guangdong, China
- Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, 106 Zhongshan Er Road, Guangzhou, 510080, Guangdong, China
| | - Xiaoke Ma
- School of Computer Science and Technology, Xidian University, No.2 South Taibai Road, Xi'an, 710071, Shaanxi, China.
- Key Laboratory of Smart Human-Computer Interaction and Wearable Technology of Shaanxi Province, Xidian University, No.2 South Taibai Road, Xi'an, 710071, Shaanxi, China.
| |
Collapse
|
5
|
Pang Y, Wang C, Zhang YZ, Wang Z, Imoto S, Lee TY. STForte: tissue context-specific encoding and consistency-aware spatial imputation for spatially resolved transcriptomics. Brief Bioinform 2025; 26:bbaf174. [PMID: 40254832 DOI: 10.1093/bib/bbaf174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2025] [Revised: 03/06/2025] [Accepted: 03/17/2025] [Indexed: 04/22/2025] Open
Abstract
Encoding spatially resolved transcriptomics (SRT) data serves to identify the biological semantics of RNA expression within the tissue while preserving spatial characteristics. Depending on the analytical scenario, one may focus on different contextual structures of tissues. For instance, anatomical regions reveal consistent patterns by focusing on spatial homogeneity, while elucidating complex tumor micro-environments requires more expression heterogeneity. However, current spatial encoding methods lack consideration of the tissue context. Meanwhile, most developed SRT technologies are still limited in providing exact patterns of intact tissues due to limitations such as low resolution or missed measurements. Here, we propose STForte, a novel pairwise graph autoencoder-based approach with cross-reconstruction and adversarial distribution matching, to model the spatial homogeneity and expression heterogeneity of SRT data. STForte extracts interpretable latent encodings, enabling downstream analysis by accurately portraying various tissue contexts. Moreover, STForte allows spatial imputation using only spatial consistency to restore the biological patterns of unobserved locations or low-quality cells, thereby providing fine-grained views to enhance the SRT analysis. Extensive evaluations of datasets under different scenarios and SRT platforms demonstrate that STForte is a scalable and versatile tool for providing enhanced insights into spatial data analysis.
Collapse
Affiliation(s)
- Yuxuan Pang
- Division of Health Medical Intelligence, Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1, Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan
| | - Chunxuan Wang
- School of Data Science, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), 2001 Longxiang Road, Longgang, Shenzhen, 518172, China
| | - Yao-Zhong Zhang
- Division of Health Medical Intelligence, Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1, Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan
| | - Zhuo Wang
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), 2001 Longxiang Road, Longgang, Shenzhen, 518172, China
- School of Medicine, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), 2001 Longxiang Road, Longgang, Shenzhen, 518172, China
| | - Seiya Imoto
- Division of Health Medical Intelligence, Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1, Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan
| | - Tzong-Yi Lee
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, No. 75 Bo-Ai Street, Hsinchu 300, Taiwan
| |
Collapse
|
6
|
Wang R, Qian Y, Guo X, Song F, Xiong Z, Cai S, Bian X, Wong MH, Cao Q, Cheng L, Lu G, Leung KS. STModule: identifying tissue modules to uncover spatial components and characteristics of transcriptomic landscapes. Genome Med 2025; 17:18. [PMID: 40033360 DOI: 10.1186/s13073-025-01441-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Accepted: 02/17/2025] [Indexed: 03/05/2025] Open
Abstract
Here we present STModule, a Bayesian method developed to identify tissue modules from spatially resolved transcriptomics that reveal spatial components and essential characteristics of tissues. STModule uncovers diverse expression signals in transcriptomic landscapes such as cancer, intraepithelial neoplasia, immune infiltration, outcome-related molecular features and various cell types, which facilitate downstream analysis and provide insights into tumor microenvironments, disease mechanisms, treatment development, and histological organization of tissues. STModule captures a broader spectrum of biological signals compared to other methods and detects novel spatial components. The tissue modules characterized by gene sets demonstrate greater robustness and transferability across different biopsies. STModule: https://github.com/rwang-z/STModule.git .
Collapse
Affiliation(s)
- Ran Wang
- CUHK-SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, 999077, China
- Center for Neuromusculoskeletal Restorative Medicine, Hong Kong Science Park, Shatin, New Territories, Hong Kong, 999077, China
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, 999077, China
| | - Yan Qian
- Department of Gastrointestinal Surgery Center, the First Affiliated Hospital, Sun Yat-Sen University, Guangzhou, 519082, China
| | - Xiaojing Guo
- Health Data Science Center, Shenzhen People's Hospital, First Affiliated Hospital of Southern University of Science and Technology, Shenzhen, 518020, China
| | - Fangda Song
- School of Data Science, The Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Zhiqiang Xiong
- CUHK-SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, 999077, China
| | - Shirong Cai
- Department of Gastrointestinal Surgery Center, the First Affiliated Hospital, Sun Yat-Sen University, Guangzhou, 519082, China
| | - Xiuwu Bian
- Jinfeng Laboratory, Chongqing, 401329, China
| | - Man Hon Wong
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, 999077, China
| | - Qin Cao
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, 999077, China.
- Shenzhen Research Institute, the Chinese University of Hong Kong, Shenzhen, 518172, China.
| | - Lixin Cheng
- Health Data Science Center, Shenzhen People's Hospital, First Affiliated Hospital of Southern University of Science and Technology, Shenzhen, 518020, China.
| | - Gang Lu
- CUHK-SDU Joint Laboratory on Reproductive Genetics, School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, 999077, China.
- Center for Neuromusculoskeletal Restorative Medicine, Hong Kong Science Park, Shatin, New Territories, Hong Kong, 999077, China.
- Jinfeng Laboratory, Chongqing, 401329, China.
- Shenzhen Research Institute, the Chinese University of Hong Kong, Shenzhen, 518172, China.
| | - Kwong Sak Leung
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, 999077, China.
- Jinfeng Laboratory, Chongqing, 401329, China.
- Department of Applied Data Science, Hong Kong Shue Yan University, North Point, Hong Kong Island, Hong Kong, 999077, China.
| |
Collapse
|
7
|
Zhang H, Patrick MT, Zhao J, Zhai X, Liu J, Li Z, Gu Y, Welch J, Zhou X, Modlin RL, Tsoi LC, Gudjonsson JE. Techniques and analytic workflow for spatial transcriptomics and its application to allergy and inflammation. J Allergy Clin Immunol 2025; 155:678-687. [PMID: 39837466 PMCID: PMC11875981 DOI: 10.1016/j.jaci.2025.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Revised: 01/02/2025] [Accepted: 01/14/2025] [Indexed: 01/23/2025]
Abstract
Spatial profiling, through single-cell gene-level expression data paired with cell localization, offers unprecedented biologic insights within the intact spatial context of cells in healthy and diseased tissue, adding a novel dimension to data interpretation. This review summarizes recent developments in this field, its application to allergy and inflammation, and recent single-cell resolution platforms designed for spatial transcriptomics with a focus on data processing and analyses for efficient biologic interpretation of data. By preserving spatial context, these technologies provide critical insights into tissue architecture and cellular interactions that are unattainable with traditional transcriptomics methods, such as revealing localized inflammatory cell network in atopic dermatitis and T-cell interactions in the lung in chronic obstructive pulmonary disease. Spatial profiling offers opportunities for discovering novel biomarkers, defining compartmentalization of immune responses within tissues and individual diseases, and accelerating novel discoveries toward a greater understanding of fundamental disease mechanisms and, eventually, toward the development of future targeted therapies.
Collapse
Affiliation(s)
- Haihan Zhang
- Department of Biostatistics, University of Michigan, Ann Arbor, Mich
| | - Matthew T Patrick
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, Mich
| | - Jingyu Zhao
- Department of Biostatistics, University of Michigan, Ann Arbor, Mich
| | - Xintong Zhai
- Department of Biostatistics, University of Michigan, Ann Arbor, Mich
| | - Jialin Liu
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Mich
| | - Zheng Li
- Department of Biostatistics, University of Michigan, Ann Arbor, Mich
| | - Yiqian Gu
- Department of Internal Medicine, Division of Dermatology, UCLA, Los Angeles, Calif
| | - Joshua Welch
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Mich; Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Mich
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, Mich
| | - Robert L Modlin
- Department of Internal Medicine, Division of Dermatology, UCLA, Los Angeles, Calif
| | - Lam C Tsoi
- Department of Biostatistics, University of Michigan, Ann Arbor, Mich; Department of Dermatology, University of Michigan Medical School, Ann Arbor, Mich; Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Mich.
| | - Johann E Gudjonsson
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, Mich; Taubman Medical Research Institute, University of Michigan Medical School, Ann Arbor, Mich.
| |
Collapse
|
8
|
Gingerich IK, Goods BA, Frost HR. Randomized Spatial PCA (RASP): a computationally efficient method for dimensionality reduction of high-resolution spatial transcriptomics data. RESEARCH SQUARE 2025:rs.3.rs-6050441. [PMID: 40034439 PMCID: PMC11875318 DOI: 10.21203/rs.3.rs-6050441/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Spatial transcriptomics (ST) provides critical insights into the spatial organization of gene expression, enabling researchers to unravel the intricate relationship between cellular environments and biological function. Identifying spatial domains within tissues is key to understanding tissue architecture and mechanisms underlying development and disease progression. Here, we present Randomized Spatial PCA (RASP), a novel spatially-aware dimensionality reduction method for ST data. RASP is designed to be orders-of-magnitude faster than existing techniques, scale to datasets with 100, 000+ locations, support flexible integration of non-transcriptomic covariates, and reconstruct de-noised, spatially-smoothed gene expression values. It employs a randomized two-stage PCA framework with sparse matrix operations and configurable spatial smoothing. RASP was compared to BASS, GraphST, SEDR, SpatialPCA, and STAGATE using diverse ST datasets (10x Visium, Stereo-Seq, MERFISH, 10x Xenium) on human and mouse tissues. RASP demonstrates comparable or superior tissue domain detection with substantial improvements in computational speed, enhancing exploration of high-resolution subcellular datasets.
Collapse
Affiliation(s)
- Ian K. Gingerich
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH, USA
- Thayer School of Engineering, Dartmouth College, Hanover, NH, USA
| | | | - H. Robert Frost
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH, USA
| |
Collapse
|
9
|
Su H, Wu Y, Chen B, Cui Y. STANCE: a unified statistical model to detect cell-type-specific spatially variable genes in spatial transcriptomics. Nat Commun 2025; 16:1793. [PMID: 39979358 PMCID: PMC11842841 DOI: 10.1038/s41467-025-57117-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2024] [Accepted: 02/10/2025] [Indexed: 02/22/2025] Open
Abstract
One of the major challenges in spatial transcriptomics is to detect spatially variable genes (SVGs), whose expression patterns are non-random across tissue locations. Many SVGs correlate with cell type compositions, introducing the concept of cell type-specific SVGs (ctSVGs). Existing ctSVG detection methods treat cell type-specific spatial effects as fixed effects, leading to tissue spatial rotation-dependent results. Moreover, SVGs may exhibit random spatial patterns within cell types, meaning an SVG is not always a ctSVG, and vice versa, further complicating detection. We propose STANCE, a unified statistical model for both SVGs and ctSVGs detection under a linear mixed-effect model framework that integrates gene expression, spatial location, and cell type composition information. STANCE ensures tissue rotation-invariant results, with a two-stage approach: initial SVG/ctSVG detection followed by ctSVG-specific testing. We demonstrate its performance through extensive simulations and analyses of public datasets. Downstream analyses reveal STANCE's potential in spatial transcriptomics analysis.
Collapse
Affiliation(s)
- Haohao Su
- Department of Statistics and Probability, Michigan State University, East Lansing, 48824, MI, USA
| | - Yuesong Wu
- Department of Statistics and Probability, Michigan State University, East Lansing, 48824, MI, USA
| | - Bin Chen
- Department of Pharmacology and Toxicology, Michigan State University, East Lansing, 48824, MI, USA
- Department of Computer Science and Engineering, Michigan State University, East Lansing, 48824, MI, USA
- Department of Pediatrics and Human Development, Michigan State University, Grand Rapids, 49503, MI, USA
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, 48824, MI, USA.
| |
Collapse
|
10
|
Reynoso S, Schiebout C, Krishna R, Zhang F. STEAM: Spatial Transcriptomics Evaluation Algorithm and Metric for clustering performance. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.17.636505. [PMID: 40027655 PMCID: PMC11870515 DOI: 10.1101/2025.02.17.636505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Motivation Spatial transcriptomic technologies allow researchers to explore the diversity and specificity of gene expression within their original tissue structure. Accurately identifying regions that are spatially coherent in both gene expression and physical tissue structures is an emerging topic, but challenging due to the lack of ground truth labels which renders complicating validation of clustering consistency and reproducibility. This highlights a need for a computational evaluation framework to rigorously and unbiasedly assess clustering performance. Results To address this gap, we propose STEAM (Spatial Transcriptomics Evaluation Algorithm and Metric), a user-friendly computational pipeline designed to evaluate the consistency and reliability of clustering results by leveraging machine learning classification and prediction methods, with the goal of maintaining the spatial proximity and gene expression patterns within clusters. We benchmarked STEAM on various public datasets, spanning multi-cell to single-cell resolution, as well as spatial transcriptomics and proteomics. The results highlighted its robustness and generalizability through comprehensive statistical evaluation metrics, such as Kappa score, F1 score, accuracy, and adjusted rand index. Notably, STEAM supports multi-sample training, enabling cross-replicate clustering consistency assessment. Moreover, STEAM provides practical guidance by comparing clustering results across multiple approaches; here, we evaluated four different methods, including spatial-aware and spatial-ignorant approaches. In summary, we believe that STEAM provides researchers a promising tool for evaluating clustering robustness and benchmarking clustering performance for spatial omics data, offering valuable insights to drive reproducible discoveries in spatial biology. Availability and implementation Source code and the R software tool STEAM are available from https://github.com/fanzhanglab/STEAM .
Collapse
Affiliation(s)
- Samantha Reynoso
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
- Department of Medicine Rheumatology, University of Colorado School of Medicine, Aurora, CO, 80045, USA
- Computational Bioscience PhD Program, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Courtney Schiebout
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
- Department of Medicine Rheumatology, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Revanth Krishna
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
- Department of Medicine Rheumatology, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Fan Zhang
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
- Department of Medicine Rheumatology, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| |
Collapse
|
11
|
Zhang H, Zhang Y, Ting KM, Zhang J, Zhao Q. Kernel-bounded clustering for spatial transcriptomics enables scalable discovery of complex spatial domains. Genome Res 2025; 35:355-367. [PMID: 39909714 PMCID: PMC11874963 DOI: 10.1101/gr.278983.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 12/19/2024] [Indexed: 02/07/2025]
Abstract
Spatial transcriptomics are a collection of technologies that have enabled characterization of gene expression profiles and spatial information in tissue samples. Existing methods for clustering spatial transcriptomics data have primarily focused on data transformation techniques to represent the data suitably for subsequent clustering analysis, often using an existing clustering algorithm. These methods have limitations in handling complex data characteristics with varying densities, sizes, and shapes (in the transformed space on which clustering is performed), and they have high computational complexity, resulting in unsatisfactory clustering outcomes and slow execution time even with GPUs. Rather than focusing on data transformation techniques, we propose a new clustering algorithm called kernel-bounded clustering (KBC). It has two unique features: (1) It is the first clustering algorithm that employs a distributional kernel to recruit members of a cluster, enabling clusters of varying densities, sizes, and shapes to be discovered, and (2) it is a linear-time clustering algorithm that significantly enhances the speed of clustering analysis, enabling researchers to effectively handle large-scale spatial transcriptomics data sets. We show that (1) KBC works well with a simple data transformation technique called the Weisfeiler-Lehman scheme, and (2) a combination of KBC and the Weisfeiler-Lehman scheme produces good clustering outcomes, and it is faster and easier-to-use than many methods that employ existing clustering algorithms and data transformation techniques.
Collapse
Affiliation(s)
- Hang Zhang
- National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China
- School of Artificial Intelligence, Nanjing University, Nanjing 210023, China
| | - Yi Zhang
- National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China
- School of Artificial Intelligence, Nanjing University, Nanjing 210023, China
| | - Kai Ming Ting
- National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China;
- School of Artificial Intelligence, Nanjing University, Nanjing 210023, China
| | - Jie Zhang
- National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China;
- School of Artificial Intelligence, Nanjing University, Nanjing 210023, China
| | - Qiuran Zhao
- National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China
- School of Artificial Intelligence, Nanjing University, Nanjing 210023, China
| |
Collapse
|
12
|
Reshef Y, Sood L, Curtis M, Rumker L, Stein DJ, Palshikar MG, Nayar S, Filer A, Jonsson AH, Korsunsky I, Raychaudhuri S. Powerful and accurate case-control analysis of spatial molecular data with deep learning-defined tissue microniches. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.07.637149. [PMID: 39975274 PMCID: PMC11839118 DOI: 10.1101/2025.02.07.637149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
As spatial molecular data grow in scope and resolution, there is a pressing need to identify key spatial structures associated with disease. Current approaches often rely on hand-crafted features such as local abundances of manually annotated, discrete cell types, which may overlook important signals. Here we introduce variational inference-based microniche analysis (VIMA), a method that combines deep learning with principled statistics to discover associated spatial features with greater flexibility and precision. VIMA uses a variational autoencoder to extract numerical "fingerprints" from small tissue patches that capture their biological content. It uses these fingerprints to define a large number of "microniches" - small, potentially overlapping groups of tissue patches with highly similar biology that span multiple samples. It then uses rigorous statistics to identify microniches whose abundance correlates with case-control status. We show in simulations that VIMA is well calibrated and more powerful and accurate than other approaches. We then apply VIMA to a 140-gene spatial transcriptomics dataset in Alzheimer's dementia, a 54-marker CO-Detection by indEXing (CODEX) dataset in ulcerative colitis (UC), and a 7-marker immunohistochemistry dataset in rheumatoid arthritis (RA), in each case recapitulating known biology and identifying novel spatial features of disease.
Collapse
Affiliation(s)
- Yakir Reshef
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Lakshay Sood
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Michelle Curtis
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Laurie Rumker
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Daniel J. Stein
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Mukta G. Palshikar
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Saba Nayar
- NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and Department of Inflammation and Ageing, College of Medicine & Health, University of Birmingham, Birmingham, UK
- Birmingham Tissue Analytics, College of Medicine and Health, University of Birmingham, Birmingham, UK
| | - Andrew Filer
- NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and Department of Inflammation and Ageing, College of Medicine & Health, University of Birmingham, Birmingham, UK
| | - Anna Helena Jonsson
- University of Colorado Anschutz Medical Campus, Division of Rheumatology, Aurora, CO, USA
| | - Ilya Korsunsky
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| |
Collapse
|
13
|
Liu Y, Li Z, Chen X, Cui X, Gao Z, Jiang R. INSTINCT: Multi-sample integration of spatial chromatin accessibility sequencing data via stochastic domain translation. Nat Commun 2025; 16:1247. [PMID: 39893190 PMCID: PMC11787322 DOI: 10.1038/s41467-025-56535-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 01/13/2025] [Indexed: 02/04/2025] Open
Abstract
Recent advances in spatial epigenomic techniques have given rise to spatial assay for transposase-accessible chromatin using sequencing (spATAC-seq) data, enabling the characterization of epigenomic heterogeneity and spatial information simultaneously. Integrative analysis of multiple spATAC-seq samples, for which no method has been developed, allows for effective identification and elimination of unwanted non-biological factors within the data, enabling comprehensive exploration of tissue structures and providing a holistic epigenomic landscape, thereby facilitating the discovery of biological implications and the study of regulatory processes. In this article, we present INSTINCT, a method for multi-sample INtegration of Spatial chromaTIN accessibility sequencing data via stochastiC domain Translation. INSTINCT can efficiently handle the high dimensionality of spATAC-seq data and eliminate the complex noise and batch effects of samples through a stochastic domain translation procedure. We demonstrate the superiority and robustness of INSTINCT in integrating spATAC-seq data across multiple simulated scenarios and real datasets. Additionally, we highlight the advantages of INSTINCT in spatial domain identification, visualization, spot-type annotation, and various downstream analyses, including motif enrichment analysis, expression enrichment analysis, and partitioned heritability analysis.
Collapse
Affiliation(s)
- Yuyao Liu
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Zhen Li
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Xiaoyang Chen
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Xuejian Cui
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Zijing Gao
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Rui Jiang
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
14
|
Park J, Cook S, Lee D, Choi J, Yoo S, Bae S, Im HJ, Lee D, Choi H. Generation of super-resolution images from barcode-based spatial transcriptomics by deep image prior. CELL REPORTS METHODS 2025; 5:100937. [PMID: 39729996 PMCID: PMC11840945 DOI: 10.1016/j.crmeth.2024.100937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 06/17/2024] [Accepted: 12/04/2024] [Indexed: 12/29/2024]
Abstract
Spatially resolved transcriptomics (ST) has revolutionized the field of biology by providing a powerful tool for analyzing gene expression in situ. However, current ST methods, particularly barcode-based methods, have limitations in reconstructing high-resolution images from barcodes sparsely distributed in slides. Here, we present SuperST, an algorithm that enables the reconstruction of dense matrices (higher-resolution and non-zero-inflated matrices) from low-resolution ST libraries. SuperST is based on deep image prior, which reconstructs spatial gene expression patterns as image matrices. Compared with previous methods, SuperST generated output images that more closely resembled immunofluorescence images for given gene expression maps. Furthermore, we demonstrated how one can combine images created by SuperST with computer vision algorithms. In this context, we proposed a method for extracting features from the images, which can aid in spatial clustering of genes. By providing a dense matrix for each gene in situ, SuperST can successfully address the resolution and zero-inflation issue.
Collapse
Affiliation(s)
- Jeongbin Park
- Portrai, Inc., Dongsullagil, 78-18 Jongrogu, Seoul, Republic of Korea
| | - Seungho Cook
- Portrai, Inc., Dongsullagil, 78-18 Jongrogu, Seoul, Republic of Korea
| | - Dongjoo Lee
- Portrai, Inc., Dongsullagil, 78-18 Jongrogu, Seoul, Republic of Korea
| | - Jinyeong Choi
- Portrai, Inc., Dongsullagil, 78-18 Jongrogu, Seoul, Republic of Korea
| | - Seongjin Yoo
- Portrai, Inc., Dongsullagil, 78-18 Jongrogu, Seoul, Republic of Korea
| | - Sungwoo Bae
- Portrai, Inc., Dongsullagil, 78-18 Jongrogu, Seoul, Republic of Korea
| | - Hyung-Jun Im
- Portrai, Inc., Dongsullagil, 78-18 Jongrogu, Seoul, Republic of Korea; Department of Molecular Medicine and Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, Seoul National University, Seoul 08826, Republic of Korea; Department of Applied Bioengineering, Graduate School of Convergence Science and Technology, Seoul National University, Seoul 08826, Republic of Korea; Cancer Research Institute, Seoul National University, 03080 Seoul, Republic of Korea; Research Institute for Convergence Science, Seoul National University, 08826 Seoul, Republic of Korea
| | - Daeseung Lee
- Portrai, Inc., Dongsullagil, 78-18 Jongrogu, Seoul, Republic of Korea
| | - Hongyoon Choi
- Portrai, Inc., Dongsullagil, 78-18 Jongrogu, Seoul, Republic of Korea; Department of Nuclear Medicine, Seoul National University Hospital, 03080 Seoul, Republic of Korea; Department of Nuclear Medicine, Seoul National University College of Medicine, 03080 Seoul, Republic of Korea.
| |
Collapse
|
15
|
Shang L, Wu P, Zhou X. Statistical identification of cell type-specific spatially variable genes in spatial transcriptomics. Nat Commun 2025; 16:1059. [PMID: 39865128 PMCID: PMC11770176 DOI: 10.1038/s41467-025-56280-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 01/06/2025] [Indexed: 01/28/2025] Open
Abstract
An essential task in spatial transcriptomics is identifying spatially variable genes (SVGs). Here, we present Celina, a statistical method for systematically detecting cell type-specific SVGs (ct-SVGs)-a subset of SVGs exhibiting distinct spatial expression patterns within specific cell types. Celina utilizes a spatially varying coefficient model to accurately capture each gene's spatial expression pattern in relation to the distribution of cell types across tissue locations, ensuring effective type I error control and high power. Celina proves powerful compared to existing methods in single-cell resolution spatial transcriptomics and stands as the only effective solution for spot-resolution spatial transcriptomics. Applied to five real datasets, Celina uncovers ct-SVGs associated with tumor progression and patient survival in lung cancer, identifies metagenes with unique spatial patterns linked to cell proliferation and immune response in kidney cancer, and detects genes preferentially expressed near amyloid-β plaques in an Alzheimer's model.
Collapse
Affiliation(s)
- Lulu Shang
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Peijun Wu
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
16
|
Zhou Y, Tang C, Xiao X, Zhan X, Wang T, Xiao G, Xu L. Dimensionality reduction for visualizing spatially resolved profiling data using SpaSNE. Gigascience 2025; 14:giaf002. [PMID: 39960663 PMCID: PMC11831803 DOI: 10.1093/gigascience/giaf002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 11/05/2024] [Accepted: 01/06/2025] [Indexed: 02/20/2025] Open
Abstract
BACKGROUND Spatially resolved profiling technologies to quantify transcriptomes, epigenomes, and proteomes have been emerging as groundbreaking methods for comprehensive molecular characterizations. Dimensionality reduction and visualization is an essential step to analyze and interpret spatially resolved profiling data. However, state-of-the-art dimensionality reduction methods for single-cell sequencing data, such as the t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP), were not tailored for spatially resolved profiling data. RESULTS Here we developed a spatially resolved t-SNE (SpaSNE) method to integrate both spatial and molecular information. We applied it to a variety of public spatially resolved profiling datasets that were generated from 3 experimental platforms and consisted of cells from different diseases, tissues, and cell types. To compare the performances of SpaSNE, t-SNE, and UMAP, we applied them to 4 spatially resolved profiling datasets obtained from 3 distinct experimental platforms (Visium, STARmap, and MERFISH) on both diseased and normal tissues. Comparisons between SpaSNE and these state-of-the-art approaches reveal that SpaSNE achieves more accurate and meaningful visualization that better elucidates the underlying spatial and molecular data structures. CONCLUSIONS This work demonstrates the broad application of SpaSNE for reliable and robust interpretation of cell types based on both molecular and spatial information, which can set the foundation for many subsequent analysis steps, such as differential gene expression and trajectory or pseudotime analysis on the spatially resolved profiling data.
Collapse
Affiliation(s)
- Yuansheng Zhou
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Chen Tang
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Xue Xiao
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Xiaowei Zhan
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Tao Wang
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Lin Xu
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Department of Pediatrics, Division of Hematology/Oncology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| |
Collapse
|
17
|
Zhou L, Peng X, Chen M, He X, Tian G, Yang J, Peng L. Unveiling patterns in spatial transcriptomics data: a novel approach utilizing graph attention autoencoder and multiscale deep subspace clustering network. Gigascience 2025; 14:giae103. [PMID: 39804726 PMCID: PMC11727722 DOI: 10.1093/gigascience/giae103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 07/06/2024] [Accepted: 11/21/2024] [Indexed: 01/16/2025] Open
Abstract
BACKGROUND The accurate deciphering of spatial domains, along with the identification of differentially expressed genes and the inference of cellular trajectory based on spatial transcriptomic (ST) data, holds significant potential for enhancing our understanding of tissue organization and biological functions. However, most of spatial clustering methods can neither decipher complex structures in ST data nor entirely employ features embedded in different layers. RESULTS This article introduces STMSGAL, a novel framework for analyzing ST data by incorporating graph attention autoencoder and multiscale deep subspace clustering. First, STMSGAL constructs ctaSNN, a cell type-aware shared nearest neighbor graph, using Louvian clustering exclusively based on gene expression profiles. Subsequently, it integrates expression profiles and ctaSNN to generate spot latent representations using a graph attention autoencoder and multiscale deep subspace clustering. Lastly, STMSGAL implements spatial clustering, differential expression analysis, and trajectory inference, providing comprehensive capabilities for thorough data exploration and interpretation. STMSGAL was evaluated against 7 methods, including SCANPY, SEDR, CCST, DeepST, GraphST, STAGATE, and SiGra, using four 10x Genomics Visium datasets, 1 mouse visual cortex STARmap dataset, and 2 Stereo-seq mouse embryo datasets. The comparison showcased STMSGAL's remarkable performance across Davies-Bouldin, Calinski-Harabasz, S_Dbw, and ARI values. STMSGAL significantly enhanced the identification of layer structures across ST data with different spatial resolutions and accurately delineated spatial domains in 2 breast cancer tissues, adult mouse brain (FFPE), and mouse embryos. CONCLUSIONS STMSGAL can serve as an essential tool for bridging the analysis of cellular spatial organization and disease pathology, offering valuable insights for researchers in the field.
Collapse
Affiliation(s)
- Liqian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou 412007, Hunan, China
| | - Xinhuai Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou 412007, Hunan, China
| | - Min Chen
- School of Computer Science, Hunan Institute of Technology, Hengyang 421002, Hunan, China
| | - Xianzhi He
- School of Computer Science, Hunan University of Technology, Zhuzhou 412007, Hunan, China
| | - Geng Tian
- Geneis (Beijing) Co. Ltd., Beijing 100102, China
| | | | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou 412007, Hunan, China
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou 412007, Hunan, China
| |
Collapse
|
18
|
Sun X, Zhang W, Li W, Yu N, Zhang D, Zou Q, Dong Q, Zhang X, Liu Z, Yuan Z, Gao R. SpaGRA: Graph augmentation facilitates domain identification for spatially resolved transcriptomics. J Genet Genomics 2025; 52:93-104. [PMID: 39362628 DOI: 10.1016/j.jgg.2024.09.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Revised: 09/07/2024] [Accepted: 09/22/2024] [Indexed: 10/05/2024]
Abstract
Recent advances in spatially resolved transcriptomics (SRT) have provided new opportunities for characterizing spatial structures of various tissues. Graph-based geometric deep learning has gained widespread adoption for spatial domain identification tasks. Currently, most methods define adjacency relation between cells or spots by their spatial distance in SRT data, which overlooks key biological interactions like gene expression similarities, and leads to inaccuracies in spatial domain identification. To tackle this challenge, we propose a novel method, SpaGRA (https://github.com/sunxue-yy/SpaGRA), for automatic multi-relationship construction based on graph augmentation. SpaGRA uses spatial distance as prior knowledge and dynamically adjusts edge weights with multi-head graph attention networks (GATs). This helps SpaGRA to uncover diverse node relationships and enhance message passing in geometric contrastive learning. Additionally, SpaGRA uses these multi-view relationships to construct negative samples, addressing sampling bias posed by random selection. Experimental results show that SpaGRA presents superior domain identification performance on multiple datasets generated from different protocols. Using SpaGRA, we analyze the functional regions in the mouse hypothalamus, identify key genes related to heart development in mouse embryos, and observe cancer-associated fibroblasts enveloping cancer cells in the latest Visium HD data. Overall, SpaGRA can effectively characterize spatial structures across diverse SRT datasets.
Collapse
Affiliation(s)
- Xue Sun
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Wei Zhang
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Wenrui Li
- MOE Key Lab of Bioinformatics and Bioinformatics Division of BNRIST, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Na Yu
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Daoliang Zhang
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Qi Zou
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Qiongye Dong
- Institute of Precision Medicine, Peking University Shenzhen Hospital, Shenzhen, Guangdong 518036, China
| | - Xianglin Zhang
- Department of Clinical Laboratory, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250033, China
| | - Zhiping Liu
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Zhiyuan Yuan
- Institute of Science and Technology for Brain-Inspired Intelligence, Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai 200433, China.
| | - Rui Gao
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China.
| |
Collapse
|
19
|
Chen X, Li K, Wu X, Li Z, Jiang Q, Cui X, Gao Z, Wu Y, Jiang R. Descart: a method for detecting spatial chromatin accessibility patterns with inter-cellular correlations. Genome Biol 2024; 25:322. [PMID: 39736655 PMCID: PMC11686967 DOI: 10.1186/s13059-024-03458-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 12/09/2024] [Indexed: 01/01/2025] Open
Abstract
Spatial epigenomic technologies enable simultaneous capture of spatial location and chromatin accessibility of cells within tissue slices. Identifying peaks that display spatial variation and cellular heterogeneity is the key analytic task for characterizing the spatial chromatin accessibility landscape of complex tissues. Here, we propose an efficient and iterative model, Descart, for spatially variable peaks identification based on the graph of inter-cellular correlations. Through the comprehensive benchmarking, we demonstrate the superiority of Descart in revealing cellular heterogeneity and capturing tissue structure. Utilizing the graph of inter-cellular correlations, Descart shows its potential to denoise data, identify peak modules, and detect gene-peak interactions.
Collapse
Affiliation(s)
- Xiaoyang Chen
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Keyi Li
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Xiaoqing Wu
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Zhen Li
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Qun Jiang
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Xuejian Cui
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Zijing Gao
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Yanhong Wu
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Rui Jiang
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
20
|
Schaub DP, Yousefi B, Kaiser N, Khatri R, Puelles VG, Krebs CF, Panzer U, Bonn S. PCA-based spatial domain identification with state-of-the-art performance. Bioinformatics 2024; 41:btaf005. [PMID: 39775801 PMCID: PMC11761416 DOI: 10.1093/bioinformatics/btaf005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 11/25/2024] [Accepted: 01/06/2025] [Indexed: 01/11/2025] Open
Abstract
MOTIVATION The identification of biologically meaningful domains is a central step in the analysis of spatial transcriptomic data. RESULTS Following Occam's razor, we show that a simple PCA-based algorithm for unsupervised spatial domain identification rivals the performance of ten competing state-of-the-art methods across six single-cell spatial transcriptomic datasets. Our reductionist approach, NichePCA, provides researchers with intuitive domain interpretation and excels in execution speed, robustness, and scalability. AVAILABILITY AND IMPLEMENTATION The code is available at https://github.com/imsb-uke/nichepca.
Collapse
Affiliation(s)
- Darius P Schaub
- Institute of Medical Systems Bioinformatics, Center for Biomedical AI (bAIome), Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- III Department of Medicine, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
| | - Behnam Yousefi
- Institute of Medical Systems Bioinformatics, Center for Biomedical AI (bAIome), Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- German Center for Child and Adolescent Health (DZKJ), Partner Site Hamburg, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
| | - Nico Kaiser
- Institute of Medical Systems Bioinformatics, Center for Biomedical AI (bAIome), Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- III Department of Medicine, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
| | - Robin Khatri
- Institute of Medical Systems Bioinformatics, Center for Biomedical AI (bAIome), Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
| | - Victor G Puelles
- III Department of Medicine, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- Hamburg Center for Kidney Health (HCKH), University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- Department of Clinical Medicine, Aarhus University, Aarhus 8200, Denmark
- Department of Pathology, Aarhus University Hospital, Aarhus 8200, Denmark
| | - Christian F Krebs
- III Department of Medicine, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- Hamburg Center for Kidney Health (HCKH), University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- Hamburg Center for Translational Immunology (HCTI), University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
| | - Ulf Panzer
- III Department of Medicine, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- Hamburg Center for Kidney Health (HCKH), University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- Hamburg Center for Translational Immunology (HCTI), University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
| | - Stefan Bonn
- Institute of Medical Systems Bioinformatics, Center for Biomedical AI (bAIome), Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- German Center for Child and Adolescent Health (DZKJ), Partner Site Hamburg, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- Hamburg Center for Kidney Health (HCKH), University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- Hamburg Center for Translational Immunology (HCTI), University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
| |
Collapse
|
21
|
Sun Y, Kong L, Huang J, Deng H, Bian X, Li X, Cui F, Dou L, Cao C, Zou Q, Zhang Z. A comprehensive survey of dimensionality reduction and clustering methods for single-cell and spatial transcriptomics data. Brief Funct Genomics 2024; 23:733-744. [PMID: 38860675 DOI: 10.1093/bfgp/elae023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 02/29/2024] [Accepted: 05/27/2024] [Indexed: 06/12/2024] Open
Abstract
In recent years, the application of single-cell transcriptomics and spatial transcriptomics analysis techniques has become increasingly widespread. Whether dealing with single-cell transcriptomic or spatial transcriptomic data, dimensionality reduction and clustering are indispensable. Both single-cell and spatial transcriptomic data are often high-dimensional, making the analysis and visualization of such data challenging. Through dimensionality reduction, it becomes possible to visualize the data in a lower-dimensional space, allowing for the observation of relationships and differences between cell subpopulations. Clustering enables the grouping of similar cells into the same cluster, aiding in the identification of distinct cell subpopulations and revealing cellular diversity, providing guidance for downstream analyses. In this review, we systematically summarized the most widely recognized algorithms employed for the dimensionality reduction and clustering analysis of single-cell transcriptomic and spatial transcriptomic data. This endeavor provides valuable insights and ideas that can contribute to the development of novel tools in this rapidly evolving field.
Collapse
Affiliation(s)
- Yidi Sun
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Lingling Kong
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Jiayi Huang
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Hongyan Deng
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Xinling Bian
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Xingfeng Li
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Feifei Cui
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Lijun Dou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland, OH 44106, United States
| | - Chen Cao
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 210029, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Zilong Zhang
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| |
Collapse
|
22
|
Agrawal A, Thomann S, Basu S, Grün D. NiCo identifies extrinsic drivers of cell state modulation by niche covariation analysis. Nat Commun 2024; 15:10628. [PMID: 39639035 PMCID: PMC11621405 DOI: 10.1038/s41467-024-54973-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2024] [Accepted: 11/22/2024] [Indexed: 12/07/2024] Open
Abstract
Cell states are modulated by intrinsic driving forces such as gene expression noise and extrinsic signals from the tissue microenvironment. The distinction between intrinsic and extrinsic cell state determinants is essential for understanding the regulation of cell fate in tissues during development, homeostasis and disease. The rapidly growing availability of single-cell resolution spatial transcriptomics makes it possible to meet this challenge. However, available computational methods to infer topological tissue domains, spatially variable genes, or ligand-receptor interactions are limited in their capacity to capture cell state changes driven by crosstalk between individual cell types within the same niche. We present NiCo, a computational framework for integrating single-cell resolution spatial transcriptomics with matched single-cell RNA-sequencing reference data to infer the influence of the spatial niche on the cell state. By applying NiCo to mouse embryogenesis, adult small intestine and liver data, we demonstrate the ability to predict novel niche interactions that govern cell state variation underlying tissue development and homeostasis. In particular, NiCo predicts a feedback mechanism between Kupffer cells and neighboring stellate cells dampening stellate cell activation in the normal liver. NiCo provides a powerful tool to elucidate tissue architecture and to identify drivers of cellular states in local niches.
Collapse
Affiliation(s)
- Ankit Agrawal
- Würzburg Institute of Systems Immunology, Julius-Maximilians-Universität Würzburg, Würzburg, Germany
| | - Stefan Thomann
- Würzburg Institute of Systems Immunology, Julius-Maximilians-Universität Würzburg, Würzburg, Germany
| | - Sukanya Basu
- Würzburg Institute of Systems Immunology, Julius-Maximilians-Universität Würzburg, Würzburg, Germany
| | - Dominic Grün
- Würzburg Institute of Systems Immunology, Julius-Maximilians-Universität Würzburg, Würzburg, Germany.
- CAIDAS - Center for Artificial Intelligence and Data Science, Würzburg, Germany.
| |
Collapse
|
23
|
Das Adhikari S, Yang J, Wang J, Cui Y. Recent advances in spatially variable gene detection in spatial transcriptomics. Comput Struct Biotechnol J 2024; 23:883-891. [PMID: 38370977 PMCID: PMC10869304 DOI: 10.1016/j.csbj.2024.01.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 01/22/2024] [Accepted: 01/22/2024] [Indexed: 02/20/2024] Open
Abstract
With the emergence of advanced spatial transcriptomic technologies, there has been a surge in research papers dedicated to analyzing spatial transcriptomics data, resulting in significant contributions to our understanding of biology. The initial stage of downstream analysis of spatial transcriptomic data has centered on identifying spatially variable genes (SVGs) or genes expressed with specific spatial patterns across the tissue. SVG detection is an important task since many downstream analyses depend on these selected SVGs. Over the past few years, a plethora of new methods have been proposed for the detection of SVGs, accompanied by numerous innovative concepts and discussions. This article provides a selective review of methods and their practical implementations, offering valuable insights into the current literature in this field.
Collapse
Affiliation(s)
- Sikta Das Adhikari
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| | - Jiaxin Yang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
24
|
Wu P, Zhou X. Statistical and computational methods for enabling the clinical and translational application of spatial transcriptomics. Clin Transl Med 2024; 14:e70119. [PMID: 39644148 PMCID: PMC11624480 DOI: 10.1002/ctm2.70119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Accepted: 11/19/2024] [Indexed: 12/09/2024] Open
Affiliation(s)
- Peijun Wu
- Department of BiostatisticsUniversity of MichiganAnn ArborMichiganUSA
- Center for Statistical GeneticsUniversity of MichiganAnn ArborMichiganUSA
| | - Xiang Zhou
- Department of BiostatisticsUniversity of MichiganAnn ArborMichiganUSA
- Center for Statistical GeneticsUniversity of MichiganAnn ArborMichiganUSA
| |
Collapse
|
25
|
Zhang C, Wang L, Shi Q. Computational modeling for deciphering tissue microenvironment heterogeneity from spatially resolved transcriptomics. Comput Struct Biotechnol J 2024; 23:2109-2115. [PMID: 38800634 PMCID: PMC11126885 DOI: 10.1016/j.csbj.2024.05.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 05/15/2024] [Accepted: 05/16/2024] [Indexed: 05/29/2024] Open
Abstract
Spatial transcriptomics techniques, while measuring gene expression, retain spatial location information, aiding in situ studies of organismal tissue architecture and the progression of pathological processes. These techniques generate vast amounts of omics data, necessitating the development of computational methods to reveal the underlying tissue microenvironment heterogeneity. The main directions in spatial transcriptomics data analysis are spatial domain detection and spatial deconvolution, which can identify spatial functional regions and parse the distribution of cell types in spatial transcriptomics data by integrating single-cell transcriptomics data. In these two research directions, many computational methods have been successively proposed. This article will categorize them into three types: machine learning-based methods, probabilistic models-based methods, and deep learning-based methods. It will list and discuss the representative algorithms of each type along with their advantages and disadvantages and describe the datasets and evaluation metrics used to assess these computational methods, facilitating researchers in selecting suitable computational methods according to their research needs. Finally, combining the latest technological developments and the advantages and disadvantages of current algorithms, this article will look forward to the future directions of computational method development.
Collapse
Affiliation(s)
- Chuanchao Zhang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, Hangzhou 310024; University of Chinese Academy of Sciences, China
| | - Lequn Wang
- State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qianqian Shi
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Engineering Technology Research Center of Agricultural Big Data, Huazhong Agricultural University, Wuhan 430070, Hubei, China
| |
Collapse
|
26
|
Wang Y, Woyshner K, Sriworarat C, Stein-O'Brien G, Goff LA, Hansen KD. Multi-sample non-negative spatial factorization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.01.599554. [PMID: 39005356 PMCID: PMC11244884 DOI: 10.1101/2024.07.01.599554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Analyzing multi-sample spatial transcriptomics data requires accounting for biological variation. We present multi-sample non-negative spatial factorization (mNSF), an alignment-free framework extending single-sample spatial factorization (NSF) to multi-sample datasets. mNSF incorporates sample-specific spatial correlation modeling and extracts low-dimensional data representations. Through simulations and real data analysis, we demonstrate mNSF's efficacy in identifying true factors, shared anatomical regions, and region-specific biological functions. mNSF's performance is comparable to alignment-based methods when alignment is feasible, while enabling analysis in scenarios where spatial alignment is unfeasible. mNSF shows promise as a robust method for analyzing spatially resolved transcriptomics data across multiple samples.
Collapse
|
27
|
Qin F, Luo X, Lu Q, Cai B, Xiao F, Cai G. Spatial pattern and differential expression analysis with spatial transcriptomic data. Nucleic Acids Res 2024; 52:e101. [PMID: 39470725 PMCID: PMC11602167 DOI: 10.1093/nar/gkae962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 10/03/2024] [Accepted: 10/11/2024] [Indexed: 10/30/2024] Open
Abstract
The emergence of spatial transcriptomic technologies has opened new avenues for investigating gene activities while preserving the spatial context of tissues. Utilizing data generated by such technologies, the identification of spatially variable (SV) genes is an essential step in exploring tissue landscapes and biological processes. Particularly in typical experimental designs, such as case-control or longitudinal studies, identifying SV genes between groups is crucial for discovering significant biomarkers or developing targeted therapies for diseases. However, current methods available for analyzing spatial transcriptomic data are still in their infancy, and none of the existing methods are capable of identifying SV genes between groups. To overcome this challenge, we developed SPADE for spatial pattern and differential expression analysis to identify SV genes in spatial transcriptomic data. SPADE is based on a machine learning model of Gaussian process regression with a gene-specific Gaussian kernel, enabling the detection of SV genes both within and between groups. Through benchmarking against existing methods in extensive simulations and real data analyses, we demonstrated the preferred performance of SPADE in detecting SV genes within and between groups. The SPADE source code and documentation are publicly available at https://github.com/thecailab/SPADE.
Collapse
Affiliation(s)
- Fei Qin
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, 921 Assembly Street, Columbia, SC, 29208, USA
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD, 20850, USA
| | - Xizhi Luo
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, 921 Assembly Street, Columbia, SC, 29208, USA
- Data and Statistical Sciences, AbbVie Inc., 1 N. Waukegan Road, North Chicago, IL, 60064, USA
| | - Qing Lu
- Department of Biostatistics, College of Public Health and Health Professions and College of Medicine, University of Florida, 2004 Mowry Rd., Gainesville, FL, 32608, USA
| | - Bo Cai
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, 921 Assembly Street, Columbia, SC, 29208, USA
| | - Feifei Xiao
- Department of Biostatistics, College of Public Health and Health Professions and College of Medicine, University of Florida, 2004 Mowry Rd., Gainesville, FL, 32608, USA
| | - Guoshuai Cai
- Department of Biostatistics, College of Public Health and Health Professions and College of Medicine, University of Florida, 2004 Mowry Rd., Gainesville, FL, 32608, USA
- Department of Surgery, College of Medicine, University of Florida, 1600 SW Archer Rd., Gainesville, FL, 32610, USA
| |
Collapse
|
28
|
Zhao Y, Long C, Shang W, Si Z, Liu Z, Feng Z, Zuo Y. A composite scaling network of EfficientNet for improving spatial domain identification performance. Commun Biol 2024; 7:1567. [PMID: 39587274 PMCID: PMC11589849 DOI: 10.1038/s42003-024-07286-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2024] [Accepted: 11/18/2024] [Indexed: 11/27/2024] Open
Abstract
Spatial Transcriptomics leverages gene expression profiling while preserving spatial location and histological images. However, processing the vast and noisy image data in spatial transcriptomics (ST) for precise recognition of spatial domains remains a challenge. In this study, we propose a method of EfNST for recognizing spatial domains, which employs an efficient composite scaling network of EfficientNet to learn multi-scale image features. Compared with other relevant algorithms on six data sets from three sequencing platforms, EfNST exhibits higher accuracy in discerning fine tissue structures, highlighting its strong scalability to data and operational efficiency. Under limited computing resources, the testing results on multiple data sets show that the EfNST algorithm runs faster while maintaining accuracy. The ablation studies of EfNST model demonstrate the significant effectiveness of the EfficientNet. Within the annotated data sets, EfNST showcases the ability to finely identify subregions within tissue structure and discover corresponding marker genes. In the unannotated data sets, EfNST successfully identifies minute regions within complex tissues and elucidated their spatial expression patterns in biological processes. In summary, EfNST presents a novel approach to inferring cellular spatial organization from discrete data spots with significant implications for the exploration of tissue structure and function.
Collapse
Affiliation(s)
- Yanan Zhao
- College of Sciences, Inner Mongolia University of Technology, Hohhot, China
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, School of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Chunshen Long
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, School of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Wenjing Shang
- College of Sciences, Inner Mongolia University of Technology, Hohhot, China
| | - Zhihao Si
- College of Sciences, Inner Mongolia University of Technology, Hohhot, China
| | - Zhigang Liu
- Department of pediatrics, Foshan Women and Children Hospital, Foshan, China.
| | - Zhenxing Feng
- College of Sciences, Inner Mongolia University of Technology, Hohhot, China.
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, School of Life Sciences, Inner Mongolia University, Hohhot, China.
| |
Collapse
|
29
|
Yan Y, Luo X. BACT: nonparametric Bayesian cell typing for single-cell spatial transcriptomics data. Brief Bioinform 2024; 26:bbae689. [PMID: 39751646 PMCID: PMC11697130 DOI: 10.1093/bib/bbae689] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Revised: 11/21/2024] [Accepted: 12/16/2024] [Indexed: 01/04/2025] Open
Abstract
The spatial transcriptomics is a rapidly evolving biological technology that simultaneously measures the gene expression profiles and the spatial locations of spots. With progressive advances, current spatial transcriptomic techniques can achieve the cellular or even the subcellular resolution, making it possible to explore the fine-grained spatial pattern of cell types within one tissue section. However, most existing cell spatial clustering methods require a correct specification of the cell type number, which is hard to determine in the practical exploratory data analysis. To address this issue, we present a nonparametric Bayesian model BACT to perform BAyesian Cell Typing by utilizing gene expression information and spatial coordinates of cells. BACT incorporates a nonparametric Potts prior to induce neighboring cells' spatial dependency, and, more importantly, it can automatically learn the cell type number directly from the data without prespecification. Evaluations on three single-cell spatial transcriptomic datasets demonstrate the better performance of BACT than competing spatial cell typing methods. The R package and the user manual of BACT are publicly available at https://github.com/yinqiaoyan/BACT.
Collapse
Affiliation(s)
- Yinqiao Yan
- School of Mathematics, Statistics and Mechanics, Beijing University of Technology, No. 100 Pingleyuan, 100124 Beijing, China
| | - Xiangyu Luo
- Institute of Statistics and Big Data, Renmin University of China, No. 59 Zhongguancun Street, 100872 Beijing, China
| |
Collapse
|
30
|
Túrós D, Vasiljevic J, Hahn K, Rottenberg S, Valdeolivas A. Chrysalis: decoding tissue compartments in spatial transcriptomics with archetypal analysis. Commun Biol 2024; 7:1520. [PMID: 39550461 PMCID: PMC11569261 DOI: 10.1038/s42003-024-07165-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 10/29/2024] [Indexed: 11/18/2024] Open
Abstract
Dissecting tissue compartments in spatial transcriptomics (ST) remains challenging due to limited spatial resolution and dependence on single-cell reference data. We present Chrysalis, a computational method that rapidly uncovers tissue compartments through spatially variable gene (SVG) detection and archetypal analysis without requiring external reference data. Additionally, it offers a unique visualisation approach for swift tissue characterisation and provides access to the underlying gene expression signatures, enabling the identification of spatially and functionally distinct cellular niches. Chrysalis was evaluated through various benchmarks and validated against deconvolution, independently obtained cell type abundance data, and histopathological annotations, demonstrating superior performance compared to other algorithms on both in silico and real-world test examples. Furthermore, we showcased its versatility across different technologies, such as Visium, Visium HD, Slide-seq, and Stereo-seq.
Collapse
Affiliation(s)
- Demeter Túrós
- Institute of Animal Pathology, Vetsuisse Faculty, University of Bern, Bern, Switzerland.
| | - Jelica Vasiljevic
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Kerstin Hahn
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Sven Rottenberg
- Institute of Animal Pathology, Vetsuisse Faculty, University of Bern, Bern, Switzerland.
- Bern Center for Precision Medicine (BCPM), University of Bern, Bern, Switzerland.
| | - Alberto Valdeolivas
- Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland.
| |
Collapse
|
31
|
Yang J, Wang L, Liu L, Zheng X. GraphPCA: a fast and interpretable dimension reduction algorithm for spatial transcriptomics data. Genome Biol 2024; 25:287. [PMID: 39511664 PMCID: PMC11545739 DOI: 10.1186/s13059-024-03429-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 10/29/2024] [Indexed: 11/15/2024] Open
Abstract
The rapid advancement of spatial transcriptomics technologies has revolutionized our understanding of cell heterogeneity and intricate spatial structures within tissues and organs. However, the high dimensionality and noise in spatial transcriptomic data present significant challenges for downstream data analyses. Here, we develop GraphPCA, an interpretable and quasi-linear dimension reduction algorithm that leverages the strengths of graphical regularization and principal component analysis. Comprehensive evaluations on simulated and multi-resolution spatial transcriptomic datasets generated from various platforms demonstrate the capacity of GraphPCA to enhance downstream analysis tasks including spatial domain detection, denoising, and trajectory inference compared to other state-of-the-art methods.
Collapse
Affiliation(s)
- Jiyuan Yang
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Lu Wang
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- The Guangxi Key Laboratory of Intelligent Precision Medicine, Guangxi Zhuang Autonomous Region, Nanning, China
| | - Lin Liu
- Institute of Natural Sciences, MOE-LSC, School of Mathematical Sciences, CMA-Shanghai, SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University and Shanghai Artificial Intelligence Laboratory, Shanghai, China
| | - Xiaoqi Zheng
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
| |
Collapse
|
32
|
Stouffer KM, Chen X, Zeng H, Charlier B, Younes L, Trouvé A, Miller MI. xIV-LDDMM Toolkit: A Suite of Image-Varifold Based Technologies for Representing and Mapping 3D Imaging and Spatial-omics Data Simultaneously Across Scales. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.04.621983. [PMID: 39574713 PMCID: PMC11580965 DOI: 10.1101/2024.11.04.621983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2024]
Abstract
Advancements in imaging and molecular techniques enable the collection of subcellular-scale data. Diversity in measured features, resolution, and physical scope of capture across technologies and experimental protocols pose numerous challenges to integrating data with reference coordinate systems and across scales. This resource paper describes a collection of technologies that we have developed for cross-modality 3D mapping for the alignment of transcriptomics at the micron scales of genes and cells to the anatomical tissue scales. Our collection of technologies include (i) an explicit censored data representation for the partial matching problem mapping whole brains to subsampled subvolumes, (ii) image-varifold measure norms for supporting nearly universal crossing of modality, (iii) a multi, scale-space optimization technology for generating resampling grids optimized to represent spatial geometry at fixed complexities, and (iv) mutual-information based functional feature selection. Collectively, these methods afford efficient representations of peta-scale imagery providing the algorithms for mapping from the nano to millimeter scales which we term cross-modality image-varifold LDDMM (xIV-LDDMM).
Collapse
Affiliation(s)
- Kaitlin M. Stouffer
- Center for Imaging Science, Johns Hopkins University, Baltimore,MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore,MD, USA
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA
- Centre Borelli ENS Paris-Saclay, Gif-Sur-Yvette, France
| | - Xiaoyin Chen
- Allen Institute for Brain Science, Seattle,WA, USA
| | - Hongkui Zeng
- Allen Institute for Brain Science, Seattle,WA, USA
| | | | - Laurent Younes
- Center for Imaging Science, Johns Hopkins University, Baltimore,MD, USA
- Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, USA
| | - Alain Trouvé
- Centre Borelli ENS Paris-Saclay, Gif-Sur-Yvette, France
| | - Michael I. Miller
- Center for Imaging Science, Johns Hopkins University, Baltimore,MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore,MD, USA
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
33
|
Daccache J, Park E, Junejo M, Abdelghaffar M, Hwang E, Mohanty C, Singh CK, Wang G, Wheeler JO, Shields BE, Nelson CA, Wang Y, Damsky W. Spatial transcriptomics reveals organized and distinct immune activation in cutaneous granulomatous disorders. J Allergy Clin Immunol 2024; 154:1216-1231. [PMID: 39098508 PMCID: PMC11560686 DOI: 10.1016/j.jaci.2024.07.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 06/20/2024] [Accepted: 07/08/2024] [Indexed: 08/06/2024]
Abstract
BACKGROUND Noninfectious (inflammatory) cutaneous granulomatous disorders include cutaneous sarcoidosis (CS), granuloma annulare (GA), necrobiosis lipoidica (NL), and necrobiotic xanthogranuloma (NXG). These disorders share macrophage-predominant inflammation histologically, but the inflammatory architecture and the pattern of extracellular matrix alteration varies. The underlying molecular explanations for these differences remain unclear. OBJECTIVE We sought to understand spatial gene expression characteristics in these disorders. METHODS We performed spatial transcriptomics in cases of CS, GA, NL, and NXG to compare patterns of immune activation and other molecular features in a spatially resolved fashion. RESULTS CS is characterized by a polarized, spatially organized type 1-predominant response with classical macrophage activation. GA is characterized by a mixed but spatially organized pattern of type 1 and type 2 polarization with both classical and alternative macrophage activation. NL showed concomitant activation of type 1, type 2, and type 3 immunity with a mixed pattern of macrophage activation. Activation of type 1 immunity was shared among, CS, GA, and NL and included upregulation of IL-32. NXG showed upregulation of CXCR4-CXCL12/14 chemokine signaling and exaggerated alternative macrophage polarization. Histologic alteration of extracellular matrix correlated with hypoxia and glycolysis programs and type 2 immune activation. CONCLUSIONS Inflammatory cutaneous granulomatous disorders show distinct and spatially organized immune activation that correlate with hallmark histologic changes.
Collapse
Affiliation(s)
- Joseph Daccache
- Department of Pathology, NYU Langone Health, New York, NY; Department of Dermatology, Yale School of Medicine, New Haven, Conn.
| | - Eunsuh Park
- Department of Dermatology, Yale School of Medicine, New Haven, Conn
| | - Muhammad Junejo
- Department of Dermatology, Yale School of Medicine, New Haven, Conn
| | | | - Erica Hwang
- Department of Dermatology, Yale School of Medicine, New Haven, Conn
| | - Chitrasen Mohanty
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wis
| | - Chandra K Singh
- Department of Dermatology, University of Wisconsin School of Medicine and Public Health, Madison, Wis
| | - Guilin Wang
- Keck Microarray Shared Resource, Yale School of Medicine, New Haven, Conn
| | - John O Wheeler
- Keck Microarray Shared Resource, Yale School of Medicine, New Haven, Conn
| | - Bridget E Shields
- Department of Dermatology, University of Wisconsin School of Medicine and Public Health, Madison, Wis
| | | | - Yiwei Wang
- Department of Dermatology, Yale School of Medicine, New Haven, Conn
| | - William Damsky
- Department of Dermatology, Yale School of Medicine, New Haven, Conn; Department of Pathology, Yale School of Medicine, New Haven, Conn.
| |
Collapse
|
34
|
Zhong C, Ang KS, Chen J. Interpretable spatially aware dimension reduction of spatial transcriptomics with STAMP. Nat Methods 2024; 21:2072-2083. [PMID: 39407016 PMCID: PMC11541207 DOI: 10.1038/s41592-024-02463-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 09/12/2024] [Indexed: 11/08/2024]
Abstract
Spatial transcriptomics produces high-dimensional gene expression measurements with spatial context. Obtaining a biologically meaningful low-dimensional representation of such data is crucial for effective interpretation and downstream analysis. Here, we present Spatial Transcriptomics Analysis with topic Modeling to uncover spatial Patterns (STAMP), an interpretable spatially aware dimension reduction method built on a deep generative model that returns biologically relevant, low-dimensional spatial topics and associated gene modules. STAMP can analyze data ranging from a single section to multiple sections and from different technologies to time-series data, returning topics matching known biological domains and associated gene modules containing established markers highly ranked within. In a lung cancer sample, STAMP delineated cell states with supporting markers at a higher resolution than the original annotation and uncovered cancer-associated fibroblasts concentrated on the tumor edge's exterior. In time-series data of mouse embryonic development, STAMP disentangled the erythro-myeloid hematopoiesis and hepatocytes developmental trajectories within the liver. STAMP is highly scalable and can handle more than 500,000 cells.
Collapse
Affiliation(s)
- Chengwei Zhong
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Kok Siong Ang
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Jinmiao Chen
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore.
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore.
- Centre for Computational Biology and Program in Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore, Singapore.
- Immunology Translational Research Program, Department of Microbiology and Immunology, Yong Loo Lin School of Medicine, National University of Singapore (NUS), Singapore, Singapore.
| |
Collapse
|
35
|
Liu Y, Li N, Qi J, Xu G, Zhao J, Wang N, Huang X, Jiang W, Wei H, Justet A, Adams TS, Homer R, Amei A, Rosas IO, Kaminski N, Wang Z, Yan X. SDePER: a hybrid machine learning and regression method for cell-type deconvolution of spatial barcoding-based transcriptomic data. Genome Biol 2024; 25:271. [PMID: 39402626 PMCID: PMC11475911 DOI: 10.1186/s13059-024-03416-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Accepted: 10/01/2024] [Indexed: 10/19/2024] Open
Abstract
Spatial barcoding-based transcriptomic (ST) data require deconvolution for cellular-level downstream analysis. Here we present SDePER, a hybrid machine learning and regression method to deconvolve ST data using reference single-cell RNA sequencing (scRNA-seq) data. SDePER tackles platform effects between ST and scRNA-seq data, ensuring a linear relationship between them while addressing sparsity and spatial correlations in cell types across capture spots. SDePER estimates cell-type proportions, enabling enhanced resolution tissue mapping by imputing cell-type compositions and gene expressions at unmeasured locations. Applications to simulated data and four real datasets showed SDePER's superior accuracy and robustness over existing methods.
Collapse
Affiliation(s)
- Yunqing Liu
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Ningshan Li
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
- SJTU-Yale Join Center for Biostatistics and Data Science, Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- The Second Affiliated Hospital of The Chinese University of Hong Kong, Shenzhen, Shenzhen, Guangdong, China
| | - Ji Qi
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Gang Xu
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
- Department of Mathematical Sciences, University of Nevada, Las Vegas, NV, USA
| | - Jiayi Zhao
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Nating Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Xiayuan Huang
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Wenhao Jiang
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Huanhuan Wei
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
- Section of Pulmonary, Critical Care and Sleep Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Aurélien Justet
- Section of Pulmonary, Critical Care and Sleep Medicine, Yale School of Medicine, New Haven, CT, USA
- Service de Pneumologie, Centre de Competences de Maladies Pulmonaires Rares, CHU de Caen UNICAEN, CEA, CNRS, ISTCT/CERVOxy Group, GIP CYCERON, Normandie University, Caen, France
| | - Taylor S Adams
- Section of Pulmonary, Critical Care and Sleep Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Robert Homer
- Department of Pathology, Yale School of Medicine, New Haven, CT, USA
| | - Amei Amei
- Department of Mathematical Sciences, University of Nevada, Las Vegas, NV, USA
| | - Ivan O Rosas
- Department of Medicine, Baylor College of Medicine, Houston, TX, USA
| | - Naftali Kaminski
- Section of Pulmonary, Critical Care and Sleep Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Zuoheng Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
- Department of Biomedical Informatics & Data Science, Yale School of Medicine, New Haven, CT, USA.
| | - Xiting Yan
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
- Section of Pulmonary, Critical Care and Sleep Medicine, Yale School of Medicine, New Haven, CT, USA.
| |
Collapse
|
36
|
Zhang Y, Yu B, Ming W, Zhou X, Wang J, Chen D. SpaTopic: A statistical learning framework for exploring tumor spatial architecture from spatially resolved transcriptomic data. SCIENCE ADVANCES 2024; 10:eadp4942. [PMID: 39331720 PMCID: PMC11430467 DOI: 10.1126/sciadv.adp4942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 08/21/2024] [Indexed: 09/29/2024]
Abstract
Tumor tissues exhibit a complex spatial architecture within the tumor microenvironment (TME). Spatially resolved transcriptomics (SRT) is promising for unveiling the spatial structures of the TME at both cellular and molecular levels, but identifying pathology-relevant spatial domains remains challenging. Here, we introduce SpaTopic, a statistical learning framework that harmonizes spot clustering and cell-type deconvolution by integrating single-cell transcriptomics and SRT data. Through topic modeling, SpaTopic stratifies the TME into spatial domains with coherent cellular organization, facilitating refined annotation of the spatial architecture with improved performance. We assess SpaTopic across various tumor types and show accurate prediction of tertiary lymphoid structures and tumor boundaries. Moreover, marker genes derived from SpaTopic are transferrable and can be applied to mark spatial domains in other datasets. In addition, SpaTopic enables quantitative comparison and functional characterization of spatial domains across SRT datasets. Overall, SpaTopic presents an innovative analytical framework for exploring, comparing, and interpreting tumor SRT data.
Collapse
Affiliation(s)
- Yuelei Zhang
- Department of Gastroenterology, Nanjing Drum Tower Hospital, National Resource Center for Mutant Mice, State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Bianjiong Yu
- Department of Gastroenterology, Nanjing Drum Tower Hospital, National Resource Center for Mutant Mice, State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Wenxuan Ming
- Department of Gastroenterology, Nanjing Drum Tower Hospital, National Resource Center for Mutant Mice, State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Xiaolong Zhou
- Department of Gastroenterology, Nanjing Drum Tower Hospital, National Resource Center for Mutant Mice, State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Jin Wang
- Department of Gastroenterology, Nanjing Drum Tower Hospital, National Resource Center for Mutant Mice, State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Dijun Chen
- Department of Gastroenterology, Nanjing Drum Tower Hospital, National Resource Center for Mutant Mice, State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
- Central Laboratory of Stomatology, Nanjing Stomatological Hospital, Medical School of Nanjing University, Nanjing, China
- Chemistry and Biomedicine Innovation Center, Nanjing University, Nanjing, China
| |
Collapse
|
37
|
Zhao J, Zhang X, Wang G, Lin Y, Liu T, Chang RB, Zhao H. INSPIRE: interpretable, flexible and spatially-aware integration of multiple spatial transcriptomics datasets from diverse sources. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.23.614539. [PMID: 39386646 PMCID: PMC11463460 DOI: 10.1101/2024.09.23.614539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Recent advances in spatial transcriptomics technologies have led to a growing number of diverse datasets, offering unprecedented opportunities to explore tissue organizations and functions within spatial contexts. However, it remains a significant challenge to effectively integrate and interpret these data, often originating from different samples, technologies, and developmental stages. In this paper, we present INSPIRE, a deep learning method for integrative analyses of multiple spatial transcriptomics datasets to address this challenge. With designs of graph neural networks and an adversarial learning mechanism, INSPIRE enables spatially informed and adaptable integration of data from varying sources. By incorporating non-negative matrix factorization, INSPIRE uncovers interpretable spatial factors with corresponding gene programs, revealing tissue architectures, cell type distributions and biological processes. We demonstrate the capabilities of INSPIRE by applying it to human cortex slices from different samples, mouse brain slices with complementary views, mouse hippocampus and embryo slices generated through different technologies, and spatiotemporal organogenesis atlases containing half a million spatial spots. INSPIRE shows superior performance in identifying detailed biological signals, effectively borrowing information across distinct profiling technologies, and elucidating dynamical changes during embryonic development. Furthermore, we utilize INSPIRE to build 3D models of tissues and whole organisms from multiple slices, demonstrating its power and versatility.
Collapse
Affiliation(s)
- Jia Zhao
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, USA
| | - Xiangyu Zhang
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, USA
| | - Gefei Wang
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, USA
| | - Yingxin Lin
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, USA
| | - Tianyu Liu
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, USA
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - Rui B. Chang
- Department of Neuroscience, School of Medicine, Yale University, New Haven, CT, USA
- Department of Cellular and Molecular Physiology, School of Medicine, Yale University, New Haven, CT, USA
| | - Hongyu Zhao
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, USA
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| |
Collapse
|
38
|
Xu Y, Lv D, Zou X, Wu L, Xu X, Zhao X. BFAST: joint dimension reduction and spatial clustering with Bayesian factor analysis for zero-inflated spatial transcriptomics data. Brief Bioinform 2024; 25:bbae594. [PMID: 39552067 PMCID: PMC11570543 DOI: 10.1093/bib/bbae594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 09/03/2024] [Accepted: 11/01/2024] [Indexed: 11/19/2024] Open
Abstract
The development of spatially resolved transcriptomics (ST) technologies has made it possible to measure gene expression profiles coupled with cellular spatial context and assist biologists in comprehensively characterizing cellular phenotype heterogeneity and tissue microenvironment. Spatial clustering is vital for biological downstream analysis. However, due to high noise and dropout events, clustering spatial transcriptomics data poses numerous challenges due to the lack of effective algorithms. Here we develop a novel method, jointly performing dimension reduction and spatial clustering with Bayesian Factor Analysis for zero-inflated Spatial Transcriptomics data (BFAST). BFAST has showcased exceptional performance on simulation data and real spatial transcriptomics datasets, as proven by benchmarking against currently available methods. It effectively extracts more biologically informative low-dimensional features compared to traditional dimensionality reduction approaches, thereby enhancing the accuracy and precision of clustering.
Collapse
Affiliation(s)
- Yang Xu
- BGI-Research, 313, Gaoteng Avenue, Jiulongpo, Chongqing 400039, China
- BGI-Research, 9, Yunhua Road, Yantian, Shenzhen 518083, China
| | - Dian Lv
- BGI-Research, 313, Gaoteng Avenue, Jiulongpo, Chongqing 400039, China
- BGI-Research, 9, Yunhua Road, Yantian, Shenzhen 518083, China
| | - Xuanxuan Zou
- BGI-Research, 313, Gaoteng Avenue, Jiulongpo, Chongqing 400039, China
- BGI-Research, 9, Yunhua Road, Yantian, Shenzhen 518083, China
| | - Liang Wu
- BGI-Research, 313, Gaoteng Avenue, Jiulongpo, Chongqing 400039, China
- BGI-Research, 9, Yunhua Road, Yantian, Shenzhen 518083, China
| | - Xun Xu
- BGI-Research, 9, Yunhua Road, Yantian, Shenzhen 518083, China
| | - Xin Zhao
- BGI-Research, 313, Gaoteng Avenue, Jiulongpo, Chongqing 400039, China
- BGI-Research, 9, Yunhua Road, Yantian, Shenzhen 518083, China
| |
Collapse
|
39
|
Qiu X, Zhong P, Yue L, Li C, Yun Z, Si G, Li M, Chen Z, Tan Y, Bao P. Spatial transcriptomic sequencing reveals immune microenvironment features of Mycobacterium tuberculosis granulomas in lung and omentum. Theranostics 2024; 14:6185-6201. [PMID: 39431015 PMCID: PMC11488093 DOI: 10.7150/thno.99038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Accepted: 09/04/2024] [Indexed: 10/22/2024] Open
Abstract
Granulomas are a key pathological feature of tuberculosis (TB), characterized by cell heterogeneity, spatial composition, and cellular interactions, which play crucial roles in granuloma progression and host prognosis. This study aims to analyze the transcriptome profiles of cell populations based on their spatial location and to understand the core transcriptome characteristics of granuloma formation and development. Methods In this study, we collected four clinical biopsy samples including Mycobacterium tuberculosis (Mtb) infected lung (MTB-L) and omentum tissues (MTB-O), as well as two lung and omentum biopsies from non-TB patients. The tissues were analyzed by spatial transcriptomics to create a spatial atlas. Utilizing cell enrichment scores and intercellular communication analysis, we investigated the transcriptome signatures of cell populations in various spatial regions and identified genes that may play a decisive role in the formation of pulmonary and omental tuberculosis granulomas. To validate our major findings, an in vitro TB model based on organoid-macrophage co-culture was established. Results Spatial transcriptomics mapped the cell composition and spatial distribution characteristics of tuberculosis granulomas in lung and omental tissues infected with Mtb. The characteristics and evolutionary relationships of major cell populations in granulomas reveal a shift in the immune microenvironment: from a predominance of B cells and fibroblasts in pulmonary granulomas to a predominance of myeloid cells and fibroblasts in omental granulomas. Furthermore, our data identified key differentially expressed genes across cell clusters and regions, showing that upregulation of collagen genes is a common feature of granulomas. Using an organoid-macrophage co-culture model, we demonstrated the notable efficacy of Thrombospondin-1 (THBS1) in reducing protein expression levels related to extracellular matrix remodeling. Conclusion These results provide insights into the pathogenesis and development of tuberculosis, enhancing our understanding of the composition and interactions of tuberculosis granuloma cells from a spatial perspective, and pave the way for novel adjuvant treatments for tuberculosis.
Collapse
Affiliation(s)
- Xiaochen Qiu
- The Eighth Medical Center, Chinese PLA General Hospital, 100039, Beijing, China
- Senior Department of General Surgery, Chinese PLA General Hospital, Beijing, 100093, China
| | - Pengfei Zhong
- Graduate School, Hebei North University, 075000, Zhangjiakou, Hebei Province, China
| | - Liang Yue
- Academy of Military Medical Sciences, Beijing, 100850, China
| | - Chaofan Li
- Graduate School, Hebei North University, 075000, Zhangjiakou, Hebei Province, China
| | - Zhimin Yun
- Academy of Military Medical Sciences, Beijing, 100850, China
| | - Guangqian Si
- Graduate School, Hebei North University, 075000, Zhangjiakou, Hebei Province, China
| | - Mengfan Li
- Graduate School, Hebei North University, 075000, Zhangjiakou, Hebei Province, China
| | - Zhi Chen
- The Eighth Medical Center, Chinese PLA General Hospital, 100039, Beijing, China
- Senior Department of Tuberculosis, Chinese PLA General Hospital, Beijing, 100093, China
| | - Yingxia Tan
- Academy of Military Medical Sciences, Beijing, 100850, China
| | - Pengtao Bao
- The Eighth Medical Center, Chinese PLA General Hospital, 100039, Beijing, China
- Senior Department of Pulmonary and Critical Care Medicine, Chinese PLA General Hospital, Beijing, 100093, China
| |
Collapse
|
40
|
Pang M, Roy TK, Wu X, Tan K. CelloType: A Unified Model for Segmentation and Classification of Tissue Images. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.15.613139. [PMID: 39345491 PMCID: PMC11429831 DOI: 10.1101/2024.09.15.613139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Cell segmentation and classification are critical tasks in spatial omics data analysis. We introduce CelloType, an end-to-end model designed for cell segmentation and classification of biomedical microscopy images. Unlike the traditional two-stage approach of segmentation followed by classification, CelloType adopts a multi-task learning approach that connects the segmentation and classification tasks and simultaneously boost the performance of both tasks. CelloType leverages Transformer-based deep learning techniques for enhanced accuracy of object detection, segmentation, and classification. It outperforms existing segmentation methods using ground-truths from public databases. In terms of classification, CelloType outperforms a baseline model comprised of state-of-the-art methods for individual tasks. Using multiplexed tissue images, we further demonstrate the utility of CelloType for multi-scale segmentation and classification of both cellular and non-cellular elements in a tissue. The enhanced accuracy and multi-task-learning ability of CelloType facilitate automated annotation of rapidly growing spatial omics data.
Collapse
Affiliation(s)
- Minxing Pang
- Applied Mathematics & Computational Science Graduate Group, University of Pennsylvania, Philadelphia, PA, USA
| | - Tarun Kanti Roy
- Department of Computer Science, The University of Iowa, Iowa City, IA, USA
| | - Xiaodong Wu
- Department of Electrical and Computer Engineering, The University of Iowa, Iowa City, IA, USA
- Department of Radiation Oncology, University of Iowa, Iowa City, IA, USA
| | - Kai Tan
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Division of Oncology and Center for Childhood Cancer Research, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Center for Single Cell Biology, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| |
Collapse
|
41
|
Cahill R, Wang Y, Xian RP, Lee AJ, Zeng H, Yu B, Tasic B, Abbasi-Asl R. Unsupervised pattern identification in spatial gene expression atlas reveals mouse brain regions beyond established ontology. Proc Natl Acad Sci U S A 2024; 121:e2319804121. [PMID: 39226356 PMCID: PMC11406299 DOI: 10.1073/pnas.2319804121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Accepted: 07/24/2024] [Indexed: 09/05/2024] Open
Abstract
The rapid growth of large-scale spatial gene expression data demands efficient and reliable computational tools to extract major trends of gene expression in their native spatial context. Here, we used stability-driven unsupervised learning (i.e., staNMF) to identify principal patterns (PPs) of 3D gene expression profiles and understand spatial gene distribution and anatomical localization at the whole mouse brain level. Our subsequent spatial correlation analysis systematically compared the PPs to known anatomical regions and ontology from the Allen Mouse Brain Atlas using spatial neighborhoods. We demonstrate that our stable and spatially coherent PPs, whose linear combinations accurately approximate the spatial gene data, are highly correlated with combinations of expert-annotated brain regions. These PPs yield a brain ontology based purely on spatial gene expression. Our PP identification approach outperforms principal component analysis and typical clustering algorithms on the same task. Moreover, we show that the stable PPs reveal marked regional imbalance of brainwide genetic architecture, leading to region-specific marker genes and gene coexpression networks. Our findings highlight the advantages of stability-driven machine learning for plausible biological discovery from dense spatial gene expression data, streamlining tasks that are infeasible by conventional manual approaches.
Collapse
Affiliation(s)
- Robert Cahill
- Department of Neurology, University of California, San Francisco, CA 94143
- UCSF Weill Institute for Neurosciences, San Francisco, CA 94143
| | - Yu Wang
- Department of Statistics, University of California, Berkeley, CA 94720
| | - R Patrick Xian
- Department of Neurology, University of California, San Francisco, CA 94143
- UCSF Weill Institute for Neurosciences, San Francisco, CA 94143
| | - Alex J Lee
- Department of Neurology, University of California, San Francisco, CA 94143
- UCSF Weill Institute for Neurosciences, San Francisco, CA 94143
| | - Hongkui Zeng
- Allen Institute for Brain Science, Seattle, WA 98109
| | - Bin Yu
- Department of Statistics, University of California, Berkeley, CA 94720
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720
| | | | - Reza Abbasi-Asl
- Department of Neurology, University of California, San Francisco, CA 94143
- UCSF Weill Institute for Neurosciences, San Francisco, CA 94143
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94143
| |
Collapse
|
42
|
Jena SG, Verma A, Engelhardt BE. Answering open questions in biology using spatial genomics and structured methods. BMC Bioinformatics 2024; 25:291. [PMID: 39232666 PMCID: PMC11375982 DOI: 10.1186/s12859-024-05912-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 08/22/2024] [Indexed: 09/06/2024] Open
Abstract
Genomics methods have uncovered patterns in a range of biological systems, but obscure important aspects of cell behavior: the shapes, relative locations, movement, and interactions of cells in space. Spatial technologies that collect genomic or epigenomic data while preserving spatial information have begun to overcome these limitations. These new data promise a deeper understanding of the factors that affect cellular behavior, and in particular the ability to directly test existing theories about cell state and variation in the context of morphology, location, motility, and signaling that could not be tested before. Rapid advancements in resolution, ease-of-use, and scale of spatial genomics technologies to address these questions also require an updated toolkit of statistical methods with which to interrogate these data. We present a framework to respond to this new avenue of research: four open biological questions that can now be answered using spatial genomics data paired with methods for analysis. We outline spatial data modalities for each open question that may yield specific insights, discuss how conflicting theories may be tested by comparing the data to conceptual models of biological behavior, and highlight statistical and machine learning-based tools that may prove particularly helpful to recover biological understanding.
Collapse
Affiliation(s)
- Siddhartha G Jena
- Department of Stem Cell and Regenerative Biology, Harvard, 7 Divinity Ave, Cambridge, MA, USA
| | - Archit Verma
- Gladstone Institutes, 1650 Owens Street, San Francisco, CA, 94158, USA
| | | |
Collapse
|
43
|
Adhikari SD, Steele NG, Theisen B, Wang J, Cui Y. SPACE: Spatially variable gene clustering adjusting for cell type effect for improved spatial domain detection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.23.609477. [PMID: 39229093 PMCID: PMC11370608 DOI: 10.1101/2024.08.23.609477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
Recent advances in spatial transcriptomics have significantly deepened our understanding of biology. A primary focus has been identifying spatially variable genes (SVGs) which are crucial for downstream tasks like spatial domain detection. Traditional methods often use all or a set number of top SVGs for this purpose. However, in diverse datasets with many SVGs, this approach may not ensure accurate results. Instead, grouping SVGs by expression patterns and using all SVG groups in downstream analysis can improve accuracy. Furthermore, classifying SVGs in this manner is akin to identifying cell type marker genes, offering valuable biological insights. The challenge lies in accurately categorizing SVGs into relevant clusters, aggravated by the absence of prior knowledge regarding the number and spectrum of spatial gene patterns. Addressing this challenge, we propose SPACE, SPatially variable gene clustering Adjusting for Cell type Effect, a framework that classifies SVGs based on their spatial patterns by adjusting for confounding effects caused by shared cell types, to improve spatial domain detection. This method does not require prior knowledge of gene cluster numbers, spatial patterns, or cell type information. Our comprehensive simulations and real data analyses demonstrate that SPACE is an efficient and promising tool for spatial transcriptomics analysis.
Collapse
Affiliation(s)
- Sikta Das Adhikari
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI
- Department of Statistics and Probability, Michigan State University, East Lansing, MI
| | - Nina G. Steele
- Department of Surgery, Henry Ford Pancreatic Cancer Center, Henry Ford Hospital, Detroit, MI
- Department of Pathology, Wayne State University, Detroit, MI
- Department of Oncology, Wayne State University, Detroit, MI
- Department of Pharmacology and Toxicology, Michigan State University, East Lansing, MI
| | - Brian Theisen
- Department of Pathology, Henry Ford Health, Detroit, MI
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI, 48824, USA
| |
Collapse
|
44
|
Liu L, Chen A, Li Y, Mulder J, Heyn H, Xu X. Spatiotemporal omics for biology and medicine. Cell 2024; 187:4488-4519. [PMID: 39178830 DOI: 10.1016/j.cell.2024.07.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 07/05/2024] [Accepted: 07/23/2024] [Indexed: 08/26/2024]
Abstract
The completion of the Human Genome Project has provided a foundational blueprint for understanding human life. Nonetheless, understanding the intricate mechanisms through which our genetic blueprint is involved in disease or orchestrates development across temporal and spatial dimensions remains a profound scientific challenge. Recent breakthroughs in cellular omics technologies have paved new pathways for understanding the regulation of genomic elements and the relationship between gene expression, cellular functions, and cell fate determination. The advent of spatial omics technologies, encompassing both imaging and sequencing-based methodologies, has enabled a comprehensive understanding of biological processes from a cellular ecosystem perspective. This review offers an updated overview of how spatial omics has advanced our understanding of the translation of genetic information into cellular heterogeneity and tissue structural organization and their dynamic changes over time. It emphasizes the discovery of various biological phenomena, related to organ functionality, embryogenesis, species evolution, and the pathogenesis of diseases.
Collapse
Affiliation(s)
| | - Ao Chen
- BGI Research, Shenzhen 518083, China
| | | | - Jan Mulder
- Department of Neuroscience, Karolinska Institute, Stockholm, Sweden
| | - Holger Heyn
- Centro Nacional de Análisis Genómico (CNAG), Barcelona, Spain
| | - Xun Xu
- BGI Research, Hangzhou 310030, China; BGI Research, Shenzhen 518083, China.
| |
Collapse
|
45
|
Wu Z, Kondo A, McGrady M, Baker EAG, Chidester B, Wu E, Rahim MK, Bracey NA, Charu V, Cho RJ, Cheng JB, Afkarian M, Zou J, Mayer AT, Trevino AE. Discovery and generalization of tissue structures from spatial omics data. CELL REPORTS METHODS 2024; 4:100838. [PMID: 39127044 PMCID: PMC11384092 DOI: 10.1016/j.crmeth.2024.100838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 04/15/2024] [Accepted: 07/19/2024] [Indexed: 08/12/2024]
Abstract
Tissues are organized into anatomical and functional units at different scales. New technologies for high-dimensional molecular profiling in situ have enabled the characterization of structure-function relationships in increasing molecular detail. However, it remains a challenge to consistently identify key functional units across experiments, tissues, and disease contexts, a task that demands extensive manual annotation. Here, we present spatial cellular graph partitioning (SCGP), a flexible method for the unsupervised annotation of tissue structures. We further present a reference-query extension pipeline, SCGP-Extension, that generalizes reference tissue structure labels to previously unseen samples, performing data integration and tissue structure discovery. Our experiments demonstrate reliable, robust partitioning of spatial data in a wide variety of contexts and best-in-class accuracy in identifying expertly annotated structures. Downstream analysis on SCGP-identified tissue structures reveals disease-relevant insights regarding diabetic kidney disease, skin disorder, and neoplastic diseases, underscoring its potential to drive biological insight and discovery from spatial datasets.
Collapse
Affiliation(s)
- Zhenqin Wu
- Enable Medicine, Menlo Park, CA 94025, USA.
| | | | | | | | | | - Eric Wu
- Enable Medicine, Menlo Park, CA 94025, USA; Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA
| | | | - Nathan A Bracey
- Institute of Immunity, Transplantation and Infection, Stanford University, Stanford, CA 94305, USA
| | - Vivek Charu
- Department of Pathology, Stanford University, Stanford, CA 94305, USA
| | - Raymond J Cho
- Department of Dermatology, University of California, San Francisco, San Francisco, CA, USA
| | - Jeffrey B Cheng
- Department of Dermatology, University of California, San Francisco, San Francisco, CA, USA; Department of Dermatology, Veterans Affairs Medical Center, San Francisco, CA, USA
| | - Maryam Afkarian
- Division of Nephrology, Department of Medicine, University of California, Davis, Davis, CA 95618, USA
| | - James Zou
- Enable Medicine, Menlo Park, CA 94025, USA; Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA.
| | | | | |
Collapse
|
46
|
Hu Y, Xie M, Li Y, Rao M, Shen W, Luo C, Qin H, Baek J, Zhou XM. Benchmarking clustering, alignment, and integration methods for spatial transcriptomics. Genome Biol 2024; 25:212. [PMID: 39123269 PMCID: PMC11312151 DOI: 10.1186/s13059-024-03361-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 07/30/2024] [Indexed: 08/12/2024] Open
Abstract
BACKGROUND Spatial transcriptomics (ST) is advancing our understanding of complex tissues and organisms. However, building a robust clustering algorithm to define spatially coherent regions in a single tissue slice and aligning or integrating multiple tissue slices originating from diverse sources for essential downstream analyses remains challenging. Numerous clustering, alignment, and integration methods have been specifically designed for ST data by leveraging its spatial information. The absence of comprehensive benchmark studies complicates the selection of methods and future method development. RESULTS In this study, we systematically benchmark a variety of state-of-the-art algorithms with a wide range of real and simulated datasets of varying sizes, technologies, species, and complexity. We analyze the strengths and weaknesses of each method using diverse quantitative and qualitative metrics and analyses, including eight metrics for spatial clustering accuracy and contiguity, uniform manifold approximation and projection visualization, layer-wise and spot-to-spot alignment accuracy, and 3D reconstruction, which are designed to assess method performance as well as data quality. The code used for evaluation is available on our GitHub. Additionally, we provide online notebook tutorials and documentation to facilitate the reproduction of all benchmarking results and to support the study of new methods and new datasets. CONCLUSIONS Our analyses lead to comprehensive recommendations that cover multiple aspects, helping users to select optimal tools for their specific needs and guide future method development.
Collapse
Affiliation(s)
- Yunfei Hu
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA
| | - Manfei Xie
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, USA
| | - Yikang Li
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, USA
| | - Mingxing Rao
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA
| | - Wenjun Shen
- Department of Bioinformatics, Shantou University Medical College, 515041, Shantou, China
| | - Can Luo
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, USA
| | - Haoran Qin
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA
| | - Jihoon Baek
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA
| | - Xin Maizie Zhou
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA.
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, USA.
| |
Collapse
|
47
|
Mohd ON, Heng YJ, Wang L, Thavamani A, Massicott ES, Wulf GM, Slack FJ, Doyle PS. Sensitive Multiplexed MicroRNA Spatial Profiling and Data Classification Framework Applied to Murine Breast Tumors. Anal Chem 2024; 96:12729-12738. [PMID: 39044395 DOI: 10.1021/acs.analchem.4c01773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/25/2024]
Abstract
MicroRNAs (miRNAs) are small RNAs that are often dysregulated in many diseases, including cancers. They are highly tissue-specific and stable, thus, making them particularly useful as biomarkers. As the spatial transcriptomics field advances, protocols that enable highly sensitive and spatially resolved detection become necessary to maximize the information gained from samples. This is especially true of miRNAs where the location their expression within tissue can provide prognostic value with regard to patient outcome. Equally as important as detection are ways to assess and visualize the miRNA's spatial information in order to leverage the power of spatial transcriptomics over that of traditional nonspatial bulk assays. We present a highly sensitive methodology that simultaneously quantitates and spatially detects seven miRNAs in situ on formalin-fixed paraffin-embedded tissue sections. This method utilizes rolling circle amplification (RCA) in conjunction with a dual scanning approach in nanoliter well arrays with embedded hydrogel posts. The hydrogel posts are functionalized with DNA probes that enable the detection of miRNAs across a large dynamic range (4 orders of magnitude) and a limit of detection of 0.17 zeptomoles (1.7 × 10-4 attomoles). We applied our methodology coupled with a data analysis pipeline to K14-Cre Brca1f/fTp53f/f murine breast tumors to showcase the information gained from this approach.
Collapse
Affiliation(s)
- Omar N Mohd
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Yujing J Heng
- Departments of Pathology, Beth Israel Deaconess Medical Center, Boston, Massachusetts 02215, United States
| | - Lin Wang
- Departments of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts 02215, United States
| | - Abhishek Thavamani
- Departments of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts 02215, United States
| | - Erica S Massicott
- Departments of Pathology, Beth Israel Deaconess Medical Center, Boston, Massachusetts 02215, United States
| | - Gerburg M Wulf
- Departments of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts 02215, United States
| | - Frank J Slack
- Departments of Pathology, Beth Israel Deaconess Medical Center, Boston, Massachusetts 02215, United States
- Harvard Medical School Initiative for RNA Medicine, Departments of Pathology, Beth Israel Deaconess Medical Center, Boston, Massachusetts 02215, United States
| | - Patrick S Doyle
- Harvard Medical School Initiative for RNA Medicine, Departments of Pathology, Beth Israel Deaconess Medical Center, Boston, Massachusetts 02215, United States
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
48
|
Dezem FS, Arjumand W, DuBose H, Morosini NS, Plummer J. Spatially Resolved Single-Cell Omics: Methods, Challenges, and Future Perspectives. Annu Rev Biomed Data Sci 2024; 7:131-153. [PMID: 38768396 DOI: 10.1146/annurev-biodatasci-102523-103640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Overlaying omics data onto spatial biological dimensions has been a promising technology to provide high-resolution insights into the interactome and cellular heterogeneity relative to the organization of the molecular microenvironment of tissue samples in normal and disease states. Spatial omics can be categorized into three major modalities: (a) next-generation sequencing-based assays, (b) imaging-based spatially resolved transcriptomics approaches including in situ hybridization/in situ sequencing, and (c) imaging-based spatial proteomics. These modalities allow assessment of transcripts and proteins at a cellular level, generating large and computationally challenging datasets. The lack of standardized computational pipelines to analyze and integrate these nonuniform structured data has made it necessary to apply artificial intelligence and machine learning strategies to best visualize and translate their complexity. In this review, we summarize the currently available techniques and computational strategies, highlight their advantages and limitations, and discuss their future prospects in the scientific field.
Collapse
Affiliation(s)
- Felipe Segato Dezem
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Center for Spatial Omics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA;
| | - Wani Arjumand
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Center for Spatial Omics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA;
| | - Hannah DuBose
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Center for Spatial Omics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA;
| | - Natalia Silva Morosini
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Center for Spatial Omics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA;
| | - Jasmine Plummer
- Department of Cellular and Molecular Biology and Comprehensive Cancer Center, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Center for Spatial Omics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA;
| |
Collapse
|
49
|
Tian T, Zhang J, Lin X, Wei Z, Hakonarson H. Dependency-aware deep generative models for multitasking analysis of spatial omics data. Nat Methods 2024; 21:1501-1513. [PMID: 38783067 DOI: 10.1038/s41592-024-02257-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 03/25/2024] [Indexed: 05/25/2024]
Abstract
Spatially resolved transcriptomics (SRT) technologies have significantly advanced biomedical research, but their data analysis remains challenging due to the discrete nature of the data and the high levels of noise, compounded by complex spatial dependencies. Here, we propose spaVAE, a dependency-aware, deep generative spatial variational autoencoder model that probabilistically characterizes count data while capturing spatial correlations. spaVAE introduces a hybrid embedding combining a Gaussian process prior with a Gaussian prior to explicitly capture spatial correlations among spots. It then optimizes the parameters of deep neural networks to approximate the distributions underlying the SRT data. With the approximated distributions, spaVAE can contribute to several analytical tasks that are essential for SRT data analysis, including dimensionality reduction, visualization, clustering, batch integration, denoising, differential expression, spatial interpolation, resolution enhancement and identification of spatially variable genes. Moreover, we have extended spaVAE to spaPeakVAE and spaMultiVAE to characterize spatial ATAC-seq (assay for transposase-accessible chromatin using sequencing) data and spatial multi-omics data, respectively.
Collapse
Affiliation(s)
- Tian Tian
- School of Computer Science, National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence, and Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan, Hubei, China
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Jie Zhang
- National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, Jiangsu, China
| | - Xiang Lin
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA
| | - Zhi Wei
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA.
| | - Hakon Hakonarson
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
50
|
Zhang M, Zhang W, Ma X. ST-SCSR: identifying spatial domains in spatial transcriptomics data via structure correlation and self-representation. Brief Bioinform 2024; 25:bbae437. [PMID: 39228303 PMCID: PMC11372132 DOI: 10.1093/bib/bbae437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 07/31/2024] [Accepted: 08/20/2024] [Indexed: 09/05/2024] Open
Abstract
Recent advances in spatial transcriptomics (ST) enable measurements of transcriptome within intact biological tissues by preserving spatial information, offering biologists unprecedented opportunities to comprehensively understand tissue micro-environment, where spatial domains are basic units of tissues. Although great efforts are devoted to this issue, they still have many shortcomings, such as ignoring local information and relations of spatial domains, requiring alternatives to solve these problems. Here, a novel algorithm for spatial domain identification in Spatial Transcriptomics data with Structure Correlation and Self-Representation (ST-SCSR), which integrates local information, global information, and similarity of spatial domains. Specifically, ST-SCSR utilzes matrix tri-factorization to simultaneously decompose expression profiles and spatial network of spots, where expressional and spatial features of spots are fused via the shared factor matrix that interpreted as similarity of spatial domains. Furthermore, ST-SCSR learns affinity graph of spots by manipulating expressional and spatial features, where local preservation and sparse constraints are employed, thereby enhancing the quality of graph. The experimental results demonstrate that ST-SCSR not only outperforms state-of-the-art algorithms in terms of accuracy, but also identifies many potential interesting patterns.
Collapse
Affiliation(s)
- Min Zhang
- School of Computer Science and Technology, Xidian University, No. 2 South Taibai Road, 710071 Xi'an Shaanxi, China
- Key Laboratory of Smart Human-Computer Interaction and Wearable Technology of Shaanxi Province, Xidian University, No. 2 South Taibai Road, 710071 Xi'an Shaanxi, China
| | - Wensheng Zhang
- School of Computer Science and Cyber Engineering, GuangZhou University, No. 230 Wai Huan Xi Road,Guangzhou Higher Education Mega Center, 510006 Guangzhou Guangdong, China
| | - Xiaoke Ma
- School of Computer Science and Technology, Xidian University, No. 2 South Taibai Road, 710071 Xi'an Shaanxi, China
- Key Laboratory of Smart Human-Computer Interaction and Wearable Technology of Shaanxi Province, Xidian University, No. 2 South Taibai Road, 710071 Xi'an Shaanxi, China
| |
Collapse
|