1
|
Chen H, Nguyen ND, Ruffalo M, Bar-Joseph Z. A unified analysis of atlas single-cell data. Genome Res 2025; 35:1219-1233. [PMID: 39965934 PMCID: PMC12047537 DOI: 10.1101/gr.279631.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Accepted: 02/03/2025] [Indexed: 02/20/2025]
Abstract
Recent efforts to generate atlas-scale single-cell data provide opportunities for joint analysis across tissues and modalities. Existing methods use cells as the reference unit, hindering downstream gene-based analysis and removing genuine biological variation. Here we present GIANT, an integration method designed for atlas-scale gene analysis across cell types and tissues. GIANT converts data sets into gene graphs and recursively embeds genes without additional alignment. Applying GIANT to two recent atlas data sets yields unified gene-embedding spaces across human tissues and data modalities. Further evaluations demonstrate GIANT's usefulness in discovering diverse gene functions and underlying gene regulation in cells from different tissues.
Collapse
Affiliation(s)
- Hao Chen
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
- Department of Computer Science, University of Illinois Chicago, Chicago, Illinois 60607, USA
| | - Nam D Nguyen
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Matthew Ruffalo
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Ziv Bar-Joseph
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA;
- Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| |
Collapse
|
2
|
Danaher P, McGuire D, Wu L, Patrick M, Kroeppler D, Zhai H, Olgun DG, Gong D, Cao J, Hwang WL, Schmid J, Beechem JM. InSituCor: exploring spatially correlated genes conditional on the cell type landscape. Genome Biol 2025; 26:105. [PMID: 40275395 PMCID: PMC12020328 DOI: 10.1186/s13059-025-03554-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 03/21/2025] [Indexed: 04/26/2025] Open
Abstract
In spatial transcriptomics data, spatially correlated genes promise to reveal high-interest phenomena like cell-cell interactions and latent variables. But in practice, most spatial correlations arise from the spatial arrangement of cell types, obscuring the more interesting relationships we hope to discover. We introduce InSituCor, a toolkit for discovering modules of spatially correlated genes. InSituCor returns only correlations not explainable by already-known factors like the cell type landscape; this spares precious analyst effort. InSituCor supports both unbiased discovery of whole-dataset correlations and knowledge-driven exploration of genes of interest. As a special case, it evaluates ligand-receptor pairs for spatial co-regulation.
Collapse
Affiliation(s)
| | | | - Lidan Wu
- Bruker Spatial Biology, Seattle, WA, USA
| | | | | | | | - Deniz G Olgun
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Neuroscience, University of Virginia School of Medicine, Charlottesville, VA, USA
| | - Dennis Gong
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard-MIT Health Sciences and Technology Program, Cambridge, MA, USA
| | - Jingyi Cao
- Gene Lay Institute of Immunology and Inflammation, Brigham and Women'S Hospital, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - William L Hwang
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | |
Collapse
|
3
|
Chen X, Ran Q, Tang J, Chen Z, Huang S, Shi X, Xi R. Benchmarking algorithms for spatially variable gene identification in spatial transcriptomics. Bioinformatics 2025; 41:btaf131. [PMID: 40139667 PMCID: PMC12036962 DOI: 10.1093/bioinformatics/btaf131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Revised: 03/15/2025] [Accepted: 03/27/2025] [Indexed: 03/29/2025] Open
Abstract
MOTIVATION The rapid development of spatial transcriptomics has underscored the importance of identifying spatially variable genes. As a fundamental task in spatial transcriptomic data analysis, spatially variable gene identification has been extensively studied. However, the lack of comprehensive benchmark makes it difficult to validate the effectiveness of various algorithms scattered across a large number of studies with real-world datasets. RESULTS In response, this article proposes a benchmark framework to evaluate algorithms for identifying spatially variable genes through the analysis of 30 synthesized and 74 real-world datasets, aiming to identify the best algorithms and their corresponding application scenarios. This framework can assist medical and life scientists in selecting suitable algorithms for their research, while also aid bioinformatics scientists in developing more powerful and efficient computational methods in spatial transcriptomic research. AVAILABILITY AND IMPLEMENTATION The source code of this benchmarking framework is available at both Github (https://github.com/XiDsLab/svg-benchmark) and Zenodo (https://doi.org/10.5281/zenodo.15031083). In addition, all real and synthetic datasets considered in this study are also publicly available at Zenodo (https://doi.org/10.5281/zenodo.7227771).
Collapse
Affiliation(s)
- Xuanwei Chen
- School of Mathematical Sciences, Peking University, Beijing 100871, China
| | - Qinghua Ran
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Junjie Tang
- Center for Statistical Science, Peking University, Beijing 100871, China
| | - Zihao Chen
- School of Mathematical Sciences, Peking University, Beijing 100871, China
| | - Siyuan Huang
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Xingjie Shi
- KLATASDS-MOE, Academy of Statistics and Interdisciplinary Sciences, School of Statistics, East China Normal University, Shanghai 200062, China
| | - Ruibin Xi
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Center for Statistical Science, Peking University, Beijing 100871, China
| |
Collapse
|
4
|
Liu C, Li X, Hu Q, Jia Z, Ye Q, Wang X, Zhao K, Liu L, Wang M. Decoding the blueprints of embryo development with single-cell and spatial omics. Semin Cell Dev Biol 2025; 167:22-39. [PMID: 39889540 DOI: 10.1016/j.semcdb.2025.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 01/18/2025] [Accepted: 01/18/2025] [Indexed: 02/03/2025]
Abstract
Embryonic development is a complex and intricately regulated process that encompasses precise control over cell differentiation, morphogenesis, and the underlying gene expression changes. Recent years have witnessed a remarkable acceleration in the development of single-cell and spatial omic technologies, enabling high-throughput profiling of transcriptomic and other multi-omic information at the individual cell level. These innovations offer fresh and multifaceted perspectives for investigating the intricate cellular and molecular mechanisms that govern embryonic development. In this review, we provide an in-depth exploration of the latest technical advancements in single-cell and spatial multi-omic methodologies and compile a systematic catalog of their applications in the field of embryonic development. We deconstruct the research strategies employed by recent studies that leverage single-cell sequencing techniques and underscore the unique advantages of spatial transcriptomics. Furthermore, we delve into both the current applications, data analysis algorithms and the untapped potential of these technologies in advancing our understanding of embryonic development. With the continuous evolution of multi-omic technologies, we anticipate their widespread adoption and profound contributions to unraveling the intricate molecular foundations underpinning embryo development in the foreseeable future.
Collapse
Affiliation(s)
- Chang Liu
- BGI Research, Hangzhou 310030, China; BGI Research, Shenzhen 518083, China; Shanxi Medical University-BGI Collaborative Center for Future Medicine, Shanxi Medical University, Taiyuan 030001, China; Shenzhen Proof-of-Concept Center of Digital Cytopathology, BGI Research, Shenzhen 518083, China
| | | | - Qinan Hu
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518005, China; Department of Pharmacology, School of Medicine, Southern University of Science and Technology, Shenzhen 518005, China
| | - Zihan Jia
- BGI Research, Hangzhou 310030, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qing Ye
- BGI Research, Hangzhou 310030, China; China Jiliang University, Hangzhou 310018, China
| | | | - Kaichen Zhao
- College of Biomedicine and Health, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Longqi Liu
- BGI Research, Hangzhou 310030, China; Shanxi Medical University-BGI Collaborative Center for Future Medicine, Shanxi Medical University, Taiyuan 030001, China.
| | - Mingyue Wang
- BGI Research, Hangzhou 310030, China; Key Laboratory of Spatial Omics of Zhejiang Province, BGI Research, Hangzhou 310030, China.
| |
Collapse
|
5
|
Sun P, Bush SJ, Wang S, Jia P, Li M, Xu T, Zhang P, Yang X, Wang C, Xu L, Wang T, Ye K. STMiner: Gene-centric spatial transcriptomics for deciphering tumor tissues. CELL GENOMICS 2025; 5:100771. [PMID: 39947134 PMCID: PMC11872602 DOI: 10.1016/j.xgen.2025.100771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2024] [Revised: 12/09/2024] [Accepted: 01/17/2025] [Indexed: 03/05/2025]
Abstract
Analyzing spatial transcriptomics data from tumor tissues poses several challenges beyond those of healthy samples, including unclear boundaries between different regions, uneven cell densities, and relatively higher cellular heterogeneity. Collectively, these bias the background against which spatially variable genes are identified, which can result in misidentification of spatial structures and hinder potential insight into complex pathologies. To overcome this problem, STMiner leverages 2D Gaussian mixture models and optimal transport theory to directly characterize the spatial distribution of genes rather than the capture locations of the cells expressing them (spots). By effectively mitigating the impacts of both background bias and data sparsity, STMiner reveals key gene sets and spatial structures overlooked by spot-based analytic tools, facilitating novel biological discoveries. The core concept of directly analyzing overall gene expression patterns also allows for a broader application beyond spatial transcriptomics, positioning STMiner for continuous expansion as spatial omics technologies evolve.
Collapse
Affiliation(s)
- Peisen Sun
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China; MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Stephen J Bush
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China; MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Songbo Wang
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China; MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Peng Jia
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China; MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China; Department of Gynecology and Obstetrics, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Mingxuan Li
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China; MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Tun Xu
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China; MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Pengyu Zhang
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China; MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Xiaofei Yang
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China; School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Chengyao Wang
- Department of Endocrinology, Genome Institute, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Linfeng Xu
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China; MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Tingjie Wang
- The Affiliated Cancer Hospital of Zhengzhou University, Zhengzhou, China
| | - Kai Ye
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China; MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China; Department of Gynecology and Obstetrics, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China; Genome Institute, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China; Faculty of Science, Leiden University, Leiden, the Netherlands.
| |
Collapse
|
6
|
Sun F, Li H, Sun D, Fu S, Gu L, Shao X, Wang Q, Dong X, Duan B, Xing F, Wu J, Xiao M, Zhao F, Han JDJ, Liu Q, Fan X, Li C, Wang C, Shi T. Single-cell omics: experimental workflow, data analyses and applications. SCIENCE CHINA. LIFE SCIENCES 2025; 68:5-102. [PMID: 39060615 DOI: 10.1007/s11427-023-2561-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 04/18/2024] [Indexed: 07/28/2024]
Abstract
Cells are the fundamental units of biological systems and exhibit unique development trajectories and molecular features. Our exploration of how the genomes orchestrate the formation and maintenance of each cell, and control the cellular phenotypes of various organismsis, is both captivating and intricate. Since the inception of the first single-cell RNA technology, technologies related to single-cell sequencing have experienced rapid advancements in recent years. These technologies have expanded horizontally to include single-cell genome, epigenome, proteome, and metabolome, while vertically, they have progressed to integrate multiple omics data and incorporate additional information such as spatial scRNA-seq and CRISPR screening. Single-cell omics represent a groundbreaking advancement in the biomedical field, offering profound insights into the understanding of complex diseases, including cancers. Here, we comprehensively summarize recent advances in single-cell omics technologies, with a specific focus on the methodology section. This overview aims to guide researchers in selecting appropriate methods for single-cell sequencing and related data analysis.
Collapse
Affiliation(s)
- Fengying Sun
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China
| | - Haoyan Li
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Dongqing Sun
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Shaliu Fu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Lei Gu
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Shao
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China
| | - Qinqin Wang
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Dong
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Bin Duan
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Feiyang Xing
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Jun Wu
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Minmin Xiao
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jing-Dong J Han
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China.
| | - Qi Liu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China.
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China.
- Zhejiang Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.
| | - Chen Li
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Chenfei Wang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
| | - Tieliu Shi
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China.
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, 200062, China.
| |
Collapse
|
7
|
Chen X, Li K, Wu X, Li Z, Jiang Q, Cui X, Gao Z, Wu Y, Jiang R. Descart: a method for detecting spatial chromatin accessibility patterns with inter-cellular correlations. Genome Biol 2024; 25:322. [PMID: 39736655 PMCID: PMC11686967 DOI: 10.1186/s13059-024-03458-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 12/09/2024] [Indexed: 01/01/2025] Open
Abstract
Spatial epigenomic technologies enable simultaneous capture of spatial location and chromatin accessibility of cells within tissue slices. Identifying peaks that display spatial variation and cellular heterogeneity is the key analytic task for characterizing the spatial chromatin accessibility landscape of complex tissues. Here, we propose an efficient and iterative model, Descart, for spatially variable peaks identification based on the graph of inter-cellular correlations. Through the comprehensive benchmarking, we demonstrate the superiority of Descart in revealing cellular heterogeneity and capturing tissue structure. Utilizing the graph of inter-cellular correlations, Descart shows its potential to denoise data, identify peak modules, and detect gene-peak interactions.
Collapse
Affiliation(s)
- Xiaoyang Chen
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Keyi Li
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Xiaoqing Wu
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Zhen Li
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Qun Jiang
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Xuejian Cui
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Zijing Gao
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Yanhong Wu
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Rui Jiang
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
8
|
Das Adhikari S, Yang J, Wang J, Cui Y. Recent advances in spatially variable gene detection in spatial transcriptomics. Comput Struct Biotechnol J 2024; 23:883-891. [PMID: 38370977 PMCID: PMC10869304 DOI: 10.1016/j.csbj.2024.01.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 01/22/2024] [Accepted: 01/22/2024] [Indexed: 02/20/2024] Open
Abstract
With the emergence of advanced spatial transcriptomic technologies, there has been a surge in research papers dedicated to analyzing spatial transcriptomics data, resulting in significant contributions to our understanding of biology. The initial stage of downstream analysis of spatial transcriptomic data has centered on identifying spatially variable genes (SVGs) or genes expressed with specific spatial patterns across the tissue. SVG detection is an important task since many downstream analyses depend on these selected SVGs. Over the past few years, a plethora of new methods have been proposed for the detection of SVGs, accompanied by numerous innovative concepts and discussions. This article provides a selective review of methods and their practical implementations, offering valuable insights into the current literature in this field.
Collapse
Affiliation(s)
- Sikta Das Adhikari
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| | - Jiaxin Yang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
9
|
Zhang T, Sun H, Wu Z, Zhao Z, Zhao X, Zhang H, Gao B, Wang G. GAADE: identification spatially variable genes based on adaptive graph attention network. Brief Bioinform 2024; 26:bbae669. [PMID: 39701602 DOI: 10.1093/bib/bbae669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Revised: 11/16/2024] [Accepted: 12/06/2024] [Indexed: 12/21/2024] Open
Abstract
The rapid advancement of spatial transcriptomics (ST) sequencing technology has made it possible to capture gene expression with spatial coordinate information at the cellular level. Although many methods in ST data analysis can detect spatially variable genes (SVGs), these methods often fail to identify genes with explicit spatial expression patterns due to the lack of consideration for spatial domains. Considering spatial domains is crucial for identifying SVGs as it focuses the analysis of gene expression changes on biologically relevant regions, aiding in the more accurate identification of SVGs associated with specific cell types. Existing methods for identifying SVGs based on spatial domains predefine spot similarity before training, which prevents adaptive learning and limits generalizability across different tissues or samples. This limitation may also lead to inaccurate identification of specific genes at boundary regions. To address these issues, we present GAADE, an unsupervised neural network architecture based on graph-structured data representation learning. GAADE stacks encoder/decoder layers and integrates a self-attention mechanism to reconstruct node attributes and graph structure, effectively capturing spatial domain structures of different sections. Consequently, we confine the identification of SVGs within spatial domains. By performing differential expression analysis on spots within the target spatial domain and their multi-order neighbors, GAADE detects genes with enriched expression patterns within defined domains. Comparative evaluations with five other popular methods on ST datasets across four different species, regions and tissues demonstrate that GAADE exhibits superior performance in detecting SVGs and capturing the extent of spatial gene expression variation.
Collapse
Affiliation(s)
- Tianjiao Zhang
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Hao Sun
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Zhenao Wu
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Zhongqian Zhao
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Xingjie Zhao
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Hongfei Zhang
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Bo Gao
- Department of Radiology, The Second Affiliated Hospital of Harbin Medical University, No. 246 Xuefu Road, Nangang District, Harbin 150081, China
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
- Faculty of Computing, Harbin Institute of Technology, No. 92 West Da Zhi Street, Nangang District, Harbin 150001, China
| |
Collapse
|
10
|
Yan G, Hua SH, Li JJ. Categorization of 33 computational methods to detect spatially variable genes from spatially resolved transcriptomics data. ARXIV 2024:arXiv:2405.18779v4. [PMID: 38855546 PMCID: PMC11160866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
In the analysis of spatially resolved transcriptomics data, detecting spatially variable genes (SVGs) is crucial. Numerous computational methods exist, but varying SVG definitions and methodologies lead to incomparable results. We review 33 state-of-the-art methods, categorizing SVGs into three types: overall, cell-type-specific, and spatial-domain-marker SVGs. Our review explains the intuitions underlying these methods, summarizes their applications, and categorizes the hypothesis tests they use in the trade-off between generality and specificity for SVG detection. We discuss challenges in SVG detection and propose future directions for improvement. Our review offers insights for method developers and users, advocating for category-specific benchmarking.
Collapse
Affiliation(s)
- Guanao Yan
- Department of Statistics, University of California, Los Angeles, CA 90095-1554
| | - Shuo Harper Hua
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, CA 90095-1554
- Department of Human Genetics, University of California, Los Angeles, CA 90095-7088
- Department of Computational Medicine, University of California, Los Angeles, CA 90095-1766
- Department of Biostatistics, University of California, Los Angeles, CA 90095-1772
- Radcliffe Institute for Advanced Study, Harvard University, Cambridge, MA 02138
| |
Collapse
|
11
|
Chang Y, Liu J, Jiang Y, Ma A, Yeo YY, Guo Q, McNutt M, Krull JE, Rodig SJ, Barouch DH, Nolan GP, Xu D, Jiang S, Li Z, Liu B, Ma Q. Graph Fourier transform for spatial omics representation and analyses of complex organs. Nat Commun 2024; 15:7467. [PMID: 39209833 PMCID: PMC11362340 DOI: 10.1038/s41467-024-51590-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Accepted: 08/08/2024] [Indexed: 09/04/2024] Open
Abstract
Spatial omics technologies decipher functional components of complex organs at cellular and subcellular resolutions. We introduce Spatial Graph Fourier Transform (SpaGFT) and apply graph signal processing to a wide range of spatial omics profiling platforms to generate their interpretable representations. This representation supports spatially variable gene identification and improves gene expression imputation, outperforming existing tools in analyzing human and mouse spatial transcriptomics data. SpaGFT can identify immunological regions for B cell maturation in human lymph nodes Visium data and characterize variations in secondary follicles using in-house human tonsil CODEX data. Furthermore, it can be integrated seamlessly into other machine learning frameworks, enhancing accuracy in spatial domain identification, cell type annotation, and subcellular feature inference by up to 40%. Notably, SpaGFT detects rare subcellular organelles, such as Cajal bodies and Set1/COMPASS complexes, in high-resolution spatial proteomics data. This approach provides an explainable graph representation method for exploring tissue biology and function.
Collapse
Affiliation(s)
- Yuzhou Chang
- Department of Biomedical Informatics, College of Medicine, Ohio State University, Columbus, OH, 43210, USA
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH, 43210, USA
| | - Jixin Liu
- School of Mathematics, Shandong University, 250100, Jinan, China
| | - Yi Jiang
- Department of Biomedical Informatics, College of Medicine, Ohio State University, Columbus, OH, 43210, USA
| | - Anjun Ma
- Department of Biomedical Informatics, College of Medicine, Ohio State University, Columbus, OH, 43210, USA
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH, 43210, USA
| | - Yao Yu Yeo
- Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, MA, 02115, USA
- Program in Virology, Division of Medical Sciences, Harvard Medical School, Boston, MA, 20115, USA
| | - Qi Guo
- Department of Biomedical Informatics, College of Medicine, Ohio State University, Columbus, OH, 43210, USA
| | - Megan McNutt
- Department of Biomedical Informatics, College of Medicine, Ohio State University, Columbus, OH, 43210, USA
| | - Jordan E Krull
- Department of Biomedical Informatics, College of Medicine, Ohio State University, Columbus, OH, 43210, USA
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH, 43210, USA
| | - Scott J Rodig
- Department of Pathology, Dana Farber Cancer Institute, Boston, MA, 02115, USA
- Department of Pathology, Brigham & Women's Hospital, Boston, MA, 02115, USA
| | - Dan H Barouch
- Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, MA, 02115, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, MA, 02139, USA
| | - Garry P Nolan
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Dong Xu
- Department of Electrical Engineering and Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, 65211, USA
| | - Sizun Jiang
- Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, MA, 02115, USA
- Program in Virology, Division of Medical Sciences, Harvard Medical School, Boston, MA, 20115, USA
- Department of Pathology, Dana Farber Cancer Institute, Boston, MA, 02115, USA
| | - Zihai Li
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH, 43210, USA
| | - Bingqiang Liu
- School of Mathematics, Shandong University, 250100, Jinan, China.
| | - Qin Ma
- Department of Biomedical Informatics, College of Medicine, Ohio State University, Columbus, OH, 43210, USA.
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH, 43210, USA.
| |
Collapse
|
12
|
Dezem FS, Arjumand W, DuBose H, Morosini NS, Plummer J. Spatially Resolved Single-Cell Omics: Methods, Challenges, and Future Perspectives. Annu Rev Biomed Data Sci 2024; 7:131-153. [PMID: 38768396 DOI: 10.1146/annurev-biodatasci-102523-103640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Overlaying omics data onto spatial biological dimensions has been a promising technology to provide high-resolution insights into the interactome and cellular heterogeneity relative to the organization of the molecular microenvironment of tissue samples in normal and disease states. Spatial omics can be categorized into three major modalities: (a) next-generation sequencing-based assays, (b) imaging-based spatially resolved transcriptomics approaches including in situ hybridization/in situ sequencing, and (c) imaging-based spatial proteomics. These modalities allow assessment of transcripts and proteins at a cellular level, generating large and computationally challenging datasets. The lack of standardized computational pipelines to analyze and integrate these nonuniform structured data has made it necessary to apply artificial intelligence and machine learning strategies to best visualize and translate their complexity. In this review, we summarize the currently available techniques and computational strategies, highlight their advantages and limitations, and discuss their future prospects in the scientific field.
Collapse
Affiliation(s)
- Felipe Segato Dezem
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Center for Spatial Omics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA;
| | - Wani Arjumand
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Center for Spatial Omics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA;
| | - Hannah DuBose
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Center for Spatial Omics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA;
| | - Natalia Silva Morosini
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Center for Spatial Omics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA;
| | - Jasmine Plummer
- Department of Cellular and Molecular Biology and Comprehensive Cancer Center, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Center for Spatial Omics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA;
| |
Collapse
|
13
|
Yuan X, Ma Y, Gao R, Cui S, Wang Y, Fa B, Ma S, Wei T, Ma S, Yu Z. HEARTSVG: a fast and accurate method for identifying spatially variable genes in large-scale spatial transcriptomics. Nat Commun 2024; 15:5700. [PMID: 38972896 PMCID: PMC11228050 DOI: 10.1038/s41467-024-49846-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 06/19/2024] [Indexed: 07/09/2024] Open
Abstract
Identifying spatially variable genes (SVGs) is crucial for understanding the spatiotemporal characteristics of diseases and tissue structures, posing a distinctive challenge in spatial transcriptomics research. We propose HEARTSVG, a distribution-free, test-based method for fast and accurately identifying spatially variable genes in large-scale spatial transcriptomic data. Extensive simulations demonstrate that HEARTSVG outperforms state-of-the-art methods with higherF 1 scores (averageF 1 Score=0.948), improved computational efficiency, scalability, and reduced false positives (FPs). Through analysis of twelve real datasets from various spatial transcriptomic technologies, HEARTSVG identifies a greater number of biologically significant SVGs (average AUC = 0.792) than other comparative methods without prespecifying spatial patterns. Furthermore, by clustering SVGs, we uncover two distinct tumor spatial domains characterized by unique spatial expression patterns, spatial-temporal locations, and biological functions in human colorectal cancer data, unraveling the complexity of tumors.
Collapse
Affiliation(s)
- Xin Yuan
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics and Data Science Organization, Shanghai Jiao Tong University, Shanghai, China
| | - Yanran Ma
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Ruitian Gao
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Shuya Cui
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics and Data Science Organization, Shanghai Jiao Tong University, Shanghai, China
| | - Yifan Wang
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Botao Fa
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Xi'an Jiaotong University, Xi'an, Shanxi, China
| | - Shiyang Ma
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Ting Wei
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Shuangge Ma
- SJTU-Yale Joint Center for Biostatistics and Data Science Organization, Shanghai Jiao Tong University, Shanghai, China.
- Department of Biostatistics, Yale University, New Haven, USA.
| | - Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
- SJTU-Yale Joint Center for Biostatistics and Data Science Organization, Shanghai Jiao Tong University, Shanghai, China.
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
- Center for Biomedical Data Science, Translational Science Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
| |
Collapse
|
14
|
Ali M, Kuijs M, Hediyeh-zadeh S, Treis T, Hrovatin K, Palla G, Schaar AC, Theis FJ. GraphCompass: spatial metrics for differential analyses of cell organization across conditions. Bioinformatics 2024; 40:i548-i557. [PMID: 38940138 PMCID: PMC11256915 DOI: 10.1093/bioinformatics/btae242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
SUMMARY Spatial omics technologies are increasingly leveraged to characterize how disease disrupts tissue organization and cellular niches. While multiple methods to analyze spatial variation within a sample have been published, statistical and computational approaches to compare cell spatial organization across samples or conditions are mostly lacking. We present GraphCompass, a comprehensive set of omics-adapted graph analysis methods to quantitatively evaluate and compare the spatial arrangement of cells in samples representing diverse biological conditions. GraphCompass builds upon the Squidpy spatial omics toolbox and encompasses various statistical approaches to perform cross-condition analyses at the level of individual cell types, niches, and samples. Additionally, GraphCompass provides custom visualization functions that enable effective communication of results. We demonstrate how GraphCompass can be used to address key biological questions, such as how cellular organization and tissue architecture differ across various disease states and which spatial patterns correlate with a given pathological condition. GraphCompass can be applied to various popular omics techniques, including, but not limited to, spatial proteomics (e.g. MIBI-TOF), spot-based transcriptomics (e.g. 10× Genomics Visium), and single-cell resolved transcriptomics (e.g. Stereo-seq). In this work, we showcase the capabilities of GraphCompass through its application to three different studies that may also serve as benchmark datasets for further method development. With its easy-to-use implementation, extensive documentation, and comprehensive tutorials, GraphCompass is accessible to biologists with varying levels of computational expertise. By facilitating comparative analyses of cell spatial organization, GraphCompass promises to be a valuable asset in advancing our understanding of tissue function in health and disease. .
Collapse
Affiliation(s)
- Mayar Ali
- Institute of Computational Biology, Helmholtz Munich, Neuherberg, 85764, Germany
- Institute for Tissue Engineering and Regenerative Medicine, Helmholtz Munich, Neuherberg, 85764, Germany
- Graduate School of Systemic Neurosciences, Ludwig Maximilian University of Munich, Planegg-Martinsried, 82152, Germany
| | - Merel Kuijs
- Institute of Computational Biology, Helmholtz Munich, Neuherberg, 85764, Germany
- Department of Mathematics, TUM School of Computation, Information and Technology, Technical University of Munich, Munich, 80333, Germany
| | - Soroor Hediyeh-zadeh
- Institute of Computational Biology, Helmholtz Munich, Neuherberg, 85764, Germany
- TUM School of Life Sciences, Technical University of Munich, Freising, 85354, Germany
| | - Tim Treis
- Institute of Computational Biology, Helmholtz Munich, Neuherberg, 85764, Germany
- TUM School of Life Sciences, Technical University of Munich, Freising, 85354, Germany
| | - Karin Hrovatin
- Institute of Computational Biology, Helmholtz Munich, Neuherberg, 85764, Germany
- TUM School of Life Sciences, Technical University of Munich, Freising, 85354, Germany
| | - Giovanni Palla
- Institute of Computational Biology, Helmholtz Munich, Neuherberg, 85764, Germany
- TUM School of Life Sciences, Technical University of Munich, Freising, 85354, Germany
| | - Anna C Schaar
- Institute of Computational Biology, Helmholtz Munich, Neuherberg, 85764, Germany
- Department of Mathematics, TUM School of Computation, Information and Technology, Technical University of Munich, Munich, 80333, Germany
- Munich Center for Machine Learning, Technical University of Munich, Munich, 80333, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Munich, Neuherberg, 85764, Germany
- Department of Mathematics, TUM School of Computation, Information and Technology, Technical University of Munich, Munich, 80333, Germany
- TUM School of Life Sciences, Technical University of Munich, Freising, 85354, Germany
| |
Collapse
|
15
|
Yu S, Li WV. spVC for the detection and interpretation of spatial gene expression variation. Genome Biol 2024; 25:103. [PMID: 38641849 PMCID: PMC11027374 DOI: 10.1186/s13059-024-03245-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Accepted: 04/10/2024] [Indexed: 04/21/2024] Open
Abstract
Spatially resolved transcriptomics technologies have opened new avenues for understanding gene expression heterogeneity in spatial contexts. However, existing methods for identifying spatially variable genes often focus solely on statistical significance, limiting their ability to capture continuous expression patterns and integrate spot-level covariates. To address these challenges, we introduce spVC, a statistical method based on a generalized Poisson model. spVC seamlessly integrates constant and spatially varying effects of covariates, facilitating comprehensive exploration of gene expression variability and enhancing interpretability. Simulation and real data applications confirm spVC's accuracy in these tasks, highlighting its versatility in spatial transcriptomics analysis.
Collapse
Affiliation(s)
- Shan Yu
- Department of Statistics, Unversity of Virginia, Charlottesville, 22903, VA, USA.
| | - Wei Vivian Li
- Department of Statistics, University of California, Riverside, 92521, CA, USA.
| |
Collapse
|
16
|
Li R, Chen X, Yang X. Navigating the landscapes of spatial transcriptomics: How computational methods guide the way. WILEY INTERDISCIPLINARY REVIEWS. RNA 2024; 15:e1839. [PMID: 38527900 DOI: 10.1002/wrna.1839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 02/24/2024] [Accepted: 03/04/2024] [Indexed: 03/27/2024]
Abstract
Spatially resolved transcriptomics has been dramatically transforming biological and medical research in various fields. It enables transcriptome profiling at single-cell, multi-cellular, or sub-cellular resolution, while retaining the information of geometric localizations of cells in complex tissues. The coupling of cell spatial information and its molecular characteristics generates a novel multi-modal high-throughput data source, which poses new challenges for the development of analytical methods for data-mining. Spatial transcriptomic data are often highly complex, noisy, and biased, presenting a series of difficulties, many unresolved, for data analysis and generation of biological insights. In addition, to keep pace with the ever-evolving spatial transcriptomic experimental technologies, the existing analytical theories and tools need to be updated and reformed accordingly. In this review, we provide an overview and discussion of the current computational approaches for mining of spatial transcriptomics data. Future directions and perspectives of methodology design are proposed to stimulate further discussions and advances in new analytical models and algorithms. This article is categorized under: RNA Methods > RNA Analyses in Cells RNA Evolution and Genomics > Computational Analyses of RNA RNA Export and Localization > RNA Localization.
Collapse
Affiliation(s)
- Runze Li
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
| | - Xu Chen
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
| | - Xuerui Yang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
| |
Collapse
|
17
|
Zhang C, Gao J, Chen HY, Kong L, Cao G, Guo X, Liu W, Ren B, Wei DQ. STGIC: A graph and image convolution-based method for spatial transcriptomic clustering. PLoS Comput Biol 2024; 20:e1011935. [PMID: 38416785 PMCID: PMC10927115 DOI: 10.1371/journal.pcbi.1011935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 03/11/2024] [Accepted: 02/20/2024] [Indexed: 03/01/2024] Open
Abstract
Spatial transcriptomic (ST) clustering employs spatial and transcription information to group spots spatially coherent and transcriptionally similar together into the same spatial domain. Graph convolution network (GCN) and graph attention network (GAT), fed with spatial coordinates derived adjacency and transcription profile derived feature matrix are often used to solve the problem. Our proposed method STGIC (spatial transcriptomic clustering with graph and image convolution) is designed for techniques with regular lattices on chips. It utilizes an adaptive graph convolution (AGC) to get high quality pseudo-labels and then resorts to dilated convolution framework (DCF) for virtual image converted from gene expression information and spatial coordinates of spots. The dilation rates and kernel sizes are set appropriately and updating of weight values in the kernels is made to be subject to the spatial distance from the position of corresponding elements to kernel centers so that feature extraction of each spot is better guided by spatial distance to neighbor spots. Self-supervision realized by Kullback-Leibler (KL) divergence, spatial continuity loss and cross entropy calculated among spots with high confidence pseudo-labels make up the training objective of DCF. STGIC attains state-of-the-art (SOTA) clustering performance on the benchmark dataset of 10x Visium human dorsolateral prefrontal cortex (DLPFC). Besides, it's capable of depicting fine structures of other tissues from other species as well as guiding the identification of marker genes. Also, STGIC is expandable to Stereo-seq data with high spatial resolution.
Collapse
Affiliation(s)
- Chen Zhang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Junhui Gao
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Hong-Yu Chen
- College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Lingxin Kong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Guangshuo Cao
- State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang
| | - Xiangyu Guo
- Smart-Health Initiative, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Wei Liu
- Marine Science and Technology College, Zhejiang Ocean University, Zhoushan, China
| | - Bin Ren
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
18
|
Zahedi R, Ghamsari R, Argha A, Macphillamy C, Beheshti A, Alizadehsani R, Lovell NH, Lotfollahi M, Alinejad-Rokny H. Deep learning in spatially resolved transcriptfomics: a comprehensive technical view. Brief Bioinform 2024; 25:bbae082. [PMID: 38483255 PMCID: PMC10939360 DOI: 10.1093/bib/bbae082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/22/2024] [Accepted: 02/13/2024] [Indexed: 03/17/2024] Open
Abstract
Spatially resolved transcriptomics (SRT) is a pioneering method for simultaneously studying morphological contexts and gene expression at single-cell precision. Data emerging from SRT are multifaceted, presenting researchers with intricate gene expression matrices, precise spatial details and comprehensive histology visuals. Such rich and intricate datasets, unfortunately, render many conventional methods like traditional machine learning and statistical models ineffective. The unique challenges posed by the specialized nature of SRT data have led the scientific community to explore more sophisticated analytical avenues. Recent trends indicate an increasing reliance on deep learning algorithms, especially in areas such as spatial clustering, identification of spatially variable genes and data alignment tasks. In this manuscript, we provide a rigorous critique of these advanced deep learning methodologies, probing into their merits, limitations and avenues for further refinement. Our in-depth analysis underscores that while the recent innovations in deep learning tailored for SRT have been promising, there remains a substantial potential for enhancement. A crucial area that demands attention is the development of models that can incorporate intricate biological nuances, such as phylogeny-aware processing or in-depth analysis of minuscule histology image segments. Furthermore, addressing challenges like the elimination of batch effects, perfecting data normalization techniques and countering the overdispersion and zero inflation patterns seen in gene expression is pivotal. To support the broader scientific community in their SRT endeavors, we have meticulously assembled a comprehensive directory of readily accessible SRT databases, hoping to serve as a foundation for future research initiatives.
Collapse
Affiliation(s)
- Roxana Zahedi
- UNSW BioMedical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
| | - Reza Ghamsari
- UNSW BioMedical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
| | - Ahmadreza Argha
- The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
- Tyree Institute of Health Engineering (IHealthE), UNSW Sydney, 2052, NSW, Australia
| | - Callum Macphillamy
- School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, 5371, Australia
| | - Amin Beheshti
- School of Computing, Macquarie University, Sydney, 2109, Australia
| | - Roohallah Alizadehsani
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Waurn Ponds, Melbourne, VIC, 3216, Australia
| | - Nigel H Lovell
- The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
- Tyree Institute of Health Engineering (IHealthE), UNSW Sydney, 2052, NSW, Australia
| | - Mohammad Lotfollahi
- Computational Health Center, Helmholtz Munich, Germany
- Wellcome Sanger Institute, Cambridge, UK
| | - Hamid Alinejad-Rokny
- UNSW BioMedical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
- Tyree Institute of Health Engineering (IHealthE), UNSW Sydney, 2052, NSW, Australia
| |
Collapse
|
19
|
Liang Y, Shi G, Cai R, Yuan Y, Xie Z, Yu L, Huang Y, Shi Q, Wang L, Li J, Tang Z. PROST: quantitative identification of spatially variable genes and domain detection in spatial transcriptomics. Nat Commun 2024; 15:600. [PMID: 38238417 PMCID: PMC10796707 DOI: 10.1038/s41467-024-44835-w] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 12/19/2023] [Indexed: 01/22/2024] Open
Abstract
Computational methods have been proposed to leverage spatially resolved transcriptomic data, pinpointing genes with spatial expression patterns and delineating tissue domains. However, existing approaches fall short in uniformly quantifying spatially variable genes (SVGs). Moreover, from a methodological viewpoint, while SVGs are naturally associated with depicting spatial domains, they are technically dissociated in most methods. Here, we present a framework (PROST) for the quantitative recognition of spatial transcriptomic patterns, consisting of (i) quantitatively characterizing spatial variations in gene expression patterns through the PROST Index; and (ii) unsupervised clustering of spatial domains via a self-attention mechanism. We demonstrate that PROST performs superior SVG identification and domain segmentation with various spatial resolutions, from multicellular to cellular levels. Importantly, PROST Index can be applied to prioritize spatial expression variations, facilitating the exploration of biological insights. Together, our study provides a flexible and robust framework for analyzing diverse spatial transcriptomic data.
Collapse
Affiliation(s)
- Yuchen Liang
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Guowei Shi
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
| | - Runlin Cai
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Yuchen Yuan
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
| | - Ziying Xie
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
| | - Long Yu
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Yingjian Huang
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Qian Shi
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Lizhe Wang
- School of Computer Science, China University of Geosciences, Wuhan, 430078, China
| | - Jun Li
- School of Computer Science, China University of Geosciences, Wuhan, 430078, China.
| | - Zhonghui Tang
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China.
| |
Collapse
|
20
|
Li Z, Patel ZM, Song D, Yan G, Li JJ, Pinello L. Benchmarking computational methods to identify spatially variable genes and peaks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.02.569717. [PMID: 38076922 PMCID: PMC10705556 DOI: 10.1101/2023.12.02.569717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2023]
Abstract
Spatially resolved transcriptomics offers unprecedented insight by enabling the profiling of gene expression within the intact spatial context of cells, effectively adding a new and essential dimension to data interpretation. To efficiently detect spatial structure of interest, an essential step in analyzing such data involves identifying spatially variable genes. Despite researchers having developed several computational methods to accomplish this task, the lack of a comprehensive benchmark evaluating their performance remains a considerable gap in the field. Here, we present a systematic evaluation of 14 methods using 60 simulated datasets generated by four different simulation strategies, 12 real-world transcriptomics, and three spatial ATAC-seq datasets. We find that spatialDE2 consistently outperforms the other benchmarked methods, and Moran's I achieves competitive performance in different experimental settings. Moreover, our results reveal that more specialized algorithms are needed to identify spatially variable peaks.
Collapse
Affiliation(s)
- Zhijian Li
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Zain M. Patel
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Dongyuan Song
- Interdepartmental Program of Bioinformatics, University of California, Los Angeles, CA, USA
| | - Guanao Yan
- Department of Statistics and Data Science, University of California, Los Angeles, CA, USA
| | - Jingyi Jessica Li
- Department of Statistics and Data Science, University of California, Los Angeles, CA, USA
| | - Luca Pinello
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
21
|
Adhikari SD, Yang J, Wang J, Cui Y. A SELECTIVE REVIEW OF RECENT DEVELOPMENTS IN SPATIALLY VARIABLE GENE DETECTION FOR SPATIAL TRANSCRIPTOMICS. ARXIV 2023:arXiv:2311.13801v1. [PMID: 38045476 PMCID: PMC10690303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
With the emergence of advanced spatial transcriptomic technologies, there has been a surge in research papers dedicated to analyzing spatial transcriptomics data, resulting in significant contributions to our understanding of biology. The initial stage of downstream analysis of spatial transcriptomic data has centered on identifying spatially variable genes (SVGs) or genes expressed with specific spatial patterns across the tissue. SVG detection is an important task since many downstream analyses depend on these selected SVGs. Over the past few years, a plethora of new methods have been proposed for the detection of SVGs, accompanied by numerous innovative concepts and discussions. This article provides a selective review of methods and their practical implementations, offering valuable insights into the current literature in this field.
Collapse
Affiliation(s)
- Sikta Das Adhikari
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jiaxin Yang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
22
|
Seal S, Bitler BG, Ghosh D. SMASH: Scalable Method for Analyzing Spatial Heterogeneity of genes in spatial transcriptomics data. PLoS Genet 2023; 19:e1010983. [PMID: 37862362 PMCID: PMC10619839 DOI: 10.1371/journal.pgen.1010983] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 11/01/2023] [Accepted: 09/19/2023] [Indexed: 10/22/2023] Open
Abstract
In high-throughput spatial transcriptomics (ST) studies, it is of great interest to identify the genes whose level of expression in a tissue covaries with the spatial location of cells/spots. Such genes, also known as spatially variable genes (SVGs), can be crucial to the biological understanding of both structural and functional characteristics of complex tissues. Existing methods for detecting SVGs either suffer from huge computational demand or significantly lack statistical power. We propose a non-parametric method termed SMASH that achieves a balance between the above two problems. We compare SMASH with other existing methods in varying simulation scenarios demonstrating its superior statistical power and robustness. We apply the method to four ST datasets from different platforms uncovering interesting biological insights.
Collapse
Affiliation(s)
- Souvik Seal
- Department of Public Health Sciences, School of Medicine, Medical University of South Carolina, Charleston, South Carolina, United States of America
| | - Benjamin G. Bitler
- Department of Obstetrics and Gynecology, School of Medicine, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado, United States of America
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado, United States of America
| |
Collapse
|
23
|
Jones DC, Danaher P, Kim Y, Beechem JM, Gottardo R, Newell EW. An information theoretic approach to detecting spatially varying genes. CELL REPORTS METHODS 2023; 3:100507. [PMID: 37426750 PMCID: PMC10326450 DOI: 10.1016/j.crmeth.2023.100507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 04/03/2023] [Accepted: 05/25/2023] [Indexed: 07/11/2023]
Abstract
A key step in spatial transcriptomics is identifying genes with spatially varying expression patterns. We adopt an information theoretic perspective to this problem by equating the degree of spatial coherence with the Jensen-Shannon divergence between pairs of nearby cells and pairs of distant cells. To avoid the notoriously difficult problem of estimating information theoretic divergences, we use modern approximation techniques to implement a computationally efficient algorithm designed to scale with in situ spatial transcriptomics technologies. In addition to being highly scalable, we show that our method, which we call maximization of spatial information (Maxspin), improves accuracy across several spatial transcriptomics platforms and a variety of simulations when compared with a variety of state-of-the-art methods. To further demonstrate the method, we generated in situ spatial transcriptomics data in a renal cell carcinoma sample using the CosMx Spatial Molecular Imager and used Maxspin to reveal novel spatial patterns of tumor cell gene expression.
Collapse
Affiliation(s)
| | | | - Youngmi Kim
- NanoString Technologies, Inc., Seattle, WA, USA
| | | | - Raphael Gottardo
- Fred Hutchinson Cancer Center, Seattle, WA, USA
- Biomedical Data Science Center, Lausanne University Hospital, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Ludwig Institute for Cancer Research, Lausanne Branch, Lausanne, Switzerland
| | | |
Collapse
|
24
|
Meng-Lin K, Ung CY, Zhang C, Weiskittel TM, Wisniewski P, Zhang Z, Tan SH, Yeo KS, Zhu S, Correia C, Li H. SPIN-AI: A Deep Learning Model That Identifies Spatially Predictive Genes. Biomolecules 2023; 13:895. [PMID: 37371475 PMCID: PMC10296445 DOI: 10.3390/biom13060895] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 05/23/2023] [Accepted: 05/23/2023] [Indexed: 06/29/2023] Open
Abstract
Spatially resolved sequencing technologies help us dissect how cells are organized in space. Several available computational approaches focus on the identification of spatially variable genes (SVGs), genes whose expression patterns vary in space. The detection of SVGs is analogous to the identification of differentially expressed genes and permits us to understand how genes and associated molecular processes are spatially distributed within cellular niches. However, the expression activities of SVGs fail to encode all information inherent in the spatial distribution of cells. Here, we devised a deep learning model, Spatially Informed Artificial Intelligence (SPIN-AI), to identify spatially predictive genes (SPGs), whose expression can predict how cells are organized in space. We used SPIN-AI on spatial transcriptomic data from squamous cell carcinoma (SCC) as a proof of concept. Our results demonstrate that SPGs not only recapitulate the biology of SCC but also identify genes distinct from SVGs. Moreover, we found a substantial number of ribosomal genes that were SPGs but not SVGs. Since SPGs possess the capability to predict spatial cellular organization, we reason that SPGs capture more biologically relevant information for a given cellular niche than SVGs. Thus, SPIN-AI has broad applications for detecting SPGs and uncovering which biological processes play important roles in governing cellular organization.
Collapse
Affiliation(s)
- Kevin Meng-Lin
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA; (K.M.-L.); (C.-Y.U.); (C.Z.); (T.M.W.); (P.W.); (Z.Z.); (S.-H.T.)
| | - Choong-Yong Ung
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA; (K.M.-L.); (C.-Y.U.); (C.Z.); (T.M.W.); (P.W.); (Z.Z.); (S.-H.T.)
| | - Cheng Zhang
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA; (K.M.-L.); (C.-Y.U.); (C.Z.); (T.M.W.); (P.W.); (Z.Z.); (S.-H.T.)
| | - Taylor M. Weiskittel
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA; (K.M.-L.); (C.-Y.U.); (C.Z.); (T.M.W.); (P.W.); (Z.Z.); (S.-H.T.)
| | - Philip Wisniewski
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA; (K.M.-L.); (C.-Y.U.); (C.Z.); (T.M.W.); (P.W.); (Z.Z.); (S.-H.T.)
| | - Zhuofei Zhang
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA; (K.M.-L.); (C.-Y.U.); (C.Z.); (T.M.W.); (P.W.); (Z.Z.); (S.-H.T.)
| | - Shyang-Hong Tan
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA; (K.M.-L.); (C.-Y.U.); (C.Z.); (T.M.W.); (P.W.); (Z.Z.); (S.-H.T.)
| | - Kok-Siong Yeo
- Department of Biochemistry and Molecular Biology, Mayo Clinic College of Medicine and Science, Rochester, MN 55905, USA; (K.-S.Y.); (S.Z.)
| | - Shizhen Zhu
- Department of Biochemistry and Molecular Biology, Mayo Clinic College of Medicine and Science, Rochester, MN 55905, USA; (K.-S.Y.); (S.Z.)
| | - Cristina Correia
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA; (K.M.-L.); (C.-Y.U.); (C.Z.); (T.M.W.); (P.W.); (Z.Z.); (S.-H.T.)
| | - Hu Li
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA; (K.M.-L.); (C.-Y.U.); (C.Z.); (T.M.W.); (P.W.); (Z.Z.); (S.-H.T.)
| |
Collapse
|
25
|
Lee RY, Ng CW, Rajapakse MP, Ang N, Yeong JPS, Lau MC. The promise and challenge of spatial omics in dissecting tumour microenvironment and the role of AI. Front Oncol 2023; 13:1172314. [PMID: 37197415 PMCID: PMC10183599 DOI: 10.3389/fonc.2023.1172314] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 04/18/2023] [Indexed: 05/19/2023] Open
Abstract
Growing evidence supports the critical role of tumour microenvironment (TME) in tumour progression, metastases, and treatment response. However, the in-situ interplay among various TME components, particularly between immune and tumour cells, are largely unknown, hindering our understanding of how tumour progresses and responds to treatment. While mainstream single-cell omics techniques allow deep, single-cell phenotyping, they lack crucial spatial information for in-situ cell-cell interaction analysis. On the other hand, tissue-based approaches such as hematoxylin and eosin and chromogenic immunohistochemistry staining can preserve the spatial information of TME components but are limited by their low-content staining. High-content spatial profiling technologies, termed spatial omics, have greatly advanced in the past decades to overcome these limitations. These technologies continue to emerge to include more molecular features (RNAs and/or proteins) and to enhance spatial resolution, opening new opportunities for discovering novel biological knowledge, biomarkers, and therapeutic targets. These advancements also spur the need for novel computational methods to mine useful TME insights from the increasing data complexity confounded by high molecular features and spatial resolution. In this review, we present state-of-the-art spatial omics technologies, their applications, major strengths, and limitations as well as the role of artificial intelligence (AI) in TME studies.
Collapse
Affiliation(s)
- Ren Yuan Lee
- Singapore Thong Chai Medical Institution, Singapore, Singapore
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Chan Way Ng
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | | | - Nicholas Ang
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Joe Poh Sheng Yeong
- Department of Anatomical Pathology, Singapore General Hospital, Singapore, Singapore
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
| | - Mai Chan Lau
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| |
Collapse
|
26
|
Lee AJ, Cahill R, Abbasi-Asl R. Machine Learning for Uncovering Biological Insights in Spatial Transcriptomics Data. ARXIV 2023:arXiv:2303.16725v1. [PMID: 37033464 PMCID: PMC10081350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/11/2023]
Abstract
Development and homeostasis in multicellular systems both require exquisite control over spatial molecular pattern formation and maintenance. Advances in spatially-resolved and high-throughput molecular imaging methods such as multiplexed immunofluorescence and spatial transcriptomics (ST) provide exciting new opportunities to augment our fundamental understanding of these processes in health and disease. The large and complex datasets resulting from these techniques, particularly ST, have led to rapid development of innovative machine learning (ML) tools primarily based on deep learning techniques. These ML tools are now increasingly featured in integrated experimental and computational workflows to disentangle signals from noise in complex biological systems. However, it can be difficult to understand and balance the different implicit assumptions and methodologies of a rapidly expanding toolbox of analytical tools in ST. To address this, we summarize major ST analysis goals that ML can help address and current analysis trends. We also describe four major data science concepts and related heuristics that can help guide practitioners in their choices of the right tools for the right biological questions.
Collapse
|