1
|
Yan G, Hua SH, Li JJ. Categorization of 31 computational methods to detect spatially variable genes from spatially resolved transcriptomics data. ARXIV 2024:arXiv:2405.18779v2. [PMID: 38855546 PMCID: PMC11160866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
In the analysis of spatially resolved transcriptomics data, detecting spatially variable genes (SVGs) is crucial. Numerous computational methods exist, but varying SVG definitions and methodologies lead to incomparable results. We review 31 state-of-the-art methods, categorizing SVGs into three types: overall, cell-type-specific, and spatial-domain-marker SVGs. Our review explains the intuitions underlying these methods, summarizes their applications, and categorizes the hypothesis tests they use in the trade-off between generality and specificity for SVG detection. We discuss challenges in SVG detection and propose future directions for improvement. Our review offers insights for method developers and users, advocating for category-specific benchmarking.
Collapse
Affiliation(s)
- Guanao Yan
- Department of Statistics, University of California, Los Angeles, CA 90095-1554
| | - Shuo Harper Hua
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, CA 90095-1554
- Department of Human Genetics, University of California, Los Angeles, CA 90095-7088
- Department of Computational Medicine, University of California, Los Angeles, CA 90095-1766
- Department of Biostatistics, University of California, Los Angeles, CA 90095-1772
- Radcliffe Institute for Advanced Study, Harvard University, Cambridge, MA 02138
| |
Collapse
|
2
|
Yuan X, Ma Y, Gao R, Cui S, Wang Y, Fa B, Ma S, Wei T, Ma S, Yu Z. HEARTSVG: a fast and accurate method for identifying spatially variable genes in large-scale spatial transcriptomics. Nat Commun 2024; 15:5700. [PMID: 38972896 PMCID: PMC11228050 DOI: 10.1038/s41467-024-49846-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 06/19/2024] [Indexed: 07/09/2024] Open
Abstract
Identifying spatially variable genes (SVGs) is crucial for understanding the spatiotemporal characteristics of diseases and tissue structures, posing a distinctive challenge in spatial transcriptomics research. We propose HEARTSVG, a distribution-free, test-based method for fast and accurately identifying spatially variable genes in large-scale spatial transcriptomic data. Extensive simulations demonstrate that HEARTSVG outperforms state-of-the-art methods with higherF 1 scores (averageF 1 Score=0.948), improved computational efficiency, scalability, and reduced false positives (FPs). Through analysis of twelve real datasets from various spatial transcriptomic technologies, HEARTSVG identifies a greater number of biologically significant SVGs (average AUC = 0.792) than other comparative methods without prespecifying spatial patterns. Furthermore, by clustering SVGs, we uncover two distinct tumor spatial domains characterized by unique spatial expression patterns, spatial-temporal locations, and biological functions in human colorectal cancer data, unraveling the complexity of tumors.
Collapse
Affiliation(s)
- Xin Yuan
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics and Data Science Organization, Shanghai Jiao Tong University, Shanghai, China
| | - Yanran Ma
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Ruitian Gao
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Shuya Cui
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics and Data Science Organization, Shanghai Jiao Tong University, Shanghai, China
| | - Yifan Wang
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Botao Fa
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Xi'an Jiaotong University, Xi'an, Shanxi, China
| | - Shiyang Ma
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Ting Wei
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Shuangge Ma
- SJTU-Yale Joint Center for Biostatistics and Data Science Organization, Shanghai Jiao Tong University, Shanghai, China.
- Department of Biostatistics, Yale University, New Haven, USA.
| | - Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
- SJTU-Yale Joint Center for Biostatistics and Data Science Organization, Shanghai Jiao Tong University, Shanghai, China.
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
- Center for Biomedical Data Science, Translational Science Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
| |
Collapse
|
3
|
Zhuang H, Ji Z. PreTSA: computationally efficient modeling of temporal and spatial gene expression patterns. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.20.585926. [PMID: 38585819 PMCID: PMC10996487 DOI: 10.1101/2024.03.20.585926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Modeling temporal and spatial gene expression patterns in large-scale single-cell and spatial transcriptomics data is a computationally intensive task. We present PreTSA, a method that offers computational efficiency in modeling these patterns and is applicable to single-cell and spatial transcriptomics data comprising millions of cells. PreTSA consistently matches the results of state-of-the-art methods while significantly reducing computational time. PreTSA provides a unique solution for studying gene expression patterns in extremely large datasets.
Collapse
Affiliation(s)
- Haotian Zhuang
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA
| | - Zhicheng Ji
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA
| |
Collapse
|
4
|
Ruitenberg MJ, Nguyen QH. Cellular neighborhood analysis in spatial omics reveals new tissue domains and cell subtypes. Nat Genet 2024; 56:362-364. [PMID: 38413724 DOI: 10.1038/s41588-023-01646-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Affiliation(s)
- Marc J Ruitenberg
- School of Biomedical Science, Faculty of Medicine, The University of Queensland, Brisbane, Australia
| | - Quan H Nguyen
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia.
- QIMR Berghofter Medical Research Institute, Brisbane, Australia.
| |
Collapse
|
5
|
Kang H, Lee J. Adipose tissue macrophage heterogeneity in the single-cell genomics era. Mol Cells 2024; 47:100031. [PMID: 38354858 PMCID: PMC10960114 DOI: 10.1016/j.mocell.2024.100031] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 02/07/2024] [Accepted: 02/07/2024] [Indexed: 02/16/2024] Open
Abstract
It is now well-accepted that obesity-induced inflammation plays an important role in the development of insulin resistance and type 2 diabetes. A key source of the inflammation is the murine epididymal and human visceral adipose tissue. The current paradigm is that obesity activates multiple proinflammatory immune cell types in adipose tissue, including adipose-tissue macrophages (ATMs), T Helper 1 (Th1) T cells, and natural killer (NK) cells, while concomitantly suppressing anti-inflammatory immune cells such as T Helper 2 (Th2) T cells and regulatory T cells (Tregs). A key feature of the current paradigm is that obesity induces the anti-inflammatory M2 ATMs in lean adipose tissue to polarize into proinflammatory M1 ATMs. However, recent single-cell transcriptomics studies suggest that the story is much more complex. Here we describe the single-cell genomics technologies that have been developed recently and the emerging results from studies using these technologies. While further studies are needed, it is clear that ATMs are highly heterogeneous. Moreover, while a variety of ATM clusters with quite distinct features have been found to be expanded by obesity, none truly resemble classical M1 ATMs. It is likely that single-cell transcriptomics technology will further revolutionize the field, thereby promoting our understanding of ATMs, adipose-tissue inflammation, and insulin resistance and accelerating the development of therapies for type 2 diabetes.
Collapse
Affiliation(s)
- Haneul Kang
- Soonchunhyang Institute of Medi-Bio Science (SIMS) and Department of Integrated Biomedical Science, Soonchunhyang University, Cheonan-si, South Korea
| | - Jongsoon Lee
- Soonchunhyang Institute of Medi-Bio Science (SIMS) and Department of Integrated Biomedical Science, Soonchunhyang University, Cheonan-si, South Korea.
| |
Collapse
|
6
|
Zahedi R, Ghamsari R, Argha A, Macphillamy C, Beheshti A, Alizadehsani R, Lovell NH, Lotfollahi M, Alinejad-Rokny H. Deep learning in spatially resolved transcriptfomics: a comprehensive technical view. Brief Bioinform 2024; 25:bbae082. [PMID: 38483255 PMCID: PMC10939360 DOI: 10.1093/bib/bbae082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/22/2024] [Accepted: 02/13/2024] [Indexed: 03/17/2024] Open
Abstract
Spatially resolved transcriptomics (SRT) is a pioneering method for simultaneously studying morphological contexts and gene expression at single-cell precision. Data emerging from SRT are multifaceted, presenting researchers with intricate gene expression matrices, precise spatial details and comprehensive histology visuals. Such rich and intricate datasets, unfortunately, render many conventional methods like traditional machine learning and statistical models ineffective. The unique challenges posed by the specialized nature of SRT data have led the scientific community to explore more sophisticated analytical avenues. Recent trends indicate an increasing reliance on deep learning algorithms, especially in areas such as spatial clustering, identification of spatially variable genes and data alignment tasks. In this manuscript, we provide a rigorous critique of these advanced deep learning methodologies, probing into their merits, limitations and avenues for further refinement. Our in-depth analysis underscores that while the recent innovations in deep learning tailored for SRT have been promising, there remains a substantial potential for enhancement. A crucial area that demands attention is the development of models that can incorporate intricate biological nuances, such as phylogeny-aware processing or in-depth analysis of minuscule histology image segments. Furthermore, addressing challenges like the elimination of batch effects, perfecting data normalization techniques and countering the overdispersion and zero inflation patterns seen in gene expression is pivotal. To support the broader scientific community in their SRT endeavors, we have meticulously assembled a comprehensive directory of readily accessible SRT databases, hoping to serve as a foundation for future research initiatives.
Collapse
Affiliation(s)
- Roxana Zahedi
- UNSW BioMedical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
| | - Reza Ghamsari
- UNSW BioMedical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
| | - Ahmadreza Argha
- The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
- Tyree Institute of Health Engineering (IHealthE), UNSW Sydney, 2052, NSW, Australia
| | - Callum Macphillamy
- School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, 5371, Australia
| | - Amin Beheshti
- School of Computing, Macquarie University, Sydney, 2109, Australia
| | - Roohallah Alizadehsani
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Waurn Ponds, Melbourne, VIC, 3216, Australia
| | - Nigel H Lovell
- The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
- Tyree Institute of Health Engineering (IHealthE), UNSW Sydney, 2052, NSW, Australia
| | - Mohammad Lotfollahi
- Computational Health Center, Helmholtz Munich, Germany
- Wellcome Sanger Institute, Cambridge, UK
| | - Hamid Alinejad-Rokny
- UNSW BioMedical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
- Tyree Institute of Health Engineering (IHealthE), UNSW Sydney, 2052, NSW, Australia
| |
Collapse
|
7
|
Li Z, Patel ZM, Song D, Yan G, Li JJ, Pinello L. Benchmarking computational methods to identify spatially variable genes and peaks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.02.569717. [PMID: 38076922 PMCID: PMC10705556 DOI: 10.1101/2023.12.02.569717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2023]
Abstract
Spatially resolved transcriptomics offers unprecedented insight by enabling the profiling of gene expression within the intact spatial context of cells, effectively adding a new and essential dimension to data interpretation. To efficiently detect spatial structure of interest, an essential step in analyzing such data involves identifying spatially variable genes. Despite researchers having developed several computational methods to accomplish this task, the lack of a comprehensive benchmark evaluating their performance remains a considerable gap in the field. Here, we present a systematic evaluation of 14 methods using 60 simulated datasets generated by four different simulation strategies, 12 real-world transcriptomics, and three spatial ATAC-seq datasets. We find that spatialDE2 consistently outperforms the other benchmarked methods, and Moran's I achieves competitive performance in different experimental settings. Moreover, our results reveal that more specialized algorithms are needed to identify spatially variable peaks.
Collapse
Affiliation(s)
- Zhijian Li
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Zain M. Patel
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Dongyuan Song
- Interdepartmental Program of Bioinformatics, University of California, Los Angeles, CA, USA
| | - Guanao Yan
- Department of Statistics and Data Science, University of California, Los Angeles, CA, USA
| | - Jingyi Jessica Li
- Department of Statistics and Data Science, University of California, Los Angeles, CA, USA
| | - Luca Pinello
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|