1
|
Yi M, Zhan T, Rui H, Chervoneva I. Functional protein biomarkers based on distributions of expression levels in single-cell imaging data. Bioinformatics 2025; 41:btaf182. [PMID: 40257750 PMCID: PMC12070390 DOI: 10.1093/bioinformatics/btaf182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2024] [Revised: 04/01/2025] [Accepted: 04/19/2025] [Indexed: 04/22/2025] Open
Abstract
MOTIVATION The intra-tumor heterogeneity of protein expression is well recognized and may provide important information for cancer prognosis and predicting treatment responses. Analytic methods that account for spatial heterogeneity remain methodologically complex and computationally demanding for single-cell protein expression. For many functional proteins, single-cell expressions vary independently of spatial localization in a substantial proportion of the tumor tissues, and incorporation of spatial information may not affect the prognostic value of such protein biomarkers. RESULTS We developed a new framework for using the distributions of functional single-cell protein expression levels as cancer biomarkers. The quantile functions of single-cell expressions are used to fully capture the heterogeneity of protein expression across all cancer cells. The quantile index (QI) biomarker is defined as an integral of an unspecified function which may depend linearly or nonlinearly on a tissue-specific quantile function. Linear and nonlinear versions of QI biomarkers based on single-cell expressions of ER, Ki67, TS, and CyclinD3 were derived and evaluated as predictors of progression-free survival or high mitotic index in a large breast cancer dataset. We evaluated performance and demonstrated the advantages of nonlinear QI biomarkers through simulation studies. AVAILABILITY AND IMPLEMENTATION The associated R package Qindex is available at https://CRAN.R-project.org/package=Qindex and R package hyper.gam is available at https://github.com/tingtingzhan/hyper.gam. Examples of R code and detailed instructions could be found in vignette quantile-index-predictor (https://CRAN.R-project.org/package=hyper.gam/vignettes/applications.html#quantile-index-predictor).
Collapse
Affiliation(s)
- Misung Yi
- Department of Statistics & Data Science, College of Software and Convergence, Dankook University, Suji-gu, Gyeonggi-do 16890, Korea
| | - Tingting Zhan
- Division of Biostatistics & Bioinformatics, Department of Pharmacology, Physiology & Cancer Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA 19107, United States
| | - Hallgeir Rui
- Division of Cancer Biology, Department of Pharmacology, Physiology & Cancer Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA 19107, United States
| | - Inna Chervoneva
- Division of Biostatistics & Bioinformatics, Department of Pharmacology, Physiology & Cancer Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA 19107, United States
| |
Collapse
|
2
|
Seal S, Neelon B, Angel PM, O’Quinn EC, Hill E, Vu T, Ghosh D, Mehta AS, Wallace K, Alekseyenko AV. SpaceANOVA: Spatial Co-occurrence Analysis of Cell Types in Multiplex Imaging Data Using Point Process and Functional ANOVA. J Proteome Res 2024; 23:1131-1143. [PMID: 38417823 PMCID: PMC11002919 DOI: 10.1021/acs.jproteome.3c00462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Revised: 01/04/2024] [Accepted: 01/26/2024] [Indexed: 03/01/2024]
Abstract
Multiplex imaging platforms have enabled the identification of the spatial organization of different types of cells in complex tissue or the tumor microenvironment. Exploring the potential variations in the spatial co-occurrence or colocalization of different cell types across distinct tissue or disease classes can provide significant pathological insights, paving the way for intervention strategies. However, the existing methods in this context either rely on stringent statistical assumptions or suffer from a lack of generalizability. We present a highly powerful method to study differential spatial co-occurrence of cell types across multiple tissue or disease groups, based on the theories of the Poisson point process and functional analysis of variance. Notably, the method accommodates multiple images per subject and addresses the problem of missing tissue regions, commonly encountered due to data-collection complexities. We demonstrate the superior statistical power and robustness of the method in comparison with existing approaches through realistic simulation studies. Furthermore, we apply the method to three real data sets on different diseases collected using different imaging platforms. In particular, one of these data sets reveals novel insights into the spatial characteristics of various types of colorectal adenoma.
Collapse
Affiliation(s)
- Souvik Seal
- Department
of Public Health Sciences, Medical University
of South Carolina Charleston, South Carolina 29425, United States
| | - Brian Neelon
- Department
of Public Health Sciences, Medical University
of South Carolina Charleston, South Carolina 29425, United States
| | - Peggi M. Angel
- Department
of Cell and Molecular Pharmacology and Experimental Therapeutics, Medical University of South Carolina Charleston, South Carolina 29425, United States
| | - Elizabeth C. O’Quinn
- Translational
Science Laboratory, Hollings Cancer Center, Medical University of South Carolina Charleston, South Carolina 29425, United States
| | - Elizabeth Hill
- Department
of Public Health Sciences, Medical University
of South Carolina Charleston, South Carolina 29425, United States
| | - Thao Vu
- Department
of Biostatistics and Informatics, University
of Colorado CU Anschutz Medical Campus Aurora, Colorado 80045, United States
| | - Debashis Ghosh
- Department
of Biostatistics and Informatics, University
of Colorado CU Anschutz Medical Campus Aurora, Colorado 80045, United States
| | - Anand S. Mehta
- Department
of Cell and Molecular Pharmacology and Experimental Therapeutics, Medical University of South Carolina Charleston, South Carolina 29425, United States
| | - Kristin Wallace
- Department
of Public Health Sciences, Medical University
of South Carolina Charleston, South Carolina 29425, United States
| | - Alexander V. Alekseyenko
- Department
of Public Health Sciences, Medical University
of South Carolina Charleston, South Carolina 29425, United States
| |
Collapse
|
3
|
Seal S, Bitler BG, Ghosh D. SMASH: Scalable Method for Analyzing Spatial Heterogeneity of genes in spatial transcriptomics data. PLoS Genet 2023; 19:e1010983. [PMID: 37862362 PMCID: PMC10619839 DOI: 10.1371/journal.pgen.1010983] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 11/01/2023] [Accepted: 09/19/2023] [Indexed: 10/22/2023] Open
Abstract
In high-throughput spatial transcriptomics (ST) studies, it is of great interest to identify the genes whose level of expression in a tissue covaries with the spatial location of cells/spots. Such genes, also known as spatially variable genes (SVGs), can be crucial to the biological understanding of both structural and functional characteristics of complex tissues. Existing methods for detecting SVGs either suffer from huge computational demand or significantly lack statistical power. We propose a non-parametric method termed SMASH that achieves a balance between the above two problems. We compare SMASH with other existing methods in varying simulation scenarios demonstrating its superior statistical power and robustness. We apply the method to four ST datasets from different platforms uncovering interesting biological insights.
Collapse
Affiliation(s)
- Souvik Seal
- Department of Public Health Sciences, School of Medicine, Medical University of South Carolina, Charleston, South Carolina, United States of America
| | - Benjamin G. Bitler
- Department of Obstetrics and Gynecology, School of Medicine, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado, United States of America
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado, United States of America
| |
Collapse
|
4
|
Seal S, Neelon B, Angel P, O’Quinn EC, Hill E, Vu T, Ghosh D, Mehta A, Wallace K, Alekseyenko AV. SpaceANOVA: Spatial co-occurrence analysis of cell types in multiplex imaging data using point process and functional ANOVA. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.06.548034. [PMID: 37461579 PMCID: PMC10350074 DOI: 10.1101/2023.07.06.548034] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/31/2023]
Abstract
Motivation Multiplex imaging platforms have enabled the identification of the spatial organization of different types of cells in complex tissue or tumor microenvironment (TME). Exploring the potential variations in the spatial co-occurrence or co-localization of different cell types across distinct tissue or disease classes can provide significant pathological insights, paving the way for intervention strategies. However, the existing methods in this context either rely on stringent statistical assumptions or suffer from a lack of generalizability. Results We present a highly powerful method to study differential spatial co-occurrence of cell types across multiple tissue or disease groups, based on the theories of the Poisson point process (PPP) and functional analysis of variance (FANOVA). Notably, the method accommodates multiple images per subject and addresses the problem of missing tissue regions, commonly encountered in such a context due to the complex nature of the data-collection procedure. We demonstrate the superior statistical power and robustness of the method in comparison to existing approaches through realistic simulation studies. Furthermore, we apply the method to three real datasets on different diseases collected using different imaging platforms. In particular, one of these datasets reveals novel insights into the spatial characteristics of various types of precursor lesions associated with colorectal cancer. Availability The associated R package can be found here, https://github.com/sealx017/SpaceANOVA.
Collapse
Affiliation(s)
- Souvik Seal
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina
| | - Brian Neelon
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina
| | - Peggi Angel
- Department of Cell and Molecular Pharmacology and Experimental Therapeutics, Medical University of South Carolina, Charleston, South Carolina
| | - Elizabeth C. O’Quinn
- Translational Science Laboratory, Hollings Cancer Center, Medical University of South Carolina, Charleston, South Carolina
| | - Elizabeth Hill
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina
| | - Thao Vu
- Department of Biostatistics and Informatics, University of Colorado CU Anschutz Medical Campus, Aurora, Colorado
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, University of Colorado CU Anschutz Medical Campus, Aurora, Colorado
| | - Anand Mehta
- Department of Cell and Molecular Pharmacology and Experimental Therapeutics, Medical University of South Carolina, Charleston, South Carolina
| | - Kristin Wallace
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina
| | - Alexander V. Alekseyenko
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina
| |
Collapse
|
5
|
Osher N, Kang J, Krishnan S, Rao A, Baladandayuthapani V. SPARTIN: a Bayesian method for the quantification and characterization of cell type interactions in spatial pathology data. Front Genet 2023; 14:1175603. [PMID: 37274781 PMCID: PMC10232864 DOI: 10.3389/fgene.2023.1175603] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 04/25/2023] [Indexed: 06/07/2023] Open
Abstract
Introduction: The acquisition of high-resolution digital pathology imaging data has sparked the development of methods to extract context-specific features from such complex data. In the context of cancer, this has led to increased exploration of the tumor microenvironment with respect to the presence and spatial composition of immune cells. Spatial statistical modeling of the immune microenvironment may yield insights into the role played by the immune system in the natural development of cancer as well as downstream therapeutic interventions. Methods: In this paper, we present SPatial Analysis of paRtitioned Tumor-Immune imagiNg (SPARTIN), a Bayesian method for the spatial quantification of immune cell infiltration from pathology images. SPARTIN uses Bayesian point processes to characterize a novel measure of local tumor-immune cell interaction, Cell Type Interaction Probability (CTIP). CTIP allows rigorous incorporation of uncertainty and is highly interpretable, both within and across biopsies, and can be used to assess associations with genomic and clinical features. Results: Through simulations, we show SPARTIN can accurately distinguish various patterns of cellular interactions as compared to existing methods. Using SPARTIN, we characterized the local spatial immune cell infiltration within and across 335 melanoma biopsies and evaluated their association with genomic, phenotypic, and clinical outcomes. We found that CTIP was significantly (negatively) associated with deconvolved immune cell prevalence scores including CD8+ T-Cells and Natural Killer cells. Furthermore, average CTIP scores differed significantly across previously established transcriptomic classes and significantly associated with survival outcomes. Discussion: SPARTIN provides a general framework for investigating spatial cellular interactions in high-resolution digital histopathology imaging data and its associations with patient level characteristics. The results of our analysis have potential implications relevant to both treatment and prognosis in the context of Skin Cutaneous Melanoma. The R-package for SPARTIN is available at https://github.com/bayesrx/SPARTIN along with a visualization tool for the images and results at: https://nateosher.github.io/SPARTIN.
Collapse
Affiliation(s)
- Nathaniel Osher
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States
| | - Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States
| | - Santhoshi Krishnan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
- Department of Electrical and Computer Engineering, Rice University, Houston, TX, United States
| | - Arvind Rao
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
- Department of Electrical and Computer Engineering, Rice University, Houston, TX, United States
- Department of Radiation Oncology, University of Michigan, Ann Arbor, MI, United States
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI, United States
| | - Veerabhadran Baladandayuthapani
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
6
|
Seal S, Bitler BG, Ghosh D. SMASH: Scalable Method for Analyzing Spatial Heterogeneity of genes in spatial transcriptomics data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.23.533980. [PMID: 36993287 PMCID: PMC10055313 DOI: 10.1101/2023.03.23.533980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
Abstract
In high-throughput spatial transcriptomics (ST) studies, it is of great interest to identify the genes whose level of expression in a tissue covaries with the spatial location of cells/spots. Such genes, also known as spatially variable genes (SVGs), can be crucial to the biological understanding of both structural and functional characteristics of complex tissues. Existing methods for detecting SVGs either suffer from huge computational demand or significantly lack statistical power. We propose a non-parametric method termed SMASH that achieves a balance between the above two problems. We compare SMASH with other existing methods in varying simulation scenarios demonstrating its superior statistical power and robustness. We apply the method to four ST datasets from different platforms revealing interesting biological insights.
Collapse
Affiliation(s)
- Souvik Seal
- Department of Public Health Sciences, School of Medicine, Medical University of South Carolina, Charleston, USA
| | - Benjamin G. Bitler
- Department of Obstetrics and Gynecology, School of Medicine, University of Colorado Denver - Anschutz Medical Campus, Aurora, USA
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Denver - Anschutz Medical Campus, Aurora, USA
| |
Collapse
|