1
|
Marmarelis MG, Littman R, Battaglin F, Niedzwiecki D, Venook A, Ambite JL, Galstyan A, Lenz HJ, Ver Steeg G. q-Diffusion leverages the full dimensionality of gene coexpression in single-cell transcriptomics. Commun Biol 2024; 7:400. [PMID: 38565955 DOI: 10.1038/s42003-024-06104-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 03/25/2024] [Indexed: 04/04/2024] Open
Abstract
Unlocking the full dimensionality of single-cell RNA sequencing data (scRNAseq) is the next frontier to a richer, fuller understanding of cell biology. We introduce q-diffusion, a framework for capturing the coexpression structure of an entire library of genes, improving on state-of-the-art analysis tools. The method is demonstrated via three case studies. In the first, q-diffusion helps gain statistical significance for differential effects on patient outcomes when analyzing the CALGB/SWOG 80405 randomized phase III clinical trial, suggesting precision guidance for the treatment of metastatic colorectal cancer. Secondly, q-diffusion is benchmarked against existing scRNAseq classification methods using an in vitro PBMC dataset, in which the proposed method discriminates IFN-γ stimulation more accurately. The same case study demonstrates improvements in unsupervised cell clustering with the recent Tabula Sapiens human atlas. Finally, a local distributional segmentation approach for spatial scRNAseq, driven by q-diffusion, yields interpretable structures of human cortical tissue.
Collapse
Affiliation(s)
- Myrl G Marmarelis
- Information Sciences Institute, University of Southern California, 4676 Admiralty Way, Marina del Rey, CA, 90292, USA.
| | - Russell Littman
- University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Francesca Battaglin
- Keck School of Medicine, University of Southern California, 1975 Zonal Ave., Los Angeles, CA, 90033, USA
| | | | - Alan Venook
- University of California San Francisco, San Francisco, CA, 94143, USA
| | - Jose-Luis Ambite
- Information Sciences Institute, University of Southern California, 4676 Admiralty Way, Marina del Rey, CA, 90292, USA
| | - Aram Galstyan
- Information Sciences Institute, University of Southern California, 4676 Admiralty Way, Marina del Rey, CA, 90292, USA
| | - Heinz-Josef Lenz
- Keck School of Medicine, University of Southern California, 1975 Zonal Ave., Los Angeles, CA, 90033, USA
| | - Greg Ver Steeg
- Information Sciences Institute, University of Southern California, 4676 Admiralty Way, Marina del Rey, CA, 90292, USA
- University of California Riverside, Riverside, CA, 92521, USA
| |
Collapse
|
2
|
Marmarelis MG, Ver Steeg G, Galstyan A. A Metric Space for Point Process Excitations. J ARTIF INTELL RES 2022. [DOI: 10.1613/jair.1.13610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
A multivariate Hawkes process enables self- and cross-excitations through a triggering matrix that behaves like an asymmetrical covariance structure, characterizing pairwise interactions between the event types. Full-rank estimation of all interactions is often infeasible in empirical settings. Models that specialize on a spatiotemporal application alleviate this obstacle by exploiting spatial locality, allowing the dyadic relationships between events to depend only on separation in time and relative distances in real Euclidean space. Here we generalize this framework to any multivariate Hawkes process, and harness it as a vessel for embedding arbitrary event types in a hidden metric space. Specifically, we propose a Hidden Hawkes Geometry (HHG) model to uncover the hidden geometry between event excitations in a multivariate point process. The low dimensionality of the embedding regularizes the structure of the inferred interactions. We develop a number of estimators and validate the model by conducting several experiments. In particular, we investigate regional infectivity dynamics of COVID-19 in an early South Korean record and recent Los Angeles confirmed cases. By additionally performing synthetic experiments on short records as well as explorations into options markets and the Ebola epidemic, we demonstrate that learning the embedding alongside a point process uncovers salient interactions in a broad range of applications.
Collapse
|