1
|
Gaspard-Boulinc LC, Gortana L, Walter T, Barillot E, Cavalli FMG. Cell-type deconvolution methods for spatial transcriptomics. Nat Rev Genet 2025:10.1038/s41576-025-00845-y. [PMID: 40369312 DOI: 10.1038/s41576-025-00845-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/14/2025] [Indexed: 05/16/2025]
Abstract
Spatial transcriptomics is a powerful method for studying the spatial organization of cells, which is a critical feature in the development, function and evolution of multicellular life. However, sequencing-based spatial transcriptomics has not yet achieved cellular-level resolution, so advanced deconvolution methods are needed to infer cell-type contributions at each location in the data. Recent progress has led to diverse tools for cell-type deconvolution that are helping to describe tissue architectures in health and disease. In this Review, we describe the varied types of cell-type deconvolution methods for spatial transcriptomics, contrast their capabilities and summarize them in a web-based, interactive table to enable more efficient method selection.
Collapse
Affiliation(s)
- Lucie C Gaspard-Boulinc
- Institut Curie, PSL University, Paris, France
- Institut National de la Santé et de la Recherche Médicale (INSERM), U1331, Paris, France
- Mines Paris, PSL University, CBIO - Centre for Computational Biology, Paris, France
| | - Luca Gortana
- Institut Curie, PSL University, Paris, France
- Institut National de la Santé et de la Recherche Médicale (INSERM), U1331, Paris, France
- Mines Paris, PSL University, CBIO - Centre for Computational Biology, Paris, France
| | - Thomas Walter
- Institut Curie, PSL University, Paris, France
- Institut National de la Santé et de la Recherche Médicale (INSERM), U1331, Paris, France
- Mines Paris, PSL University, CBIO - Centre for Computational Biology, Paris, France
| | - Emmanuel Barillot
- Institut Curie, PSL University, Paris, France
- Institut National de la Santé et de la Recherche Médicale (INSERM), U1331, Paris, France
- Mines Paris, PSL University, CBIO - Centre for Computational Biology, Paris, France
| | - Florence M G Cavalli
- Institut Curie, PSL University, Paris, France.
- Institut National de la Santé et de la Recherche Médicale (INSERM), U1331, Paris, France.
- Mines Paris, PSL University, CBIO - Centre for Computational Biology, Paris, France.
| |
Collapse
|
2
|
Dong Q, Yang Y, Luo Z, Shen H, Shi X, Liu J. Robust Spatial Cell-Type Deconvolution with Qualitative Reference for Spatial Transcriptomics. SMALL METHODS 2025; 9:e2401145. [PMID: 40059456 PMCID: PMC12103236 DOI: 10.1002/smtd.202401145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 01/14/2025] [Indexed: 05/26/2025]
Abstract
Many spatially resolved transcriptomic technologies have been developed to provide gene expression profiles for spots that may contain heterogeneous mixtures of cells. To decompose cellular composition and expression levels, various deconvolution methods have been developed using single-cell RNA sequencing (scRNA-seq) data with known cell-type labels as a reference. However, in the absence of a reliable reference dataset or in the presence of heterogeneous batch effects, these methods may introduce bias. Here, a Qualitative-Reference-based Spatially-Informed Deconvolution method (QR-SIDE) is developed for multi-cellular spatial transcriptomic data. Uniquely, QR-SIDE provides a detailed map of spatial heterogeneity for individual marker genes and performs robust deconvolution by adaptively adjusting the contributions of each marker gene. Simultaneously, QR-SIDE unifies cell-type deconvolution with spatial clustering and incorporates spatial information via a Potts model to promote spatial continuity. The identified spatial domains represent a meaningful biological effect in potential tissue segments. Using simulated data and three real spatial transcriptomic datasets from the 10x Visium and ST platforms, QR-SIDE demonstrates improved accuracy and robustness in cell-type deconvolution and its superiority over established methods in recognizing and delineating spatial structures within a given context. These results can facilitate a range of downstream analyses and provide a refined understanding of cellular heterogeneity.
Collapse
Affiliation(s)
- Qishi Dong
- College of Big Data and InternetShenzhen Technology UniversityShenzhen518118China
| | - Yi Yang
- The Key Laboratory of Developmental Genes and Human DiseaseSchool of Life Science and TechnologySoutheast UniversityNanjing210018China
| | - Ziye Luo
- Department of BiometricsAstrazeneca Global R&D (China) CO. Ltd.Shanghai200085, China
| | - Haipeng Shen
- Faculty of Business and EconomicsHong Kong UniversityPokfulamHong Kong SARChina
| | - Xingjie Shi
- KLATASDS‐MOE, Academy of Statistics and Interdisciplinary SciencesSchool of StatisticsEast China Normal UniversityShanghai200062China
| | - Jin Liu
- School of Data ScienceThe Chinese University of Hong Kong‐ShenzhenShenzhen518172China
| |
Collapse
|
3
|
Huuki-Myers LA, Montgomery KD, Kwon SH, Cinquemani S, Eagles NJ, Gonzalez-Padilla D, Maden SK, Kleinman JE, Hyde TM, Hicks SC, Maynard KR, Collado-Torres L. Benchmark of cellular deconvolution methods using a multi-assay dataset from postmortem human prefrontal cortex. Genome Biol 2025; 26:88. [PMID: 40197307 PMCID: PMC11978107 DOI: 10.1186/s13059-025-03552-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 03/21/2025] [Indexed: 04/10/2025] Open
Abstract
Cellular deconvolution of bulk RNA-sequencing data using single cell/nuclei RNA-seq reference data is an important strategy for estimating cell type composition in heterogeneous tissues, such as the human brain. Here, we generate a multi-assay dataset in postmortem human dorsolateral prefrontal cortex from 22 tissue blocks, including bulk RNA-seq, reference snRNA-seq, and orthogonal measurement of cell type proportions with RNAScope/ImmunoFluorescence. We use this dataset to evaluate six deconvolution algorithms. Bisque and hspe were the most accurate methods. The dataset, as well as the Mean Ratio gene marker finding method, is made available in the DeconvoBuddies R/Bioconductor package.
Collapse
Affiliation(s)
- Louise A Huuki-Myers
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
- UK Dementia Research Institute at the University of Cambridge, Cambridge, UK
- Department of Clinical Neurosciences, School of Clinical Medicine, The University of Cambridge, Cambridge, UK
| | - Kelsey D Montgomery
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
| | - Sang Ho Kwon
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
| | - Sophia Cinquemani
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
| | - Nicholas J Eagles
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
| | | | - Sean K Maden
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Joel E Kleinman
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
| | - Thomas M Hyde
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
| | - Stephanie C Hicks
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21205, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21205, USA
- Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Kristen R Maynard
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA.
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA.
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA.
| | - Leonardo Collado-Torres
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA.
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA.
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21205, USA.
| |
Collapse
|
4
|
Feng T, Li P, Li S, Wang Y, Lv J, Xia T, Lee HJ, Piao HL, Chen D, Ma Y. Metabolic state uncovers prognosis insights of esophageal squamous cell carcinoma patients. J Transl Med 2025; 23:342. [PMID: 40098145 PMCID: PMC11912770 DOI: 10.1186/s12967-025-06087-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2024] [Accepted: 01/06/2025] [Indexed: 03/19/2025] Open
Abstract
BACKGROUND Metabolite-protein interactions (MPIs) are crucial regulators of cancer metabolism; however, their roles and coordination within the esophageal squamous cell carcinoma (ESCC) microenvironment remain largely unexplored. This study is the first to comprehensively map the metabolic landscape of the ESCC microenvironment by integrating an MPI network with multi-scale transcriptomics data. METHODS First, we characterized the metabolic states of cells in ESCC using single-cell transcriptome profiles of key metabolite-interacting proteins. Next, we determined the metabolic patterns of each ESCC patient based on the composition of different metabolic states within bulk samples. Finally, the ESCC samples were clustered into unique subtypes. RESULTS Sixteen ESCC metabolic states across 7 cell types were identified based on the re-analysis of single-cell RNA-sequencing data of 208,659 cells in 64 ESCC samples. Each of the 7 cell types within the tumor microenvironment exhibited distinct metabolic states, highlighting the high metabolic heterogeneity of ESCC. Based on differences in the compositions of the metabolic states, 4 ESCC subtypes were identified in two independent cohorts (n = 79 and 119), which were associated with significant variations in prognosis, clinical features, gene expression, and pathways. Notably, the inactivation of cellular detoxification processes may contribute to the poor prognosis of ESCC patients. CONCLUSIONS Overall, we redefined robust ESCC prognostic subtypes and identified key MPI pathways that link metabolism to tumor heterogeneity. This study provides the first comprehensive mapping of the ESCC metabolic microenvironment, offering novel insights into ESCC metabolic diversity and its clinical applications.
Collapse
Affiliation(s)
- Tingze Feng
- Department of Thoracic Surgery, Liaoning Cancer Hospital & Institute, Cancer Hospital of China Medical University, Shenyang, 110042, China
- Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China
| | - Pengfei Li
- Department of Thoracic Surgery, Liaoning Cancer Hospital & Institute, Cancer Hospital of China Medical University, Shenyang, 110042, China
| | - Siyi Li
- Department of Thoracic Surgery, Liaoning Cancer Hospital & Institute, Cancer Hospital of China Medical University, Shenyang, 110042, China
- Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China
| | - Yuhan Wang
- Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China
| | - Jing Lv
- Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China
| | - Tian Xia
- Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China
| | - Hoy-Jong Lee
- School of Pharmacy, Sungkyunkwan University, Suwon, 16419, Republic of Korea
| | - Hai-Long Piao
- Department of Thoracic Surgery, Liaoning Cancer Hospital & Institute, Cancer Hospital of China Medical University, Shenyang, 110042, China.
- Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China.
| | - Di Chen
- Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China.
| | - Yegang Ma
- Department of Thoracic Surgery, Liaoning Cancer Hospital & Institute, Cancer Hospital of China Medical University, Shenyang, 110042, China.
| |
Collapse
|
5
|
Li Y, Xu S, Wang X, Ertekin-Taner N, Chen D. An augmented GSNMF model for complete deconvolution of bulk RNA-seq data. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2025; 22:988-1018. [PMID: 40296800 PMCID: PMC12043048 DOI: 10.3934/mbe.2025036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/30/2025]
Abstract
Performing complete deconvolution analysis for bulk RNA-seq data to obtain both cell type specific gene expression profiles (GEP) and relative cell abundances is a challenging task. One of the fundamental models used, the nonnegative matrix factorization (NMF), is mathematically ill-posed. Although several complete deconvolution methods have been developed, and their estimates compared to ground truth for some datasets appear promising, a comprehensive understanding of how to circumvent the ill-posedness and improve solution accuracy is lacking. In this paper, we first investigated the necessary requirements for a given dataset to satisfy the solvability conditions in NMF theory. Even with solvability conditions, the "unique" solutions of NMF are subject to a rescaling matrix. Therefore, we provide estimates of the converged local minima and the possible rescaling matrix, based on informative initial conditions. Using these strategies, we developed a new pipeline of pseudo-bulk tissue data augmented, geometric structure guided NMF model (GSNMF+). In our approach, pseudo-bulk tissue data was generated, by statistical distribution simulated pseudo cellular compositions and single-cell RNA-seq (scRNA-seq) data, and then mixed with the original dataset. The constituent matrices of the hybrid dataset then satisfy the weak solvability conditions of NMF. Furthermore, an estimated rescaling matrix was used to adjust the minimizer of the NMF, which was expected to reduce mean square root errors of solutions. Our algorithms are tested on several realistic bulk-tissue datasets and showed significant improvements in scenarios with singular cellular compositions.
Collapse
Affiliation(s)
- Yujie Li
- Department of Mathematics and Statistics, University of North Carolina at Charlotte, USA
- School of Data Science, University of North Carolina at Charlotte, USA
| | - Su Xu
- Department of Mathematics and Statistics, University of North Carolina at Charlotte, USA
| | - Xue Wang
- Department of Quantitative Health Sciences, Mayo Clinic, Florida, USA
| | - Nilüfer Ertekin-Taner
- Department of Neurosciences, Mayo Clinic, Florida, USA
- Department of Neurology, Mayo Clinic, Florida, USA
| | - Duan Chen
- Department of Mathematics and Statistics, University of North Carolina at Charlotte, USA
| |
Collapse
|
6
|
Shen W, Liu C, Hu Y, Lei Y, Wong HS, Wu S, Zhou XM. CSsingle: A Unified Tool for Robust Decomposition of Bulk and Spatial Transcriptomic Data Across Diverse Single-Cell References. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.04.07.588458. [PMID: 38645128 PMCID: PMC11030304 DOI: 10.1101/2024.04.07.588458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
We introduce CSsingle, a novel method that enhances the decomposition of bulk and spatial transcriptomic (ST) data by addressing key challenges in cellular heterogeneity. CSsingle applies cell size correction using ERCC spike-in controls, enabling it to account for variations in RNA content between cell types and achieve accurate bulk data deconvolution. In addition, it enables fine-scale analysis for ST data, advancing our understanding of tissue architecture and cellular interactions, particularly in complex microenvironments. We provide a unified tool for integrating bulk and ST with scRNA-seq data, advancing the study of complex biological systems and disease processes. The benchmark results demonstrate that CSsingle outperforms existing methods in accuracy and robustness. Validation using more than 700 normal and diseased samples from gastroesophageal tissue reveals the predominant presence of mosaic columnar cells (MCCs), which exhibit a gastric and intestinal mosaic phenotype in Barrett's esophagus and esophageal adenocarcinoma (EAC), in contrast to their very low detectable levels in esophageal squamous cell carcinoma and normal gastroesophageal tissue. We revealed a dynamic relationship between MCCs and squamous cells during immune checkpoint inhibitors (ICI)-based treatment in EAC patients, suggesting MCC expression signatures as predictive and prognostic markers of immunochemotherapy outcomes. Our findings reveal the critical role of MCC in the treatment of EAC and its potential as a biomarker to predict outcomes of immunochemotherapy, providing insight into tumor epithelial plasticity to guide personalized immunotherapeutic strategies.
Collapse
Affiliation(s)
- Wenjun Shen
- Department of Bioinformatics, Shantou University Medical College, Shantou, China
- Chaoshan Branch of State Key Laboratory for Esophageal Cancer Prevention and Treatment, Shantou University Medical College, Shantou, China
| | - Cheng Liu
- Department of Computer Science, Shantou University, Shantou China
| | - Yunfei Hu
- Department of Computer Science, Vanderbilt University, Nashville, USA
| | - Yuanfan Lei
- Department of Bioinformatics, Shantou University Medical College, Shantou, China
| | - Hau-San Wong
- Department of Computer Sciences, City University of Hong Kong, Kowloon, Hong kong
| | - Si Wu
- Department of Computer Science, South China University of Technology, Guangzhou, China
| | - Xin Maizie Zhou
- Department of Computer Science, Vanderbilt University, Nashville, USA
- Department of Biomedical Engineering, Vanderbilt University, Nashville, USA
| |
Collapse
|
7
|
Zhang Y, Lu Z, Guo J, Wang Q, Zhang X, Yang H, Li X. Advanced Carriers for Precise Delivery and Therapeutic Mechanisms of Traditional Chinese Medicines: Integrating Spatial Multi-Omics and Delivery Visualization. Adv Healthc Mater 2025; 14:e2403698. [PMID: 39828637 DOI: 10.1002/adhm.202403698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Revised: 12/01/2024] [Indexed: 01/22/2025]
Abstract
The complex composition of traditional Chinese medicines (TCMs) has posed challenges for in-depth study and global application, despite their abundance of bioactive compounds that make them valuable resources for disease treatment. To overcome these obstacles, it is essential to modernize TCMs by focusing on precise disease treatment. This involves elucidating the structure-activity relationships within their complex compositions, ensuring accurate in vivo delivery, and monitoring the delivery process. This review discusses the research progress of TCMs in precision disease treatment from three perspectives: spatial multi-omics technology for precision therapeutic activity, carrier systems for precise in vivo delivery, and medical imaging technology for visualizing the delivery process. The aim is to establish a novel research paradigm that advances the precision therapy of TCMs.
Collapse
Affiliation(s)
- Yusheng Zhang
- Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing, 100700, P. R. China
| | - Zhiguo Lu
- State Key Laboratory of Biochemical Engineering, Institute of Process, Engineering, Chinese Academy of Sciences, Beijing, 100190, P. R. China
- Key Laboratory of Biopharmaceutical Preparation and Delivery, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, P. R. China
| | - Jing Guo
- State Key Laboratory of Biochemical Engineering, Institute of Process, Engineering, Chinese Academy of Sciences, Beijing, 100190, P. R. China
- Key Laboratory of Biopharmaceutical Preparation and Delivery, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, P. R. China
| | - Qing Wang
- School of Life Sciences, Beijing University of Chinese Medicine, Beijing, 100029, P. R. China
| | - Xin Zhang
- State Key Laboratory of Biochemical Engineering, Institute of Process, Engineering, Chinese Academy of Sciences, Beijing, 100190, P. R. China
- Key Laboratory of Biopharmaceutical Preparation and Delivery, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, P. R. China
| | - Hongjun Yang
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, China Academy of Chinese Medical Sciences, Beijing, 100029, P. R. China
| | - Xianyu Li
- Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing, 100700, P. R. China
| |
Collapse
|
8
|
Yang H, Atak D, Yuan M, Li M, Altay O, Demirtas E, Peltek IB, Ulukan B, Yigit B, Sipahioglu T, Álvez MB, Meng L, Yüksel B, Turkez H, Kirimlioglu H, Saka B, Yurdaydin C, Akyildiz M, Dayangac M, Uhlen M, Boren J, Zhang C, Mardinoglu A, Zeybel M. Integrative proteo-transcriptomic characterization of advanced fibrosis in chronic liver disease across etiologies. Cell Rep Med 2025; 6:101935. [PMID: 39889710 PMCID: PMC11866494 DOI: 10.1016/j.xcrm.2025.101935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 09/20/2024] [Accepted: 01/08/2025] [Indexed: 02/03/2025]
Abstract
Chronic hepatic injury and inflammation from various causes can lead to fibrosis and cirrhosis, potentially predisposing to hepatocellular carcinoma. The molecular mechanisms underlying fibrosis and its progression remain incompletely understood. Using a proteo-transcriptomics approach, we analyze liver and plasma samples from 330 individuals, including 40 healthy individuals and 290 patients with histologically characterized fibrosis due to chronic viral infection, alcohol consumption, or metabolic dysfunction-associated steatotic liver disease. Our findings reveal dysregulated pathways related to extracellular matrix, immune response, inflammation, and metabolism in advanced fibrosis. We also identify 132 circulating proteins associated with advanced fibrosis, with neurofascin and growth differentiation factor 15 demonstrating superior predictive performance for advanced fibrosis(area under the receiver operating characteristic curve [AUROC] 0.89 [95% confidence interval (CI) 0.81-0.97]) compared to the fibrosis-4 model (AUROC 0.85 [95% CI 0.78-0.93]). These findings provide insights into fibrosis pathogenesis and highlight the potential for more accurate non-invasive diagnosis.
Collapse
Affiliation(s)
- Hong Yang
- Science for Life Laboratory, KTH - Royal Institute of Technology, Stockholm, Sweden
| | - Dila Atak
- Department of Gastroenterology and Hepatology, School of Medicine, Koç University, İstanbul 34010, Turkiye
| | - Meng Yuan
- Science for Life Laboratory, KTH - Royal Institute of Technology, Stockholm, Sweden
| | - Mengzhen Li
- Science for Life Laboratory, KTH - Royal Institute of Technology, Stockholm, Sweden
| | - Ozlem Altay
- Science for Life Laboratory, KTH - Royal Institute of Technology, Stockholm, Sweden
| | - Elif Demirtas
- School of Medicine, Koç University, Istanbul 34450, Turkiye
| | | | - Burge Ulukan
- Department of Gastroenterology and Hepatology, School of Medicine, Koç University, İstanbul 34010, Turkiye
| | - Buket Yigit
- Department of Gastroenterology and Hepatology, School of Medicine, Koç University, İstanbul 34010, Turkiye
| | - Tarik Sipahioglu
- Department of Gastroenterology and Hepatology, School of Medicine, Koç University, İstanbul 34010, Turkiye
| | - María Bueno Álvez
- Science for Life Laboratory, KTH - Royal Institute of Technology, Stockholm, Sweden
| | - Lingqi Meng
- Science for Life Laboratory, KTH - Royal Institute of Technology, Stockholm, Sweden
| | | | - Hasan Turkez
- Department of Medical Biology, Faculty of Medicine, Atatürk University, Erzurum 25240, Turkiye
| | - Hale Kirimlioglu
- Department of Pathology, School of Medicine, Acibadem Mehmet Ali Aydinlar University Istanbul 34752, Turkiye
| | - Burcu Saka
- Department of Pathology, School of Medicine, Koç University, Istanbul 34010, Turkiye
| | - Cihan Yurdaydin
- Department of Gastroenterology and Hepatology, School of Medicine, Koç University, İstanbul 34010, Turkiye
| | - Murat Akyildiz
- Department of Gastroenterology and Hepatology, School of Medicine, Koç University, İstanbul 34010, Turkiye
| | - Murat Dayangac
- Department of General Surgery, International School of Medicine, Medipol University, Istanbul 34010, Turkiye
| | - Mathias Uhlen
- Science for Life Laboratory, KTH - Royal Institute of Technology, Stockholm, Sweden
| | - Jan Boren
- Department of Molecular and Clinical Medicine, University of Gothenburg and Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Cheng Zhang
- Science for Life Laboratory, KTH - Royal Institute of Technology, Stockholm, Sweden
| | - Adil Mardinoglu
- Science for Life Laboratory, KTH - Royal Institute of Technology, Stockholm, Sweden; Centre for Host-Microbiome Interactions, Faculty of Dentistry, Oral & Craniofacial Sciences, King's College London, London SE1 9RT, UK.
| | - Mujdat Zeybel
- Department of Gastroenterology and Hepatology, School of Medicine, Koç University, İstanbul 34010, Turkiye; Clinical Trials Unit, Koç University Hospital, Istanbul 34010, Turkiye.
| |
Collapse
|
9
|
Tu JJ, Yan H, Zhang XF, Lin Z. Precise gene expression deconvolution in spatial transcriptomics with STged. Nucleic Acids Res 2025; 53:gkaf087. [PMID: 39970279 PMCID: PMC11838043 DOI: 10.1093/nar/gkaf087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 01/07/2025] [Accepted: 02/02/2025] [Indexed: 02/21/2025] Open
Abstract
Spatially resolved transcriptomics (SRT) has transformed tissue biology by linking gene expression profiles with spatial information. However, sequencing-based SRT methods aggregate signals from multiple cell types within capture locations ("spots"), masking cell-type-specific gene expression patterns. Traditional cell-type deconvolution methods estimate cell compositions within spots but fail to resolve cell-type-specific gene expression, limiting their ability to uncover critical biological processes such as cellular interactions and microenvironmental dynamics. Here, we present STged (spatial transcriptomic gene expression deconvolution), a novel computational framework that goes beyond traditional deconvolution by reconstructing cell-type-specific gene expression profiles from mixed spots. STged integrates graph-based spatial correlations and reference-derived gene signatures using a non-negative least-squares regression framework, achieving precise and biologically meaningful deconvolution. Comprehensive simulations show that STged consistently outperforms existing methods in accuracy and robustness. Applications to human pancreatic ductal adenocarcinoma and human squamous cell carcinoma datasets reveal its capacity to identify microenvironment-specific highly variable genes, reconstruct spatial cell-cell communication networks, and resolve tissue architecture at near-single-cell resolution. In mouse kidney tissues, STged uncovers dynamic spatial gene expression patterns and distinct gene programs, advancing our understanding of tissue heterogeneity and cellular dynamics.
Collapse
Affiliation(s)
- Jia-Juan Tu
- School of Science, Hubei University of Technology, Wuhan 430079, China
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong 999077, China
| | - Hong Yan
- Centre for Intelligent Multidimensional Data Analysis, Hong Kong 999077, China
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China
| | - Xiao-Fei Zhang
- School of Mathematics and Statistics, and Hubei Key Laboratory of Mathematical Sciences, Central China Normal University, Wuhan 430079, China
- Key Laboratory of Nonlinear Analysis & Applications (Ministry of Education), Central China Normal University, Wuhan 430079, China
| | - Zhixiang Lin
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong 999077, China
| |
Collapse
|
10
|
Guo S, Liu X, Cheng X, Jiang Y, Ji S, Liang Q, Koval A, Li Y, Owen LA, Kim IK, Aparicio A, Lee S, Sood AK, Kopetz S, Shen JP, Weinstein JN, DeAngelis MM, Chen R, Wang W. A deconvolution framework that uses single-cell sequencing plus a small benchmark data set for accurate analysis of cell type ratios in complex tissue samples. Genome Res 2025; 35:147-161. [PMID: 39586714 PMCID: PMC11789644 DOI: 10.1101/gr.278822.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 11/19/2024] [Indexed: 11/27/2024]
Abstract
Bulk deconvolution with single-cell/nucleus RNA-seq data is critical for understanding heterogeneity in complex biological samples, yet the technological discrepancy across sequencing platforms limits deconvolution accuracy. To address this, we utilize an experimental design to match inter-platform biological signals, hence revealing the technological discrepancy, and then develop a deconvolution framework called DeMixSC using this well-matched, that is, benchmark, data. Built upon a novel weighted nonnegative least-squares framework, DeMixSC identifies and adjusts genes with high technological discrepancy and aligns the benchmark data with large patient cohorts of matched-tissue-type for large-scale deconvolution. Our results using two benchmark data sets of healthy retinas and ovarian cancer tissues suggest much-improved deconvolution accuracy. Leveraging tissue-specific benchmark data sets, we applied DeMixSC to a large cohort of 453 age-related macular degeneration patients and a cohort of 30 ovarian cancer patients with various responses to neoadjuvant chemotherapy. Only DeMixSC successfully unveiled biologically meaningful differences across patient groups, demonstrating its broad applicability in diverse real-world clinical scenarios. Our findings reveal the impact of technological discrepancy on deconvolution performance and underscore the importance of a well-matched data set to resolve this challenge. The developed DeMixSC framework is generally applicable for accurately deconvolving large cohorts of disease tissues, including cancers, when a well-matched benchmark data set is available.
Collapse
Affiliation(s)
- Shuai Guo
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - Xiaoqian Liu
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - Xuesen Cheng
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Yujie Jiang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
- Department of Statistics, Rice University, Houston, Texas 77005, USA
| | - Shuangxi Ji
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - Qingnan Liang
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Andrew Koval
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
- Department of Statistics, Rice University, Houston, Texas 77005, USA
| | - Yumei Li
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Leah A Owen
- Department of Ophthalmology, Jacobs School of Medicine and Biomedical Engineering, SUNY University at Buffalo, Buffalo, New York 14209, USA
- Department of Population Health Sciences, University of Utah School of Medicine, Salt Lake City, Utah 84108, USA
- Department of Ophthalmology and Visual Sciences, University of Utah School of Medicine, Salt Lake City, Utah 84132, USA
| | - Ivana K Kim
- USA Retina Service, Harvard Medical School, Massachusetts Eye and Ear, Boston, Massachusetts 02114, USA
| | - Ana Aparicio
- Department of Genitourinary Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77230, USA
| | - Sanghoon Lee
- Department of Gynecologic Oncology and Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas 77230, USA
| | - Anil K Sood
- Department of Gynecologic Oncology and Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, Texas 77230, USA
| | - Scott Kopetz
- Department of Gastrointestinal Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - John Paul Shen
- Department of Gastrointestinal Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - John N Weinstein
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
- Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - Margaret M DeAngelis
- Department of Ophthalmology, Jacobs School of Medicine and Biomedical Engineering, SUNY University at Buffalo, Buffalo, New York 14209, USA
- Department of Population Health Sciences, University of Utah School of Medicine, Salt Lake City, Utah 84108, USA
- Department of Ophthalmology and Visual Sciences, University of Utah School of Medicine, Salt Lake City, Utah 84132, USA
- VA Western New York Healthcare System, Buffalo, New York 14215, USA
| | - Rui Chen
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Wenyi Wang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA;
| |
Collapse
|
11
|
Yuan M, Zhang C, Von Feilitzen K, Zwahlen M, Shi M, Li X, Yang H, Song X, Turkez H, Uhlén M, Mardinoglu A. The Human Pathology Atlas for deciphering the prognostic features of human cancers. EBioMedicine 2025; 111:105495. [PMID: 39662180 PMCID: PMC11683280 DOI: 10.1016/j.ebiom.2024.105495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Revised: 11/21/2024] [Accepted: 11/27/2024] [Indexed: 12/13/2024] Open
Abstract
BACKGROUND Cancer is one of the leading causes of mortality worldwide, highlighting the urgent need for a deeper molecular understanding and the development of personalized treatments. The present study aims to establish a solid association between gene expression and patient survival outcomes to enhance the utility of the Human Pathology Atlas for cancer research. METHODS In this updated analysis, we examined the expression profiles of 6918 patients across 21 cancer types. We integrated data from 10 independent cancer cohorts, creating a cross-validated, reliable collection of prognostic genes. We applied systems biology approach to identify the association between gene expression profiles and patient survival outcomes. We further constructed prognostic regulatory networks for kidney renal clear cell carcinoma (KIRC) and liver hepatocellular carcinoma (LIHC), which elucidate the molecular underpinnings associated with patient survival in these cancers. FINDINGS We observed that gene expression during the transition from normal to tumorous tissue exhibited diverse shifting patterns in their original tissue locations. Significant correlations between gene expression and patient survival outcomes were identified in KIRC and LIHC among the major cancer types. Additionally, the prognostic regulatory network established for these two cancers showed the indicative capabilities of the Human Pathology Atlas and provides actionable insights for cancer research. INTERPRETATION The updated Human Pathology Atlas provides a significant foundation for precision oncology and the formulation of personalized treatment strategies. These findings deepen our understanding of cancer biology and have the potential to advance targeted therapeutic approaches in clinical practice. FUNDING The Knut and Alice Wallenberg Foundation (72110), the China Scholarship Council (Grant No. 202006940003).
Collapse
Affiliation(s)
- Meng Yuan
- Science for Life Laboratory, KTH-Royal Institute of Technology, Stockholm SE-17165, Sweden
| | - Cheng Zhang
- Science for Life Laboratory, KTH-Royal Institute of Technology, Stockholm SE-17165, Sweden
| | - Kalle Von Feilitzen
- Science for Life Laboratory, KTH-Royal Institute of Technology, Stockholm SE-17165, Sweden
| | - Martin Zwahlen
- Science for Life Laboratory, KTH-Royal Institute of Technology, Stockholm SE-17165, Sweden
| | - Mengnan Shi
- Science for Life Laboratory, KTH-Royal Institute of Technology, Stockholm SE-17165, Sweden
| | - Xiangyu Li
- Guangzhou National Laboratory, Guangzhou, Guangdong Province 510005, China
| | - Hong Yang
- Science for Life Laboratory, KTH-Royal Institute of Technology, Stockholm SE-17165, Sweden
| | - Xiya Song
- Science for Life Laboratory, KTH-Royal Institute of Technology, Stockholm SE-17165, Sweden
| | - Hasan Turkez
- Department of Medical Biology, Faculty of Medicine, Atatürk University, Erzurum, Turkey
| | - Mathias Uhlén
- Science for Life Laboratory, KTH-Royal Institute of Technology, Stockholm SE-17165, Sweden
| | - Adil Mardinoglu
- Science for Life Laboratory, KTH-Royal Institute of Technology, Stockholm SE-17165, Sweden; Centre for Host-Microbiome Interactions, Faculty of Dentistry, Oral & Craniofacial Sciences, King's College London, London SE1 9RT, UK.
| |
Collapse
|
12
|
Feng S, Huang L, Pournara AV, Huang Z, Yang X, Zhang Y, Brazma A, Shi M, Papatheodorou I, Miao Z. Alleviating batch effects in cell type deconvolution with SCCAF-D. Nat Commun 2024; 15:10867. [PMID: 39738054 PMCID: PMC11686230 DOI: 10.1038/s41467-024-55213-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 12/02/2024] [Indexed: 01/01/2025] Open
Abstract
Cell type deconvolution methods can impute cell proportions from bulk transcriptomics data, revealing changes in disease progression or organ development. But benchmarking studies often use simulated bulk data from the same source as the reference, which limits its application scenarios. This study examines batch effects in deconvolution and introduces SCCAF-D, a computational workflow that ensures a Pearson Correlation Coefficient above 0.75 across simulated and real bulk data for various tissue types. Applied to non-alcoholic fatty liver disease, SCCAF-D unveils meaningful insights into changes in cell proportions during disease progression.
Collapse
Grants
- This work was supported by the Natural Science Foundation of China (32270707), the National Key R&D Programs of China (2023YFF1204700, 2023YFF1204701, 2021YFF1200900, 2021YFF1200903), the R&D Programs of Guangzhou Laboratory, Grant No. GZNL2024A01002, GZNL2023A01006, SRPG22-003, SRPG22-006, SRPG22-007, HWYQ23-003, YW-YFYJ0102.
Collapse
Affiliation(s)
- Shuo Feng
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China
- Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230027, China
| | - Liangfeng Huang
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China
- Translational Research Institute of Brain and Brain-Like Intelligence and Department of Anesthesiology, Shanghai Fourth People's Hospital Affiliated to Tongji University School of Medicine, Shanghai, China
| | - Anna Vathrakokoili Pournara
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Cambridge, CB10 1SD, UK
| | - Ziliang Huang
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Xinlu Yang
- Department of Obstetrics and Gynaecology, Harbin Red Cross Central Hospital, Harbin, 150001, China
| | - Yongjian Zhang
- Harbin Medical University the Sixth Affiliated Hospital, Harbin, 150023, China
| | - Alvis Brazma
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Cambridge, CB10 1SD, UK
| | - Ming Shi
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150001, China.
| | - Irene Papatheodorou
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, UK.
- Medical School, University of East Anglia, Norwich Research Park, Norwich, NR4 7UA, UK.
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China.
- Translational Research Institute of Brain and Brain-Like Intelligence and Department of Anesthesiology, Shanghai Fourth People's Hospital Affiliated to Tongji University School of Medicine, Shanghai, China.
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Cambridge, CB10 1SD, UK.
| |
Collapse
|
13
|
Huang B, Chen Y, Yuan S. Application of Spatial Transcriptomics in Digestive System Tumors. Biomolecules 2024; 15:21. [PMID: 39858416 PMCID: PMC11761220 DOI: 10.3390/biom15010021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2024] [Revised: 12/15/2024] [Accepted: 12/24/2024] [Indexed: 01/27/2025] Open
Abstract
In the field of digestive system tumor research, spatial transcriptomics technologies are used to delve into the spatial structure and the spatial heterogeneity of tumors and to analyze the tumor microenvironment (TME) and the inter-cellular interactions within it by revealing gene expression in tumors. These technologies are also instrumental in the diagnosis, prognosis, and treatment of digestive system tumors. This review provides a concise introduction to spatial transcriptomics and summarizes recent advances, application prospects, and technical challenges of these technologies in digestive system tumor research. This review also discusses the importance of combining spatial transcriptomics with single-cell RNA sequencing (scRNA-seq), artificial intelligence, and machine learning in digestive system cancer research.
Collapse
Affiliation(s)
- Bowen Huang
- Department of Gastric Surgery, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Guangzhou 510060, China;
| | - Yingjia Chen
- Health Science Center, Peking University, Beijing 100191, China
| | - Shuqiang Yuan
- Department of Gastric Surgery, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Guangzhou 510060, China;
| |
Collapse
|
14
|
Teleman M, Gabriel AAG, Hérault L, Gfeller D. SuperSpot: coarse graining spatial transcriptomics data into metaspots. Bioinformatics 2024; 41:btae734. [PMID: 39657949 PMCID: PMC11725322 DOI: 10.1093/bioinformatics/btae734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 11/28/2024] [Accepted: 12/06/2024] [Indexed: 12/12/2024] Open
Abstract
SUMMARY Spatial Transcriptomics is revolutionizing our ability to phenotypically characterize complex biological tissues and decipher cellular niches. With current technologies such as VisiumHD, thousands of genes can be detected across millions of spots (also called cells or bins depending on the technologies). Building upon the metacell concept, we present a workflow, called SuperSpot, to combine adjacent and transcriptomically similar spots into "metaspots". The process involves representing spots as nodes in a graph with edges connecting spots in spatial proximity and edge weights representing transcriptomic similarity. Hierarchical clustering is used to aggregate spots into metaspots at a user-defined resolution. We demonstrate that metaspots reduce the size and sparsity of spatial transcriptomic data and facilitate the analysis of large datasets generated with the most recent technologies. AVAILABILITY AND IMPLEMENTATION SuperSpot is an R package available at https://github.com/GfellerLab/SuperSpot and archived on Zenodo (https://doi.org/10.5281/zenodo.14222088). The code to reproduce the figures is available at https://github.com/GfellerLab/SuperSpot/tree/main/figures (https://doi.org/10.5281/zenodo.14222088).
Collapse
Affiliation(s)
- Matei Teleman
- Department of Oncology, Ludwig Institute for Cancer Research Lausanne, University of Lausanne, Lausanne 1011, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Lausanne 1015, Switzerland
- Agora Cancer Research Center, Lausanne 1011, Switzerland
- Swiss Cancer Center Leman (SCCL), Switzerland
| | - Aurélie A G Gabriel
- Department of Oncology, Ludwig Institute for Cancer Research Lausanne, University of Lausanne, Lausanne 1011, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Lausanne 1015, Switzerland
- Agora Cancer Research Center, Lausanne 1011, Switzerland
- Swiss Cancer Center Leman (SCCL), Switzerland
| | - Léonard Hérault
- Department of Oncology, Ludwig Institute for Cancer Research Lausanne, University of Lausanne, Lausanne 1011, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Lausanne 1015, Switzerland
- Agora Cancer Research Center, Lausanne 1011, Switzerland
- Swiss Cancer Center Leman (SCCL), Switzerland
| | - David Gfeller
- Department of Oncology, Ludwig Institute for Cancer Research Lausanne, University of Lausanne, Lausanne 1011, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Lausanne 1015, Switzerland
- Agora Cancer Research Center, Lausanne 1011, Switzerland
- Swiss Cancer Center Leman (SCCL), Switzerland
| |
Collapse
|
15
|
Maurizio A, Tascini AS, Morelli MJ. SurfR: Riding the wave of RNA-seq data with a comprehensive bioconductor package to identify surface protein-coding genes. BIOINFORMATICS ADVANCES 2024; 5:vbae201. [PMID: 39735574 PMCID: PMC11671034 DOI: 10.1093/bioadv/vbae201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Revised: 11/26/2024] [Accepted: 12/12/2024] [Indexed: 12/31/2024]
Abstract
Motivation Proteins at the cell surface connect signaling networks and largely determine a cell's capacity to communicate and interact with its environment. In particular, variations in transcriptomic profiles are often observed between healthy and diseased cells, leading to distinct sets of cell-surface proteins. For these reasons, cell-surface proteins may act as biomarkers for the detection of cells of interest in tissues or body fluids, are often the target of pharmaceutical agents, and hold significant promise in the clinical practice for diagnosis, prognosis, treatment development, and evaluation of therapy response. Therefore, implementing robust methods to identify condition-specific cell-surface proteins is of pivotal importance to advance biomedical research. Results We developed SurfR, an R/Bioconductor package providing a streamlined end-to-end workflow for computationally identifying surface protein-coding genes from expression data. Our user-friendly, comprehensive workflow performs systematic expression data retrieval from public databases, differential gene expression across conditions, integration of datasets, enrichment analysis, identification of targetable proteins on a condition of interest, and data visualization. Availability and implementation SurfR is released under GNU-GPL-v3.0 License. Source code, documentation, examples, and tutorials are available through Bioconductor (http://www.bioconductor.org/packages/SurfR). RMD notebooks with the use cases code described in the manuscript can be found on GitHub (https://github.com/auroramaurizio/SurfR_UseCases).
Collapse
Affiliation(s)
- Aurora Maurizio
- Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan 20132, Italy
| | - Anna Sofia Tascini
- Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan 20132, Italy
- Universita‘Vita-Salute San Raffaele, Milan 20132, Italy
| | - Marco J Morelli
- Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan 20132, Italy
- Universita‘Vita-Salute San Raffaele, Milan 20132, Italy
| |
Collapse
|
16
|
Yang CX, Sin DD, Ng RT. SMART: spatial transcriptomics deconvolution using marker-gene-assisted topic model. Genome Biol 2024; 25:304. [PMID: 39623485 PMCID: PMC11610197 DOI: 10.1186/s13059-024-03441-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 11/20/2024] [Indexed: 12/06/2024] Open
Abstract
While spatial transcriptomics offer valuable insights into gene expression patterns within the spatial context of tissue, many technologies do not have a single-cell resolution. Here, we present SMART, a marker gene-assisted deconvolution method that simultaneously infers the cell type-specific gene expression profile and the cellular composition at each spot. Using multiple datasets, we show that SMART outperforms the existing methods in realistic settings. It also provides a two-stage approach to enhance its performance on cell subtypes. The covariate model of SMART enables the identification of cell type-specific differentially expressed genes across conditions, elucidating biological changes at a single-cell-type resolution.
Collapse
Affiliation(s)
- Chen Xi Yang
- Centre for Heart Lung Innovation, St. Paul's Hospital, University of British Columbia, Vancouver, BC, Canada.
- Department of Bioinformatics, Faculty of Science, University of British Columbia, Vancouver, BC, Canada.
| | - Don D Sin
- Centre for Heart Lung Innovation, St. Paul's Hospital, University of British Columbia, Vancouver, BC, Canada
- Division of Respiratory Medicine, Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Raymond T Ng
- Centre for Heart Lung Innovation, St. Paul's Hospital, University of British Columbia, Vancouver, BC, Canada
- Department of Bioinformatics, Faculty of Science, University of British Columbia, Vancouver, BC, Canada
- Department of Computer Science, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
17
|
Sango J, Carcamo S, Sirenko M, Maiti A, Mansour H, Ulukaya G, Tomalin LE, Cruz-Rodriguez N, Wang T, Olszewska M, Olivier E, Jaud M, Nadorp B, Kroger B, Hu F, Silverman L, Chung SS, Wagenblast E, Chaligne R, Eisfeld AK, Demircioglu D, Landau DA, Lito P, Papaemmanuil E, DiNardo CD, Hasson D, Konopleva M, Papapetrou EP. RAS-mutant leukaemia stem cells drive clinical resistance to venetoclax. Nature 2024; 636:241-250. [PMID: 39478230 PMCID: PMC11618090 DOI: 10.1038/s41586-024-08137-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 09/30/2024] [Indexed: 12/06/2024]
Abstract
Cancer driver mutations often show distinct temporal acquisition patterns, but the biological basis for this, if any, remains unknown. RAS mutations occur invariably late in the course of acute myeloid leukaemia, upon progression or relapsed/refractory disease1-6. Here, by using human leukaemogenesis models, we first show that RAS mutations are obligatory late events that need to succeed earlier cooperating mutations. We provide the mechanistic explanation for this in a requirement for mutant RAS to specifically transform committed progenitors of the myelomonocytic lineage (granulocyte-monocyte progenitors) harbouring previously acquired driver mutations, showing that advanced leukaemic clones can originate from a different cell type in the haematopoietic hierarchy than ancestral clones. Furthermore, we demonstrate that RAS-mutant leukaemia stem cells (LSCs) give rise to monocytic disease, as observed frequently in patients with poor responses to treatment with the BCL2 inhibitor venetoclax. We show that this is because RAS-mutant LSCs, in contrast to RAS-wild-type LSCs, have altered BCL2 family gene expression and are resistant to venetoclax, driving clinical resistance and relapse with monocytic features. Our findings demonstrate that a specific genetic driver shapes the non-genetic cellular hierarchy of acute myeloid leukaemia by imposing a specific LSC target cell restriction and critically affects therapeutic outcomes in patients.
Collapse
MESH Headings
- Animals
- Female
- Humans
- Mice
- Antineoplastic Agents/pharmacology
- Antineoplastic Agents/therapeutic use
- Bridged Bicyclo Compounds, Heterocyclic/pharmacology
- Bridged Bicyclo Compounds, Heterocyclic/therapeutic use
- Cell Lineage/genetics
- Drug Resistance, Neoplasm/drug effects
- Drug Resistance, Neoplasm/genetics
- Leukemia, Myeloid, Acute/genetics
- Leukemia, Myeloid, Acute/drug therapy
- Leukemia, Myeloid, Acute/pathology
- Monocytes/metabolism
- Monocytes/drug effects
- Mutation
- Neoplastic Stem Cells/pathology
- Neoplastic Stem Cells/drug effects
- Neoplastic Stem Cells/metabolism
- Proto-Oncogene Proteins c-bcl-2/metabolism
- Proto-Oncogene Proteins c-bcl-2/genetics
- Proto-Oncogene Proteins c-bcl-2/antagonists & inhibitors
- ras Proteins/metabolism
- ras Proteins/genetics
- Sulfonamides/pharmacology
- Sulfonamides/therapeutic use
- Granulocytes
- Clone Cells/metabolism
- Clone Cells/pathology
- Stem Cells/metabolism
- Stem Cells/pathology
- Recurrence
Collapse
Affiliation(s)
- Junya Sango
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Saul Carcamo
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Bioinformatics for Next Generation Sequencing Shared Resource Facility, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Maria Sirenko
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Louis V. Gerstner Jr Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Abhishek Maiti
- Department of Leukemia, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Hager Mansour
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Gulay Ulukaya
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Bioinformatics for Next Generation Sequencing Shared Resource Facility, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lewis E Tomalin
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Bioinformatics for Next Generation Sequencing Shared Resource Facility, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Nataly Cruz-Rodriguez
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Tiansu Wang
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Malgorzata Olszewska
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Emmanuel Olivier
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Manon Jaud
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Bettina Nadorp
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, NY, USA
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Benjamin Kroger
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Medical Scientist Training Program, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Feng Hu
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Lewis Silverman
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Stephen S Chung
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Children's Medical Center Research Institute, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Elvin Wagenblast
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ronan Chaligne
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
- New York Genome Center, New York, NY, USA
| | - Ann-Kathrin Eisfeld
- Clara D. Bloomfield Center for Leukemia Outcomes Research, The Ohio State University Comprehensive Cancer Center, Columbus, OH, USA
| | - Deniz Demircioglu
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Bioinformatics for Next Generation Sequencing Shared Resource Facility, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Dan A Landau
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
- New York Genome Center, New York, NY, USA
| | - Piro Lito
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Elli Papaemmanuil
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Courtney D DiNardo
- Department of Leukemia, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Dan Hasson
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Bioinformatics for Next Generation Sequencing Shared Resource Facility, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Marina Konopleva
- Department of Medicine (Oncology), Albert Einstein College of Medicine, Bronx, NY, USA
- Department of Molecular Pharmacology, Albert Einstein College of Medicine, Bronx, NY, USA
- Montefiore Einstein Comprehensive Cancer Center, Bronx, NY, USA
| | - Eirini P Papapetrou
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Center for Advancement of Blood Cancer Therapies, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
18
|
Xu X, Li R, Mo O, Liu K, Li J, Hao P. Cell-type deconvolution for bulk RNA-seq data using single-cell reference: a comparative analysis and recommendation guideline. Brief Bioinform 2024; 26:bbaf031. [PMID: 39899596 PMCID: PMC11789683 DOI: 10.1093/bib/bbaf031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 12/06/2024] [Accepted: 01/20/2025] [Indexed: 02/05/2025] Open
Abstract
The accurate estimation of cell type proportions in tissues is crucial for various downstream analyses. With the increasing availability of single-cell sequencing data, numerous deconvolution methods that use single-cell RNA sequencing data as a reference have been developed. However, a unified understanding of how these deconvolution approaches perform in practical applications is still lacking. To address this, we systematically assessed the accuracy and robustness of nine deconvolution methods that use single-cell RNA sequencing data as a reference, evaluating them on real bulk data with cell proportions verified through flow cytometry, as well as simulated bulk data generated from five single-cell RNA sequencing datasets. Our study highlights the importance of several factors-including reference dataset construction strategies, dataset size, cell type subdivision, and cell type inconsistency-on the accuracy and robustness of deconvolution results. We also propose a set of recommended guidelines for software users in diverse scenarios.
Collapse
Affiliation(s)
- Xintian Xu
- Key Laboratory of Molecular Virology and Immunology, Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, 320 Yueyang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, 1 Yanqihu East Road, Huairou District, Beijing 100039, China
| | - Rui Li
- Key Laboratory of Molecular Virology and Immunology, Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, 320 Yueyang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, 1 Yanqihu East Road, Huairou District, Beijing 100039, China
| | - Ouyang Mo
- Key Laboratory of Molecular Virology and Immunology, Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, 320 Yueyang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, 1 Yanqihu East Road, Huairou District, Beijing 100039, China
| | - Kai Liu
- Key Laboratory of Molecular Virology and Immunology, Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, 320 Yueyang Road, Xuhui District, Shanghai 200031, China
- Department of Colorectal Surgery, Fudan University Shanghai Cancer Center, 270 Dong'an Road, Xuhui District, Shanghai 200032, China
| | - Justin Li
- Department of Mathematics, University of Connecticut, 352 Mansfield Road, Storrs, CT 06269, USA
| | - Pei Hao
- Key Laboratory of Molecular Virology and Immunology, Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, 320 Yueyang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, 1 Yanqihu East Road, Huairou District, Beijing 100039, China
| |
Collapse
|
19
|
Luo S, Zhu M, Lin L, Xie J, Lin S, Chen Y, Zhu J, Huang J. DECA: harnessing interpretable transformer model for cellular deconvolution of chromatin accessibility profile. Brief Bioinform 2024; 26:bbaf069. [PMID: 39987573 PMCID: PMC11847511 DOI: 10.1093/bib/bbaf069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Revised: 01/09/2025] [Accepted: 02/06/2025] [Indexed: 02/25/2025] Open
Abstract
The assay for transposase-accessible chromatin with sequencing (ATAC-seq) identifies chromatin accessibility across the genome, crucial for gene expression regulating. However, bulk ATAC-seq obscures cellular heterogeneity, while single-cell ATAC-seq suffers from issues such as sparsity and costliness. To this end, we introduce DECA, a sophisticated deep learning model based on vision transformer to deconvolve cell type information from bulk chromatin accessibility profiles, utilizing single-cell ATAC-seq datasets as reference for enhanced precision and resolution. Notably, patch attention generated by DECA's multi-head attention mechanism aligns with chromatin interactions detected by Hi-C. Additionally, DECA predicted lineage-specific cell composition changes due to genetic perturbation. The chromatin accessibility signatures predicted by DECA are enriched with cell-type specific genetic variations. Ultimately, we applied DECA on pan-cancer ATAC-seq datasets and demonstrated its capability to deconvolve cell type proportions with clinical significance. Taken together, DECA deconvolves cellular proportions and predicts their chromatin accessibility profiles from bulk chromatin accessibility data, which enable exploring the gene regulatory programs in development and diseases.
Collapse
Affiliation(s)
- Shijie Luo
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, No. 4221, Xiang'an South Road, Xiamen, Fujian 361102, China
- National Institute for Data Science in Health and Medicine, Xiamen University, No. 4221, Xiang'an South Road, Xiamen, Fujian 361102, China
| | - Ming Zhu
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, No. 4221, Xiang'an South Road, Xiamen, Fujian 361102, China
| | - Liquan Lin
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, No. 4221, Xiang'an South Road, Xiamen, Fujian 361102, China
| | - Jiajing Xie
- National Institute for Data Science in Health and Medicine, Xiamen University, No. 4221, Xiang'an South Road, Xiamen, Fujian 361102, China
| | - Shihao Lin
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, No. 4221, Xiang'an South Road, Xiamen, Fujian 361102, China
| | - Ying Chen
- School of Informatics, Xiamen University, No. 4221, Xiang'an South Road, Fujian 361000, China
| | - Jiali Zhu
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, No. 4221, Xiang'an South Road, Xiamen, Fujian 361102, China
| | - Jialiang Huang
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, No. 4221, Xiang'an South Road, Xiamen, Fujian 361102, China
- National Institute for Data Science in Health and Medicine, Xiamen University, No. 4221, Xiang'an South Road, Xiamen, Fujian 361102, China
| |
Collapse
|
20
|
Tejima K, Kozawa S, Sato TN. Cell type-specific weighting-factors to solve solid organs-specific limitations of single cell RNA-sequencing. PLoS Genet 2024; 20:e1011436. [PMID: 39556600 PMCID: PMC11573148 DOI: 10.1371/journal.pgen.1011436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 09/20/2024] [Indexed: 11/20/2024] Open
Abstract
While single-cell RNA-sequencing (scRNA-seq) is a popular method to analyze gene expression and cellular composition at single-cell resolution, it harbors shortcomings: The failure to account for cell-to-cell variations of transcriptome-size (i.e., the total number of transcripts per cell) and also cell dissociation/processing-induced cryptic gene expression. This is particularly a problem when analyzing highly heterogeneous solid tissues/organs, which requires cell dissociation for the analysis. As a result, there exists a discrepancy between bulk RNA-seq result and virtually reconstituted bulk RNA-seq result using its composite scRNA-seq data. To fix this problem, we propose a computationally calculated coefficient, "cell type-specific weighting-factor (cWF)". Here, we introduce a concept and a method of its computation and report cWFs for 76 cell-types across 10 solid organs. Their fidelity is validated by more accurate reconstitution and deconvolution of bulk RNA-seq data of diverse solid organs using the scRNA-seq data and the cWFs of their composite cells. Furthermore, we also show that cWFs effectively predict aging-progression, implicating their diagnostic applications and also their association with aging mechanism. Our study provides an important method to solve critical limitations of scRNA-seq analysis of complex solid tissues/organs. Furthermore, our findings suggest a diagnostic utility and biological significance of cWFs.
Collapse
Affiliation(s)
- Kengo Tejima
- Karydo TherapeutiX, Inc., Kyoto, Japan
- ERATO Sato Live Bio-Forecasting Project, Kyoto, Japan
- The Thomas N. Sato BioMEC-X Laboratories, Advanced Telecommunications Research Institute International, Kyoto, Japan
- V-iClinix Laboratory, Nara Medical University, Nara, Japan
| | - Satoshi Kozawa
- Karydo TherapeutiX, Inc., Kyoto, Japan
- ERATO Sato Live Bio-Forecasting Project, Kyoto, Japan
- The Thomas N. Sato BioMEC-X Laboratories, Advanced Telecommunications Research Institute International, Kyoto, Japan
- V-iClinix Laboratory, Nara Medical University, Nara, Japan
| | - Thomas N. Sato
- Karydo TherapeutiX, Inc., Kyoto, Japan
- ERATO Sato Live Bio-Forecasting Project, Kyoto, Japan
- The Thomas N. Sato BioMEC-X Laboratories, Advanced Telecommunications Research Institute International, Kyoto, Japan
- V-iClinix Laboratory, Nara Medical University, Nara, Japan
| |
Collapse
|
21
|
Fiorini MR, Dilliott AA, Thomas RA, Farhan SMK. Transcriptomics of Human Brain Tissue in Parkinson's Disease: a Comparison of Bulk and Single-cell RNA Sequencing. Mol Neurobiol 2024; 61:8996-9015. [PMID: 38578357 PMCID: PMC11496323 DOI: 10.1007/s12035-024-04124-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/12/2024] [Indexed: 04/06/2024]
Abstract
Parkinson's disease (PD) is a chronic and progressive neurodegenerative disease leading to motor dysfunction and, in some cases, dementia. Transcriptome analysis is one promising approach for characterizing PD and other neurodegenerative disorders by informing how specific disease events influence gene expression and contribute to pathogenesis. With the emergence of single-cell and single-nucleus RNA sequencing (scnRNA-seq) technologies, the transcriptional landscape of neurodegenerative diseases can now be described at the cellular level. As the application of scnRNA-seq is becoming routine, it calls to question how results at a single-cell resolution compare to those obtained from RNA sequencing of whole tissues (bulk RNA-seq), whether the findings are compatible, and how the assays are complimentary for unraveling the elusive transcriptional changes that drive neurodegenerative disease. Herein, we review the studies that have leveraged RNA-seq technologies to investigate PD. Through the integration of bulk and scnRNA-seq findings from human, post-mortem brain tissue, we use the PD literature as a case study to evaluate the compatibility of the results generated from each assay and demonstrate the complementarity of the sequencing technologies. Finally, through the lens of the PD transcriptomic literature, we evaluate the current feasibility of bulk and scnRNA-seq technologies to illustrate the necessity of both technologies for achieving a comprehensive insight into the mechanism by which gene expression promotes neurodegenerative disease. We conclude that the continued application of both assays will provide the greatest insight into neurodegenerative disease pathology, providing both cell-specific and whole-tissue level information.
Collapse
Affiliation(s)
- Michael R Fiorini
- The Montreal Neurological Institute-Hospital, Montreal, QC, Canada
- Department of Human Genetics, McGill University, Montreal, QC, Canada
| | - Allison A Dilliott
- The Montreal Neurological Institute-Hospital, Montreal, QC, Canada
- Department of Neurology and Neurosurgery, McGill University, Montreal, QC, Canada
| | - Rhalena A Thomas
- The Montreal Neurological Institute-Hospital, Montreal, QC, Canada.
- Department of Neurology and Neurosurgery, McGill University, Montreal, QC, Canada.
| | - Sali M K Farhan
- The Montreal Neurological Institute-Hospital, Montreal, QC, Canada.
- Department of Human Genetics, McGill University, Montreal, QC, Canada.
- Department of Neurology and Neurosurgery, McGill University, Montreal, QC, Canada.
| |
Collapse
|
22
|
Marcos Rubio Á, Oh S, Roelandt S, Stevens D, Van Damme E, Vermaelen K, De Preter K, Everaert C. Defining the optimal setting for transcriptomic analyses on blood samples for response prediction in immunotherapy-treated NSCLC patients. Sci Rep 2024; 14:26026. [PMID: 39472635 PMCID: PMC11522423 DOI: 10.1038/s41598-024-76982-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 10/18/2024] [Indexed: 11/02/2024] Open
Abstract
Transcriptomic profiling of blood immune cells offers a promising alternative to invasive, sampling bias-prone tissue-based biomarker assays for predicting immune checkpoint inhibitor (ICI) therapy response in non-small cell lung cancer (NSCLC) patients. However, the optimal analytical approach to identify systemic correlates of response still needs to be explored. We collected peripheral blood mononuclear cells and whole blood (WB) samples from 33 ICI-treated NSCLC patients before ICI treatment and at the first response evaluation. After bulk polyadenylated RNA-sequencing, we assessed differences in gene expression profiles between non-responders and responders using differential expression analysis, single sample gene set enrichment analysis (ssGSEA), and cell type deconvolution. We evaluated gene expression values, ssGSEA scores, and deconvolved cell type proportions to distinguish non-responders from responders via ROC curve (AUC) analysis, training a logistic regression classification model. Gene expression values and deconvolved proportions yielded the best results with WB samples after treatment (AUC = 0.87 and 0.85, respectively). Overall, ssGSEA scores showed superior classification performance across all sample types and timepoints (AUC > 0.7). In conclusion, transcriptomic analysis through ssGSEA demonstrated the best performance as a non-invasive biomarker for predicting clinical benefit in ICI-treated NSCLC patients, with gene expression and deconvolution on post-treatment WB samples also showing promising results.
Collapse
Affiliation(s)
- Álvaro Marcos Rubio
- Department of Biomolecular Medicine, VIB-UGent Center for Medical Biotechnology, Ghent University, Technologiepark-Zwijnaarde 75, 9052, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Medical Research Building 2 (MRB2) - UZ Gent - Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Seoyeon Oh
- Department of Biomolecular Medicine, VIB-UGent Center for Medical Biotechnology, Ghent University, Technologiepark-Zwijnaarde 75, 9052, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Medical Research Building 2 (MRB2) - UZ Gent - Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Sofie Roelandt
- Department of Biomolecular Medicine, VIB-UGent Center for Medical Biotechnology, Ghent University, Technologiepark-Zwijnaarde 75, 9052, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Medical Research Building 2 (MRB2) - UZ Gent - Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Dieter Stevens
- Cancer Research Institute Ghent (CRIG), Medical Research Building 2 (MRB2) - UZ Gent - Corneel Heymanslaan 10, 9000, Ghent, Belgium
- Department of Pulmonary Medicine and Immuno-Oncology Network Ghent, Ghent University Hospital, Ghent, Belgium
| | - Eufra Van Damme
- Department of Biomolecular Medicine, VIB-UGent Center for Medical Biotechnology, Ghent University, Technologiepark-Zwijnaarde 75, 9052, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Medical Research Building 2 (MRB2) - UZ Gent - Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Karim Vermaelen
- Cancer Research Institute Ghent (CRIG), Medical Research Building 2 (MRB2) - UZ Gent - Corneel Heymanslaan 10, 9000, Ghent, Belgium
- Department of Pulmonary Medicine and Immuno-Oncology Network Ghent, Ghent University Hospital, Ghent, Belgium
| | - Katleen De Preter
- Department of Biomolecular Medicine, VIB-UGent Center for Medical Biotechnology, Ghent University, Technologiepark-Zwijnaarde 75, 9052, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Medical Research Building 2 (MRB2) - UZ Gent - Corneel Heymanslaan 10, 9000, Ghent, Belgium
| | - Celine Everaert
- Department of Biomolecular Medicine, VIB-UGent Center for Medical Biotechnology, Ghent University, Technologiepark-Zwijnaarde 75, 9052, Ghent, Belgium.
- Cancer Research Institute Ghent (CRIG), Medical Research Building 2 (MRB2) - UZ Gent - Corneel Heymanslaan 10, 9000, Ghent, Belgium.
| |
Collapse
|
23
|
Wang C, Lin Y, Li S, Guan J. Deconvolution from bulk gene expression by leveraging sample-wise and gene-wise similarities and single-cell RNA-Seq data. BMC Genomics 2024; 25:875. [PMID: 39294558 PMCID: PMC11409548 DOI: 10.1186/s12864-024-10728-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 08/20/2024] [Indexed: 09/20/2024] Open
Abstract
BACKGROUND The widely adopted bulk RNA-seq measures the gene expression average of cells, masking cell type heterogeneity, which confounds downstream analyses. Therefore, identifying the cellular composition and cell type-specific gene expression profiles (GEPs) facilitates the study of the underlying mechanisms of various biological processes. Although single-cell RNA-seq focuses on cell type heterogeneity in gene expression, it requires specialized and expensive resources and currently is not practical for a large number of samples or a routine clinical setting. Recently, computational deconvolution methodologies have been developed, while many of them only estimate cell type composition or cell type-specific GEPs by requiring the other as input. The development of more accurate deconvolution methods to infer cell type abundance and cell type-specific GEPs is still essential. RESULTS We propose a new deconvolution algorithm, DSSC, which infers cell type-specific gene expression and cell type proportions of heterogeneous samples simultaneously by leveraging gene-gene and sample-sample similarities in bulk expression and single-cell RNA-seq data. Through comparisons with the other existing methods, we demonstrate that DSSC is effective in inferring both cell type proportions and cell type-specific GEPs across simulated pseudo-bulk data (including intra-dataset and inter-dataset simulations) and experimental bulk data (including mixture data and real experimental data). DSSC shows robustness to the change of marker gene number and sample size and also has cost and time efficiencies. CONCLUSIONS DSSC provides a practical and promising alternative to the experimental techniques to characterize cellular composition and heterogeneity in the gene expression of heterogeneous samples.
Collapse
Affiliation(s)
- Chenqi Wang
- Department of Automation, Xiamen University, Xiamen, China
| | - Yifan Lin
- Department of Automation, Xiamen University, Xiamen, China
| | - Shuchao Li
- Department of Automation, Xiamen University, Xiamen, China
| | - Jinting Guan
- Department of Automation, Xiamen University, Xiamen, China.
- Key Laboratory of System Control and Information Processing, Ministry of Education, Shanghai, China.
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China.
| |
Collapse
|
24
|
Larsen JH, Jensen IS, Svenningsen P. Benchmarking transcriptome deconvolution methods for estimating tissue- and cell-type-specific extracellular vesicle abundances. J Extracell Vesicles 2024; 13:e12511. [PMID: 39320021 PMCID: PMC11423344 DOI: 10.1002/jev2.12511] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 08/28/2024] [Indexed: 09/26/2024] Open
Abstract
Extracellular vesicles (EVs) contain cell-derived lipids, proteins and RNAs; however, determining the tissue- and cell-type-specific EV abundances in body fluids remains a significant hurdle for our understanding of EV biology. While tissue- and cell-type-specific EV abundances can be estimated by matching the EV's transcriptome to a tissue's/cell type's expression signature using deconvolutional methods, a comparative assessment of deconvolution methods' performance on EV transcriptome data is currently lacking. We benchmarked 11 deconvolution methods using data from four cell lines and their EVs, in silico mixtures, 118 human plasma and 88 urine EVs. We identified deconvolution methods that estimated cell type-specific abundances of pure and in silico mixed cell line-derived EV samples with high accuracy. Using data from two urine EV cohorts with different EV isolation procedures, four deconvolution methods produced highly similar results. The three methods were also concordant in their tissue- and cell-type-specific plasma EV abundance estimates. We identified driving factors for deconvolution accuracy and highlighted the importance of implementing biological knowledge in creating the tissue/cell type signature. Overall, our analyses demonstrate that the deconvolution algorithms DWLS and CIBERSORTx produce highly similar and accurate estimates of tissue- and cell-type-specific EV abundances in biological fluids.
Collapse
Affiliation(s)
| | - Iben Skov Jensen
- Department of Molecular MedicineUniversity of Southern DenmarkOdenseDenmark
| | - Per Svenningsen
- Department of Molecular MedicineUniversity of Southern DenmarkOdenseDenmark
| |
Collapse
|
25
|
Zhu S, Kubota N, Wang S, Wang T, Xiao G, Hoshida Y. STIE: Single-cell level deconvolution, convolution, and clustering in in situ capturing-based spatial transcriptomics. Nat Commun 2024; 15:7559. [PMID: 39214995 PMCID: PMC11364663 DOI: 10.1038/s41467-024-51728-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 08/14/2024] [Indexed: 09/04/2024] Open
Abstract
In in situ capturing-based spatial transcriptomics, spots of the same size and printed at fixed locations cannot precisely capture the randomly-located single cells, therefore inherently failing to profile transcriptome at the single-cell level. To this end, we present STIE, an Expectation Maximization algorithm that aligns the spatial transcriptome to its matched histology image-based nuclear morphology and recovers missing cells from ~70% gap area, thereby achieving the real single-cell level and whole-slide scale deconvolution, convolution, and clustering for both low- and high-resolution spots. STIE characterizes cell-type-specific gene expression and demonstrates outperforming concordance with true cell-type-specific transcriptomic signatures than the other spot- and subspot-level methods. Furthermore, STIE reveals the single-cell level insights, for instance, lower actual spot resolution than its reported spot size, unbiased evaluation of cell type colocalization, superior power of high-resolution spot in distinguishing nuanced cell types, and spatial cell-cell interactions at the single-cell level other than spot level.
Collapse
Affiliation(s)
- Shijia Zhu
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA.
- Division of Digestive and Liver Diseases, Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, USA.
| | - Naoto Kubota
- Division of Digestive and Liver Diseases, Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Shidan Wang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Tao Wang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Yujin Hoshida
- Division of Digestive and Liver Diseases, Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
26
|
Li Y, Luo Y. STdGCN: spatial transcriptomic cell-type deconvolution using graph convolutional networks. Genome Biol 2024; 25:206. [PMID: 39103939 DOI: 10.1186/s13059-024-03353-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 07/26/2024] [Indexed: 08/07/2024] Open
Abstract
Spatially resolved transcriptomics integrates high-throughput transcriptome measurements with preserved spatial cellular organization information. However, many technologies cannot reach single-cell resolution. We present STdGCN, a graph model leveraging single-cell RNA sequencing (scRNA-seq) as reference for cell-type deconvolution in spatial transcriptomic (ST) data. STdGCN incorporates expression profiles from scRNA-seq and spatial localization from ST data for deconvolution. Extensive benchmarking on multiple datasets demonstrates that STdGCN outperforms 17 state-of-the-art models. In a human breast cancer Visium dataset, STdGCN delineates stroma, lymphocytes, and cancer cells, aiding tumor microenvironment analysis. In human heart ST data, STdGCN identifies changes in endothelial-cardiomyocyte communications during tissue development.
Collapse
Affiliation(s)
- Yawei Li
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Center for Collaborative AI in Healthcare, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Yuan Luo
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
- Center for Collaborative AI in Healthcare, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
| |
Collapse
|
27
|
Aronson SL, Walker C, Thijssen B, van de Vijver KK, Horlings HM, Sanders J, Alkemade M, Koole SN, Lopez-Yurda M, Lok CAR, Rottenberg S, van Rheenen J, Sonke GS, van Driel WJ, Kester LA, Hahn K. Tumour microenvironment characterisation to stratify patients for hyperthermic intraperitoneal chemotherapy in high-grade serous ovarian cancer (OVHIPEC-1). Br J Cancer 2024; 131:565-576. [PMID: 38866963 PMCID: PMC11300911 DOI: 10.1038/s41416-024-02731-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 05/14/2024] [Accepted: 05/17/2024] [Indexed: 06/14/2024] Open
Abstract
BACKGROUND Hyperthermic intraperitoneal chemotherapy (HIPEC) improves survival in patients with Stage III ovarian cancer following interval cytoreductive surgery (CRS). Optimising patient selection is essential to maximise treatment efficacy and avoid overtreatment. This study aimed to identify biomarkers that predict HIPEC benefit by analysing gene signatures and cellular composition of tumours from participants in the OVHIPEC-1 trial. METHODS Whole-transcriptome RNA sequencing data were retrieved from high-grade serous ovarian cancer (HGSOC) samples from 147 patients obtained during interval CRS. We performed differential gene expression analysis and applied deconvolution methods to estimate cell-type proportions in bulk mRNA data, validated by histological assessment. We tested the interaction between treatment and potential predictors on progression-free survival using Cox proportional hazards models. RESULTS While differential gene expression analysis did not yield any predictive biomarkers, the cellular composition, as characterised by deconvolution, indicated that the absence of macrophages and the presence of B cells in the tumour microenvironment are potential predictors of HIPEC benefit. The histological assessment confirmed the predictive value of macrophage absence. CONCLUSION Immune cell composition, in particular macrophages absence, may predict response to HIPEC in HGSOC and these hypothesis-generating findings warrant further investigation. CLINICAL TRIAL REGISTRATION NCT00426257.
Collapse
Affiliation(s)
- S Lot Aronson
- Center for Gynecologic Oncology Amsterdam, Department of Gynecologic Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
- Department of Medical Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Cédric Walker
- Institute of Animal Pathology, Vetsuisse Faculty, University of Bern, Bern, Switzerland
- Graduate School for Cellular and Biomedical Sciences, University of Bern, Bern, Switzerland
| | - Bram Thijssen
- Division of Molecular Carcinogenesis, Oncode Institute, Netherlands Cancer Institute, Amsterdam, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Koen K van de Vijver
- Center for Gynecologic Oncology Amsterdam, Department of Gynecologic Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
- Department of Pathology & Cancer Research Institute Ghent (CRIG), Ghent University Hospital, Ghent, Belgium
| | - Hugo M Horlings
- Department of Pathology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Joyce Sanders
- Department of Pathology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Maartje Alkemade
- Core Facility Molecular Pathology and Biobanking, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Simone N Koole
- Center for Gynecologic Oncology Amsterdam, Department of Gynecologic Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Marta Lopez-Yurda
- Department of Biometrics, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Christianne A R Lok
- Center for Gynecologic Oncology Amsterdam, Department of Gynecologic Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Sven Rottenberg
- Institute of Animal Pathology, Vetsuisse Faculty, University of Bern, Bern, Switzerland
- Bern Center for Precision Medicine, University of Bern, Bern, Switzerland
| | - Jacco van Rheenen
- Department of Molecular Pathology, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Gabe S Sonke
- Department of Medical Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Willemien J van Driel
- Center for Gynecologic Oncology Amsterdam, Department of Gynecologic Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands.
| | - Lennart A Kester
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Kerstin Hahn
- Department of Molecular Pathology, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| |
Collapse
|
28
|
Li M, Su Y, Gao Y, Tian W. ReCIDE: robust estimation of cell type proportions by integrating single-reference-based deconvolutions. Brief Bioinform 2024; 25:bbae422. [PMID: 39177263 PMCID: PMC11342246 DOI: 10.1093/bib/bbae422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 07/16/2024] [Accepted: 08/12/2024] [Indexed: 08/24/2024] Open
Abstract
In this study, we introduce Robust estimation of Cell type proportions by Integrating single-reference-based DEconvolutions (ReCIDE), an innovative framework for robust estimation of cell type proportions by integrating single-reference-based deconvolutions. ReCIDE outperforms existing approaches in benchmark and real datasets, particularly excelling in estimating rare cell type proportions. Through exploratory analysis on public bulk data of triple-negative breast cancer (TNBC) patients using ReCIDE, we demonstrate a significant correlation between the prognosis of TNBC patients and the proportions of both T cell and perivascular-like cell subtypes. Built upon this discovery, we develop a prognostic assessment model for TNBC patients. Our contribution presents a novel framework for enhancing deconvolution accuracy, showcasing its effectiveness in medical research.
Collapse
Affiliation(s)
- Minghan Li
- State Key Laboratory of Genetic Engineering, Department of Computational Biology, School of Life Sciences, Fudan University, 2005 Songhu Road, Yangpu District, Shanghai 200438, China
| | - Yuqing Su
- State Key Laboratory of Genetic Engineering, Department of Computational Biology, School of Life Sciences, Fudan University, 2005 Songhu Road, Yangpu District, Shanghai 200438, China
| | - Yanbo Gao
- Shanghai SPH Jiaolian Pharmaceutical Technology Company, Limited, Buliding 4, 998 Ha Lei Road, Pudong District, Shanghai 201203, China
| | - Weidong Tian
- State Key Laboratory of Genetic Engineering, Department of Computational Biology, School of Life Sciences, Fudan University, 2005 Songhu Road, Yangpu District, Shanghai 200438, China
- Children’s Hospital of Fudan University, 399 Wanyuan Road, Minhang District, Shanghai 201102, China
- Children’s Hospital of Shandong University, 23976 Jingshi Road, Huaiyin District, Jinan, Shandong 250022, China
| |
Collapse
|
29
|
Wang J, Fonseca GJ, Ding J. scSemiProfiler: Advancing large-scale single-cell studies through semi-profiling with deep generative models and active learning. Nat Commun 2024; 15:5989. [PMID: 39013867 PMCID: PMC11252419 DOI: 10.1038/s41467-024-50150-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 06/28/2024] [Indexed: 07/18/2024] Open
Abstract
Single-cell sequencing is a crucial tool for dissecting the cellular intricacies of complex diseases. Its prohibitive cost, however, hampers its application in expansive biomedical studies. Traditional cellular deconvolution approaches can infer cell type proportions from more affordable bulk sequencing data, yet they fall short in providing the detailed resolution required for single-cell-level analyses. To overcome this challenge, we introduce "scSemiProfiler", an innovative computational framework that marries deep generative models with active learning strategies. This method adeptly infers single-cell profiles across large cohorts by fusing bulk sequencing data with targeted single-cell sequencing from a few rigorously chosen representatives. Extensive validation across heterogeneous datasets verifies the precision of our semi-profiling approach, aligning closely with true single-cell profiling data and empowering refined cellular analyses. Originally developed for extensive disease cohorts, "scSemiProfiler" is adaptable for broad applications. It provides a scalable, cost-effective solution for single-cell profiling, facilitating in-depth cellular investigation in various biological domains.
Collapse
Affiliation(s)
- Jingtao Wang
- Meakins-Christe Laboratories, Research Institute of McGill University Health Centre, 1001 Decarie Blvd, Montreal, H4A 3J1, Quebec, Canada
- Department of Medicine, Division of Experimental Medicine, McGill University, 1001 Decarie Blvd, Montreal, H4A 3J1, Quebec, Canada
| | - Gregory J Fonseca
- Meakins-Christe Laboratories, Research Institute of McGill University Health Centre, 1001 Decarie Blvd, Montreal, H4A 3J1, Quebec, Canada
- Department of Medicine, Division of Experimental Medicine, McGill University, 1001 Decarie Blvd, Montreal, H4A 3J1, Quebec, Canada
- Quantitative Life Sciences, McGill University, 845 Rue Sherbrooke Ouest, Montreal, H3A 0G4, Quebec, Canada
| | - Jun Ding
- Meakins-Christe Laboratories, Research Institute of McGill University Health Centre, 1001 Decarie Blvd, Montreal, H4A 3J1, Quebec, Canada.
- Department of Medicine, Division of Experimental Medicine, McGill University, 1001 Decarie Blvd, Montreal, H4A 3J1, Quebec, Canada.
- Quantitative Life Sciences, McGill University, 845 Rue Sherbrooke Ouest, Montreal, H3A 0G4, Quebec, Canada.
- School of Computer Science, McGill University, 3480 Rue University, Montreal, H3A 2A7, Quebec, Canada.
- Mila-Quebec AI Institute, 6666 Rue Saint-Urbain, Montreal, H2S 3H1, Quebec, Canada.
| |
Collapse
|
30
|
Görtler F, Mensching-Buhr M, Skaar Ø, Schrod S, Sterr T, Schäfer A, Beißbarth T, Joshi A, Zacharias HU, Grellscheid SN, Altenbuchinger M. Adaptive digital tissue deconvolution. Bioinformatics 2024; 40:i100-i109. [PMID: 38940181 PMCID: PMC11256946 DOI: 10.1093/bioinformatics/btae263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION The inference of cellular compositions from bulk and spatial transcriptomics data increasingly complements data analyses. Multiple computational approaches were suggested and recently, machine learning techniques were developed to systematically improve estimates. Such approaches allow to infer additional, less abundant cell types. However, they rely on training data which do not capture the full biological diversity encountered in transcriptomics analyses; data can contain cellular contributions not seen in the training data and as such, analyses can be biased or blurred. Thus, computational approaches have to deal with unknown, hidden contributions. Moreover, most methods are based on cellular archetypes which serve as a reference; e.g. a generic T-cell profile is used to infer the proportion of T-cells. It is well known that cells adapt their molecular phenotype to the environment and that pre-specified cell archetypes can distort the inference of cellular compositions. RESULTS We propose Adaptive Digital Tissue Deconvolution (ADTD) to estimate cellular proportions of pre-selected cell types together with possibly unknown and hidden background contributions. Moreover, ADTD adapts prototypic reference profiles to the molecular environment of the cells, which further resolves cell-type specific gene regulation from bulk transcriptomics data. We verify this in simulation studies and demonstrate that ADTD improves existing approaches in estimating cellular compositions. In an application to bulk transcriptomics data from breast cancer patients, we demonstrate that ADTD provides insights into cell-type specific molecular differences between breast cancer subtypes. AVAILABILITY AND IMPLEMENTATION A python implementation of ADTD and a tutorial are available at Gitlab and zenodo (doi:10.5281/zenodo.7548362).
Collapse
Affiliation(s)
- Franziska Görtler
- Computational Biology Unit, Department of Biological Sciences, University of Bergen, N-5008 Bergen, Norway
- Department of Oncology and Medical Physics, Haukeland University Hospital, 5021 Bergen, Norway
| | - Malte Mensching-Buhr
- Department of Medical Bioinformatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Ørjan Skaar
- Department of Informatics, Computational Biology Unit, University of Bergen, N-5008 Bergen, Norway
| | - Stefan Schrod
- Department of Medical Bioinformatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Thomas Sterr
- Institute of Theoretical Physics, University of Regensburg, 93053 Regensburg, Germany
| | - Andreas Schäfer
- Institute of Theoretical Physics, University of Regensburg, 93053 Regensburg, Germany
| | - Tim Beißbarth
- Department of Medical Bioinformatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Anagha Joshi
- Department of Clinical Science, Computational Biology Unit, University of Bergen, N-5008 Bergen, Norway
| | - Helena U Zacharias
- Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Hannover Medical School, 30625 Hannover, Germany
| | | | - Michael Altenbuchinger
- Department of Medical Bioinformatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| |
Collapse
|
31
|
Tiong KL, Luzhbin D, Yeang CH. Assessing transcriptomic heterogeneity of single-cell RNASeq data by bulk-level gene expression data. BMC Bioinformatics 2024; 25:209. [PMID: 38867193 PMCID: PMC11167951 DOI: 10.1186/s12859-024-05825-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 06/03/2024] [Indexed: 06/14/2024] Open
Abstract
BACKGROUND Single-cell RNA sequencing (sc-RNASeq) data illuminate transcriptomic heterogeneity but also possess a high level of noise, abundant missing entries and sometimes inadequate or no cell type annotations at all. Bulk-level gene expression data lack direct information of cell population composition but are more robust and complete and often better annotated. We propose a modeling framework to integrate bulk-level and single-cell RNASeq data to address the deficiencies and leverage the mutual strengths of each type of data and enable a more comprehensive inference of their transcriptomic heterogeneity. Contrary to the standard approaches of factorizing the bulk-level data with one algorithm and (for some methods) treating single-cell RNASeq data as references to decompose bulk-level data, we employed multiple deconvolution algorithms to factorize the bulk-level data, constructed the probabilistic graphical models of cell-level gene expressions from the decomposition outcomes, and compared the log-likelihood scores of these models in single-cell data. We term this framework backward deconvolution as inference operates from coarse-grained bulk-level data to fine-grained single-cell data. As the abundant missing entries in sc-RNASeq data have a significant effect on log-likelihood scores, we also developed a criterion for inclusion or exclusion of zero entries in log-likelihood score computation. RESULTS We selected nine deconvolution algorithms and validated backward deconvolution in five datasets. In the in-silico mixtures of mouse sc-RNASeq data, the log-likelihood scores of the deconvolution algorithms were strongly anticorrelated with their errors of mixture coefficients and cell type specific gene expression signatures. In the true bulk-level mouse data, the sample mixture coefficients were unknown but the log-likelihood scores were strongly correlated with accuracy rates of inferred cell types. In the data of autism spectrum disorder (ASD) and normal controls, we found that ASD brains possessed higher fractions of astrocytes and lower fractions of NRGN-expressing neurons than normal controls. In datasets of breast cancer and low-grade gliomas (LGG), we compared the log-likelihood scores of three simple hypotheses about the gene expression patterns of the cell types underlying the tumor subtypes. The model that tumors of each subtype were dominated by one cell type persistently outperformed an alternative model that each cell type had elevated expression in one gene group and tumors were mixtures of those cell types. Superiority of the former model is also supported by comparing the real breast cancer sc-RNASeq clusters with those generated by simulated sc-RNASeq data. CONCLUSIONS The results indicate that backward deconvolution serves as a sensible model selection tool for deconvolution algorithms and facilitates discerning hypotheses about cell type compositions underlying heterogeneous specimens such as tumors.
Collapse
Affiliation(s)
- Khong-Loon Tiong
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | - Dmytro Luzhbin
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | | |
Collapse
|
32
|
Tao Q, Xu Y, He Y, Luo T, Li X, Han L. Benchmarking mapping algorithms for cell-type annotating in mouse brain by integrating single-nucleus RNA-seq and Stereo-seq data. Brief Bioinform 2024; 25:bbae250. [PMID: 38796691 PMCID: PMC11128029 DOI: 10.1093/bib/bbae250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/17/2024] [Accepted: 05/08/2024] [Indexed: 05/28/2024] Open
Abstract
Limited gene capture efficiency and spot size of spatial transcriptome (ST) data pose significant challenges in cell-type characterization. The heterogeneity and complexity of cell composition in the mammalian brain make it more challenging to accurately annotate ST data from brain. Many algorithms attempt to characterize subtypes of neuron by integrating ST data with single-nucleus RNA sequencing (snRNA-seq) or single-cell RNA sequencing. However, assessing the accuracy of these algorithms on Stereo-seq ST data remains unresolved. Here, we benchmarked 9 mapping algorithms using 10 ST datasets from four mouse brain regions in two different resolutions and 24 pseudo-ST datasets from snRNA-seq. Both actual ST data and pseudo-ST data were mapped using snRNA-seq datasets from the corresponding brain regions as reference data. After comparing the performance across different areas and resolutions of the mouse brain, we have reached the conclusion that both robust cell-type decomposition and SpatialDWLS demonstrated superior robustness and accuracy in cell-type annotation. Testing with publicly available snRNA-seq data from another sequencing platform in the cortex region further validated our conclusions. Altogether, we developed a workflow for assessing suitability of mapping algorithm that fits for ST datasets, which can improve the efficiency and accuracy of spatial data annotation.
Collapse
Affiliation(s)
- Quyuan Tao
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- BGI Research, Hangzhou 310012, China
| | - Yiheng Xu
- Department of Neurobiology and Department of Neurology of Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, MOE Frontier Center of Brain Science and Brain-machine Integration, School of Brain Science and Brain Medicine, Zhejiang University, Hangzhou 310058, China
| | - Youzhe He
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- BGI Research, Hangzhou 310012, China
| | - Ting Luo
- BGI Research, Hangzhou 310012, China
- BGI Research, Shenzhen 518103, China
| | - Xiaoming Li
- Department of Neurobiology and Department of Neurology of Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, MOE Frontier Center of Brain Science and Brain-machine Integration, School of Brain Science and Brain Medicine, Zhejiang University, Hangzhou 310058, China
- Research Units for Emotion and Emotion disorders, Chinese Academy of Medical Sciences, Beijing 100730, China
| | - Lei Han
- BGI Research, Hangzhou 310012, China
- BGI Research, Shenzhen 518103, China
| |
Collapse
|
33
|
Nguyen H, Nguyen H, Tran D, Draghici S, Nguyen T. Fourteen years of cellular deconvolution: methodology, applications, technical evaluation and outstanding challenges. Nucleic Acids Res 2024; 52:4761-4783. [PMID: 38619038 PMCID: PMC11109966 DOI: 10.1093/nar/gkae267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 03/01/2024] [Accepted: 04/02/2024] [Indexed: 04/16/2024] Open
Abstract
Single-cell RNA sequencing (scRNA-Seq) is a recent technology that allows for the measurement of the expression of all genes in each individual cell contained in a sample. Information at the single-cell level has been shown to be extremely useful in many areas. However, performing single-cell experiments is expensive. Although cellular deconvolution cannot provide the same comprehensive information as single-cell experiments, it can extract cell-type information from bulk RNA data, and therefore it allows researchers to conduct studies at cell-type resolution from existing bulk datasets. For these reasons, a great effort has been made to develop such methods for cellular deconvolution. The large number of methods available, the requirement of coding skills, inadequate documentation, and lack of performance assessment all make it extremely difficult for life scientists to choose a suitable method for their experiment. This paper aims to fill this gap by providing a comprehensive review of 53 deconvolution methods regarding their methodology, applications, performance, and outstanding challenges. More importantly, the article presents a benchmarking of all these 53 methods using 283 cell types from 30 tissues of 63 individuals. We also provide an R package named DeconBenchmark that allows readers to execute and benchmark the reviewed methods (https://github.com/tinnlab/DeconBenchmark).
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Ha Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Duc Tran
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, MI, USA
- Advaita Bioinformatics, Ann Arbor, MI, USA
| | - Tin Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| |
Collapse
|
34
|
Gao ZJ, Fang H, Sun S, Liu SQ, Fang Z, Liu Z, Li B, Wang P, Sun SR, Meng XY, Wu Q, Chen CS. Single-cell analyses reveal evolution mimicry during the specification of breast cancer subtype. Theranostics 2024; 14:3104-3126. [PMID: 38855191 PMCID: PMC11155410 DOI: 10.7150/thno.96163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 05/12/2024] [Indexed: 06/11/2024] Open
Abstract
Background: The stem or progenitor antecedents confer developmental plasticity and unique cell identities to cancer cells via genetic and epigenetic programs. A comprehensive characterization and mapping of the cell-of-origin of breast cancer using novel technologies to unveil novel subtype-specific therapeutic targets is still absent. Methods: We integrated 195,144 high-quality cells from normal breast tissues and 406,501 high-quality cells from primary breast cancer samples to create a large-scale single-cell atlas of human normal and cancerous breasts. Potential heterogeneous origin of malignant cells was explored by contrasting cancer cells against reference normal epithelial cells. Multi-omics analyses and both in vitro and in vivo experiments were performed to screen and validate potential subtype-specific treatment targets. Novel biomarkers of identified immune and stromal cell subpopulations were validated by immunohistochemistry in our cohort. Results: Tumor stratification based on cancer cell-of-origin patterns correlated with clinical outcomes, genomic aberrations and diverse microenvironment constitutions. We found that the luminal progenitor (LP) subtype was robustly associated with poor prognosis, genomic instability and dysfunctional immune microenvironment. However, the LP subtype patients were sensitive to neoadjuvant chemotherapy (NAC), PARP inhibitors (PARPi) and immunotherapy. The LP subtype-specific target PLK1 was investigated by both in vitro and in vivo experiments. Besides, large-scale single-cell profiling of breast cancer inspired us to identify a range of clinically relevant immune and stromal cell subpopulations, including subsets of innate lymphoid cells (ILCs), macrophages and endothelial cells. Conclusion: The present single-cell study revealed the cellular repertoire and cell-of-origin patterns of breast cancer. Combining single-cell and bulk transcriptome data, we elucidated the evolution mimicry from normal to malignant subtypes and expounded the LP subtype with vital clinical implications. Novel immune and stromal cell subpopulations of breast cancer identified in our study could be potential therapeutic targets. Taken together, Our findings lay the foundation for the precise prognostic and therapeutic stratification of breast cancer.
Collapse
Affiliation(s)
- Zhi-Jie Gao
- Department of Breast and Thyroid Surgery, Renmin Hospital of Wuhan University, Wuhan, Hubei, China
| | - Huan Fang
- Kunming Institute of Zoology, Chinese Academy of Sciences. Kunming, Yunnan, China
- Kunming College of Life Sciences, University of Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Si Sun
- Department of Clinical Laboratory, Renmin Hospital of Wuhan University, Wuhan, Hubei, China
| | - Si-Qing Liu
- Department of Breast and Thyroid Surgery, Renmin Hospital of Wuhan University, Wuhan, Hubei, China
| | - Zhou Fang
- Department of Breast and Thyroid Surgery, Renmin Hospital of Wuhan University, Wuhan, Hubei, China
| | - Zhou Liu
- Breast Tumor Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Bei Li
- Department of Pathology, Renmin Hospital of Wuhan University, Wuhan, Hubei. China
| | - Ping Wang
- Medical College, Anhui University of Science and Technology, Huainan, AnHui. China
- Tongji University Cancer Center, Shanghai Tenth People's Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Sheng-Rong Sun
- Department of Breast and Thyroid Surgery, Renmin Hospital of Wuhan University, Wuhan, Hubei, China
| | - Xiang-Yu Meng
- Health Science Center, Hubei Minzu University, Enshi, Hubei, China
| | - Qi Wu
- Tongji University Cancer Center, Shanghai Tenth People's Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Ce-Shi Chen
- Kunming Institute of Zoology, Chinese Academy of Sciences. Kunming, Yunnan, China
- Academy of Biomedical Engineering, Kunming Medical University, Kunming, Yunnan, China
- The Third Affiliated Hospital, Kunming Medical University, Kunming, Yunnan, China
| |
Collapse
|
35
|
Meng G, Pan Y, Tang W, Zhang L, Cui Y, Schumacher FR, Wang M, Wang R, He S, Krischer J, Li Q, Feng H. imply: improving cell-type deconvolution accuracy using personalized reference profiles. Genome Med 2024; 16:65. [PMID: 38685057 PMCID: PMC11057104 DOI: 10.1186/s13073-024-01338-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 04/18/2024] [Indexed: 05/02/2024] Open
Abstract
Using computational tools, bulk transcriptomics can be deconvoluted to estimate the abundance of constituent cell types. However, existing deconvolution methods are conditioned on the assumption that the whole study population is served by a single reference panel, ignoring person-to-person heterogeneity. Here, we present imply, a novel algorithm to deconvolute cell type proportions using personalized reference panels. Simulation studies demonstrate reduced bias compared with existing methods. Real data analyses on longitudinal consortia show disparities in cell type proportions are associated with several disease phenotypes in Type 1 diabetes and Parkinson's disease. imply is available through the R/Bioconductor package ISLET at https://bioconductor.org/packages/ISLET/ .
Collapse
Affiliation(s)
- Guanqun Meng
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Yue Pan
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, 38105, TN, USA
| | - Wen Tang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Lijun Zhang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Ying Cui
- Department of Biomedical Data Science, Stanford University, Stanford, 94305, CA, USA
| | - Fredrick R Schumacher
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Ming Wang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Rui Wang
- Department of Surgery, Division of Surgical Oncology, University Hospitals Cleveland Medical Center, Cleveland, 44106, OH, USA
| | - Sijia He
- Department of Biostatistics, University of Michigan, Ann Arbor, 48109, MI, USA
| | - Jeffrey Krischer
- Health Informatics Institute, University of South Florida, Tampa, 38105, FL, USA
| | - Qian Li
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, 38105, TN, USA.
| | - Hao Feng
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA.
| |
Collapse
|
36
|
Lu Y, Chen QM, An L. SPADE: spatial deconvolution for domain specific cell-type estimation. Commun Biol 2024; 7:469. [PMID: 38632414 PMCID: PMC11024133 DOI: 10.1038/s42003-024-06172-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 04/10/2024] [Indexed: 04/19/2024] Open
Abstract
Understanding gene expression in different cell types within their spatial context is a key goal in genomics research. SPADE (SPAtial DEconvolution), our proposed method, addresses this by integrating spatial patterns into the analysis of cell type composition. This approach uses a combination of single-cell RNA sequencing, spatial transcriptomics, and histological data to accurately estimate the proportions of cell types in various locations. Our analyses of synthetic data have demonstrated SPADE's capability to discern cell type-specific spatial patterns effectively. When applied to real-life datasets, SPADE provides insights into cellular dynamics and the composition of tumor tissues. This enhances our comprehension of complex biological systems and aids in exploring cellular diversity. SPADE represents a significant advancement in deciphering spatial gene expression patterns, offering a powerful tool for the detailed investigation of cell types in spatial transcriptomics.
Collapse
Affiliation(s)
- Yingying Lu
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, 85721, USA
| | - Qin M Chen
- College of Pharmacy, University of Arizona, Tucson, AZ, 85721, USA
| | - Lingling An
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, 85721, USA.
- Department of Biosystems Engineering, University of Arizona, Tucson, AZ, 85721, USA.
- Department of Epidemiology and Biostatistics, University of Arizona, Tucson, AZ, 85721, USA.
| |
Collapse
|
37
|
Huuki-Myers LA, Montgomery KD, Kwon SH, Cinquemani S, Eagles NJ, Gonzalez-Padilla D, Maden SK, Kleinman JE, Hyde TM, Hicks SC, Maynard KR, Collado-Torres L. Benchmark of cellular deconvolution methods using a multi-assay reference dataset from postmortem human prefrontal cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.09.579665. [PMID: 38405805 PMCID: PMC10888823 DOI: 10.1101/2024.02.09.579665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Background Cellular deconvolution of bulk RNA-sequencing (RNA-seq) data using single cell or nuclei RNA-seq (sc/snRNA-seq) reference data is an important strategy for estimating cell type composition in heterogeneous tissues, such as human brain. Computational methods for deconvolution have been developed and benchmarked against simulated data, pseudobulked sc/snRNA-seq data, or immunohistochemistry reference data. A major limitation in developing improved deconvolution algorithms has been the lack of integrated datasets with orthogonal measurements of gene expression and estimates of cell type proportions on the same tissue sample. Deconvolution algorithm performance has not yet been evaluated across different RNA extraction methods (cytosolic, nuclear, or whole cell RNA), different library preparation types (mRNA enrichment vs. ribosomal RNA depletion), or with matched single cell reference datasets. Results A rich multi-assay dataset was generated in postmortem human dorsolateral prefrontal cortex (DLPFC) from 22 tissue blocks. Assays included spatially-resolved transcriptomics, snRNA-seq, bulk RNA-seq (across six library/extraction RNA-seq combinations), and RNAScope/Immunofluorescence (RNAScope/IF) for six broad cell types. The Mean Ratio method, implemented in the DeconvoBuddies R package, was developed for selecting cell type marker genes. Six computational deconvolution algorithms were evaluated in DLPFC and predicted cell type proportions were compared to orthogonal RNAScope/IF measurements. Conclusions Bisque and hspe were the most accurate methods, were robust to differences in RNA library types and extractions. This multi-assay dataset showed that cell size differences, marker genes differentially quantified across RNA libraries, and cell composition variability in reference snRNA-seq impact the accuracy of current deconvolution methods.
Collapse
Affiliation(s)
- Louise A. Huuki-Myers
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
| | - Kelsey D. Montgomery
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
| | - Sang Ho Kwon
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
| | - Sophia Cinquemani
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
| | - Nicholas J. Eagles
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
| | | | - Sean K. Maden
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Joel E. Kleinman
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
| | - Thomas M. Hyde
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
| | - Stephanie C. Hicks
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21205, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21205, USA
- Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Kristen R. Maynard
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, 21205, USA
| | - Leonardo Collado-Torres
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, 21205, USA
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21205, USA
| |
Collapse
|
38
|
Slabowska AO, Pyke C, Hvid H, Jessen LE, Baumgart S, Das V. A systematic evaluation of state-of-the-art deconvolution methods in spatial transcriptomics: insights from cardiovascular disease and chronic kidney disease. FRONTIERS IN BIOINFORMATICS 2024; 4:1352594. [PMID: 38601476 PMCID: PMC11004278 DOI: 10.3389/fbinf.2024.1352594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 03/11/2024] [Indexed: 04/12/2024] Open
Abstract
A major challenge in sequencing-based spatial transcriptomics (ST) is resolution limitations. Tissue sections are divided into hundreds of thousands of spots, where each spot invariably contains a mixture of cell types. Methods have been developed to deconvolute the mixed transcriptional signal into its constituents. Although ST is becoming essential for drug discovery, especially in cardiometabolic diseases, to date, no deconvolution benchmark has been performed on these types of tissues and diseases. However, the three methods, Cell2location, RCTD, and spatialDWLS, have previously been shown to perform well in brain tissue and simulated data. Here, we compare these methods to assess the best performance when using human data from cardiovascular disease (CVD) and chronic kidney disease (CKD) from patients in different pathological states, evaluated using expert annotation. In this study, we found that all three methods performed comparably well in deconvoluting verifiable cell types, including smooth muscle cells and macrophages in vascular samples and podocytes in kidney samples. RCTD shows the best performance accuracy scores in CVD samples, while Cell2location, on average, achieved the highest performance across all test experiments. Although all three methods had similar accuracies, Cell2location needed less reference data to converge at the expense of higher computational intensity. Finally, we also report that RCTD has the fastest computational time and the simplest workflow, requiring fewer computational dependencies. In conclusion, we find that each method has particular advantages, and the optimal choice depends on the use case.
Collapse
Affiliation(s)
- Alban Obel Slabowska
- Digital Science and Innovation, Computational Biology—AI and Digital Research, Novo Nordisk A/S, Måløv, Denmark
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, DTU, Kgs Lyngby, Denmark
| | - Charles Pyke
- Pathology and Imaging, Global Drug Development, Novo Nordisk A/S, Måløv, Denmark
| | - Henning Hvid
- Pathology and Imaging, Global Drug Development, Novo Nordisk A/S, Måløv, Denmark
| | - Leon Eyrich Jessen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, DTU, Kgs Lyngby, Denmark
| | - Simon Baumgart
- Digital Science and Innovation, Computational Biology—AI and Digital Research, Novo Nordisk A/S, Måløv, Denmark
| | - Vivek Das
- Digital Science and Innovation, Computational Biology—AI and Digital Research, Novo Nordisk A/S, Måløv, Denmark
| |
Collapse
|
39
|
Vathrakokoili Pournara A, Miao Z, Beker OY, Nolte N, Brazma A, Papatheodorou I. CATD: a reproducible pipeline for selecting cell-type deconvolution methods across tissues. BIOINFORMATICS ADVANCES 2024; 4:vbae048. [PMID: 38638280 PMCID: PMC11023940 DOI: 10.1093/bioadv/vbae048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 02/20/2024] [Accepted: 03/21/2024] [Indexed: 04/20/2024]
Abstract
Motivation Cell-type deconvolution methods aim to infer cell composition from bulk transcriptomic data. The proliferation of developed methods coupled with inconsistent results obtained in many cases, highlights the pressing need for guidance in the selection of appropriate methods. Additionally, the growing accessibility of single-cell RNA sequencing datasets, often accompanied by bulk expression from related samples enable the benchmark of existing methods. Results In this study, we conduct a comprehensive assessment of 31 methods, utilizing single-cell RNA-sequencing data from diverse human and mouse tissues. Employing various simulation scenarios, we reveal the efficacy of regression-based deconvolution methods, highlighting their sensitivity to reference choices. We investigate the impact of bulk-reference differences, incorporating variables such as sample, study and technology. We provide validation using a gold standard dataset from mononuclear cells and suggest a consensus prediction of proportions when ground truth is not available. We validated the consensus method on data from the stomach and studied its spillover effect. Importantly, we propose the use of the critical assessment of transcriptomic deconvolution (CATD) pipeline which encompasses functionalities for generating references and pseudo-bulks and running implemented deconvolution methods. CATD streamlines simultaneous deconvolution of numerous bulk samples, providing a practical solution for speeding up the evaluation of newly developed methods. Availability and implementation https://github.com/Papatheodorou-Group/CATD_snakemake.
Collapse
Affiliation(s)
- Anna Vathrakokoili Pournara
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Zhichao Miao
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- Open Targets, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
- GMU-GIBH Joint School of Life Sciences, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, 511436, China
| | - Ozgur Yilimaz Beker
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- Faculty of Engineering and Natural Sciences, Sabanci University, Tuzla 34956, Turkey
| | - Nadja Nolte
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- Department of Biotechnology and Systems Biology, National Institute of Biology, Ljubljana, 121-1000, Slovenia
| | - Alvis Brazma
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- Open Targets, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United Kingdom
| |
Collapse
|
40
|
Tebben K, Yirampo S, Coulibaly D, Koné AK, Laurens MB, Stucke EM, Dembélé A, Tolo Y, Traoré K, Niangaly A, Berry AA, Kouriba B, Plowe CV, Doumbo OK, Lyke KE, Takala-Harrison S, Thera MA, Travassos MA, Serre D. Gene expression analyses reveal differences in children's response to malaria according to their age. Nat Commun 2024; 15:2021. [PMID: 38448421 PMCID: PMC10918175 DOI: 10.1038/s41467-024-46416-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 02/26/2024] [Indexed: 03/08/2024] Open
Abstract
In Bandiagara, Mali, children experience on average two clinical malaria episodes per year. However, even in the same transmission area, the number of uncomplicated symptomatic infections, and their parasitemia, can vary dramatically among children. We simultaneously characterize host and parasite gene expression profiles from 136 Malian children with symptomatic falciparum malaria and examine differences in the relative proportion of immune cells and parasite stages, as well as in gene expression, associated with infection and or patient characteristics. Parasitemia explains much of the variation in host and parasite gene expression, and infections with higher parasitemia display proportionally more neutrophils and fewer T cells, suggesting parasitemia-dependent neutrophil recruitment and/or T cell extravasation to secondary lymphoid organs. The child's age also strongly correlates with variations in gene expression: Plasmodium falciparum genes associated with age suggest that older children carry more male gametocytes, while variations in host gene expression indicate a stronger innate response in younger children and stronger adaptive response in older children. These analyses highlight the variability in host responses and parasite regulation during P. falciparum symptomatic infections and emphasize the importance of considering the children's age when studying and treating malaria infections.
Collapse
Affiliation(s)
- Kieran Tebben
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Salif Yirampo
- Malaria Research and Training Center, University of Sciences, Techniques and Technologies, Bamako, Mali
| | - Drissa Coulibaly
- Malaria Research and Training Center, University of Sciences, Techniques and Technologies, Bamako, Mali
| | - Abdoulaye K Koné
- Malaria Research and Training Center, University of Sciences, Techniques and Technologies, Bamako, Mali
| | - Matthew B Laurens
- Malaria Research Program, Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Emily M Stucke
- Malaria Research Program, Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Ahmadou Dembélé
- Malaria Research and Training Center, University of Sciences, Techniques and Technologies, Bamako, Mali
| | - Youssouf Tolo
- Malaria Research and Training Center, University of Sciences, Techniques and Technologies, Bamako, Mali
| | - Karim Traoré
- Malaria Research and Training Center, University of Sciences, Techniques and Technologies, Bamako, Mali
| | - Amadou Niangaly
- Malaria Research and Training Center, University of Sciences, Techniques and Technologies, Bamako, Mali
| | - Andrea A Berry
- Malaria Research Program, Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Bourema Kouriba
- Malaria Research and Training Center, University of Sciences, Techniques and Technologies, Bamako, Mali
| | - Christopher V Plowe
- Malaria Research Program, Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Ogobara K Doumbo
- Malaria Research and Training Center, University of Sciences, Techniques and Technologies, Bamako, Mali
| | - Kirsten E Lyke
- Malaria Research Program, Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Shannon Takala-Harrison
- Malaria Research Program, Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Mahamadou A Thera
- Malaria Research and Training Center, University of Sciences, Techniques and Technologies, Bamako, Mali
| | - Mark A Travassos
- Malaria Research Program, Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - David Serre
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
41
|
Garmire LX, Li Y, Huang Q, Xu C, Teichmann SA, Kaminski N, Pellegrini M, Nguyen Q, Teschendorff AE. Challenges and perspectives in computational deconvolution of genomics data. Nat Methods 2024; 21:391-400. [PMID: 38374264 DOI: 10.1038/s41592-023-02166-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 12/26/2023] [Indexed: 02/21/2024]
Abstract
Deciphering cell-type heterogeneity is crucial for systematically understanding tissue homeostasis and its dysregulation in diseases. Computational deconvolution is an efficient approach for estimating cell-type abundances from a variety of omics data. Despite substantial methodological progress in computational deconvolution in recent years, challenges are still outstanding. Here we enlist four important challenges related to computational deconvolution: the quality of the reference data, generation of ground truth data, limitations of computational methodologies, and benchmarking design and implementation. Finally, we make recommendations on reference data generation, new directions of computational methodologies, and strategies to promote rigorous benchmarking.
Collapse
Affiliation(s)
- Lana X Garmire
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| | - Yijun Li
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Qianhui Huang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Chuan Xu
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | | | - Naftali Kaminski
- Pulmonary, Critical Care & Sleep Medicine, Yale University School of Medicine, New Haven, CT, USA
| | - Matteo Pellegrini
- Molecular, Cell and Developmental Biology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Quan Nguyen
- Institute for Molecular Bioscience, The University of Queensland and QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Andrew E Teschendorff
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- UCL Cancer Institute, University College London, London, UK
| |
Collapse
|
42
|
Pérez-Jurado LA, Cáceres A, Balagué-Dobón L, Esko T, López de Heredia M, Quintela I, Cruz R, Lapunzina P, Carracedo Á, González JR. Clonal chromosomal mosaicism and loss of chromosome Y in elderly men increase vulnerability for SARS-CoV-2. Commun Biol 2024; 7:202. [PMID: 38374351 PMCID: PMC10876565 DOI: 10.1038/s42003-024-05805-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 01/11/2024] [Indexed: 02/21/2024] Open
Abstract
The pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2, COVID-19) had an estimated overall case fatality ratio of 1.38% (pre-vaccination), being 53% higher in males and increasing exponentially with age. Among 9578 individuals diagnosed with COVID-19 in the SCOURGE study, we found 133 cases (1.42%) with detectable clonal mosaicism for chromosome alterations (mCA) and 226 males (5.08%) with acquired loss of chromosome Y (LOY). Individuals with clonal mosaic events (mCA and/or LOY) showed a 54% increase in the risk of COVID-19 lethality. LOY is associated with transcriptomic biomarkers of immune dysfunction, pro-coagulation activity and cardiovascular risk. Interferon-induced genes involved in the initial immune response to SARS-CoV-2 are also down-regulated in LOY. Thus, mCA and LOY underlie at least part of the sex-biased severity and mortality of COVID-19 in aging patients. Given its potential therapeutic and prognostic relevance, evaluation of clonal mosaicism should be implemented as biomarker of COVID-19 severity in elderly people.
Collapse
Affiliation(s)
- Luis A Pérez-Jurado
- Genetics Unit, Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain.
- Genetics Service, Hospital del Mar & Hospital del Mar Research Institute (IMIM), Barcelona, Spain.
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), ISCIII, Barcelona, Spain.
| | - Alejandro Cáceres
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain
- Centro de Investigación Biomédica en Red en Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
| | - Laura Balagué-Dobón
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain
- Centro de Investigación Biomédica en Red en Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
| | - Tonu Esko
- Estonian Genome Science Centre, University of Tartu, Tartu, Estonia
- Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, USA
| | - Miguel López de Heredia
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), ISCIII, Barcelona, Spain
| | - Inés Quintela
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), ISCIII, Barcelona, Spain
- Centro Nacional de Genotipado (CEGEN), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - Raquel Cruz
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), ISCIII, Barcelona, Spain
- Centro Nacional de Genotipado (CEGEN), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
- Instituto de Investigación Sanitaria de Santiago (IDIS), Santiago de Compostela, Spain
- Centro Singular de Investigación en Medicina Molecular y Enfermedades Crónicas (CIMUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - Pablo Lapunzina
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), ISCIII, Barcelona, Spain
- Instituto de Genética Médica y Molecular (INGEMM), Hospital Universitario La Paz-IDIPAZ, Madrid, Spain
- ERN-ITHACA-European Reference Network, Paris, France
| | - Ángel Carracedo
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), ISCIII, Barcelona, Spain
- Centro Nacional de Genotipado (CEGEN), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
- Instituto de Investigación Sanitaria de Santiago (IDIS), Santiago de Compostela, Spain
- Centro Singular de Investigación en Medicina Molecular y Enfermedades Crónicas (CIMUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
- Fundación Pública Galega de Medicina Xenómica, Sistema Galego de Saúde (SERGAS), Santiago de Compostela, Spain
| | - Juan R González
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain.
- Centro de Investigación Biomédica en Red en Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain.
- Department of Mathematics, Universitat Autònoma de Barcelona, Bellaterra, Spain.
| |
Collapse
|
43
|
Farris KM, Senior AM, Sobreira DR, Mitchell RM, Weber ZT, Ingerslev LR, Barrès R, Simpson SJ, Crean AJ, Nobrega MA. Dietary macronutrient composition impacts gene regulation in adipose tissue. Commun Biol 2024; 7:194. [PMID: 38365885 PMCID: PMC10873408 DOI: 10.1038/s42003-024-05876-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 01/30/2024] [Indexed: 02/18/2024] Open
Abstract
Diet is a key lifestyle component that influences metabolic health through several factors, including total energy intake and macronutrient composition. While the impact of caloric intake on gene expression and physiological phenomena in various tissues is well described, the influence of dietary macronutrient composition on these parameters is less well studied. Here, we use the Nutritional Geometry framework to investigate the role of macronutrient composition on metabolic function and gene regulation in adipose tissue. Using ten isocaloric diets that vary systematically in their proportion of energy from fat, protein, and carbohydrates, we find that gene expression and splicing are highly responsive to macronutrient composition, with distinct sets of genes regulated by different macronutrient interactions. Specifically, the expression of many genes associated with Bardet-Biedl syndrome is responsive to dietary fat content. Splicing and expression changes occur in largely separate gene sets, highlighting distinct mechanisms by which dietary composition influences the transcriptome and emphasizing the importance of considering splicing changes to more fully capture the gene regulation response to environmental changes such as diet. Our study provides insight into the gene regulation plasticity of adipose tissue in response to macronutrient composition, beyond the already well-characterized response to caloric intake.
Collapse
Affiliation(s)
- Kathryn M Farris
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.
| | - Alistair M Senior
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW, 2006, Australia
| | - Débora R Sobreira
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Robert M Mitchell
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Zachary T Weber
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Lars R Ingerslev
- Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, DK-2200, Copenhagen, Denmark
| | - Romain Barrès
- Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, DK-2200, Copenhagen, Denmark.
- Institut de Pharmacologie Moléculaire et Cellulaire, Université Côte d'Azur & Centre National pour la Recherche Scientifique (CNRS), Valbonne, 06560, France.
| | - Stephen J Simpson
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia.
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW, 2006, Australia.
| | - Angela J Crean
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, 2006, Australia.
| | - Marcelo A Nobrega
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.
| |
Collapse
|
44
|
Hüttmann N, Li Y, Poolsup S, Zaripov E, D’Mello R, Susevski V, Minic Z, Berezovski MV. Surface Proteome of Extracellular Vesicles and Correlation Analysis Reveal Breast Cancer Biomarkers. Cancers (Basel) 2024; 16:520. [PMID: 38339272 PMCID: PMC10854524 DOI: 10.3390/cancers16030520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 01/13/2024] [Accepted: 01/23/2024] [Indexed: 02/12/2024] Open
Abstract
Breast cancer (BC) is the second most frequently diagnosed cancer and accounts for approximately 25% of new cancer cases in Canadian women. Using biomarkers as a less-invasive BC diagnostic method is currently under investigation but is not ready for practical application in clinical settings. During the last decade, extracellular vesicles (EVs) have emerged as a promising source of biomarkers because they contain cancer-derived proteins, RNAs, and metabolites. In this study, EV proteins from small EVs (sEVs) and medium EVs (mEVs) were isolated from BC MDA-MB-231 and MCF7 and non-cancerous breast epithelial MCF10A cell lines and then analyzed by two approaches: global proteomic analysis and enrichment of EV surface proteins by Sulfo-NHS-SS-Biotin labeling. From the first approach, proteomic profiling identified 2459 proteins, which were subjected to comparative analysis and correlation network analysis. Twelve potential biomarker proteins were identified based on cell line-specific expression and filtered by their predicted co-localization with known EV marker proteins, CD63, CD9, and CD81. This approach resulted in the identification of 11 proteins, four of which were further investigated by Western blot analysis. The presence of transmembrane serine protease matriptase (ST14), claudin-3 (CLDN3), and integrin alpha-7 (ITGA7) in each cell line was validated by Western blot, revealing that ST14 and CLDN3 may be further explored as potential EV biomarkers for BC. The surface labeling approach enriched proteins that were not identified using the first approach. Ten potential BC biomarkers (Glutathione S-transferase P1 (GSTP1), Elongation factor 2 (EEF2), DEAD/H box RNA helicase (DDX10), progesterone receptor (PGR), Ras-related C3 botulinum toxin substrate 2 (RAC2), Disintegrin and metalloproteinase domain-containing protein 10 (ADAM10), Aconitase 2 (ACO2), UTP20 small subunit processome component (UTP20), NEDD4 binding protein 2 (N4BP2), Programmed cell death 6 (PDCD6)) were selected from surface proteins commonly identified from MDA-MB-231 and MCF7, but not identified in MCF10A EVs. In total, 846 surface proteins were identified from the second approach, of which 11 were already known as BC markers. This study supports the proposition that Evs are a rich source of known and novel biomarkers that may be used for non-invasive detection of BC. Furthermore, the presented datasets could be further explored for the identification of potential biomarkers in BC.
Collapse
Affiliation(s)
- Nico Hüttmann
- Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, ON K1N 6N5, Canada; (N.H.); (Y.L.); (S.P.); (E.Z.); (R.D.); (V.S.)
- John L. Holmes Mass Spectrometry Facility, Faculty of Science, University of Ottawa, Ottawa, ON K1N 6N5, Canada;
| | - Yingxi Li
- Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, ON K1N 6N5, Canada; (N.H.); (Y.L.); (S.P.); (E.Z.); (R.D.); (V.S.)
| | - Suttinee Poolsup
- Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, ON K1N 6N5, Canada; (N.H.); (Y.L.); (S.P.); (E.Z.); (R.D.); (V.S.)
| | - Emil Zaripov
- Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, ON K1N 6N5, Canada; (N.H.); (Y.L.); (S.P.); (E.Z.); (R.D.); (V.S.)
| | - Rochelle D’Mello
- Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, ON K1N 6N5, Canada; (N.H.); (Y.L.); (S.P.); (E.Z.); (R.D.); (V.S.)
| | - Vanessa Susevski
- Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, ON K1N 6N5, Canada; (N.H.); (Y.L.); (S.P.); (E.Z.); (R.D.); (V.S.)
| | - Zoran Minic
- John L. Holmes Mass Spectrometry Facility, Faculty of Science, University of Ottawa, Ottawa, ON K1N 6N5, Canada;
| | - Maxim V. Berezovski
- Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, ON K1N 6N5, Canada; (N.H.); (Y.L.); (S.P.); (E.Z.); (R.D.); (V.S.)
- John L. Holmes Mass Spectrometry Facility, Faculty of Science, University of Ottawa, Ottawa, ON K1N 6N5, Canada;
| |
Collapse
|
45
|
Roy G, Syed R, Lazaro O, Robertson S, McCabe SD, Rodriguez D, Mawla AM, Johnson TS, Kalwat MA. Identification of type 2 diabetes- and obesity-associated human β-cells using deep transfer learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.18.576260. [PMID: 38328172 PMCID: PMC10849510 DOI: 10.1101/2024.01.18.576260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Diabetes affects >10% of adults worldwide and is caused by impaired production or response to insulin, resulting in chronic hyperglycemia. Pancreatic islet β-cells are the sole source of endogenous insulin and our understanding of β-cell dysfunction and death in type 2 diabetes (T2D) is incomplete. Single-cell RNA-seq data supports heterogeneity as an important factor in β-cell function and survival. However, it is difficult to identify which β-cell phenotypes are critical for T2D etiology and progression. Our goal was to prioritize specific disease-related β-cell subpopulations to better understand T2D pathogenesis and identify relevant genes for targeted therapeutics. To address this, we applied a deep transfer learning tool, DEGAS, which maps disease associations onto single-cell RNA-seq data from bulk expression data. Independent runs of DEGAS using T2D or obesity status identified distinct β-cell subpopulations. A singular cluster of T2D-associated β-cells was identified; however, β-cells with high obese-DEGAS scores contained two subpopulations derived largely from either non-diabetic or T2D donors. The obesity-associated non-diabetic cells were enriched for translation and unfolded protein response genes compared to T2D cells. We selected DLK1 for validation by immunostaining in human pancreas sections from healthy and T2D donors. DLK1 was heterogeneously expressed among β-cells and appeared depleted from T2D islets. In conclusion, DEGAS has the potential to advance our holistic understanding of the β-cell transcriptomic phenotypes, including features that distinguish β-cells in obese non-diabetic or lean T2D states. Future work will expand this approach to additional human islet omics datasets to reveal the complex multicellular interactions driving T2D.
Collapse
|
46
|
Ricker CA, Meli K, Van Allen EM. Historical perspective and future directions: computational science in immuno-oncology. J Immunother Cancer 2024; 12:e008306. [PMID: 38191244 PMCID: PMC10826578 DOI: 10.1136/jitc-2023-008306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/07/2023] [Indexed: 01/10/2024] Open
Abstract
Immuno-oncology holds promise for transforming patient care having achieved durable clinical response rates across a variety of advanced and metastatic cancers. Despite these achievements, only a minority of patients respond to immunotherapy, underscoring the importance of elucidating molecular mechanisms responsible for response and resistance to inform the development and selection of treatments. Breakthroughs in molecular sequencing technologies have led to the generation of an immense amount of genomic and transcriptomic sequencing data that can be mined to uncover complex tumor-immune interactions using computational tools. In this review, we discuss existing and emerging computational methods that contextualize the composition and functional state of the tumor microenvironment, infer the reactivity and clonal dynamics from reconstructed immune cell receptor repertoires, and predict the antigenic landscape for immune cell recognition. We further describe the advantage of multi-omics analyses for capturing multidimensional relationships and artificial intelligence techniques for integrating omics data with histopathological and radiological images to encapsulate patterns of treatment response and tumor-immune biology. Finally, we discuss key challenges impeding their widespread use and clinical application and conclude with future perspectives. We are hopeful that this review will both serve as a guide for prospective researchers seeking to use existing tools for scientific discoveries and inspire the optimization or development of novel tools to enhance precision, ultimately expediting advancements in immunotherapy that improve patient survival and quality of life.
Collapse
Affiliation(s)
- Cora A Ricker
- Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
| | - Kevin Meli
- Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
| | | |
Collapse
|
47
|
George N, Fexova S, Fuentes AM, Madrigal P, Bi Y, Iqbal H, Kumbham U, Nolte N, Zhao L, Thanki A, Yu I, Marugan Calles J, Erdos K, Vilmovsky L, Kurri S, Vathrakokoili-Pournara A, Osumi-Sutherland D, Prakash A, Wang S, Tello-Ruiz M, Kumari S, Ware D, Goutte-Gattat D, Hu Y, Brown N, Perrimon N, Vizcaíno JA, Burdett T, Teichmann S, Brazma A, Papatheodorou I. Expression Atlas update: insights from sequencing data at both bulk and single cell level. Nucleic Acids Res 2024; 52:D107-D114. [PMID: 37992296 PMCID: PMC10767917 DOI: 10.1093/nar/gkad1021] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/13/2023] [Accepted: 10/30/2023] [Indexed: 11/24/2023] Open
Abstract
Expression Atlas (www.ebi.ac.uk/gxa) and its newest counterpart the Single Cell Expression Atlas (www.ebi.ac.uk/gxa/sc) are EMBL-EBI's knowledgebases for gene and protein expression and localisation in bulk and at single cell level. These resources aim to allow users to investigate their expression in normal tissue (baseline) or in response to perturbations such as disease or changes to genotype (differential) across multiple species. Users are invited to search for genes or metadata terms across species or biological conditions in a standardised consistent interface. Alongside these data, new features in Single Cell Expression Atlas allow users to query metadata through our new cell type wheel search. At the experiment level data can be explored through two types of dimensionality reduction plots, t-distributed Stochastic Neighbor Embedding (tSNE) and Uniform Manifold Approximation and Projection (UMAP), overlaid with either clustering or metadata information to assist users' understanding. Data are also visualised as marker gene heatmaps identifying genes that help confer cluster identity. For some data, additional visualisations are available as interactive cell level anatomograms and cell type gene expression heatmaps.
Collapse
Affiliation(s)
- Nancy George
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Silvie Fexova
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Alfonso Munoz Fuentes
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Pedro Madrigal
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Yalan Bi
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Haider Iqbal
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Upendra Kumbham
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Nadja Francesca Nolte
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Lingyun Zhao
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Anil S Thanki
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Iris D Yu
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Jose C Marugan Calles
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Karoly Erdos
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Liora Vilmovsky
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Sandeep R Kurri
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | | | - David Osumi-Sutherland
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Ananth Prakash
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Shengbo Wang
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Marcela K Tello-Ruiz
- Cold Spring Harbour Laboratory, One Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | - Sunita Kumari
- Cold Spring Harbour Laboratory, One Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | - Doreen Ware
- Cold Spring Harbour Laboratory, One Bungtown Road, Cold Spring Harbor, NY 11724, USA
- USDA ARS NEA, Plant Soil & Nutrition Laboratory Research Unit, Ithaca, NY 14853, USA
| | - Damien Goutte-Gattat
- FlyBase-Cambridge, Department of Physiology, Development and Neuroscience, University of Cambridge Downing Street, Cambridge CB2 3DY, UK
| | - Yanhui Hu
- Perrimon Lab, Department of Genetics, Harvard Medical School, Boston MA 02115, USA
| | - Nick Brown
- FlyBase-Cambridge, Department of Physiology, Development and Neuroscience, University of Cambridge Downing Street, Cambridge CB2 3DY, UK
| | - Norbert Perrimon
- Perrimon Lab, Department of Genetics, Harvard Medical School, Boston MA 02115, USA
- FlyBase-Harvard Biological Laboratories, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Sarah Teichmann
- Wellcome Trust Sanger Institute. Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Alvis Brazma
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton CB10 1SD, UK
| |
Collapse
|
48
|
Su J, Reynier JB, Fu X, Zhong G, Jiang J, Escalante RS, Wang Y, Aparicio L, Izar B, Knowles DA, Rabadan R. Smoother: a unified and modular framework for incorporating structural dependency in spatial omics data. Genome Biol 2023; 24:291. [PMID: 38110959 PMCID: PMC10726548 DOI: 10.1186/s13059-023-03138-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 12/04/2023] [Indexed: 12/20/2023] Open
Abstract
Spatial omics technologies can help identify spatially organized biological processes, but existing computational approaches often overlook structural dependencies in the data. Here, we introduce Smoother, a unified framework that integrates positional information into non-spatial models via modular priors and losses. In simulated and real datasets, Smoother enables accurate data imputation, cell-type deconvolution, and dimensionality reduction with remarkable efficiency. In colorectal cancer, Smoother-guided deconvolution reveals plasma cell and fibroblast subtype localizations linked to tumor microenvironment restructuring. Additionally, joint modeling of spatial and single-cell human prostate data with Smoother allows for spatial mapping of reference populations with significantly reduced ambiguity.
Collapse
Affiliation(s)
- Jiayu Su
- Program for Mathematical Genomics, Columbia University, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA.
- New York Genome Center, New York, NY, USA.
| | - Jean-Baptiste Reynier
- Program for Mathematical Genomics, Columbia University, New York, NY, USA
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Xi Fu
- Program for Mathematical Genomics, Columbia University, New York, NY, USA
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Guojie Zhong
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Jiahao Jiang
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | | | - Yiping Wang
- Program for Mathematical Genomics, Columbia University, New York, NY, USA
- Division of Hematology/Oncology, Department of Medicine, Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Luis Aparicio
- Program for Mathematical Genomics, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Benjamin Izar
- Program for Mathematical Genomics, Columbia University, New York, NY, USA
- Division of Hematology/Oncology, Department of Medicine, Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA
| | - David A Knowles
- Department of Systems Biology, Columbia University, New York, NY, USA
- New York Genome Center, New York, NY, USA
- Department of Computer Science, Columbia University, New York, NY, USA
| | - Raul Rabadan
- Program for Mathematical Genomics, Columbia University, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA.
- Department of Biomedical Informatics, Columbia University, New York, NY, USA.
| |
Collapse
|
49
|
Maden SK, Kwon SH, Huuki-Myers LA, Collado-Torres L, Hicks SC, Maynard KR. Challenges and opportunities to computationally deconvolve heterogeneous tissue with varying cell sizes using single-cell RNA-sequencing datasets. Genome Biol 2023; 24:288. [PMID: 38098055 PMCID: PMC10722720 DOI: 10.1186/s13059-023-03123-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 11/24/2023] [Indexed: 12/17/2023] Open
Abstract
Deconvolution of cell mixtures in "bulk" transcriptomic samples from homogenate human tissue is important for understanding disease pathologies. However, several experimental and computational challenges impede transcriptomics-based deconvolution approaches using single-cell/nucleus RNA-seq reference atlases. Cells from the brain and blood have substantially different sizes, total mRNA, and transcriptional activities, and existing approaches may quantify total mRNA instead of cell type proportions. Further, standards are lacking for the use of cell reference atlases and integrative analyses of single-cell and spatial transcriptomics data. We discuss how to approach these key challenges with orthogonal "gold standard" datasets for evaluating deconvolution methods.
Collapse
Affiliation(s)
- Sean K Maden
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Sang Ho Kwon
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Louise A Huuki-Myers
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
| | - Leonardo Collado-Torres
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
| | - Stephanie C Hicks
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
- Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, USA.
| | - Kristen R Maynard
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA.
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA.
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
50
|
Lu Y, Chen QM, An L. Semi-reference based cell type deconvolution with application to human metastatic cancers. NAR Genom Bioinform 2023; 5:lqad109. [PMID: 38143958 PMCID: PMC10748484 DOI: 10.1093/nargab/lqad109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 11/01/2023] [Accepted: 12/13/2023] [Indexed: 12/26/2023] Open
Abstract
Bulk RNA-seq experiments, commonly used to discern gene expression changes across conditions, often neglect critical cell type-specific information due to their focus on average transcript abundance. Recognizing cell type contribution is crucial to understanding phenotype and disease variations. The advent of single-cell RNA sequencing has allowed detailed examination of cellular heterogeneity; however, the cost and analytic caveat prohibits such sequencing for a large number of samples. We introduce a novel deconvolution approach, SECRET, that employs cell type-specific gene expression profiles from single-cell RNA-seq to accurately estimate cell type proportions from bulk RNA-seq data. Notably, SECRET can adapt to scenarios where the cell type present in the bulk data is unrepresented in the reference, thereby offering increased flexibility in reference selection. SECRET has demonstrated superior accuracy compared to existing methods using synthetic data and has identified unknown tissue-specific cell types in real human metastatic cancers. Its versatility makes it broadly applicable across various human cancer studies.
Collapse
Affiliation(s)
- Yingying Lu
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, USA
| | - Qin M Chen
- College of Pharmacy, University of Arizona, Tucson, AZ, USA
- Cancer Biology Program, University of Arizona, Tucson, AZ, USA
| | - Lingling An
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, USA
- Department of Biosystems Engineering, University of Arizona, Tucson, AZ, USA
- Department of Epidemiology and Biostatistics, University of Arizona, Tucson, AZ, USA
| |
Collapse
|