1
|
Deng F, Zou J, Wang M, Gu Y, Wu J, Gao L, Ji Y, Tong HHY, Chen J, Chen W, Tan L, Chu Y, Zou X, Hao J. DECEPTICON: a correlation-based strategy for RNA-seq deconvolution inspired by a variation of the Anna Karenina principle. Brief Bioinform 2025; 26:bbaf234. [PMID: 40421659 PMCID: PMC12107245 DOI: 10.1093/bib/bbaf234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Revised: 02/22/2025] [Accepted: 04/29/2025] [Indexed: 05/28/2025] Open
Abstract
Accurately deconvoluting cellular composition from bulk RNA-seq data is pivotal for understanding the tumor microenvironment and advancing precision medicine. Existing methods often struggle to consistently and accurately quantify cell types across heterogeneous RNA-seq datasets, particularly when ground truths are unavailable. In this study, we introduce DECEPTICON, a deconvolution strategy inspired by the Anna Karenina principle, which postulates that successful outcomes share common traits, while failures are more varied. DECEPTICON selects top-performing methods by leveraging correlations between different strategies and combines them dynamically to enhance performance. Our approach demonstrates superior accuracy in predicting cell-type proportions across multiple tumor datasets, improving correlation by 23.9% and reducing root mean square error by 73.5% compared to the best of 50 analyzed strategies. Applied to The Cancer Genome Atlas (TCGA) datasets for breast carcinoma, cervical squamous cell carcinoma, and lung adenocarcinoma, DECEPTICON-based predictions showed improved differentiation between patient prognoses. This correlation-based strategy offers a reliable, flexible tool for deconvoluting complex transcriptomic data and highlights its potential in refining prognostic assessments in oncology and advancing cancer biology.
Collapse
Affiliation(s)
- Fulan Deng
- School of Materials Science and Engineering, Shanghai Institute of Technology, 100 Haiquan Road, Fengxian District, Shanghai 201418, China
- Centre for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, Macao SAR 999078, China
| | - Jiawei Zou
- Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences, 320 Yueyang Road, Xuhui District, Shanghai 200031, China
| | - Miaochen Wang
- Department of General Dentistry, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, College of Stomatology, Shanghai Jiao Tong University, 1908 Gaoke West Road, Pudong New District, Shanghai 200240, China
| | - Yida Gu
- Guangdong Provincial/Zhuhai Key Laboratory of Interdisciplinary Research and Application for Data Science, Beijing Normal-Hong Kong Baptist University, 2000 Jintong Road, Tangjiawan, Xiangzhou District, Zhuhai 519087, China
| | - Jiale Wu
- Mathematics and Science College, Shanghai Normal University, 100 Guilin Road, Xuhui District, Shanghai 200233, China
| | - Lianchong Gao
- Shanghai Centre for Systems Biomedicine, Key Laboratory of Systems Biomedicine (Ministry of Education), Shanghai Centre for Systems Biomedicine, Shanghai Jiao Tong University, 800 Dong Chuan Road, Minhang District, Shanghai 200240, China
| | - Yuan Ji
- Molecular Pathology Center, Department of Pathology, Zhongshan Hospital, Fudan University, 966 Huaihai Middle Road, Xuhui District, Shanghai 200032, China
| | - Henry H Y Tong
- Centre for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, Macao SAR 999078, China
| | - Jie Chen
- Center for Ultrafast Science and Technology, Key Laboratory for Laser Plasmas (Ministry of Education), School of Physics and Astronomy, Collaborative Innovation Center of IFSA (CICIFSA, Shanghai Jiao Tong University, 800 Dongchuan Road, Minhang District, Shanghai 200240, China
| | - Wantao Chen
- Ninth People's Hospital, Shanghai Key Laboratory of Stomatology & Shanghai Research Institute of Stomatology, National Clinical Research Center of Stomatology, Shanghai Jiao Tong University School of Medicine, Huangpu District, Shanghai 200011, China
| | - Lianjiang Tan
- School of Materials Science and Engineering, Shanghai Institute of Technology, 100 Haiquan Road, Fengxian District, Shanghai 201418, China
| | - Yaoqing Chu
- School of Materials Science and Engineering, Shanghai Institute of Technology, 100 Haiquan Road, Fengxian District, Shanghai 201418, China
| | - Xin Zou
- School of Medicine, Linyi University, Shuangling Road, Lanshan District, Linyi, Shandong 276000, China
- Digital Diagnosis and Treatment Innovation Center for Cancer, Institute of Translational Medicine, Shanghai Jiao Tong University, 800 Dong Chuan Road, Minhang District, Shanghai 200240, China
| | - Jie Hao
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, 3888 Chenhua Road, Songjiang District, Shanghai 201602, China
- Institute of Clinical Science, Zhongshan Hospital, Fudan University, No.180 Fenglin Road, Xuhui District, Shanghai 200032, China
| |
Collapse
|
2
|
Ahn C, Divoux A, Zhou M, Seldin MM, Sparks LM, Whytock KL. Optimized RNA sequencing deconvolution illustrates the impact of obesity and weight loss on cell composition of human adipose tissue. Obesity (Silver Spring) 2025; 33:936-948. [PMID: 40176378 PMCID: PMC12018139 DOI: 10.1002/oby.24264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 01/24/2025] [Accepted: 01/27/2025] [Indexed: 04/04/2025]
Abstract
OBJECTIVE Cellular heterogeneity of human adipose tissue is linked to the pathophysiology of obesity and may impact the response to energy restriction and changes in fat mass. Herein, we provide an optimized pipeline to estimate cellular composition in human abdominal subcutaneous adipose tissue (ASAT) bulk RNA sequencing (RNA-seq) datasets using a single-nuclei RNA-seq signature matrix. METHODS A deconvolution pipeline for ASAT was optimized by benchmarking publicly available algorithms using a signature matrix derived from ASAT single-nuclei RNA-seq data from 20 adults and then applied to estimate ASAT cell-type proportions in publicly available obesity and weight loss studies. RESULTS Individuals with obesity had greater proportions of macrophages and lower proportions of adipocyte subpopulations and vascular cells compared with lean individuals. Two months of diet-induced weight loss increased the estimated proportions of macrophages; however, 2 years of diet-induced weight loss reduced the estimated proportions of macrophages, thereby suggesting a biphasic nature of cellular remodeling of ASAT during weight loss. CONCLUSIONS Our optimized high-throughput pipeline facilitates the assessment of composition changes of highly characterized cell types in large numbers of ASAT samples using low-cost bulk RNA-seq. Our data reveal novel changes in cellular heterogeneity and its association with cardiometabolic health in humans with obesity and following weight loss.
Collapse
Affiliation(s)
- Cheehoon Ahn
- Translational Research Institute, AdventHealth, Orlando, Florida, USA
| | - Adeline Divoux
- Translational Research Institute, AdventHealth, Orlando, Florida, USA
| | - Mingqi Zhou
- Department of Biological Chemistry and Center for Epigenetics and Metabolism, University of California, Irvine, California, USA
| | - Marcus M Seldin
- Department of Biological Chemistry and Center for Epigenetics and Metabolism, University of California, Irvine, California, USA
| | - Lauren M Sparks
- Translational Research Institute, AdventHealth, Orlando, Florida, USA
| | - Katie L Whytock
- Translational Research Institute, AdventHealth, Orlando, Florida, USA
| |
Collapse
|
3
|
Maden SK, Huuki-Myers LA, Kwon SH, Collado-Torres L, Maynard KR, Hicks SC. lute: estimating the cell composition of heterogeneous tissue with varying cell sizes using gene expression. BMC Genomics 2025; 26:433. [PMID: 40312738 PMCID: PMC12045009 DOI: 10.1186/s12864-025-11508-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Accepted: 03/19/2025] [Indexed: 05/03/2025] Open
Abstract
BACKGROUND Relative cell type fraction estimates in bulk RNA-sequencing data are important to control for cell composition differences across heterogenous tissue samples. While there exist algorithms to estimate the cell type proportions in tissues, a major challenge is the algorithms can show reduced performance if using tissues that have varying cell sizes, such as in brain tissue. In this way, without adjusting for differences in cell sizes, computational algorithms estimate the relative fraction of RNA attributable to each cell type, rather than the relative fraction of cell types, leading to potentially biased estimates in cellular composition. Furthermore, these tools were built on different frameworks with non-uniform input data formats while addressing different types of systematic errors or unwanted bias. RESULTS We present lute, a software tool to accurately deconvolute cell types with varying sizes. Our package lute wraps existing deconvolution algorithms in a flexible and extensible framework to enable easy benchmarking and comparison of existing deconvolution algorithms. Using simulated and real datasets, we demonstrate how lute adjusts for differences in cell sizes to improve the accuracy of cell composition. CONCLUSIONS Our software ( https://bioconductor.org/packages/lute ) can be used to enhance and improve existing deconvolution algorithms and can be used broadly for any type of tissue containing cell types with varying cell sizes.
Collapse
Affiliation(s)
- Sean K Maden
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Louise A Huuki-Myers
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
- Department of Clinical Neurosciences, School of Clinical Medicine, The University of Cambridge, Cambridge, UK
- UK Dementia Research Institute at The University of Cambridge, Cambridge, UK
| | - Sang Ho Kwon
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Leonardo Collado-Torres
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Kristen R Maynard
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Stephanie C Hicks
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
- Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
4
|
Ivich A, Davidson NR, Grieshober L, Li W, Hicks SC, Doherty JA, Greene CS. Missing cell types in single-cell references impact deconvolution of bulk data but are detectable. Genome Biol 2025; 26:86. [PMID: 40197327 PMCID: PMC11974051 DOI: 10.1186/s13059-025-03506-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Accepted: 02/12/2025] [Indexed: 04/10/2025] Open
Abstract
BACKGROUND Advancements in RNA sequencing have expanded our ability to study gene expression profiles of biological samples in bulk tissue and single cells. Deconvolution of bulk data with single-cell references provides the ability to study relative cell-type proportions, but most methods assume a reference is present for every cell type in bulk data. This is not true in all circumstances-cell types can be missing in single-cell profiles for many reasons. In this study, we examine the impact of missing cell types on deconvolution methods. RESULTS Using paired single-cell and single-nucleus data, we simulate realistic scenarios where cell types are missing since single-nucleus RNA sequencing is able to capture cell types that would otherwise be missing in a single-cell counterpart. Single-nucleus sequencing captures cell types absent in single-cell counterparts, allowing us to study their effects on deconvolution. We evaluate three different methods and find that performance is influenced by both the number and similarity of missing cell types. Additionally, missing cell-type profiles can be recovered from residuals using a simple non-negative matrix factorization strategy. We also analyzed real bulk data of cancerous and non-cancerous samples. We observe results consistent with simulation, namely that expression patterns from cell types likely to be missing appear present in residuals. CONCLUSIONS We expect our results to provide a starting point for those developing new deconvolution methods and help improve their to better account for the presence of missing cell types. Our results suggest that deconvolution methods should consider the possibility of missing cell types.
Collapse
Affiliation(s)
- Adriana Ivich
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Natalie R Davidson
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Laurie Grieshober
- Huntsman Cancer Institute, University of Utah, Salt Lake City, UT, USA
| | - Weishan Li
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Stephanie C Hicks
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, USA
| | | | - Casey S Greene
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.
| |
Collapse
|
5
|
Li Y, Xu S, Wang X, Ertekin-Taner N, Chen D. An augmented GSNMF model for complete deconvolution of bulk RNA-seq data. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2025; 22:988-1018. [PMID: 40296800 PMCID: PMC12043048 DOI: 10.3934/mbe.2025036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/30/2025]
Abstract
Performing complete deconvolution analysis for bulk RNA-seq data to obtain both cell type specific gene expression profiles (GEP) and relative cell abundances is a challenging task. One of the fundamental models used, the nonnegative matrix factorization (NMF), is mathematically ill-posed. Although several complete deconvolution methods have been developed, and their estimates compared to ground truth for some datasets appear promising, a comprehensive understanding of how to circumvent the ill-posedness and improve solution accuracy is lacking. In this paper, we first investigated the necessary requirements for a given dataset to satisfy the solvability conditions in NMF theory. Even with solvability conditions, the "unique" solutions of NMF are subject to a rescaling matrix. Therefore, we provide estimates of the converged local minima and the possible rescaling matrix, based on informative initial conditions. Using these strategies, we developed a new pipeline of pseudo-bulk tissue data augmented, geometric structure guided NMF model (GSNMF+). In our approach, pseudo-bulk tissue data was generated, by statistical distribution simulated pseudo cellular compositions and single-cell RNA-seq (scRNA-seq) data, and then mixed with the original dataset. The constituent matrices of the hybrid dataset then satisfy the weak solvability conditions of NMF. Furthermore, an estimated rescaling matrix was used to adjust the minimizer of the NMF, which was expected to reduce mean square root errors of solutions. Our algorithms are tested on several realistic bulk-tissue datasets and showed significant improvements in scenarios with singular cellular compositions.
Collapse
Affiliation(s)
- Yujie Li
- Department of Mathematics and Statistics, University of North Carolina at Charlotte, USA
- School of Data Science, University of North Carolina at Charlotte, USA
| | - Su Xu
- Department of Mathematics and Statistics, University of North Carolina at Charlotte, USA
| | - Xue Wang
- Department of Quantitative Health Sciences, Mayo Clinic, Florida, USA
| | - Nilüfer Ertekin-Taner
- Department of Neurosciences, Mayo Clinic, Florida, USA
- Department of Neurology, Mayo Clinic, Florida, USA
| | - Duan Chen
- Department of Mathematics and Statistics, University of North Carolina at Charlotte, USA
| |
Collapse
|
6
|
Kim M, Wang J, Pilley SE, Lu RJ, Xu A, Kim Y, Liu M, Fu X, Booth SL, Mullen PJ, Benayoun BA. Estropausal gut microbiota transplant improves measures of ovarian function in adult mice. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.05.03.592475. [PMID: 40060387 PMCID: PMC11888174 DOI: 10.1101/2024.05.03.592475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/17/2025]
Abstract
Decline in ovarian function with age not only affects fertility but is also linked to a higher risk of age-related diseases in women (e.g. osteoporosis, dementia). Intriguingly, earlier menopause is linked to shorter lifespan; however, the underlying molecular mechanisms of ovarian aging are not well understood. Recent evidence suggests the gut microbiota may influence ovarian health. In this study, we characterized ovarian aging associated microbial profiles in mice and investigated the effect of the gut microbiome from young and estropausal female mice on ovarian health through fecal microbiota transplantation. We demonstrate that the ovarian transcriptome can be broadly remodeled after heterochronic microbiota transplantation, with a reduction in inflammation-related gene expression and trends consistent with transcriptional rejuvenation. Consistently, these mice exhibited enhanced ovarian health and increased fertility. Using metagenomics-based causal mediation analyses and serum untargeted metabolomics, we identified candidate microbial species and metabolites that may contribute to the observed effects of fecal microbiota transplantation. Our findings reveal a direct link between the gut microbiota and ovarian health.
Collapse
Affiliation(s)
- Minhoo Kim
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA
| | - Justin Wang
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
- Department of Chemistry, University of Southern California, Los Angeles, CA 90089, USA
| | - Steven E Pilley
- Department of Molecular Microbiology and Immunology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Ryan J Lu
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA
- Graduate Program in the Biology of Aging, University of Southern California, Los Angeles, CA 90089, USA
| | - Alan Xu
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA
- Thomas Lord Department of Computer Science, USC Viterbi School of Engineering, Los Angeles, CA 90089, USA
| | - Younggyun Kim
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA
- Alfred E. Mann Department of Biomedical Engineering, University of Southern California, Los Angeles, CA 90089, USA
| | - Minying Liu
- Jean Mayer USDA Human Nutrition Research Center on Aging at Tufts University, Boston, MA 02111, USA
| | - Xueyan Fu
- Jean Mayer USDA Human Nutrition Research Center on Aging at Tufts University, Boston, MA 02111, USA
| | - Sarah L Booth
- Jean Mayer USDA Human Nutrition Research Center on Aging at Tufts University, Boston, MA 02111, USA
| | - Peter J Mullen
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA
- Department of Molecular Microbiology and Immunology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Bérénice A Benayoun
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA 90089, USA
- Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Molecular and Computational Biology Department, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA 90089, USA
- Biochemistry and Molecular Medicine Department, USC Keck School of Medicine, Los Angeles, CA 90089, USA
- USC Stem Cell Initiative, Los Angeles, CA 90089, USA
| |
Collapse
|
7
|
Sémon M, Mouginot M, Peltier M, Corneloup C, Veber P, Guéguen L, Pantalacci S. Comparative transcriptomics in serial organs uncovers early and pan-organ developmental changes associated with organ-specific morphological adaptation. Nat Commun 2025; 16:768. [PMID: 39824799 PMCID: PMC11742040 DOI: 10.1038/s41467-025-55826-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 12/24/2024] [Indexed: 01/20/2025] Open
Abstract
Mice have evolved a new dental plan with two additional cusps on the upper molar, while hamsters were retaining the ancestral plan. By comparing the dynamics of molar development with transcriptome time series, we found at least three early changes in mouse upper molar development. Together, they redirect spatio-temporal dynamics to ultimately form two additional cusps. The mouse lower molar has undergone much more limited phenotypic evolution. Nevertheless, its developmental trajectory evolved as much as that of the upper molar and co-evolved with it. Among the coevolving changes, some are clearly involved in the new upper molar phenotype. We found a similar level of coevolution in bat limbs. In conclusion, our study reveals how serial organ morphology has adapted through organ-specific developmental changes, as expected, but also through shared changes that have organ-specific effects on the final phenotype. This highlights the important role of developmental system drift in one organ to accommodate adaptation in another.
Collapse
Affiliation(s)
- Marie Sémon
- Laboratoire de Biologie et Modelisation de la Cellule, Ecole Normale Superieure de Lyon, CNRS, UMR 5239, Inserm, U1293, Universite Claude Bernard Lyon 1, 46 allee d'Italie, F-69364, Lyon, France.
| | - Marion Mouginot
- Laboratoire de Biologie et Modelisation de la Cellule, Ecole Normale Superieure de Lyon, CNRS, UMR 5239, Inserm, U1293, Universite Claude Bernard Lyon 1, 46 allee d'Italie, F-69364, Lyon, France
| | - Manon Peltier
- Laboratoire de Biologie et Modelisation de la Cellule, Ecole Normale Superieure de Lyon, CNRS, UMR 5239, Inserm, U1293, Universite Claude Bernard Lyon 1, 46 allee d'Italie, F-69364, Lyon, France
| | - Claudine Corneloup
- Laboratoire de Biologie et Modelisation de la Cellule, Ecole Normale Superieure de Lyon, CNRS, UMR 5239, Inserm, U1293, Universite Claude Bernard Lyon 1, 46 allee d'Italie, F-69364, Lyon, France
| | - Philippe Veber
- Laboratoire de Biometrie et Biologie Evolutive, Universite Claude Bernard Lyon 1, UMR CNRS 5558, 69622, Villeurbanne, France
| | - Laurent Guéguen
- Laboratoire de Biometrie et Biologie Evolutive, Universite Claude Bernard Lyon 1, UMR CNRS 5558, 69622, Villeurbanne, France
| | - Sophie Pantalacci
- Laboratoire de Biologie et Modelisation de la Cellule, Ecole Normale Superieure de Lyon, CNRS, UMR 5239, Inserm, U1293, Universite Claude Bernard Lyon 1, 46 allee d'Italie, F-69364, Lyon, France.
| |
Collapse
|
8
|
Feng S, Huang L, Pournara AV, Huang Z, Yang X, Zhang Y, Brazma A, Shi M, Papatheodorou I, Miao Z. Alleviating batch effects in cell type deconvolution with SCCAF-D. Nat Commun 2024; 15:10867. [PMID: 39738054 PMCID: PMC11686230 DOI: 10.1038/s41467-024-55213-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 12/02/2024] [Indexed: 01/01/2025] Open
Abstract
Cell type deconvolution methods can impute cell proportions from bulk transcriptomics data, revealing changes in disease progression or organ development. But benchmarking studies often use simulated bulk data from the same source as the reference, which limits its application scenarios. This study examines batch effects in deconvolution and introduces SCCAF-D, a computational workflow that ensures a Pearson Correlation Coefficient above 0.75 across simulated and real bulk data for various tissue types. Applied to non-alcoholic fatty liver disease, SCCAF-D unveils meaningful insights into changes in cell proportions during disease progression.
Collapse
Grants
- This work was supported by the Natural Science Foundation of China (32270707), the National Key R&D Programs of China (2023YFF1204700, 2023YFF1204701, 2021YFF1200900, 2021YFF1200903), the R&D Programs of Guangzhou Laboratory, Grant No. GZNL2024A01002, GZNL2023A01006, SRPG22-003, SRPG22-006, SRPG22-007, HWYQ23-003, YW-YFYJ0102.
Collapse
Affiliation(s)
- Shuo Feng
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China
- Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230027, China
| | - Liangfeng Huang
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China
- Translational Research Institute of Brain and Brain-Like Intelligence and Department of Anesthesiology, Shanghai Fourth People's Hospital Affiliated to Tongji University School of Medicine, Shanghai, China
| | - Anna Vathrakokoili Pournara
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Cambridge, CB10 1SD, UK
| | - Ziliang Huang
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Xinlu Yang
- Department of Obstetrics and Gynaecology, Harbin Red Cross Central Hospital, Harbin, 150001, China
| | - Yongjian Zhang
- Harbin Medical University the Sixth Affiliated Hospital, Harbin, 150023, China
| | - Alvis Brazma
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Cambridge, CB10 1SD, UK
| | - Ming Shi
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150001, China.
| | - Irene Papatheodorou
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, UK.
- Medical School, University of East Anglia, Norwich Research Park, Norwich, NR4 7UA, UK.
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China.
- Translational Research Institute of Brain and Brain-Like Intelligence and Department of Anesthesiology, Shanghai Fourth People's Hospital Affiliated to Tongji University School of Medicine, Shanghai, China.
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Cambridge, CB10 1SD, UK.
| |
Collapse
|
9
|
Zeng D, Fang Y, Qiu W, Luo P, Wang S, Shen R, Gu W, Huang X, Mao Q, Wang G, Lai Y, Rong G, Xu X, Shi M, Wu Z, Yu G, Liao W. Enhancing immuno-oncology investigations through multidimensional decoding of tumor microenvironment with IOBR 2.0. CELL REPORTS METHODS 2024; 4:100910. [PMID: 39626665 PMCID: PMC11704618 DOI: 10.1016/j.crmeth.2024.100910] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 07/30/2024] [Accepted: 11/06/2024] [Indexed: 12/19/2024]
Abstract
The use of large transcriptome datasets has greatly improved our understanding of the tumor microenvironment (TME) and helped develop precise immunotherapies. The growing application of multi-omics, single-cell RNA sequencing (scRNA-seq), and spatial transcriptome sequencing has led to many new insights, yet these findings still require clinical validation in large cohorts. To advance multi-omics integration in TME research, we have upgraded the Immuno-Oncology Biological Research (IOBR) package to IOBR 2.0, restructuring and standardizing its analytical workflow. IOBR 2.0 offers six modules for TME analysis based on multi-omics data, including data preprocessing, TME estimation, TME infiltration pattern identification, cellular interaction analysis, genome and TME interaction, and feature visualization, as well as modeling. Additionally, IOBR 2.0 enables constructing gene signatures and reference matrices from scRNA-seq data for TME deconvolution. The user-friendly pipeline provides comprehensive insights into tumor-immune interactions, and a detailed GitBook(https://iobr.github.io/book/) offers a complete manual and analysis guide for each module.
Collapse
Affiliation(s)
- Dongqiang Zeng
- Cancer Center, the Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Foshan, P.R. China; Foshan Key Laboratory of Translational Medicine in Oncology, the Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Foshan, P.R. China; Department of Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, P.R. China
| | - Yiran Fang
- Department of Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, P.R. China
| | - Wenjun Qiu
- Department of Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, P.R. China
| | - Peng Luo
- Department of Oncology, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, P.R. China
| | - Shixiang Wang
- Department of Biomedical Informatics, School of Life Sciences, Central South University, Changsha, P.R. China
| | - Rongfang Shen
- Department of Thyroid and Neck Surgery, Beijing Chaoyang Hospital, Capital Medical University, Beijing, P.R. China
| | - Wenchao Gu
- Department of Artificial Intelligence Medicine, Graduate School of Medicine, Chiba University, Chiba, Japan
| | - Xiatong Huang
- Department of Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, P.R. China
| | - Qianqian Mao
- Department of Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, P.R. China
| | - Gaofeng Wang
- Department of Plastic and Aesthetic Surgery, Nanfang Hospital of Southern Medical University, Guangzhou, Guangdong, P.R. China; Department of Dermatology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Yonghong Lai
- Department of Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, P.R. China
| | - Guangda Rong
- Department of Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, P.R. China
| | - Xi Xu
- The First School of Clinical Medical, Southern Medical University, Guangzhou, Guangdong, P.R. China
| | - Min Shi
- Department of Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, P.R. China
| | - Zuqiang Wu
- Cancer Center, the Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Foshan, P.R. China; Foshan Key Laboratory of Translational Medicine in Oncology, the Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Foshan, P.R. China.
| | - Guangchuang Yu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, Guangdong, P.R. China.
| | - Wangjun Liao
- Cancer Center, the Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Foshan, P.R. China; Foshan Key Laboratory of Translational Medicine in Oncology, the Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Foshan, P.R. China; Department of Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, P.R. China.
| |
Collapse
|
10
|
Yin H, Duo H, Li S, Qin D, Xie L, Xiao Y, Sun J, Tao J, Zhang X, Li Y, Zou Y, Yang Q, Yang X, Hao Y, Li B. Unlocking biological insights from differentially expressed genes: Concepts, methods, and future perspectives. J Adv Res 2024:S2090-1232(24)00560-5. [PMID: 39647635 DOI: 10.1016/j.jare.2024.12.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2024] [Revised: 10/12/2024] [Accepted: 12/03/2024] [Indexed: 12/10/2024] Open
Abstract
BACKGROUND Identifying differentially expressed genes (DEGs) is a core task of transcriptome analysis, as DEGs can reveal the molecular mechanisms underlying biological processes. However, interpreting the biological significance of large DEG lists is challenging. Currently, gene ontology, pathway enrichment and protein-protein interaction analysis are common strategies employed by biologists. Additionally, emerging analytical strategies/approaches (such as network module analysis, knowledge graph, drug repurposing, cell marker discovery, trajectory analysis, and cell communication analysis) have been proposed. Despite these advances, comprehensive guidelines for systematically and thoroughly mining the biological information within DEGs remain lacking. AIM OF REVIEW This review aims to provide an overview of essential concepts and methodologies for the biological interpretation of DEGs, enhancing the contextual understanding. It also addresses the current limitations and future perspectives of these approaches, highlighting their broad applications in deciphering the molecular mechanism of complex diseases and phenotypes. To assist users in extracting insights from extensive datasets, especially various DEG lists, we developed DEGMiner (https://www.ciblab.net/DEGMiner/), which integrates over 300 easily accessible databases and tools. KEY SCIENTIFIC CONCEPTS OF REVIEW This review offers strong support and guidance for exploring DEGs, and also will accelerate the discovery of hidden biological insights within genomes.
Collapse
Affiliation(s)
- Huachun Yin
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China; Department of Neurosurgery, Xinqiao Hospital, The Army Medical University, Chongqing 400037, PR China; Department of Neurobiology, Chongqing Key Laboratory of Neurobiology, The Army Medical University, Chongqing 400038, PR China
| | - Hongrui Duo
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Song Li
- Department of Neurosurgery, Xinqiao Hospital, The Army Medical University, Chongqing 400037, PR China
| | - Dan Qin
- Department of Biology, College of Science, Northeastern University, Boston, MA 02115, USA
| | - Lingling Xie
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Yingxue Xiao
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Jing Sun
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Jingxin Tao
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Xiaoxi Zhang
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Yinghong Li
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, PR China
| | - Yue Zou
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Qingxia Yang
- Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou 310058, PR China
| | - Xian Yang
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China.
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, PR China.
| |
Collapse
|
11
|
Lu Q, Liu Z, Wang X. Inferring tumor purity using multi-omics data based on a uniform machine learning framework MoTP. Brief Bioinform 2024; 26:bbaf056. [PMID: 39950745 PMCID: PMC11826339 DOI: 10.1093/bib/bbaf056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2024] [Revised: 12/24/2024] [Accepted: 01/27/2025] [Indexed: 02/17/2025] Open
Abstract
Existing algorithms for assessing tumor purity are limited to a single omics data, such as gene expression, somatic copy number variations, somatic mutations, and DNA methylation. Here we proposed the machine learning Multi-omics Tumor Purity prediction (MoTP) algorithm to estimate tumor purity based on multiple types of omics data. MoTP utilizes the Bayesian Regularized Neural Networks as the prediction algorithm, and Consensus Tumor Purity Estimates as labels. We trained MoTP using multi-omics data (mRNA, microRNA, long non-coding RNA, and DNA methylation) across 21 TCGA solid cancer types. By testing MoTP in TCGA validation sets, TCGA test sets, and eight datasets outside the TCGA cancer cohorts, we showed that although MoTP could achieve excellent performance in predicting tumor purity based on a single omics data type, the integration of multiple single omics data-based predictions can enhance the prediction performance. Moreover, we demonstrated the robustness of MoTP by testing it in datasets with Gaussian noise and feature missing. Benchmark analysis showed that MoTP outperformed most established tumor purity prediction algorithms, and that it required less running time and computational resource to fulfill the predictive task. Thus, MoTP would be an attractive option for computational tumor purity inference.
Collapse
Affiliation(s)
- Qiqi Lu
- Biomedical Informatics Research Lab, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, China
- Cancer Genomics Research Center, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, China
- Big Data Research Institute, China Pharmaceutical University, Nanjing, China
| | - Zhixian Liu
- Jiangsu Cancer Hospital, Jiangsu Institute of Cancer Research, The Affiliated Cancer Hospital of Nanjing Medical University, Nanjing, China
| | - Xiaosheng Wang
- Biomedical Informatics Research Lab, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, China
- Cancer Genomics Research Center, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, China
- Big Data Research Institute, China Pharmaceutical University, Nanjing, China
| |
Collapse
|
12
|
Sokolowski DJ, Hou H, Yuki KE, Roy A, Chan C, Choi W, Faykoo-Martinez M, Hudson M, Corre C, Uusküla-Reimand L, Goldenberg A, Palmert MR, Wilson MD. Age, sex, and cell type-resolved hypothalamic gene expression across the pubertal transition in mice. Biol Sex Differ 2024; 15:83. [PMID: 39449090 PMCID: PMC11515584 DOI: 10.1186/s13293-024-00661-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 10/07/2024] [Indexed: 10/26/2024] Open
Abstract
BACKGROUND The hypothalamus plays a central role in regulating puberty. However, our knowledge of the postnatal gene regulatory networks that control the pubertal transition in males and females is incomplete. Here, we investigate the age-, sex- and cell-type-specific gene regulation in the hypothalamus across the pubertal transition. METHODS We used RNA-seq to profile hypothalamic gene expression in male and female mice at five time points spanning the onset of puberty (postnatal days (PD) 12, 22, 27, 32, and 37). By combining this data with hypothalamic single nuclei RNA-seq data from pre- and postpubertal mice, we assigned gene expression changes to their most likely cell types of origin. In our colony, pubertal onset occurs earlier in male mice, allowing us to focus on genes whose expression is dynamic across ages and offset between sexes, and to explore the bases of sex effects. RESULTS Our age-by-sex pattern of expression enriched for biological pathways involved hormone production, neuronal activation, and glial maturation. Additionally, we inferred a robust expansion of oligodendrocytes precursor cells into mature oligodendrocytes spanning the prepubertal (PD12) to peri-pubertal (PD27) timepoints. Using spatial transcriptomic data from postpubertal mice, we observed the lateral hypothalamic area and zona incerta were the most oligodendrocyte-rich regions and that these cells expressed genes known to be involved in pubertal regulation. CONCLUSION Together, by incorporating multiple biological timepoints and using sex as a variable, we identified gene and cell-type changes that may participate in orchestrating the pubertal transition and provided a resource for future studies of postnatal hypothalamic gene regulation.
Collapse
Affiliation(s)
- Dustin J Sokolowski
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Huayun Hou
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
| | - Kyoko E Yuki
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
| | - Anna Roy
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
| | - Cadia Chan
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Developmental and Stem Cell Biology, The Hospital for Sick Children, Toronto, ON, Canada
- Donnelly Centre for Cellular & Biomolecular Research, Toronto, ON, Canada
| | - Wendy Choi
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Developmental and Stem Cell Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Mariela Faykoo-Martinez
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada
| | - Matt Hudson
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Christina Corre
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
| | | | - Anna Goldenberg
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Vector Institute, Toronto, ON, Canada
- CIFAR, Toronto, ON, Canada
| | - Mark R Palmert
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
- Division of Endocrinology, The Hospital for Sick Children, Toronto, ON, Canada
- Departments of Pediatrics and Physiology, University of Toronto, Toronto, ON, Canada
- Institute of Medical Science, University of Toronto, Toronto, ON, Canada
| | - Michael D Wilson
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
13
|
Gabriel AAG, Racle J, Falquet M, Jandus C, Gfeller D. Robust estimation of cancer and immune cell-type proportions from bulk tumor ATAC-Seq data. eLife 2024; 13:RP94833. [PMID: 39383060 PMCID: PMC11464006 DOI: 10.7554/elife.94833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/11/2024] Open
Abstract
Assay for Transposase-Accessible Chromatin sequencing (ATAC-Seq) is a widely used technique to explore gene regulatory mechanisms. For most ATAC-Seq data from healthy and diseased tissues such as tumors, chromatin accessibility measurement represents a mixed signal from multiple cell types. In this work, we derive reliable chromatin accessibility marker peaks and reference profiles for most non-malignant cell types frequently observed in the microenvironment of human tumors. We then integrate these data into the EPIC deconvolution framework (Racle et al., 2017) to quantify cell-type heterogeneity in bulk ATAC-Seq data. Our EPIC-ATAC tool accurately predicts non-malignant and malignant cell fractions in tumor samples. When applied to a human breast cancer cohort, EPIC-ATAC accurately infers the immune contexture of the main breast cancer subtypes.
Collapse
Affiliation(s)
- Aurélie Anne-Gaëlle Gabriel
- Department of Oncology, Ludwig Institute for Cancer Research, University of LausanneLausanneSwitzerland
- Agora Cancer Research CenterLausanneSwitzerland
- Swiss Cancer Center Leman (SCCL)GenevaSwitzerland
- Swiss Institute of Bioinformatics (SIB)LausanneSwitzerland
| | - Julien Racle
- Department of Oncology, Ludwig Institute for Cancer Research, University of LausanneLausanneSwitzerland
- Agora Cancer Research CenterLausanneSwitzerland
- Swiss Cancer Center Leman (SCCL)GenevaSwitzerland
- Swiss Institute of Bioinformatics (SIB)LausanneSwitzerland
| | - Maryline Falquet
- Swiss Cancer Center Leman (SCCL)GenevaSwitzerland
- Ludwig Institute for Cancer Research, Lausanne BranchLausanneSwitzerland
- Department of Pathology and Immunology, Faculty of Medicine, University of GenevaGenevaSwitzerland
- Geneva Center for Inflammation ResearchGenevaSwitzerland
| | - Camilla Jandus
- Swiss Cancer Center Leman (SCCL)GenevaSwitzerland
- Ludwig Institute for Cancer Research, Lausanne BranchLausanneSwitzerland
- Department of Pathology and Immunology, Faculty of Medicine, University of GenevaGenevaSwitzerland
- Geneva Center for Inflammation ResearchGenevaSwitzerland
| | - David Gfeller
- Department of Oncology, Ludwig Institute for Cancer Research, University of LausanneLausanneSwitzerland
- Agora Cancer Research CenterLausanneSwitzerland
- Swiss Cancer Center Leman (SCCL)GenevaSwitzerland
- Swiss Institute of Bioinformatics (SIB)LausanneSwitzerland
| |
Collapse
|
14
|
Ahn C, Divoux A, Zhou M, Seldin MM, Sparks LM, Whytock KL. An optimized pipeline for high-throughput bulk RNA-Seq deconvolution illustrates the impact of obesity and weight loss on cell composition of human adipose tissue. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.23.614489. [PMID: 39386599 PMCID: PMC11463495 DOI: 10.1101/2024.09.23.614489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Cellular heterogeneity of human adipose tissue, is linked to the pathophysiology of obesity and may impact the response to energy restriction and changes in fat mass. Here, we provide an optimized pipeline to estimate cellular composition in human abdominal subcutaneous adipose tissue (ASAT) from publicly available bulk RNA-Seq using signature profiles from our previously published full-length single nuclei (sn)RNA-Seq of the same depot. Individuals with obesity had greater proportions of macrophages and lower proportions of adipocyte sub-populations and vascular cells compared with lean individuals. Two months of diet-induced weight loss (DIWL) increased the estimated proportions of macrophages; however, two years of DIWL reduced the estimated proportions of macrophages, thereby suggesting a bi-phasic nature of cellular remodeling of ASAT during weight loss. Our optimized high-throughput pipeline facilitates the assessment of composition changes of highly characterized cell types in large numbers of ASAT samples using low-cost bulk RNA-Seq. Our data reveal novel changes in cellular heterogeneity and its association with cardiometabolic health in humans with obesity and following weight loss.
Collapse
Affiliation(s)
- Cheehoon Ahn
- Translational Research Institute, AdventHealth, Orlando, FL, USA
| | - Adeline Divoux
- Translational Research Institute, AdventHealth, Orlando, FL, USA
| | - Mingqi Zhou
- Department of Biological Chemistry and Center for Epigenetics and Metabolism, University of California, Irvine, Irvine, CA, USA
| | - Marcus M Seldin
- Department of Biological Chemistry and Center for Epigenetics and Metabolism, University of California, Irvine, Irvine, CA, USA
| | - Lauren M Sparks
- Translational Research Institute, AdventHealth, Orlando, FL, USA
| | - Katie L Whytock
- Translational Research Institute, AdventHealth, Orlando, FL, USA
| |
Collapse
|
15
|
Zhou X, Cai M, Yue M, Celedón JC, Wang J, Ding Y, Chen W, Li Y. Molecular group and correlation guided structural learning for multi-phenotype prediction. Brief Bioinform 2024; 25:bbae585. [PMID: 39541190 PMCID: PMC11562839 DOI: 10.1093/bib/bbae585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 08/09/2024] [Accepted: 10/30/2024] [Indexed: 11/16/2024] Open
Abstract
We propose a supervised learning bioinformatics tool, Biological gRoup guIded muLtivariate muLtiple lIneAr regression with peNalizaTion (Brilliant), designed for feature selection and outcome prediction in genomic data with multi-phenotypic responses. Brilliant specifically incorporates genome and/or phenotype grouping structures, as well as phenotype correlation structures, in feature selection, effect estimation, and outcome prediction under a penalized multi-response linear regression model. Extensive simulations demonstrate its superior performance compared to competing methods. We applied Brilliant to two omics studies. In the first study, we identified novel association signals between multivariate gene expressions and high-dimensional DNA methylation profiles, providing biological insights for the baseline CpG-to-gene regulation patterns in a Puerto Rican children asthma cohort. The second study focused on cell-type deconvolution prediction using high-dimensional gene expression profiles. Using Brilliant, we improved the accuracy for cell-type fraction prediction and identified novel cell-type signature genes.
Collapse
Affiliation(s)
- Xueping Zhou
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15216, United States
| | - Manqi Cai
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15216, United States
| | - Molin Yue
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15216, United States
| | - Juan C Celedón
- Department of Pediatrics, University of Pittsburgh Medical Center Children’s Hospital of Pittsburgh, Pittsburgh, PA 15224, United States
| | - Jiebiao Wang
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15216, United States
| | - Ying Ding
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15216, United States
| | - Wei Chen
- Department of Pediatrics, University of Pittsburgh Medical Center Children’s Hospital of Pittsburgh, Pittsburgh, PA 15224, United States
| | - Yanming Li
- Department of Biostatistics & Data Science, University of Kansas Medical Center, Kansas, KS 66160, United States
| |
Collapse
|
16
|
Wang C, Lin Y, Li S, Guan J. Deconvolution from bulk gene expression by leveraging sample-wise and gene-wise similarities and single-cell RNA-Seq data. BMC Genomics 2024; 25:875. [PMID: 39294558 PMCID: PMC11409548 DOI: 10.1186/s12864-024-10728-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 08/20/2024] [Indexed: 09/20/2024] Open
Abstract
BACKGROUND The widely adopted bulk RNA-seq measures the gene expression average of cells, masking cell type heterogeneity, which confounds downstream analyses. Therefore, identifying the cellular composition and cell type-specific gene expression profiles (GEPs) facilitates the study of the underlying mechanisms of various biological processes. Although single-cell RNA-seq focuses on cell type heterogeneity in gene expression, it requires specialized and expensive resources and currently is not practical for a large number of samples or a routine clinical setting. Recently, computational deconvolution methodologies have been developed, while many of them only estimate cell type composition or cell type-specific GEPs by requiring the other as input. The development of more accurate deconvolution methods to infer cell type abundance and cell type-specific GEPs is still essential. RESULTS We propose a new deconvolution algorithm, DSSC, which infers cell type-specific gene expression and cell type proportions of heterogeneous samples simultaneously by leveraging gene-gene and sample-sample similarities in bulk expression and single-cell RNA-seq data. Through comparisons with the other existing methods, we demonstrate that DSSC is effective in inferring both cell type proportions and cell type-specific GEPs across simulated pseudo-bulk data (including intra-dataset and inter-dataset simulations) and experimental bulk data (including mixture data and real experimental data). DSSC shows robustness to the change of marker gene number and sample size and also has cost and time efficiencies. CONCLUSIONS DSSC provides a practical and promising alternative to the experimental techniques to characterize cellular composition and heterogeneity in the gene expression of heterogeneous samples.
Collapse
Affiliation(s)
- Chenqi Wang
- Department of Automation, Xiamen University, Xiamen, China
| | - Yifan Lin
- Department of Automation, Xiamen University, Xiamen, China
| | - Shuchao Li
- Department of Automation, Xiamen University, Xiamen, China
| | - Jinting Guan
- Department of Automation, Xiamen University, Xiamen, China.
- Key Laboratory of System Control and Information Processing, Ministry of Education, Shanghai, China.
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China.
| |
Collapse
|
17
|
Liu Y, Vierkant R, Bhagwate A, Jons W, Stallings-Mann M, McCauley B, Carter J, Stephens M, Pfrender M, Littlepage L, Radisky D, Cunningham J, Degnim A, Winham S, Wang C. Evaluating cell type deconvolution in FFPE breast tissue: application to benign breast disease. NAR Genom Bioinform 2024; 6:lqae098. [PMID: 40162103 PMCID: PMC11952925 DOI: 10.1093/nargab/lqae098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 07/11/2024] [Accepted: 07/23/2024] [Indexed: 04/02/2025] Open
Abstract
Transcriptome profiling using RNA sequencing (RNA-seq) of bulk formalin-fixed paraffin-embedded (FFPE) tissue blocks is a standard method in biomedical research. However, when used on tissues with diverse cell type compositions, it yields averaged gene expression profiles, complicating biomarker identification due to variations in cell proportions. To address the need for optimized strategies for defining individual cell type compositions from bulk FFPE samples, we constructed single-cell RNA-seq reference data for breast tissue and tested cell type deconvolution methods. Initial simulation experiments showed similar performances across multiple commonly used deconvolution methods. However, the introduction of FFPE artifacts significantly impacted their performances, with a root mean squared error (RMSE) ranging between 0.04 and 0.17. Scaden, a deep learning-based method, consistently outperformed the others, demonstrating robustness against FFPE artifacts. Testing these methods on our 62-sample RNA-seq benign breast disease cohort in which cell type composition was estimated using digital pathology approaches, we found that pre-filtering of the reference data enhanced the accuracy of most methods, realizing up to a 32% reduction in RMSE. To support further research efforts in this domain, we introduce SCdeconR, an R package designed for streamlined cell type deconvolution assessments and downstream analyses.
Collapse
Affiliation(s)
- Yuanhang Liu
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| | - Robert A Vierkant
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| | - Aditya Bhagwate
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| | - William A Jons
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
- Biomedical Engineering and Physiology Graduate Program, Mayo Clinic Graduate School of Biomedical Sciences, Rochester, MN 55905, USA
| | | | - Bryan M McCauley
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| | - Jodi M Carter
- Department of Laboratory Medicine & Pathology, University of Alberta, Edmonton, AB T6G 2R3, Canada
| | - Melissa T Stephens
- Genomics and Bioinformatics Core Facility, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Michael E Pfrender
- Department of Biological Sciences, 109B Galvin Life Science Center, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Laurie E Littlepage
- Department of Chemistry and Biochemistry, Harper Cancer Research Institute, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Derek C Radisky
- Department of Cancer Biology, Mayo Clinic, 4500 San Pablo Road, Jacksonville, FL 32224, USA
| | - Julie M Cunningham
- Department of Laboratory Medicine and Pathology, Mayo Clinic, 200 1st Street SW, Rochester, MN 55905, USA
| | - Amy C Degnim
- Department of Surgery, Mayo Clinic, 200 1st Street SW, Rochester, MN 55905, USA
| | - Stacey J Winham
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| | - Chen Wang
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| |
Collapse
|
18
|
Larsen JH, Jensen IS, Svenningsen P. Benchmarking transcriptome deconvolution methods for estimating tissue- and cell-type-specific extracellular vesicle abundances. J Extracell Vesicles 2024; 13:e12511. [PMID: 39320021 PMCID: PMC11423344 DOI: 10.1002/jev2.12511] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 08/28/2024] [Indexed: 09/26/2024] Open
Abstract
Extracellular vesicles (EVs) contain cell-derived lipids, proteins and RNAs; however, determining the tissue- and cell-type-specific EV abundances in body fluids remains a significant hurdle for our understanding of EV biology. While tissue- and cell-type-specific EV abundances can be estimated by matching the EV's transcriptome to a tissue's/cell type's expression signature using deconvolutional methods, a comparative assessment of deconvolution methods' performance on EV transcriptome data is currently lacking. We benchmarked 11 deconvolution methods using data from four cell lines and their EVs, in silico mixtures, 118 human plasma and 88 urine EVs. We identified deconvolution methods that estimated cell type-specific abundances of pure and in silico mixed cell line-derived EV samples with high accuracy. Using data from two urine EV cohorts with different EV isolation procedures, four deconvolution methods produced highly similar results. The three methods were also concordant in their tissue- and cell-type-specific plasma EV abundance estimates. We identified driving factors for deconvolution accuracy and highlighted the importance of implementing biological knowledge in creating the tissue/cell type signature. Overall, our analyses demonstrate that the deconvolution algorithms DWLS and CIBERSORTx produce highly similar and accurate estimates of tissue- and cell-type-specific EV abundances in biological fluids.
Collapse
Affiliation(s)
| | - Iben Skov Jensen
- Department of Molecular MedicineUniversity of Southern DenmarkOdenseDenmark
| | - Per Svenningsen
- Department of Molecular MedicineUniversity of Southern DenmarkOdenseDenmark
| |
Collapse
|
19
|
Li Y, Luo Y. STdGCN: spatial transcriptomic cell-type deconvolution using graph convolutional networks. Genome Biol 2024; 25:206. [PMID: 39103939 DOI: 10.1186/s13059-024-03353-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 07/26/2024] [Indexed: 08/07/2024] Open
Abstract
Spatially resolved transcriptomics integrates high-throughput transcriptome measurements with preserved spatial cellular organization information. However, many technologies cannot reach single-cell resolution. We present STdGCN, a graph model leveraging single-cell RNA sequencing (scRNA-seq) as reference for cell-type deconvolution in spatial transcriptomic (ST) data. STdGCN incorporates expression profiles from scRNA-seq and spatial localization from ST data for deconvolution. Extensive benchmarking on multiple datasets demonstrates that STdGCN outperforms 17 state-of-the-art models. In a human breast cancer Visium dataset, STdGCN delineates stroma, lymphocytes, and cancer cells, aiding tumor microenvironment analysis. In human heart ST data, STdGCN identifies changes in endothelial-cardiomyocyte communications during tissue development.
Collapse
Affiliation(s)
- Yawei Li
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Center for Collaborative AI in Healthcare, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Yuan Luo
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
- Center for Collaborative AI in Healthcare, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
| |
Collapse
|
20
|
Kilian C, Ulrich H, Zouboulis VA, Sprezyna P, Schreiber J, Landsberger T, Büttner M, Biton M, Villablanca EJ, Huber S, Adlung L. Longitudinal single-cell data informs deterministic modelling of inflammatory bowel disease. NPJ Syst Biol Appl 2024; 10:69. [PMID: 38914538 PMCID: PMC11196733 DOI: 10.1038/s41540-024-00395-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 06/14/2024] [Indexed: 06/26/2024] Open
Abstract
Single-cell-based methods such as flow cytometry or single-cell mRNA sequencing (scRNA-seq) allow deep molecular and cellular profiling of immunological processes. Despite their high throughput, however, these measurements represent only a snapshot in time. Here, we explore how longitudinal single-cell-based datasets can be used for deterministic ordinary differential equation (ODE)-based modelling to mechanistically describe immune dynamics. We derived longitudinal changes in cell numbers of colonic cell types during inflammatory bowel disease (IBD) from flow cytometry and scRNA-seq data of murine colitis using ODE-based models. Our mathematical model generalised well across different protocols and experimental techniques, and we hypothesised that the estimated model parameters reflect biological processes. We validated this prediction of cellular turnover rates with KI-67 staining and with gene expression information from the scRNA-seq data not used for model fitting. Finally, we tested the translational relevance of the mathematical model by deconvolution of longitudinal bulk mRNA-sequencing data from a cohort of human IBD patients treated with olamkicept. We found that neutrophil depletion may contribute to IBD patients entering remission. The predictive power of IBD deterministic modelling highlights its potential to advance our understanding of immune dynamics in health and disease.
Collapse
Affiliation(s)
- Christoph Kilian
- I. Department of Medicine, University Medical Center Hamburg-Eppendorf (UKE), D-20246, Hamburg, Germany
| | - Hanna Ulrich
- I. Department of Medicine, University Medical Center Hamburg-Eppendorf (UKE), D-20246, Hamburg, Germany
| | - Viktor A Zouboulis
- I. Department of Medicine, University Medical Center Hamburg-Eppendorf (UKE), D-20246, Hamburg, Germany
| | - Paulina Sprezyna
- I. Department of Medicine, University Medical Center Hamburg-Eppendorf (UKE), D-20246, Hamburg, Germany
| | - Jasmin Schreiber
- Leibniz Institute for the Analysis of Biodiversity Change, D-20146, Hamburg, Germany
| | - Tomer Landsberger
- Department of statistics and data science, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Maren Büttner
- Calico Life Sciences, LLC, South San Francisco, CA, USA
| | - Moshe Biton
- Department of Immunology and Regenerative Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Eduardo J Villablanca
- Division of Immunology and Allergy, Department of Medicine Solna, Karolinska Institutet and University Hospital, Stockholm, Sweden
- Center of Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Samuel Huber
- I. Department of Medicine, University Medical Center Hamburg-Eppendorf (UKE), D-20246, Hamburg, Germany
| | - Lorenz Adlung
- I. Department of Medicine, University Medical Center Hamburg-Eppendorf (UKE), D-20246, Hamburg, Germany.
- Hamburg Center for Translational Immunology (HCTI) and Center for Biomedical AI (bAIome), University Medical Center Hamburg-Eppendorf (UKE), D-20246, Hamburg, Germany.
| |
Collapse
|
21
|
Wang L, Izadmehr S, Sfakianos JP, Tran M, Beaumont KG, Brody R, Cordon-Cardo C, Horowitz A, Sebra R, Oh WK, Bhardwaj N, Galsky MD, Zhu J. Single-cell transcriptomic-informed deconvolution of bulk data identifies immune checkpoint blockade resistance in urothelial cancer. iScience 2024; 27:109928. [PMID: 38812546 PMCID: PMC11133924 DOI: 10.1016/j.isci.2024.109928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 11/23/2023] [Accepted: 05/03/2024] [Indexed: 05/31/2024] Open
Abstract
Interactions within the tumor microenvironment (TME) significantly influence tumor progression and treatment responses. While single-cell RNA sequencing (scRNA-seq) and spatial genomics facilitate TME exploration, many clinical cohorts are assessed at the bulk tissue level. Integrating scRNA-seq and bulk tissue RNA-seq data through computational deconvolution is essential for obtaining clinically relevant insights. Our method, ProM, enables the examination of major and minor cell types. Through evaluation against existing methods using paired single-cell and bulk RNA sequencing of human urothelial cancer (UC) samples, ProM demonstrates superiority. Application to UC cohorts treated with immune checkpoint inhibitors reveals pre-treatment cellular features associated with poor outcomes, such as elevated SPP1 expression in macrophage/monocytes (MM). Our deconvolution method and paired single-cell and bulk tissue RNA-seq dataset contribute novel insights into TME heterogeneity and resistance to immune checkpoint blockade.
Collapse
Affiliation(s)
- Li Wang
- Department of Precision Medicine, Aitia, Somerville, MA 02143, USA
- Department of Medicine, Division of Hematology Oncology, Icahn School of Medicine at Mount Sinai, Tisch Cancer Institute, New York, NY 10029, USA
| | - Sudeh Izadmehr
- Department of Medicine, Division of Hematology Oncology, Icahn School of Medicine at Mount Sinai, Tisch Cancer Institute, New York, NY 10029, USA
| | - John P. Sfakianos
- Department of Urology; Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Michelle Tran
- Department of Medicine, Division of Hematology Oncology, Icahn School of Medicine at Mount Sinai, Tisch Cancer Institute, New York, NY 10029, USA
- The Marc and Jennifer Lipschultz Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Kristin G. Beaumont
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Rachel Brody
- Pathology, Molecular and Cell-Based Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Carlos Cordon-Cardo
- Pathology, Molecular and Cell-Based Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Amir Horowitz
- The Marc and Jennifer Lipschultz Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Robert Sebra
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - William K. Oh
- Department of Medicine, Division of Hematology Oncology, Icahn School of Medicine at Mount Sinai, Tisch Cancer Institute, New York, NY 10029, USA
| | - Nina Bhardwaj
- Department of Medicine, Division of Hematology Oncology, Icahn School of Medicine at Mount Sinai, Tisch Cancer Institute, New York, NY 10029, USA
- The Marc and Jennifer Lipschultz Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Matthew D. Galsky
- Department of Medicine, Division of Hematology Oncology, Icahn School of Medicine at Mount Sinai, Tisch Cancer Institute, New York, NY 10029, USA
| | - Jun Zhu
- Department of Medicine, Division of Hematology Oncology, Icahn School of Medicine at Mount Sinai, Tisch Cancer Institute, New York, NY 10029, USA
| |
Collapse
|
22
|
Tiong KL, Luzhbin D, Yeang CH. Assessing transcriptomic heterogeneity of single-cell RNASeq data by bulk-level gene expression data. BMC Bioinformatics 2024; 25:209. [PMID: 38867193 PMCID: PMC11167951 DOI: 10.1186/s12859-024-05825-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 06/03/2024] [Indexed: 06/14/2024] Open
Abstract
BACKGROUND Single-cell RNA sequencing (sc-RNASeq) data illuminate transcriptomic heterogeneity but also possess a high level of noise, abundant missing entries and sometimes inadequate or no cell type annotations at all. Bulk-level gene expression data lack direct information of cell population composition but are more robust and complete and often better annotated. We propose a modeling framework to integrate bulk-level and single-cell RNASeq data to address the deficiencies and leverage the mutual strengths of each type of data and enable a more comprehensive inference of their transcriptomic heterogeneity. Contrary to the standard approaches of factorizing the bulk-level data with one algorithm and (for some methods) treating single-cell RNASeq data as references to decompose bulk-level data, we employed multiple deconvolution algorithms to factorize the bulk-level data, constructed the probabilistic graphical models of cell-level gene expressions from the decomposition outcomes, and compared the log-likelihood scores of these models in single-cell data. We term this framework backward deconvolution as inference operates from coarse-grained bulk-level data to fine-grained single-cell data. As the abundant missing entries in sc-RNASeq data have a significant effect on log-likelihood scores, we also developed a criterion for inclusion or exclusion of zero entries in log-likelihood score computation. RESULTS We selected nine deconvolution algorithms and validated backward deconvolution in five datasets. In the in-silico mixtures of mouse sc-RNASeq data, the log-likelihood scores of the deconvolution algorithms were strongly anticorrelated with their errors of mixture coefficients and cell type specific gene expression signatures. In the true bulk-level mouse data, the sample mixture coefficients were unknown but the log-likelihood scores were strongly correlated with accuracy rates of inferred cell types. In the data of autism spectrum disorder (ASD) and normal controls, we found that ASD brains possessed higher fractions of astrocytes and lower fractions of NRGN-expressing neurons than normal controls. In datasets of breast cancer and low-grade gliomas (LGG), we compared the log-likelihood scores of three simple hypotheses about the gene expression patterns of the cell types underlying the tumor subtypes. The model that tumors of each subtype were dominated by one cell type persistently outperformed an alternative model that each cell type had elevated expression in one gene group and tumors were mixtures of those cell types. Superiority of the former model is also supported by comparing the real breast cancer sc-RNASeq clusters with those generated by simulated sc-RNASeq data. CONCLUSIONS The results indicate that backward deconvolution serves as a sensible model selection tool for deconvolution algorithms and facilitates discerning hypotheses about cell type compositions underlying heterogeneous specimens such as tumors.
Collapse
Affiliation(s)
- Khong-Loon Tiong
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | - Dmytro Luzhbin
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | | |
Collapse
|
23
|
Yang M, Kaarbø M, Myhre V, Reims HM, Karlsen TH, Wang J, Rognes T, Halvorsen B, Fevang B, Lundin KEA, Aukrust P, Bjørås M, Jørgensen SF. Altered Genome-Wide DNA Methylation in the Duodenum of Common Variable Immunodeficiency Patients. J Clin Immunol 2024; 44:133. [PMID: 38780872 PMCID: PMC11116262 DOI: 10.1007/s10875-024-01726-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 04/29/2024] [Indexed: 05/25/2024]
Abstract
PURPOSE A large proportion of Common variable immunodeficiency (CVID) patients has duodenal inflammation with increased intraepithelial lymphocytes (IEL) of unknown aetiology. The histologic similarities to celiac disease, lead to confusion regarding treatment (gluten-free diet) of these patients. We aimed to elucidate the role of epigenetic DNA methylation in the aetiology of duodenal inflammation in CVID and differentiate it from true celiac disease. METHODS DNA was isolated from snap-frozen pieces of duodenal biopsies and analysed for differences in genome-wide epigenetic DNA methylation between CVID patients with increased IEL (CVID_IEL; n = 5) without IEL (CVID_N; n = 3), celiac disease (n = 3) and healthy controls (n = 3). RESULTS The DNA methylation data of 5-methylcytosine in CpG sites separated CVID and celiac diseases from healthy controls. Differential methylation in promoters of genes were identified as potential novel mediators in CVID and celiac disease. There was limited overlap of methylation associated genes between CVID_IEL and Celiac disease. High frequency of differentially methylated CpG sites was detected in over 100 genes nearby transcription start site (TSS) in both CVID_IEL and celiac disease, compared to healthy controls. Differential methylation of genes involved in regulation of TNF/cytokine production were enriched in CVID_IEL, compared to healthy controls. CONCLUSION This is the first study to reveal a role of epigenetic DNA methylation in the etiology of duodenal inflammation of CVID patients, distinguishing CVID_IEL from celiac disease. We identified potential biomarkers and therapeutic targets within gene promotors and in high-frequency differentially methylated CpG regions proximal to TSS in both CVID_IEL and celiac disease.
Collapse
Affiliation(s)
- Mingyi Yang
- Department of Microbiology, Oslo University Hospital and University of Oslo, Oslo, Norway
- Department of Medical Biochemistry, Oslo University Hospital and University of Oslo, Oslo, Norway
| | - Mari Kaarbø
- Department of Microbiology, Oslo University Hospital and University of Oslo, Oslo, Norway
| | - Vegard Myhre
- Research Institute of Internal Medicine, Division of Surgery, Inflammatory Diseases and Transplantation, Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Henrik M Reims
- Department of Pathology, Oslo University Hospital, Rikshospitalet, Oslo, Norway
| | - Tom H Karlsen
- Research Institute of Internal Medicine, Division of Surgery, Inflammatory Diseases and Transplantation, Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Norwegian PSC Research Center, Department of Transplantation Medicine, Oslo University Hospital, Oslo, Norway
- Section of Gastroenterology, Department of Transplantation Medicine, Oslo University Hospital, Rikshospitalet, Oslo, Norway
| | - Junbai Wang
- Department of Clinical Molecular Biology (EpiGen), Akershus University Hospital and University of Oslo, Lørenskog, Norway
| | - Torbjørn Rognes
- Department of Microbiology, Oslo University Hospital and University of Oslo, Oslo, Norway
- Centre for Bioinformatics, Department of Informatics, University of Oslo, Oslo, Norway
| | - Bente Halvorsen
- Research Institute of Internal Medicine, Division of Surgery, Inflammatory Diseases and Transplantation, Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Børre Fevang
- Section of Clinical Immunology and Infectious Diseases, Oslo University Hospital, Rikshospitalet, Oslo, Norway
| | - Knut E A Lundin
- Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Section of Gastroenterology, Department of Transplantation Medicine, Oslo University Hospital, Rikshospitalet, Oslo, Norway
| | - Pål Aukrust
- Research Institute of Internal Medicine, Division of Surgery, Inflammatory Diseases and Transplantation, Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Section of Clinical Immunology and Infectious Diseases, Oslo University Hospital, Rikshospitalet, Oslo, Norway
| | - Magnar Bjørås
- Department of Microbiology, Oslo University Hospital and University of Oslo, Oslo, Norway
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, NTNU, Trondheim, Norway
- The Proteomics and Modomics Experimental Core Facility (PROMEC) at Norwegian University of Science and Technology, Trondheim, Norway
| | - Silje F Jørgensen
- Research Institute of Internal Medicine, Division of Surgery, Inflammatory Diseases and Transplantation, Oslo University Hospital, Oslo, Norway.
- Section of Clinical Immunology and Infectious Diseases, Oslo University Hospital, Rikshospitalet, Oslo, Norway.
| |
Collapse
|
24
|
Nguyen H, Nguyen H, Tran D, Draghici S, Nguyen T. Fourteen years of cellular deconvolution: methodology, applications, technical evaluation and outstanding challenges. Nucleic Acids Res 2024; 52:4761-4783. [PMID: 38619038 PMCID: PMC11109966 DOI: 10.1093/nar/gkae267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 03/01/2024] [Accepted: 04/02/2024] [Indexed: 04/16/2024] Open
Abstract
Single-cell RNA sequencing (scRNA-Seq) is a recent technology that allows for the measurement of the expression of all genes in each individual cell contained in a sample. Information at the single-cell level has been shown to be extremely useful in many areas. However, performing single-cell experiments is expensive. Although cellular deconvolution cannot provide the same comprehensive information as single-cell experiments, it can extract cell-type information from bulk RNA data, and therefore it allows researchers to conduct studies at cell-type resolution from existing bulk datasets. For these reasons, a great effort has been made to develop such methods for cellular deconvolution. The large number of methods available, the requirement of coding skills, inadequate documentation, and lack of performance assessment all make it extremely difficult for life scientists to choose a suitable method for their experiment. This paper aims to fill this gap by providing a comprehensive review of 53 deconvolution methods regarding their methodology, applications, performance, and outstanding challenges. More importantly, the article presents a benchmarking of all these 53 methods using 283 cell types from 30 tissues of 63 individuals. We also provide an R package named DeconBenchmark that allows readers to execute and benchmark the reviewed methods (https://github.com/tinnlab/DeconBenchmark).
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Ha Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Duc Tran
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, MI, USA
- Advaita Bioinformatics, Ann Arbor, MI, USA
| | - Tin Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| |
Collapse
|
25
|
Meng G, Pan Y, Tang W, Zhang L, Cui Y, Schumacher FR, Wang M, Wang R, He S, Krischer J, Li Q, Feng H. imply: improving cell-type deconvolution accuracy using personalized reference profiles. Genome Med 2024; 16:65. [PMID: 38685057 PMCID: PMC11057104 DOI: 10.1186/s13073-024-01338-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 04/18/2024] [Indexed: 05/02/2024] Open
Abstract
Using computational tools, bulk transcriptomics can be deconvoluted to estimate the abundance of constituent cell types. However, existing deconvolution methods are conditioned on the assumption that the whole study population is served by a single reference panel, ignoring person-to-person heterogeneity. Here, we present imply, a novel algorithm to deconvolute cell type proportions using personalized reference panels. Simulation studies demonstrate reduced bias compared with existing methods. Real data analyses on longitudinal consortia show disparities in cell type proportions are associated with several disease phenotypes in Type 1 diabetes and Parkinson's disease. imply is available through the R/Bioconductor package ISLET at https://bioconductor.org/packages/ISLET/ .
Collapse
Affiliation(s)
- Guanqun Meng
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Yue Pan
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, 38105, TN, USA
| | - Wen Tang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Lijun Zhang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Ying Cui
- Department of Biomedical Data Science, Stanford University, Stanford, 94305, CA, USA
| | - Fredrick R Schumacher
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Ming Wang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Rui Wang
- Department of Surgery, Division of Surgical Oncology, University Hospitals Cleveland Medical Center, Cleveland, 44106, OH, USA
| | - Sijia He
- Department of Biostatistics, University of Michigan, Ann Arbor, 48109, MI, USA
| | - Jeffrey Krischer
- Health Informatics Institute, University of South Florida, Tampa, 38105, FL, USA
| | - Qian Li
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, 38105, TN, USA.
| | - Hao Feng
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA.
| |
Collapse
|
26
|
Garmire LX, Li Y, Huang Q, Xu C, Teichmann SA, Kaminski N, Pellegrini M, Nguyen Q, Teschendorff AE. Challenges and perspectives in computational deconvolution of genomics data. Nat Methods 2024; 21:391-400. [PMID: 38374264 DOI: 10.1038/s41592-023-02166-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 12/26/2023] [Indexed: 02/21/2024]
Abstract
Deciphering cell-type heterogeneity is crucial for systematically understanding tissue homeostasis and its dysregulation in diseases. Computational deconvolution is an efficient approach for estimating cell-type abundances from a variety of omics data. Despite substantial methodological progress in computational deconvolution in recent years, challenges are still outstanding. Here we enlist four important challenges related to computational deconvolution: the quality of the reference data, generation of ground truth data, limitations of computational methodologies, and benchmarking design and implementation. Finally, we make recommendations on reference data generation, new directions of computational methodologies, and strategies to promote rigorous benchmarking.
Collapse
Affiliation(s)
- Lana X Garmire
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| | - Yijun Li
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Qianhui Huang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Chuan Xu
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | | | - Naftali Kaminski
- Pulmonary, Critical Care & Sleep Medicine, Yale University School of Medicine, New Haven, CT, USA
| | - Matteo Pellegrini
- Molecular, Cell and Developmental Biology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Quan Nguyen
- Institute for Molecular Bioscience, The University of Queensland and QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Andrew E Teschendorff
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- UCL Cancer Institute, University College London, London, UK
| |
Collapse
|
27
|
Zhang J, Zhang L, Gongol B, Hayes J, Borowsky A, Bailey-Serres J, Girke T. spatialHeatmap: visualizing spatial bulk and single-cell assays in anatomical images. NAR Genom Bioinform 2024; 6:lqae006. [PMID: 38312938 PMCID: PMC10836942 DOI: 10.1093/nargab/lqae006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/14/2023] [Accepted: 01/18/2024] [Indexed: 02/06/2024] Open
Abstract
Visualizing spatial assay data in anatomical images is vital for understanding biological processes in cell, tissue, and organ organizations. Technologies requiring this functionality include traditional one-at-a-time assays, and bulk and single-cell omics experiments, including RNA-seq and proteomics. The spatialHeatmap software provides a series of powerful new methods for these needs, and allows users to work with adequately formatted anatomical images from public collections or custom images. It colors the spatial features (e.g. tissues) annotated in the images according to the measured or predicted abundance levels of biomolecules (e.g. mRNAs) using a color key. This core functionality of the package is called a spatial heatmap plot. Single-cell data can be co-visualized in composite plots that combine spatial heatmaps with embedding plots of high-dimensional data. The resulting spatial context information is essential for gaining insights into the tissue-level organization of single-cell data, or vice versa. Additional core functionalities include the automated identification of biomolecules with spatially selective abundance patterns and clusters of biomolecules sharing similar abundance profiles. To appeal to both non-expert and computational users, spatialHeatmap provides a graphical and a command-line interface, respectively. It is distributed as a free, open-source Bioconductor package (https://bioconductor.org/packages/spatialHeatmap) that users can install on personal computers, shared servers, or cloud systems.
Collapse
Affiliation(s)
- Jianhai Zhang
- Institute for Integrative Genome Biology, Department of Botany and Plant Sciences, 1207F Genomics Building, University of California, Riverside, CA 92521, USA
| | - Le Zhang
- Institute for Integrative Genome Biology, Department of Botany and Plant Sciences, 1207F Genomics Building, University of California, Riverside, CA 92521, USA
| | - Brendan Gongol
- Institute for Integrative Genome Biology, Department of Botany and Plant Sciences, 1207F Genomics Building, University of California, Riverside, CA 92521, USA
| | - Jordan Hayes
- Institute for Integrative Genome Biology, Department of Botany and Plant Sciences, 1207F Genomics Building, University of California, Riverside, CA 92521, USA
| | - Alexander T Borowsky
- Center for Plant Cell Biology, Department of Botany and Plant Sciences, University of California, Riverside, Riverside, CA 92521, USA
| | - Julia Bailey-Serres
- Center for Plant Cell Biology, Department of Botany and Plant Sciences, University of California, Riverside, Riverside, CA 92521, USA
| | - Thomas Girke
- Institute for Integrative Genome Biology, Department of Botany and Plant Sciences, 1207F Genomics Building, University of California, Riverside, CA 92521, USA
| |
Collapse
|
28
|
Nishikawa T, Lee M, Amau M. New generative methods for single-cell transcriptome data in bulk RNA sequence deconvolution. Sci Rep 2024; 14:4156. [PMID: 38378978 PMCID: PMC10879528 DOI: 10.1038/s41598-024-54798-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 02/16/2024] [Indexed: 02/22/2024] Open
Abstract
Numerous methods for bulk RNA sequence deconvolution have been developed to identify cellular targets of diseases by understanding the composition of cell types in disease-related tissues. However, issues of heterogeneity in gene expression between subjects and the shortage of reference single-cell RNA sequence data remain to achieve accurate bulk deconvolution. In our study, we investigated whether a new data generative method named sc-CMGAN and benchmarking generative methods (Copula, CTGAN and TVAE) could solve these issues and improve the bulk deconvolutions. We also evaluated the robustness of sc-CMGAN using three deconvolution methods and four public datasets. In almost all conditions, the generative methods contributed to improved deconvolution. Notably, sc-CMGAN outperformed the benchmarking methods and demonstrated higher robustness. This study is the first to examine the impact of data augmentation on bulk deconvolution. The new generative method, sc-CMGAN, is expected to become one of the powerful tools for the preprocessing of bulk deconvolution.
Collapse
Affiliation(s)
- Toui Nishikawa
- Faculty of Medicine, Wakayama Medical University, 811-1 Kimiidera, Wakayama, 641-8509, Japan.
| | - Masatoshi Lee
- Faculty of Medicine, Wakayama Medical University, 811-1 Kimiidera, Wakayama, 641-8509, Japan
| | | |
Collapse
|
29
|
Guo X, Huang Z, Ju F, Zhao C, Yu L. Highly Accurate Estimation of Cell Type Abundance in Bulk Tissues Based on Single-Cell Reference and Domain Adaptive Matching. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2306329. [PMID: 38072669 PMCID: PMC10870031 DOI: 10.1002/advs.202306329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 10/27/2023] [Indexed: 02/17/2024]
Abstract
Accurately identifies the cellular composition of complex tissues, which is critical for understanding disease pathogenesis, early diagnosis, and prevention. However, current methods for deconvoluting bulk RNA sequencing (RNA-seq) typically rely on matched single-cell RNA sequencing (scRNA-seq) as a reference, which can be limiting due to differences in sequencing distribution and the potential for invalid information from single-cell references. Hence, a novel computational method named SCROAM is introduced to address these challenges. SCROAM transforms scRNA-seq and bulk RNA-seq into a shared feature space, effectively eliminating distributional differences in the latent space. Subsequently, cell-type-specific expression matrices are generated from the scRNA-seq data, facilitating the precise identification of cell types within bulk tissues. The performance of SCROAM is assessed through benchmarking against simulated and real datasets, demonstrating its accuracy and robustness. To further validate SCROAM's performance, single-cell and bulk RNA-seq experiments are conducted on mouse spinal cord tissue, with SCROAM applied to identify cell types in bulk tissue. Results indicate that SCROAM is a highly effective tool for identifying similar cell types. An integrated analysis of liver cancer and primary glioblastoma is then performed. Overall, this research offers a novel perspective for delivering precise insights into disease pathogenesis and potential therapeutic strategies.
Collapse
Affiliation(s)
- Xinyang Guo
- School of Computer Science and TechnologyXidian UniversityXi'an710071China
| | - Zhaoyang Huang
- School of Computer Science and TechnologyXidian UniversityXi'an710071China
| | - Fen Ju
- Department of Rehabilitation MedicineXijing HospitalFourth Military Medical UniversityXi'an710032China
| | - Chenguang Zhao
- Department of Rehabilitation MedicineXijing HospitalFourth Military Medical UniversityXi'an710032China
| | - Liang Yu
- School of Computer Science and TechnologyXidian UniversityXi'an710071China
| |
Collapse
|
30
|
Zhou M, Tamburini I, Van C, Molendijk J, Nguyen CM, Chang IYY, Johnson C, Velez LM, Cheon Y, Yeo R, Bae H, Le J, Larson N, Pulido R, Nascimento-Filho CHV, Jang C, Marazzi I, Justice J, Pannunzio N, Hevener AL, Sparks L, Kershaw EE, Nicholas D, Parker BL, Masri S, Seldin MM. Leveraging inter-individual transcriptional correlation structure to infer discrete signaling mechanisms across metabolic tissues. eLife 2024; 12:RP88863. [PMID: 38224289 PMCID: PMC10945578 DOI: 10.7554/elife.88863] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2024] Open
Abstract
Inter-organ communication is a vital process to maintain physiologic homeostasis, and its dysregulation contributes to many human diseases. Given that circulating bioactive factors are stable in serum, occur naturally, and are easily assayed from blood, they present obvious focal molecules for therapeutic intervention and biomarker development. Recently, studies have shown that secreted proteins mediating inter-tissue signaling could be identified by 'brute force' surveys of all genes within RNA-sequencing measures across tissues within a population. Expanding on this intuition, we reasoned that parallel strategies could be used to understand how individual genes mediate signaling across metabolic tissues through correlative analyses of gene variation between individuals. Thus, comparison of quantitative levels of gene expression relationships between organs in a population could aid in understanding cross-organ signaling. Here, we surveyed gene-gene correlation structure across 18 metabolic tissues in 310 human individuals and 7 tissues in 103 diverse strains of mice fed a normal chow or high-fat/high-sucrose (HFHS) diet. Variation of genes such as FGF21, ADIPOQ, GCG, and IL6 showed enrichments which recapitulate experimental observations. Further, similar analyses were applied to explore both within-tissue signaling mechanisms (liver PCSK9) and genes encoding enzymes producing metabolites (adipose PNPLA2), where inter-individual correlation structure aligned with known roles for these critical metabolic pathways. Examination of sex hormone receptor correlations in mice highlighted the difference of tissue-specific variation in relationships with metabolic traits. We refer to this resource as gene-derived correlations across tissues (GD-CAT) where all tools and data are built into a web portal enabling users to perform these analyses without a single line of code (gdcat.org). This resource enables querying of any gene in any tissue to find correlated patterns of genes, cell types, pathways, and network architectures across metabolic organs.
Collapse
Affiliation(s)
- Mingqi Zhou
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Ian Tamburini
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Cassandra Van
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Jeffrey Molendijk
- Department of Anatomy and Physiology, University of MelbourneMelbourneAustralia
| | - Christy M Nguyen
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | | | - Casey Johnson
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Leandro M Velez
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Youngseo Cheon
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Reichelle Yeo
- Translational Research Institute, AdventHealthOrlandoUnited States
| | - Hosung Bae
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Johnny Le
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Natalie Larson
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Ron Pulido
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Carlos HV Nascimento-Filho
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Cholsoon Jang
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Ivan Marazzi
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Jamie Justice
- Veterans Administration Greater Los Angeles Healthcare System, Geriatric Research Education and Clinical Center (GRECC)Los AngelesUnited States
| | - Nicholas Pannunzio
- Divison of Hematology/Oncology, Department of Medicine, UC Irvine HealthIrvineUnited States
| | - Andrea L Hevener
- Department of Medicine, Division of Endocrinology, Diabetes, and Hypertension, David Geffen School of Medicine at UCLALos AngelesUnited States
- Iris Cantor-UCLA Women’s Health Research Center, David Geffen School of Medicine at UCLALos AngelesUnited States
| | - Lauren Sparks
- Translational Research Institute, AdventHealthOrlandoUnited States
| | - Erin E Kershaw
- Division of Endocrinology, Department of Medicine, University of PittsburgPittsburghUnited States
| | - Dequina Nicholas
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
- Department of Molecular Biology and Biochemistry, School of Biological Sciences, University of California IrvineIrvineUnited States
| | - Benjamin L Parker
- Department of Anatomy and Physiology, University of MelbourneMelbourneAustralia
| | - Selma Masri
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| | - Marcus M Seldin
- Department of Biological Chemistry, UC IrvineIrvineUnited States
- Center for Epigenetics and Metabolism, UC IrvineIrvineUnited States
| |
Collapse
|
31
|
Uzuner D, İlgün A, Düz E, Bozkurt FB, Çakır T. Multilayer Analysis of RNA Sequencing Data in Alzheimer's Disease to Unravel Molecular Mysteries. ADVANCES IN NEUROBIOLOGY 2024; 41:219-246. [PMID: 39589716 DOI: 10.1007/978-3-031-69188-1_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2024]
Abstract
Alzheimer's disease (AD) is a complex disease, and numerous cellular events may be involved in etiology. RNAseq-based transcriptome data hold multilayer information content, which could be crucial in unraveling molecular mysteries of AD. It enables quantification of gene expression levels, identification of genomic variants, and elucidation of splicing anomalies such as exon skipping and intron retention. Additional integration of this information into protein-protein interaction networks and genome-scale metabolic models from the literature has potential to decipher functional modules and affected mechanisms for complex scenarios such as AD. In this chapter, we review the application areas of the multilayer content of RNAseq and associated integrative approaches available, with a special focus on AD.
Collapse
Affiliation(s)
- Dilara Uzuner
- Department of Bioengineering, Gebze Technical University, Gebze, Kocaeli, Turkey
| | - Atılay İlgün
- Department of Bioengineering, Gebze Technical University, Gebze, Kocaeli, Turkey
| | - Elif Düz
- Department of Bioengineering, Gebze Technical University, Gebze, Kocaeli, Turkey
| | - Fatma Betül Bozkurt
- Department of Bioengineering, Gebze Technical University, Gebze, Kocaeli, Turkey
| | - Tunahan Çakır
- Department of Bioengineering, Gebze Technical University, Gebze, Kocaeli, Turkey.
| |
Collapse
|
32
|
Sidiropoulos DN, Ho WJ, Jaffee EM, Kagohara LT, Fertig EJ. Systems immunology spanning tumors, lymph nodes, and periphery. CELL REPORTS METHODS 2023; 3:100670. [PMID: 38086385 PMCID: PMC10753389 DOI: 10.1016/j.crmeth.2023.100670] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 10/20/2023] [Accepted: 11/17/2023] [Indexed: 12/21/2023]
Abstract
The immune system defines a complex network of tissues and cell types that orchestrate responses across the body in a dynamic manner. The local and systemic interactions between immune and cancer cells contribute to disease progression. Lymphocytes are activated in lymph nodes, traffic through the periphery, and impact cancer progression through their interactions with tumor cells. As a result, therapeutic response and resistance are mediated across tissues, and a comprehensive understanding of lymphocyte dynamics requires a systems-level approach. In this review, we highlight experimental and computational methods that can leverage the study of leukocyte trafficking through an immunomics lens and reveal how adaptive immunity shapes cancer.
Collapse
Affiliation(s)
- Dimitrios N Sidiropoulos
- Johns Hopkins University School of Medicine, Baltimore, MD, USA; Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA
| | - Won Jin Ho
- Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA
| | - Elizabeth M Jaffee
- Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA
| | - Luciane T Kagohara
- Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA.
| | - Elana J Fertig
- Johns Hopkins Convergence Institute, Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA; Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA; Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
33
|
Maden SK, Kwon SH, Huuki-Myers LA, Collado-Torres L, Hicks SC, Maynard KR. Challenges and opportunities to computationally deconvolve heterogeneous tissue with varying cell sizes using single-cell RNA-sequencing datasets. Genome Biol 2023; 24:288. [PMID: 38098055 PMCID: PMC10722720 DOI: 10.1186/s13059-023-03123-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 11/24/2023] [Indexed: 12/17/2023] Open
Abstract
Deconvolution of cell mixtures in "bulk" transcriptomic samples from homogenate human tissue is important for understanding disease pathologies. However, several experimental and computational challenges impede transcriptomics-based deconvolution approaches using single-cell/nucleus RNA-seq reference atlases. Cells from the brain and blood have substantially different sizes, total mRNA, and transcriptional activities, and existing approaches may quantify total mRNA instead of cell type proportions. Further, standards are lacking for the use of cell reference atlases and integrative analyses of single-cell and spatial transcriptomics data. We discuss how to approach these key challenges with orthogonal "gold standard" datasets for evaluating deconvolution methods.
Collapse
Affiliation(s)
- Sean K Maden
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Sang Ho Kwon
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Louise A Huuki-Myers
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
| | - Leonardo Collado-Torres
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
| | - Stephanie C Hicks
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
- Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, USA.
| | - Kristen R Maynard
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA.
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA.
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
34
|
Zhou M, Tamburini IJ, Van C, Molendijk J, Nguyen CM, Chang IYY, Johnson C, Velez LM, Cheon Y, Yeo RX, Bae H, Le J, Larson N, Pulido R, Filho C, Jang C, Marazzi I, Justice JN, Pannunzio N, Hevener A, Sparks LM, Kershaw EE, Nicholas D, Parker B, Masri S, Seldin M. Leveraging inter-individual transcriptional correlation structure to infer discrete signaling mechanisms across metabolic tissues. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.10.540142. [PMID: 37214953 PMCID: PMC10197628 DOI: 10.1101/2023.05.10.540142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Abstract/IntroductionInter-organ communication is a vital process to maintain physiologic homeostasis, and its dysregulation contributes to many human diseases. Beginning with the discovery of insulin over a century ago, characterization of molecules responsible for signal between tissues has required careful and elegant experimentation where these observations have been integral to deciphering physiology and disease. Given that circulating bioactive factors are stable in serum, occur naturally, and are easily assayed from blood, they present obvious focal molecules for therapeutic intervention and biomarker development. For example, physiologic dissection of the actions of soluble proteins such as proprotein convertase subtilisin/kexin type 9 (PCSK9) and glucagon-like peptide 1 (GLP1) have yielded among the most promising therapeutics to treat cardiovascular disease and obesity, respectively1–4. A major obstacle in the characterization of such soluble factors is that defining their tissues and pathways of action requires extensive experimental testing in cells and animal models. Recently, studies have shown that secreted proteins mediating inter-tissue signaling could be identified by “brute-force” surveys of all genes within RNA-sequencing measures across tissues within a population5–9. Expanding on this intuition, we reasoned that parallel strategies could be used to understand how individual genes mediate signaling across metabolic tissues through correlative analyses of gene variation between individuals. Thus, comparison of quantitative levels of gene expression relationships between organs in a population could aid in understanding cross-organ signaling. Here, we surveyed gene-gene correlation structure across 18 metabolic tissues in 310 human individuals and 7 tissues in 103 diverse strains of mice fed a normal chow or HFHS diet. Variation of genes such asFGF21, ADIPOQ, GCGandIL6showed enrichments which recapitulate experimental observations. Further, similar analyses were applied to explore both within-tissue signaling mechanisms (liverPCSK9) as well as genes encoding enzymes producing metabolites (adiposePNPLA2), where inter-individual correlation structure aligned with known roles for these critical metabolic pathways. Examination of sex hormone receptor correlations in mice highlighted the difference of tissue-specific variation in relationships with metabolic traits. We refer to this resource asGene-DerivedCorrelationsAcrossTissues (GD-CAT) where all tools and data are built into a web portal enabling users to perform these analyses without a single line of code (gdcat.org). This resource enables querying of any gene in any tissue to find correlated patterns of genes, cell types, pathways and network architectures across metabolic organs.
Collapse
Affiliation(s)
- Mingqi Zhou
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Ian J. Tamburini
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Cassandra Van
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Jeffrey Molendijk
- Department of Anatomy and Physiology, University of Melbourne, Melbourne, VIC, Australia
| | - Christy M Nguyen
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | | | - Casey Johnson
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Leandro M. Velez
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Youngseo Cheon
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Reichelle X. Yeo
- Translational Research Institute, AdventHealth, Orlando, FL, USA
| | - Hosung Bae
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Johnny Le
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Natalie Larson
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Ron Pulido
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Carlos Filho
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Cholsoon Jang
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Ivan Marazzi
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Jamie N. Justice
- Veterans Administration Greater Los Angeles Healthcare System, Geriatric Research Education and Clinical Center (GRECC), Los Angeles, CA, USA
| | - Nicholas Pannunzio
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Andrea Hevener
- Department of Medicine, Division of Endocrinology, Diabetes, and Hypertension, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
- Iris Cantor-UCLA Women’s Health Research Center, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - Lauren M. Sparks
- Translational Research Institute, AdventHealth, Orlando, FL, USA
| | - Erin E. Kershaw
- Department of Internal Medicine, Section On Gerontology and Geriatric Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Dequina Nicholas
- Division of Endocrinology, Department of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Benjamin Parker
- Department of Anatomy and Physiology, University of Melbourne, Melbourne, VIC, Australia
| | - Selma Masri
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
- Center for Epigenetics and Metabolism, UC Irvine. Irvine, CA, USA
| | - Marcus Seldin
- Department of Biological Chemistry, UC Irvine. Irvine, CA, USA
| |
Collapse
|
35
|
Meng G, Pan Y, Tang W, Zhang L, Cui Y, Schumacher FR, Wang M, Wang R, He S, Krischer J, Li Q, Feng H. imply: improving cell-type deconvolution accuracy using personalized reference profiles. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.27.559579. [PMID: 37808714 PMCID: PMC10557724 DOI: 10.1101/2023.09.27.559579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
Real-world clinical samples are often admixtures of signal mosaics from multiple pure cell types. Using computational tools, bulk transcriptomics can be deconvoluted to solve for the abundance of constituent cell types. However, existing deconvolution methods are conditioned on the assumption that the whole study population is served by a single reference panel, which ignores person-to-person heterogeneity. Here we present imply, a novel algorithm to deconvolute cell type proportions using personalized reference panels. imply can borrow information across repeatedly measured samples for each subject, and obtain precise cell type proportion estimations. Simulation studies demonstrate reduced bias in cell type abundance estimation compared with existing methods. Real data analyses on large longitudinal consortia show more realistic deconvolution results that align with biological facts. Our results suggest that disparities in cell type proportions are associated with several disease phenotypes in type 1 diabetes and Parkinson's disease. Our proposed tool imply is available through the R/Bioconductor package ISLET at https://bioconductor.org/packages/ISLET/.
Collapse
Affiliation(s)
- Guanqun Meng
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Yue Pan
- Department of Biostatistics, St. Jude Children’s Research Hospital, Memphis, 38105, TN, USA
| | - Wen Tang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Lijun Zhang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Ying Cui
- Department of Biomedical Data Science, Stanford University, Stanford, 94305, CA, USA
| | - Fredrick R. Schumacher
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Ming Wang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| | - Rui Wang
- Department of Surgery, Division of Surgical Oncology, University Hospitals Cleveland Medical Center, Cleveland, 44106, OH, USA
| | - Sijia He
- Department of Biostatistics, University of Michigan, Ann Arbor, 48109, MI, USA
| | - Jeffrey Krischer
- Health Informatics Institute, University of South Florida, Tampa, 38105, FL, USA
| | - Qian Li
- Department of Biostatistics, St. Jude Children’s Research Hospital, Memphis, 38105, TN, USA
| | - Hao Feng
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, OH, USA
| |
Collapse
|
36
|
Karagiannis K, Gannavaram S, Verma C, Pacheco-Fernandez T, Bhattacharya P, Nakhasi HL, Satoskar AR. Dual-scRNA-seq analysis reveals rare and uncommon parasitized cell populations in chronic L. donovani infection. Cell Rep 2023; 42:113097. [PMID: 37682713 DOI: 10.1016/j.celrep.2023.113097] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 06/21/2023] [Accepted: 08/22/2023] [Indexed: 09/10/2023] Open
Abstract
Although phagocytic cells are documented targets of Leishmania parasites, it is unclear whether other cell types can be infected. Here, we use unbiased single-cell RNA sequencing (scRNA-seq) to simultaneously analyze host cell and Leishmania donovani transcriptomes to identify and annotate parasitized cells in spleen and bone marrow in chronically infected mice. Our dual-scRNA-seq methodology allows the detection of heterogeneous parasitized populations. In the spleen, monocytes and macrophages are the dominant parasitized cells, while megakaryocytes, basophils, and natural killer (NK) cells are found to be unexpectedly infected. In the bone marrow, the hematopoietic stem cells (HSCs) expressing phagocytic receptors FcγR and CD93 are the main parasitized cells. Additionally, we also detect parasitized cycling basal cells, eosinophils, and macrophages in chronically infected mice. Flow cytometric analysis confirms the presence of parasitized HSCs. Our unbiased dual-scRNA-seq method identifies rare, parasitized cells, potentially implicated in pathogenesis, persistence, and protective immunity, using a non-targeted approach.
Collapse
Affiliation(s)
| | - Sreenivas Gannavaram
- Division of Emerging and Transfusion Transmitted Diseases, CBER, FDA, Silver Spring, MD, USA
| | - Chaitenya Verma
- Department of Pathology, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA
| | | | - Parna Bhattacharya
- Division of Emerging and Transfusion Transmitted Diseases, CBER, FDA, Silver Spring, MD, USA
| | - Hira L Nakhasi
- Division of Emerging and Transfusion Transmitted Diseases, CBER, FDA, Silver Spring, MD, USA
| | - Abhay R Satoskar
- Department of Microbiology, The Ohio State University, Columbus, OH 43210, USA; Department of Pathology, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA.
| |
Collapse
|
37
|
Vallelonga V, Gandolfi F, Ficara F, Della Porta MG, Ghisletti S. Emerging Insights into Molecular Mechanisms of Inflammation in Myelodysplastic Syndromes. Biomedicines 2023; 11:2613. [PMID: 37892987 PMCID: PMC10603842 DOI: 10.3390/biomedicines11102613] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 09/15/2023] [Accepted: 09/21/2023] [Indexed: 10/29/2023] Open
Abstract
Inflammation impacts human hematopoiesis across physiologic and pathologic conditions, as signals derived from the bone marrow microenvironment, such as pro-inflammatory cytokines and chemokines, have been shown to alter hematopoietic stem cell (HSCs) homeostasis. Dysregulated inflammation can skew HSC fate-related decisions, leading to aberrant hematopoiesis and potentially contributing to the pathogenesis of hematological disorders such as myelodysplastic syndromes (MDS). Recently, emerging studies have used single-cell sequencing and muti-omic approaches to investigate HSC cellular heterogeneity and gene expression in normal hematopoiesis as well as in myeloid malignancies. This review summarizes recent reports mechanistically dissecting the role of inflammatory signaling and innate immune response activation due to MDS progression. Furthermore, we highlight the growing importance of using multi-omic techniques, such as single-cell profiling and deconvolution methods, to unravel MDSs' heterogeneity. These approaches have provided valuable insights into the patterns of clonal evolution that drive MDS progression and have elucidated the impact of inflammation on the composition of the bone marrow immune microenvironment in MDS.
Collapse
Affiliation(s)
- Veronica Vallelonga
- Department of Experimental Oncology, European Institute of Oncology (IEO) IRCCS, 20139 Milan, Italy
| | - Francesco Gandolfi
- Department of Experimental Oncology, European Institute of Oncology (IEO) IRCCS, 20139 Milan, Italy
| | - Francesca Ficara
- Milan Unit, CNR-IRGB, 20090 Milan, Italy
- IRCCS Humanitas Research Hospital, 20089 Milan, Italy
| | - Matteo Giovanni Della Porta
- IRCCS Humanitas Research Hospital, 20089 Milan, Italy
- Department of Biomedical Sciences, Humanitas University, 20072 Milan, Italy
| | - Serena Ghisletti
- Department of Experimental Oncology, European Institute of Oncology (IEO) IRCCS, 20139 Milan, Italy
| |
Collapse
|
38
|
Wang J, Lu L, Zheng S, Wang D, Jin L, Zhang Q, Li M, Zhang Z. DeCOOC Deconvoluted Hi-C Map Characterizes the Chromatin Architecture of Cells in Physiologically Distinctive Tissues. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2301058. [PMID: 37515382 PMCID: PMC10520690 DOI: 10.1002/advs.202301058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 07/06/2023] [Indexed: 07/30/2023]
Abstract
Deciphering variations in chromosome conformations based on bulk three-dimensional (3D) genomic data from heterogenous tissues is a key to understanding cell-type specific genome architecture and dynamics. Surprisingly, computational deconvolution methods for high-throughput chromosome conformation capture (Hi-C) data remain very rare in the literature. Here, a deep convolutional neural network (CNN), deconvolve bulk Hi-C data (deCOOC) that remarkably outperformed all the state-of-the-art tools in the deconvolution task is developed. Interestingly, it is noticed that the chromatin accessibility or the Hi-C contact frequency alone is insufficient to explain the power of deCOOC, suggesting the existence of a latent embedded layer of information pertaining to the cell type specific 3D genome architecture. By applying deCOOC to in-house-generated bulk Hi-C data from visceral and subcutaneous adipose tissues, it is found that the characteristic chromatin features of M2 cells in the two anatomical loci are distinctively bound to different physiological functionalities. Taken together, deCOOC is both a reliable Hi-C data deconvolution method and a powerful tool for functional extraction of 3D genome architecture.
Collapse
Affiliation(s)
- Junmei Wang
- CAS Key Laboratory of Genome Sciences and InformationBeijing Institute of GenomicsChinese Academy of Sciences and China National Center for BioinformationBeijing100101China
- School of Life ScienceUniversity of Chinese Academy of SciencesBeijing100049China
| | - Lu Lu
- Livestock and Poultry Multiomics Key Laboratory of Ministry of Agriculture and Rural AffairsCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
- Animal Breeding and Genetics Key Laboratory of Sichuan ProvinceInstitute of Animal Genetics and BreedingSichuan Agricultural UniversityChengdu611130China
| | - Shiqi Zheng
- CAS Key Laboratory of Genome Sciences and InformationBeijing Institute of GenomicsChinese Academy of Sciences and China National Center for BioinformationBeijing100101China
- School of Life ScienceUniversity of Chinese Academy of SciencesBeijing100049China
| | - Danyang Wang
- CAS Key Laboratory of Genome Sciences and InformationBeijing Institute of GenomicsChinese Academy of Sciences and China National Center for BioinformationBeijing100101China
- School of Life ScienceUniversity of Chinese Academy of SciencesBeijing100049China
- Sars‐Fang Centre & MOE Key Laboratory of Marine Genetics and BreedingCollege of Marine Life SciencesOcean University of ChinaQingdao266100China
| | - Long Jin
- Livestock and Poultry Multiomics Key Laboratory of Ministry of Agriculture and Rural AffairsCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
- Animal Breeding and Genetics Key Laboratory of Sichuan ProvinceInstitute of Animal Genetics and BreedingSichuan Agricultural UniversityChengdu611130China
| | - Qing Zhang
- CAS Key Laboratory of Genome Sciences and InformationBeijing Institute of GenomicsChinese Academy of Sciences and China National Center for BioinformationBeijing100101China
| | - Mingzhou Li
- Livestock and Poultry Multiomics Key Laboratory of Ministry of Agriculture and Rural AffairsCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
- Animal Breeding and Genetics Key Laboratory of Sichuan ProvinceInstitute of Animal Genetics and BreedingSichuan Agricultural UniversityChengdu611130China
| | - Zhihua Zhang
- CAS Key Laboratory of Genome Sciences and InformationBeijing Institute of GenomicsChinese Academy of Sciences and China National Center for BioinformationBeijing100101China
- School of Life ScienceUniversity of Chinese Academy of SciencesBeijing100049China
| |
Collapse
|
39
|
Tsalenchuk M, Gentleman SM, Marzi SJ. Linking environmental risk factors with epigenetic mechanisms in Parkinson's disease. NPJ Parkinsons Dis 2023; 9:123. [PMID: 37626097 PMCID: PMC10457362 DOI: 10.1038/s41531-023-00568-z] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 08/16/2023] [Indexed: 08/27/2023] Open
Abstract
Sporadic Parkinson's disease (PD) is a progressive neurodegenerative disease, with a complex risk structure thought to be influenced by interactions between genetic variants and environmental exposures, although the full aetiology is unknown. Environmental factors, including pesticides, have been reported to increase the risk of developing the disease. Growing evidence suggests epigenetic changes are key mechanisms by which these environmental factors act upon gene regulation, in disease-relevant cell types. We present a systematic review critically appraising and summarising the current body of evidence of the relationship between epigenetic mechanisms and environmental risk factors in PD to inform future research in this area. Epigenetic studies of relevant environmental risk factors in animal and cell models have yielded promising results, however, research in humans is just emerging. While published studies in humans are currently relatively limited, the importance of the field for the elucidation of molecular mechanisms of pathogenesis opens clear and promising avenues for the future of PD research. Carefully designed epidemiological studies carried out in PD patients hold great potential to uncover disease-relevant gene regulatory mechanisms. Therefore, to advance this burgeoning field, we recommend broadening the scope of investigations to include more environmental exposures, increasing sample sizes, focusing on disease-relevant cell types, and recruiting more diverse cohorts.
Collapse
Affiliation(s)
- Maria Tsalenchuk
- UK Dementia Research Institute, Imperial College London, London, UK
- Department of Brain Sciences, Imperial College London, London, UK
| | | | - Sarah J Marzi
- UK Dementia Research Institute, Imperial College London, London, UK.
- Department of Brain Sciences, Imperial College London, London, UK.
| |
Collapse
|
40
|
Cain A, Taga M, McCabe C, Green GS, Hekselman I, White CC, Lee DI, Gaur P, Rozenblatt-Rosen O, Zhang F, Yeger-Lotem E, Bennett DA, Yang HS, Regev A, Menon V, Habib N, De Jager PL. Multicellular communities are perturbed in the aging human brain and Alzheimer's disease. Nat Neurosci 2023; 26:1267-1280. [PMID: 37336975 PMCID: PMC10789499 DOI: 10.1038/s41593-023-01356-x] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 05/10/2023] [Indexed: 06/21/2023]
Abstract
The role of different cell types and their interactions in Alzheimer's disease (AD) is a complex and open question. Here, we pursued this question by assembling a high-resolution cellular map of the aging frontal cortex using single-nucleus RNA sequencing of 24 individuals with a range of clinicopathologic characteristics. We used this map to infer the neocortical cellular architecture of 638 individuals profiled by bulk RNA sequencing, providing the sample size necessary for identifying statistically robust associations. We uncovered diverse cell populations associated with AD, including a somatostatin inhibitory neuronal subtype and oligodendroglial states. We further identified a network of multicellular communities, each composed of coordinated subpopulations of neuronal, glial and endothelial cells, and we found that two of these communities are altered in AD. Finally, we used mediation analyses to prioritize cellular changes that might contribute to cognitive decline. Thus, our deconstruction of the aging neocortex provides a roadmap for evaluating the cellular microenvironments underlying AD and dementia.
Collapse
Affiliation(s)
- Anael Cain
- Edmond & Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Mariko Taga
- Center for Translational & Computational Immunology, Department of Neurology and Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY, USA
| | - Cristin McCabe
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Gilad S Green
- Edmond & Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Idan Hekselman
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | | | - Dylan I Lee
- Center for Translational & Computational Immunology, Department of Neurology and Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY, USA
| | - Pallavi Gaur
- Center for Translational & Computational Immunology, Department of Neurology and Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY, USA
| | - Orit Rozenblatt-Rosen
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Genentech, South San Francisco, CA, USA
| | - Feng Zhang
- Broad Institute, Cambridge, MA, USA
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Esti Yeger-Lotem
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel
- National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - Hyun-Sik Yang
- Broad Institute, Cambridge, MA, USA
- Harvard Medical School, Boston, MA, USA
- Center for Alzheimer Research and Treatment, Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA
| | - Aviv Regev
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, Koch Institute of Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Genentech, South San Francisco, CA, USA
| | - Vilas Menon
- Center for Translational & Computational Immunology, Department of Neurology and Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY, USA.
| | - Naomi Habib
- Edmond & Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel.
| | - Philip L De Jager
- Center for Translational & Computational Immunology, Department of Neurology and Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY, USA.
- Broad Institute, Cambridge, MA, USA.
| |
Collapse
|
41
|
Alonso-Moreda N, Berral-González A, De La Rosa E, González-Velasco O, Sánchez-Santos JM, De Las Rivas J. Comparative Analysis of Cell Mixtures Deconvolution and Gene Signatures Generated for Blood, Immune and Cancer Cells. Int J Mol Sci 2023; 24:10765. [PMID: 37445946 DOI: 10.3390/ijms241310765] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 06/19/2023] [Accepted: 06/21/2023] [Indexed: 07/15/2023] Open
Abstract
In the last two decades, many detailed full transcriptomic studies on complex biological samples have been published and included in large gene expression repositories. These studies primarily provide a bulk expression signal for each sample, including multiple cell-types mixed within the global signal. The cellular heterogeneity in these mixtures does not allow the activity of specific genes in specific cell types to be identified. Therefore, inferring relative cellular composition is a very powerful tool to achieve a more accurate molecular profiling of complex biological samples. In recent decades, computational techniques have been developed to solve this problem by applying deconvolution methods, designed to decompose cell mixtures into their cellular components and calculate the relative proportions of these elements. Some of them only calculate the cell proportions (supervised methods), while other deconvolution algorithms can also identify the gene signatures specific for each cell type (unsupervised methods). In these work, five deconvolution methods (CIBERSORT, FARDEEP, DECONICA, LINSEED and ABIS) were implemented and used to analyze blood and immune cells, and also cancer cells, in complex mixture samples (using three bulk expression datasets). Our study provides three analytical tools (corrplots, cell-signature plots and bar-mixture plots) that allow a thorough comparative analysis of the cell mixture data. The work indicates that CIBERSORT is a robust method optimized for the identification of immune cell-types, but not as efficient in the identification of cancer cells. We also found that LINSEED is a very powerful unsupervised method that provides precise and specific gene signatures for each of the main immune cell types tested: neutrophils and monocytes (of the myeloid lineage), B-cells, NK cells and T-cells (of the lymphoid lineage), and also for cancer cells.
Collapse
Affiliation(s)
- Natalia Alonso-Moreda
- Cancer Research Center (CiC-IBMCC, CSIC/USAL & IBSAL), Consejo Superior de Investigaciones Científicas (CSIC), University of Salamanca (USAL), & Instituto de Investigación Biomédica de Salamanca (IBSAL), 37007 Salamanca, Spain
| | - Alberto Berral-González
- Cancer Research Center (CiC-IBMCC, CSIC/USAL & IBSAL), Consejo Superior de Investigaciones Científicas (CSIC), University of Salamanca (USAL), & Instituto de Investigación Biomédica de Salamanca (IBSAL), 37007 Salamanca, Spain
| | - Enrique De La Rosa
- Cancer Research Center (CiC-IBMCC, CSIC/USAL & IBSAL), Consejo Superior de Investigaciones Científicas (CSIC), University of Salamanca (USAL), & Instituto de Investigación Biomédica de Salamanca (IBSAL), 37007 Salamanca, Spain
| | - Oscar González-Velasco
- Cancer Research Center (CiC-IBMCC, CSIC/USAL & IBSAL), Consejo Superior de Investigaciones Científicas (CSIC), University of Salamanca (USAL), & Instituto de Investigación Biomédica de Salamanca (IBSAL), 37007 Salamanca, Spain
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - José Manuel Sánchez-Santos
- Cancer Research Center (CiC-IBMCC, CSIC/USAL & IBSAL), Consejo Superior de Investigaciones Científicas (CSIC), University of Salamanca (USAL), & Instituto de Investigación Biomédica de Salamanca (IBSAL), 37007 Salamanca, Spain
- Department of Statistics, University of Salamanca (USAL), 37008 Salamanca, Spain
| | - Javier De Las Rivas
- Cancer Research Center (CiC-IBMCC, CSIC/USAL & IBSAL), Consejo Superior de Investigaciones Científicas (CSIC), University of Salamanca (USAL), & Instituto de Investigación Biomédica de Salamanca (IBSAL), 37007 Salamanca, Spain
| |
Collapse
|
42
|
Li Y, Luo Y. Spatial Transcriptomic Cell-type Deconvolution Using Graph Neural Networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.10.532112. [PMID: 37333198 PMCID: PMC10274700 DOI: 10.1101/2023.03.10.532112] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Spatially resolved transcriptomics performs high-throughput measurement of transcriptomes while preserving spatial information about the cellular organizations. However, many spatially resolved transcriptomic technologies can only distinguish spots consisting of a mixture of cells instead of working at single-cell resolution. Here, we present STdGCN, a graph neural network model designed for cell type deconvolution of spatial transcriptomic (ST) data that can leverage abundant single-cell RNA sequencing (scRNA-seq) data as reference. STdGCN is the first model incorporating the expression profiles from single cell data as well as the spatial localization information from the ST data for cell type deconvolution. Extensive benchmarking experiments on multiple ST datasets showed that STdGCN outperformed 14 published state-of-the-art models. Applied to a human breast cancer Visium dataset, STdGCN discerned spatial distributions between stroma, lymphocytes and cancer cells for tumor microenvironment dissection. In a human heart ST dataset, STdGCN detected the changes of potential endothelial-cardiomyocyte communications during tissue development.
Collapse
Affiliation(s)
- Yawei Li
- Department of Preventive Medicine, Northwestern University, Feinberg School of Medicine, Chicago, IL 60611, USA
- Center for Collaborative AI in Healthcare, Northwestern University, Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University, Feinberg School of Medicine, Chicago, IL 60611, USA
- Center for Collaborative AI in Healthcare, Northwestern University, Feinberg School of Medicine, Chicago, IL 60611, USA
| |
Collapse
|
43
|
Heiling HM, Wilson DR, Rashid NU, Sun W, Ibrahim JG. Estimating cell type composition using isoform expression one gene at a time. Biometrics 2023; 79:854-865. [PMID: 34921386 PMCID: PMC11245124 DOI: 10.1111/biom.13614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 12/08/2021] [Indexed: 11/29/2022]
Abstract
Human tissue samples are often mixtures of heterogeneous cell types, which can confound the analyses of gene expression data derived from such tissues. The cell type composition of a tissue sample may itself be of interest and is needed for proper analysis of differential gene expression. A variety of computational methods have been developed to estimate cell type proportions using gene-level expression data. However, RNA isoforms can also be differentially expressed across cell types, and isoform-level expression could be equally or more informative for determining cell type origin than gene-level expression. We propose a new computational method, IsoDeconvMM, which estimates cell type fractions using isoform-level gene expression data. A novel and useful feature of IsoDeconvMM is that it can estimate cell type proportions using only a single gene, though in practice we recommend aggregating estimates of a few dozen genes to obtain more accurate results. We demonstrate the performance of IsoDeconvMM using a unique data set with cell type-specific RNA-seq data across more than 135 individuals. This data set allows us to evaluate different methods given the biological variation of cell type-specific gene expression data across individuals. We further complement this analysis with additional simulations.
Collapse
Affiliation(s)
- Hillary M Heiling
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Douglas R Wilson
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Naim U Rashid
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Wei Sun
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| |
Collapse
|
44
|
Luo J, Wu X, Cheng Y, Chen G, Wang J, Song X. Expression quantitative trait locus studies in the era of single-cell omics. Front Genet 2023; 14:1182579. [PMID: 37284065 PMCID: PMC10239882 DOI: 10.3389/fgene.2023.1182579] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 04/26/2023] [Indexed: 06/08/2023] Open
Abstract
Genome-wide association studies have revealed that the regulation of gene expression bridges genetic variants and complex phenotypes. Profiling of the bulk transcriptome coupled with linkage analysis (expression quantitative trait locus (eQTL) mapping) has advanced our understanding of the relationship between genetic variants and gene regulation in the context of complex phenotypes. However, bulk transcriptomics has inherited limitations as the regulation of gene expression tends to be cell-type-specific. The advent of single-cell RNA-seq technology now enables the identification of the cell-type-specific regulation of gene expression through a single-cell eQTL (sc-eQTL). In this review, we first provide an overview of sc-eQTL studies, including data processing and the mapping procedure of the sc-eQTL. We then discuss the benefits and limitations of sc-eQTL analyses. Finally, we present an overview of the current and future applications of sc-eQTL discoveries.
Collapse
Affiliation(s)
- Jie Luo
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Xinyi Wu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Yuan Cheng
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Guang Chen
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Jian Wang
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Xijiao Song
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| |
Collapse
|
45
|
Nava A, Alves da Quinta D, Prato L, Girotti R, Moron G, Llera AS, Fernández EA. Novel evaluation approach for molecular signature-based deconvolution methods. J Biomed Inform 2023; 142:104387. [PMID: 37172634 DOI: 10.1016/j.jbi.2023.104387] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 03/17/2023] [Accepted: 05/07/2023] [Indexed: 05/15/2023]
Abstract
The tumoral immune microenvironment (TIME) plays a key role in prognosis, therapeutic approach and pathophysiological understanding over oncological processes. Several computational immune cell-type deconvolution methods (DM), supported by diverse molecular signatures (MS), have been developed to uncover such TIME interplay from RNA-seq tumor biopsies. MS-DM pairs were benchmarked against each other by means of different metrics, such as Pearson's correlation, R2 and RMSE, but these only evaluate the linear association of the estimated proportion related to the expected one, missing the analysis of prediction-dependent bias trends and cell identification accuracy. We present a novel protocol composed of four tests allowing appropriate evaluation of the cell type identification performance and proportion prediction accuracy of molecular signature-deconvolution method pair by means of certainty and confidence cell-type identification scores (F1-score, distance to the optimal point and error rates) as well the Bland-Altman method for error-trend analysis. Our protocol was used to benchmark six state-of-the-art DMs (CIBERSORTx, DCQ, DeconRNASeq, EPIC, MIXTURE and quanTIseq) paired to five murine tissue-specific MSs, revealing a systematic overestimation of the number of different cell types across almost all methods.
Collapse
Affiliation(s)
- A Nava
- Fundación Instituto Leloir-CONICET, Buenos Aires, Argentina; Fundación Huésped, Buenos Aires, Argentina
| | - D Alves da Quinta
- Fundación Instituto Leloir-CONICET, Buenos Aires, Argentina; Universidad Argentina de la Empresa (UADE). Instituto de Tecnología (INTEC), Buenos Aires, Argentina
| | - L Prato
- Universidad de Villa María, Córdoba, Argentina
| | - R Girotti
- Universidad Argentina de la Empresa (UADE). Instituto de Tecnología (INTEC), Buenos Aires, Argentina
| | - G Moron
- Departamento de Bioquímica Clínica, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Córdoba, Argentina; Centro de Investigaciones en Bioquímica Clínica e Inmunología, CONICET, Córdoba, Argentina
| | - A S Llera
- Fundación Instituto Leloir-CONICET, Buenos Aires, Argentina
| | - E A Fernández
- Facultad de Ingeniería, Carrera de Bioinformática, Universidad Católica de Córdoba (UCC), Córdoba, Argentina; Facultad de Ciencias Exactas Físicas y Naturales, Universidad Nacional de Córdoba, Córdoba, Argentina; Centro de Investigaciòn en Inmunología y Enfermedades Infecciosas, UCC, CONICET, Córdoba, Argentina.
| |
Collapse
|
46
|
Revkov E, Kulshrestha T, Sung KWK, Skanderup AJ. PUREE: accurate pan-cancer tumor purity estimation from gene expression data. Commun Biol 2023; 6:394. [PMID: 37041233 PMCID: PMC10090153 DOI: 10.1038/s42003-023-04764-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Accepted: 03/27/2023] [Indexed: 04/13/2023] Open
Abstract
Tumors are complex masses composed of malignant and non-malignant cells. Variation in tumor purity (proportion of cancer cells in a sample) can both confound integrative analysis and enable studies of tumor heterogeneity. Here we developed PUREE, which uses a weakly supervised learning approach to infer tumor purity from a tumor gene expression profile. PUREE was trained on gene expression data and genomic consensus purity estimates from 7864 solid tumor samples. PUREE predicted purity with high accuracy across distinct solid tumor types and generalized to tumor samples from unseen tumor types and cohorts. Gene features of PUREE were further validated using single-cell RNA-seq data from distinct tumor types. In a comprehensive benchmark, PUREE outperformed existing transcriptome-based purity estimation approaches. Overall, PUREE is a highly accurate and versatile method for estimating tumor purity and interrogating tumor heterogeneity from bulk tumor gene expression data, which can complement genomics-based approaches or be used in settings where genomic data is unavailable.
Collapse
Affiliation(s)
- Egor Revkov
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Singapore, 138672, Republic of Singapore
- School of Computing, National University of Singapore, Computing 1, 13 Computing Drive, Singapore, 117417, Republic of Singapore
| | - Tanmay Kulshrestha
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Singapore, 138672, Republic of Singapore
| | - Ken Wing-Kin Sung
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Singapore, 138672, Republic of Singapore
- School of Computing, National University of Singapore, Computing 1, 13 Computing Drive, Singapore, 117417, Republic of Singapore
| | - Anders Jacobsen Skanderup
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.
- School of Computing, National University of Singapore, Computing 1, 13 Computing Drive, Singapore, 117417, Republic of Singapore.
- National Cancer Centre Singapore, Division of Medical Oncology, 30 Hospital Boulevard, Singapore, 168583, Republic of Singapore.
| |
Collapse
|
47
|
Charytonowicz D, Brody R, Sebra R. Interpretable and context-free deconvolution of multi-scale whole transcriptomic data with UniCell deconvolve. Nat Commun 2023; 14:1350. [PMID: 36906603 PMCID: PMC10008582 DOI: 10.1038/s41467-023-36961-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 02/27/2023] [Indexed: 03/13/2023] Open
Abstract
We introduce UniCell: Deconvolve Base (UCDBase), a pre-trained, interpretable, deep learning model to deconvolve cell type fractions and predict cell identity across Spatial, bulk-RNA-Seq, and scRNA-Seq datasets without contextualized reference data. UCD is trained on 10 million pseudo-mixtures from a fully-integrated scRNA-Seq training database comprising over 28 million annotated single cells spanning 840 unique cell types from 898 studies. We show that our UCDBase and transfer-learning models achieve comparable or superior performance on in-silico mixture deconvolution to existing, reference-based, state-of-the-art methods. Feature attribute analysis uncovers gene signatures associated with cell-type specific inflammatory-fibrotic responses in ischemic kidney injury, discerns cancer subtypes, and accurately deconvolves tumor microenvironments. UCD identifies pathologic changes in cell fractions among bulk-RNA-Seq data for several disease states. Applied to lung cancer scRNA-Seq data, UCD annotates and distinguishes normal from cancerous cells. Overall, UCD enhances transcriptomic data analysis, aiding in assessment of cellular and spatial context.
Collapse
Affiliation(s)
- Daniel Charytonowicz
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Rachel Brody
- Department of Pathology, Molecular and Cell-Based Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Robert Sebra
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Icahn Genomics Institute, New York, NY, USA.
- Black Family Stem Cell Institute, New York, NY, USA.
| |
Collapse
|
48
|
Li J, Li L, You P, Wei Y, Xu B. Towards artificial intelligence to multi-omics characterization of tumor heterogeneity in esophageal cancer. Semin Cancer Biol 2023; 91:35-49. [PMID: 36868394 DOI: 10.1016/j.semcancer.2023.02.009] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 02/21/2023] [Accepted: 02/28/2023] [Indexed: 03/05/2023]
Abstract
Esophageal cancer is a unique and complex heterogeneous malignancy, with substantial tumor heterogeneity: at the cellular levels, tumors are composed of tumor and stromal cellular components; at the genetic levels, they comprise genetically distinct tumor clones; at the phenotypic levels, cells in distinct microenvironmental niches acquire diverse phenotypic features. This heterogeneity affects almost every process of esophageal cancer progression from onset to metastases and recurrence, etc. Intertumoral and intratumoral heterogeneity are major obstacles in the treatment of esophageal cancer, but also offer the potential to manipulate the heterogeneity themselves as a new therapeutic strategy. The high-dimensional, multi-faceted characterization of genomics, epigenomics, transcriptomics, proteomics, metabonomics, etc. of esophageal cancer has opened novel horizons for dissecting tumor heterogeneity. Artificial intelligence especially machine learning and deep learning algorithms, are able to make decisive interpretations of data from multi-omics layers. To date, artificial intelligence has emerged as a promising computational tool for analyzing and dissecting esophageal patient-specific multi-omics data. This review provides a comprehensive review of tumor heterogeneity from a multi-omics perspective. Especially, we discuss the novel techniques single-cell sequencing and spatial transcriptomics, which have revolutionized our understanding of the cell compositions of esophageal cancer and allowed us to determine novel cell types. We focus on the latest advances in artificial intelligence in integrating multi-omics data of esophageal cancer. Artificial intelligence-based multi-omics data integration computational tools exert a key role in tumor heterogeneity assessment, which will potentially boost the development of precision oncology in esophageal cancer.
Collapse
Affiliation(s)
- Junyu Li
- Department of Radiation Oncology, Jiangxi Cancer Hospital, Nanchang 330029, Jiangxi, China; Jiangxi Health Committee Key (JHCK) Laboratory of Tumor Metastasis, Jiangxi Cancer Hospital, Nanchang 330029, Jiangxi, China
| | - Lin Li
- Department of Thoracic Oncology, Jiangxi Cancer Hospital, Nanchang 330029, Jiangxi, China
| | - Peimeng You
- Nanchang University, Department of Radiation Oncology, Jiangxi Cancer Hospital, Nanchang 330029, Jiangxi, China
| | - Yiping Wei
- Department of Thoracic Surgery, The Second Affiliated Hospital of Nanchang University, Nanchang 330006, Jiangxi, China.
| | - Bin Xu
- Jiangxi Health Committee Key (JHCK) Laboratory of Tumor Metastasis, Jiangxi Cancer Hospital, Nanchang 330029, Jiangxi, China.
| |
Collapse
|
49
|
Deng W, Li B, Wang J, Jiang W, Yan X, Li N, Vukmirovic M, Kaminski N, Wang J, Zhao H. A novel Bayesian framework for harmonizing information across tissues and studies to increase cell type deconvolution accuracy. Brief Bioinform 2023; 24:bbac616. [PMID: 36631398 PMCID: PMC9851324 DOI: 10.1093/bib/bbac616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 11/28/2022] [Accepted: 12/14/2022] [Indexed: 01/13/2023] Open
Abstract
Computational cell type deconvolution on bulk transcriptomics data can reveal cell type proportion heterogeneity across samples. One critical factor for accurate deconvolution is the reference signature matrix for different cell types. Compared with inferring reference signature matrices from cell lines, rapidly accumulating single-cell RNA-sequencing (scRNA-seq) data provide a richer and less biased resource. However, deriving cell type signature from scRNA-seq data is challenging due to high biological and technical noises. In this article, we introduce a novel Bayesian framework, tranSig, to improve signature matrix inference from scRNA-seq by leveraging shared cell type-specific expression patterns across different tissues and studies. Our simulations show that tranSig is robust to the number of signature genes and tissues specified in the model. Applications of tranSig to bulk RNA sequencing data from peripheral blood, bronchoalveolar lavage and aorta demonstrate its accuracy and power to characterize biological heterogeneity across groups. In summary, tranSig offers an accurate and robust approach to defining gene expression signatures of different cell types, facilitating improved in silico cell type deconvolutions.
Collapse
Affiliation(s)
- Wenxuan Deng
- Department of Biostatistics, Yale School of Public Health, 60 College Street, New Haven, CT, USA
| | - Bolun Li
- Department of Biostatistics, Yale School of Public Health, 60 College Street, New Haven, CT, USA
- State Key Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, Department of Pathophysiology, Peking Union Medical College, Beijing, China
| | - Jiawei Wang
- Department of Biostatistics, Yale School of Public Health, 60 College Street, New Haven, CT, USA
| | - Wei Jiang
- Department of Biostatistics, Yale School of Public Health, 60 College Street, New Haven, CT, USA
| | - Xiting Yan
- Section of Pulmonary, Critical Care and Sleep Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Ningshan Li
- Section of Pulmonary, Critical Care and Sleep Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Milica Vukmirovic
- Section of Pulmonary, Critical Care and Sleep Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA
- Leslie Dan Faculty of Pharmacy, University of Toronto, 144 College St., ON, Canada
| | - Naftali Kaminski
- Section of Pulmonary, Critical Care and Sleep Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Jing Wang
- State Key Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, Department of Pathophysiology, Peking Union Medical College, Beijing, China
| | - Hongyu Zhao
- Department of Biostatistics, Yale School of Public Health, 60 College Street, New Haven, CT, USA
| |
Collapse
|
50
|
Tu JJ, Li HS, Yan H, Zhang XF. EnDecon: cell type deconvolution of spatially resolved transcriptomics data via ensemble learning. Bioinformatics 2023; 39:6969103. [PMID: 36610709 PMCID: PMC9825263 DOI: 10.1093/bioinformatics/btac825] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 12/08/2022] [Accepted: 12/21/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Spatially resolved gene expression profiles are the key to exploring the cell type spatial distributions and understanding the architecture of tissues. Many spatially resolved transcriptomics (SRT) techniques do not provide single-cell resolutions, but they measure gene expression profiles on captured locations (spots) instead, which are mixtures of potentially heterogeneous cell types. Currently, several cell-type deconvolution methods have been proposed to deconvolute SRT data. Due to the different model strategies of these methods, their deconvolution results also vary. RESULTS Leveraging the strengths of multiple deconvolution methods, we introduce a new weighted ensemble learning deconvolution method, EnDecon, to predict cell-type compositions on SRT data in this work. EnDecon integrates multiple base deconvolution results using a weighted optimization model to generate a more accurate result. Simulation studies demonstrate that EnDecon outperforms the competing methods and the learned weights assigned to base deconvolution methods have high positive correlations with the performances of these base methods. Applied to real datasets from different spatial techniques, EnDecon identifies multiple cell types on spots, localizes these cell types to specific spatial regions and distinguishes distinct spatial colocalization and enrichment patterns, providing valuable insights into spatial heterogeneity and regionalization of tissues. AVAILABILITY AND IMPLEMENTATION The source code is available at https://github.com/Zhangxf-ccnu/EnDecon. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Hong Yan
- Centre for Intelligent Multidimensional Data Analysis, Hong Kong Science Park, Hong Kong 999077, China
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong 999077, China
| | | |
Collapse
|