1
|
Davidson NR, Zhang F, Greene CS. BuDDI: Bulk Deconvolution with Domain Invariance to predict cell-type-specific perturbations from bulk. PLoS Comput Biol 2025; 21:e1012742. [PMID: 39823522 PMCID: PMC11790236 DOI: 10.1371/journal.pcbi.1012742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 02/03/2025] [Accepted: 12/20/2024] [Indexed: 01/19/2025] Open
Abstract
While single-cell experiments provide deep cellular resolution within a single sample, some single-cell experiments are inherently more challenging than bulk experiments due to dissociation difficulties, cost, or limited tissue availability. This creates a situation where we have deep cellular profiles of one sample or condition, and bulk profiles across multiple samples and conditions. To bridge this gap, we propose BuDDI (BUlk Deconvolution with Domain Invariance). BuDDI utilizes domain adaptation techniques to effectively integrate available corpora of case-control bulk and reference scRNA-seq observations to infer cell-type-specific perturbation effects. BuDDI achieves this by learning independent latent spaces within a single variational autoencoder (VAE) encompassing at least four sources of variability: 1) cell type proportion, 2) perturbation effect, 3) structured experimental variability, and 4) remaining variability. Since each latent space is encouraged to be independent, we simulate perturbation responses by independently composing each latent space to simulate cell-type-specific perturbation responses. We evaluated BuDDI's performance on simulated and real data with experimental designs of increasing complexity. We first validated that BuDDI could learn domain invariant latent spaces on data with matched samples across each source of variability. Then we validated that BuDDI could accurately predict cell-type-specific perturbation response when no single-cell perturbed profiles were used during training; instead, only bulk samples had both perturbed and non-perturbed observations. Finally, we validated BuDDI on predicting sex-specific differences, an experimental design where it is not possible to have matched samples. In each experiment, BuDDI outperformed all other comparative methods and baselines. As more reference atlases are completed, BuDDI provides a path to combine these resources with bulk-profiled treatment or disease signatures to study perturbations, sex differences, or other factors at single-cell resolution.
Collapse
Affiliation(s)
- Natalie R. Davidson
- Department of Biomedical Informatics, University of Colorado Anschutz School of Medicine, Aurora, Colorado, United States of America
| | - Fan Zhang
- Department of Biomedical Informatics, University of Colorado Anschutz School of Medicine, Aurora, Colorado, United States of America
- Department of Medicine Rheumatology, University of Colorado Anschutz School of Medicine, Aurora, Colorado, United States of America
| | - Casey S. Greene
- Department of Biomedical Informatics, University of Colorado Anschutz School of Medicine, Aurora, Colorado, United States of America
| |
Collapse
|
2
|
Xiong X, Liu Y, Pu D, Yang Z, Bi Z, Tian L, Li X. DeSide: A unified deep learning approach for cellular deconvolution of tumor microenvironment. Proc Natl Acad Sci U S A 2024; 121:e2407096121. [PMID: 39514318 PMCID: PMC11573681 DOI: 10.1073/pnas.2407096121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 09/23/2024] [Indexed: 11/16/2024] Open
Abstract
Cellular deconvolution via bulk RNA sequencing (RNA-seq) presents a cost-effective and efficient alternative to experimental methods such as flow cytometry and single-cell RNA-seq (scRNA-seq) for analyzing the complex cellular composition of tumor microenvironments. Despite challenges due to heterogeneity within and among tumors, our innovative deep learning-based approach, DeSide, shows exceptional accuracy in estimating the proportions of 16 distinct cell types and subtypes within solid tumors. DeSide integrates biological pathways and assesses noncancerous cell types first, effectively sidestepping the issue of highly variable gene expression profiles (GEPs) associated with cancer cells. By leveraging scRNA-seq data from six cancer types and 185 cancer cell lines across 22 cancer types as references, our method introduces distinctive sampling and filtering techniques to generate a high-quality training set that closely replicates real tumor GEPs, based on The Cancer Genome Atlas (TCGA) bulk RNA-seq data. With this model and high-quality training set, DeSide outperforms existing methods in estimating tumor purity and the proportions of noncancerous cells within solid tumors. Our model precisely predicts cellular compositions across 19 cancer types from TCGA and proves its effectiveness with multiple additional external datasets. Crucially, DeSide enables the identification and analysis of combinatorial cell type pairs, facilitating the stratification of cancer patients into prognostically significant groups. This approach not only provides deeper insights into the dynamics of tumor biology but also highlights potential therapeutic targets by underscoring the importance of specific cell type or subtype interactions.
Collapse
Affiliation(s)
- Xin Xiong
- Department of Physics, Hong Kong Baptist University, Hong Kong, China
| | - Yerong Liu
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Dandan Pu
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Zhu Yang
- State Key Laboratory of Environmental and Biological Analysis, Hong Kong Baptist University, Hong Kong, China
| | - Zedong Bi
- Lingang Laboratory, Shanghai 200031, China
| | - Liang Tian
- Department of Physics, Hong Kong Baptist University, Hong Kong, China
- State Key Laboratory of Environmental and Biological Analysis, Hong Kong Baptist University, Hong Kong, China
- Institute of Computational and Theoretical Studies, Hong Kong Baptist University, Hong Kong, China
- Institute of Systems Medicine and Health Sciences, Hong Kong Baptist University, Hong Kong, China
| | - Xuefei Li
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| |
Collapse
|
3
|
White BS, de Reyniès A, Newman AM, Waterfall JJ, Lamb A, Petitprez F, Lin Y, Yu R, Guerrero-Gimenez ME, Domanskyi S, Monaco G, Chung V, Banerjee J, Derrick D, Valdeolivas A, Li H, Xiao X, Wang S, Zheng F, Yang W, Catania CA, Lang BJ, Bertus TJ, Piermarocchi C, Caruso FP, Ceccarelli M, Yu T, Guo X, Bletz J, Coller J, Maecker H, Duault C, Shokoohi V, Patel S, Liliental JE, Simon S, Saez-Rodriguez J, Heiser LM, Guinney J, Gentles AJ. Community assessment of methods to deconvolve cellular composition from bulk gene expression. Nat Commun 2024; 15:7362. [PMID: 39191725 PMCID: PMC11350143 DOI: 10.1038/s41467-024-50618-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 07/11/2024] [Indexed: 08/29/2024] Open
Abstract
We evaluate deconvolution methods, which infer levels of immune infiltration from bulk expression of tumor samples, through a community-wide DREAM Challenge. We assess six published and 22 community-contributed methods using in vitro and in silico transcriptional profiles of admixed cancer and healthy immune cells. Several published methods predict most cell types well, though they either were not trained to evaluate all functional CD8+ T cell states or do so with low accuracy. Several community-contributed methods address this gap, including a deep learning-based approach, whose strong performance establishes the applicability of this paradigm to deconvolution. Despite being developed largely using immune cells from healthy tissues, deconvolution methods predict levels of tumor-derived immune cells well. Our admixed and purified transcriptional profiles will be a valuable resource for developing deconvolution methods, including in response to common challenges we observe across methods, such as sensitive identification of functional CD4+ T cell states.
Collapse
Affiliation(s)
- Brian S White
- Sage Bionetworks, Seattle, WA, USA
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Aurélien de Reyniès
- Centre de Recherche des Cordeliers, INSERM U1138, Université Paris Cité, Paris, France
| | - Aaron M Newman
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Joshua J Waterfall
- INSERM U830 and Translational Research Department, Institut Curie, PSL Research University, Paris, France
| | | | - Florent Petitprez
- Programme Cartes d'Identité des Tumeurs, Ligue Nationale Contre le Cancer, Paris, France
- MRC Centre for Reproductive Health, the Queen's Medical Research Institute, University of Edinburgh, Edinburgh, UK
| | - Yating Lin
- Xiamen University, Xiamen, Fujian, China
| | | | - Martin E Guerrero-Gimenez
- Institute of Biochemistry and Biotechnology, School of Medicine, National University of Cuyo, Mendoza, Argentina
| | | | - Gianni Monaco
- BIOGEM Institute of Molecular Biology and Genetics, Ariano Irpino, AV, Italy
| | | | | | - Daniel Derrick
- Department of Biomedical Engineering, Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA
| | - Alberto Valdeolivas
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Haojun Li
- Xiamen University, Xiamen, Fujian, China
| | - Xu Xiao
- Xiamen University, Xiamen, Fujian, China
| | - Shun Wang
- Department of Pathology, Cancer Hospital, Chinese Aacdemy of Medical Science, Beijing, China
| | | | | | - Carlos A Catania
- Laboratory of Intelligent Systems (LABSIN), Engineering School, National University of Cuyo, Mendoza, Argentina
| | - Benjamin J Lang
- Department of Radiation Oncology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | | | | | - Francesca P Caruso
- BIOGEM Institute of Molecular Biology and Genetics, Ariano Irpino, AV, Italy
| | - Michele Ceccarelli
- BIOGEM Institute of Molecular Biology and Genetics, Ariano Irpino, AV, Italy
- Sylvester Comprehensive Cancer Center, Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, Florida, USA
| | | | | | | | - John Coller
- Stanford Functional Genomics Facility, Stanford University School of Medicine, Stanford, CA, USA
| | - Holden Maecker
- Institute for Immunity, Transplantation, and Infection, Stanford University School of Medicine, Stanford, CA, USA
| | - Caroline Duault
- Institute for Immunity, Transplantation, and Infection, Stanford University School of Medicine, Stanford, CA, USA
| | - Vida Shokoohi
- Stanford Functional Genomics Facility, Stanford University School of Medicine, Stanford, CA, USA
| | - Shailja Patel
- Translational Applications Service Center, Stanford University School of Medicine, Stanford, CA, USA
| | - Joanna E Liliental
- Translational Applications Service Center, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Julio Saez-Rodriguez
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Laura M Heiser
- Department of Biomedical Engineering, Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA
| | | | - Andrew J Gentles
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA.
- Department of Pathology, Stanford University, Stanford, CA, USA.
| |
Collapse
|
4
|
Görtler F, Mensching-Buhr M, Skaar Ø, Schrod S, Sterr T, Schäfer A, Beißbarth T, Joshi A, Zacharias HU, Grellscheid SN, Altenbuchinger M. Adaptive digital tissue deconvolution. Bioinformatics 2024; 40:i100-i109. [PMID: 38940181 PMCID: PMC11256946 DOI: 10.1093/bioinformatics/btae263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION The inference of cellular compositions from bulk and spatial transcriptomics data increasingly complements data analyses. Multiple computational approaches were suggested and recently, machine learning techniques were developed to systematically improve estimates. Such approaches allow to infer additional, less abundant cell types. However, they rely on training data which do not capture the full biological diversity encountered in transcriptomics analyses; data can contain cellular contributions not seen in the training data and as such, analyses can be biased or blurred. Thus, computational approaches have to deal with unknown, hidden contributions. Moreover, most methods are based on cellular archetypes which serve as a reference; e.g. a generic T-cell profile is used to infer the proportion of T-cells. It is well known that cells adapt their molecular phenotype to the environment and that pre-specified cell archetypes can distort the inference of cellular compositions. RESULTS We propose Adaptive Digital Tissue Deconvolution (ADTD) to estimate cellular proportions of pre-selected cell types together with possibly unknown and hidden background contributions. Moreover, ADTD adapts prototypic reference profiles to the molecular environment of the cells, which further resolves cell-type specific gene regulation from bulk transcriptomics data. We verify this in simulation studies and demonstrate that ADTD improves existing approaches in estimating cellular compositions. In an application to bulk transcriptomics data from breast cancer patients, we demonstrate that ADTD provides insights into cell-type specific molecular differences between breast cancer subtypes. AVAILABILITY AND IMPLEMENTATION A python implementation of ADTD and a tutorial are available at Gitlab and zenodo (doi:10.5281/zenodo.7548362).
Collapse
Affiliation(s)
- Franziska Görtler
- Computational Biology Unit, Department of Biological Sciences, University of Bergen, N-5008 Bergen, Norway
- Department of Oncology and Medical Physics, Haukeland University Hospital, 5021 Bergen, Norway
| | - Malte Mensching-Buhr
- Department of Medical Bioinformatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Ørjan Skaar
- Department of Informatics, Computational Biology Unit, University of Bergen, N-5008 Bergen, Norway
| | - Stefan Schrod
- Department of Medical Bioinformatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Thomas Sterr
- Institute of Theoretical Physics, University of Regensburg, 93053 Regensburg, Germany
| | - Andreas Schäfer
- Institute of Theoretical Physics, University of Regensburg, 93053 Regensburg, Germany
| | - Tim Beißbarth
- Department of Medical Bioinformatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Anagha Joshi
- Department of Clinical Science, Computational Biology Unit, University of Bergen, N-5008 Bergen, Norway
| | - Helena U Zacharias
- Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Hannover Medical School, 30625 Hannover, Germany
| | | | - Michael Altenbuchinger
- Department of Medical Bioinformatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| |
Collapse
|
5
|
Nguyen H, Nguyen H, Tran D, Draghici S, Nguyen T. Fourteen years of cellular deconvolution: methodology, applications, technical evaluation and outstanding challenges. Nucleic Acids Res 2024; 52:4761-4783. [PMID: 38619038 PMCID: PMC11109966 DOI: 10.1093/nar/gkae267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 03/01/2024] [Accepted: 04/02/2024] [Indexed: 04/16/2024] Open
Abstract
Single-cell RNA sequencing (scRNA-Seq) is a recent technology that allows for the measurement of the expression of all genes in each individual cell contained in a sample. Information at the single-cell level has been shown to be extremely useful in many areas. However, performing single-cell experiments is expensive. Although cellular deconvolution cannot provide the same comprehensive information as single-cell experiments, it can extract cell-type information from bulk RNA data, and therefore it allows researchers to conduct studies at cell-type resolution from existing bulk datasets. For these reasons, a great effort has been made to develop such methods for cellular deconvolution. The large number of methods available, the requirement of coding skills, inadequate documentation, and lack of performance assessment all make it extremely difficult for life scientists to choose a suitable method for their experiment. This paper aims to fill this gap by providing a comprehensive review of 53 deconvolution methods regarding their methodology, applications, performance, and outstanding challenges. More importantly, the article presents a benchmarking of all these 53 methods using 283 cell types from 30 tissues of 63 individuals. We also provide an R package named DeconBenchmark that allows readers to execute and benchmark the reviewed methods (https://github.com/tinnlab/DeconBenchmark).
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Ha Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Duc Tran
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, MI, USA
- Advaita Bioinformatics, Ann Arbor, MI, USA
| | - Tin Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| |
Collapse
|
6
|
Davidson NR, Zhang F, Greene CS. BuDDI: BulkDeconvolution withDomainInvariance to predict cell-type-specific perturbations from bulk. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.20.549951. [PMID: 37503097 PMCID: PMC10370205 DOI: 10.1101/2023.07.20.549951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
While single-cell experiments provide deep cellular resolution within a single sample, some single-cell experiments are inherently more challenging than bulk experiments due to dissociation difficulties, cost, or limited tissue availability. This creates a situation where we have deep cellular profiles of one sample or condition, and bulk profiles across multiple samples and conditions. To bridge this gap, we propose BuDDI (BUlk Deconvolution with Domain Invariance). BuDDI utilizes domain adaptation techniques to effectively integrate available corpora of case-control bulk and reference scRNA-seq observations to infer cell-type-specific perturbation effects. BuDDI achieves this by learning independent latent spaces within a single variational autoencoder (VAE) encompassing at least four sources of variability: 1) cell type proportion, 2) perturbation effect, 3) structured experimental variability, and 4) remaining variability. Since each latent space is encouraged to be independent, we simulate perturbation responses by independently composing each latent space to simulate cell-type-specific perturbation responses. We evaluated BuDDI's performance on simulated and real data with experimental designs of increasing complexity. We first validated that BuDDI could learn domain invariant latent spaces on data with matched samples across each source of variability. Then we validated that BuDDI could accurately predict cell-type-specific perturbation response when no single-cell perturbed profiles were used during training; instead, only bulk samples had both perturbed and non-perturbed observations. Finally, we validated BuDDI on predicting sex-specific differences, an experimental design where it is not possible to have matched samples. In each experiment, BuDDI outperformed all other comparative methods and baselines. As more reference atlases are completed, BuDDI provides a path to combine these resources with bulk-profiled treatment or disease signatures to study perturbations, sex differences, or other factors at single-cell resolution.
Collapse
Affiliation(s)
- Natalie R Davidson
- Department of Biomedical Informatics, University of Colorado Anschutz School of Medicine, Aurora, Colorado, United States of America · Funded by the Gordon and Betty Moore Foundation (GBMF 4552), NHGRI of the National Institutes of Health (K99HG012945), NCI of the National Institutes of Health (R01CA237170, R01CA243188, R01CA200854)
| | - Fan Zhang
- Department of Medicine Rheumatology, University of Colorado Anschutz School of Medicine, Aurora, Colorado, United States of America; Department of Biomedical Informatics, University of Colorado Anschutz School of Medicine, Aurora, Colorado, United States of America · Funded by the Arthritis National Research Foundation Award, the PhRMA foundation, and the University of Colorado Translational Research Scholars Program Award
| | - Casey S Greene
- Department of Biomedical Informatics, University of Colorado Anschutz School of Medicine, Aurora, Colorado, United States of America · Funded by the Gordon and Betty Moore Foundation (GBMF 4552), NCI of the National Institutes of Health (R01CA237170, R01CA243188, R01CA200854)
| |
Collapse
|
7
|
Chen G, Yu R, Chen X. Editorial: Integrative analysis of single-cell and/or bulk multi-omics sequencing data. Front Genet 2023; 13:1121999. [PMID: 36685891 PMCID: PMC9845394 DOI: 10.3389/fgene.2022.1121999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 12/13/2022] [Indexed: 01/05/2023] Open
Affiliation(s)
- Geng Chen
- Stemirna Therapeutics Co., Ltd., Shanghai, China,*Correspondence: Geng Chen,
| | - Rongshan Yu
- Department of Computer Science, School of Informatics, Xiamen University, Xiamen, China
| | - Xingdong Chen
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences, Fudan University, Shanghai, China
| |
Collapse
|
8
|
Xiao X, Guo Q, Cui C, Lin Y, Zhang L, Ding X, Li Q, Wang M, Yang W, Kong Y, Yu R. Multiplexed imaging mass cytometry reveals distinct tumor-immune microenvironments linked to immunotherapy responses in melanoma. COMMUNICATIONS MEDICINE 2022; 2:131. [PMID: 36281356 PMCID: PMC9587266 DOI: 10.1038/s43856-022-00197-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Accepted: 09/30/2022] [Indexed: 11/08/2022] Open
Abstract
Background Single-cell technologies have enabled extensive analysis of complex immune composition, phenotype and interactions within tumor, which is crucial in understanding the mechanisms behind cancer progression and treatment resistance. Unfortunately, knowledge on cell phenotypes and their spatial interactions has only had limited impact on the pathological stratification of patients in the clinic so far. We explore the relationship between different tumor environments (TMEs) and response to immunotherapy by deciphering the composition and spatial relationships of different cell types. Methods Here we used imaging mass cytometry to simultaneously quantify 35 proteins in a spatially resolved manner on tumor tissues from 26 melanoma patients receiving anti-programmed cell death-1 (anti-PD-1) therapy. Using unsupervised clustering, we profiled 662,266 single cells to identify lymphocytes, myeloid derived monocytes, stromal and tumor cells, and characterized TME of different melanomas. Results Combined single-cell and spatial analysis reveals highly dynamic TMEs that are characterized with variable tumor and immune cell phenotypes and their spatial organizations in melanomas, and many of these multicellular features are associated with response to anti-PD-1 therapy. We further identify six distinct TME archetypes based on their multicellular compositions, and find that patients with different TME archetypes responded differently to anti-PD-1 therapy. Finally, we find that classifying patients based on the gene expression signature derived from TME archetypes predicts anti-PD-1 therapy response across multiple validation cohorts. Conclusions Our results demonstrate the utility of multiplex proteomic imaging technologies in studying complex molecular events in a spatially resolved manner for the development of new strategies for patient stratification and treatment outcome prediction.
Collapse
Affiliation(s)
- Xu Xiao
- School of Informatics, Xiamen University, Xiamen, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China
| | - Qian Guo
- Peking University Cancer Hospital and Institute, Beijing, China
| | - Chuanliang Cui
- Peking University Cancer Hospital and Institute, Beijing, China
| | - Yating Lin
- School of Informatics, Xiamen University, Xiamen, China
| | - Lei Zhang
- School of Life Science, Xiamen University, Xiamen, China
| | - Xin Ding
- Zhongshan Hospital, Xiamen University, Xiamen, China
| | - Qiyuan Li
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China
- School of Medicine, Xiamen University, Xiamen, China
| | - Minshu Wang
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China
- School of Medicine, Xiamen University, Xiamen, China
| | | | - Yan Kong
- Peking University Cancer Hospital and Institute, Beijing, China
| | - Rongshan Yu
- School of Informatics, Xiamen University, Xiamen, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China
- Aginome Scientific, Xiamen, China
| |
Collapse
|
9
|
Lin Y, Wu S, Xiao X, Zhao J, Wang M, Li H, Wang K, Zhang M, Zheng F, Yang W, Zhang L, Han J, Yu R. Protocol to estimate cell type proportions from bulk RNA-seq using DAISM-DNNXMBD. STAR Protoc 2022; 3:101587. [PMID: 35942344 PMCID: PMC9356155 DOI: 10.1016/j.xpro.2022.101587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Computational protocols for cell type deconvolution from bulk RNA-seq data have been used to understand cellular heterogeneity in disease-related samples, but their performance can be impacted by batch effect among datasets. Here, we present a DAISM-DNN protocol to achieve robust cell type proportion estimation on the target dataset. We describe the preparation of calibrated samples from human blood samples. We then detail steps to train a dataset-specific deep neural network (DNN) model and cell type proportion estimation using the trained model. For complete details on the use and execution of this protocol, please refer to Lin et al. (2022). A protocol for accurate cell type deconvolution with data-driven DNN-based approach Obtain expression and cell proportions from calibrated samples DAISM-DNN model training including parameter tuning and data formatting Trained model can be applied to other biomedical experiments under the same conditions
Publisher’s note: Undertaking any experimental protocol requires adherence to local institutional guidelines for laboratory safety and ethics.
Collapse
Affiliation(s)
- Yating Lin
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Shangze Wu
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Xu Xiao
- School of Informatics, Xiamen University, Xiamen 361005, China; National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
| | | | - Minshu Wang
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China; School of Medicine, Xiamen University, Xiamen 361102, China
| | - Haojun Li
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Kejia Wang
- School of Medicine, Xiamen University, Xiamen 361102, China
| | - Minwei Zhang
- Department of Critical Care Medicine, The First Affiliated Hospital of Xiamen University, Xiamen 361003, China
| | | | | | - Lei Zhang
- School of Life Science, Xiamen University, Xiamen 361102, China.
| | - Jiahuai Han
- Research Unit of Cellular Stress of CAMS, Cancer Research Center of Xiamen University, School of Medicine, Xiamen University, Xiamen 361102, China.
| | - Rongshan Yu
- School of Informatics, Xiamen University, Xiamen 361005, China; National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China; Aginome Scientific, Xiamen 361005, China.
| |
Collapse
|
10
|
Tegner JN, Gomez-Cabrero D. Data-driven bioinformatics to disentangle cells within a tissue microenvironment. Trends Cell Biol 2022; 32:467-469. [DOI: 10.1016/j.tcb.2022.03.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 03/30/2022] [Indexed: 10/18/2022]
|