1
|
Murillo OD, Petrosyan V, LaPlante EL, Dobrolecki LE, Lewis MT, Milosavljevic A. Deconvolution of cancer cell states by the XDec-SM method. PLoS Comput Biol 2023; 19:e1011365. [PMID: 37578979 PMCID: PMC10449115 DOI: 10.1371/journal.pcbi.1011365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 08/24/2023] [Accepted: 07/17/2023] [Indexed: 08/16/2023] Open
Abstract
Proper characterization of cancer cell states within the tumor microenvironment is a key to accurately identifying matching experimental models and the development of precision therapies. To reconstruct this information from bulk RNA-seq profiles, we developed the XDec Simplex Mapping (XDec-SM) reference-optional deconvolution method that maps tumors and the states of constituent cells onto a biologically interpretable low-dimensional space. The method identifies gene sets informative for deconvolution from relevant single-cell profiling data when such profiles are available. When applied to breast tumors in The Cancer Genome Atlas (TCGA), XDec-SM infers the identity of constituent cell types and their proportions. XDec-SM also infers cancer cells states within individual tumors that associate with DNA methylation patterns, driver somatic mutations, pathway activation and metabolic coupling between stromal and breast cancer cells. By projecting tumors, cancer cell lines, and PDX models onto the same map, we identify in vitro and in vivo models with matching cancer cell states. Map position is also predictive of therapy response, thus opening the prospects for precision therapy informed by experiments in model systems matched to tumors in vivo by cancer cell state.
Collapse
Affiliation(s)
- Oscar D. Murillo
- Translational Science and Therapeutics Division, Fred Hutchinson Cancer Center, Seattle, Washington, United States of America
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Varduhi Petrosyan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Emily L. LaPlante
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Lacey E. Dobrolecki
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - Michael T. Lewis
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas, United States of America
- Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, United States of America
- Departments of Molecular and Cellular Biology and Radiology, Baylor College of Medicine, Houston, Texas, United States of America
| | - Aleksandar Milosavljevic
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, United States of America
| |
Collapse
|
2
|
Dragomir MP, Calina TG, Perez E, Schallenberg S, Chen M, Albrecht T, Koch I, Wolkenstein P, Goeppert B, Roessler S, Calin GA, Sers C, Horst D, Roßner F, Capper D. DNA methylation-based classifier differentiates intrahepatic pancreato-biliary tumours. EBioMedicine 2023; 93:104657. [PMID: 37348162 DOI: 10.1016/j.ebiom.2023.104657] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Revised: 05/21/2023] [Accepted: 06/02/2023] [Indexed: 06/24/2023] Open
Abstract
BACKGROUND Differentiating intrahepatic cholangiocarcinomas (iCCA) from hepatic metastases of pancreatic ductal adenocarcinoma (PAAD) is challenging. Both tumours have similar morphological and immunohistochemical pattern and share multiple driver mutations. We hypothesised that DNA methylation-based machine-learning algorithms may help perform this task. METHODS We assembled genome-wide DNA methylation data for iCCA (n = 259), PAAD (n = 431), and normal bile duct (n = 70) from publicly available sources. We split this cohort into a reference (n = 399) and a validation set (n = 361). Using the reference cohort, we trained three machine learning models to differentiate between these entities. Furthermore, we validated the classifiers on the technical validation set and used an internal cohort (n = 72) to test our classifier. FINDINGS On the validation cohort, the neural network, support vector machine, and the random forest classifiers reached accuracies of 97.68%, 95.62%, and 96.5%, respectively. Filtering by anomaly detection and thresholds improved the accuracy to 99.07% (37 samples excluded by filtering), 96.22% (17 samples excluded), and 100% (44 samples excluded) for the neural network, support vector machine and random forest, respectively. Because of best balance between accuracy and number of predictable cases we tested the neural network with applied filters on the in-house cohort, obtaining an accuracy of 95.45%. INTERPRETATION We developed a classifier that can differentiate between iCCAs, intrahepatic metastases of a PAAD, and normal bile duct tissue with high accuracy. This tool can be used for improving the diagnosis of pancreato-biliary cancers of the liver. FUNDING This work was supported by Berlin Institute of Health (JCS Program), DKTK Berlin (Young Investigator Grant 2022), German Research Foundation (493697503 and 314905040 - SFB/TRR 209 Liver Cancer B01), and German Cancer Aid (70113922).
Collapse
Affiliation(s)
- Mihnea P Dragomir
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany; German Cancer Consortium (DKTK), Partner Site Berlin, and German Cancer Research Center (DKFZ), Heidelberg, Germany; Berlin Institute of Health, Berlin, Germany.
| | | | - Eilís Perez
- Department of Neuropathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany; Berlin School of Integrative Oncology (BSIO), Charite - Universitätsmedizin Berlin (CVK), Berlin, Germany
| | - Simon Schallenberg
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Meng Chen
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Thomas Albrecht
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
| | - Ines Koch
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Peggy Wolkenstein
- German Cancer Consortium (DKTK), Partner Site Berlin, and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Benjamin Goeppert
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany; Institute of Pathology and Neuropathology, Hospital RKH Kliniken Ludwigsburg, 71640 Ludwigsburg, Germany
| | - Stephanie Roessler
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
| | - George A Calin
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA; Center for RNA Interference and Non-coding RNAs, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Christine Sers
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
| | - David Horst
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany; German Cancer Consortium (DKTK), Partner Site Berlin, and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Florian Roßner
- Institute of Pathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
| | - David Capper
- German Cancer Consortium (DKTK), Partner Site Berlin, and German Cancer Research Center (DKFZ), Heidelberg, Germany; Department of Neuropathology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
| |
Collapse
|
3
|
Romagnoli D, Nardone A, Galardi F, Paoli M, De Luca F, Biagioni C, Franceschini GM, Pestrin M, Sanna G, Moretti E, Demichelis F, Migliaccio I, Biganzoli L, Malorni L, Benelli M. MIMESIS: minimal DNA-methylation signatures to quantify and classify tumor signals in tissue and cell-free DNA samples. Brief Bioinform 2023; 24:6991124. [PMID: 36653909 DOI: 10.1093/bib/bbad015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 12/17/2022] [Accepted: 01/03/2023] [Indexed: 01/20/2023] Open
Abstract
DNA-methylation alterations are common in cancer and display unique characteristics that make them ideal markers for tumor quantification and classification. Here we present MIMESIS, a computational framework exploiting minimal DNA-methylation signatures composed by a few dozen informative DNA-methylation sites to quantify and classify tumor signals in tissue and cell-free DNA samples. Extensive analyses of multiple independent and heterogenous datasets including >7200 samples demonstrate the capability of MIMESIS to provide precise estimations of tumor content and to enable accurate classification of tumor type and molecular subtype. To assess our framework for clinical applications, we designed a MIMESIS-informed assay incorporating the minimal signatures for breast cancer. Using both artificial samples and clinical serial cell-free DNA samples from patients with metastatic breast cancer, we show that our approach provides accurate estimations of tumor content, sensitive detection of tumor signal and the ability to capture clinically relevant molecular subtype in patients' circulation. This study provides evidence that our extremely parsimonious approach can be used to develop cost-effective and highly scalable DNA-methylation assays that could support and facilitate the implementation of precision oncology in clinical practice.
Collapse
Affiliation(s)
| | - Agostina Nardone
- "Sandro Pitigliani" Translational Research Unit, Hospital of Prato, 59100 Prato, Italy
| | - Francesca Galardi
- "Sandro Pitigliani" Translational Research Unit, Hospital of Prato, 59100 Prato, Italy
| | - Marta Paoli
- Bioinformatics Unit, Hospital of Prato, 59100 Prato, Italy
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Francesca De Luca
- "Sandro Pitigliani" Translational Research Unit, Hospital of Prato, 59100 Prato, Italy
| | - Chiara Biagioni
- Bioinformatics Unit, Hospital of Prato, 59100 Prato, Italy
- "Sandro Pitigliani" Medical Oncology Department, Hospital of Prato, 59100 Prato, Italy
| | - Gian Marco Franceschini
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Marta Pestrin
- Medical Oncology Unit, Azienda Sanitaria Universitaria Giuliano Isontina, 34170 Gorizia, Italy
| | - Giuseppina Sanna
- Medical Oncology, Ospedale Civile SS Annunziata, 07100 Sassari, Italy
| | - Erica Moretti
- "Sandro Pitigliani" Medical Oncology Department, Hospital of Prato, 59100 Prato, Italy
| | - Francesca Demichelis
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | - Ilenia Migliaccio
- "Sandro Pitigliani" Translational Research Unit, Hospital of Prato, 59100 Prato, Italy
| | - Laura Biganzoli
- "Sandro Pitigliani" Medical Oncology Department, Hospital of Prato, 59100 Prato, Italy
| | - Luca Malorni
- "Sandro Pitigliani" Translational Research Unit, Hospital of Prato, 59100 Prato, Italy
- "Sandro Pitigliani" Medical Oncology Department, Hospital of Prato, 59100 Prato, Italy
| | - Matteo Benelli
- Bioinformatics Unit, Hospital of Prato, 59100 Prato, Italy
- "Sandro Pitigliani" Medical Oncology Department, Hospital of Prato, 59100 Prato, Italy
| |
Collapse
|
4
|
Song J, Kuan PF. A systematic assessment of cell type deconvolution algorithms for DNA methylation data. Brief Bioinform 2022; 23:bbac449. [PMID: 36242584 PMCID: PMC9947552 DOI: 10.1093/bib/bbac449] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 08/11/2022] [Accepted: 09/20/2022] [Indexed: 12/14/2022] Open
Abstract
We performed systematic assessment of computational deconvolution methods that play an important role in the estimation of cell type proportions from bulk methylation data. The proposed framework methylDeConv (available as an R package) integrates several deconvolution methods for methylation profiles (Illumina HumanMethylation450 and MethylationEPIC arrays) and offers different cell-type-specific CpG selection to construct the extended reference library which incorporates the main immune cell subsets, epithelial cells and cell-free DNAs. We compared the performance of different deconvolution algorithms via simulations and benchmark datasets and further investigated the associations of the estimated cell type proportions to cancer therapy in breast cancer and subtypes in melanoma methylation case studies. Our results indicated that the deconvolution based on the extended reference library is critical to obtain accurate estimates of cell proportions in non-blood tissues.
Collapse
Affiliation(s)
- Junyan Song
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY
| | - Pei-Fen Kuan
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY
| |
Collapse
|
5
|
Staaf J, Aine M. Tumor purity adjusted beta values improve biological interpretability of high-dimensional DNA methylation data. PLoS One 2022; 17:e0265557. [PMID: 36084090 PMCID: PMC9462735 DOI: 10.1371/journal.pone.0265557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 08/15/2022] [Indexed: 11/19/2022] Open
Abstract
A common issue affecting DNA methylation analysis in tumor tissue is the presence of a substantial amount of non-tumor methylation signal derived from the surrounding microenvironment. Although approaches for quantifying and correcting for the infiltration component have been proposed previously, we believe these have not fully addressed the issue in a comprehensive and universally applicable way. We present a multi-population framework for adjusting DNA methylation beta values on the Illumina 450/850K platform using generic purity estimates to account for non-tumor signal. Our approach also provides an indirect estimate of the aggregate methylation state of the surrounding normal tissue. Using whole exome sequencing derived purity estimates and Illumina 450K methylation array data generated by The Cancer Genome Atlas project (TCGA), we provide a demonstration of this framework in breast cancer illustrating the effect of beta correction on the aggregate methylation beta value distribution, clustering accuracy, and global methylation profiles.
Collapse
Affiliation(s)
- Johan Staaf
- Department of Clinical Sciences Lund, Division of Oncology, Lund University, Medicon Village, Lund, Sweden
| | - Mattias Aine
- Department of Clinical Sciences Lund, Division of Oncology, Lund University, Medicon Village, Lund, Sweden
- * E-mail:
| |
Collapse
|
6
|
Batchu S, Hakim A, Henry OS, Madzo J, Atabek U, Spitz FR, Hong YK. Transcriptome-guided resolution of tumor microenvironment interactions in pheochromocytoma and paraganglioma subtypes. J Endocrinol Invest 2022; 45:989-998. [PMID: 35088383 DOI: 10.1007/s40618-021-01729-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 12/19/2021] [Indexed: 12/26/2022]
Abstract
BACKGROUND Pheochromocytomas and paragangliomas (PCPG) are rare catecholamine-secreting endocrine tumors deriving from chromaffin cells of the embryonic neural crest. Although distinct molecular PCPG subtypes have been elucidated, certain characteristics of these tumors have yet to be fully examined, namely the tumor microenvironment (TME). To further understand tumor-stromal interactions in PCPG subtypes, the present study deconvoluted bulk tumor gene expression to examine ligand-receptor interactions. METHODS RNA-sequencing data primary solid PCPG tumors were derived from The Cancer Genome Atlas (TCGA). Tumor purity was estimated using two robust algorithms. The tumor purity estimates and bulk tumor expression values allowed for non-negative linear regression to predict the average expression of each gene in the stromal and tumor compartments for each PCPG molecular subtype. The predicted expression values were then used in conjunction with a previously curated ligand-receptor database and scoring system to evaluate top ligand-receptor interactions. RESULTS Across all PCPG subtypes compared to normal samples, tumor-to-tumor signaling between bone morphogenic proteins 7 (BMP7) and 15 (BMP15) and cognate receptors ACVR2B and BMPR1B was increased. In addition, tumor-to-stroma signaling was enriched for interactions between predicted tumor-originating delta-like ligand 3 (DLL3) and predicted stromal NOTCH receptors. Stroma-to-tumor signaling was enriched for interactions between ephrins A1 and A4 with ephrin receptors EphA5, EphA7, and EphA8. Pseudohypoxia subtype tumors displayed increased predicted stromal expression of genes related to immune-exhausted T-cell response, including those for inhibitory receptors HAVCR2 and CTLA4. CONCLUSION The current exploratory study predicted stromal and tumor through compartmental deconvolution and yielded previously unrecognized interactions and putative biomarkers in PCPG.
Collapse
Affiliation(s)
- S Batchu
- Cooper Medical School at Rowan University, 401 Broadway, Camden, NJ, 08103, USA.
| | - A Hakim
- Department of Surgery, Cooper University Hospital, Camden, NJ, USA
| | - O S Henry
- Cooper Medical School at Rowan University, 401 Broadway, Camden, NJ, 08103, USA
| | - J Madzo
- Coriell Institute, Camden, NJ, USA
| | - U Atabek
- Department of Surgery, Cooper University Hospital, Camden, NJ, USA
| | - F R Spitz
- Department of Surgery, Cooper University Hospital, Camden, NJ, USA
| | - Y K Hong
- Department of Surgery, Cooper University Hospital, Camden, NJ, USA
| |
Collapse
|
7
|
Obtaining spatially resolved tumor purity maps using deep multiple instance learning in a pan-cancer study. PATTERNS (NEW YORK, N.Y.) 2022; 3:100399. [PMID: 35199060 PMCID: PMC8848022 DOI: 10.1016/j.patter.2021.100399] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 09/07/2021] [Accepted: 11/03/2021] [Indexed: 02/07/2023]
Abstract
Tumor purity is the percentage of cancer cells within a tissue section. Pathologists estimate tumor purity to select samples for genomic analysis by manually reading hematoxylin-eosin (H&E)-stained slides, which is tedious, time consuming, and prone to inter-observer variability. Besides, pathologists' estimates do not correlate well with genomic tumor purity values, which are inferred from genomic data and accepted as accurate for downstream analysis. We developed a deep multiple instance learning model predicting tumor purity from H&E-stained digital histopathology slides. Our model successfully predicted tumor purity in eight The Cancer Genome Atlas (TCGA) cohorts and a local Singapore cohort. The predictions were highly consistent with genomic tumor purity values. Thus, our model can be utilized to select samples for genomic analysis, which will help reduce pathologists' workload and decrease inter-observer variability. Furthermore, our model provided tumor purity maps showing the spatial variation within sections. They can help better understand the tumor microenvironment. MIL model successfully predicts a sample's tumor purity from histopathology slides MIL model learns to spatially resolve tumor purity from sample-level labels Tumor purity varies spatially within a sample Pathologists’ region selection is vital for correct percentage tumor nuclei estimation
Given some big data and coarse-level labels, extracting fine-level information is a demanding yet rewarding challenge in data science. This study develops a machine learning model utilizing big data and exploiting coarse-level labels to reveal fine-level details within the data. Although it can be applied to different data science tasks with enormous data and coarse labels, we applied it to a computational histopathology task with gigapixel histopathology slides and sample-level labels. Specifically, the model revealed spatial resolution of tumor purity within histopathology slides using only sample-level genomic tumor purity values during training. This can also be extended to other omics features, providing precious information about cancer biology and promising personalized, precision medicine. Such studies are of great clinical importance in discovering imaging biomarkers and better understanding the tumor microenvironment.
Collapse
|
8
|
Zhao L, Zhang J, Xuan S, Liu Z, Wang Y, Zhao P. Molecular and Clinicopathological Characterization of a Prognostic Immune Gene Signature Associated With MGMT Methylation in Glioblastoma. Front Cell Dev Biol 2021; 9:600506. [PMID: 33614641 PMCID: PMC7892978 DOI: 10.3389/fcell.2021.600506] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2020] [Accepted: 01/08/2021] [Indexed: 02/06/2023] Open
Abstract
Background: O6-methylguanine-DNA methyltransferase (MGMT) methylation status affects tumor chemo-resistance and the prognosis of glioblastoma (GBM) patients. We aimed to investigate the role of MGMT methylation in the regulation of GBM immunophenotype and discover an effective biomarker to improve prognosis prediction of GBM patients. Methods: A total of 769 GBM patients with clinical information from five independent cohorts were enrolled in the present study. Samples from the Cancer Genome Atlas (TCGA) dataset were used as the training set, whereas transcriptome data from the Chinese Glioma Genome Atlas (CGGA) RNA-seq, CGGA microarray, GSE16011, and the Repository for Molecular Brain Neoplasia (REMBRANDT) cohort were used for validation. A series of bioinformatics approaches were carried out to construct a prognostic signature based on immune-related genes, which were tightly related to the MGMT methylation status. In silico analyses were performed to investigate the influence of the signature on immunosuppression and remodeling of the tumor microenvironment. Then, the utility of this immune gene signature was analyzed by the development and evaluation of a nomogram. In vitro experiments were further used to verify the immunologic function of the genes in the signature. Results: We found that MGMT unmethylation was closely associated with immune-related biological processes in GBM. Sixty-five immune genes were more highly expressed in the MGMT unmethylated than the MGMT-methylated group. An immune gene-based risk model was further established to divide patients into high and low-risk groups, and the prognostic value of this signature was validated in several GBM cohorts. Functional analyses manifested a universal up-regulation of immune-related pathways in the high-risk group. Furthermore, the risk score was highly correlated to the immune cell infiltration, immunosuppression, inflammatory activities, as well as the expression levels of immune checkpoints. A nomogram was developed for clinical application. Knockdown of the five genes in the signature remodeled the immunosuppressive microenvironment by restraining M2 macrophage polarization and suppressing immunosuppressive cytokines production. Conclusions: MGMT methylation is strongly related to the immune responses in GBM. The immune gene-based signature we identified may have potential implications in predicting the prognosis of GBM patients and mechanisms underlying the role of MGMT methylation.
Collapse
Affiliation(s)
- Liang Zhao
- Department of Neurosurgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Jiayue Zhang
- Department of Neurosurgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Shurui Xuan
- Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Zhiyuan Liu
- Department of Neurosurgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Yu Wang
- Department of Neurosurgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Peng Zhao
- Department of Neurosurgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| |
Collapse
|
9
|
Galardi F, De Luca F, Romagnoli D, Biagioni C, Moretti E, Biganzoli L, Di Leo A, Migliaccio I, Malorni L, Benelli M. Cell-Free DNA-Methylation-Based Methods and Applications in Oncology. Biomolecules 2020; 10:E1677. [PMID: 33334040 PMCID: PMC7765488 DOI: 10.3390/biom10121677] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 12/07/2020] [Accepted: 12/14/2020] [Indexed: 12/11/2022] Open
Abstract
Liquid biopsy based on cell-free DNA (cfDNA) enables non-invasive dynamic assessment of disease status in patients with cancer, both in the early and advanced settings. The analysis of DNA-methylation (DNAm) from cfDNA samples holds great promise due to the intrinsic characteristics of DNAm being more prevalent, pervasive, and cell- and tumor-type specific than genomics, for which established cfDNA assays already exist. Herein, we report on recent advances on experimental strategies for the analysis of DNAm in cfDNA samples. We describe the main steps of DNAm-based analysis workflows, including pre-analytics of cfDNA samples, DNA treatment, assays for DNAm evaluation, and methods for data analysis. We report on protocols, biomolecular techniques, and computational strategies enabling DNAm evaluation in the context of cfDNA analysis, along with practical considerations on input sample requirements and costs. We provide an overview on existing studies exploiting cell-free DNAm biomarkers for the detection and monitoring of cancer in early and advanced settings, for the evaluation of drug resistance, and for the identification of the cell-of-origin of tumors. Finally, we report on DNAm-based tests approved for clinical use and summarize their performance in the context of liquid biopsy.
Collapse
Affiliation(s)
- Francesca Galardi
- «Sandro Pitigliani» Translational Research Unit, Hospital of Prato, 59100 Prato, Italy; (F.G.); (F.D.L.); (I.M.); (L.M.)
| | - Francesca De Luca
- «Sandro Pitigliani» Translational Research Unit, Hospital of Prato, 59100 Prato, Italy; (F.G.); (F.D.L.); (I.M.); (L.M.)
| | - Dario Romagnoli
- Bioinformatics Unit, Hospital of Prato, 59100 Prato, Italy; (D.R.); (C.B.)
| | - Chiara Biagioni
- Bioinformatics Unit, Hospital of Prato, 59100 Prato, Italy; (D.R.); (C.B.)
- «Sandro Pitigliani» Medical Oncology Department, Hospital of Prato, 59100 Prato, Italy; (E.M.); (L.B.); (A.D.L.)
| | - Erica Moretti
- «Sandro Pitigliani» Medical Oncology Department, Hospital of Prato, 59100 Prato, Italy; (E.M.); (L.B.); (A.D.L.)
| | - Laura Biganzoli
- «Sandro Pitigliani» Medical Oncology Department, Hospital of Prato, 59100 Prato, Italy; (E.M.); (L.B.); (A.D.L.)
| | - Angelo Di Leo
- «Sandro Pitigliani» Medical Oncology Department, Hospital of Prato, 59100 Prato, Italy; (E.M.); (L.B.); (A.D.L.)
| | - Ilenia Migliaccio
- «Sandro Pitigliani» Translational Research Unit, Hospital of Prato, 59100 Prato, Italy; (F.G.); (F.D.L.); (I.M.); (L.M.)
| | - Luca Malorni
- «Sandro Pitigliani» Translational Research Unit, Hospital of Prato, 59100 Prato, Italy; (F.G.); (F.D.L.); (I.M.); (L.M.)
- «Sandro Pitigliani» Medical Oncology Department, Hospital of Prato, 59100 Prato, Italy; (E.M.); (L.B.); (A.D.L.)
| | - Matteo Benelli
- Bioinformatics Unit, Hospital of Prato, 59100 Prato, Italy; (D.R.); (C.B.)
| |
Collapse
|
10
|
Noorbakhsh J, Farahmand S, Foroughi Pour A, Namburi S, Caruana D, Rimm D, Soltanieh-Ha M, Zarringhalam K, Chuang JH. Deep learning-based cross-classifications reveal conserved spatial behaviors within tumor histological images. Nat Commun 2020; 11:6367. [PMID: 33311458 PMCID: PMC7733499 DOI: 10.1038/s41467-020-20030-5] [Citation(s) in RCA: 108] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 11/05/2020] [Indexed: 02/07/2023] Open
Abstract
Histopathological images are a rich but incompletely explored data type for studying cancer. Manual inspection is time consuming, making it challenging to use for image data mining. Here we show that convolutional neural networks (CNNs) can be systematically applied across cancer types, enabling comparisons to reveal shared spatial behaviors. We develop CNN architectures to analyze 27,815 hematoxylin and eosin scanned images from The Cancer Genome Atlas for tumor/normal, cancer subtype, and mutation classification. Our CNNs are able to classify TCGA pathologist-annotated tumor/normal status of whole slide images (WSIs) in 19 cancer types with consistently high AUCs (0.995 ± 0.008), as well as subtypes with lower but significant accuracy (AUC 0.87 ± 0.1). Remarkably, tumor/normal CNNs trained on one tissue are effective in others (AUC 0.88 ± 0.11), with classifier relationships also recapitulating known adenocarcinoma, carcinoma, and developmental biology. Moreover, classifier comparisons reveal intra-slide spatial similarities, with an average tile-level correlation of 0.45 ± 0.16 between classifier pairs. Breast cancers, bladder cancers, and uterine cancers have spatial patterns that are particularly easy to detect, suggesting these cancers can be canonical types for image analysis. Patterns for TP53 mutations can also be detected, with WSI self- and cross-tissue AUCs ranging from 0.65-0.80. Finally, we comparatively evaluate CNNs on 170 breast and colon cancer images with pathologist-annotated nuclei, finding that both cellular and intercellular regions contribute to CNN accuracy. These results demonstrate the power of CNNs not only for histopathological classification, but also for cross-comparisons to reveal conserved spatial behaviors across tumors.
Collapse
Affiliation(s)
- Javad Noorbakhsh
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Saman Farahmand
- Computational Sciences PhD Program, University of Massachusetts-Boston, Boston, MA, USA
| | | | - Sandeep Namburi
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Dennis Caruana
- Department of Pathology, Yale University School of Medicine, New Haven, CT, USA
| | - David Rimm
- Department of Pathology, Yale University School of Medicine, New Haven, CT, USA
| | | | - Kourosh Zarringhalam
- Computational Sciences PhD Program, University of Massachusetts-Boston, Boston, MA, USA
- Department of Mathematics, University of Massachusetts-Boston, Boston, MA, USA
| | - Jeffrey H Chuang
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
- UCONN Health, Department of Genetics and Genome Sciences, Farmington, CT, USA.
| |
Collapse
|
11
|
Zuo Y, Song M, Li H, Chen X, Cao P, Zheng L, Cao G. Analysis of the Epigenetic Signature of Cell Reprogramming by Computational DNA Methylation Profiles. Curr Bioinform 2020. [DOI: 10.2174/1574893614666190919103752] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
DNA methylation plays an important role in the reprogramming process.
Understanding the underlying molecular mechanism of reprogramming is crucial for answering
fundamental questions regarding the transition of cell identity.
Methods:
In this study, based on the genome-wide DNA methylation data from different cell lines,
comparative methylation profiles were proposed to identify the epigenetic signature of cell
reprogramming.
Results:
The density profile of CpG methylation showed that pluripotent cells are more polarized
than Human Dermal Fibroblasts (HDF) cells. The heterogeneity of iPS has a greater deviation in
the DNA hypermethylation pattern. The result of regional distribution showed that the differential
CpG sites between pluripotent cells and HDFs tend to accumulate in the gene body and CpG shelf
regions, whereas the internal differential methylation CpG sites (DMCs) of three types of
pluripotent cells tend to accumulate in the TSS1500 region. Furthermore, a series of endogenous
markers of cell reprogramming were identified based on the integrative analysis, including focal
adhesion, pluripotency maintenance and transcription regulation. The calcium signaling pathway
was detected as one of the signatures between NT cells and iPS cells. Finally, the regional bias of
DNA methylation for key pluripotency factors was discussed. Our studies provide new insight into
the barrier identification of cell reprogramming.
Conclusion:
Our studies analyzed some epigenetic markers and barriers of nuclear reprogramming,
hoping to provide new insight into understanding the underlying molecular mechanism
of reprogramming.
Collapse
Affiliation(s)
- Yongchun Zuo
- The College of Veterinary Medicine, Inner Mongolia Agricultural University, Hohhot 010018, China
| | - Mingmin Song
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - Hanshuang Li
- State key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - Xing Chen
- State key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - Pengbo Cao
- State key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - Lei Zheng
- State key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - Guifang Cao
- The College of Veterinary Medicine, Inner Mongolia Agricultural University, Hohhot 010018, China
| |
Collapse
|
12
|
Abstract
Background DNA methylation is a key epigenetic regulator contributing to cancer development. To understand the role of DNA methylation in tumorigenesis, it is important to investigate and compare differential methylation (DM) patterns between normal and case samples across different cancer types. However, current pan-cancer analyses call DM separately for each cancer, which suffers from lower statistical power and fails to provide a comprehensive view for patterns across cancers. Methods In this work, we propose a rigorous statistical model, PanDM, to jointly characterize DM patterns across diverse cancer types. PanDM uses the hidden correlations in the combined dataset to improve statistical power through joint modeling. PanDM takes summary statistics from separate analyses as input and performs methylation site clustering, differential methylation detection, and pan-cancer pattern discovery. We demonstrate the favorable performance of PanDM using simulation data. We apply our model to 12 cancer methylome data collected from The Cancer Genome Atlas (TCGA) project. We further conduct ontology- and pathway-enrichment analyses to gain new biological insights into the pan-cancer DM patterns learned by PanDM. Results PanDM outperforms two types of separate analyses in the power of DM calling in the simulation study. Application of PanDM to TCGA data reveals 37 pan-cancer DM patterns in the 12 cancer methylomes, including both common and cancer-type-specific patterns. These 37 patterns are in turn used to group cancer types. Functional ontology and biological pathways enriched in the non-common patterns not only underpin the cancer-type-specific etiology and pathogenesis but also unveil the common environmental risk factors shared by multiple cancer types. Moreover, we also identify PanDM-specific DM CpG sites that the common strategy fails to detect. Conclusions PanDM is a powerful tool that provides a systematic way to investigate aberrant methylation patterns across multiple cancer types. Results from real data analyses suggest a novel angle for us to understand the common and specific DM patterns in different cancers. Moreover, as PanDM works on the summary statistics for each cancer type, the same framework can in principle be applied to pan-cancer analyses of other functional genomic profiles. We implement PanDM as an R package, which is freely available at http://www.sta.cuhk.edu.hk/YWei/PanDM.html.
Collapse
Affiliation(s)
- Mai Shi
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, SAR, China
| | - Stephen Kwok-Wing Tsui
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, SAR, China.,Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, SAR, China
| | - Hao Wu
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, Georgia, 30322, USA
| | - Yingying Wei
- Department of Statistics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, SAR, China.
| |
Collapse
|
13
|
Lee D, Park Y, Kim S. Towards multi-omics characterization of tumor heterogeneity: a comprehensive review of statistical and machine learning approaches. Brief Bioinform 2020; 22:5896573. [PMID: 34020548 DOI: 10.1093/bib/bbaa188] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 06/29/2020] [Accepted: 07/21/2020] [Indexed: 12/19/2022] Open
Abstract
The multi-omics molecular characterization of cancer opened a new horizon for our understanding of cancer biology and therapeutic strategies. However, a tumor biopsy comprises diverse types of cells limited not only to cancerous cells but also to tumor microenvironmental cells and adjacent normal cells. This heterogeneity is a major confounding factor that hampers a robust and reproducible bioinformatic analysis for biomarker identification using multi-omics profiles. Besides, the heterogeneity itself has been recognized over the years for its significant prognostic values in some cancer types, thus offering another promising avenue for therapeutic intervention. A number of computational approaches to unravel such heterogeneity from high-throughput molecular profiles of a tumor sample have been proposed, but most of them rely on the data from an individual omics layer. Since the heterogeneity of cells is widely distributed across multi-omics layers, methods based on an individual layer can only partially characterize the heterogeneous admixture of cells. To help facilitate further development of the methodologies that synchronously account for several multi-omics profiles, we wrote a comprehensive review of diverse approaches to characterize tumor heterogeneity based on three different omics layers: genome, epigenome and transcriptome. As a result, this review can be useful for the analysis of multi-omics profiles produced by many large-scale consortia. Contact:sunkim.bioinfo@snu.ac.kr.
Collapse
Affiliation(s)
- Dohoon Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Youngjune Park
- Department of Computer Science and Engineering, Institute of Engineering Research, Seoul National University, Seoul 08826, Korea
| | - Sun Kim
- Bioinformatics Institute, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
14
|
Pang S, Wang L, Wang S, Zhang Y, Wang X. PESM: A novel approach of tumor purity estimation based on sample specific methylation sites. J Bioinform Comput Biol 2020; 18:2050027. [PMID: 32757807 DOI: 10.1142/s0219720020500274] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Background: Tumor purity is of great significance for the study of tumor genotyping and the prediction of recurrence, which is significantly affected by tumor heterogeneity. Tumor heterogeneity is the basis of drug resistance in various cancer treatments, and DNA methylation plays a core role in the generation of tumor heterogeneity. Almost all types of cancer cells are associated with abnormal DNA methylation in certain regions of the genome. The selection of tumor-related differential methylation sites, which can be used as an indicator of tumor purity, has important implications for purity assessment. At present, the selection of information sites mostly focuses on inter-tumor heterogeneity and ignores the heterogeneity of tumor growth space that is sample specificity. Results: Considering the specificity of tumor samples and the information gain of individual tumor sample relative to the normal samples, we present an approach, PESM, to evaluate the tumor purity through the specificity difference methylation sites of tumor samples. Applied to more than 200 tumor samples of Prostate adenocarcinoma (PRAD) and Kidney renal clear cell carcinoma (KIRC), it shows that the tumor purity estimated by PESM is highly consistent with other existing methods. In addition, PESM performs better than the method that uses the integrated signal of methylation sites to estimate purity. Therefore, different information sites selection methods have an important impact on the estimation of tumor purity, and the selection of sample specific information sites has a certain significance for accurate identification of tumor purity of samples.
Collapse
Affiliation(s)
- Shanchen Pang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong, P. R. China
| | - Lihua Wang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong, P. R. China
| | - Shudong Wang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong, P. R. China
| | - Yuanyuan Zhang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong, P. R. China.,School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong, P. R. China
| | - Xinzeng Wang
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao, Shandong, P. R. China
| |
Collapse
|
15
|
Azim R, Wang S, Zhou S, Zhong X. Purity estimation from differentially methylated sites using Illumina Infinium methylation microarray data. Cell Cycle 2020; 19:2028-2039. [PMID: 32627651 PMCID: PMC7469651 DOI: 10.1080/15384101.2020.1789315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 06/11/2020] [Accepted: 06/23/2020] [Indexed: 10/23/2022] Open
Abstract
Solid tissues collected from patient-driven clinical settings are composed of both normal and cancer cells, which often precede complications in data analysis and epigenetic findings. The Purity estimation of samples is crucial for reliable genomic aberration identification and uniform inter-sample and inter-patient comparisons as well. Here, an effective and flexible method has been developed and designed to estimate the level of methylation, which infers tumor purity without prior knowledge from the other datasets. The comprehensive analysis of our approach on Illumina Infinium 450 k methylation microarray explains that TCGA Breast Cancer data exhibits improved performance for purity assessment. This assessment has a strong correlation with other advanced methods.
Collapse
Affiliation(s)
- Riasat Azim
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, P.R. China
| | - Shulin Wang
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, P.R. China
| | - Su Zhou
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, P.R. China
| | - Xing Zhong
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, P.R. China
| |
Collapse
|
16
|
Wang S, Wang L, Zhang Y, Pang S, Wang X. PEIS: a novel approach of tumor purity estimation by identifying information sites through integrating signal based on DNA methylation data. BMC Bioinformatics 2019; 20:714. [PMID: 31888435 PMCID: PMC6936156 DOI: 10.1186/s12859-019-3227-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Tumor purity plays an important role in understanding the pathogenic mechanism of tumors. The purity of tumor samples is highly sensitive to tumor heterogeneity. Due to Intratumoral heterogeneity of genetic and epigenetic data, it is suitable to study the purity of tumors. Among them, there are many purity estimation methods based on copy number variation, gene expression and other data, while few use DNA methylation data and often based on selected information sites. Consequently, how to choose methylation sites as information sites has an important influence on the purity estimation results. At present, the selection of information sites was often based on the differentially methylated sites that only consider the mean signal, without considering other possible signals and the strong correlation among adjacent sites. RESULTS Considering integrating multi-signals and strong correlation among adjacent sites, we propose an approach, PEIS, to estimate the purity of tumor samples by selecting informative differential methylation sites. Application to 12 publicly available tumor datasets, it is shown that PEIS provides accurate results in the estimation of tumor purity which has a high consistency with other existing methods. Also, through comparing the results of different information sites selection methods in the evaluation of tumor purity, it shows the PEIS is superior to other methods. CONCLUSIONS A new method to estimate the purity of tumor samples is proposed. This approach integrates multi-signals of the CpG sites and the correlation between the sites. Experimental analysis shows that this method is in good agreement with other existing methods for estimating tumor purity.
Collapse
Affiliation(s)
- Shudong Wang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong, China
| | - Lihua Wang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong, China
| | - Yuanyuan Zhang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong, China. .,School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong, China.
| | - Shanchen Pang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong, China
| | - Xinzeng Wang
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao, Shandong, China.
| |
Collapse
|
17
|
Li Y, Umbach DM, Bingham A, Li QJ, Zhuang Y, Li L. Putative biomarkers for predicting tumor sample purity based on gene expression data. BMC Genomics 2019; 20:1021. [PMID: 31881847 PMCID: PMC6933652 DOI: 10.1186/s12864-019-6412-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 12/18/2019] [Indexed: 12/29/2022] Open
Abstract
Background Tumor purity is the percent of cancer cells present in a sample of tumor tissue. The non-cancerous cells (immune cells, fibroblasts, etc.) have an important role in tumor biology. The ability to determine tumor purity is important to understand the roles of cancerous and non-cancerous cells in a tumor. Methods We applied a supervised machine learning method, XGBoost, to data from 33 TCGA tumor types to predict tumor purity using RNA-seq gene expression data. Results Across the 33 tumor types, the median correlation between observed and predicted tumor-purity ranged from 0.75 to 0.87 with small root mean square errors, suggesting that tumor purity can be accurately predicted υσινγ expression data. We further confirmed that expression levels of a ten-gene set (CSF2RB, RHOH, C1S, CCDC69, CCL22, CYTIP, POU2AF1, FGR, CCL21, and IL7R) were predictive of tumor purity regardless of tumor type. We tested whether our set of ten genes could accurately predict tumor purity of a TCGA-independent data set. We showed that expression levels from our set of ten genes were highly correlated (ρ = 0.88) with the actual observed tumor purity. Conclusions Our analyses suggested that the ten-gene set may serve as a biomarker for tumor purity prediction using gene expression data.
Collapse
Affiliation(s)
- Yuanyuan Li
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709, USA MD A3-03, Durham, NC, 27709, USA.
| | - David M Umbach
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709, USA MD A3-03, Durham, NC, 27709, USA
| | - Adrienna Bingham
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709, USA MD A3-03, Durham, NC, 27709, USA
| | - Qi-Jing Li
- Department of Immunology, Duke University, Durham, North, Carolina, 27710, USA
| | - Yuan Zhuang
- Department of Immunology, Duke University, Durham, North, Carolina, 27710, USA
| | - Leping Li
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709, USA MD A3-03, Durham, NC, 27709, USA
| |
Collapse
|
18
|
Sun W, Bunn P, Jin C, Little P, Zhabotynsky V, Perou CM, Hayes DN, Chen M, Lin DY. The association between copy number aberration, DNA methylation and gene expression in tumor samples. Nucleic Acids Res 2019. [PMID: 29529299 PMCID: PMC5887505 DOI: 10.1093/nar/gky131] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
We systematically studied the association between somatic copy number aberration (SCNA), DNA methylation and gene expression using -omic data from The Cancer Genome Atlas (TCGA) on six cancer types: breast cancer, colon cancer, glioblastoma, leukemia, lower-grade glioma and prostate cancer. A major challenge for such integrated study is that the association between DNA methylation and gene expression is severely confounded by tumor purity and cell type composition, which are often unobserved and difficult to estimate. To overcome this challenge, we developed a method to remove confounding effects by calculating the principal components that span the space of the latent factors. Another intriguing findings of our study is that there could be both positive and negative associations between SCNA and DNA methylation, while the CpGs with negative/positive associations with SCNA are often located around CpG islands/ocean, respectively. A joint study of SCNA, DNA methylation, and gene expression suggest that SCNA often affect DNA methylation and gene expression independently.
Collapse
Affiliation(s)
- Wei Sun
- Public Health Science Division, Fred Hutchison Cancer Research Center, USA
| | - Paul Bunn
- Department of Biostatistics, University of North Carolina, Chapel Hill, USA
| | - Chong Jin
- Department of Biostatistics, University of North Carolina, Chapel Hill, USA
| | - Paul Little
- Department of Biostatistics, University of North Carolina, Chapel Hill, USA
| | - Vasyl Zhabotynsky
- Department of Biostatistics, University of North Carolina, Chapel Hill, USA
| | - Charles M Perou
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, USA.,Department of Genetics, University of North Carolina, Chapel Hill, USA
| | - David Neil Hayes
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, USA.,Department of Medicine, Division of Hematology/Oncology, University of North Carolina, Chapel Hill, USA
| | - Mengjie Chen
- Department of Medicine, University of Chicago, USA.,Department of Human Genetics, University of Chicago, USA
| | - Dan-Yu Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, USA.,Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, USA
| |
Collapse
|
19
|
Benelli M, Romagnoli D, Demichelis F. Tumor purity quantification by clonal DNA methylation signatures. Bioinformatics 2019; 34:1642-1649. [PMID: 29325057 DOI: 10.1093/bioinformatics/bty011] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 01/06/2018] [Indexed: 12/18/2022] Open
Abstract
Motivation Controlling for tumor purity in molecular analyses is essential to allow for reliable genomic aberration calls, for inter-sample comparison and to monitor heterogeneity of cancer cell populations. In genome wide screening studies, the assessment of tumor purity is typically performed by means of computational methods that exploit somatic copy number aberrations. Results We present a strategy, called Purity Assessment from clonal MEthylation Sites (PAMES), which uses the methylation level of a few dozen, highly clonal, tumor type specific CpG sites to estimate the purity of tumor samples, without the need of a matched benign control. We trained and validated our method in more than 6000 samples from different datasets. Purity estimates by PAMES were highly concordant with other state-of-the-art tools and its evaluation in a cancer cell line dataset highlights its reliability to accurately estimate tumor admixtures. We extended the capability of PAMES to the analysis of CpG islands instead of the more platform-specific CpG sites and demonstrated its accuracy in a set of advanced tumors profiled by high throughput DNA methylation sequencing. These analyses show that PAMES is a valuable tool to assess the purity of tumor samples in the settings of clinical research and diagnostics. Availability and implementation https://github.com/cgplab/PAMES. Contact matteo.benelli@uslcentro.toscana.it or f.demichelis@unitn.it. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matteo Benelli
- Centre for Integrative Biology, University of Trento, Trento, Italy.,Bioinformatics Unit, Hospital of Prato, Istituto Toscano Tumori, Prato, Italy
| | - Dario Romagnoli
- Centre for Integrative Biology, University of Trento, Trento, Italy.,Bioinformatics Unit, Hospital of Prato, Istituto Toscano Tumori, Prato, Italy
| | - Francesca Demichelis
- Centre for Integrative Biology, University of Trento, Trento, Italy.,Caryl and Israel Englander Institute for Precision Medicine, New York Presbyterian Hospital-Weill Cornell Medicine, New York, NY, USA
| |
Collapse
|
20
|
A novel matched-pairs feature selection method considering with tumor purity for differential gene expression analyses. Math Biosci 2019; 311:39-48. [DOI: 10.1016/j.mbs.2019.02.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Revised: 02/21/2019] [Accepted: 02/22/2019] [Indexed: 12/13/2022]
|
21
|
Zhang R, Lai L, Dong X, He J, You D, Chen C, Lin L, Zhu Y, Huang H, Shen S, Wei L, Chen X, Guo Y, Liu L, Su L, Shafer A, Moran S, Fleischer T, Bjaanæs MM, Karlsson A, Planck M, Staaf J, Helland Å, Esteller M, Wei Y, Chen F, Christiani DC. SIPA1L3 methylation modifies the benefit of smoking cessation on lung adenocarcinoma survival: an epigenomic-smoking interaction analysis. Mol Oncol 2019; 13:1235-1248. [PMID: 30924596 PMCID: PMC6487703 DOI: 10.1002/1878-0261.12482] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Revised: 03/13/2019] [Accepted: 03/18/2019] [Indexed: 01/10/2023] Open
Abstract
Smoking cessation prolongs survival and decreases mortality of patients with non-small-cell lung cancer (NSCLC). In addition, epigenetic alterations of some genes are associated with survival. However, potential interactions between smoking cessation and epigenetics have not been assessed. Here, we conducted an epigenome-wide interaction analysis between DNA methylation and smoking cessation on NSCLC survival. We used a two-stage study design to identify DNA methylation-smoking cessation interactions that affect overall survival for early-stage NSCLC. The discovery phase contained NSCLC patients from Harvard, Spain, Norway, and Sweden. A histology-stratified Cox proportional hazards model adjusted for age, sex, clinical stage, and study center was used to test DNA methylation-smoking cessation interaction terms. Interactions with false discovery rate-q ≤ 0.05 were further confirmed in a validation phase using The Cancer Genome Atlas database. Histology-specific interactions were identified by stratification analysis in lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) patients. We identified one CpG probe (cg02268510SIPA1L3 ) that significantly and exclusively modified the effect of smoking cessation on survival in LUAD patients [hazard ratio (HR)interaction = 1.12; 95% confidence interval (CI): 1.07-1.16; P = 4.30 × 10-7 ]. Further, the effect of smoking cessation on early-stage LUAD survival varied across patients with different methylation levels of cg02268510SIPA1L3 . Smoking cessation only benefited LUAD patients with low methylation (HR = 0.53; 95% CI: 0.34-0.82; P = 4.61 × 10-3 ) rather than medium or high methylation (HR = 1.21; 95% CI: 0.86-1.70; P = 0.266) of cg02268510SIPA1L3 . Moreover, there was an antagonistic interaction between elevated methylation of cg02268510SIPA1L3 and smoking cessation (HRinteraction = 2.1835; 95% CI: 1.27-3.74; P = 4.46 × 10-3 ). In summary, smoking cessation benefited survival of LUAD patients with low methylation at cg02268510SIPA1L3 . The results have implications for not only smoking cessation after diagnosis, but also possible methylation-specific drug targeting.
Collapse
|
22
|
Blum Y, Meiller C, Quetel L, Elarouci N, Ayadi M, Tashtanbaeva D, Armenoult L, Montagne F, Tranchant R, Renier A, de Koning L, Copin MC, Hofman P, Hofman V, Porte H, Le Pimpec-Barthes F, Zucman-Rossi J, Jaurand MC, de Reyniès A, Jean D. Dissecting heterogeneity in malignant pleural mesothelioma through histo-molecular gradients for clinical applications. Nat Commun 2019; 10:1333. [PMID: 30902996 PMCID: PMC6430832 DOI: 10.1038/s41467-019-09307-6] [Citation(s) in RCA: 118] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2018] [Accepted: 02/28/2019] [Indexed: 12/19/2022] Open
Abstract
Malignant pleural mesothelioma (MPM) is recognized as heterogeneous based both on histology and molecular profiling. Histology addresses inter-tumor and intra-tumor heterogeneity in MPM and describes three major types: epithelioid, sarcomatoid and biphasic, a combination of the former two types. Molecular profiling studies have not addressed intra-tumor heterogeneity in MPM to date. Here, we use a deconvolution approach and show that molecular gradients shed new light on the intra-tumor heterogeneity of MPM, leading to a reconsideration of MPM molecular classifications. We show that each tumor can be decomposed as a combination of epithelioid-like and sarcomatoid-like components whose proportions are highly associated with the prognosis. Moreover, we show that this more subtle way of characterizing MPM heterogeneity provides a better understanding of the underlying oncogenic pathways and the related epigenetic regulation and immune and stromal contexts. We discuss the implications of these findings for guiding therapeutic strategies, particularly immunotherapies and targeted therapies.
Collapse
Affiliation(s)
- Yuna Blum
- Programme Cartes d'Identité des Tumeurs (CIT), Ligue Nationale Contre Le Cancer, 75013, Paris, France
| | - Clément Meiller
- Centre de Recherche des Cordeliers, Sorbonne Universités, Inserm, UMRS-1138, 75006, Paris, France
- Functional Genomics of Solid Tumors, USPC, Université Paris Descartes, Université Paris Diderot, Université Paris 13, Labex Immuno-Oncology, 75000, Paris, France
| | - Lisa Quetel
- Centre de Recherche des Cordeliers, Sorbonne Universités, Inserm, UMRS-1138, 75006, Paris, France
- Functional Genomics of Solid Tumors, USPC, Université Paris Descartes, Université Paris Diderot, Université Paris 13, Labex Immuno-Oncology, 75000, Paris, France
| | - Nabila Elarouci
- Programme Cartes d'Identité des Tumeurs (CIT), Ligue Nationale Contre Le Cancer, 75013, Paris, France
| | - Mira Ayadi
- Programme Cartes d'Identité des Tumeurs (CIT), Ligue Nationale Contre Le Cancer, 75013, Paris, France
| | - Danisa Tashtanbaeva
- Centre de Recherche des Cordeliers, Sorbonne Universités, Inserm, UMRS-1138, 75006, Paris, France
- Functional Genomics of Solid Tumors, USPC, Université Paris Descartes, Université Paris Diderot, Université Paris 13, Labex Immuno-Oncology, 75000, Paris, France
| | - Lucile Armenoult
- Programme Cartes d'Identité des Tumeurs (CIT), Ligue Nationale Contre Le Cancer, 75013, Paris, France
| | - François Montagne
- Centre de Recherche des Cordeliers, Sorbonne Universités, Inserm, UMRS-1138, 75006, Paris, France
- Functional Genomics of Solid Tumors, USPC, Université Paris Descartes, Université Paris Diderot, Université Paris 13, Labex Immuno-Oncology, 75000, Paris, France
- Service de Chirurgie Thoracique, Hôpital Calmette - CHRU de Lille, 59000, Lille, France
- Université de Lille, 59045, Lille, France
- Service de Chirurgie Générale et Thoracique, CHU de Rouen, 76000, Rouen, France
| | - Robin Tranchant
- Centre de Recherche des Cordeliers, Sorbonne Universités, Inserm, UMRS-1138, 75006, Paris, France
- Functional Genomics of Solid Tumors, USPC, Université Paris Descartes, Université Paris Diderot, Université Paris 13, Labex Immuno-Oncology, 75000, Paris, France
- Laboratoire de Biochimie (LBC), ESPCI Paris, PSL Research University, CNRS UMR8231 Chimie Biologie Innovation, 75005, Paris, France
| | - Annie Renier
- Centre de Recherche des Cordeliers, Sorbonne Universités, Inserm, UMRS-1138, 75006, Paris, France
- Functional Genomics of Solid Tumors, USPC, Université Paris Descartes, Université Paris Diderot, Université Paris 13, Labex Immuno-Oncology, 75000, Paris, France
| | - Leanne de Koning
- Translational Research Department, Institut Curie, PSL Research University, 75005, Paris, France
| | - Marie-Christine Copin
- Université de Lille, 59045, Lille, France
- Institut de Pathologie, Centre de Biologie-Pathologie, CHRU de Lille, 59037, Lille, France
| | - Paul Hofman
- Laboratoire de Pathologie Clinique et Expérimentale (LPCE) et biobanque (BB-0033-00025), CHRU de Nice, 06003, Nice, France
- Université Côte d'Azur, 06108, Nice, France
| | - Véronique Hofman
- Laboratoire de Pathologie Clinique et Expérimentale (LPCE) et biobanque (BB-0033-00025), CHRU de Nice, 06003, Nice, France
- Université Côte d'Azur, 06108, Nice, France
| | - Henri Porte
- Service de Chirurgie Thoracique, Hôpital Calmette - CHRU de Lille, 59000, Lille, France
- Université de Lille, 59045, Lille, France
| | - Françoise Le Pimpec-Barthes
- Centre de Recherche des Cordeliers, Sorbonne Universités, Inserm, UMRS-1138, 75006, Paris, France
- Functional Genomics of Solid Tumors, USPC, Université Paris Descartes, Université Paris Diderot, Université Paris 13, Labex Immuno-Oncology, 75000, Paris, France
- Assistance Publique-Hôpitaux de Paris, Hôpital Européen Georges Pompidou, 75015, Paris, France
- Département de Chirurgie Thoracique, Hôpital Européen Georges Pompidou, 75015, Paris, France
| | - Jessica Zucman-Rossi
- Centre de Recherche des Cordeliers, Sorbonne Universités, Inserm, UMRS-1138, 75006, Paris, France
- Functional Genomics of Solid Tumors, USPC, Université Paris Descartes, Université Paris Diderot, Université Paris 13, Labex Immuno-Oncology, 75000, Paris, France
| | - Marie-Claude Jaurand
- Centre de Recherche des Cordeliers, Sorbonne Universités, Inserm, UMRS-1138, 75006, Paris, France
- Functional Genomics of Solid Tumors, USPC, Université Paris Descartes, Université Paris Diderot, Université Paris 13, Labex Immuno-Oncology, 75000, Paris, France
| | - Aurélien de Reyniès
- Programme Cartes d'Identité des Tumeurs (CIT), Ligue Nationale Contre Le Cancer, 75013, Paris, France.
| | - Didier Jean
- Centre de Recherche des Cordeliers, Sorbonne Universités, Inserm, UMRS-1138, 75006, Paris, France.
- Functional Genomics of Solid Tumors, USPC, Université Paris Descartes, Université Paris Diderot, Université Paris 13, Labex Immuno-Oncology, 75000, Paris, France.
| |
Collapse
|
23
|
Zhang W, Long H, He B, Yang J. DECtp: Calling Differential Gene Expression Between Cancer and Normal Samples by Integrating Tumor Purity Information. Front Genet 2018; 9:321. [PMID: 30210526 PMCID: PMC6121016 DOI: 10.3389/fgene.2018.00321] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 07/30/2018] [Indexed: 11/13/2022] Open
Abstract
Identifying differentially expressed genes (DEGs) between tumor and normal samples is critical for studying tumorigenesis, and has been routinely applied to identify diagnostic, prognostic, and therapeutic biomarkers for many cancers. It is well-known that solid tumor tissue samples obtained from clinical settings are always mixtures of cancer and normal cells. However, the tumor purity information is more or less ignored in traditional differential expression analyses, which might decrease the power of differential gene identification or even bias the results. In this paper, we have developed a novel differential gene calling method called DECtp by integrating tumor purity information into a generalized least square procedure, followed by the Wald test. We compared DECtp with popular methods like t-test and limma on nine simulation datasets with different sample sizes and noise levels. DECtp achieved the highest area under curves (AUCs) for all the comparisons, suggesting that cancer purity information is critical for DEG calling between tumor and normal samples. In addition, we applied DECtp into cancer and normal samples of 14 tumor types collected from The Cancer Genome Atlas (TCGA) and compared the DEGs with those called by limma. As a result, DECtp achieved more sensitive, consistent, and biologically meaningful results and identified a few novel DEGs for further experimental validation.
Collapse
Affiliation(s)
- Weiwei Zhang
- School of Science, East China University of Technology, Nanchang, China
| | - Haixia Long
- Department of Information Science and Technology, Hainan Normal University, Haikou, China
| | - Binsheng He
- The First Affiliated Hosptial, Changsha Medical University, Changsha, China
| | - Jialiang Yang
- College of Information Engineering, Changsha Medical University, Changsha, China.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| |
Collapse
|
24
|
Pisanic TR, Cope LM, Lin SF, Yen TT, Athamanolap P, Asaka R, Nakayama K, Fader AN, Wang TH, Shih IM, Wang TL. Methylomic Analysis of Ovarian Cancers Identifies Tumor-Specific Alterations Readily Detectable in Early Precursor Lesions. Clin Cancer Res 2018; 24:6536-6547. [PMID: 30108103 DOI: 10.1158/1078-0432.ccr-18-1199] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Revised: 07/12/2018] [Accepted: 08/09/2018] [Indexed: 12/22/2022]
Abstract
PURPOSE High-grade serous ovarian carcinoma (HGSOC) typically remains undiagnosed until advanced stages when peritoneal dissemination has already occurred. Here, we sought to identify HGSOC-specific alterations in DNA methylation and assess their potential to provide sensitive and specific detection of HGSOC at its earliest stages. EXPERIMENTAL DESIGN MethylationEPIC genome-wide methylation analysis was performed on a discovery cohort comprising 23 HGSOC, 37 non-HGSOC malignant, and 36 histologically unremarkable gynecologic tissue samples. The resulting data were processed using selective bioinformatic criteria to identify regions of high-confidence HGSOC-specific differential methylation. Quantitative methylation-specific real-time PCR (qMSP) assays were then developed for 8 of the top-performing regions and analytically validated in a cohort of 90 tissue samples. Lastly, qMSP assays were used to assess and compare methylation in 30 laser-capture microdissected (LCM) fallopian tube epithelia samples obtained from cancer-free and serous tubal intraepithelial carcinoma (STIC) positive women. RESULTS Bioinformatic selection identified 91 regions of robust, HGSOC-specific hypermethylation, 23 of which exhibited an area under the receiver-operator curve (AUC) value ≥ 0.9 in the discovery cohort. Seven of 8 top-performing regions demonstrated AUC values between 0.838 and 0.968 when analytically validated by qMSP in a 90-patient cohort. A panel of the 3 top-performing genes (c17orf64, IRX2, and TUBB6) was able to perfectly discriminate HGSOC (AUC 1.0). Hypermethylation within these loci was found exclusively in LCM fallopian tube epithelia from women with STIC lesions, but not in cancer-free fallopian tubes. CONCLUSIONS A panel of methylation biomarkers can be used to accurately identify HGSOC, even at precursor stages of the disease.
Collapse
Affiliation(s)
- Thomas R Pisanic
- Johns Hopkins Institute for NanoBioTechnology, Johns Hopkins University, Baltimore, Maryland.
| | - Leslie M Cope
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland.,Departments of Oncology and Biostatistics, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Shiou-Fu Lin
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland.,Departments of Gynecology and Obstetrics and Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Ting-Tai Yen
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland.,Departments of Gynecology and Obstetrics and Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Pornpat Athamanolap
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Ryoichi Asaka
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland.,Departments of Gynecology and Obstetrics and Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Kentaro Nakayama
- Department of Obstetrics and Gynecology, Shimane University School of Medicine, Izumo, Japan
| | - Amanda N Fader
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland.,Departments of Gynecology and Obstetrics and Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Tza-Huei Wang
- Johns Hopkins Institute for NanoBioTechnology, Johns Hopkins University, Baltimore, Maryland.,Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland.,Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, Maryland.,Department of Mechanical Engineering, Johns Hopkins University, Baltimore, Maryland
| | - Ie-Ming Shih
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland.,Departments of Gynecology and Obstetrics and Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Tian-Li Wang
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland. .,Departments of Gynecology and Obstetrics and Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| |
Collapse
|
25
|
Shen L, Zhu J, Robert Li SY, Fan X. Detect differentially methylated regions using non-homogeneous hidden Markov model for methylation array data. Bioinformatics 2018; 33:3701-3708. [PMID: 29036320 DOI: 10.1093/bioinformatics/btx467] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2017] [Accepted: 07/18/2017] [Indexed: 12/12/2022] Open
Abstract
Motivation DNA methylation is an important epigenetic mechanism in gene regulation and the detection of differentially methylated regions (DMRs) is enthralling for many disease studies. There are several aspects that we can improve over existing DMR detection methods: (i) methylation statuses of nearby CpG sites are highly correlated, but this fact has seldom been modelled rigorously due to the uneven spacing; (ii) it is practically important to be able to handle both paired and unpaired samples; and (iii) the capability to detect DMRs from a single pair of samples is demanded. Results We present DMRMark (DMR detection based on non-homogeneous hidden Markov model), a novel Bayesian framework for detecting DMRs from methylation array data. It combines the constrained Gaussian mixture model that incorporates the biological knowledge with the non-homogeneous hidden Markov model that models spatial correlation. Unlike existing methods, our DMR detection is achieved without predefined boundaries or decision windows. Furthermore, our method can detect DMRs from a single pair of samples and can also incorporate unpaired samples. Both simulation studies and real datasets from The Cancer Genome Atlas showed the significant improvement of DMRMark over other methods. Availability and implementation DMRMark is freely available as an R package at the CRAN R package repository. Contact xfan@cuhk.edu.hk. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Linghao Shen
- Department of Information Engineering, The Chinese University of Hong Kong, Hong Kong
| | - Jun Zhu
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York University, New York, NY, USA
| | - Shuo-Yen Robert Li
- University of Electronic Science and Technology of China, Sichuan, China
| | - Xiaodan Fan
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong
| |
Collapse
|
26
|
Zhang W, Feng H, Wu H, Zheng X. Accounting for tumor purity improves cancer subtype classification from DNA methylation data. Bioinformatics 2018; 33:2651-2657. [PMID: 28472248 DOI: 10.1093/bioinformatics/btx303] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Accepted: 05/03/2017] [Indexed: 11/12/2022] Open
Abstract
Motivation Tumor sample classification has long been an important task in cancer research. Classifying tumors into different subtypes greatly benefits therapeutic development and facilitates application of precision medicine on patients. In practice, solid tumor tissue samples obtained from clinical settings are always mixtures of cancer and normal cells. Thus, the data obtained from these samples are mixed signals. The 'tumor purity', or the percentage of cancer cells in cancer tissue sample, will bias the clustering results if not properly accounted for. Results In this article, we developed a model-based clustering method and an R function which uses DNA methylation microarray data to infer tumor subtypes with the consideration of tumor purity. Simulation studies and the analyses of The Cancer Genome Atlas data demonstrate improved results compared with existing methods. Availability and implementation InfiniumClust is part of R package InfiniumPurify , which is freely available from CRAN ( https://cran.r-project.org/web/packages/InfiniumPurify/index.html ). Contact hao.wu@emory.edu or xqzheng@shnu.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Weiwei Zhang
- Department of Mathematics, Shanghai Normal University, Shanghai 200234, China.,School of Science, East China University of Technology, Nanchang, Jiangxi 330013, China
| | - Hao Feng
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA
| | - Hao Wu
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA
| | - Xiaoqi Zheng
- Department of Mathematics, Shanghai Normal University, Shanghai 200234, China
| |
Collapse
|
27
|
Grasso CS, Giannakis M, Wells DK, Hamada T, Mu XJ, Quist M, Nowak JA, Nishihara R, Qian ZR, Inamura K, Morikawa T, Nosho K, Abril-Rodriguez G, Connolly C, Escuin-Ordinas H, Geybels MS, Grady WM, Hsu L, Hu-Lieskovan S, Huyghe JR, Kim YJ, Krystofinski P, Leiserson MDM, Montoya DJ, Nadel BB, Pellegrini M, Pritchard CC, Puig-Saus C, Quist EH, Raphael BJ, Salipante SJ, Shin DS, Shinbrot E, Shirts B, Shukla S, Stanford JL, Sun W, Tsoi J, Upfill-Brown A, Wheeler DA, Wu CJ, Yu M, Zaidi SH, Zaretsky JM, Gabriel SB, Lander ES, Garraway LA, Hudson TJ, Fuchs CS, Ribas A, Ogino S, Peters U. Genetic Mechanisms of Immune Evasion in Colorectal Cancer. Cancer Discov 2018; 8:730-749. [PMID: 29510987 DOI: 10.1158/2159-8290.cd-17-1327] [Citation(s) in RCA: 375] [Impact Index Per Article: 53.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Revised: 02/13/2018] [Accepted: 02/27/2018] [Indexed: 12/16/2022]
Abstract
To understand the genetic drivers of immune recognition and evasion in colorectal cancer, we analyzed 1,211 colorectal cancer primary tumor samples, including 179 classified as microsatellite instability-high (MSI-high). This set includes The Cancer Genome Atlas colorectal cancer cohort of 592 samples, completed and analyzed here. MSI-high, a hypermutated, immunogenic subtype of colorectal cancer, had a high rate of significantly mutated genes in important immune-modulating pathways and in the antigen presentation machinery, including biallelic losses of B2M and HLA genes due to copy-number alterations and copy-neutral loss of heterozygosity. WNT/β-catenin signaling genes were significantly mutated in all colorectal cancer subtypes, and activated WNT/β-catenin signaling was correlated with the absence of T-cell infiltration. This large-scale genomic analysis of colorectal cancer demonstrates that MSI-high cases frequently undergo an immunoediting process that provides them with genetic events allowing immune escape despite high mutational load and frequent lymphocytic infiltration and, furthermore, that colorectal cancer tumors have genetic and methylation events associated with activated WNT signaling and T-cell exclusion.Significance: This multi-omic analysis of 1,211 colorectal cancer primary tumors reveals that it should be possible to better monitor resistance in the 15% of cases that respond to immune blockade therapy and also to use WNT signaling inhibitors to reverse immune exclusion in the 85% of cases that currently do not. Cancer Discov; 8(6); 730-49. ©2018 AACR.This article is highlighted in the In This Issue feature, p. 663.
Collapse
Affiliation(s)
- Catherine S Grasso
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California. .,Parker Institute for Cancer Immunotherapy, San Francisco, California
| | - Marios Giannakis
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Daniel K Wells
- Parker Institute for Cancer Immunotherapy, San Francisco, California
| | - Tsuyoshi Hamada
- Department of Oncologic Pathology, Dana-Farber Cancer Institute, Boston, Massachusetts
| | - Xinmeng Jasmine Mu
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Michael Quist
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California.,Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Jonathan A Nowak
- Program in MPE Molecular Pathological Epidemiology, Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Reiko Nishihara
- Department of Oncologic Pathology, Dana-Farber Cancer Institute, Boston, Massachusetts.,Program in MPE Molecular Pathological Epidemiology, Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts.,Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts.,Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, Massachusetts.,Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Zhi Rong Qian
- Department of Oncologic Pathology, Dana-Farber Cancer Institute, Boston, Massachusetts
| | - Kentaro Inamura
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts
| | - Teppei Morikawa
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts
| | - Katsuhiko Nosho
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts
| | - Gabriel Abril-Rodriguez
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California.,Parker Institute for Cancer Immunotherapy, San Francisco, California
| | - Charles Connolly
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Helena Escuin-Ordinas
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California.,Parker Institute for Cancer Immunotherapy, San Francisco, California
| | - Milan S Geybels
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - William M Grady
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington.,Department of Medicine, University of Washington School of Medicine, Seattle, Washington
| | - Li Hsu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Siwen Hu-Lieskovan
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California.,Parker Institute for Cancer Immunotherapy, San Francisco, California
| | - Jeroen R Huyghe
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Yeon Joo Kim
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California.,Parker Institute for Cancer Immunotherapy, San Francisco, California
| | - Paige Krystofinski
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California.,Parker Institute for Cancer Immunotherapy, San Francisco, California
| | - Mark D M Leiserson
- Department of Computer Science and Center for Computational Molecular Biology, Brown University, Providence, Rhode Island
| | - Dennis J Montoya
- Department of Molecular, Cell, and Developmental Biology, University of California, Los Angeles, Los Angeles, California
| | - Brian B Nadel
- Department of Molecular, Cell, and Developmental Biology, University of California, Los Angeles, Los Angeles, California
| | - Matteo Pellegrini
- Department of Molecular, Cell, and Developmental Biology, University of California, Los Angeles, Los Angeles, California
| | - Colin C Pritchard
- Department of Laboratory Medicine, University of Washington, Seattle, Washington
| | - Cristina Puig-Saus
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California.,Parker Institute for Cancer Immunotherapy, San Francisco, California
| | - Elleanor H Quist
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California.,Parker Institute for Cancer Immunotherapy, San Francisco, California
| | - Ben J Raphael
- Department of Computer Science and Center for Computational Molecular Biology, Brown University, Providence, Rhode Island
| | - Stephen J Salipante
- Department of Laboratory Medicine, University of Washington, Seattle, Washington
| | - Daniel Sanghoon Shin
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California.,Parker Institute for Cancer Immunotherapy, San Francisco, California
| | - Eve Shinbrot
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas
| | - Brian Shirts
- Department of Laboratory Medicine, University of Washington, Seattle, Washington
| | - Sachet Shukla
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts.,Department of Statistics, Iowa State University, Ames, Iowa
| | - Janet L Stanford
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington.,Department of Epidemiology, School of Public Health, University of Washington, Seattle, Washington
| | - Wei Sun
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Jennifer Tsoi
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, Los Angeles, California
| | - Alexander Upfill-Brown
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California.,Parker Institute for Cancer Immunotherapy, San Francisco, California
| | - David A Wheeler
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas
| | - Catherine J Wu
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Ming Yu
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Syed H Zaidi
- Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada
| | - Jesse M Zaretsky
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California.,Parker Institute for Cancer Immunotherapy, San Francisco, California
| | | | - Eric S Lander
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Levi A Garraway
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Thomas J Hudson
- Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada.,AbbVie Inc., Redwood City, California
| | - Charles S Fuchs
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts.,Yale Cancer Center, New Haven, Connecticut.,Department of Medicine, Yale School of Medicine, New Haven, Connecticut.,Smilow Cancer Hospital, New Haven, Connecticut
| | - Antoni Ribas
- Department of Medicine, Division of Hematology-Oncology, University of California, Los Angeles, and the Jonsson Comprehensive Cancer Center, Los Angeles, California.,Parker Institute for Cancer Immunotherapy, San Francisco, California
| | - Shuji Ogino
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts.,Department of Oncologic Pathology, Dana-Farber Cancer Institute, Boston, Massachusetts.,Program in MPE Molecular Pathological Epidemiology, Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts.,Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Ulrike Peters
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington.,Department of Epidemiology, School of Public Health, University of Washington, Seattle, Washington
| |
Collapse
|
28
|
Qin Y, Feng H, Chen M, Wu H, Zheng X. InfiniumPurify: An R package for estimating and accounting for tumor purity in cancer methylation research. Genes Dis 2018; 5:43-45. [PMID: 30258934 PMCID: PMC6147081 DOI: 10.1016/j.gendis.2018.02.003] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2018] [Accepted: 02/04/2018] [Indexed: 01/07/2023] Open
Abstract
The proposition of cancer cells in a tumor sample, named as tumor purity, is an intrinsic factor of tumor samples and has potentially great influence in variety of analyses including differential methylation, subclonal deconvolution and subtype clustering. InfiniumPurify is an integrated R package for estimating and accounting for tumor purity based on DNA methylation Infinium 450 k array data. InfiniumPurify has three main functions getPurity, InfiniumDMC and InfiniumClust, which could infer tumor purity, differential methylation analysis and tumor sample cluster accounting for estimated or user-provided tumor purities, respectively. The InfiniumPurify package provides a comprehensive analysis of tumor purity in cancer methylation research.
Collapse
Affiliation(s)
- Yufang Qin
- College of Information Technology, Shanghai Ocean University, Shanghai, 201306, PR China.,Key Laboratory of Fisheries Information Ministry of Agriculture, Shanghai, 201306, PR China
| | - Hao Feng
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Georgia 30322, USA
| | - Ming Chen
- College of Information Technology, Shanghai Ocean University, Shanghai, 201306, PR China.,Key Laboratory of Fisheries Information Ministry of Agriculture, Shanghai, 201306, PR China
| | - Hao Wu
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Georgia 30322, USA
| | - Xiaoqi Zheng
- Department of Mathematics, Shanghai Normal University, Shanghai, 200234, PR China
| |
Collapse
|
29
|
Dou H, Fang Y, Zheng X. Universal informative CpG sites for inferring tumor purity from DNA methylation microarray data. J Bioinform Comput Biol 2018; 16:1750030. [PMID: 29347875 DOI: 10.1142/s0219720017500305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Tumor purity is an intrinsic property of tumor samples and potentially has severe impact on many types of data analysis. We have previously developed a statistical method, InfiniumPurify, which could infer purity of a tumor sample given its tumor type (available in TCGA) or a set of informative CpG (iDMC) sites. However, in many clinical practices, researchers may focus on a specific type of tumor samples that is not included in TCGA, and samples which are too few to identify reliable iDMCs. This greatly restricts the application of InfiniumPurify in cancer research. In this paper, we proposed an updated version of InfiniumPurify (termed as uiInfiniumPurify) through identifying a universal set of iDMCs (uiDMCs) and redesigning the algorithm to determine hyper- and hypo-methylation status of each uiDMC. Through the application, we estimated tumor purities of 8830 tumor samples from TCGA. Result shows that our estimates are highly consistent with those by other available methods. Consequently, the updated uiInfiniumPurify, can be applied to a single sample (or a few samples) of interest whose tumor type is not included in TCGA. This characteristic will greatly broaden the application of uiInfiniumPurify in cancer research.
Collapse
Affiliation(s)
- Haixia Dou
- 1 Department of Mathematics, Shanghai Normal University, Shanghai 200234, P. R. China
| | - Yun Fang
- 1 Department of Mathematics, Shanghai Normal University, Shanghai 200234, P. R. China
| | - Xiaoqi Zheng
- 1 Department of Mathematics, Shanghai Normal University, Shanghai 200234, P. R. China
| |
Collapse
|
30
|
Wen Y, Wei Y, Zhang S, Li S, Liu H, Wang F, Zhao Y, Zhang D, Zhang Y. Cell subpopulation deconvolution reveals breast cancer heterogeneity based on DNA methylation signature. Brief Bioinform 2017; 18:426-440. [PMID: 27016391 DOI: 10.1093/bib/bbw028] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2015] [Indexed: 12/21/2022] Open
Abstract
Tumour heterogeneity describes the coexistence of divergent tumour cell clones within tumours, which is often caused by underlying epigenetic changes. DNA methylation is commonly regarded as a significant regulator that differs across cells and tissues. In this study, we comprehensively reviewed research progress on estimating of tumour heterogeneity. Bioinformatics-based analysis of DNA methylation has revealed the evolutionary relationships between breast cancer cell lines and tissues. Further analysis of the DNA methylation profiles in 33 breast cancer-related cell lines identified cell line-specific methylation patterns. Next, we reviewed the computational methods in inferring clonal evolution of tumours from different perspectives and then proposed a deconvolution strategy for modelling cell subclonal populations dynamics in breast cancer tissues based on DNA methylation. Further analysis of simulated cancer tissues and real cell lines revealed that this approach exhibits satisfactory performance and relative stability in estimating the composition and proportions of cellular subpopulations. The application of this strategy to breast cancer individuals of the Cancer Genome Atlas's identified different cellular subpopulations with distinct molecular phenotypes. Moreover, the current and potential future applications of this deconvolution strategy to clinical breast cancer research are discussed, and emphasis was placed on the DNA methylation-based recognition of intra-tumour heterogeneity. The wide use of these methods for estimating heterogeneity to further clinical cohorts will improve our understanding of neoplastic progression and the design of therapeutic interventions for treating breast cancer and other malignancies.
Collapse
|
31
|
Statistical and integrative system-level analysis of DNA methylation data. Nat Rev Genet 2017; 19:129-147. [PMID: 29129922 DOI: 10.1038/nrg.2017.86] [Citation(s) in RCA: 197] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Epigenetics plays a key role in cellular development and function. Alterations to the epigenome are thought to capture and mediate the effects of genetic and environmental risk factors on complex disease. Currently, DNA methylation is the only epigenetic mark that can be measured reliably and genome-wide in large numbers of samples. This Review discusses some of the key statistical challenges and algorithms associated with drawing inferences from DNA methylation data, including cell-type heterogeneity, feature selection, reverse causation and system-level analyses that require integration with other data types such as gene expression, genotype, transcription factor binding and other epigenetic information.
Collapse
|
32
|
Liu XS, Mardis ER. Applications of Immunogenomics to Cancer. Cell 2017; 168:600-612. [PMID: 28187283 DOI: 10.1016/j.cell.2017.01.014] [Citation(s) in RCA: 149] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Revised: 01/10/2017] [Accepted: 01/10/2017] [Indexed: 01/05/2023]
Abstract
Cancer immunogenomics originally was framed by research supporting the hypothesis that cancer mutations generated novel peptides seen as "non-self" by the immune system. The search for these "neoantigens" has been facilitated by the combination of new sequencing technologies, specialized computational analyses, and HLA binding predictions that evaluate somatic alterations in a cancer genome and interpret their ability to produce an immune-stimulatory peptide. The resulting information can characterize a tumor's neoantigen load, its cadre of infiltrating immune cell types, the T or B cell receptor repertoire, and direct the design of a personalized therapeutic.
Collapse
Affiliation(s)
- X Shirley Liu
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard T.H. Chan School of Public Health, 450 Brookline Ave, Boston MA 02215, USA.
| | - Elaine R Mardis
- Institute for Genomic Medicine, Nationwide Children's Hospital, and The Ohio State University College of Medicine, 575 Children's Crossroad, Columbus OH 43205, USA.
| |
Collapse
|
33
|
Zheng X, Zhang N, Wu HJ, Wu H. Estimating and accounting for tumor purity in the analysis of DNA methylation data from cancer studies. Genome Biol 2017; 18:17. [PMID: 28122605 PMCID: PMC5267453 DOI: 10.1186/s13059-016-1143-5] [Citation(s) in RCA: 93] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2016] [Accepted: 12/20/2016] [Indexed: 01/03/2023] Open
Abstract
We present a set of statistical methods for the analysis of DNA methylation microarray data, which account for tumor purity. These methods are an extension of our previously developed method for purity estimation; our updated method is flexible, efficient, and does not require data from reference samples or matched normal controls. We also present a method for incorporating purity information for differential methylation analysis. In addition, we propose a control-free differential methylation calling method when normal controls are not available. Extensive analyses of TCGA data demonstrate that our methods provide accurate results. All methods are implemented in InfiniumPurify.
Collapse
Affiliation(s)
- Xiaoqi Zheng
- Department of Mathematics, Shanghai Normal University, Shanghai, 200234, China.
| | - Naiqian Zhang
- Department of Mathematics, Weifang University, Weifang, Shandong, 261061, China
| | - Hua-Jun Wu
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, MA, 02215, USA
| | - Hao Wu
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, Georgia, 30322, USA.
| |
Collapse
|
34
|
Han Y, He X. Integrating Epigenomics into the Understanding of Biomedical Insight. Bioinform Biol Insights 2016; 10:267-289. [PMID: 27980397 PMCID: PMC5138066 DOI: 10.4137/bbi.s38427] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2016] [Revised: 11/01/2016] [Accepted: 11/06/2016] [Indexed: 12/13/2022] Open
Abstract
Epigenetics is one of the most rapidly expanding fields in biomedical research, and the popularity of the high-throughput next-generation sequencing (NGS) highlights the accelerating speed of epigenomics discovery over the past decade. Epigenetics studies the heritable phenotypes resulting from chromatin changes but without alteration on DNA sequence. Epigenetic factors and their interactive network regulate almost all of the fundamental biological procedures, and incorrect epigenetic information may lead to complex diseases. A comprehensive understanding of epigenetic mechanisms, their interactions, and alterations in health and diseases genome widely has become a priority in biological research. Bioinformatics is expected to make a remarkable contribution for this purpose, especially in processing and interpreting the large-scale NGS datasets. In this review, we introduce the epigenetics pioneering achievements in health status and complex diseases; next, we give a systematic review of the epigenomics data generation, summarize public resources and integrative analysis approaches, and finally outline the challenges and future directions in computational epigenomics.
Collapse
Affiliation(s)
- Yixing Han
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, MD, USA.; Present address: Genetics and Biochemistry Branch, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Ximiao He
- Laboratory of Metabolism, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.; Present address: Department of Medical Genetics, School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| |
Collapse
|
35
|
Bernhart SH, Kretzmer H, Holdt LM, Jühling F, Ammerpohl O, Bergmann AK, Northoff BH, Doose G, Siebert R, Stadler PF, Hoffmann S. Changes of bivalent chromatin coincide with increased expression of developmental genes in cancer. Sci Rep 2016; 6:37393. [PMID: 27876760 PMCID: PMC5120258 DOI: 10.1038/srep37393] [Citation(s) in RCA: 81] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 10/27/2016] [Indexed: 02/08/2023] Open
Abstract
Bivalent (poised or paused) chromatin comprises activating and repressing histone modifications at the same location. This combination of epigenetic marks at promoter or enhancer regions keeps genes expressed at low levels but poised for rapid activation. Typically, DNA at bivalent promoters is only lowly methylated in normal cells, but frequently shows elevated methylation levels in cancer samples. Here, we developed a universal classifier built from chromatin data that can identify cancer samples solely from hypermethylation of bivalent chromatin. Tested on over 7,000 DNA methylation data sets from several cancer types, it reaches an AUC of 0.92. Although higher levels of DNA methylation are often associated with transcriptional silencing, counter-intuitive positive statistical dependencies between DNA methylation and expression levels have been recently reported for two cancer types. Here, we re-analyze combined expression and DNA methylation data sets, comprising over 5,000 samples, and demonstrate that the conjunction of hypermethylation of bivalent chromatin and up-regulation of the corresponding genes is a general phenomenon in cancer. This up-regulation affects many developmental genes and transcription factors, including dozens of homeobox genes and other genes implicated in cancer. Thus, we reason that the disturbance of bivalent chromatin may be intimately linked to tumorigenesis.
Collapse
Affiliation(s)
- Stephan H Bernhart
- Leipzig University, Chair of Bioinformatics, Leipzig, 04107, Germany.,Leipzig University, Transcriptome Bioinformatics Group - Interdisciplinary Center for Bioinformatics, Leipzig, 04107, Germany
| | - Helene Kretzmer
- Leipzig University, Chair of Bioinformatics, Leipzig, 04107, Germany.,Leipzig University, Transcriptome Bioinformatics Group - Interdisciplinary Center for Bioinformatics, Leipzig, 04107, Germany
| | - Lesca M Holdt
- Ludwig-Maximilians-University, Institute of Laboratory Medicine, Munich, 81377, Germany
| | - Frank Jühling
- Leipzig University, Chair of Bioinformatics, Leipzig, 04107, Germany.,Leipzig University, Transcriptome Bioinformatics Group - Interdisciplinary Center for Bioinformatics, Leipzig, 04107, Germany.,Inserm, U1110 - Institut de Recherche sur les Maladies Virales et Hépatiques, Strasbourg, 67000, France.,Université de Strasbourg, Strasbourg, 67000, France
| | - Ole Ammerpohl
- Christian Albrechts University &University Hospital Schleswig-Holstein - Campus Kiel, Institute of Human Genetics, Kiel, 24105, Germany
| | - Anke K Bergmann
- Christian Albrechts University &University Hospital Schleswig-Holstein - Campus Kiel, Institute of Human Genetics, Kiel, 24105, Germany.,Christian Albrechts University Kiel &University Hospital Schleswig-Holstein - Campus Kiel, Department of Pediatrics, Kiel, 24105, Germany
| | - Bernd H Northoff
- Ludwig-Maximilians-University, Institute of Laboratory Medicine, Munich, 81377, Germany
| | - Gero Doose
- Leipzig University, Chair of Bioinformatics, Leipzig, 04107, Germany.,Leipzig University, Transcriptome Bioinformatics Group - Interdisciplinary Center for Bioinformatics, Leipzig, 04107, Germany
| | - Reiner Siebert
- Christian Albrechts University &University Hospital Schleswig-Holstein - Campus Kiel, Institute of Human Genetics, Kiel, 24105, Germany.,Ulm University &Ulm University Medical Center, Institute for Human Genetics, Ulm, 89081, Germany
| | - Peter F Stadler
- Leipzig University, Chair of Bioinformatics, Leipzig, 04107, Germany.,Leipzig University, Transcriptome Bioinformatics Group - Interdisciplinary Center for Bioinformatics, Leipzig, 04107, Germany.,Leipzig University, LIFE - Leipzig Research Center for Civilization Diseases, Leipzig, 04107, Germany.,University of Vienna, Department of Theoretical Chemistry, Vienna, 1090, Austria.,Max-Planck-Institute for Mathematics in Sciences, Leipzig, 04103, Germany.,Santa Fe Institute, Santa Fe, NM 87501, USA
| | - Steve Hoffmann
- Leipzig University, Chair of Bioinformatics, Leipzig, 04107, Germany.,Leipzig University, Transcriptome Bioinformatics Group - Interdisciplinary Center for Bioinformatics, Leipzig, 04107, Germany.,Leipzig University, LIFE - Leipzig Research Center for Civilization Diseases, Leipzig, 04107, Germany
| |
Collapse
|
36
|
Weinhold L, Wahl S, Pechlivanis S, Hoffmann P, Schmid M. A statistical model for the analysis of beta values in DNA methylation studies. BMC Bioinformatics 2016; 17:480. [PMID: 27875981 PMCID: PMC5120494 DOI: 10.1186/s12859-016-1347-4] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2016] [Accepted: 11/09/2016] [Indexed: 12/31/2022] Open
Abstract
Background The analysis of DNA methylation is a key component in the development of personalized treatment approaches. A common way to measure DNA methylation is the calculation of beta values, which are bounded variables of the form M/(M+U) that are generated by Illumina’s 450k BeadChip array. The statistical analysis of beta values is considered to be challenging, as traditional methods for the analysis of bounded variables, such as M-value regression and beta regression, are based on regularity assumptions that are often too strong to adequately describe the distribution of beta values. Results We develop a statistical model for the analysis of beta values that is derived from a bivariate gamma distribution for the signal intensities M and U. By allowing for possible correlations between M and U, the proposed model explicitly takes into account the data-generating process underlying the calculation of beta values. Using simulated data and a real sample of DNA methylation data from the Heinz Nixdorf Recall cohort study, we demonstrate that the proposed model fits our data significantly better than beta regression and M-value regression. Conclusion The proposed model contributes to an improved identification of associations between beta values and covariates such as clinical variables and lifestyle factors in epigenome-wide association studies. It is as easy to apply to a sample of beta values as beta regression and M-value regression. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1347-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Leonie Weinhold
- Department of Medical Biometry, Informatics and Epidemiology, University of Bonn, Sigmund-Freud-Str. 25, Bonn, D-53127, Germany.
| | - Simone Wahl
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München, Ingolstädter Landstr. 1, Neuherber, D-85764, Germany
| | - Sonali Pechlivanis
- Department of Medical Informatics, Biometry and Epidemiology, University Hospital Essen, Hufelandstr. 55, Essen, D-45122, Germany
| | - Per Hoffmann
- Human Genomics Research Group, Department of Biomedicine, University Hospital Basel, Hebelstr. 20, Basel, CH-4031, Switzerland
| | - Matthias Schmid
- Department of Medical Biometry, Informatics and Epidemiology, University of Bonn, Sigmund-Freud-Str. 25, Bonn, D-53127, Germany
| |
Collapse
|
37
|
Ruan P, Shen J, Santella RM, Zhou S, Wang S. NEpiC: a network-assisted algorithm for epigenetic studies using mean and variance combined signals. Nucleic Acids Res 2016; 44:e134. [PMID: 27302130 PMCID: PMC5027497 DOI: 10.1093/nar/gkw546] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2015] [Accepted: 06/04/2016] [Indexed: 12/13/2022] Open
Abstract
DNA methylation plays an important role in many biological processes. Existing epigenome-wide association studies (EWAS) have successfully identified aberrantly methylated genes in many diseases and disorders with most studies focusing on analysing methylation sites one at a time. Incorporating prior biological information such as biological networks has been proven to be powerful in identifying disease-associated genes in both gene expression studies and genome-wide association studies (GWAS) but has been under studied in EWAS. Although recent studies have noticed that there are differences in methylation variation in different groups, only a few existing methods consider variance signals in DNA methylation studies. Here, we present a network-assisted algorithm, NEpiC, that combines both mean and variance signals in searching for differentially methylated sub-networks using the protein–protein interaction (PPI) network. In simulation studies, we demonstrate the power gain from using both the prior biological information and variance signals compared to using either of the two or neither information. Applications to several DNA methylation datasets from the Cancer Genome Atlas (TCGA) project and DNA methylation data on hepatocellular carcinoma (HCC) from the Columbia University Medical Center (CUMC) suggest that the proposed NEpiC algorithm identifies more cancer-related genes and generates better replication results.
Collapse
Affiliation(s)
- Peifeng Ruan
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai 200433, China
| | - Jing Shen
- Department of Environmental Health Science, Mailman School of Public Health, Columbia University, New York, NY 10032, USA
| | - Regina M Santella
- Department of Environmental Health Science, Mailman School of Public Health, Columbia University, New York, NY 10032, USA
| | - Shuigeng Zhou
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai 200433, China
| | - Shuang Wang
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10032, USA
| |
Collapse
|
38
|
Wang F, Zhang N, Wang J, Wu H, Zheng X. Tumor purity and differential methylation in cancer epigenomics. Brief Funct Genomics 2016; 15:408-419. [PMID: 27199459 DOI: 10.1093/bfgp/elw016] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
DNA methylation is an epigenetic modification of DNA molecule that plays a vital role in gene expression regulation. It is not only involved in many basic biological processes, but also considered an important factor for tumorigenesis and other human diseases. Study of DNA methylation has been an active field in cancer epigenomics research. With the advances of high-throughput technologies and the accumulation of enormous amount of data, method development for analyzing these data has gained tremendous interests in the fields of computational biology and bioinformatics. In this review, we systematically summarize the recent developments of computational methods and software tools in high-throughput methylation data analysis with focus on two aspects: differential methylation analysis and tumor purity estimation in cancer studies.
Collapse
|