1
|
Prediction of cancer driver genes and mutations: the potential of integrative computational frameworks. Brief Bioinform 2024; 25:bbad519. [PMID: 38261338 PMCID: PMC10805075 DOI: 10.1093/bib/bbad519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 11/27/2023] [Accepted: 12/11/2023] [Indexed: 01/24/2024] Open
Abstract
The vast amount of available sequencing data allows the scientific community to explore different genetic alterations that may drive cancer or favor cancer progression. Software developers have proposed a myriad of predictive tools, allowing researchers and clinicians to compare and prioritize driver genes and mutations and their relative pathogenicity. However, there is little consensus on the computational approach or a golden standard for comparison. Hence, benchmarking the different tools depends highly on the input data, indicating that overfitting is still a massive problem. One of the solutions is to limit the scope and usage of specific tools. However, such limitations force researchers to walk on a tightrope between creating and using high-quality tools for a specific purpose and describing the complex alterations driving cancer. While the knowledge of cancer development increases daily, many bioinformatic pipelines rely on single nucleotide variants or alterations in a vacuum without accounting for cellular compartments, mutational burden or disease progression. Even within bioinformatics and computational cancer biology, the research fields work in silos, risking overlooking potential synergies or breakthroughs. Here, we provide an overview of databases and datasets for building or testing predictive cancer driver tools. Furthermore, we introduce predictive tools for driver genes, driver mutations, and the impact of these based on structural analysis. Additionally, we suggest and recommend directions in the field to avoid silo-research, moving towards integrative frameworks.
Collapse
|
2
|
The Many Faces of Oligoadenylate Synthetases. J Interferon Cytokine Res 2023; 43:487-494. [PMID: 37751211 PMCID: PMC10654648 DOI: 10.1089/jir.2023.0098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 08/13/2023] [Indexed: 09/27/2023] Open
Abstract
2'-5' Oligoadenylate synthetases (OAS) are interferon-stimulated genes that are most well-known to protect hosts from viral infections. They are evolutionarily related to an ancient family of Nucleotidyltransferases, which are primarily involved in pathogen-sensing and innate immune response. Classical function of OAS proteins involves double-stranded RNA-stimulated polymerization of adenosine triphosphate in 2'-5' oligoadenylates (2-5A), which can activate the latent RNase (RNase L) to degrade RNA. However, accumulated evidence over the years have suggested alternative mode of antiviral function of several OAS family proteins. Furthermore, recent studies have connected some OAS proteins with wider function beyond viral infection. Here, we review some of the canonical and noncanonical functions of OAS proteins and their mechanisms.
Collapse
|
3
|
InDEP: an interpretable machine learning approach to predict cancer driver genes from multi-omics data. Brief Bioinform 2023; 24:bbad318. [PMID: 37649392 DOI: 10.1093/bib/bbad318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Revised: 06/14/2023] [Accepted: 08/16/2023] [Indexed: 09/01/2023] Open
Abstract
Cancer driver genes are critical in driving tumor cell growth, and precisely identifying these genes is crucial in advancing our understanding of cancer pathogenesis and developing targeted cancer drugs. Despite the current methods for discovering cancer driver genes that mainly rely on integrating multi-omics data, many existing models are overly complex, and it is difficult to interpret the results accurately. This study aims to address this issue by introducing InDEP, an interpretable machine learning framework based on cascade forests. InDEP is designed with easy-to-interpret features, cascade forests based on decision trees and a KernelSHAP module that enables fine-grained post-hoc interpretation. Integrating multi-omics data, InDEP can identify essential features of classified driver genes at both the gene and cancer-type levels. The framework accurately identifies driver genes, discovers new patterns that make genes as driver genes and refines the cancer driver gene catalog. In comparison with state-of-the-art methods, InDEP proved to be more accurate on the test set and identified reliable candidate driver genes. Mutational features were the primary drivers for InDEP's identifying driver genes, with other omics features also contributing. At the gene level, the framework concluded that substitution-type mutations were the main reason most genes were identified as driver genes. InDEP's ability to identify reliable candidate driver genes opens up new avenues for precision oncology and discovering new biomedical knowledge. This framework can help advance cancer research by providing an interpretable method for identifying cancer driver genes and their contribution to cancer pathogenesis, facilitating the development of targeted cancer drugs.
Collapse
|
4
|
Multi-Omics Data Analysis Identifies Prognostic Biomarkers across Cancers. Med Sci (Basel) 2023; 11:44. [PMID: 37489460 PMCID: PMC10366886 DOI: 10.3390/medsci11030044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 06/18/2023] [Accepted: 06/20/2023] [Indexed: 07/26/2023] Open
Abstract
Combining omics data from different layers using integrative methods provides a better understanding of the biology of a complex disease such as cancer. The discovery of biomarkers related to cancer development or prognosis helps to find more effective treatment options. This study integrates multi-omics data of different cancer types with a network-based approach to explore common gene modules among different tumors by running community detection methods on the integrated network. The common modules were evaluated by several biological metrics adapted to cancer. Then, a new prognostic scoring method was developed by weighting mRNA expression, methylation, and mutation status of genes. The survival analysis pointed out statistically significant results for GNG11, CBX2, CDKN3, ARHGEF10, CLN8, SEC61G and PTDSS1 genes. The literature search reveals that the identified biomarkers are associated with the same or different types of cancers. Our method does not only identify known cancer-specific biomarker genes, but also proposes new potential biomarkers. Thus, this study provides a rationale for identifying new gene targets and expanding treatment options across cancer types.
Collapse
|
5
|
Microbial Synthesis of Heme b: Biosynthetic Pathways, Current Strategies, Detection, and Future Prospects. Molecules 2023; 28:molecules28083633. [PMID: 37110868 PMCID: PMC10144233 DOI: 10.3390/molecules28083633] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 04/10/2023] [Accepted: 04/19/2023] [Indexed: 04/29/2023] Open
Abstract
Heme b, which is characterized by a ferrous ion and a porphyrin macrocycle, acts as a prosthetic group for many enzymes and contributes to various physiological processes. Consequently, it has wide applications in medicine, food, chemical production, and other burgeoning fields. Due to the shortcomings of chemical syntheses and bio-extraction techniques, alternative biotechnological methods have drawn increasing attention. In this review, we provide the first systematic summary of the progress in the microbial synthesis of heme b. Three different pathways are described in detail, and the metabolic engineering strategies for the biosynthesis of heme b via the protoporphyrin-dependent and coproporphyrin-dependent pathways are highlighted. The UV spectrophotometric detection of heme b is gradually being replaced by newly developed detection methods, such as HPLC and biosensors, and for the first time, this review summarizes the methods used in recent years. Finally, we discuss the future prospects, with an emphasis on the potential strategies for improving the biosynthesis of heme b and understanding the regulatory mechanisms for building efficient microbial cell factories.
Collapse
|
6
|
Bayesian Machine Learning Enables Identification of Transcriptional Network Disruptions Associated with Drug-Resistant Prostate Cancer. Cancer Res 2023; 83:1361-1380. [PMID: 36779846 PMCID: PMC10102853 DOI: 10.1158/0008-5472.can-22-1910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/29/2022] [Accepted: 02/08/2023] [Indexed: 02/14/2023]
Abstract
Survival rates of patients with metastatic castration-resistant prostate cancer (mCRPC) are low due to lack of response or acquired resistance to available therapies, such as abiraterone (Abi). A better understanding of the underlying molecular mechanisms is needed to identify effective targets to overcome resistance. Given the complexity of the transcriptional dynamics in cells, differential gene expression analysis of bulk transcriptomics data cannot provide sufficient detailed insights into resistance mechanisms. Incorporating network structures could overcome this limitation to provide a global and functional perspective of Abi resistance in mCRPC. Here, we developed TraRe, a computational method using sparse Bayesian models to examine phenotypically driven transcriptional mechanistic differences at three distinct levels: transcriptional networks, specific regulons, and individual transcription factors (TF). TraRe was applied to transcriptomic data from 46 patients with mCRPC with Abi-response clinical data and uncovered abrogated immune response transcriptional modules that showed strong differential regulation in Abi-responsive compared with Abi-resistant patients. These modules were replicated in an independent mCRPC study. Furthermore, key rewiring predictions and their associated TFs were experimentally validated in two prostate cancer cell lines with different Abi-resistance features. Among them, ELK3, MXD1, and MYB played a differential role in cell survival in Abi-sensitive and Abi-resistant cells. Moreover, ELK3 regulated cell migration capacity, which could have a direct impact on mCRPC. Collectively, these findings shed light on the underlying transcriptional mechanisms driving Abi response, demonstrating that TraRe is a promising tool for generating novel hypotheses based on identified transcriptional network disruptions. SIGNIFICANCE The computational method TraRe built on Bayesian machine learning models for investigating transcriptional network structures shows that disruption of ELK3, MXD1, and MYB signaling cascades impacts abiraterone resistance in prostate cancer.
Collapse
|
7
|
Identifying key multifunctional components shared by critical cancer and normal liver pathways via SparseGMM. CELL REPORTS METHODS 2023; 3:100392. [PMID: 36814838 PMCID: PMC9939431 DOI: 10.1016/j.crmeth.2022.100392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 09/16/2022] [Accepted: 12/21/2022] [Indexed: 01/19/2023]
Abstract
Despite the abundance of multimodal data, suitable statistical models that can improve our understanding of diseases with genetic underpinnings are challenging to develop. Here, we present SparseGMM, a statistical approach for gene regulatory network discovery. SparseGMM uses latent variable modeling with sparsity constraints to learn Gaussian mixtures from multiomic data. By combining coexpression patterns with a Bayesian framework, SparseGMM quantitatively measures confidence in regulators and uncertainty in target gene assignment by computing gene entropy. We apply SparseGMM to liver cancer and normal liver tissue data and evaluate discovered gene modules in an independent single-cell RNA sequencing (scRNA-seq) dataset. SparseGMM identifies PROCR as a regulator of angiogenesis and PDCD1LG2 and HNF4A as regulators of immune response and blood coagulation in cancer. Furthermore, we show that more genes have significantly higher entropy in cancer compared with normal liver. Among high-entropy genes are key multifunctional components shared by critical pathways, including p53 and estrogen signaling.
Collapse
|
8
|
Artificial intelligence-based multi-omics analysis fuels cancer precision medicine. Semin Cancer Biol 2023; 88:187-200. [PMID: 36596352 DOI: 10.1016/j.semcancer.2022.12.009] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 12/16/2022] [Accepted: 12/29/2022] [Indexed: 01/02/2023]
Abstract
With biotechnological advancements, innovative omics technologies are constantly emerging that have enabled researchers to access multi-layer information from the genome, epigenome, transcriptome, proteome, metabolome, and more. A wealth of omics technologies, including bulk and single-cell omics approaches, have empowered to characterize different molecular layers at unprecedented scale and resolution, providing a holistic view of tumor behavior. Multi-omics analysis allows systematic interrogation of various molecular information at each biological layer while posing tricky challenges regarding how to extract valuable insights from the exponentially increasing amount of multi-omics data. Therefore, efficient algorithms are needed to reduce the dimensionality of the data while simultaneously dissecting the mysteries behind the complex biological processes of cancer. Artificial intelligence has demonstrated the ability to analyze complementary multi-modal data streams within the oncology realm. The coincident development of multi-omics technologies and artificial intelligence algorithms has fuelled the development of cancer precision medicine. Here, we present state-of-the-art omics technologies and outline a roadmap of multi-omics integration analysis using an artificial intelligence strategy. The advances made using artificial intelligence-based multi-omics approaches are described, especially concerning early cancer screening, diagnosis, response assessment, and prognosis prediction. Finally, we discuss the challenges faced in multi-omics analysis, along with tentative future trends in this field. With the increasing application of artificial intelligence in multi-omics analysis, we anticipate a shifting paradigm in precision medicine becoming driven by artificial intelligence-based multi-omics technologies.
Collapse
|
9
|
Multi-omics analysis: Paving the path toward achieving precision medicine in cancer treatment and immuno-oncology. Front Mol Biosci 2022; 9:962743. [PMID: 36304921 PMCID: PMC9595279 DOI: 10.3389/fmolb.2022.962743] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 09/21/2022] [Indexed: 11/13/2022] Open
Abstract
The acceleration of large-scale sequencing and the progress in high-throughput computational analyses, defined as omics, was a hallmark for the comprehension of the biological processes in human health and diseases. In cancerology, the omics approach, initiated by genomics and transcriptomics studies, has revealed an incredible complexity with unsuspected molecular diversity within a same tumor type as well as spatial and temporal heterogeneity of tumors. The integration of multiple biological layers of omics studies brought oncology to a new paradigm, from tumor site classification to pan-cancer molecular classification, offering new therapeutic opportunities for precision medicine. In this review, we will provide a comprehensive overview of the latest innovations for multi-omics integration in oncology and summarize the largest multi-omics dataset available for adult and pediatric cancers. We will present multi-omics techniques for characterizing cancer biology and show how multi-omics data can be combined with clinical data for the identification of prognostic and treatment-specific biomarkers, opening the way to personalized therapy. To conclude, we will detail the newest strategies for dissecting the tumor immune environment and host–tumor interaction. We will explore the advances in immunomics and microbiomics for biomarker identification to guide therapeutic decision in immuno-oncology.
Collapse
|
10
|
SimiC enables the inference of complex gene regulatory dynamics across cell phenotypes. Commun Biol 2022; 5:351. [PMID: 35414121 PMCID: PMC9005655 DOI: 10.1038/s42003-022-03319-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Accepted: 03/24/2022] [Indexed: 11/08/2022] Open
Abstract
Single-cell RNA-Sequencing has the potential to provide deep biological insights by revealing complex regulatory interactions across diverse cell phenotypes at single-cell resolution. However, current single-cell gene regulatory network inference methods produce a single regulatory network per input dataset, limiting their capability to uncover complex regulatory relationships across related cell phenotypes. We present SimiC, a single-cell gene regulatory inference framework that overcomes this limitation by jointly inferring distinct, but related, gene regulatory dynamics per phenotype. We show that SimiC uncovers key regulatory dynamics missed by previously proposed methods across a range of systems, both model and non-model alike. In particular, SimiC was able to uncover CAR T cell dynamics after tumor recognition and key regulatory patterns on a regenerating liver, and was able to implicate glial cells in the generation of distinct behavioral states in honeybees. SimiC hence establishes a new approach to quantitating regulatory architectures between distinct cellular phenotypes, with far-reaching implications for systems biology.
Collapse
|
11
|
Single-Cell Multiomics Techniques: From Conception to Applications. Front Cell Dev Biol 2022; 10:854317. [PMID: 35386194 PMCID: PMC8979110 DOI: 10.3389/fcell.2022.854317] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 02/14/2022] [Indexed: 01/16/2023] Open
Abstract
Recent advances in methods for single-cell analyses and barcoding strategies have led to considerable progress in research. The development of multiplexed assays offers the possibility to conduct parallel analyses of multiple factors and processes for comprehensive characterization of cellular and molecular states in health and disease. These technologies have expanded extremely rapidly in the past years and constantly evolve and provide better specificity, precision and resolution. This review summarizes recent progress in single-cell multiomics approaches, and focuses, in particular, on the most innovative techniques that integrate genome, epigenome and transcriptome profiling. It describes the methodologies, discusses their advantages and limitations, and explains how they have been applied to studies on cell heterogeneity and differentiation, and epigenetic reprogramming.
Collapse
|
12
|
Multi-Omics Profiling of the Tumor Microenvironment. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:283-326. [DOI: 10.1007/978-3-030-91836-1_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
13
|
From DNA Copy Number Gains and Tumor Dependencies to Novel Therapeutic Targets for High-Risk Neuroblastoma. J Pers Med 2021; 11:1286. [PMID: 34945759 PMCID: PMC8707517 DOI: 10.3390/jpm11121286] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 11/19/2021] [Accepted: 11/20/2021] [Indexed: 12/15/2022] Open
Abstract
Neuroblastoma is a pediatric tumor arising from the sympatho-adrenal lineage and a worldwide leading cause of childhood cancer-related deaths. About half of high-risk patients die from the disease while survivors suffer from multiple therapy-related side-effects. While neuroblastomas present with a low mutational burden, focal and large segmental DNA copy number aberrations are highly recurrent and associated with poor survival. It can be assumed that the affected chromosomal regions contain critical genes implicated in neuroblastoma biology and behavior. More specifically, evidence has emerged that several of these genes are implicated in tumor dependencies thus potentially providing novel therapeutic entry points. In this review, we briefly review the current status of recurrent DNA copy number aberrations in neuroblastoma and provide an overview of the genes affected by these genomic variants for which a direct role in neuroblastoma has been established. Several of these genes are implicated in networks that positively regulate MYCN expression or stability as well as cell cycle control and apoptosis. Finally, we summarize alternative approaches to identify and prioritize candidate copy-number driven dependency genes for neuroblastoma offering novel therapeutic opportunities.
Collapse
Grants
- P30 CA008748 NCI NIH HHS
- G087221N, G.0507.12, G049720N,12U4718N, 11C3921N, 11J8313N, 12B5313N, 1514215N, 1197617N,1238420N, 12Q8322N, 3F018519, 12N6917N Fund for Scientific Research Flanders
- 2018-087, 2018-125, 2020-112 Belgian Foundation against Cancer
Collapse
|
14
|
Prospects and challenges of cancer systems medicine: from genes to disease networks. Brief Bioinform 2021; 23:6361045. [PMID: 34471925 PMCID: PMC8769701 DOI: 10.1093/bib/bbab343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 12/20/2022] Open
Abstract
It is becoming evident that holistic perspectives toward cancer are crucial in deciphering the overwhelming complexity of tumors. Single-layer analysis of genome-wide data has greatly contributed to our understanding of cellular systems and their perturbations. However, fundamental gaps in our knowledge persist and hamper the design of effective interventions. It is becoming more apparent than ever, that cancer should not only be viewed as a disease of the genome but as a disease of the cellular system. Integrative multilayer approaches are emerging as vigorous assets in our endeavors to achieve systemic views on cancer biology. Herein, we provide a comprehensive review of the approaches, methods and technologies that can serve to achieve systemic perspectives of cancer. We start with genome-wide single-layer approaches of omics analyses of cellular systems and move on to multilayer integrative approaches in which in-depth descriptions of proteogenomics and network-based data analysis are provided. Proteogenomics is a remarkable example of how the integration of multiple levels of information can reduce our blind spots and increase the accuracy and reliability of our interpretations and network-based data analysis is a major approach for data interpretation and a robust scaffold for data integration and modeling. Overall, this review aims to increase cross-field awareness of the approaches and challenges regarding the omics-based study of cancer and to facilitate the necessary shift toward holistic approaches.
Collapse
|
15
|
Integrated multi-omics analysis of ovarian cancer using variational autoencoders. Sci Rep 2021; 11:6265. [PMID: 33737557 PMCID: PMC7973750 DOI: 10.1038/s41598-021-85285-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 02/28/2021] [Indexed: 02/06/2023] Open
Abstract
Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years, Deep Learning (DL) approaches have become a useful tool in integrated multi-omics analysis of cancer data. However, high dimensional multi-omics data are generally imbalanced with too many molecular features and relatively few patient samples. This imbalance makes a DL based integrated multi-omics analysis difficult. DL-based dimensionality reduction technique, including variational autoencoder (VAE), is a potential solution to balance high dimensional multi-omics data. However, there are few VAE-based integrated multi-omics analyses, and they are limited to pancancer. In this work, we did an integrated multi-omics analysis of ovarian cancer using the compressed features learned through VAE and an improved version of VAE, namely Maximum Mean Discrepancy VAE (MMD-VAE). First, we designed and developed a DL architecture for VAE and MMD-VAE. Then we used the architecture for mono-omics, integrated di-omics and tri-omics data analysis of ovarian cancer through cancer samples identification, molecular subtypes clustering and classification, and survival analysis. The results show that MMD-VAE and VAE-based compressed features can respectively classify the transcriptional subtypes of the TCGA datasets with an accuracy in the range of 93.2-95.5% and 87.1-95.7%. Also, survival analysis results show that VAE and MMD-VAE based compressed representation of omics data can be used in cancer prognosis. Based on the results, we can conclude that (i) VAE and MMD-VAE outperform existing dimensionality reduction techniques, (ii) integrated multi-omics analyses perform better or similar compared to their mono-omics counterparts, and (iii) MMD-VAE performs better than VAE in most omics dataset.
Collapse
|
16
|
A Detailed Catalogue of Multi-Omics Methodologies for Identification of Putative Biomarkers and Causal Molecular Networks in Translational Cancer Research. Int J Mol Sci 2021; 22:2822. [PMID: 33802234 PMCID: PMC8000236 DOI: 10.3390/ijms22062822] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/05/2021] [Accepted: 03/05/2021] [Indexed: 02/06/2023] Open
Abstract
Recent advances in sequencing and biotechnological methodologies have led to the generation of large volumes of molecular data of different omics layers, such as genomics, transcriptomics, proteomics and metabolomics. Integration of these data with clinical information provides new opportunities to discover how perturbations in biological processes lead to disease. Using data-driven approaches for the integration and interpretation of multi-omics data could stably identify links between structural and functional information and propose causal molecular networks with potential impact on cancer pathophysiology. This knowledge can then be used to improve disease diagnosis, prognosis, prevention, and therapy. This review will summarize and categorize the most current computational methodologies and tools for integration of distinct molecular layers in the context of translational cancer research and personalized therapy. Additionally, the bioinformatics tools Multi-Omics Factor Analysis (MOFA) and netDX will be tested using omics data from public cancer resources, to assess their overall robustness, provide reproducible workflows for gaining biological knowledge from multi-omics data, and to comprehensively understand the significantly perturbed biological entities in distinct cancer types. We show that the performed supervised and unsupervised analyses result in meaningful and novel findings.
Collapse
|
17
|
Comparison of single and module-based methods for modeling gene regulatory networks. Bioinformatics 2020; 36:558-567. [PMID: 31287491 DOI: 10.1093/bioinformatics/btz549] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Revised: 06/11/2019] [Accepted: 07/06/2019] [Indexed: 01/02/2023] Open
Abstract
MOTIVATION Gene regulatory networks describe the regulatory relationships among genes, and developing methods for reverse engineering these networks is an ongoing challenge in computational biology. The majority of the initially proposed methods for gene regulatory network discovery create a network of genes and then mine it in order to uncover previously unknown regulatory processes. More recent approaches have focused on inferring modules of co-regulated genes, linking these modules with regulatory genes and then mining them to discover new molecular biology. RESULTS In this work we analyze module-based network approaches to build gene regulatory networks, and compare their performance to single gene network approaches. In the process, we propose a novel approach to estimate gene regulatory networks drawing from the module-based methods. We show that generating modules of co-expressed genes which are predicted by a sparse set of regulators using a variational Bayes method, and then building a bipartite graph on the generated modules using sparse regression, yields more informative networks than previous single and module-based network approaches as measured by: (i) the rate of enriched gene sets, (ii) a network topology assessment, (iii) ChIP-Seq evidence and (iv) the KnowEnG Knowledge Network collection of previously characterized gene-gene interactions. AVAILABILITY AND IMPLEMENTATION The code is written in R and can be downloaded from https://github.com/mikelhernaez/linker. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
18
|
Circulating Cell-Free Nucleic Acids as Epigenetic Biomarkers in Precision Medicine. Front Genet 2020; 11:844. [PMID: 32849827 PMCID: PMC7431953 DOI: 10.3389/fgene.2020.00844] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 07/13/2020] [Indexed: 12/20/2022] Open
Abstract
The circulating cell-free nucleic acids (ccfNAs) are a mixture of single- or double-stranded nucleic acids, released into the blood plasma/serum by different tissues via apoptosis, necrosis, and secretions. Under healthy conditions, ccfNAs originate from the hematopoietic system, whereas under various clinical scenarios, the concomitant tissues release ccfNAs into the bloodstream. These ccfNAs include DNA, RNA, microRNA (miRNA), long non-coding RNA (lncRNA), fetal DNA/RNA, and mitochondrial DNA/RNA, and act as potential biomarkers in various clinical conditions. These are associated with different epigenetic modifications, which show disease-related variations and so finding their role as epigenetic biomarkers in clinical settings. This field has recently emerged as the latest advance in precision medicine because of its clinical relevance in diagnostic, prognostic, and predictive values. DNA methylation detected in ccfDNA has been widely used in personalized clinical diagnosis; furthermore, there is also the emerging role of ccfRNAs like miRNA and lncRNA as epigenetic biomarkers. This review focuses on the novel approaches for exploring ccfNAs as epigenetic biomarkers in personalized clinical diagnosis and prognosis, their potential as therapeutic targets and disease progression monitors, and reveals the tremendous potential that epigenetic biomarkers present to improve precision medicine. We explore the latest techniques for both quantitative and qualitative detection of epigenetic modifications in ccfNAs. The data on epigenetic modifications on ccfNAs are complex and often milieu-specific posing challenges for its understanding. Artificial intelligence and deep networks are the novel approaches for decoding complex data and providing insight into the decision-making in precision medicine.
Collapse
|
19
|
Genomic data imputation with variational auto-encoders. Gigascience 2020; 9:giaa082. [PMID: 32761097 PMCID: PMC7407276 DOI: 10.1093/gigascience/giaa082] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 05/14/2020] [Accepted: 07/03/2020] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND As missing values are frequently present in genomic data, practical methods to handle missing data are necessary for downstream analyses that require complete data sets. State-of-the-art imputation techniques, including methods based on singular value decomposition and K-nearest neighbors, can be computationally expensive for large data sets and it is difficult to modify these algorithms to handle certain cases not missing at random. RESULTS In this work, we use a deep-learning framework based on the variational auto-encoder (VAE) for genomic missing value imputation and demonstrate its effectiveness in transcriptome and methylome data analysis. We show that in the vast majority of our testing scenarios, VAE achieves similar or better performances than the most widely used imputation standards, while having a computational advantage at evaluation time. When dealing with data missing not at random (e.g., few values are missing), we develop simple yet effective methodologies to leverage the prior knowledge about missing data. Furthermore, we investigate the effect of varying latent space regularization strength in VAE on the imputation performances and, in this context, show why VAE has a better imputation capacity compared to a regular deterministic auto-encoder. CONCLUSIONS We describe a deep learning imputation framework for transcriptome and methylome data using a VAE and show that it can be a preferable alternative to traditional methods for data imputation, especially in the setting of large-scale data and certain missing-not-at-random scenarios.
Collapse
|
20
|
Imputing missing RNA-sequencing data from DNA methylation by using a transfer learning-based neural network. Gigascience 2020; 9:giaa076. [PMID: 32649756 PMCID: PMC7350980 DOI: 10.1093/gigascience/giaa076] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 04/23/2020] [Accepted: 06/24/2020] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Gene expression plays a key intermediate role in linking molecular features at the DNA level and phenotype. However, owing to various limitations in experiments, the RNA-seq data are missing in many samples while there exist high-quality of DNA methylation data. Because DNA methylation is an important epigenetic modification to regulate gene expression, it can be used to predict RNA-seq data. For this purpose, many methods have been developed. A common limitation of these methods is that they mainly focus on a single cancer dataset and do not fully utilize information from large pan-cancer datasets. RESULTS Here, we have developed a novel method to impute missing gene expression data from DNA methylation data through a transfer learning-based neural network, namely, TDimpute. In the method, the pan-cancer dataset from The Cancer Genome Atlas (TCGA) was utilized for training a general model, which was then fine-tuned on the specific cancer dataset. By testing on 16 cancer datasets, we found that our method significantly outperforms other state-of-the-art methods in imputation accuracy with a 7-11% improvement under different missing rates. The imputed gene expression was further proved to be useful for downstream analyses, including the identification of both methylation-driving and prognosis-related genes, clustering analysis, and survival analysis on the TCGA dataset. More importantly, our method was indicated to be useful for general purposes by an independent test on the Wilms tumor dataset from the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) project. CONCLUSIONS TDimpute is an effective method for RNA-seq imputation with limited training samples.
Collapse
|
21
|
Integrated Multi-Omics Analyses in Oncology: A Review of Machine Learning Methods and Tools. Front Oncol 2020; 10:1030. [PMID: 32695678 PMCID: PMC7338582 DOI: 10.3389/fonc.2020.01030] [Citation(s) in RCA: 100] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 05/26/2020] [Indexed: 12/16/2022] Open
Abstract
In recent years, high-throughput sequencing technologies provide unprecedented opportunity to depict cancer samples at multiple molecular levels. The integration and analysis of these multi-omics datasets is a crucial and critical step to gain actionable knowledge in a precision medicine framework. This paper explores recent data-driven methodologies that have been developed and applied to respond major challenges of stratified medicine in oncology, including patients' phenotyping, biomarker discovery, and drug repurposing. We systematically retrieved peer-reviewed journals published from 2014 to 2019, select and thoroughly describe the tools presenting the most promising innovations regarding the integration of heterogeneous data, the machine learning methodologies that successfully tackled the complexity of multi-omics data, and the frameworks to deliver actionable results for clinical practice. The review is organized according to the applied methods: Deep learning, Network-based methods, Clustering, Features Extraction, and Transformation, Factorization. We provide an overview of the tools available in each methodological group and underline the relationship among the different categories. Our analysis revealed how multi-omics datasets could be exploited to drive precision oncology, but also current limitations in the development of multi-omics data integration.
Collapse
|
22
|
Imaging-AMARETTO: An Imaging Genomics Software Tool to Interrogate Multiomics Networks for Relevance to Radiography and Histopathology Imaging Biomarkers of Clinical Outcomes. JCO Clin Cancer Inform 2020; 4:421-435. [PMID: 32383980 PMCID: PMC7265792 DOI: 10.1200/cci.19.00125] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/16/2020] [Indexed: 12/18/2022] Open
Abstract
PURPOSE The availability of increasing volumes of multiomics, imaging, and clinical data in complex diseases such as cancer opens opportunities for the formulation and development of computational imaging genomics methods that can link multiomics, imaging, and clinical data. METHODS Here, we present the Imaging-AMARETTO algorithms and software tools to systematically interrogate regulatory networks derived from multiomics data within and across related patient studies for their relevance to radiography and histopathology imaging features predicting clinical outcomes. RESULTS To demonstrate its utility, we applied Imaging-AMARETTO to integrate three patient studies of brain tumors, specifically, multiomics with radiography imaging data from The Cancer Genome Atlas (TCGA) glioblastoma multiforme (GBM) and low-grade glioma (LGG) cohorts and transcriptomics with histopathology imaging data from the Ivy Glioblastoma Atlas Project (IvyGAP) GBM cohort. Our results show that Imaging-AMARETTO recapitulates known key drivers of tumor-associated microglia and macrophage mechanisms, mediated by STAT3, AHR, and CCR2, and neurodevelopmental and stemness mechanisms, mediated by OLIG2. Imaging-AMARETTO provides interpretation of their underlying molecular mechanisms in light of imaging biomarkers of clinical outcomes and uncovers novel master drivers, THBS1 and MAP2, that establish relationships across these distinct mechanisms. CONCLUSION Our network-based imaging genomics tools serve as hypothesis generators that facilitate the interrogation of known and uncovering of novel hypotheses for follow-up with experimental validation studies. We anticipate that our Imaging-AMARETTO imaging genomics tools will be useful to the community of biomedical researchers for applications to similar studies of cancer and other complex diseases with available multiomics, imaging, and clinical data.
Collapse
|
23
|
Computational Oncology in the Multi-Omics Era: State of the Art. Front Oncol 2020; 10:423. [PMID: 32318338 PMCID: PMC7154096 DOI: 10.3389/fonc.2020.00423] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Accepted: 03/10/2020] [Indexed: 12/24/2022] Open
Abstract
Cancer is the quintessential complex disease. As technologies evolve faster each day, we are able to quantify the different layers of biological elements that contribute to the emergence and development of malignancies. In this multi-omics context, the use of integrative approaches is mandatory in order to gain further insights on oncological phenomena, and to move forward toward the precision medicine paradigm. In this review, we will focus on computational oncology as an integrative discipline that incorporates knowledge from the mathematical, physical, and computational fields to further the biomedical understanding of cancer. We will discuss the current roles of computation in oncology in the context of multi-omic technologies, which include: data acquisition and processing; data management in the clinical and research settings; classification, diagnosis, and prognosis; and the development of models in the research setting, including their use for therapeutic target identification. We will discuss the machine learning and network approaches as two of the most promising emerging paradigms, in computational oncology. These approaches provide a foundation on how to integrate different layers of biological description into coherent frameworks that allow advances both in the basic and clinical settings.
Collapse
|
24
|
MeinteR: A framework to prioritize DNA methylation aberrations based on conformational and cis-regulatory element enrichment. Sci Rep 2019; 9:19148. [PMID: 31844073 PMCID: PMC6915744 DOI: 10.1038/s41598-019-55453-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 11/19/2019] [Indexed: 12/16/2022] Open
Abstract
DNA methylation studies have been reformed with the advent of single-base resolution arrays and bisulfite sequencing methods, enabling deeper investigation of methylation-mediated mechanisms. In addition to these advancements, numerous bioinformatics tools address important computational challenges, covering DNA methylation calling up to multi-modal interpretative analyses. However, contrary to the analytical frameworks that detect driver mutational signatures, the identification of putatively actionable epigenetic events remains an unmet need. The present work describes a novel computational framework, called MeinteR, that prioritizes critical DNA methylation events based on the following hypothesis: critical aberrations of DNA methylation more likely occur on a genomic substrate that is enriched in cis-acting regulatory elements with distinct structural characteristics, rather than in genomic “deserts”. In this context, the framework incorporates functional cis-elements, e.g. transcription factor binding sites, tentative splice sites, as well as conformational features, such as G-quadruplexes and palindromes, to identify critical epigenetic aberrations with potential implications on transcriptional regulation. The evaluation on multiple, public cancer datasets revealed significant associations between the highest-ranking loci with gene expression and known driver genes, enabling for the first time the computational identification of high impact epigenetic changes based on high-throughput DNA methylation data.
Collapse
|
25
|
Combined Analysis of Metabolomes, Proteomes, and Transcriptomes of Hepatitis C Virus-Infected Cells and Liver to Identify Pathways Associated With Disease Development. Gastroenterology 2019; 157:537-551.e9. [PMID: 30978357 PMCID: PMC8318381 DOI: 10.1053/j.gastro.2019.04.003] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Revised: 03/01/2019] [Accepted: 04/04/2019] [Indexed: 02/07/2023]
Abstract
BACKGROUND & AIMS The mechanisms of hepatitis C virus (HCV) infection, liver disease progression, and hepatocarcinogenesis are only partially understood. We performed genomic, proteomic, and metabolomic analyses of HCV-infected cells and chimeric mice to learn more about these processes. METHODS Huh7.5.1dif (hepatocyte-like cells) were infected with culture-derived HCV and used in RNA sequencing, proteomic, metabolomic, and integrative genomic analyses. uPA/SCID (urokinase-type plasminogen activator/severe combined immunodeficiency) mice were injected with serum from HCV-infected patients; 8 weeks later, liver tissues were collected and analyzed by RNA sequencing and proteomics. Using differential expression, gene set enrichment analyses, and protein interaction mapping, we identified pathways that changed in response to HCV infection. We validated our findings in studies of liver tissues from 216 patients with HCV infection and early-stage cirrhosis and paired biopsy specimens from 99 patients with hepatocellular carcinoma, including 17 patients with histologic features of steatohepatitis. Cirrhotic liver tissues from patients with HCV infection were classified into 2 groups based on relative peroxisome function; outcomes assessed included Child-Pugh class, development of hepatocellular carcinoma, survival, and steatohepatitis. Hepatocellular carcinomas were classified according to steatohepatitis; the outcome was relative peroxisomal function. RESULTS We quantified 21,950 messenger RNAs (mRNAs) and 8297 proteins in HCV-infected cells. Upon HCV infection of hepatocyte-like cells and chimeric mice, we observed significant changes in levels of mRNAs and proteins involved in metabolism and hepatocarcinogenesis. HCV infection of hepatocyte-like cells significantly increased levels of the mRNAs, but not proteins, that regulate the innate immune response; we believe this was due to the inhibition of translation in these cells. HCV infection of hepatocyte-like cells increased glucose consumption and metabolism and the STAT3 signaling pathway and reduced peroxisome function. Peroxisomes mediate β-oxidation of very long-chain fatty acids; we found intracellular accumulation of very long-chain fatty acids in HCV-infected cells, which is also observed in patients with fatty liver disease. Cells in livers from HCV-infected mice had significant reductions in levels of the mRNAs and proteins associated with peroxisome function, indicating perturbation of peroxisomes. We found that defects in peroxisome function were associated with outcomes and features of HCV-associated cirrhosis, fatty liver disease, and hepatocellular carcinoma in patients. CONCLUSIONS We performed combined transcriptome, proteome, and metabolome analyses of liver tissues from HCV-infected hepatocyte-like cells and HCV-infected mice. We found that HCV infection increases glucose metabolism and the STAT3 signaling pathway and thereby reduces peroxisome function; alterations in the expression levels of peroxisome genes were associated with outcomes of patients with liver diseases. These findings provide insights into liver disease pathogenesis and might be used to identify new therapeutic targets.
Collapse
|
26
|
The impact of DNA methylation on the cancer proteome. PLoS Comput Biol 2019; 15:e1007245. [PMID: 31356589 PMCID: PMC6695193 DOI: 10.1371/journal.pcbi.1007245] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Revised: 08/15/2019] [Accepted: 07/02/2019] [Indexed: 12/29/2022] Open
Abstract
Aberrant DNA methylation disrupts normal gene expression in cancer and broadly contributes to oncogenesis. We previously developed MethylMix, a model-based algorithmic approach to identify epigenetically regulated driver genes. MethylMix identifies genes where methylation likely executes a functional role by using transcriptomic data to select only methylation events that can be linked to changes in gene expression. However, given that proteins more closely link genotype to phenotype recent high-throughput proteomic data provides an opportunity to more accurately identify functionally relevant abnormal methylation events. Here we present a MethylMix analysis that refines nominations for epigenetic driver genes by leveraging quantitative high-throughput proteomic data to select only genes where DNA methylation is predictive of protein abundance. Applying our algorithm across three cancer cohorts we find that using protein abundance data narrows candidate nominations, where the effect of DNA methylation is often buffered at the protein level. Next, we find that MethylMix genes predictive of protein abundance are enriched for biological processes involved in cancer including functions involved in epithelial and mesenchymal transition. Moreover, our results are also enriched for tumor markers which are predictive of clinical features like tumor stage and we find clustering using MethylMix genes predictive of protein abundance captures cancer subtypes. To elucidate the molecular basis of cancer we examine the variation and dynamics characterizing the flow of information from epigenome to the transcriptome and proteome. Conducting the first genome wide analysis of epigenome-proteome associations, we present a MethylMix analysis that leverages protein abundance data taking advantage of recent high-throughput proteomic data generated using mass-spectrometry technology to elucidate the role of DNA methylation in cancer. By integrating across molecular data types, we confirm the benefit of using protein abundance data to provide additional insights into pathways and processes involved in oncogenesis and how they manifest as clinical phenotypes. Applying our method across three large cancer cohorts including breast cancer, ovarian cancer and colorectal cancer, MethylMix identifies key genes and describes molecular features and subtypes in these cancers.
Collapse
|
27
|
Development and validation of radiomic signatures of head and neck squamous cell carcinoma molecular features and subtypes. EBioMedicine 2019; 45:70-80. [PMID: 31255659 PMCID: PMC6642281 DOI: 10.1016/j.ebiom.2019.06.034] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 06/18/2019] [Accepted: 06/18/2019] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Radiomics-based non-invasive biomarkers are promising to facilitate the translation of therapeutically related molecular subtypes for treatment allocation of patients with head and neck squamous cell carcinoma (HNSCC). METHODS We included 113 HNSCC patients from The Cancer Genome Atlas (TCGA-HNSCC) project. Molecular phenotypes analyzed were RNA-defined HPV status, five DNA methylation subtypes, four gene expression subtypes and five somatic gene mutations. A total of 540 quantitative image features were extracted from pre-treatment CT scans. Features were selected and used in a regularized logistic regression model to build binary classifiers for each molecular subtype. Models were evaluated using the average area under the Receiver Operator Characteristic curve (AUC) of a stratified 10-fold cross-validation procedure repeated 10 times. Next, an HPV model was trained with the TCGA-HNSCC, and tested on a Stanford cohort (N = 53). FINDINGS Our results show that quantitative image features are capable of distinguishing several molecular phenotypes. We obtained significant predictive performance for RNA-defined HPV+ (AUC = 0.73), DNA methylation subtypes MethylMix HPV+ (AUC = 0.79), non-CIMP-atypical (AUC = 0.77) and Stem-like-Smoking (AUC = 0.71), and mutation of NSD1 (AUC = 0.73). We externally validated the HPV prediction model (AUC = 0.76) on the Stanford cohort. When compared to clinical models, radiomic models were superior to subtypes such as NOTCH1 mutation and DNA methylation subtype non-CIMP-atypical while were inferior for DNA methylation subtype CIMP-atypical and NSD1 mutation. INTERPRETATION Our study demonstrates that radiomics can potentially serve as a non-invasive tool to identify treatment-relevant subtypes of HNSCC, opening up the possibility for patient stratification, treatment allocation and inclusion in clinical trials. FUND: Dr. Gevaert reports grants from National Institute of Dental & Craniofacial Research (NIDCR) U01 DE025188, grants from National Institute of Biomedical Imaging and Bioengineering of the National Institutes of Health (NIBIB), R01 EB020527, grants from National Cancer Institute (NCI), U01 CA217851, during the conduct of the study; Dr. Huang and Dr. Zhu report grants from China Scholarship Council (Grant NO:201606320087), grants from China Medical Board Collaborating Program (Grant NO:15-216), the Cyrus Tang Foundation, and the Zhejiang University Education Foundation during the conduct of the study; Dr. Cintra reports grants from São Paulo State Foundation for Teaching and Research (FAPESP), during the conduct of the study.
Collapse
|
28
|
Differential Effect of Smoking on Gene Expression in Head and Neck Cancer Patients. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2018; 15:ijerph15071558. [PMID: 30041465 PMCID: PMC6069101 DOI: 10.3390/ijerph15071558] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 07/11/2018] [Accepted: 07/17/2018] [Indexed: 12/13/2022]
Abstract
Smoking is a well-known behavior that has an important negative impact on human health, and is considered to be a significant factor related to the development and progression of head and neck squamous cell carcinomas (HNSCCs). Use of high-dimensional datasets to discern novel HNSCC driver genes related to smoking represents an important challenge. The Cancer Genome Atlas (TCGA) analysis was performed in three co-existing groups of HNSCC in order to assess whether gene expression landscape is affected by tobacco smoking, having quit, or non-smoking status. We identified a set of differentially expressed genes that discriminate between smokers and non-smokers or based on human papilloma virus (HPV)16 status, or the co-occurrence of these two exposome components in HNSCC. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways classification shows that most of the genes are specific to cellular metabolism, emphasizing metabolic detoxification pathways, metabolism of chemical carcinogenesis, or drug metabolism. In the case of HPV16-positive patients it has been demonstrated that the altered genes are related to cellular adhesion and inflammation. The correlation between smoking and the survival rate was not statistically significant. This emphasizes the importance of the complex environmental exposure and genetic factors in order to establish prevention assays and personalized care system for HNSCC, with the potential for being extended to other cancer types.
Collapse
|