1
|
Labory J, Njomgue-Fotso E, Bottini S. Benchmarking feature selection and feature extraction methods to improve the performances of machine-learning algorithms for patient classification using metabolomics biomedical data. Comput Struct Biotechnol J 2024; 23:1274-1287. [PMID: 38560281 PMCID: PMC10979063 DOI: 10.1016/j.csbj.2024.03.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 03/12/2024] [Accepted: 03/18/2024] [Indexed: 04/04/2024] Open
Abstract
Objective Classification tasks are an open challenge in the field of biomedicine. While several machine-learning techniques exist to accomplish this objective, several peculiarities associated with biomedical data, especially when it comes to omics measurements, prevent their use or good performance achievements. Omics approaches aim to understand a complex biological system through systematic analysis of its content at the molecular level. On the other hand, omics data are heterogeneous, sparse and affected by the classical "curse of dimensionality" problem, i.e. having much fewer observation, samples (n) than omics features (p). Furthermore, a major problem with multi-omics data is the imbalance either at the class or feature level. The objective of this work is to study whether feature extraction and/or feature selection techniques can improve the performances of classification machine-learning algorithms on omics measurements. Methods Among all omics, metabolomics has emerged as a powerful tool in cancer research, facilitating a deeper understanding of the complex metabolic landscape associated with tumorigenesis and tumor progression. Thus, we selected three publicly available metabolomics datasets, and we applied several feature extraction techniques both linear and non-linear, coupled or not with feature selection methods, and evaluated the performances regarding patient classification in the different configurations for the three datasets. Results We provide general workflow and guidelines on when to use those techniques depending on the characteristics of the data available. To further test the extension of our approach to other omics data, we have included a transcriptomics and a proteomics data. Overall, for all datasets, we showed that applying supervised feature selection improves the performances of feature extraction methods for classification purposes. Scripts used to perform all analyses are available at: https://github.com/Plant-Net/Metabolomic_project/.
Collapse
Affiliation(s)
- Justine Labory
- Université Côte d′Azur, Center of Modeling Simulation and Interactions, Nice, France
- INRAE, Université Côte d′Azur, CNRS, Institut Sophia Agrobiotech, Sophia-Antipolis, France
- Université Côte d′Azur, Inserm U1081, CNRS UMR 7284, Institute for Research on Cancer and Aging, Nice (IRCAN), Nice, France
| | | | - Silvia Bottini
- Université Côte d′Azur, Center of Modeling Simulation and Interactions, Nice, France
- INRAE, Université Côte d′Azur, CNRS, Institut Sophia Agrobiotech, Sophia-Antipolis, France
| |
Collapse
|
2
|
Afroz S, Islam N, Habib MA, Reza MS, Ashad Alam M. Multi-omics data integration and drug screening of AML cancer using Generative Adversarial Network. Methods 2024; 226:138-150. [PMID: 38670415 DOI: 10.1016/j.ymeth.2024.04.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 04/02/2024] [Accepted: 04/20/2024] [Indexed: 04/28/2024] Open
Abstract
In the era of precision medicine, accurate disease phenotype prediction for heterogeneous diseases, such as cancer, is emerging due to advanced technologies that link genotypes and phenotypes. However, it is difficult to integrate different types of biological data because they are so varied. In this study, we focused on predicting the traits of a blood cancer called Acute Myeloid Leukemia (AML) by combining different kinds of biological data. We used a recently developed method called Omics Generative Adversarial Network (GAN) to better classify cancer outcomes. The primary advantages of a GAN include its ability to create synthetic data that is nearly indistinguishable from real data, its high flexibility, and its wide range of applications, including multi-omics data analysis. In addition, the GAN was effective at combining two types of biological data. We created synthetic datasets for gene activity and DNA methylation. Our method was more accurate in predicting disease traits than using the original data alone. The experimental results provided evidence that the creation of synthetic data through interacting multi-omics data analysis using GANs improves the overall prediction quality. Furthermore, we identified the top-ranked significant genes through statistical methods and pinpointed potential candidate drug agents through in-silico studies. The proposed drugs, also supported by other independent studies, might play a crucial role in the treatment of AML cancer. The code is available on GitHub; https://github.com/SabrinAfroz/omicsGAN_codes?fbclid=IwAR1-/stuffmlE0hyWgSu2wlXo6dYlKUei3faLdlvpxTOOUPVlmYCloXf4Uk9ejK4I.
Collapse
Affiliation(s)
- Sabrin Afroz
- Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Bangladesh
| | - Nadira Islam
- Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Bangladesh
| | - Md Ahsan Habib
- Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Bangladesh; Statistical Learning Group, Bangladesh
| | - Md Selim Reza
- Tulane Center for Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University, New Orleans, LA 70112, USA; Statistical Learning Group, Bangladesh
| | - Md Ashad Alam
- Ochsner Center for Outcomes Research, Ochsner Research, Ochsner Clinic Foundation, New Orleans, LA 70121, USA; Statistical Learning Group, Bangladesh.
| |
Collapse
|
3
|
Yang H, Zhao L, Li D, An C, Fang X, Chen Y, Liu J, Xiao T, Wang Z. Subtype-WGME enables whole-genome-wide multi-omics cancer subtyping. Cell Rep Methods 2024:100781. [PMID: 38761803 DOI: 10.1016/j.crmeth.2024.100781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 01/05/2024] [Accepted: 04/26/2024] [Indexed: 05/20/2024]
Abstract
We present an innovative strategy for integrating whole-genome-wide multi-omics data, which facilitates adaptive amalgamation by leveraging hidden layer features derived from high-dimensional omics data through a multi-task encoder. Empirical evaluations on eight benchmark cancer datasets substantiated that our proposed framework outstripped the comparative algorithms in cancer subtyping, delivering superior subtyping outcomes. Building upon these subtyping results, we establish a robust pipeline for identifying whole-genome-wide biomarkers, unearthing 195 significant biomarkers. Furthermore, we conduct an exhaustive analysis to assess the importance of each omic and non-coding region features at the whole-genome-wide level during cancer subtyping. Our investigation shows that both omics and non-coding region features substantially impact cancer development and survival prognosis. This study emphasizes the potential and practical implications of integrating genome-wide data in cancer research, demonstrating the potency of comprehensive genomic characterization. Additionally, our findings offer insightful perspectives for multi-omics analysis employing deep learning methodologies.
Collapse
Affiliation(s)
- Hai Yang
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Liang Zhao
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Dongdong Li
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Congcong An
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Xiaoyang Fang
- Cornell Tech, Cornell University, New York, NY 14853, USA
| | - Yiwen Chen
- Center for Continuing and Lifelong Education, National University of Singapore, Singapore 119077, Singapore
| | - Jingping Liu
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Ting Xiao
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Zhe Wang
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China.
| |
Collapse
|
4
|
Ali HR, West RB. Spatial Biology of Breast Cancer. Cold Spring Harb Perspect Med 2024; 14:a041335. [PMID: 38110242 PMCID: PMC11065165 DOI: 10.1101/cshperspect.a041335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Spatial findings have shaped on our understanding of breast cancer. In this review, we discuss how spatial methods, including spatial transcriptomics and proteomics and the resultant understanding of spatial relationships, have contributed to concepts regarding cancer progression and treatment. In addition to discussing traditional approaches, we examine how emerging multiplex imaging technologies have contributed to the field and how they might influence future research.
Collapse
Affiliation(s)
- H Raza Ali
- Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Cambridge CB2 0RE, United Kingdom
| | - Robert B West
- Department of Pathology, Stanford University Medical Center, Stanford, California 94305, USA
| |
Collapse
|
5
|
Williams A. Multiomics data integration, limitations, and prospects to reveal the metabolic activity of the coral holobiont. FEMS Microbiol Ecol 2024; 100:fiae058. [PMID: 38653719 PMCID: PMC11067971 DOI: 10.1093/femsec/fiae058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 03/25/2024] [Accepted: 04/22/2024] [Indexed: 04/25/2024] Open
Abstract
Since their radiation in the Middle Triassic period ∼240 million years ago, stony corals have survived past climate fluctuations and five mass extinctions. Their long-term survival underscores the inherent resilience of corals, particularly when considering the nutrient-poor marine environments in which they have thrived. However, coral bleaching has emerged as a global threat to coral survival, requiring rapid advancements in coral research to understand holobiont stress responses and allow for interventions before extensive bleaching occurs. This review encompasses the potential, as well as the limits, of multiomics data applications when applied to the coral holobiont. Synopses for how different omics tools have been applied to date and their current restrictions are discussed, in addition to ways these restrictions may be overcome, such as recruiting new technology to studies, utilizing novel bioinformatics approaches, and generally integrating omics data. Lastly, this review presents considerations for the design of holobiont multiomics studies to support lab-to-field advancements of coral stress marker monitoring systems. Although much of the bleaching mechanism has eluded investigation to date, multiomic studies have already produced key findings regarding the holobiont's stress response, and have the potential to advance the field further.
Collapse
Affiliation(s)
- Amanda Williams
- Microbial Biology Graduate Program, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08901, United States
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08901, United States
| |
Collapse
|
6
|
Pang Y, Xu Y, Chen Q, Cheng K, Ling Y, Jang J, Ge J, Zhu W. FLRT3 and TGF-β/SMAD4 signalling: Impacts on apoptosis, autophagy and ion channels in supraventricular tachycardia. J Cell Mol Med 2024; 28:e18237. [PMID: 38509727 PMCID: PMC10955158 DOI: 10.1111/jcmm.18237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/14/2024] [Accepted: 02/28/2024] [Indexed: 03/22/2024] Open
Abstract
To explore the underlying molecular mechanisms of supraventricular tachycardia (SVT), this study aimed to analyse the complex relationship between FLRT3 and TGF-β/SMAD4 signalling pathway, which affects Na+ and K+ channels in cardiomyocytes. Bioinformatics analysis was performed on 85 SVT samples and 15 healthy controls to screen overlapping genes from the key module and differentially expressed genes (DEGs). Expression profiling of overlapping genes, coupled with Receiver Operating Characteristic (ROC) curve analyses, identified FLRT3 as a hub gene. In vitro studies utilizing Ang II-stimulated H9C2 cardiomyocytes were undertaken to elucidate the consequences of FLRT3 silencing on cardiomyocyte apoptosis and autophagic processes. Utilizing a combination of techniques such as quantitative reverse-transcription polymerase chain reaction (qRT-PCR), western blotting (WB), flow cytometry, dual-luciferase reporter assays and chromatin immunoprecipitation polymerase chain reaction (ChIP-PCR) assays were conducted to decipher the intricate interactions between FLRT3, the TGF-β/SMAD4 signalling cascade and ion channel gene expression. Six genes (AADAC, DSC3, FLRT3, SYT4, PRR9 and SERTM1) demonstrated reduced expression in SVT samples, each possessing significant clinical diagnostic potential. In H9C2 cardiomyocytes, FLRT3 silencing mitigated Ang II-induced apoptosis and modulated autophagy. With increasing TGF-β concentration, there was a dose-responsive decline in FLRT3 and SCN5A expression, while both KCNIP2 and KCND2 expressions were augmented. Moreover, a direct interaction between FLRT3 and SMAD4 was observed, and inhibition of SMAD4 expression resulted in increased FLRT3 expression. Our results demonstrated that the TGF-β/SMAD4 signalling pathway plays a critical role by regulating FLRT3 expression, with potential implications for ion channel function in SVT.
Collapse
Affiliation(s)
- Yang Pang
- Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan HospitalFudan UniversityShanghaiChina
| | - Ye Xu
- Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan HospitalFudan UniversityShanghaiChina
| | - Qingxing Chen
- Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan HospitalFudan UniversityShanghaiChina
| | - Kuan Cheng
- Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan HospitalFudan UniversityShanghaiChina
| | - Yunlong Ling
- Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan HospitalFudan UniversityShanghaiChina
| | - Jun Jang
- State Key Laboratory of Genetic Engineering, Institute of Genetics, School of Life ScienceFudan UniversityShanghaiChina
| | - Junbo Ge
- Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan HospitalFudan UniversityShanghaiChina
| | - Wenqing Zhu
- Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan HospitalFudan UniversityShanghaiChina
| |
Collapse
|
7
|
Araiza-Olivera D, Prudnikova TY, Uribe-Alvarez C, Cai KQ, Franco-Barraza J, Dones JM, Raines RT, Chernoff J. Identifying and targeting key driver genes for collagen production within the 11q13/14 breast cancer amplicon. bioRxiv 2024:2024.03.27.587019. [PMID: 38586042 PMCID: PMC10996585 DOI: 10.1101/2024.03.27.587019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Genetic studies indicate that breast cancer can be divided into several basic molecular groups. One of these groups, termed IntClust-2, is characterized by amplification of a small portion of chromosome 11 and has a median survival of only five years. Several cancer-relevant genes occupy this portion of chromosome 11, and it is thought that overexpression of a combination of driver genes in this region is responsible for the poor outcome of women in this group. In this study we used a gene editing method to knock out, one by one, each of 198 genes that are located within the amplified region of chromosome 11 and determined how much each of these genes contributed to the survival of breast cancer cells. In addition to well-known drivers such as CCND1 and PAK1 , we identified two different genes ( SERPINH1 and P4HA3 ), that encode proteins involved in collagen synthesis and organization. Using both in vitro and in vivo functional analyses, we determined that P4HA3 and/or SERPINH1 provide a critical driver function on IntClust-2 basic processes, such as viability, proliferation, and migration. Inhibiting these enzymes via genetic or pharmacologic means reduced collagen synthesis and impeded oncogenic signaling transduction in cell culture models, and a small-molecule inhibitor of P4HA3 was effective in treating 11q13 tumor growth in an animal model. As collagen has a well-known association with tissue stiffness and aggressive forms of breast cancer, we believe that the two genes we identified provide an opportunity for a new therapeutic strategy in IntClust-2 breast cancers.
Collapse
|
8
|
Yan H, Weng D, Li D, Gu Y, Ma W, Liu Q. Prior knowledge-guided multilevel graph neural network for tumor risk prediction and interpretation via multi-omics data integration. Brief Bioinform 2024; 25:bbae184. [PMID: 38670157 PMCID: PMC11052635 DOI: 10.1093/bib/bbae184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/11/2024] [Accepted: 04/06/2024] [Indexed: 04/28/2024] Open
Abstract
The interrelation and complementary nature of multi-omics data can provide valuable insights into the intricate molecular mechanisms underlying diseases. However, challenges such as limited sample size, high data dimensionality and differences in omics modalities pose significant obstacles to fully harnessing the potential of these data. The prior knowledge such as gene regulatory network and pathway information harbors useful gene-gene interaction and gene functional module information. To effectively integrate multi-omics data and make full use of the prior knowledge, here, we propose a Multilevel-graph neural network (GNN): a hierarchically designed deep learning algorithm that sequentially leverages multi-omics data, gene regulatory networks and pathway information to extract features and enhance accuracy in predicting survival risk. Our method achieved better accuracy compared with existing methods. Furthermore, key factors nonlinearly associated with the tumor pathogenesis are prioritized by employing two interpretation algorithms (i.e. GNN-Explainer and IGscore) for neural networks, at gene and pathway level, respectively. The top genes and pathways exhibit strong associations with disease in survival analyses, many of which such as SEC61G and CYP27B1 are previously reported in the literature.
Collapse
Affiliation(s)
- Hongxi Yan
- Department of Computer Science, Beihang University, XueYuan Road, 100191, BeiJing, China
| | - Dawei Weng
- School of Biomedical Engineering, Capital Medical University, 10 You An Men WaiXi Tou Tiao, 100069, Beijing, China
| | - Dongguo Li
- School of Biomedical Engineering, Capital Medical University, 10 You An Men WaiXi Tou Tiao, 100069, Beijing, China
| | - Yu Gu
- School of Biomedical Engineering, Capital Medical University, 10 You An Men WaiXi Tou Tiao, 100069, Beijing, China
| | - Wenji Ma
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, 227 South Chongqing Road, 200025, Shanghai, China
| | - Qingjie Liu
- Department of Computer Science, Beihang University, XueYuan Road, 100191, BeiJing, China
| |
Collapse
|
9
|
Niciura SCM, Cardoso TF, Ibelli AMG, Okino CH, Andrade BG, Benavides MV, Chagas ACDS, Esteves SN, Minho AP, Regitano LCDA, Gondro C. Multi-omics data elucidate parasite-host-microbiota interactions and resistance to Haemonchus contortus in sheep. Parasit Vectors 2024; 17:102. [PMID: 38429820 PMCID: PMC10908167 DOI: 10.1186/s13071-024-06205-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 02/18/2024] [Indexed: 03/03/2024] Open
Abstract
BACKGROUND The integration of molecular data from hosts, parasites, and microbiota can enhance our understanding of the complex biological interactions underlying the resistance of hosts to parasites. Haemonchus contortus, the predominant sheep gastrointestinal parasite species in the tropics, causes significant production and economic losses, which are further compounded by the diminishing efficiency of chemical control owing to anthelmintic resistance. Knowledge of how the host responds to infection and how the parasite, in combination with microbiota, modulates host immunity can guide selection decisions to breed animals with improved parasite resistance. This understanding will help refine management practices and advance the development of new therapeutics for long-term helminth control. METHODS Eggs per gram (EPG) of feces were obtained from Morada Nova sheep subjected to two artificial infections with H. contortus and used as a proxy to select animals with high resistance or susceptibility for transcriptome sequencing (RNA-seq) of the abomasum and 50 K single-nucleotide genotyping. Additionally, RNA-seq data for H. contortus were generated, and amplicon sequence variants (ASV) were obtained using polymerase chain reaction amplification and sequencing of bacterial and archaeal 16S ribosomal RNA genes from sheep feces and rumen content. RESULTS The heritability estimate for EPG was 0.12. GAST, GNLY, IL13, MGRN1, FGF14, and RORC genes and transcripts were differentially expressed between resistant and susceptible animals. A genome-wide association study identified regions on chromosomes 2 and 11 that harbor candidate genes for resistance, immune response, body weight, and adaptation. Trans-expression quantitative trait loci were found between significant variants and differentially expressed transcripts. Functional co-expression modules based on sheep genes and ASVs correlated with resistance to H. contortus, showing enrichment in pathways of response to bacteria, immune and inflammatory responses, and hub features of the Christensenellaceae, Bacteroides, and Methanobrevibacter genera; Prevotellaceae family; and Verrucomicrobiota phylum. In H. contortus, some mitochondrial, collagen-, and cuticle-related genes were expressed only in parasites isolated from susceptible sheep. CONCLUSIONS The present study identified chromosome regions, genes, transcripts, and pathways involved in the elaborate interactions between the sheep host, its gastrointestinal microbiota, and the H. contortus parasite. These findings will assist in the development of animal selection strategies for parasite resistance and interdisciplinary approaches to control H. contortus infection in sheep.
Collapse
|
10
|
Salahudeen AA, Seoane JA, Yuki K, Mah AT, Smith AR, Kolahi K, De la O SM, Hart DJ, Ding J, Ma Z, Barkal SA, Shukla ND, Zhang CH, Cantrell MA, Batish A, Usui T, Root DE, Hahn WC, Curtis C, Kuo CJ. Functional screening of amplification outlier oncogenes in organoid models of early tumorigenesis. Cell Rep 2023; 42:113355. [PMID: 37922313 PMCID: PMC10841581 DOI: 10.1016/j.celrep.2023.113355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 08/30/2023] [Accepted: 10/12/2023] [Indexed: 11/05/2023] Open
Abstract
Somatic copy number gains are pervasive across cancer types, yet their roles in oncogenesis are insufficiently evaluated. This inadequacy is partly due to copy gains spanning large chromosomal regions, obscuring causal loci. Here, we employed organoid modeling to evaluate candidate oncogenic loci identified via integrative computational analysis of extreme copy gains overlapping with extreme expression dysregulation in The Cancer Genome Atlas. Subsets of "outlier" candidates were contextually screened as tissue-specific cDNA lentiviral libraries within cognate esophagus, oral cavity, colon, stomach, pancreas, and lung organoids bearing initial oncogenic mutations. Iterative analysis nominated the kinase DYRK2 at 12q15 as an amplified head and neck squamous carcinoma oncogene in p53-/- oral mucosal organoids. Similarly, FGF3, amplified at 11q13 in 41% of esophageal squamous carcinomas, promoted p53-/- esophageal organoid growth reversible by small molecule and soluble receptor antagonism of FGFRs. Our studies establish organoid-based contextual screening of candidate genomic drivers, enabling functional evaluation during early tumorigenesis.
Collapse
Affiliation(s)
- Ameen A Salahudeen
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA; University of Illinois at Chicago College of Medicine, Department of Medicine, Division of Hematology and Oncology, Chicago, IL 60612, USA; Department of Biochemistry and Molecular Genetics, University of Illinois at Chicago College of Medicine, Chicago, IL 60612, USA; University of Illinois Cancer Center, Chicago, IL 60612, USA.
| | - Jose A Seoane
- Stanford University School of Medicine, Department of Medicine, Divisions of Oncology, Stanford, CA 94305, USA; Cancer Computational Biology Group, Vall d'Hebron Institute of Oncology (VHIO), 08035 Barcelona, Spain.
| | - Kanako Yuki
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA
| | - Amanda T Mah
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA
| | - Amber R Smith
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA
| | - Kevin Kolahi
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA
| | - Sean M De la O
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA
| | - Daniel J Hart
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA
| | - Jie Ding
- Stanford University School of Medicine, Department of Medicine, Divisions of Oncology, Stanford, CA 94305, USA
| | - Zhicheng Ma
- Stanford University School of Medicine, Department of Medicine, Divisions of Oncology, Stanford, CA 94305, USA
| | - Sammy A Barkal
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA
| | - Navika D Shukla
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA
| | - Chuck H Zhang
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA
| | - Michael A Cantrell
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA
| | - Arpit Batish
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA
| | - Tatsuya Usui
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA
| | - David E Root
- Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA
| | - William C Hahn
- Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA; Dana-Farber Cancer Institute, Department of Medical Oncology, 450 Brookline Avenue, Boston, MA 02215, USA
| | - Christina Curtis
- Stanford University School of Medicine, Department of Medicine, Divisions of Oncology, Stanford, CA 94305, USA; Stanford University School of Medicine, Department of Medicine, Divisions of Genetics, Stanford, CA 94305, USA
| | - Calvin J Kuo
- Stanford University School of Medicine, Department of Medicine, Divisions of Hematology, Stanford, CA 94305, USA.
| |
Collapse
|
11
|
Khalili-Tanha G, Mohit R, Asadnia A, Khazaei M, Dashtiahangar M, Maftooh M, Nassiri M, Hassanian SM, Ghayour-Mobarhan M, Kiani MA, Ferns GA, Batra J, Nazari E, Avan A. Identification of ZMYND19 as a novel biomarker of colorectal cancer: RNA-sequencing and machine learning analysis. J Cell Commun Signal 2023:10.1007/s12079-023-00779-2. [PMID: 37428302 DOI: 10.1007/s12079-023-00779-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 05/29/2023] [Indexed: 07/11/2023] Open
Abstract
Colorectal cancer (CRC) is the third most common cause of cancer-related deaths. The five-year relative survival rate for CRC is estimated to be approximately 90% for patients diagnosed with early stages and 14% for those diagnosed at an advanced stages of disease, respectively. Hence, the development of accurate prognostic markers is required. Bioinformatics enables the identification of dysregulated pathways and novel biomarkers. RNA expression profiling was performed in CRC patients from the TCGA database using a Machine Learning approach to identify differential expression genes (DEGs). Survival curves were assessed using Kaplan-Meier analysis to identify prognostic biomarkers. Furthermore, the molecular pathways, protein-protein interaction, the co-expression of DEGs, and the correlation between DEGs and clinical data have been evaluated. The diagnostic markers were then determined based on machine learning analysis. The results indicated that key upregulated genes are associated with the RNA processing and heterocycle metabolic process, including C10orf2, NOP2, DKC1, BYSL, RRP12, PUS7, MTHFD1L, and PPAT. Furthermore, the survival analysis identified NOP58, OSBPL3, DNAJC2, and ZMYND19 as prognostic markers. The combineROC curve analysis indicated that the combination of C10orf2 -PPAT- ZMYND19 can be considered as diagnostic markers with sensitivity, specificity, and AUC values of 0.98, 1.00, and 0.99, respectively. Eventually, ZMYND19 gene was validated in CRC patients. In conclusion, novel biomarkers of CRC have been identified that may be a promising strategy for early diagnosis, potential treatment, and better prognosis.
Collapse
Affiliation(s)
- Ghazaleh Khalili-Tanha
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Reza Mohit
- Department of Anesthesia, Bushehr University of Medical Sciences, Bushehr, Iran
| | - Alireza Asadnia
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Khazaei
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | | | - Mina Maftooh
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mohammadreza Nassiri
- Recombinant Proteins Research Group, The Research Institute of Biotechnology, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Seyed Mahdi Hassanian
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Ghayour-Mobarhan
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mohammad Ali Kiani
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
- Department of Pediatrics, Ghaem Hospital, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Gordon A Ferns
- Brighton & Sussex Medical School, Division of Medical Education, Falmer, Brighton, Sussex, BN1 9PH, UK
| | - Jyotsna Batra
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, 4059, Australia
- Translational Research Institute, Queensland University of Technology, Brisbane, 4102, Australia
- Faculty of Health, School of Biomedical Sciences, Queensland University of Technology, Brisbane, Australia
| | - Elham Nazari
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Amir Avan
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.
- College of Medicine, University of Warith Al-Anbiyaa, Karbala, Iraq.
- Faculty of Health, School of Biomedical Sciences, Queensland University of Technology, Brisbane, Australia.
| |
Collapse
|
12
|
Yin F, Zhao H, Lu S, Shen J, Li M, Mao X, Li F, Shi J, Li J, Dong B, Xue W, Zuo X, Yang X, Fan C. DNA-framework-based multidimensional molecular classifiers for cancer diagnosis. Nat Nanotechnol 2023; 18:677-686. [PMID: 36973399 DOI: 10.1038/s41565-023-01348-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 02/10/2023] [Indexed: 06/18/2023]
Abstract
A molecular classification of diseases that accurately reflects clinical behaviour lays the foundation of precision medicine. The development of in silico classifiers coupled with molecular implementation based on DNA reactions marks a key advance in more powerful molecular classification, but it nevertheless remains a challenge to process multiple molecular datatypes. Here we introduce a DNA-encoded molecular classifier that can physically implement the computational classification of multidimensional molecular clinical data. To produce unified electrochemical sensing signals across heterogeneous molecular binding events, we exploit DNA-framework-based programmable atom-like nanoparticles with n valence to develop valence-encoded signal reporters that enable linearity in translating virtually any biomolecular binding events to signal gains. Multidimensional molecular information in computational classification is thus precisely assigned weights for bioanalysis. We demonstrate the implementation of a molecular classifier based on programmable atom-like nanoparticles to perform biomarker panel screening and analyse a panel of six biomarkers across three-dimensional datatypes for a near-deterministic molecular taxonomy of prostate cancer patients.
Collapse
Affiliation(s)
- Fangfei Yin
- Institute of Molecular Medicine, Department of Urology, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Haipei Zhao
- Frontiers Science Center for Transformative Molecules, School of Chemistry and Chemical Engineering, Zhangjiang Institute for Advanced Study, and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Shasha Lu
- Frontiers Science Center for Transformative Molecules, School of Chemistry and Chemical Engineering, Zhangjiang Institute for Advanced Study, and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou, China
| | - Juwen Shen
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Min Li
- Institute of Molecular Medicine, Department of Urology, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Xiuhai Mao
- Institute of Molecular Medicine, Department of Urology, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Fan Li
- Institute of Molecular Medicine, Department of Urology, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Jiye Shi
- Division of Physical Biology, CAS Key Laboratory of Interfacial Physics and Technology, Shanghai Institute of Applied Physics, Chinese Academy of Sciences, Shanghai, China
| | - Jiang Li
- Division of Physical Biology, CAS Key Laboratory of Interfacial Physics and Technology, Shanghai Institute of Applied Physics, Chinese Academy of Sciences, Shanghai, China
- The Interdisciplinary Research Center, Shanghai Synchrotron Radiation Facility, Zhangjiang Laboratory, Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai, China
| | - Baijun Dong
- Institute of Molecular Medicine, Department of Urology, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Wei Xue
- Institute of Molecular Medicine, Department of Urology, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Xiaolei Zuo
- Institute of Molecular Medicine, Department of Urology, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
- Frontiers Science Center for Transformative Molecules, School of Chemistry and Chemical Engineering, Zhangjiang Institute for Advanced Study, and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China.
| | - Xiurong Yang
- Frontiers Science Center for Transformative Molecules, School of Chemistry and Chemical Engineering, Zhangjiang Institute for Advanced Study, and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, China
| | - Chunhai Fan
- Institute of Molecular Medicine, Department of Urology, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Frontiers Science Center for Transformative Molecules, School of Chemistry and Chemical Engineering, Zhangjiang Institute for Advanced Study, and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
13
|
Zhao J, Zhao B, Song X, Lyu C, Chen W, Xiong Y, Wei DQ. Subtype-DCC: decoupled contrastive clustering method for cancer subtype identification based on multi-omics data. Brief Bioinform 2023; 24:7005165. [PMID: 36702755 DOI: 10.1093/bib/bbad025] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 12/21/2022] [Accepted: 01/08/2023] [Indexed: 01/28/2023] Open
Abstract
Due to the high heterogeneity and complexity of cancers, patients with different cancer subtypes often have distinct groups of genomic and clinical characteristics. Therefore, the discovery and identification of cancer subtypes are crucial to cancer diagnosis, prognosis and treatment. Recent technological advances have accelerated the increasing availability of multi-omics data for cancer subtyping. To take advantage of the complementary information from multi-omics data, it is necessary to develop computational models that can represent and integrate different layers of data into a single framework. Here, we propose a decoupled contrastive clustering method (Subtype-DCC) based on multi-omics data integration for clustering to identify cancer subtypes. The idea of contrastive learning is introduced into deep clustering based on deep neural networks to learn clustering-friendly representations. Experimental results demonstrate the superior performance of the proposed Subtype-DCC model in identifying cancer subtypes over the currently available state-of-the-art clustering methods. The strength of Subtype-DCC is also supported by the survival and clinical analysis.
Collapse
Affiliation(s)
- Jing Zhao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Bowen Zhao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xiaotong Song
- School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Chujun Lyu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Weizhi Chen
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
- Peng Cheng Laboratory, Vanke Cloud City Phase I Building 8, Xili Street, Nanshan District, Shenzhen, Guangdong, 518055, China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nayang, Henan, 473006, China
| |
Collapse
|
14
|
Ochoa S, Hernández-Lemus E. Functional impact of multi-omic interactions in breast cancer subtypes. Front Genet 2023; 13:1078609. [PMID: 36685900 PMCID: PMC9850112 DOI: 10.3389/fgene.2022.1078609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 12/15/2022] [Indexed: 01/07/2023] Open
Abstract
Multi-omic approaches are expected to deliver a broader molecular view of cancer. However, the promised mechanistic explanations have not quite settled yet. Here, we propose a theoretical and computational analysis framework to semi-automatically produce network models of the regulatory constraints influencing a biological function. This way, we identified functions significantly enriched on the analyzed omics and described associated features, for each of the four breast cancer molecular subtypes. For instance, we identified functions sustaining over-representation of invasion-related processes in the basal subtype and DNA modification processes in the normal tissue. We found limited overlap on the omics-associated functions between subtypes; however, a startling feature intersection within subtype functions also emerged. The examples presented highlight new, potentially regulatory features, with sound biological reasons to expect a connection with the functions. Multi-omic regulatory networks thus constitute reliable models of the way omics are connected, demonstrating a capability for systematic generation of mechanistic hypothesis.
Collapse
Affiliation(s)
- Soledad Ochoa
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico,Programa de Doctorado en Ciencias Biomédicas, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico,Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico,*Correspondence: Enrique Hernández-Lemus,
| |
Collapse
|
15
|
Sun Q, Cheng L, Meng A, Ge S, Chen J, Zhang L, Gong P. SADLN: Self-attention based deep learning network of integrating multi-omics data for cancer subtype recognition. Front Genet 2023; 13:1032768. [PMID: 36685873 PMCID: PMC9846505 DOI: 10.3389/fgene.2022.1032768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 12/15/2022] [Indexed: 01/05/2023] Open
Abstract
Integrating multi-omics data for cancer subtype recognition is an important task in bioinformatics. Recently, deep learning has been applied to recognize the subtype of cancers. However, existing studies almost integrate the multi-omics data simply by concatenation as the single data and then learn a latent low-dimensional representation through a deep learning model, which did not consider the distribution differently of omics data. Moreover, these methods ignore the relationship of samples. To tackle these problems, we proposed SADLN: A self-attention based deep learning network of integrating multi-omics data for cancer subtype recognition. SADLN combined encoder, self-attention, decoder, and discriminator into a unified framework, which can not only integrate multi-omics data but also adaptively model the sample's relationship for learning an accurately latent low-dimensional representation. With the integrated representation learned from the network, SADLN used Gaussian Mixture Model to identify cancer subtypes. Experiments on ten cancer datasets of TCGA demonstrated the advantages of SADLN compared to ten methods. The Self-Attention Based Deep Learning Network (SADLN) is an effective method of integrating multi-omics data for cancer subtype recognition.
Collapse
Affiliation(s)
- Qiuwen Sun
- School of Medical Imaging, Xuzhou Medical University, Xuzhou, China
| | - Lei Cheng
- School of Medical Imaging, Xuzhou Medical University, Xuzhou, China
| | - Ao Meng
- School of Medical Imaging, Xuzhou Medical University, Xuzhou, China
| | - Shuguang Ge
- School of Information and Control Engineering, University of Mining and Technology, Xuzhou, China
| | - Jie Chen
- Department of Radiation Oncology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
| | - Longzhen Zhang
- Department of Radiation Oncology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
| | - Ping Gong
- School of Medical Imaging, Xuzhou Medical University, Xuzhou, China,*Correspondence: Ping Gong,
| |
Collapse
|
16
|
Ravindran U, Gunavathi C. A survey on gene expression data analysis using deep learning methods for cancer diagnosis. Prog Biophys Mol Biol 2023; 177:1-13. [PMID: 35988771 DOI: 10.1016/j.pbiomolbio.2022.08.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 08/09/2022] [Accepted: 08/12/2022] [Indexed: 02/07/2023]
Abstract
Gene Expression Data is the biological data to extract meaningful hidden information from the gene dataset. This gene information is used for disease diagnosis especially in cancer treatment based on the variations in gene expression levels. DNA microarray is an efficient method for gene expression classification and prediction of cancer disease for specific types of cancer. Due to the abundance of computing power, deep learning (DL) has become a widespread technique in the healthcare sector. The gene expression dataset has a limited number of samples but a large number of features. Data augmentation is needed for gene expression datasets to overcome the dimensionality problem in gene data. It is a technique to generating the synthetic samples to increase the diversity of data. Deep learning methods are designed to learn and extract the features that come from the raw input data in the form of multidimensional arrays. This paper reviews the existing research in deep learning techniques like Feed Forward Neural Network (FFN), Convolutional Neural Network (CNN), Autoencoder (AE) and Recurrent Neural Network (RNN) for the classification and prediction of cancer disease and its types through gene expression data analysis.
Collapse
|
17
|
Nisa MU, Farooq S, Ali S, Eachkoti R, Rehman MU, Hafiz S. Proteomics: A modern tool for identifying therapeutic targets in different types of carcinomas. Proteomics 2023. [DOI: 10.1016/b978-0-323-95072-5.00013-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023]
|
18
|
Hao X, Cheng S, Jiang B, Xin S. Applying multi-omics techniques to the discovery of biomarkers for acute aortic dissection. Front Cardiovasc Med 2022; 9:961991. [PMID: 36588568 PMCID: PMC9797526 DOI: 10.3389/fcvm.2022.961991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 11/28/2022] [Indexed: 12/23/2022] Open
Abstract
Acute aortic dissection (AAD) is a cardiovascular disease that manifests suddenly and fatally. Due to the lack of specific early symptoms, many patients with AAD are often overlooked or misdiagnosed, which is undoubtedly catastrophic for patients. The particular pathogenic mechanism of AAD is yet unknown, which makes clinical pharmacological therapy extremely difficult. Therefore, it is necessary and crucial to find and employ unique biomarkers for Acute aortic dissection (AAD) as soon as possible in clinical practice and research. This will aid in the early detection of AAD and give clear guidelines for the creation of focused treatment agents. This goal has been made attainable over the past 20 years by the quick advancement of omics technologies and the development of high-throughput tissue specimen biomarker screening. The primary histology data support and add to one another to create a more thorough and three-dimensional picture of the disease. Based on the introduction of the main histology technologies, in this review, we summarize the current situation and most recent developments in the application of multi-omics technologies to AAD biomarker discovery and emphasize the significance of concentrating on integration concepts for integrating multi-omics data. In this context, we seek to offer fresh concepts and recommendations for fundamental investigation, perspective innovation, and therapeutic development in AAD.
Collapse
Affiliation(s)
- Xinyu Hao
- Department of Vascular Surgery, The First Affiliated Hospital of China Medical University, China Medical University, Shenyang, China,Key Laboratory of Pathogenesis, Prevention and Therapeutics of Aortic Aneurysm, Shenyang, Liaoning, China
| | - Shuai Cheng
- Department of Vascular Surgery, The First Affiliated Hospital of China Medical University, China Medical University, Shenyang, China,Key Laboratory of Pathogenesis, Prevention and Therapeutics of Aortic Aneurysm, Shenyang, Liaoning, China
| | - Bo Jiang
- Department of Vascular Surgery, The First Affiliated Hospital of China Medical University, China Medical University, Shenyang, China,Key Laboratory of Pathogenesis, Prevention and Therapeutics of Aortic Aneurysm, Shenyang, Liaoning, China
| | - Shijie Xin
- Department of Vascular Surgery, The First Affiliated Hospital of China Medical University, China Medical University, Shenyang, China,Key Laboratory of Pathogenesis, Prevention and Therapeutics of Aortic Aneurysm, Shenyang, Liaoning, China,*Correspondence: Shijie Xin,
| |
Collapse
|
19
|
Xu Y, Wu M, Ma S. Multidimensional molecular measurements-environment interaction analysis for disease outcomes. Biometrics 2022; 78:1542-1554. [PMID: 34213006 PMCID: PMC9366385 DOI: 10.1111/biom.13526] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 02/27/2021] [Accepted: 06/28/2021] [Indexed: 12/30/2022]
Abstract
Multiple types of molecular (genetic, genomic, epigenetic, etc.) measurements, environmental risk factors, and their interactions have been found to contribute to the outcomes and phenotypes of complex diseases. In each of the previous studies, only the interactions between one type of molecular measurement and environmental risk factors have been analyzed. In recent biomedical studies, multidimensional profiling, in which data from multiple types of molecular measurements are collected from the same subjects, is becoming popular. A myriad of recent studies have shown that collectively analyzing multiple types of molecular measurements is not only biologically sensible but also leads to improved estimation and prediction. In this study, we conduct an M-E interaction analysis, with M standing for multidimensional molecular measurements and E standing for environmental risk factors. This can accommodate multiple types of molecular measurements and sufficiently account for their overlapping as well as independent information. Extensive simulation shows that it outperforms several closely related alternatives. In the analysis of TCGA (The Cancer Genome Atlas) data on lung adenocarcinoma and cutaneous melanoma, we make some stable biological findings and achieve stable prediction.
Collapse
Affiliation(s)
- Yaqing Xu
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| | - Mengyun Wu
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| | - Shuangge Ma
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| |
Collapse
|
20
|
Maghsoudi Z, Nguyen H, Tavakkoli A, Nguyen T. A comprehensive survey of the approaches for pathway analysis using multi-omics data integration. Brief Bioinform 2022; 23:6761962. [PMID: 36252928 PMCID: PMC9677478 DOI: 10.1093/bib/bbac435] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 08/26/2022] [Accepted: 09/08/2022] [Indexed: 02/07/2023] Open
Abstract
Pathway analysis has been widely used to detect pathways and functions associated with complex disease phenotypes. The proliferation of this approach is due to better interpretability of its results and its higher statistical power compared with the gene-level statistics. A plethora of pathway analysis methods that utilize multi-omics setup, rather than just transcriptomics or proteomics, have recently been developed to discover novel pathways and biomarkers. Since multi-omics gives multiple views into the same problem, different approaches are employed in aggregating these views into a comprehensive biological context. As a result, a variety of novel hypotheses regarding disease ideation and treatment targets can be formulated. In this article, we review 32 such pathway analysis methods developed for multi-omics and multi-cohort data. We discuss their availability and implementation, assumptions, supported omics types and databases, pathway analysis techniques and integration strategies. A comprehensive assessment of each method's practicality, and a thorough discussion of the strengths and drawbacks of each technique will be provided. The main objective of this survey is to provide a thorough examination of existing methods to assist potential users and researchers in selecting suitable tools for their data and analysis purposes, while highlighting outstanding challenges in the field that remain to be addressed for future development.
Collapse
Affiliation(s)
- Zeynab Maghsoudi
- Department of Computer Science and Engineering, University of Nevada, Reno, 89557, Nevada, USA
| | - Ha Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, 89557, Nevada, USA
| | - Alireza Tavakkoli
- Department of Computer Science and Engineering, University of Nevada, Reno, 89557, Nevada, USA
| | - Tin Nguyen
- Corresponding author: Tin Nguyen, Department of Computer Science and Engineering, University of Nevada, Reno, NV, USA. Tel.: +1-775-784-6619;
| |
Collapse
|
21
|
Abstract
We propose a method for supervised learning with multiple sets of features ("views"). The multiview problem is especially important in biology and medicine, where "-omics" data, such as genomics, proteomics, and radiomics, are measured on a common set of samples. "Cooperative learning" combines the usual squared-error loss of predictions with an "agreement" penalty to encourage the predictions from different data views to agree. By varying the weight of the agreement penalty, we get a continuum of solutions that include the well-known early and late fusion approaches. Cooperative learning chooses the degree of agreement (or fusion) in an adaptive manner, using a validation set or cross-validation to estimate test set prediction error. One version of our fitting procedure is modular, where one can choose different fitting mechanisms (e.g., lasso, random forests, boosting, or neural networks) appropriate for different data views. In the setting of cooperative regularized linear regression, the method combines the lasso penalty with the agreement penalty, yielding feature sparsity. The method can be especially powerful when the different data views share some underlying relationship in their signals that can be exploited to boost the signals. We show that cooperative learning achieves higher predictive accuracy on simulated data and real multiomics examples of labor-onset prediction. By leveraging aligned signals and allowing flexible fitting mechanisms for different modalities, cooperative learning offers a powerful approach to multiomics data fusion.
Collapse
|
22
|
Li Y, Li T, Zhai D, Xie C, Kuang X, Lin Y, Shao N. Quantification of ferroptosis pathway status revealed heterogeneity in breast cancer patients with distinct immune microenvironment. Front Oncol 2022; 12:956999. [PMID: 36119477 PMCID: PMC9478851 DOI: 10.3389/fonc.2022.956999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 07/29/2022] [Indexed: 11/17/2022] Open
Abstract
Clinical significance and biological functions of the ferroptosis pathway were addressed in all aspect of cancer regarding multi-omics level; however, the overall status of ferroptosis pathway alteration was hard to evaluate. The aim of this study is to comprehensively analyze the putative biological, pathological, and clinical functions of the ferroptosis pathway in breast cancer on a pathway level. By adopting the bioinformatic algorithm “pathifier”, we quantified five programmed cell death (PCD) pathways (KO04210 Apoptosis; KO04216 Ferroptosis; KO04217 Necroptosis; GO:0070269 Pyroptosis; GO:0048102 Autophagic cell death) in breast cancer patients, and we featured the clinical characteristics and prognostic value of each pathway in breast cancer and found significantly activated PCD in cancer patients, among which ferroptosis demonstrated a significant correlation with the prognosis of breast cancer. Correlation analysis between PCD pathways identified intra-tumor heterogeneity of breast cancer. Therefore, clustering of patients based on the status of PCD pathways was done. Comparisons between subgroups highlighted specifically activated ferroptosis in cluster 2 patients, which showed the distinct status of tumor immunity and microenvironment from other clusters, indicating putative correlations with ferroptosis. NDUFA13 was identified and selected as a putative biomarker for cluster 2 patients. Experimental validations were executed on cellular level and NDUFA13 showed an important role in regulating ferroptosis activation and can work as a biomarker for ferroptosis pathway status. In conclusion, the status of the ferroptosis pathway significantly correlated with the clinical outcomes and intra-tumor heterogeneity of breast cancer, and NDUFA13 expression was identified as a positive biomarker for ferroptosis pathway activation in breast cancer patients.
Collapse
Affiliation(s)
- Yuying Li
- Breast Disease Center, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
- Laboratory of Surgery, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Tianfu Li
- Breast Disease Center, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
- Laboratory of Surgery, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Duanyang Zhai
- Breast Disease Center, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
- Laboratory of Surgery, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Chuanbo Xie
- Cancer Prevention Center, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China
| | - Xiaying Kuang
- Breast Disease Center, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Ying Lin
- Breast Disease Center, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
- *Correspondence: Nan Shao, ; Ying Lin,
| | - Nan Shao
- Breast Disease Center, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
- *Correspondence: Nan Shao, ; Ying Lin,
| |
Collapse
|
23
|
Takahashi H, Kawahara D, Kikuchi Y. Understanding Breast Cancers through Spatial and High-Resolution Visualization Using Imaging Technologies. Cancers (Basel) 2022; 14:4080. [PMID: 36077616 PMCID: PMC9454728 DOI: 10.3390/cancers14174080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 08/12/2022] [Accepted: 08/18/2022] [Indexed: 11/17/2022] Open
Abstract
Breast cancer is the most common cancer affecting women worldwide. Although many analyses and treatments have traditionally targeted the breast cancer cells themselves, recent studies have focused on investigating entire cancer tissues, including breast cancer cells. To understand the structure of breast cancer tissues, including breast cancer cells, it is necessary to investigate the three-dimensional location of the cells and/or proteins comprising the tissues and to clarify the relationship between the three-dimensional structure and malignant transformation or metastasis of breast cancers. In this review, we aim to summarize the methods for analyzing the three-dimensional structure of breast cancer tissue, paying particular attention to the recent technological advances in the combination of the tissue-clearing method and optical three-dimensional imaging. We also aimed to identify the latest methods for exploring the relationship between the three-dimensional cell arrangement in breast cancer tissues and the gene expression of each cell. Finally, we aimed to describe the three-dimensional imaging features of breast cancer tissues using noninvasive photoacoustic imaging methods.
Collapse
|
24
|
Diakun A, Khosrawipour T, Mikolajczyk-Martinez A, Nicpoń J, Kiełbowicz Z, Prządka P, Liszka B, Kielan W, Zielinski K, Migdal P, Lau H, Li S, Khosrawipour V. The Onset of In-Vivo Dehydration in Gas -Based Intraperitoneal Hyperthermia and Its Cytotoxic Effects on Colon Cancer Cells. Front Oncol 2022; 12:927714. [PMID: 35847916 PMCID: PMC9278806 DOI: 10.3389/fonc.2022.927714] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 05/23/2022] [Indexed: 11/13/2022] Open
Abstract
Background Peritoneal metastasis (PM) is an ongoing challenge in surgical oncology. Current therapeutic options, including intravenous and intraperitoneal (i.p.) chemotherapies display limited clinical efficacy, resulting in an overall poor prognosis in affected patients. Combined hyperthermia and dehydration induced by a high-flow, gas-based i.p. hyperthermic procedure could be a novel approach in PM treatment. Our study is the first to evaluate the therapeutic potential of i.p. dehydration, hyperthermia, as well as the combination of both mechanisms in an in-vivo setting. Methods For this study, three swine were subjected to diagnostic laparoscopy under a high-flow air stream at 48°, 49° and 50°Celsius (C). Hygrometry of the in- and outflow airstream was measured to calculate surface evaporation and i.p. dehydration. To analyze the effects of this concept, in vitro colon cancer cells (HT-29) were treated with hyperthermia and dehydration. Cytotoxicity and cell viability were measured at different time intervals. Additionally, structural changes of dehydrated cells were analyzed using scanning electron microscopy. Results According to our results, both dehydration and hyperthermia were cytotoxic to HT-29 cells. However, while dehydration reduced cell viability, hyperthermia did not. However, dehydration effects on cell viability were significantly increased when combined with hyperthermia (p<0.01). Conclusions Changes to the physiological milieu of the peritoneal cavity could significantly reduce PM. Therefore, limited dehydration of the abdominal cavity might be a feasible, additional tool in PM treatment. Further studies are required to investigate dehydration effects and their applicability in PM management.
Collapse
Affiliation(s)
- Agata Diakun
- 2nd Department of General Surgery and Surgical Oncology, Wroclaw Medical University, Wroclaw, Poland
| | - Tanja Khosrawipour
- Department of Surgery (A), University-Hospital Düsseldorf, Düsseldorf, Germany.,Medical faculty, Heinrich-Heine University, Düsseldorf, Germany
| | - Agata Mikolajczyk-Martinez
- Department of Biochemistry and Molecular Biology, Faculty of Veterinary Sciences, Wroclaw University of Environmental and Life Sciences, Wroclaw, Poland
| | - Jakub Nicpoń
- Department of Surgery, Faculty of Veterinary Sciences, Wroclaw University of Environmental and Life Sciences, Wroclaw, Poland
| | - Zdzisław Kiełbowicz
- Department of Surgery, Faculty of Veterinary Sciences, Wroclaw University of Environmental and Life Sciences, Wroclaw, Poland
| | - Przemysław Prządka
- Department of Surgery, Faculty of Veterinary Sciences, Wroclaw University of Environmental and Life Sciences, Wroclaw, Poland
| | - Bartłomiej Liszka
- Department of Surgery, Faculty of Veterinary Sciences, Wroclaw University of Environmental and Life Sciences, Wroclaw, Poland
| | - Wojciech Kielan
- 2nd Department of General Surgery and Surgical Oncology, Wroclaw Medical University, Wroclaw, Poland
| | - Kacper Zielinski
- Department of Anesthesiology, Wroclaw Medical University, Wroclaw, Poland
| | - Pawel Migdal
- Department of Environment, Hygiene and Animal Welfare, University of Environmental and Life Sciences, Wroclaw, Poland
| | - Hien Lau
- Department of Surgery, University of California, Irvine, Irvine, CA, United States
| | - Shiri Li
- Division of Colon and Rectal Surgery, Department of Surgery, New York Presbyterian Hospital- Weill Cornell College of Medicine, New York, NY, United States
| | - Veria Khosrawipour
- Department of Biochemistry and Molecular Biology, Faculty of Veterinary Sciences, Wroclaw University of Environmental and Life Sciences, Wroclaw, Poland.,Department of Surgery, Petrus-Hospital Wuppertal, Wuppertal, Germany
| |
Collapse
|
25
|
Rao X, Cao H, Yu Q, Ou X, Deng R, Huang J. NEAT1/MALAT1/XIST/PKD--Hsa-Mir-101-3p--DLGAP5 Axis as a Novel Diagnostic and Prognostic Biomarker Associated With Immune Cell Infiltration in Bladder Cancer. Front Genet 2022; 13:892535. [PMID: 35873473 PMCID: PMC9305813 DOI: 10.3389/fgene.2022.892535] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 06/16/2022] [Indexed: 12/24/2022] Open
Abstract
Background: The clinical value of the biomarkers of bladder cancer (BC) is limited due to their low sensitivity or specificity. As a biomarker, DLG associated protein 5 (DLGAP5) is a potential cell cycle regulator in cancer cell carcinogenesis. However, its functional part in BC remains unclear. Therefore, this study aims to identify DLGAP5 expression in BC and its potential diagnostic and prognostic values. Eventually, it predicts the possible RNA regulatory pathways of BC.Methods: Data on DLGAP5 expression levels in BC and normal bladder tissues were obtained from The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO) databases. The receiver operating characteristic (ROC), Kaplan–Meier survival curves, and the univariate and multivariate Cox regression analysis determined the diagnostic and prognostic values of DLGAP5 in BC patients. Finally, the StarBase predicted the target RNAs and constructed networks using Cytoscape.Results: DLGAP5 expression was significantly upregulated in BC tissue, verified by the TCGA (p < 0.001), GSE3167, GSE7476, and GSE65635 datasets (p < 0.01). BC patients with increased DLGAP5 had poor overall survival (OS) (p = 0.01), disease specific survival (DSS) (p = 0.006) and progress free interval (DFI) (p = 0.007). The area under the ROC curve (AUC) was 0.913. The multivariate Cox analysis identified that lymphovascular invasion (p = 0.007) and DLGAP5 (p = 0.002) were independent prognostic factors.Conclusion: Increased DLGAP5 expression was closely associated with a poor prognosis in BC patients. In this case, DLGAP5 might be a diagnostic and prognostic biomarker for BC. DLGAP5 expression might be regulated by NEAT1/MALAT1/XIST/PKD--Hsa-mir-101-3p pathways.
Collapse
Affiliation(s)
- Xiaosheng Rao
- Department of Urology, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Haiyan Cao
- Department of Nephrology, Tianjin Medical University General Hospital, Tianjin, China
| | - Qingfeng Yu
- Department of Urology, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Xiuyu Ou
- Department of Urology, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Ruiqi Deng
- Department of Urology, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Jinkun Huang
- Department of Urology, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
- *Correspondence: Jinkun Huang,
| |
Collapse
|
26
|
Crawford J, Christensen BC, Chikina M, Greene CS. Widespread redundancy in -omics profiles of cancer mutation states. Genome Biol 2022; 23:137. [PMID: 35761387 PMCID: PMC9238138 DOI: 10.1186/s13059-022-02705-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 06/14/2022] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND In studies of cellular function in cancer, researchers are increasingly able to choose from many -omics assays as functional readouts. Choosing the correct readout for a given study can be difficult, and which layer of cellular function is most suitable to capture the relevant signal remains unclear. RESULTS We consider prediction of cancer mutation status (presence or absence) from functional -omics data as a representative problem that presents an opportunity to quantify and compare the ability of different -omics readouts to capture signals of dysregulation in cancer. From the TCGA Pan-Cancer Atlas that contains genetic alteration data, we focus on RNA sequencing, DNA methylation arrays, reverse phase protein arrays (RPPA), microRNA, and somatic mutational signatures as -omics readouts. Across a collection of genes recurrently mutated in cancer, RNA sequencing tends to be the most effective predictor of mutation state. We find that one or more other data types for many of the genes are approximately equally effective predictors. Performance is more variable between mutations than that between data types for the same mutation, and there is little difference between the top data types. We also find that combining data types into a single multi-omics model provides little or no improvement in predictive ability over the best individual data type. CONCLUSIONS Based on our results, for the design of studies focused on the functional outcomes of cancer mutations, there are often multiple -omics types that can serve as effective readouts, although gene expression seems to be a reasonable default option.
Collapse
Affiliation(s)
- Jake Crawford
- grid.25879.310000 0004 1936 8972Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA
| | - Brock C. Christensen
- grid.254880.30000 0001 2179 2404Department of Epidemiology, Geisel School of Medicine, Dartmouth College, Lebanon, NH USA ,grid.254880.30000 0001 2179 2404Department of Molecular and Systems Biology, Geisel School of Medicine, Dartmouth College, Lebanon, NH USA
| | - Maria Chikina
- grid.21925.3d0000 0004 1936 9000Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA USA
| | - Casey S. Greene
- grid.430503.10000 0001 0703 675XDepartment of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO USA ,grid.430503.10000 0001 0703 675XCenter for Health AI, University of Colorado School of Medicine, Aurora, CO USA
| |
Collapse
|
27
|
Courbariaux M, De Santiago K, Dalmasso C, Danjou F, Bekadar S, Corvol JC, Martinez M, Szafranski M, Ambroise C. A Sparse Mixture-of-Experts Model With Screening of Genetic Associations to Guide Disease Subtyping. Front Genet 2022; 13:859462. [PMID: 35734430 PMCID: PMC9207464 DOI: 10.3389/fgene.2022.859462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 04/21/2022] [Indexed: 11/27/2022] Open
Abstract
Motivation: Identifying new genetic associations in non-Mendelian complex diseases is an increasingly difficult challenge. These diseases sometimes appear to have a significant component of heritability requiring explanation, and this missing heritability may be due to the existence of subtypes involving different genetic factors. Taking genetic information into account in clinical trials might potentially have a role in guiding the process of subtyping a complex disease. Most methods dealing with multiple sources of information rely on data transformation, and in disease subtyping, the two main strategies used are 1) the clustering of clinical data followed by posterior genetic analysis and 2) the concomitant clustering of clinical and genetic variables. Both of these strategies have limitations that we propose to address. Contribution: This work proposes an original method for disease subtyping on the basis of both longitudinal clinical variables and high-dimensional genetic markers via a sparse mixture-of-regressions model. The added value of our approach lies in its interpretability in relation to two aspects. First, our model links both clinical and genetic data with regard to their initial nature (i.e., without transformation) and does not require post-processing where the original information is accessed a second time to interpret the subtypes. Second, it can address large-scale problems because of a variable selection step that is used to discard genetic variables that may not be relevant for subtyping. Results: The proposed method was validated on simulations. A dataset from a cohort of Parkinson's disease patients was also analyzed. Several subtypes of the disease and genetic variants that potentially have a role in this typology were identified. Software availability: The R code for the proposed method, named DiSuGen, and a tutorial are available for download (see the references).
Collapse
Affiliation(s)
- Marie Courbariaux
- Université Paris-Saclay, CNRS, Université d’Évry, Laboratoire de Mathématiques et Modélisation d’Évry, Évry-Courcouronnes, France
| | - Kylliann De Santiago
- Université Paris-Saclay, CNRS, Université d’Évry, Laboratoire de Mathématiques et Modélisation d’Évry, Évry-Courcouronnes, France
| | - Cyril Dalmasso
- Université Paris-Saclay, CNRS, Université d’Évry, Laboratoire de Mathématiques et Modélisation d’Évry, Évry-Courcouronnes, France
| | - Fabrice Danjou
- Sorbonne Université, Paris Brain Institute–ICM, Inserm, CNRS, Assistance Publique Hôpitaux de Paris, Pitié-Salpêtrière Hospital, Department of Neurology, Paris, France
| | - Samir Bekadar
- Sorbonne Université, Paris Brain Institute–ICM, Inserm, CNRS, Assistance Publique Hôpitaux de Paris, Pitié-Salpêtrière Hospital, Department of Neurology, Paris, France
| | - Jean-Christophe Corvol
- Sorbonne Université, Paris Brain Institute–ICM, Inserm, CNRS, Assistance Publique Hôpitaux de Paris, Pitié-Salpêtrière Hospital, Department of Neurology, Paris, France
| | - Maria Martinez
- Institut de Recherche en Santé Digestive, Inserm, CHU Purpan, Toulouse, France
| | - Marie Szafranski
- Université Paris-Saclay, CNRS, Université d’Évry, Laboratoire de Mathématiques et Modélisation d’Évry, Évry-Courcouronnes, France
- ENSIIE, Évry-Courcouronnes, France
| | - Christophe Ambroise
- Université Paris-Saclay, CNRS, Université d’Évry, Laboratoire de Mathématiques et Modélisation d’Évry, Évry-Courcouronnes, France
| |
Collapse
|
28
|
Yu C, Wang J. Data mining and mathematical models in cancer prognosis and prediction. Med Rev (Berl) 2022; 2:285-307. [PMID: 37724193 PMCID: PMC10388766 DOI: 10.1515/mr-2021-0026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/29/2021] [Indexed: 09/20/2023]
Abstract
Cancer is a fetal and complex disease. Individual differences of the same cancer type or the same patient at different stages of cancer development may require distinct treatments. Pathological differences are reflected in tissues, cells and gene levels etc. The interactions between the cancer cells and nearby microenvironments can also influence the cancer progression and metastasis. It is a huge challenge to understand all of these mechanistically and quantitatively. Researchers applied pattern recognition algorithms such as machine learning or data mining to predict cancer types or classifications. With the rapidly growing and available computing powers, researchers begin to integrate huge data sets, multi-dimensional data types and information. The cells are controlled by the gene expressions determined by the promoter sequences and transcription regulators. For example, the changes in the gene expression through these underlying mechanisms can modify cell progressing in the cell-cycle. Such molecular activities can be governed by the gene regulations through the underlying gene regulatory networks, which are essential for cancer study when the information and gene regulations are clear and available. In this review, we briefly introduce several machine learning methods of cancer prediction and classification which include Artificial Neural Networks (ANNs), Decision Trees (DTs), Support Vector Machine (SVM) and naive Bayes. Then we describe a few typical models for building up gene regulatory networks such as Correlation, Regression and Bayes methods based on available data. These methods can help on cancer diagnosis such as susceptibility, recurrence, survival etc. At last, we summarize and compare the modeling methods to analyze the development and progression of cancer through gene regulatory networks. These models can provide possible physical strategies to analyze cancer progression in a systematic and quantitative way.
Collapse
Affiliation(s)
- Chong Yu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, China
- Department of Statistics, JiLin University of Finance and Economics, Changchun, Jilin Province, China
| | - Jin Wang
- Department of Chemistry and of Physics and Astronomy, State University of New York, Stony Brook, NY, USA
| |
Collapse
|
29
|
Pouryahya M, Oh JH, Javanmard P, Mathews JC, Belkhatir Z, Deasy JO, Tannenbaum AR. aWCluster: A Novel Integrative Network-Based Clustering of Multiomics for Subtype Analysis of Cancer Data. IEEE/ACM Trans Comput Biol Bioinform 2022; 19:1472-1483. [PMID: 33226952 PMCID: PMC9518829 DOI: 10.1109/tcbb.2020.3039511] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The remarkable growth of multi-platform genomic profiles has led to the challenge of multiomics data integration. In this study, we present a novel network-based multiomics clustering founded on the Wasserstein distance from optimal mass transport. This distance has many important geometric properties making it a suitable choice for application in machine learning and clustering. Our proposed method of aggregating multiomics and Wasserstein distance clustering (aWCluster) is applied to breast carcinoma as well as bladder carcinoma, colorectal adenocarcinoma, renal carcinoma, lung non-small cell adenocarcinoma, and endometrial carcinoma from The Cancer Genome Atlas project. Subtypes were characterized by the concordant effect of mRNA expression, DNA copy number alteration, and DNA methylation of genes and their neighbors in the interaction network. aWCluster successfully clusters all cancer types into classes with significantly different survival rates. Also, a gene ontology enrichment analysis of significant genes in the low survival subgroup of breast cancer leads to the well-known phenomenon of tumor hypoxia and the transcription factor ETS1 whose expression is induced by hypoxia. We believe aWCluster has the potential to discover novel subtypes and biomarkers by accentuating the genes that have concordant multiomics measurements in their interaction network, which are challenging to find without the network inference or with single omics analysis.
Collapse
|
30
|
Hajjaji N, Aboulouard S, Cardon T, Bertin D, Robin YM, Fournier I, Salzet M. Path to Clonal Theranostics in Luminal Breast Cancers. Front Oncol 2022; 11:802177. [PMID: 35096604 PMCID: PMC8793283 DOI: 10.3389/fonc.2021.802177] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 12/06/2021] [Indexed: 12/18/2022] Open
Abstract
Integrating tumor heterogeneity in the drug discovery process is a key challenge to tackle breast cancer resistance. Identifying protein targets for functionally distinct tumor clones is particularly important to tailor therapy to the heterogeneous tumor subpopulations and achieve clonal theranostics. For this purpose, we performed an unsupervised, label-free, spatially resolved shotgun proteomics guided by MALDI mass spectrometry imaging (MSI) on 124 selected tumor clonal areas from early luminal breast cancers, tumor stroma, and breast cancer metastases. 2868 proteins were identified. The main protein classes found in the clonal proteome dataset were enzymes, cytoskeletal proteins, membrane-traffic, translational or scaffold proteins, or transporters. As a comparison, gene-specific transcriptional regulators, chromatin related proteins or transmembrane signal receptor were more abundant in the TCGA dataset. Moreover, 26 mutated proteins have been identified. Similarly, expanding the search to alternative proteins databases retrieved 126 alternative proteins in the clonal proteome dataset. Most of these alternative proteins were coded mainly from non-coding RNA. To fully understand the molecular information brought by our approach and its relevance to drug target discovery, the clonal proteomic dataset was further compared to the TCGA breast cancer database and two transcriptomic panels, BC360 (nanoString®) and CDx (Foundation One®). We retrieved 139 pathways in the clonal proteome dataset. Only 55% of these pathways were also present in the TCGA dataset, 68% in BC360 and 50% in CDx. Seven of these pathways have been suggested as candidate for drug targeting, 22 have been associated with breast cancer in experimental or clinical reports, the remaining 19 pathways have been understudied in breast cancer. Among the anticancer drugs, 35 drugs matched uniquely with the clonal proteome dataset, with only 7 of them already approved in breast cancer. The number of target and drug interactions with non-anticancer drugs (such as agents targeting the cardiovascular system, metabolism, the musculoskeletal or the nervous systems) was higher in the clonal proteome dataset (540 interactions) compared to TCGA (83 interactions), BC360 (419 interactions), or CDx (172 interactions). Many of the protein targets identified and drugs screened were clinically relevant to breast cancer and are in clinical trials. Thus, we described the non-redundant knowledge brought by this clone-tailored approach compared to TCGA or transcriptomic panels, the targetable proteins identified in the clonal proteome dataset, and the potential of this approach for drug discovery and repurposing through drug interactions with antineoplastic agents and non-anticancer drugs.
Collapse
Affiliation(s)
- Nawale Hajjaji
- Univ. Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), Lille, France.,Breast Cancer Unit, Oscar Lambret Center, Lille, France
| | - Soulaimane Aboulouard
- Univ. Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), Lille, France
| | - Tristan Cardon
- Univ. Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), Lille, France
| | - Delphine Bertin
- Univ. Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), Lille, France.,Breast Cancer Unit, Oscar Lambret Center, Lille, France
| | - Yves-Marie Robin
- Univ. Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), Lille, France.,Breast Cancer Unit, Oscar Lambret Center, Lille, France
| | - Isabelle Fournier
- Univ. Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), Lille, France.,Institut universitaire de France, Paris, France
| | - Michel Salzet
- Univ. Lille, Inserm, CHU Lille, U1192, Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse (PRISM), Lille, France.,Institut universitaire de France, Paris, France
| |
Collapse
|
31
|
Abstract
Rheumatoid arthritis (RA) is an autoimmune disorder characterized by inflammation and bone erosion. The exact mechanism of RA is still unknown, but various immune cytokines, signaling pathways and effector cells are involved. Disease-modifying antirheumatic drugs (DMARDs) are commonly used in RA treatment and classified into different categories. Nevertheless, RA treatment is based on a "trial-and-error" approach, and a substantial proportion of patients show failed therapy for each DMARD. Over the past decades, great efforts have been made to overcome treatment failure, including identification of biomarkers, exploration of the reasons for loss of efficacy, development of sequential or combinational DMARDs strategies and approval of new DMARDs. Here, we summarize these efforts, which would provide valuable insights for accurate RA clinical medication. While gratifying, researchers realize that these efforts are still far from enough to recommend specific DMARDs for individual patients. Precision medicine is an emerging medical model that proposes a highly individualized and tailored approach for disease management. In this review, we also discuss the potential of precision medicine for overcoming RA treatment failure, with the introduction of various cutting-edge technologies and big data.
Collapse
Affiliation(s)
- Zhuqian Wang
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China.,Institute of Integrated Bioinfomedicine and Translational Science (IBTS), School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, Hong Kong SAR, China.,Law Sau Fai Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, Hong Kong SAR, China
| | - Jie Huang
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Duoli Xie
- Institute of Integrated Bioinfomedicine and Translational Science (IBTS), School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, Hong Kong SAR, China.,Law Sau Fai Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, Hong Kong SAR, China
| | - Dongyi He
- Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China.,Department of Rheumatology, Shanghai Guanghua Hospital of Integrative Medicine, Shanghai, China
| | - Aiping Lu
- Institute of Integrated Bioinfomedicine and Translational Science (IBTS), School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, Hong Kong SAR, China.,Law Sau Fai Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, Hong Kong SAR, China.,Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China.,Guangdong-Hong Kong-Macau Joint Lab on Chinese Medicine and Immune Disease Research, Guangzhou, China
| | - Chao Liang
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China.,Institute of Integrated Bioinfomedicine and Translational Science (IBTS), School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, Hong Kong SAR, China.,Law Sau Fai Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, Hong Kong SAR, China
| |
Collapse
|
32
|
Wang Z, Peng X, Xia A, Shah AA, Huang Y, Zhu X, Zhu X, Liao Q. The role of machine learning to boost the bioenergy and biofuels conversion. Bioresour Technol 2022; 343:126099. [PMID: 34626766 DOI: 10.1016/j.biortech.2021.126099] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 10/04/2021] [Accepted: 10/05/2021] [Indexed: 06/13/2023]
Abstract
The development and application of bioenergy and biofuels conversion technology can play a significant role for the production of renewable and sustainable energy sources in the future. However, the complexity of bioenergy systems and the limitations of human understanding make it difficult to build models based on experience or theory for accurate predictions. Recent developments in data science and machine learning (ML), can provide new opportunities. Accordingly, this critical review provides a deep insight into the application of ML in the bioenergy context. The latest advances in ML assisted bioenergy technology, including energy utilization of lignocellulosic biomass, microalgae cultivation, biofuels conversion and application, are reviewed in detail. The strengths and limitations of ML in bioenergy systems are comprehensively analysed. Moreover, we highlight the capabilities and potential of advanced ML methods when encountering multifarious tasks in the future prospects to advance a new generation of bioenergy and biofuels conversion technologies.
Collapse
Affiliation(s)
- Zhengxin Wang
- Key Laboratory of Low-grade Energy Utilization Technologies and Systems, Chongqing University, Ministry of Education, Chongqing 400044, PR China; Institute of Engineering Thermophysics, School of Energy and Power Engineering, Chongqing University, Chongqing 400044, PR China
| | - Xinggan Peng
- School of Electrical and Electronic Engineering, Nanyang Technological University, 639798, Singapore
| | - Ao Xia
- Key Laboratory of Low-grade Energy Utilization Technologies and Systems, Chongqing University, Ministry of Education, Chongqing 400044, PR China; Institute of Engineering Thermophysics, School of Energy and Power Engineering, Chongqing University, Chongqing 400044, PR China.
| | - Akeel A Shah
- Key Laboratory of Low-grade Energy Utilization Technologies and Systems, Chongqing University, Ministry of Education, Chongqing 400044, PR China; Institute of Engineering Thermophysics, School of Energy and Power Engineering, Chongqing University, Chongqing 400044, PR China
| | - Yun Huang
- Key Laboratory of Low-grade Energy Utilization Technologies and Systems, Chongqing University, Ministry of Education, Chongqing 400044, PR China; Institute of Engineering Thermophysics, School of Energy and Power Engineering, Chongqing University, Chongqing 400044, PR China
| | - Xianqing Zhu
- Key Laboratory of Low-grade Energy Utilization Technologies and Systems, Chongqing University, Ministry of Education, Chongqing 400044, PR China; Institute of Engineering Thermophysics, School of Energy and Power Engineering, Chongqing University, Chongqing 400044, PR China
| | - Xun Zhu
- Key Laboratory of Low-grade Energy Utilization Technologies and Systems, Chongqing University, Ministry of Education, Chongqing 400044, PR China; Institute of Engineering Thermophysics, School of Energy and Power Engineering, Chongqing University, Chongqing 400044, PR China
| | - Qiang Liao
- Key Laboratory of Low-grade Energy Utilization Technologies and Systems, Chongqing University, Ministry of Education, Chongqing 400044, PR China; Institute of Engineering Thermophysics, School of Energy and Power Engineering, Chongqing University, Chongqing 400044, PR China
| |
Collapse
|
33
|
Fotopoulou C, Rockall A, Lu H, Lee P, Avesani G, Russo L, Petta F, Ataseven B, Waltering KU, Koch JA, Crum WR, Cunnea P, Heitz F, Harter P, Aboagye EO, du Bois A, Prader S. Validation analysis of the novel imaging-based prognostic radiomic signature in patients undergoing primary surgery for advanced high-grade serous ovarian cancer (HGSOC). Br J Cancer 2021; 126:1047-1054. [PMID: 34923575 PMCID: PMC8979975 DOI: 10.1038/s41416-021-01662-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 11/23/2021] [Accepted: 12/01/2021] [Indexed: 11/09/2022] Open
Abstract
BACKGROUND Predictive models based on radiomics features are novel, highly promising approaches for gynaecological oncology. Here, we wish to assess the prognostic value of the newly discovered Radiomic Prognostic Vector (RPV) in an independent cohort of high-grade serous ovarian cancer (HGSOC) patients, treated within a Centre of Excellence, thus avoiding any bias in treatment quality. METHODS RPV was calculated using standardised algorithms following segmentation of routine preoperative imaging of patients (n = 323) who underwent upfront debulking surgery (01/2011-07/2018). RPV was correlated with operability, survival and adjusted for well-established prognostic factors (age, postoperative residual disease, stage), and compared to previous validation models. RESULTS The distribution of low, medium and high RPV scores was 54.2% (n = 175), 33.4% (n = 108) and 12.4% (n = 40) across the cohort, respectively. High RPV scores independently associated with significantly worse progression-free survival (PFS) (HR = 1.69; 95% CI:1.06-2.71; P = 0.038), even after adjusting for stage, age, performance status and residual disease. Moreover, lower RPV was significantly associated with total macroscopic tumour clearance (OR = 2.02; 95% CI:1.56-2.62; P = 0.00647). CONCLUSIONS RPV was validated to independently identify those HGSOC patients who will not be operated tumour-free in an optimal setting, and those who will relapse early despite complete tumour clearance upfront. Further prospective, multicentre trials with a translational aspect are warranted for the incorporation of this radiomics approach into clinical routine.
Collapse
Affiliation(s)
- Christina Fotopoulou
- Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, W12 0HS, UK.
| | - Andrea Rockall
- Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, W12 0HS, UK.,Department of Radiology, Imperial College Healthcare NHS Trust, London, W12 0HS, UK.,Cancer Imaging Centre, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, W12 0HS, UK
| | - Haonan Lu
- Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, W12 0HS, UK
| | - Philippa Lee
- Department of Radiology, Imperial College Healthcare NHS Trust, London, W12 0HS, UK
| | - Giacomo Avesani
- Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, W12 0HS, UK.,Cancer Imaging Centre, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, W12 0HS, UK.,Department of Imaging, Oncological Radiotherapy, and Hematology, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Luca Russo
- Department of Imaging, Oncological Radiotherapy, and Hematology, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Federica Petta
- Department of Imaging, Oncological Radiotherapy, and Hematology, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Beyhan Ataseven
- Department of Gynecology and Gynecologic Oncology, Kliniken Essen-Mitte, Henricistr.92, 45136, Essen, Germany.,Department of Obstetrics and Gynecology, University Hospital, LMU Munich, München, Germany
| | - Kai-Uwe Waltering
- Department of Radiology, Kliniken Essen-Mitte, Henricistr.92, 45136, Essen, Germany
| | - Jens Albrecht Koch
- Department of Radiology, Kliniken Essen-Mitte, Henricistr.92, 45136, Essen, Germany
| | - William R Crum
- Cancer Imaging Centre, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, W12 0HS, UK.,Institute of Translational Medicine and Therapeutics (ITMAT), Imperial College, London, UK
| | - Paula Cunnea
- Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, W12 0HS, UK
| | - Florian Heitz
- Department of Gynecology and Gynecologic Oncology, Kliniken Essen-Mitte, Henricistr.92, 45136, Essen, Germany.,Department for Gynecology with the Center for Oncologic Surgery Charité Campus Virchow-Klinikum, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Philipp Harter
- Department of Gynecology and Gynecologic Oncology, Kliniken Essen-Mitte, Henricistr.92, 45136, Essen, Germany
| | - Eric O Aboagye
- Cancer Imaging Centre, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, W12 0HS, UK
| | - Andreas du Bois
- Department of Gynecology and Gynecologic Oncology, Kliniken Essen-Mitte, Henricistr.92, 45136, Essen, Germany
| | - Sonia Prader
- Department of Gynecology and Gynecologic Oncology, Kliniken Essen-Mitte, Henricistr.92, 45136, Essen, Germany.,Department of Obstetrics and Gynecology, Brixen General Hospital, Brixen, Italy.,Department of Obstetrics and Gynecology, Innsbruck Medical University, Innsbruck, Austria
| |
Collapse
|
34
|
Abstract
Precision oncology is perceived as a way forward to treat individual cancer patients. However, knowing particular cancer mutations is not enough for optimal therapeutic treatment, because cancer genotype-phenotype relationships are nonlinear and dynamic. Systems biology studies the biological processes at the systems' level, using an array of techniques, ranging from statistical methods to network reconstruction and analysis, to mathematical modeling. Its goal is to reconstruct the complex and often counterintuitive dynamic behavior of biological systems and quantitatively predict their responses to environmental perturbations. In this paper, we review the impact of systems biology on precision oncology. We show examples of how the analysis of signal transduction networks allows to dissect resistance to targeted therapies and inform the choice of combinations of targeted drugs based on tumor molecular alterations. Patient-specific biomarkers based on dynamical models of signaling networks can have a greater prognostic value than conventional biomarkers. These examples support systems biology models as valuable tools to advance clinical and translational oncological research.
Collapse
Affiliation(s)
- Andrea Rocca
- Hygiene and Public Health, Local Health Unit of Romagna, 47121 Forlì, Italy
| | - Boris N. Kholodenko
- Systems Biology Ireland, School of Medicine, University College Dublin, Belfield, D04 V1W8 Dublin, Ireland
- Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Belfield, D04 V1W8 Dublin, Ireland
- Department of Pharmacology, Yale University School of Medicine, New Haven, CT 06520, USA
| |
Collapse
|
35
|
Mosca E, Bersanelli M, Matteuzzi T, Di Nanni N, Castellani G, Milanesi L, Remondini D. Characterization and comparison of gene-centered human interactomes. Brief Bioinform 2021; 22:bbab153. [PMID: 34010955 PMCID: PMC8574298 DOI: 10.1093/bib/bbab153] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 03/22/2021] [Accepted: 04/01/2021] [Indexed: 01/04/2023] Open
Abstract
The complex web of macromolecular interactions occurring within cells-the interactome-is the backbone of an increasing number of studies, but a clear consensus on the exact structure of this network is still lacking. Different genome-scale maps of human interactome have been obtained through several experimental techniques and functional analyses. Moreover, these maps can be enriched through literature-mining approaches, and different combinations of various 'source' databases have been used in the literature. It is therefore unclear to which extent the various interactomes yield similar results when used in the context of interactome-based approaches in network biology. We compared a comprehensive list of human interactomes on the basis of topology, protein complexes, molecular pathways, pathway cross-talk and disease gene prediction. In a general context of relevant heterogeneity, our study provides a series of qualitative and quantitative parameters that describe the state of the art of human interactomes and guidelines for selecting interactomes in future applications.
Collapse
Affiliation(s)
- Ettore Mosca
- Institute of Biomedical Technologies, National Research Council, Segrate (Milan), 20090, Italy
| | - Matteo Bersanelli
- Humanitas University, Department of Biomedical Sciences, Pieve Emanuele (Milan), 20090, Italy
| | - Tommaso Matteuzzi
- Department of Physics and Astronomy, University of Bologna, Bologna, 40127, Italy
| | - Noemi Di Nanni
- Institute of Biomedical Technologies, National Research Council, Segrate (Milan), 20090, Italy
| | - Gastone Castellani
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, 40127, Italy
| | - Luciano Milanesi
- Institute of Biomedical Technologies, National Research Council, Segrate (Milan), 20090, Italy
| | - Daniel Remondini
- Department of Physics and Astronomy, University of Bologna, Bologna, 40127, Italy
| |
Collapse
|
36
|
Shi K, Lin W, Zhao XM. Identifying Molecular Biomarkers for Diseases With Machine Learning Based on Integrative Omics. IEEE/ACM Trans Comput Biol Bioinform 2021; 18:2514-2525. [PMID: 32305934 DOI: 10.1109/tcbb.2020.2986387] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Molecular biomarkers are certain molecules or set of molecules that can be of help for diagnosis or prognosis of diseases or disorders. In the past decades, thanks to the advances in high-throughput technologies, a huge amount of molecular 'omics' data, e.g., transcriptomics and proteomics, have been accumulated. The availability of these omics data makes it possible to screen biomarkers for diseases or disorders. Accordingly, a number of computational approaches have been developed to identify biomarkers by exploring the omics data. In this review, we present a comprehensive survey on the recent progress of identification of molecular biomarkers with machine learning approaches. Specifically, we categorize the machine learning approaches into supervised, un-supervised and recommendation approaches, where the biomarkers including single genes, gene sets and small gene networks. In addition, we further discuss potential problems underlying bio-medical data that may pose challenges for machine learning, and provide possible directions for future biomarker identification.
Collapse
|
37
|
Lai X, Zhou J, Wessely A, Heppt M, Maier A, Berking C, Vera J, Zhang L. A disease network-based deep learning approach for characterizing melanoma. Int J Cancer 2021; 150:1029-1044. [PMID: 34716589 DOI: 10.1002/ijc.33860] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 10/08/2021] [Accepted: 10/19/2021] [Indexed: 12/12/2022]
Abstract
Multiple types of genomic variations are present in cutaneous melanoma and some of the genomic features may have an impact on the prognosis of the disease. The access to genomics data via public repositories such as The Cancer Genome Atlas (TCGA) allows for a better understanding of melanoma at the molecular level, therefore making characterization of substantial heterogeneity in melanoma patients possible. Here, we proposed an approach that integrates genomics data, a disease network, and a deep learning model to classify melanoma patients for prognosis, assess the impact of genomic features on the classification and provide interpretation to the impactful features. We integrated genomics data into a melanoma network and applied an autoencoder model to identify subgroups in TCGA melanoma patients. The model utilizes communities identified in the network to effectively reduce the dimensionality of genomics data into a patient score profile. Based on the score profile, we identified three patient subtypes that show different survival times. Furthermore, we quantified and ranked the impact of genomic features on the patient score profile using a machine-learning technique. Follow-up analysis of the top-ranking features provided us with the biological interpretation of them at both pathway and molecular levels, such as their mutation and interactome profiles in melanoma and their involvement in pathways associated with signaling transduction, immune system and cell cycle. Taken together, we demonstrated the ability of the approach to identify disease subgroups using a deep learning model that captures the most relevant information of genomics data in the melanoma network.
Collapse
Affiliation(s)
- Xin Lai
- Department of Dermatology, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.,Deutsches Zentrum Immuntherapie, Erlangen, Germany.,Comprehensive Cancer Center Erlangen, Erlangen, Germany
| | - Jinfei Zhou
- College of Computer Science, Sichuan University, Chengdu, China
| | - Anja Wessely
- Department of Dermatology, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.,Deutsches Zentrum Immuntherapie, Erlangen, Germany.,Comprehensive Cancer Center Erlangen, Erlangen, Germany
| | - Markus Heppt
- Department of Dermatology, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.,Deutsches Zentrum Immuntherapie, Erlangen, Germany.,Comprehensive Cancer Center Erlangen, Erlangen, Germany
| | - Andreas Maier
- Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Carola Berking
- Department of Dermatology, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.,Deutsches Zentrum Immuntherapie, Erlangen, Germany.,Comprehensive Cancer Center Erlangen, Erlangen, Germany
| | - Julio Vera
- Department of Dermatology, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.,Deutsches Zentrum Immuntherapie, Erlangen, Germany.,Comprehensive Cancer Center Erlangen, Erlangen, Germany
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu, China
| |
Collapse
|
38
|
Karimi MR, Karimi AH, Abolmaali S, Sadeghi M, Schmitz U. Prospects and challenges of cancer systems medicine: from genes to disease networks. Brief Bioinform 2021; 23:6361045. [PMID: 34471925 PMCID: PMC8769701 DOI: 10.1093/bib/bbab343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 12/20/2022] Open
Abstract
It is becoming evident that holistic perspectives toward cancer are crucial in deciphering the overwhelming complexity of tumors. Single-layer analysis of genome-wide data has greatly contributed to our understanding of cellular systems and their perturbations. However, fundamental gaps in our knowledge persist and hamper the design of effective interventions. It is becoming more apparent than ever, that cancer should not only be viewed as a disease of the genome but as a disease of the cellular system. Integrative multilayer approaches are emerging as vigorous assets in our endeavors to achieve systemic views on cancer biology. Herein, we provide a comprehensive review of the approaches, methods and technologies that can serve to achieve systemic perspectives of cancer. We start with genome-wide single-layer approaches of omics analyses of cellular systems and move on to multilayer integrative approaches in which in-depth descriptions of proteogenomics and network-based data analysis are provided. Proteogenomics is a remarkable example of how the integration of multiple levels of information can reduce our blind spots and increase the accuracy and reliability of our interpretations and network-based data analysis is a major approach for data interpretation and a robust scaffold for data integration and modeling. Overall, this review aims to increase cross-field awareness of the approaches and challenges regarding the omics-based study of cancer and to facilitate the necessary shift toward holistic approaches.
Collapse
Affiliation(s)
| | | | | | - Mehdi Sadeghi
- Department of Cell & Molecular Biology, Semnan University, Semnan, Iran
| | - Ulf Schmitz
- Department of Molecular & Cell Biology, James Cook University, Townsville, QLD 4811, Australia
| |
Collapse
|
39
|
Zhou L, Yang Y, Liu M, Gan Y, Liu R, Ren M, Zheng Y, Wang Y, Zhou Y. Identification of the RP11-21C4.1/SVEP1 gene pair associated with FAT2 mutations as a potential biomarker in gastric cancer. Bioengineered 2021; 12:4361-4373. [PMID: 34308747 PMCID: PMC8806586 DOI: 10.1080/21655979.2021.1953211] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Gastric cancer (GC) is one of the most common malignancies worldwide. Despite rapid advances in systemic therapy, GC remains the third leading cause of cancer-related deaths. We aimed to identify a novel prognostic signature associated with FAT2 mutations in GC. We analyzed the expression levels of FAT2-mutant and FAT2-wildtype GC samples obtained from The Cancer Genome Atlas (TCGA). The Kaplan–Meier survival curve showed that patients with FAT2 mutations showed better prognosis than those without the mutation. Sixteen long non-coding RNAs (lncRNAs) and 62 messenger RNAs (mRNAs) associated with FAT2 mutations were correlated with the prognosis of GC. We then constructed a 4-mRNA signature and a 5-lncRNA signature for GC. Finally, we identified the most relevant RP11-21 C4.1/SVEP1 gene pair as a prognostic signature of GC that exhibited superior predictive performance in comparison with the 4-mRNA or 5-lncRNA signature by weighted gene correlation network analysis (WGCNA) and Cox proportional hazards regression analysis. In this study, we constructed a prognostic signature of GC by integrative genomics analysis, which also provided insights into the molecular mechanisms linked to FAT2 mutations in GC.
Collapse
Affiliation(s)
- Lingshan Zhou
- Department of Gastroenterology, The First Hospital of Lanzhou University, Lanzhou, China.,Department of Geriatrics Ward 2, The First Hospital of Lanzhou University, Lanzhou, China
| | - Yuan Yang
- Department of Gastroenterology, The First Hospital of Lanzhou University, Lanzhou, China
| | - Min Liu
- Department of Gastroenterology, The First Hospital of Lanzhou University, Lanzhou, China
| | - Yuling Gan
- Department 1nd Department of Bone and Soft Tissue Oncology, Gansu Provincial Cancer Hospital, Lanzhou, China
| | - Rong Liu
- Department of Geriatrics Ward 2, The First Hospital of Lanzhou University, Lanzhou, China
| | - Man Ren
- Department of Geriatrics Ward 2, The First Hospital of Lanzhou University, Lanzhou, China
| | - Ya Zheng
- Department of Gastroenterology, The First Hospital of Lanzhou University, Lanzhou, China
| | - Yuping Wang
- Department of Gastroenterology, The First Hospital of Lanzhou University, Lanzhou, China
| | - Yongning Zhou
- Department of Gastroenterology, The First Hospital of Lanzhou University, Lanzhou, China
| |
Collapse
|
40
|
Abstract
In recent years, machine learning (ML) researchers have changed their focus towards biological problems that are difficult to analyse with standard approaches. Large initiatives such as The Cancer Genome Atlas (TCGA) have allowed the use of omic data for the training of these algorithms. In order to study the state of the art, this review is provided to cover the main works that have used ML with TCGA data. Firstly, the principal discoveries made by the TCGA consortium are presented. Once these bases have been established, we begin with the main objective of this study, the identification and discussion of those works that have used the TCGA data for the training of different ML approaches. After a review of more than 100 different papers, it has been possible to make a classification according to following three pillars: the type of tumour, the type of algorithm and the predicted biological problem. One of the conclusions drawn in this work shows a high density of studies based on two major algorithms: Random Forest and Support Vector Machines. We also observe the rise in the use of deep artificial neural networks. It is worth emphasizing, the increase of integrative models of multi-omic data analysis. The different biological conditions are a consequence of molecular homeostasis, driven by both protein coding regions, regulatory elements and the surrounding environment. It is notable that a large number of works make use of genetic expression data, which has been found to be the preferred method by researchers when training the different models. The biological problems addressed have been classified into five types: prognosis prediction, tumour subtypes, microsatellite instability (MSI), immunological aspects and certain pathways of interest. A clear trend was detected in the prediction of these conditions according to the type of tumour. That is the reason for which a greater number of works have focused on the BRCA cohort, while specific works for survival, for example, were centred on the GBM cohort, due to its large number of events. Throughout this review, it will be possible to go in depth into the works and the methodologies used to study TCGA cancer data. Finally, it is intended that this work will serve as a basis for future research in this field of study.
Collapse
Affiliation(s)
- Jose Liñares-Blanco
- CITIC-Research Center of Information and Communication Technologies, University of A Coruna, A Coruña, Spain
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruna, A Coruña, Spain
| | - Alejandro Pazos
- CITIC-Research Center of Information and Communication Technologies, University of A Coruna, A Coruña, Spain
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruna, A Coruña, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR). Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| | - Carlos Fernandez-Lozano
- CITIC-Research Center of Information and Communication Technologies, University of A Coruna, A Coruña, Spain
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruna, A Coruña, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR). Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| |
Collapse
|
41
|
Abstract
Integrative network modeling of data arising from multiple genomic platforms provides insight into the holistic picture of the interactive system, as well as the flow of information across many disease domains including cancer. The basic data structure consists of a sequence of hierarchically ordered datasets for each individual subject, which facilitates integration of diverse inputs, such as genomic, transcriptomic, and proteomic data. A primary analytical task in such contexts is to model the layered architecture of networks where the vertices can be naturally partitioned into ordered layers, dictated by multiple platforms, and exhibit both undirected and directed relationships. We propose a multi-layered Gaussian graphical model (mlGGM) to investigate conditional independence structures in such multi-level genomic networks in human cancers. We implement a Bayesian node-wise selection (BANS) approach based on variable selection techniques that coherently accounts for the multiple types of dependencies in mlGGM; this flexible strategy exploits edge-specific prior knowledge and selects sparse and interpretable models. Through simulated data generated under various scenarios, we demonstrate that BANS outperforms other existing multivariate regression-based methodologies. Our integrative genomic network analysis for key signaling pathways across multiple cancer types highlights commonalities and differences of p53 integrative networks and epigenetic effects of BRCA2 on p53 and its interaction with T68 phosphorylated CHK2, that may have translational utilities of finding biomarkers and therapeutic targets.
Collapse
Affiliation(s)
- Min Jin Ha
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center
| | - Francesco Claudio Stingo
- Department of Statistics, Computer Science, Applications "G. Parenti", The University of Florence
| | | |
Collapse
|
42
|
Peng Y, Zhao J, Yin F, Sharen G, Wu Q, Chen Q, Sun X, Yang J, Wang H, Zhang D. A methylation-driven gene panel predicts survival in patients with colon cancer. FEBS Open Bio 2021; 11:2490-2506. [PMID: 34184409 PMCID: PMC8409306 DOI: 10.1002/2211-5463.13242] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 05/14/2021] [Accepted: 06/28/2021] [Indexed: 02/06/2023] Open
Abstract
The accumulation of various genetic and epigenetic changes in colonic epithelial cells has been identified as one of the fundamental processes that drive the initiation and progression of colorectal cancer (CRC). This study aimed to explore functional genes regulated by DNA methylation and their potential utilization as biomarkers for the prediction of CRC prognoses. Methylation‐driven genes (MDGs) were explored by applying the integrative analysis tool (methylmix) to The Cancer Genome Atlas CRC project. The prognostic MDG panel was identified by combining the Cox regression model with the least absolute shrinkage and selection operator regularization. Gene set enrichment analysis was used to determine the pathways associated with the six‐MDG panel. Cluster of differentiation 40 (CD40) expression and methylation in CRC samples were validated by using additional datasets from the Gene Expression Omnibus. Methylation‐specific PCR and bisulfite sequencing were used to confirm DNA methylation in CRC cell lines. A prognostic MDG panel consisting of six gene members was identified: TMEM88, HOXB2, FGD1, TOGARAM1, ARHGDIB and CD40. The high‐risk phenotype classified by the six‐MDG panel was associated with cancer‐related biological processes, including invasion and metastasis, angiogenesis and the tumor immune microenvironment. The prognostic value of the six‐MDG panel was found to be independent of tumor node metastasis stage and, in combination with tumor node metastasis stage and age, could help improve survival prediction. In addition, the expression of CD40 was confirmed to be regulated by promoter region methylation in CRC samples and cell lines. The proposed six‐MDG panel represents a promising signature for estimating the prognosis of patients with CRC.
Collapse
Affiliation(s)
- Yaojun Peng
- Emergency Department, The First Medical Center, Chinese PLA General Hospital, Beijing, China.,College of Graduate, Chinese PLA General Hospital, Beijing, China
| | - Jing Zhao
- Department of Scientific Research Administration, Chinese PLA General Hospital, Beijing, China
| | - Fan Yin
- Department of Oncology, The Second Medical Center & National Clinical Research Center of Geriatric Disease, Chinese PLA General Hospital, Beijing, China
| | - Gaowa Sharen
- Department of Pathology, The First Affiliated Hospital of Inner Mongolia Medical University, Hohhot City, China
| | - Qiyan Wu
- Department of Oncology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Qi Chen
- Department of Traditional Chinese Medicine, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Xiaoxuan Sun
- National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin's Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, China.,Department of Oncology Surgery, Tianjin Cancer Hospital Airport Free Trade Zone Hospital, China
| | - Juan Yang
- Department of Cardiothoracic Surgery, Tianjin Fourth Center Hospital, China
| | - Huan Wang
- Department of Scientific Research Administration, Chinese PLA General Hospital, Beijing, China
| | - Dong Zhang
- Department of Oncology, The Second Medical Center & National Clinical Research Center of Geriatric Disease, Chinese PLA General Hospital, Beijing, China
| |
Collapse
|
43
|
Tarazona S, Arzalluz-Luque A, Conesa A. Undisclosed, unmet and neglected challenges in multi-omics studies. Nat Comput Sci 2021; 1:395-402. [PMID: 38217236 DOI: 10.1038/s43588-021-00086-z] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 05/17/2021] [Indexed: 01/15/2024]
Abstract
Multi-omics approaches have become a reality in both large genomics projects and small laboratories. However, the multi-omics research community still faces a number of issues that have either not been sufficiently discussed or for which current solutions are still limited. In this Perspective, we elaborate on these limitations and suggest points of attention for future research. We finally discuss new opportunities and challenges brought to the field by the rapid development of single-cell high-throughput molecular technologies.
Collapse
Affiliation(s)
- Sonia Tarazona
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain
| | - Angeles Arzalluz-Luque
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain
| | - Ana Conesa
- Microbiology and Cell Science Department, Institute for Food and Agricultural Research, University of Florida, Gainesville, FL, USA.
- Genetics Institute, University of Florida, Gainesville, FL, USA.
- Institute for Integrative Systems Biology, Spanish National Research Council, Valencia, Spain.
| |
Collapse
|
44
|
Núñez-Carpintero I, Petrizzelli M, Zinovyev A, Cirillo D, Valencia A. The multilayer community structure of medulloblastoma. iScience 2021; 24:102365. [PMID: 33889829 PMCID: PMC8050854 DOI: 10.1016/j.isci.2021.102365] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 03/17/2021] [Accepted: 03/24/2021] [Indexed: 01/20/2023] Open
Abstract
Multilayer networks allow interpreting the molecular basis of diseases, which is particularly challenging in rare diseases where the number of cases is small compared with the size of the associated multi-omics datasets. In this work, we develop a dimensionality reduction methodology to identify the minimal set of genes that characterize disease subgroups based on their persistent association in multilayer network communities. We use this approach to the study of medulloblastoma, a childhood brain tumor, using proteogenomic data. Our approach is able to recapitulate known medulloblastoma subgroups (accuracy >94%) and provide a clear characterization of gene associations, with the downstream implications for diagnosis and therapeutic interventions. We verified the general applicability of our method on an independent medulloblastoma dataset (accuracy >98%). This approach opens the door to a new generation of multilayer network-based methods able to overcome the specific dimensionality limitations of rare disease datasets. The molecular interpretation of rare diseases is a challenging task Multilayer networks allow patient stratification and explainability We identify subgroup-specific genes and multilayer associations in medulloblastoma Multilayer community analysis enables the molecular interpretation of rare diseases
Collapse
Affiliation(s)
| | - Marianyela Petrizzelli
- Institut Curie, PSL Research University, 75005 Paris, France
- INSERM, U900, 75005 Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, 75006 Paris, France
| | - Andrei Zinovyev
- Institut Curie, PSL Research University, 75005 Paris, France
- INSERM, U900, 75005 Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, 75006 Paris, France
- Lobachevsky University, 603000 Nizhny Novgorod, Russia
| | - Davide Cirillo
- Barcelona Supercomputing Center (BSC), C/ Jordi Girona 29, 08034, Barcelona, Spain
- Corresponding author
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC), C/ Jordi Girona 29, 08034, Barcelona, Spain
- ICREA - Institució Catalana de Recerca i Estudis Avançats, Pg. Lluís Companys 23, 08010, Barcelona, Spain
| |
Collapse
|
45
|
Yang XL, Shi Y, Zhang DD, Xin R, Deng J, Wu TM, Wang HM, Wang PY, Liu JB, Li W, Ma YS, Fu D. Quantitative proteomics characterization of cancer biomarkers and treatment. Mol Ther Oncolytics 2021; 21:255-63. [PMID: 34095463 DOI: 10.1016/j.omto.2021.04.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Cancer accounted for 16% of all death worldwide in 2018. Significant progress has been made in understanding tumor occurrence, progression, diagnosis, treatment, and prognosis at the molecular level. However, genomics changes cannot truly reflect the state of protein activity in the body due to the poor correlation between genes and proteins. Quantitative proteomics, capable of quantifying the relatively different protein abundance in cancer patients, has been increasingly adopted in cancer research. Quantitative proteomics has great application potentials, including cancer diagnosis, personalized therapeutic drug selection, real-time therapeutic effects and toxicity evaluation, prognosis and drug resistance evaluation, and new therapeutic target discovery. In this review, the development, testing samples, and detection methods of quantitative proteomics are introduced. The biomarkers identified by quantitative proteomics for clinical diagnosis, prognosis, and drug resistance are reviewed. The challenges and prospects of quantitative proteomics for personalized medicine are also discussed.
Collapse
|
46
|
Ochoa S, de Anda-Jáuregui G, Hernández-Lemus E. An Information Theoretical Multilayer Network Approach to Breast Cancer Transcriptional Regulation. Front Genet 2021; 12:617512. [PMID: 33815463 PMCID: PMC8014033 DOI: 10.3389/fgene.2021.617512] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 02/05/2021] [Indexed: 12/13/2022] Open
Abstract
Breast cancer is a complex, highly heterogeneous disease at multiple levels ranging from its genetic origins and molecular processes to clinical manifestations. This heterogeneity has given rise to the so-called intrinsic or molecular breast cancer subtypes. Aside from classification, these subtypes have set a basis for differential prognosis and treatment. Multiple regulatory mechanisms-involving a variety of biomolecular entities-suffer from alterations leading to the diseased phenotypes. Information theoretical approaches have been found to be useful in the description of these complex regulatory programs. In this work, we identified the interactions occurring between three main mechanisms of regulation of the gene expression program: transcription factor regulation, regulation via noncoding RNA, and epigenetic regulation through DNA methylation. Using data from The Cancer Genome Atlas, we inferred probabilistic multilayer networks, identifying key regulatory circuits able to (partially) explain the alterations that lead from a healthy phenotype to different manifestations of breast cancer, as captured by its molecular subtype classification. We also found some general trends in the topology of the multi-omic regulatory networks: Tumor subtype networks present longer shortest paths than their normal tissue counterpart; epigenomic regulation has frequently focused on genes enriched for certain biological processes; CpG methylation and miRNA interactions are often part of a regulatory core of conserved interactions. The use of probabilistic measures to infer information regarding theoretical-derived multilayer networks based on multi-omic high-throughput data is hence presented as a useful methodological approach to capture some of the molecular heterogeneity behind regulatory phenomena in breast cancer, and potentially other diseases.
Collapse
Affiliation(s)
- Soledad Ochoa
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
| | - Guillermo de Anda-Jáuregui
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico.,Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Mexico City, Mexico.,Conacyt Research Chairs, National Council on Science and Technology, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico.,Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Mexico City, Mexico
| |
Collapse
|
47
|
Cabassi A, Kirk PDW. Multiple kernel learning for integrative consensus clustering of omic datasets. Bioinformatics 2021; 36:4789-4796. [PMID: 32592464 PMCID: PMC7750932 DOI: 10.1093/bioinformatics/btaa593] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Revised: 05/18/2020] [Accepted: 06/19/2020] [Indexed: 12/19/2022] Open
Abstract
Motivation Diverse applications—particularly in tumour subtyping—have demonstrated the importance of integrative clustering techniques for combining information from multiple data sources. Cluster Of Clusters Analysis (COCA) is one such approach that has been widely applied in the context of tumour subtyping. However, the properties of COCA have never been systematically explored, and its robustness to the inclusion of noisy datasets is unclear. Results We rigorously benchmark COCA, and present Kernel Learning Integrative Clustering (KLIC) as an alternative strategy. KLIC frames the challenge of combining clustering structures as a multiple kernel learning problem, in which different datasets each provide a weighted contribution to the final clustering. This allows the contribution of noisy datasets to be down-weighted relative to more informative datasets. We compare the performances of KLIC and COCA in a variety of situations through simulation studies. We also present the output of KLIC and COCA in real data applications to cancer subtyping and transcriptional module discovery. Availability and implementation R packages klic and coca are available on the Comprehensive R Archive Network. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Paul D W Kirk
- MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK.,Cambridge Institute of Therapeutic Immunology & Infectious Disease, University of Cambridge, Cambridge CB2 0AW, UK
| |
Collapse
|
48
|
Chang SM, Yang M, Lu W, Huang YJ, Huang Y, Hung H, Miecznikowski JC, Lu TP, Tzeng JY. Gene-Set Integrative Analysis of Multi-Omics Data Using Tensor-based Association Test. Bioinformatics 2021; 37:2259-2265. [PMID: 33674827 PMCID: PMC8388036 DOI: 10.1093/bioinformatics/btab125] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 12/30/2020] [Accepted: 02/24/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Facilitated by technological advances and the decrease in costs, it is feasible to gather subject data from several omics platforms. Each platform assesses different molecular events, and the challenge lies in efficiently analyzing these data to discover novel disease genes or mechanisms. A common strategy is to regress the outcomes on all omics variables in a gene set. However, this approach suffers from problems associated with high-dimensional inference. RESULTS We introduce a tensor-based framework for variable-wise inference in multi-omics analysis. By accounting for the matrix structure of an individual's multi-omics data, the proposed tensor methods incorporate the relationship among omics effects, reduce the number of parameters, and boost the modeling efficiency. We derive the variable-specific tensor test and enhance computational efficiency of tensor modeling. Using simulations and data applications on the Cancer Cell Line Encyclopedia (CCLE), we demonstrate our method performs favorably over baseline methods and will be useful for gaining biological insights in multi-omics analysis. AVAILABILITY AND IMPLEMENTATION R function and instruction are available from the authors' website: https://www4.stat.ncsu.edu/∼jytzeng/Software/TR.omics/TRinstruction.pdf. SUPPLEMENTARY INFORMATION Supplementary materials are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sheng-Mao Chang
- Department of Statistics, National Cheng Kung University, Tainan, Taiwan
| | - Meng Yang
- Department of Statistics, North Carolina State University, Raleigh NC, 27695, USA
| | - Wenbin Lu
- Department of Statistics, North Carolina State University, Raleigh NC, 27695, USA
| | - Yu-Jyun Huang
- Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
| | - Yueyang Huang
- Bioinformatics Research Center, North Carolina State University, Raleigh NC, 27695, USA
| | - Hung Hung
- Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
| | | | - Tzu-Pin Lu
- Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
| | - Jung-Ying Tzeng
- Department of Statistics, National Cheng Kung University, Tainan, Taiwan.,Department of Statistics, North Carolina State University, Raleigh NC, 27695, USA.,Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan.,Bioinformatics Research Center, North Carolina State University, Raleigh NC, 27695, USA
| |
Collapse
|
49
|
Choe EK, Lee S, Kim SY, Shivakumar M, Park KJ, Chai YJ, Kim D. Prognostic Effect of Inflammatory Genes on Stage I-III Colorectal Cancer-Integrative Analysis of TCGA Data. Cancers (Basel) 2021; 13:cancers13040751. [PMID: 33670198 PMCID: PMC7916934 DOI: 10.3390/cancers13040751] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Revised: 02/05/2021] [Accepted: 02/07/2021] [Indexed: 12/24/2022] Open
Abstract
Simple Summary Research interest in the role of inflammation in the progression and prognosis of colorectal cancer (CRC) is growing. In this study, we evaluated the expression and DNA methylation levels of inflammation-related genes in CRC tissues using the TCGA-COREAD dataset by integratively combining multi-omics features using machine learning. Statistical analysis was additionally performed to allow for interpretable, understandable, and clinically practical results. An integrative model combining expression, methylation, and clinical features had the highest performance. In multivariate analysis, the methylation levels of CEP250, RAB21, and TNPO3 were significantly associated with overall survival. Our study results implicate the importance of integrating expression and methylation information along with clinical information in the prediction of survival. CEP250, RAB21, and TNPO3 in the prediction model might have a crucial role in CRC prognosis and further improve our understanding of potential mechanisms linking inflammatory reactions and CRC progression. Abstract Background inflammatory status indicators have been reported as prognostic biomarkers of colorectal cancer (CRC). However, since inflammatory interactions with the colon involve various modes of action, the biological mechanism linking inflammation and CRC prognosis has not been fully elucidated. We comprehensively evaluated the predictive roles of the expression and methylation levels of inflammation-related genes for CRC prognosis and their pathophysiological associations. Method. An integrative analysis of 247 patients with stage I-III CRC from The Cancer Genome Atlas was conducted. Lasso-penalized Cox proportional hazards regression (Lasso-Cox) and statistical Cox proportional hazard regression (CPH) were used for the analysis. Results. Models to predict overall survival were designed with respective combinations of clinical variables, including age, sex, stage, gene expression, and methylation. An integrative model combining expression, methylation, and clinical features performed better (median C-index = 0.756) than the model with clinical features alone (median C-index = 0.726). Based on multivariate CPH with features from the best model, the methylation levels of CEP250, RAB21, and TNPO3 were significantly associated with overall survival. They did not share any biological process in functional networks. The 5-year survival rate was 29.8% in the low methylation group of CEP250 and 79.1% in the high methylation group (p < 0.001). Conclusion. Our study results implicate the importance of integrating expression and methylation information along with clinical information in the prediction of survival. CEP250, RAB21, and TNPO3 in the prediction model might have a crucial role in CRC prognosis and further improve our understanding of potential mechanisms linking inflammatory reactions and CRC progression.
Collapse
Affiliation(s)
- Eun Kyung Choe
- Department of Surgery, Seoul National University Hospital Healthcare System Gangnam Center, Seoul 06236, Korea;
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104-6116, USA; (S.Y.K.); (M.S.)
- Department of Surgery, Seoul National University College of Medicine, Seoul 03080, Korea;
| | - Sangwoo Lee
- Department of Future Convergence, Cyber University of Korea, Seoul 03051, Korea;
| | - So Yeon Kim
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104-6116, USA; (S.Y.K.); (M.S.)
- Department of Software and Computer Engineering, Ajou University, Suwon 16499, Korea
| | - Manu Shivakumar
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104-6116, USA; (S.Y.K.); (M.S.)
| | - Kyu Joo Park
- Department of Surgery, Seoul National University College of Medicine, Seoul 03080, Korea;
| | - Young Jun Chai
- Department of Surgery, Seoul Metropolitan Government—Seoul National University Boramae Medical Center, Seoul 07061, Korea;
| | - Dokyoon Kim
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104-6116, USA; (S.Y.K.); (M.S.)
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104-6116, USA
- Correspondence: ; Tel.: +1-215-573-5336
| |
Collapse
|
50
|
Du Y, Fan K, Lu X, Wu C. Integrating Multi–Omics Data for Gene-Environment Interactions. BioTech 2021; 10:biotech10010003. [PMID: 35822775 PMCID: PMC9245467 DOI: 10.3390/biotech10010003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 01/22/2021] [Accepted: 01/22/2021] [Indexed: 01/05/2023] Open
Abstract
Gene-environment (G×E) interaction is critical for understanding the genetic basis of complex disease beyond genetic and environment main effects. In addition to existing tools for interaction studies, penalized variable selection emerges as a promising alternative for dissecting G×E interactions. Despite the success, variable selection is limited in terms of accounting for multidimensional measurements. Published variable selection methods cannot accommodate structured sparsity in the framework of integrating multiomics data for disease outcomes. In this paper, we have developed a novel variable selection method in order to integrate multi-omics measurements in G×E interaction studies. Extensive studies have already revealed that analyzing omics data across multi-platforms is not only sensible biologically, but also resulting in improved identification and prediction performance. Our integrative model can efficiently pinpoint important regulators of gene expressions through sparse dimensionality reduction, and link the disease outcomes to multiple effects in the integrative G×E studies through accommodating a sparse bi-level structure. The simulation studies show the integrative model leads to better identification of G×E interactions and regulators than alternative methods. In two G×E lung cancer studies with high dimensional multi-omics data, the integrative model leads to an improved prediction and findings with important biological implications.
Collapse
|