1
|
Li G, Hu Z, Luo X, Liu J, Wu J, Peng W, Zhu X. Identification of cancer driver genes based on hierarchical weak consensus model. Health Inf Sci Syst 2024; 12:21. [PMID: 38464463 PMCID: PMC10917728 DOI: 10.1007/s13755-024-00279-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 01/31/2024] [Indexed: 03/12/2024] Open
Abstract
Cancer is a complex gene mutation disease that derives from the accumulation of mutations during somatic cell evolution. With the advent of high-throughput technology, a large amount of omics data has been generated, and how to find cancer-related driver genes from a large number of omics data is a challenge. In the early stage, the researchers developed many frequency-based driver genes identification methods, but they could not identify driver genes with low mutation rates well. Afterwards, researchers developed network-based methods by fusing multi-omics data, but they rarely considered the connection among features. In this paper, after analyzing a large number of methods for integrating multi-omics data, a hierarchical weak consensus model for fusing multiple features is proposed according to the connection among features. By analyzing the connection between PPI network and co-mutation hypergraph network, this paper firstly proposes a new topological feature, called co-mutation clustering coefficient (CMCC). Then, a hierarchical weak consensus model is used to integrate CMCC, mRNA and miRNA differential expression scores, and a new driver genes identification method HWC is proposed. In this paper, the HWC method and current 7 state-of-the-art methods are compared on three types of cancers. The comparison results show that HWC has the best identification performance in statistical evaluation index, functional consistency and the partial area under ROC curve. Supplementary Information The online version contains supplementary material available at 10.1007/s13755-024-00279-6.
Collapse
Affiliation(s)
- Gaoshi Li
- Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin, 541004 China
- Guangxi Key Lab of Multi-Source Information Mining & Security, Guangxi Normal University, Guilin, 541004 Guangxi China
- College of Computer Science and Engineering, Guangxi Normal University, Guilin, 541004 Guangxi China
| | - Zhipeng Hu
- Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin, 541004 China
- Guangxi Key Lab of Multi-Source Information Mining & Security, Guangxi Normal University, Guilin, 541004 Guangxi China
- College of Computer Science and Engineering, Guangxi Normal University, Guilin, 541004 Guangxi China
| | - Xinlong Luo
- Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin, 541004 China
- Guangxi Key Lab of Multi-Source Information Mining & Security, Guangxi Normal University, Guilin, 541004 Guangxi China
- College of Computer Science and Engineering, Guangxi Normal University, Guilin, 541004 Guangxi China
| | - Jiafei Liu
- Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin, 541004 China
- Guangxi Key Lab of Multi-Source Information Mining & Security, Guangxi Normal University, Guilin, 541004 Guangxi China
- College of Computer Science and Engineering, Guangxi Normal University, Guilin, 541004 Guangxi China
| | - Jingli Wu
- Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin, 541004 China
- Guangxi Key Lab of Multi-Source Information Mining & Security, Guangxi Normal University, Guilin, 541004 Guangxi China
- College of Computer Science and Engineering, Guangxi Normal University, Guilin, 541004 Guangxi China
| | - Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500 Yunnan China
| | - Xiaoshu Zhu
- Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin, 541004 China
- Guangxi Key Lab of Multi-Source Information Mining & Security, Guangxi Normal University, Guilin, 541004 Guangxi China
- College of Computer Science and Engineering, Guangxi Normal University, Guilin, 541004 Guangxi China
- School of Computer and Information Security & School of Software Engineering, Guilin University of Electronic Science and Technology, Guilin, China
| |
Collapse
|
2
|
Wei PJ, Zhu AD, Cao R, Zheng C. Personalized Driver Gene Prediction Using Graph Convolutional Networks with Conditional Random Fields. BIOLOGY 2024; 13:184. [PMID: 38534453 DOI: 10.3390/biology13030184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 03/03/2024] [Accepted: 03/10/2024] [Indexed: 03/28/2024]
Abstract
Cancer is a complex and evolutionary disease mainly driven by the accumulation of genetic variations in genes. Identifying cancer driver genes is important. However, most related studies have focused on the population level. Cancer is a disease with high heterogeneity. Thus, the discovery of driver genes at the individual level is becoming more valuable but is a great challenge. Although there have been some computational methods proposed to tackle this challenge, few can cover all patient samples well, and there is still room for performance improvement. In this study, to identify individual-level driver genes more efficiently, we propose the PDGCN method. PDGCN integrates multiple types of data features, including mutation, expression, methylation, copy number data, and system-level gene features, along with network structural features extracted using Node2vec in order to construct a sample-gene interaction network. Prediction is performed using a graphical convolutional neural network model with a conditional random field layer, which is able to better combine the network structural features with biological attribute features. Experiments on the ACC (Adrenocortical Cancer) and KICH (Kidney Chromophobe) datasets from TCGA (The Cancer Genome Atlas) demonstrated that the method performs better compared to other similar methods. It can identify not only frequently mutated driver genes, but also rare candidate driver genes and novel biomarker genes. The results of the survival and enrichment analyses of these detected genes demonstrate that the method can identify important driver genes at the individual level.
Collapse
Affiliation(s)
- Pi-Jing Wei
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, Hefei 230601, China
| | - An-Dong Zhu
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, Hefei 230601, China
| | - Ruifen Cao
- School of Computer Science and Technology, Anhui University, 111 Jiulong Road, Hefei 230601, China
| | - Chunhou Zheng
- School of Artificial Intelligence, Anhui University, 111 Jiulong Road, Hefei 230601, China
| |
Collapse
|
3
|
Xu X, Qi Z, Wang L, Zhang M, Geng Z, Han X. Gsw-fi: a GLM model incorporating shrinkage and double-weighted strategies for identifying cancer driver genes with functional impact. BMC Bioinformatics 2024; 25:99. [PMID: 38448819 PMCID: PMC10916024 DOI: 10.1186/s12859-024-05707-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 02/16/2024] [Indexed: 03/08/2024] Open
Abstract
BACKGROUND Cancer, a disease with high morbidity and mortality rates, poses a significant threat to human health. Driver genes, which harbor mutations accountable for the initiation and progression of tumors, play a crucial role in cancer development. Identifying driver genes stands as a paramount objective in cancer research and precision medicine. RESULTS In the present work, we propose a method for identifying driver genes using a Generalized Linear Regression Model (GLM) with Shrinkage and double-Weighted strategies based on Functional Impact, which is named GSW-FI. Firstly, an estimating model is proposed for assessing the background functional impacts of genes based on GLM, utilizing gene features as predictors. Secondly, the shrinkage and double-weighted strategies as two revising approaches are integrated to ensure the rationality of the identified driver genes. Lastly, a statistical method of hypothesis testing is designed to identify driver genes by leveraging the estimated background function impacts. Experimental results conducted on 31 The Cancer Genome Altas datasets demonstrate that GSW-FI outperforms ten other prediction methods in terms of the overlap fraction with well-known databases and consensus predictions among different methods. CONCLUSIONS GSW-FI presents a novel approach that efficiently identifies driver genes with functional impact mutations using computational methods, thereby advancing the development of precision medicine for cancer.
Collapse
Affiliation(s)
- Xiaolu Xu
- School of Computer and Artificial Intelligence, Liaoning Normal University, Dalian, China
| | - Zitong Qi
- Department of Statistics, University of Washington, Seattle, USA
| | - Lei Wang
- Center for Reproductive and Genetic Medicine, Dalian Women and Children's Medical Group, Dalian, China.
| | - Meiwei Zhang
- Center for Reproductive and Genetic Medicine, Dalian Women and Children's Medical Group, Dalian, China.
| | - Zhaohong Geng
- Department of Cardiology, Second Affiliated Hospital of Dalian Medical University, Dalian, China
| | - Xiumei Han
- College of Artificial Intelligence, Dalian Maritime University, Dalian, China
| |
Collapse
|
4
|
Song J, Song Z, Zhang J, Gong Y. Privacy-Preserving Identification of Cancer Subtype-Specific Driver Genes Based on Multigenomics Data with Privatedriver. J Comput Biol 2024; 31:99-116. [PMID: 38271572 DOI: 10.1089/cmb.2023.0115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024] Open
Abstract
Identifying cancer subtype-specific driver genes from a large number of irrelevant passengers is crucial for targeted therapy in cancer treatment. Recently, the rapid accumulation of large-scale cancer genomics data from multiple institutions has presented remarkable opportunities for identification of cancer subtype-specific driver genes. However, the insufficient subtype samples, privacy issues, and heterogenous of aberration events pose great challenges in precisely identifying cancer subtype-specific driver genes. To address this, we introduce privatedriver, the first model for identifying subtype-specific driver genes that integrates genomics data from multiple institutions in a data privacy-preserving collaboration manner. The process of identifying subtype-specific cancer driver genes using privatedriver involves the following two steps: genomics data integration and collaborative training. In the integration process, the aberration events from multiple genomics data sources are combined for each institution using the forward and backward propagation method of NetICS. In the collaborative training process, each institution utilizes the federated learning framework to upload encrypted model parameters instead of raw data of all institutions to train a global model by using the non-negative matrix factorization algorithm. We applied privatedriver on head and neck squamous cell and colon cancer from The Cancer Genome Atlas website and evaluated it with two benchmarks using macro-Fscore. The comparison analysis demonstrates that privatedriver achieves comparable results to centralized learning models and outperforms most other nonprivacy preserving models, all while ensuring the confidentiality of patient information. We also demonstrate that, for varying predicted driver gene distributions in subtype, our model fully considers the heterogeneity of subtype and identifies subtype-specific driver genes corresponding to the given prognosis and therapeutic effect. The success of privatedriver reveals the feasibility and effectiveness of identifying cancer subtype-specific driver genes in a data protection manner, providing new insights for future privacy-preserving driver gene identification studies.
Collapse
Affiliation(s)
- Junrong Song
- School of Information; Kunming, P.R. China
- Yunnan Key Laboratory of Service Computing; Yunnan University of Finance and Economics, Kunming, P.R. China
| | - Zhiming Song
- School of Information; Kunming, P.R. China
- Yunnan Key Laboratory of Service Computing; Yunnan University of Finance and Economics, Kunming, P.R. China
| | - Jinpeng Zhang
- School of Information; Kunming, P.R. China
- Yunnan Key Laboratory of Service Computing; Yunnan University of Finance and Economics, Kunming, P.R. China
- The School of Computer Science and Engineering, Yunnan University, Kunming, P.R. China
| | | |
Collapse
|
5
|
Geng H, Qian R, Zhong Y, Tang X, Zhang X, Zhang L, Yang C, Li T, Dong Z, Wang C, Zhang Z, Zhu C. Leveraging synthetic lethality to uncover potential therapeutic target in gastric cancer. Cancer Gene Ther 2024; 31:334-348. [PMID: 38040871 DOI: 10.1038/s41417-023-00706-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 11/10/2023] [Accepted: 11/16/2023] [Indexed: 12/03/2023]
Abstract
Since trastuzumab was approved in 2012 for the first-line treatment of gastric cancer (GC), no significant advancement in GC targeted therapies has occurred. Synthetic lethality refers to the concept that simultaneous dysfunction of a pair of genes results in a lethal effect on cells, while the loss of an individual gene does not cause this effect. Through exploiting synthetic lethality, novel targeted therapies can be developed for the individualized treatment of GC. In this study, we proposed a computational strategy named Gastric cancer Specific Synthetic Lethality inference (GSSL) to identify synthetic lethal interactions in GC. GSSL analysis was used to infer probable synthetic lethality in GC using four accessible clinical datasets. In addition, prediction results were confirmed by experiments. GSSL analysis identified a total of 34 candidate synthetic lethal pairs, which included 33 unique targets. Among the synthetic lethal gene pairs, TP53-CHEK1 was selected for further experimental validation. Both computational and experimental results indicated that inhibiting CHEK1 could be a potential therapeutic strategy for GC patients with TP53 mutation. Meanwhile, in vitro experimental validation of two novel synthetic lethal pairs TP53-AURKB and ARID1A-EP300 further proved the universality and reliability of GSSL. Collectively, GSSL has been shown to be a reliable and feasible method for comprehensive analysis of inferring synthetic lethal interactions of GC, which may offer novel insight into the precision medicine and individualized treatment of GC.
Collapse
Affiliation(s)
- Haigang Geng
- Department of Gastrointestinal Surgery, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Ruolan Qian
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yiqing Zhong
- Department of Gastrointestinal Surgery, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Xiangyu Tang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Xiaojun Zhang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Linmeng Zhang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Chen Yang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Tingting Li
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Zhongyi Dong
- Department of Gastrointestinal Surgery, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Cun Wang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Zizhen Zhang
- Department of Gastrointestinal Surgery, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
| | - Chunchao Zhu
- Department of Gastrointestinal Surgery, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
6
|
Wang X, Kostrzewa C, Reiner A, Shen R, Begg C. Adaptation of a mutual exclusivity framework to identify driver mutations within oncogenic pathways. Am J Hum Genet 2024; 111:227-241. [PMID: 38232729 PMCID: PMC10870134 DOI: 10.1016/j.ajhg.2023.12.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 12/05/2023] [Accepted: 12/05/2023] [Indexed: 01/19/2024] Open
Abstract
Distinguishing genomic alterations in cancer-associated genes that have functional impact on tumor growth and disease progression from the ones that are passengers and confer no fitness advantage have important clinical implications. Evidence-based methods for nominating drivers are limited by existing knowledge on the oncogenic effects and therapeutic benefits of specific variants from clinical trials or experimental settings. As clinical sequencing becomes a mainstay of patient care, applying computational methods to mine the rapidly growing clinical genomic data holds promise in uncovering functional candidates beyond the existing knowledge base and expanding the patient population that could potentially benefit from genetically targeted therapies. We propose a statistical and computational method (MAGPIE) that builds on a likelihood approach leveraging the mutual exclusivity pattern within an oncogenic pathway for identifying probabilistically both the specific genes within a pathway and the individual mutations within such genes that are truly the drivers. Alterations in a cancer-associated gene are assumed to be a mixture of driver and passenger mutations with the passenger rates modeled in relationship to tumor mutational burden. We use simulations to study the operating characteristics of the method and assess false-positive and false-negative rates in driver nomination. When applied to a large study of primary melanomas, the method accurately identifies the known driver genes within the RTK-RAS pathway and nominates several rare variants as prime candidates for functional validation. A comprehensive evaluation of MAGPIE against existing tools has also been conducted leveraging the Cancer Genome Atlas data.
Collapse
Affiliation(s)
- Xinjun Wang
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| | - Caroline Kostrzewa
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Allison Reiner
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Ronglai Shen
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Colin Begg
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| |
Collapse
|
7
|
Huang Y, Chen F, Sun H, Zhong C. Exploring gene-patient association to identify personalized cancer driver genes by linear neighborhood propagation. BMC Bioinformatics 2024; 25:34. [PMID: 38254011 PMCID: PMC10804660 DOI: 10.1186/s12859-024-05662-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 01/18/2024] [Indexed: 01/24/2024] Open
Abstract
BACKGROUND Driver genes play a vital role in the development of cancer. Identifying driver genes is critical for diagnosing and understanding cancer. However, challenges remain in identifying personalized driver genes due to tumor heterogeneity of cancer. Although many computational methods have been developed to solve this problem, few efforts have been undertaken to explore gene-patient associations to identify personalized driver genes. RESULTS Here we propose a method called LPDriver to identify personalized cancer driver genes by employing linear neighborhood propagation model on individual genetic data. LPDriver builds personalized gene network based on the genetic data of individual patients, extracts the gene-patient associations from the bipartite graph of the personalized gene network and utilizes a linear neighborhood propagation model to mine gene-patient associations to detect personalized driver genes. The experimental results demonstrate that as compared to the existing methods, our method shows competitive performance and can predict cancer driver genes in a more accurate way. Furthermore, these results also show that besides revealing novel driver genes that have been reported to be related with cancer, LPDriver is also able to identify personalized cancer driver genes for individual patients by their network characteristics even if the mutation data of genes are hidden. CONCLUSIONS LPDriver can provide an effective approach to predict personalized cancer driver genes, which could promote the diagnosis and treatment of cancer. The source code and data are freely available at https://github.com/hyr0771/LPDriver .
Collapse
Affiliation(s)
- Yiran Huang
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004, China
- Key Laboratory of Parallel, Distributed and Intelligent Computing in Guangxi Universities and Colleges, Guangxi University, Nanning, 530004, China
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, 530004, China
| | - Fuhao Chen
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004, China
| | - Hongtao Sun
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004, China
| | - Cheng Zhong
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004, China.
- Key Laboratory of Parallel, Distributed and Intelligent Computing in Guangxi Universities and Colleges, Guangxi University, Nanning, 530004, China.
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, 530004, China.
| |
Collapse
|
8
|
Wang Y, Zhou B, Ru J, Meng X, Wang Y, Liu W. Advances in computational methods for identifying cancer driver genes. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:21643-21669. [PMID: 38124614 DOI: 10.3934/mbe.2023958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Cancer driver genes (CDGs) are crucial in cancer prevention, diagnosis and treatment. This study employed computational methods for identifying CDGs, categorizing them into four groups. The major frameworks for each of these four categories were summarized. Additionally, we systematically gathered data from public databases and biological networks, and we elaborated on computational methods for identifying CDGs using the aforementioned databases. Further, we summarized the algorithms, mainly involving statistics and machine learning, used for identifying CDGs. Notably, the performances of nine typical identification methods for eight types of cancer were compared to analyze the applicability areas of these methods. Finally, we discussed the challenges and prospects associated with methods for identifying CDGs. The present study revealed that the network-based algorithms and machine learning-based methods demonstrated superior performance.
Collapse
Affiliation(s)
- Ying Wang
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Bohao Zhou
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Jidong Ru
- School of Textile Garment and Design, Changshu Institute of Technology, Changshu 215500, China
| | - Xianglian Meng
- School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou 213032, China
| | - Yundong Wang
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Wenjie Liu
- School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou 213032, China
| |
Collapse
|
9
|
Dou R, Kang S, Yang H, Zhang W, Zhang Y, Liu Y, Ping Y, Pang B. Identifying the driver miRNAs with somatic copy number alterations driving dysregulated ceRNA networks in cancers. Biol Direct 2023; 18:79. [PMID: 37993951 PMCID: PMC10666415 DOI: 10.1186/s13062-023-00438-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 11/15/2023] [Indexed: 11/24/2023] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) play critical roles in cancer initiation and progression, which were critical components to maintain the dynamic balance of competing endogenous RNA (ceRNA) networks. Somatic copy number alterations (SCNAs) in the cancer genome could disturb the transcriptome level of miRNA to deregulate this balance. However, the driving effects of SCNAs of miRNAs were insufficiently understood. METHODS In this study, we proposed a method to dissect the functional roles of miRNAs under different copy number states and identify driver miRNAs by integrating miRNA SCNAs profile, miRNA-target relationships and expression profiles of miRNA, mRNA and lncRNA. RESULTS Applying our method to 813 TCGA breast cancer (BRCA) samples, we identified 29 driver miRNAs whose SCNAs significantly and concordantly regulated their own expression levels and further inversely dysregulated expression levels of their targets or disturbed the miRNA-target networks they directly involved. Based on miRNA-target networks, we further constructed dynamic ceRNA networks driven by driver SCNAs of miRNAs and identified three different patterns of SCNA interference in the miRNA-mediated dynamic ceRNA networks. Survival analysis of driver miRNAs showed that high-level amplifications of four driver miRNAs (including has-miR-30d-3p, has-mir-30b-5p, has-miR-30d-5p and has-miR-151a-3p) in 8q24 characterized a new BRCA subtype with poor prognosis and contributed to the dysfunction of cancer-associated hallmarks in a complementary way. The SCNAs of driver miRNAs across different cancer types contributed to the cancer development by dysregulating different components of the same cancer hallmarks, suggesting the cancer specificity of driver miRNA. CONCLUSIONS These results demonstrate the efficacy of our method in identifying driver miRNAs and elucidating their functional roles driven by endogenous SCNAs, which is useful for interpreting cancer genomes and pathogenic mechanisms.
Collapse
Affiliation(s)
- Renjie Dou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Shaobo Kang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Huan Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Wanmei Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Yijing Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Yuanyuan Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Yanyan Ping
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China.
| | - Bo Pang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China.
| |
Collapse
|
10
|
Xu J, Pang B, Lan Y, Dou R, Wang S, Kang S, Zhang W, Liu Y, Zhang Y, Ping Y. Identifying the personalized driver gene sets maximally contributing to abnormality of transcriptome phenotype in glioblastoma multiforme individuals. Mol Oncol 2023; 17:2472-2490. [PMID: 37491836 PMCID: PMC10620122 DOI: 10.1002/1878-0261.13499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 06/21/2023] [Accepted: 07/24/2023] [Indexed: 07/27/2023] Open
Abstract
High heterogeneity in genome and phenotype of cancer populations made it difficult to apply population-based common driver genes to the diagnosis and treatment of cancer individuals. Characterizing and identifying the personalized driver mechanism for glioblastoma multiforme (GBM) individuals were pivotal for the realization of precision medicine. We proposed an integrative method to identify the personalized driver gene sets by integrating the profiles of gene expression and genetic alterations in cancer individuals. This method coupled genetic algorithm and random walk to identify the optimal gene sets that could explain abnormality of transcriptome phenotype to the maximum extent. The personalized driver gene sets were identified for 99 GBM individuals using our method. We found that genomic alterations in between one and seven driver genes could maximally and cumulatively explain the dysfunction of cancer hallmarks across GBM individuals. The driver gene sets were distinct even in GBM individuals with significantly similar transcriptomic phenotypes. Our method identified MCM4 with rare genetic alterations as previously unknown oncogenic genes, the high expression of which were significantly associated with poor GBM prognosis. The functional experiments confirmed that knockdown of MCM4 could significantly inhibit proliferation, invasion, migration, and clone formation of the GBM cell lines U251 and U118MG, and overexpression of MCM4 significantly promoted the proliferation, invasion, migration, and clone formation of the GBM cell line U87MG. Our method could dissect the personalized driver genetic alteration sets that are pivotal for developing targeted therapy strategies and precision medicine. Our method could be extended to identify key drivers from other levels and could be applied to more cancer types.
Collapse
Affiliation(s)
- Jinyuan Xu
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Bo Pang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Yujia Lan
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Renjie Dou
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Shuai Wang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Shaobo Kang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Wanmei Zhang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Yuanyuan Liu
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Yijing Zhang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| | - Yanyan Ping
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityChina
| |
Collapse
|
11
|
Gillman R, Field MA, Schmitz U, Karamatic R, Hebbard L. Identifying cancer driver genes in individual tumours. Comput Struct Biotechnol J 2023; 21:5028-5038. [PMID: 37867967 PMCID: PMC10589724 DOI: 10.1016/j.csbj.2023.10.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 10/10/2023] [Accepted: 10/11/2023] [Indexed: 10/24/2023] Open
Abstract
Cancer is a heterogeneous disease with a strong genetic component making it suitable for precision medicine approaches aimed at identifying the underlying molecular drivers within a tumour. Large scale population-level cancer sequencing consortia have identified many actionable mutations common across both cancer types and sub-types, resulting in an increasing number of successful precision medicine programs. Nonetheless, such approaches fail to consider the effects of mutations unique to an individual patient and may miss rare driver mutations, necessitating personalised approaches to driver-gene prioritisation. One approach is to quantify the functional importance of individual mutations in a single tumour based on how they affect the expression of genes in a gene interaction network (GIN). These GIN-based approaches can be broadly divided into those that utilise an existing reference GIN and those that construct de novo patient-specific GINs. These single-tumour approaches have several limitations that likely influence their results, such as use of reference cohort data, network choice, and approaches to mathematical approximation, and more research is required to evaluate the in vitro and in vivo applicability of their predictions. This review examines the current state of the art methods that identify driver genes in single tumours with a focus on GIN-based driver prioritisation.
Collapse
Affiliation(s)
- Rhys Gillman
- Department of Biomedical Sciences and Molecular and Cell Biology, College of Public Health, Medical, and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Cairns, Queensland, Australia
| | - Matt A. Field
- Department of Biomedical Sciences and Molecular and Cell Biology, College of Public Health, Medical, and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Cairns, Queensland, Australia
- Immunogenomics Lab, Garvan Institute of Medical Research, Darlinghurst, New South Wales, Australia
- Menzies School of Health Research, Charles Darwin University, Darwin, Northern Territory, Australia
| | - Ulf Schmitz
- Department of Biomedical Sciences and Molecular and Cell Biology, College of Public Health, Medical, and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Cairns, Queensland, Australia
| | - Rozemary Karamatic
- Gastroenterology and Hepatology, Townsville University Hospital, PO Box 670, Townsville, Queensland 4810, Australia
- College of Medicine and Dentistry, Division of Tropical Health and Medicine, James Cook University, Townsville, Queensland, Australia
| | - Lionel Hebbard
- Department of Biomedical Sciences and Molecular and Cell Biology, College of Public Health, Medical, and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Cairns, Queensland, Australia
- Storr Liver Centre, Westmead Institute for Medical Research, Westmead Hospital and University of Sydney, Sydney, New South Wales, Australia
- Australian Institute for Tropical Health and Medicine, Townsville, Queensland, Australia
| |
Collapse
|
12
|
Peng W, Yu P, Dai W, Fu X, Liu L, Pan Y. A Graph Convolution Network-Based Model for Prioritizing Personalized Cancer Driver Genes of Individual Patients. IEEE Trans Nanobioscience 2023; 22:744-754. [PMID: 37195839 DOI: 10.1109/tnb.2023.3277316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
Cancer driver genes are mutated genes that play a key role in the growth of cancer cells. Accurately identifying the cancer driver genes helps us understand cancer's pathogenesis and develop effective treatment strategies. However, cancers are highly heterogeneous diseases; patients with the same cancer type may have different genomic characteristics and clinical symptoms. Hence, it is urgent to devise effective methods to identify personalized cancer driver genes of individual patients to help determine whether a patient can be treated with a certain targeted drug. This work presents a method for predicting personalized cancer Driver genes of individual patients based on Graph Convolution Networks and Neighbor Interactions called NIGCNDriver. NIGCNDriver first constructs a gene-sample association matrix using the associations between a sample and its known driver genes. Then, it employs graph convolution models on the gene-sample network to aggregate neighbor node features, and themself features, and then combines with the element-wise level interactions between neighbors to learn new feature representations for the samples and gene nodes. Finally, a linear correlation coefficient decoder is used to reconstruct the association between the sample and the mutant gene, enabling the prediction of a personalized driver gene for the individual sample. We applied the NIGCNDriver method to predict cancer driver genes for individual samples in the TCGA and cancer cell line datasets. The results show that our method outperforms the baseline methods in cancer driver gene prediction for individual samples.
Collapse
|
13
|
Li Y, Zhang SW, Xie MY, Zhang T. PhenoDriver: interpretable framework for studying personalized phenotype-associated driver genes in breast cancer. Brief Bioinform 2023; 24:bbad291. [PMID: 37738403 DOI: 10.1093/bib/bbad291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 07/12/2023] [Accepted: 07/27/2023] [Indexed: 09/24/2023] Open
Abstract
Identifying personalized cancer driver genes and further revealing their oncogenic mechanisms is critical for understanding the mechanisms of cell transformation and aiding clinical diagnosis. Almost all existing methods primarily focus on identifying driver genes at the cohort or individual level but fail to further uncover their underlying oncogenic mechanisms. To fill this gap, we present an interpretable framework, PhenoDriver, to identify personalized cancer driver genes, elucidate their roles in cancer development and uncover the association between driver genes and clinical phenotypic alterations. By analyzing 988 breast cancer patients, we demonstrate the outstanding performance of PhenoDriver in identifying breast cancer driver genes at the cohort level compared to other state-of-the-art methods. Otherwise, our PhenoDriver can also effectively identify driver genes with both recurrent and rare mutations in individual patients. We further explore and reveal the oncogenic mechanisms of some known and unknown breast cancer driver genes (e.g. TP53, MAP3K1, HTT, etc.) identified by PhenoDriver, and construct their subnetworks for regulating clinical abnormal phenotypes. Notably, most of our findings are consistent with existing biological knowledge. Based on the personalized driver profiles, we discover two existing and one unreported breast cancer subtypes and uncover their molecular mechanisms. These results intensify our understanding for breast cancer mechanisms, guide therapeutic decisions and assist in the development of targeted anticancer therapies.
Collapse
Affiliation(s)
- Yan Li
- School of Automation from Northwestern Polytechnical University, China
| | - Shao-Wu Zhang
- School of Automation from Northwestern Polytechnical University, China
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, China
| | - Ming-Yu Xie
- School of Automation from Northwestern Polytechnical University, China
| | - Tong Zhang
- School of Automation from Northwestern Polytechnical University, China
| |
Collapse
|
14
|
Berber I, Erten C, Kazan H. Predator: Predicting the Impact of Cancer Somatic Mutations on Protein-Protein Interactions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3163-3172. [PMID: 37030791 DOI: 10.1109/tcbb.2023.3262119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Since many biological processes are governed by protein-protein interactions, understanding which mutations lead to a disruption in these interactions is profoundly important for cancer research. Most of the existing methods focus on the stability of the protein without considering the specific effects of a mutation on its interactions with other proteins. Here, we focus on somatic mutations that appear on the interface regions of the protein and predict the interactions that would be affected by a mutation of interest. We build an ensemble model, Predator, that classifies the interface mutations as disruptive or nondisruptive based on the predicted effects of mutations on specific protein-protein interactions. We show that Predator outperforms existing approaches in literature in terms of prediction accuracy. We then apply Predator on various TCGA cancer cohorts and perform comprehensive analysis at cohort level, patient level, and gene level in determining the genes whose interface mutations tend to yield a disruption in its interactions. The predictions obtained by Predator shed light on interesting patterns on several genes for each cohort regarding their potential as cancer drivers. Our analyses further reveal that the identified genes and their frequently disrupted partners exhibit patterns of mutually exclusivity across cancer cohorts under study.
Collapse
|
15
|
Alotaibi FM, Khan YD. A Framework for Prediction of Oncogenomic Progression Aiding Personalized Treatment of Gastric Cancer. Diagnostics (Basel) 2023; 13:2291. [PMID: 37443684 DOI: 10.3390/diagnostics13132291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 06/05/2023] [Accepted: 06/13/2023] [Indexed: 07/15/2023] Open
Abstract
Mutations in genes can alter their DNA patterns, and by recognizing these mutations, many carcinomas can be diagnosed in the progression stages. The human body contains many hidden and enigmatic features that humankind has not yet fully understood. A total of 7539 neoplasm cases were reported from 1 January 2021 to 31 December 2021. Of these, 3156 were seen in males (41.9%) and 4383 (58.1%) in female patients. Several machine learning and deep learning frameworks are already implemented to detect mutations, but these techniques lack generalized datasets and need to be optimized for better results. Deep learning-based neural networks provide the computational power to calculate the complex structures of gastric carcinoma-driven gene mutations. This study proposes deep learning approaches such as long and short-term memory, gated recurrent units and bi-LSTM to help in identifying the progression of gastric carcinoma in an optimized manner. This study includes 61 carcinogenic driver genes whose mutations can cause gastric cancer. The mutation information was downloaded from intOGen.org and normal gene sequences were downloaded from asia.ensembl.org, as explained in the data collection section. The proposed deep learning models are validated using the self-consistency test (SCT), 10-fold cross-validation test (FCVT), and independent set test (IST); the IST prediction metrics of accuracy, sensitivity, specificity, MCC and AUC of LSTM, Bi-LSTM, and GRU are 97.18%, 98.35%, 96.01%, 0.94, 0.98; 99.46%, 98.93%, 100%, 0.989, 1.00; 99.46%, 98.93%, 100%, 0.989 and 1.00, respectively.
Collapse
Affiliation(s)
- Fahad M Alotaibi
- Department of Information System, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Yaser Daanial Khan
- Department of Computer Science, University of Management and Technology, Lahore 54770, Pakistan
| |
Collapse
|
16
|
Zhu X, Zhao W, Zhou Z, Gu X. Unraveling the Drivers of Tumorigenesis in the Context of Evolution: Theoretical Models and Bioinformatics Tools. J Mol Evol 2023:10.1007/s00239-023-10117-0. [PMID: 37246992 DOI: 10.1007/s00239-023-10117-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 05/09/2023] [Indexed: 05/30/2023]
Abstract
Cancer originates from somatic cells that have accumulated mutations. These mutations alter the phenotype of the cells, allowing them to escape homeostatic regulation that maintains normal cell numbers. The emergence of malignancies is an evolutionary process in which the random accumulation of somatic mutations and sequential selection of dominant clones cause cancer cells to proliferate. The development of technologies such as high-throughput sequencing has provided a powerful means to measure subclonal evolutionary dynamics across space and time. Here, we review the patterns that may be observed in cancer evolution and the methods available for quantifying the evolutionary dynamics of cancer. An improved understanding of the evolutionary trajectories of cancer will enable us to explore the molecular mechanism of tumorigenesis and to design tailored treatment strategies.
Collapse
Affiliation(s)
- Xunuo Zhu
- Innovation Institute for Artificial Intelligence in Medicine, Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Wenyi Zhao
- Innovation Institute for Artificial Intelligence in Medicine, Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Zhan Zhou
- Innovation Institute for Artificial Intelligence in Medicine, Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, 322000, China.
- Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 310058, China.
| | - Xun Gu
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
17
|
Bhin J, Paes Dias M, Gogola E, Rolfs F, Piersma SR, de Bruijn R, de Ruiter JR, van den Broek B, Duarte AA, Sol W, van der Heijden I, Andronikou C, Kaiponen TS, Bakker L, Lieftink C, Morris B, Beijersbergen RL, van de Ven M, Jimenez CR, Wessels LFA, Rottenberg S, Jonkers J. Multi-omics analysis reveals distinct non-reversion mechanisms of PARPi resistance in BRCA1- versus BRCA2-deficient mammary tumors. Cell Rep 2023; 42:112538. [PMID: 37209095 DOI: 10.1016/j.celrep.2023.112538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 03/16/2023] [Accepted: 05/03/2023] [Indexed: 05/22/2023] Open
Abstract
BRCA1 and BRCA2 both function in DNA double-strand break repair by homologous recombination (HR). Due to their HR defect, BRCA1/2-deficient cancers are sensitive to poly(ADP-ribose) polymerase inhibitors (PARPis), but they eventually acquire resistance. Preclinical studies yielded several PARPi resistance mechanisms that do not involve BRCA1/2 reactivation, but their relevance in the clinic remains elusive. To investigate which BRCA1/2-independent mechanisms drive spontaneous resistance in vivo, we combine molecular profiling with functional analysis of HR of matched PARPi-naive and PARPi-resistant mouse mammary tumors harboring large intragenic deletions that prevent reactivation of BRCA1/2. We observe restoration of HR in 62% of PARPi-resistant BRCA1-deficient tumors but none in the PARPi-resistant BRCA2-deficient tumors. Moreover, we find that 53BP1 loss is the prevalent resistance mechanism in HR-proficient BRCA1-deficient tumors, whereas resistance in BRCA2-deficient tumors is mainly induced by PARG loss. Furthermore, combined multi-omics analysis identifies additional genes and pathways potentially involved in modulating PARPi response.
Collapse
Affiliation(s)
- Jinhyuk Bhin
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands; Division of Molecular Carcinogenesis, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands; Department of Biomedical System Informatics, Gangnam Severance Hospital, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Mariana Paes Dias
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Ewa Gogola
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Frank Rolfs
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands; OncoProteomics Laboratory, Department Medical Oncology, Amsterdam UMC, 1081HV Amsterdam, the Netherlands
| | - Sander R Piersma
- OncoProteomics Laboratory, Department Medical Oncology, Amsterdam UMC, 1081HV Amsterdam, the Netherlands
| | - Roebi de Bruijn
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands; Division of Molecular Carcinogenesis, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Julian R de Ruiter
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands; Division of Molecular Carcinogenesis, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Bram van den Broek
- Division of Cell Biology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Alexandra A Duarte
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Wendy Sol
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Ingrid van der Heijden
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Christina Andronikou
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands; Cancer Therapy Resistance Cluster and Bern Center for Precision Medicine, Department for Biomedical Research, University of Bern, 3088 Bern, Switzerland; Institute of Animal Pathology, Vetsuisse Faculty, University of Bern, 3012 Bern, Switzerland
| | - Taina S Kaiponen
- Cancer Therapy Resistance Cluster and Bern Center for Precision Medicine, Department for Biomedical Research, University of Bern, 3088 Bern, Switzerland; Institute of Animal Pathology, Vetsuisse Faculty, University of Bern, 3012 Bern, Switzerland
| | - Lara Bakker
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Cor Lieftink
- Division of Molecular Carcinogenesis, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Ben Morris
- Division of Molecular Carcinogenesis, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Roderick L Beijersbergen
- Division of Molecular Carcinogenesis, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Marieke van de Ven
- Mouse Clinic for Cancer and Aging, Preclinical Intervention Unit, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands
| | - Connie R Jimenez
- OncoProteomics Laboratory, Department Medical Oncology, Amsterdam UMC, 1081HV Amsterdam, the Netherlands
| | - Lodewyk F A Wessels
- Division of Molecular Carcinogenesis, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands.
| | - Sven Rottenberg
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands; Cancer Therapy Resistance Cluster and Bern Center for Precision Medicine, Department for Biomedical Research, University of Bern, 3088 Bern, Switzerland; Institute of Animal Pathology, Vetsuisse Faculty, University of Bern, 3012 Bern, Switzerland.
| | - Jos Jonkers
- Division of Molecular Pathology, Oncode Institute, the Netherlands Cancer Institute, 1066CX Amsterdam, the Netherlands.
| |
Collapse
|
18
|
Meng P, Wang G, Guo H, Jiang T. Identifying cancer driver genes using a two-stage random walk with restart on a gene interaction network. Comput Biol Med 2023; 158:106810. [PMID: 37011433 DOI: 10.1016/j.compbiomed.2023.106810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 03/08/2023] [Accepted: 03/20/2023] [Indexed: 04/03/2023]
Abstract
Cancer development and progression are significantly influenced by cancer driver genes. Understanding cancer driver genes and their mechanisms of action is essential for developing effective cancer treatments. As a result, identifying driver genes is important for drug development, cancer diagnosis, and treatment. Here, we present an algorithm to discover driver genes based on the two-stage random walk with restart (RWR), and the modified method for calculating the transition probability matrix in random walk algorithm. First, we performed the first stage of RWR on the whole gene interaction network, in which we employ a new method for calculating the transition probability matrix and extracted the subnetwork based on nodes that had a high correlation with the seed nodes. The subnetwork was then applied to the second stage of RWR and the nodes were re-ranked in the subnetwork. Our approach outperformed existing methods in identifying driver genes. The outcome of the effect of three gene interaction networks, two rounds of random walk, and the seed nodes' sensitivity were all compared at the same time. In addition, we identified several potential driver genes, some of which are involved in driving cancer development. Overall, our method is efficient in various cancer types, significantly outperforms existing methods, and can identify possible driver genes.
Collapse
|
19
|
Chen HH, Hsueh CW, Lee CH, Hao TY, Tu TY, Chang LY, Lee JC, Lin CY. SWEET: a single-sample network inference method for deciphering individual features in disease. Brief Bioinform 2023; 24:7017366. [PMID: 36719112 PMCID: PMC10025435 DOI: 10.1093/bib/bbad032] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 01/05/2023] [Accepted: 01/14/2023] [Indexed: 02/01/2023] Open
Abstract
Recently, extracting inherent biological system information (e.g. cellular networks) from genome-wide expression profiles for developing personalized diagnostic and therapeutic strategies has become increasingly important. However, accurately constructing single-sample networks (SINs) to capture individual characteristics and heterogeneity in disease remains challenging. Here, we propose a sample-specific-weighted correlation network (SWEET) method to model SINs by integrating the genome-wide sample-to-sample correlation (i.e. sample weights) with the differential network between perturbed and aggregate networks. For a group of samples, the genome-wide sample weights can be assessed without prior knowledge of intrinsic subpopulations to address the network edge number bias caused by sample size differences. Compared with the state-of-the-art SIN inference methods, the SWEET SINs in 16 cancers more likely fit the scale-free property, display higher overlap with the human interactomes and perform better in identifying three types of cancer-related genes. Moreover, integrating SWEET SINs with a network proximity measure facilitates characterizing individual features and therapy in diseases, such as somatic mutation, mut-driver and essential genes. Biological experiments further validated two candidate repurposable drugs, albendazole for head and neck squamous cell carcinoma (HNSCC) and lung adenocarcinoma (LUAD) and encorafenib for HNSCC. By applying SWEET, we also identified two possible LUAD subtypes that exhibit distinct clinical features and molecular mechanisms. Overall, the SWEET method complements current SIN inference and analysis methods and presents a view of biological systems at the network level to offer numerous clues for further investigation and clinical translation in network medicine and precision medicine.
Collapse
Affiliation(s)
- Hsin-Hua Chen
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
| | - Chun-Wei Hsueh
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
| | - Chia-Hwa Lee
- School of Medical Laboratory Science and Biotechnology, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan
- TMU Research Center of Cancer Translational Medicine, Taipei Medical University, Taipei 110, Taiwan
- Ph.D. Program in Medical Biotechnology, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Ting-Yi Hao
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
| | - Tzu-Ying Tu
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
| | - Lan-Yun Chang
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
| | - Jih-Chin Lee
- Department of Otolaryngology-Head and Neck Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei 110, Taiwan
| | - Chun-Yu Lin
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
- Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
- Institute of Data Science and Engineering, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
- Center for Intelligent Drug Systems and Smart Bio-devices, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
- School of Dentistry, Kaohsiung Medical University, Kaohsiung 807, Taiwan
| |
Collapse
|
20
|
Quan C, Liu F, Qi L, Tie Y. LRT-CLUSTER: A New Clustering Algorithm Based on Likelihood Ratio Test to Identify Driving Genes. Interdiscip Sci 2023; 15:217-230. [PMID: 36848004 DOI: 10.1007/s12539-023-00554-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 01/31/2023] [Accepted: 02/01/2023] [Indexed: 03/01/2023]
Abstract
Somatic mutations often occur at high relapse sites in protein sequences, which indicates that the location clustering of somatic missense mutations can be used to identify driving genes. However, the traditional clustering algorithm has such problems as the background signal over-fitting, the clustering algorithm is not suitable for mutation data, and the performance of identifying low-frequency mutation genes needs to be improved. In this paper, we propose a linear clustering algorithm based on likelihood ratio test knowledge to identify driver genes. In this experiment, firstly, the polynucleotide mutation rate is calculated based on the prior knowledge of likelihood ratio test. Then, the simulation data set is obtained through the background mutation rate model. Finally, the unsupervised peak clustering algorithm is used to, respectively, evaluate the somatic mutation data and the simulation data to identify the driver genes. The experimental results show that our method achieves a better balance of precision and sensitivity. It can also identify the driver genes missed by other methods, making it an effective supplement to other methods. We also discover some potential linkages between genes and between genes and mutation sites, which is of great value to target drug therapy research. Method framework: Our proposed model framework is as follows. a. Counting mutation sites and the number of mutations in tumor gene elements. b. The nucleotide context mutation frequency is counted based on the likelihood ratio test knowledge, and the background mutation rate model is obtained. c. Based on Monte Carlo simulation method, data sets with the same number of mutations as gene elements are randomly sampled to obtain simulated mutation data, and the sampling frequency of each mutation site is related to the mutation rate of polynucleotide. d. The original mutation data and the simulated mutation data after random reconstruction are clustered by peak density, respectively, and the corresponding clustering scores are obtained. e. We can obtain the clustering information statistics in each gene segment and score of each gene segment from the original single nucleotide mutation data through step d. f. According to the observed score and the simulated clustering score, the p-value of the corresponding gene fragment is calculated. g. We can obtain the clustering information statistics in each gene segment and score of each gene segment from the simulated single nucleotide mutation data through step d.
Collapse
Affiliation(s)
- Chenxu Quan
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, China.,Department of Respiratory and Sleep Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Fenghui Liu
- Department of Respiratory and Sleep Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Lin Qi
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, China
| | - Yun Tie
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, China.
| |
Collapse
|
21
|
Identification of Cancer Driver Genes by Integrating Multiomics Data with Graph Neural Networks. Metabolites 2023; 13:metabo13030339. [PMID: 36984779 PMCID: PMC10052551 DOI: 10.3390/metabo13030339] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 02/20/2023] [Accepted: 02/22/2023] [Indexed: 03/02/2023] Open
Abstract
Cancer is a heterogeneous disease that is driven by the accumulation of both genetic and nongenetic alterations, so integrating multiomics data and extracting effective information from them is expected to be an effective way to predict cancer driver genes. In this paper, we first generate comprehensive instructive features for each gene from genomic, epigenomic, transcriptomic levels together with protein–protein interaction (PPI)-networks-derived attributes and then propose a novel semisupervised deep graph learning framework GGraphSAGE to predict cancer driver genes according to the impact of the alterations on a biological system. When applied to eight tumor types, experimental results suggest that GGraphSAGE outperforms several state-of-the-art computational methods for driver genes identification. Moreover, it broadens our current understanding of cancer driver genes from multiomics level and identifies driver genes specific to the tumor type rather than pan-cancer. We expect GGraphSAGE to open new avenues in precision medicine and even further predict drivers for other complex diseases.
Collapse
|
22
|
Cheng X, Amanullah M, Liu W, Liu Y, Pan X, Zhang H, Xu H, Liu P, Lu Y. WMDS.net: a network control framework for identifying key players in transcriptome programs. Bioinformatics 2023; 39:7023921. [PMID: 36727489 PMCID: PMC9925106 DOI: 10.1093/bioinformatics/btad071] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 01/16/2023] [Accepted: 02/01/2023] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Mammalian cells can be transcriptionally reprogramed to other cellular phenotypes. Controllability of such complex transitions in transcriptional networks underlying cellular phenotypes is an inherent biological characteristic. This network controllability can be interpreted by operating a few key regulators to guide the transcriptional program from one state to another. Finding the key regulators in the transcriptional program can provide key insights into the network state transition underlying cellular phenotypes. RESULTS To address this challenge, here, we proposed to identify the key regulators in the transcriptional co-expression network as a minimum dominating set (MDS) of driver nodes that can fully control the network state transition. Based on the theory of structural controllability, we developed a weighted MDS network model (WMDS.net) to find the driver nodes of differential gene co-expression networks. The weight of WMDS.net integrates the degree of nodes in the network and the significance of gene co-expression difference between two physiological states into the measurement of node controllability of the transcriptional network. To confirm its validity, we applied WMDS.net to the discovery of cancer driver genes in RNA-seq datasets from The Cancer Genome Atlas. WMDS.net is powerful among various cancer datasets and outperformed the other top-tier tools with a better balance between precision and recall. AVAILABILITY AND IMPLEMENTATION https://github.com/chaofen123/WMDS.net. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiang Cheng
- Department of Gynecologic Oncology, Women's Hospital and Institute of Translational Medicine, Zhejiang University School of Medicine, Hangzhou 310006, China.,Institute of Bioinformatics, Zhejiang University, Hangzhou 310058, China
| | - Md Amanullah
- Institute of Bioinformatics, Zhejiang University, Hangzhou 310058, China.,Department of Respiratory Medicine, Key Laboratory of Precision Medicine in Diagnosis and Monitoring Research of Zhejiang Province, Sir Run Run Shaw Hospital and Institute of Translational Medicine, Zhejiang University School of Medicine, Hangzhou 310016, China
| | - Weigang Liu
- Department of Respiratory Medicine, Key Laboratory of Precision Medicine in Diagnosis and Monitoring Research of Zhejiang Province, Sir Run Run Shaw Hospital and Institute of Translational Medicine, Zhejiang University School of Medicine, Hangzhou 310016, China
| | - Yi Liu
- Institute of Bioinformatics, Zhejiang University, Hangzhou 310058, China.,Department of Respiratory Medicine, Key Laboratory of Precision Medicine in Diagnosis and Monitoring Research of Zhejiang Province, Sir Run Run Shaw Hospital and Institute of Translational Medicine, Zhejiang University School of Medicine, Hangzhou 310016, China
| | - Xiaoqing Pan
- Department of Mathematics, Shanghai Normal University, Xuhui 200234, China
| | - Honghe Zhang
- Department of Pathology, Research Unit of Intelligence Classification of Tumor Pathology and Precision Therapy, Chinese Academy of Medical Sciences, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Haiming Xu
- Institute of Bioinformatics, Zhejiang University, Hangzhou 310058, China
| | - Pengyuan Liu
- Department of Gynecologic Oncology, Women's Hospital and Institute of Translational Medicine, Zhejiang University School of Medicine, Hangzhou 310006, China.,Department of Physiology, Center of Systems Molecular Medicine, Medical College of Wisconsin, Milwaukee, WI 53226, USA.,Cancer Center, Zhejiang University, Hangzhou 310029, China
| | - Yan Lu
- Department of Gynecologic Oncology, Women's Hospital and Institute of Translational Medicine, Zhejiang University School of Medicine, Hangzhou 310006, China.,Institute of Bioinformatics, Zhejiang University, Hangzhou 310058, China.,Cancer Center, Zhejiang University, Hangzhou 310029, China
| |
Collapse
|
23
|
Dutta D, Sen A, Satagopan J. Sparse canonical correlation to identify breast cancer related genes regulated by copy number aberrations. PLoS One 2022; 17:e0276886. [PMID: 36584096 PMCID: PMC9803132 DOI: 10.1371/journal.pone.0276886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 10/16/2022] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Copy number aberrations (CNAs) in cancer affect disease outcomes by regulating molecular phenotypes, such as gene expressions, that drive important biological processes. To gain comprehensive insights into molecular biomarkers for cancer, it is critical to identify key groups of CNAs, the associated gene modules, regulatory modules, and their downstream effect on outcomes. METHODS In this paper, we demonstrate an innovative use of sparse canonical correlation analysis (sCCA) to effectively identify the ensemble of CNAs, and gene modules in the context of binary and censored disease endpoints. Our approach detects potentially orthogonal gene expression modules which are highly correlated with sets of CNA and then identifies the genes within these modules that are associated with the outcome. RESULTS Analyzing clinical and genomic data on 1,904 breast cancer patients from the METABRIC study, we found 14 gene modules to be regulated by groups of proximally located CNA sites. We validated this finding using an independent set of 1,077 breast invasive carcinoma samples from The Cancer Genome Atlas (TCGA). Our analysis of 7 clinical endpoints identified several novel and interpretable regulatory associations, highlighting the role of CNAs in key biological pathways and processes for breast cancer. Genes significantly associated with the outcomes were enriched for early estrogen response pathway, DNA repair pathways as well as targets of transcription factors such as E2F4, MYC, and ETS1 that have recognized roles in tumor characteristics and survival. Subsequent meta-analysis across the endpoints further identified several genes through the aggregation of weaker associations. CONCLUSIONS Our findings suggest that sCCA analysis can aggregate weaker associations to identify interpretable and important genes, modules, and clinically consequential pathways.
Collapse
Affiliation(s)
- Diptavo Dutta
- Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland, United States of America
- Integrative Tumor Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland, United States of America
- * E-mail: ,
| | - Ananda Sen
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States of America
- Department of Family Medicine, University of Michigan, Ann Arbor, MI, United States of America
| | - Jaya Satagopan
- Department of Biostatistics and Epidemiology, Rutgers University, New Brunswick, NJ, United States of America
| |
Collapse
|
24
|
Liu Y, Han J, Kong T, Xiao N, Mei Q, Liu J. DriverMP enables improved identification of cancer driver genes. Gigascience 2022; 12:giad106. [PMID: 38091511 PMCID: PMC10716827 DOI: 10.1093/gigascience/giad106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 10/30/2023] [Accepted: 11/22/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Cancer is widely regarded as a complex disease primarily driven by genetic mutations. A critical concern and significant obstacle lies in discerning driver genes amid an extensive array of passenger genes. FINDINGS We present a new method termed DriverMP for effectively prioritizing altered genes on a cancer-type level by considering mutated gene pairs. It is designed to first apply nonsilent somatic mutation data, protein‒protein interaction network data, and differential gene expression data to prioritize mutated gene pairs, and then individual mutated genes are prioritized based on prioritized mutated gene pairs. Application of this method in 10 cancer datasets from The Cancer Genome Atlas demonstrated its great improvements over all the compared state-of-the-art methods in identifying known driver genes. Then, a comprehensive analysis demonstrated the reliability of the novel driver genes that are strongly supported by clinical experiments, disease enrichment, or biological pathway analysis. CONCLUSIONS The new method, DriverMP, which is able to identify driver genes by effectively integrating the advantages of multiple kinds of cancer data, is available at https://github.com/LiuYangyangSDU/DriverMP. In addition, we have developed a novel driver gene database for 10 cancer types and an online service that can be freely accessed without registration for users. The DriverMP method, the database of novel drivers, and the user-friendly online server are expected to contribute to new diagnostic and therapeutic opportunities for cancers.
Collapse
Affiliation(s)
- Yangyang Liu
- School of Mathematics and Statistics, Shandong University (Weihai), Weihai 264209, China
| | - Jiyun Han
- School of Mathematics and Statistics, Shandong University (Weihai), Weihai 264209, China
| | - Tongxin Kong
- School of Mathematics and Statistics, Shandong University (Weihai), Weihai 264209, China
| | - Nannan Xiao
- School of Mathematics and Statistics, Shandong University (Weihai), Weihai 264209, China
| | - Qinglin Mei
- MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Juntao Liu
- School of Mathematics and Statistics, Shandong University (Weihai), Weihai 264209, China
| |
Collapse
|
25
|
Li F, Li H, Shang J, Liu JX, Dai L, Liu X, Li Y. A network-based method for identifying cancer driver genes based on node control centrality. Exp Biol Med (Maywood) 2022; 248:232-241. [PMID: 36573462 PMCID: PMC10107394 DOI: 10.1177/15353702221139201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Cancer is one of the major contributors to human mortality and has a serious influence on human survival and health. In biomedical research, the identification of cancer driver genes (cancer drivers for short) is an important task; cancer drivers can promote the progression and generation of cancer. To identify cancer drivers, many methods have been developed. These computational models only identify coding cancer drivers; however, non-coding drivers likewise play significant roles in the progression of cancer. Hence, we propose a Network-based Method for identifying cancer Driver Genes based on node Control Centrality (NMDGCC), which can identify coding and non-coding cancer driver genes. The process of NMDGCC for identifying driver genes mainly includes the following two steps. In the first step, we construct a gene interaction network by using mRNAs and miRNAs expression data in the cancer state. In the second step, the control centrality of the node is used to identify cancer drivers in the constructed network. We use the breast cancer dataset from The Cancer Genome Atlas (TCGA) to verify the effectiveness of NMDGCC. Compared with the existing methods of cancer driver genes identification, NMDGCC has a better performance. NMDGCC also identifies 295 miRNAs as non-coding cancer drivers, of which 158 are related to tumorigenesis of BRCA. We also apply NMDGCC to identify driver genes related to the different breast cancer subtypes. The result shows that NMDGCC detects many cancer drivers of specific cancer subtypes.
Collapse
Affiliation(s)
- Feng Li
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Han Li
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Junliang Shang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Lingyun Dai
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Xikui Liu
- Department of Electrical Engineering and Information Technology, Shandong University of Science and Technology, Jinan 250031, China
| | - Yan Li
- Department of Electrical Engineering and Information Technology, Shandong University of Science and Technology, Jinan 250031, China
| |
Collapse
|
26
|
He Z, Lin Y, Wei R, Liu C, Jiang D. Repulsion and attraction in searching: A hybrid algorithm based on gravitational kernel and vital few for cancer driver gene prediction. Comput Biol Med 2022; 151:106236. [PMID: 36370584 DOI: 10.1016/j.compbiomed.2022.106236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 10/15/2022] [Accepted: 10/22/2022] [Indexed: 12/27/2022]
Abstract
By taking a new perspective to combine a machine learning method with an evolutionary algorithm, a new hybrid algorithm is developed to predict cancer driver genes. Firstly, inspired by the search strategy with the capability of global search in evolutionary algorithms, a gravitational kernel is proposed to act on the full range of gene features. Constructed by fusing PPI and mutation features, the gravitational kernel is capable to produce repulsion effects. The candidate genes with greater mutation effects and PPI have higher similarity scores. According to repulsion, the similarity score of these promising genes is larger than ordinary genes, which is beneficial to search for these promising genes. Secondly, inspired by the idea of elite populations related to evolutionary algorithms, the concept of vital few is proposed. Targeted at a local scale, it acts on the candidate genes associated with vital few genes. Under attraction effect, these vital few driver genes attract those with similar mutational effects to them, which leads to greater similarity scores. Lastly, the model and parameters are optimized by using an evolutionary algorithm, so as to obtain the optimal model and parameters for cancer driver gene prediction. Herein, a comparison is performed with six other advanced methods of cancer driver gene prediction. According to the experimental results, the method proposed in this study outperforms these six state-of-the-art algorithms on the pan-oncogene dataset.
Collapse
Affiliation(s)
- Zhihui He
- Department of Computer Science, Shantou University, 515063, China
| | - Yingqing Lin
- Department of Computer Science, Shantou University, 515063, China
| | - Runguo Wei
- Department of Computer Science, Shantou University, 515063, China
| | - Cheng Liu
- Department of Computer Science, Shantou University, 515063, China
| | - Dazhi Jiang
- Department of Computer Science, Shantou University, 515063, China; Guangdong Provincial Key Laboratory of Information Security Technology, Sun Yat-sen University, Guangzhou 510399, China.
| |
Collapse
|
27
|
A t(11;14)(q13;q32)/CCND1::IGH carrying progenitor germinal B-cell with subsequent cytogenetic aberrations contributes to the development of classic Hodgkin lymphoma. Cancer Genet 2022; 268-269:97-102. [PMID: 36288644 DOI: 10.1016/j.cancergen.2022.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 06/13/2022] [Accepted: 09/30/2022] [Indexed: 01/25/2023]
Abstract
Classic Hodgkin lymphoma (cHL) is characterized by the presence of Hodgkin Reed-Sternberg (HRS) cells. Although HRS cells express PAX5, cHL frequently lacks other B-cell markers. There is now evidence that HRS cells are monoclonal and are derived from germinal center B-cells. In terms of genetic aberrations, cHL frequently exhibit activated NF-kB signaling pathway. In this study, we present a case of cHL harboring a t(11;14) (q13;q32)/CCND1::IGH, identified by chromosome and fluorescence in situ hybridization analysis and with CCND1 expression in HRS cells. We also analyzed recurrent cytogenetic aberrations in t(11;14) positive mantle cell lymphoma (MCL) and those found in cHL from the literature to assess genetic overlap, clonal evolution, and to identify potential signaling pathways in cHL with CCND1::IGH. This analysis suggests the development of t(11;14)+ cHL and MCL from a transformed precursor cell with t(11;14) through genetic evolution and consequent deregulated pathways, including the NF-κB and NOTCH1 signaling.
Collapse
|
28
|
Tao Y, Ma X, Palmer D, Schwartz R, Lu X, Osmanbeyoglu H. Interpretable deep learning for chromatin-informed inference of transcriptional programs driven by somatic alterations across cancers. Nucleic Acids Res 2022; 50:10869-10881. [PMID: 36243974 PMCID: PMC9638905 DOI: 10.1093/nar/gkac881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 09/23/2022] [Accepted: 09/29/2022] [Indexed: 11/14/2022] Open
Abstract
Cancer is a disease of gene dysregulation, where cells acquire somatic and epigenetic alterations that drive aberrant cellular signaling. These alterations adversely impact transcriptional programs and cause profound changes in gene expression. Interpreting somatic alterations within context-specific transcriptional programs will facilitate personalized therapeutic decisions but is a monumental task. Toward this goal, we develop a partially interpretable neural network model called Chromatin-informed Inference of Transcriptional Regulators Using Self-attention mechanism (CITRUS). CITRUS models the impact of somatic alterations on transcription factors and downstream transcriptional programs. Our approach employs a self-attention mechanism to model the contextual impact of somatic alterations. Furthermore, CITRUS uses a layer of hidden nodes to explicitly represent the state of transcription factors (TFs) to learn the relationships between TFs and their target genes based on TF binding motifs in the open chromatin regions of tumor samples. We apply CITRUS to genomic, transcriptomic, and epigenomic data from 17 cancer types profiled by The Cancer Genome Atlas. CITRUS predicts patient-specific TF activities and reveals transcriptional program variations between and within tumor types. We show that CITRUS yields biological insights into delineating TFs associated with somatic alterations in individual tumors. Thus, CITRUS is a promising tool for precision oncology.
Collapse
Affiliation(s)
| | | | - Drake Palmer
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
| | - Russell Schwartz
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Xinghua Lu
- Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA,Department of Pharmaceutical Science, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | | |
Collapse
|
29
|
Azadifar S, Ahmadi A. A novel candidate disease gene prioritization method using deep graph convolutional networks and semi-supervised learning. BMC Bioinformatics 2022; 23:422. [PMID: 36241966 PMCID: PMC9563530 DOI: 10.1186/s12859-022-04954-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 09/20/2022] [Indexed: 11/18/2022] Open
Abstract
Background Selecting and prioritizing candidate disease genes is necessary before conducting laboratory studies as identifying disease genes from a large number of candidate genes using laboratory methods, is a very costly and time-consuming task. There are many machine learning-based gene prioritization methods. These methods differ in various aspects including the feature vectors of genes, the used datasets with different structures, and the learning model. Creating a suitable feature vector for genes and an appropriate learning model on a variety of data with different and non-Euclidean structures, including graphs, as well as the lack of negative data are very important challenges of these methods. The use of graph neural networks has recently emerged in machine learning and other related fields, and they have demonstrated superior performance for a broad range of problems. Methods In this study, a new semi-supervised learning method based on graph convolutional networks is presented using the novel constructing feature vector for each gene. In the proposed method, first, we construct three feature vectors for each gene using terms from the Gene Ontology (GO) database. Then, we train a graph convolution network on these vectors using protein–protein interaction (PPI) network data to identify disease candidate genes. Our model discovers hidden layer representations encoding in both local graph structure as well as features of nodes. This method is characterized by the simultaneous consideration of topological information of the biological network (e.g., PPI) and other sources of evidence. Finally, a validation has been done to demonstrate the efficiency of our method. Results Several experiments are performed on 16 diseases to evaluate the proposed method's performance. The experiments demonstrate that our proposed method achieves the best results, in terms of precision, the area under the ROC curve (AUCs), and F1-score values, when compared with eight state-of-the-art network and machine learning-based disease gene prioritization methods. Conclusion This study shows that the proposed semi-supervised learning method appropriately classifies and ranks candidate disease genes using a graph convolutional network and an innovative method to create three feature vectors for genes based on the molecular function, cellular component, and biological process terms from GO data.
Collapse
Affiliation(s)
- Saeid Azadifar
- Faculty of Computer Engineering, K. N. Toosi University of Technology, Tehran, Iran.
| | - Ali Ahmadi
- Faculty of Computer Engineering, K. N. Toosi University of Technology, Tehran, Iran
| |
Collapse
|
30
|
Chen D, Liu Z, Wang J, Yang C, Pan C, Tang Y, Zhang P, Liu N, Li G, Li Y, Wu Z, Xia F, Zhang C, Nie H, Tang Z. Integrative genomic analysis facilitates precision strategies for glioblastoma treatment. iScience 2022; 25:105276. [PMID: 36300002 PMCID: PMC9589211 DOI: 10.1016/j.isci.2022.105276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 08/29/2022] [Accepted: 09/30/2022] [Indexed: 11/13/2022] Open
Abstract
Glioblastoma (GBM) is the most common form of malignant primary brain tumor with a dismal prognosis. Currently, the standard treatments for GBM rarely achieve satisfactory results, which means that current treatments are not individualized and precise enough. In this study, a multiomics-based GBM classification was established and three subclasses (GPA, GPB, and GPC) were identified, which have different molecular features both in bulk samples and at single-cell resolution. A robust GBM poor prognostic signature (GPS) score model was then developed using machine learning method, manifesting an excellent ability to predict the survival of GBM. NVP−BEZ235, GDC−0980, dasatinib and XL765 were ultimately identified to have subclass-specific efficacy targeting patients with a high risk of poor prognosis. Furthermore, the GBM classification and GPS score model could be considered as potential biomarkers for immunotherapy response. In summary, an integrative genomic analysis was conducted to advance individual-based therapies in GBM. A multiomics-based classification of GBM was established Single-cell transcriptomic profiling of GBM subclasses was revealed using Scissor A robust prognostic risk model was developed for GBM by machine learning method Prediction of potential agents based on molecular and prognostic risk stratification
Collapse
Affiliation(s)
- Danyang Chen
- Department of Neurology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Zhicheng Liu
- Hepatic Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Jingxuan Wang
- Department of Neurology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Chen Yang
- State Key Laboratory of Oncogenes and Related Genes, Department of Liver Surgery and Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200032, China
| | - Chao Pan
- Department of Neurology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Yingxin Tang
- Department of Neurology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Ping Zhang
- Department of Neurology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Na Liu
- Department of Neurology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Gaigai Li
- Department of Neurology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Yan Li
- State Key Laboratory of Oncogenes and Related Genes, Department of Liver Surgery and Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200032, China,Department of Immunology, Sun Yat-Sen University, Zhongshan School of Medicine, Guangzhou, Guangdong 510080, China
| | - Zhuojin Wu
- Department of Neurology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Feng Xia
- Hepatic Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Cuntai Zhang
- Department of Geriatrics, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Hao Nie
- Department of Geriatrics, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China,Corresponding author
| | - Zhouping Tang
- Department of Neurology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China,Corresponding author
| |
Collapse
|
31
|
Somatic variation in normal tissues: friend or foe of cancer early detection? Ann Oncol 2022; 33:1239-1249. [PMID: 36162751 DOI: 10.1016/j.annonc.2022.09.156] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 09/03/2022] [Accepted: 09/10/2022] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Seemingly normal tissues progressively become populated by mutant clones over time. Most of these clones bear mutations in well-known cancer genes but only rarely do they transform into cancer. This poses questions on what triggers cancer initiation and what implications somatic variation has for cancer early detection. DESIGN We analysed recent mutational screens of healthy and cancer-free diseased tissues to compare somatic drivers and the causes of somatic variation across tissues. We then reviewed the mechanisms of clonal expansion and their relationships with age and diseases other than cancer. We finally discussed the relevance of somatic variation for cancer initiation and how it can help or hinder cancer detection and prevention. RESULTS The extent of somatic variation is highly variable across tissues and depends on intrinsic features, such as tissue architecture and turnover, as well as the exposure to endogenous and exogenous insults. Most somatic mutations driving clonal expansion are tissue-specific and inactivate tumor suppressor genes involved in chromatin modification and cell growth signaling. Some of these genes are more frequently mutated in normal tissues than cancer, indicating a context-dependent cancer promoting or protective role. Mutant clones can persist over a long time or disappear rapidly, suggesting that their fitness depends on the dynamic equilibrium with the environment. The disruption of this equilibrium is likely responsible for their transformation into malignant clones and knowing what triggers this process is key for cancer prevention and early detection. Somatic variation should be considered in liquid biopsy, where it may contribute cancer-independent mutations, and in the identification of cancer drivers, since not all mutated genes favoring clonal expansion also drive tumorigenesis. CONCLUSIONS Somatic variation and the factors governing homeostasis of normal tissues should be taken into account when devising strategies for cancer prevention and early detection.
Collapse
|
32
|
Identification of key somatic oncogenic mutation based on a confounder-free causal inference model. PLoS Comput Biol 2022; 18:e1010529. [PMID: 36137089 PMCID: PMC9499235 DOI: 10.1371/journal.pcbi.1010529] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Accepted: 08/31/2022] [Indexed: 11/30/2022] Open
Abstract
Abnormal cell proliferation and epithelial-mesenchymal transition (EMT) are the essential events that induce cancer initiation and progression. A fundamental goal in cancer research is to develop an efficient method to detect mutational genes capable of driving cancer. Although several computational methods have been proposed to identify these key mutations, many of them focus on the association between genetic mutations and functional changes in relevant biological processes, but not their real causality. Causal effect inference provides a way to estimate the real induce effect of a certain mutation on vital biological processes of cancer initiation and progression, through addressing the confounder bias due to neutral mutations and unobserved latent variables. In this study, integrating genomic and transcriptomic data, we construct a novel causal inference model based on a deep variational autoencoder to identify key oncogenic somatic mutations. Applied to 10 cancer types, our method quantifies the causal effect of genetic mutations on cell proliferation and EMT by reducing both observed and unobserved confounding biases. The experimental results indicate that genes with higher mutation frequency do not necessarily mean they are more potent in inducing cancer and promoting cancer development. Moreover, our study fills a gap in the use of machine learning for causal inference to identify oncogenic mutations. Identifying key mutations of cancers is helpful to better understand the mechanisms of cancer cell transformation and is critical for therapeutic approaches. Besides sequence and structure-based computational approaches, some functional impact-based methods which consider the association between mutation events and the activity of cancer-related biological processes have also been developed to detect key mutations. However, these methods mainly consider the correlation but ignore that the correlation is far from causality due to the existence of observed and unobserved confounding factors. We develop a confounder-free machine learning-based causal inference framework to estimate the causal effect of mutations on abnormal cell proliferation and epithelial-mesenchymal transition (EMT). It fills a gap in the use of causal mechanisms to discover potential driver mutations in cancer biological systems. Applying our method to 10 cancer types, the identified key mutations are highly consistent with public well-verified ones. Additionally, some new key mutations have also been discovered.
Collapse
|
33
|
Zhang SW, Wang ZN, Li Y, Guo WF. Prioritization of cancer driver gene with prize-collecting steiner tree by introducing an edge weighted strategy in the personalized gene interaction network. BMC Bioinformatics 2022; 23:341. [PMID: 35974311 PMCID: PMC9380343 DOI: 10.1186/s12859-022-04802-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 06/13/2022] [Indexed: 11/10/2022] Open
Abstract
Background Cancer is a heterogeneous disease in which tumor genes cooperate as well as adapt and evolve to the changing conditions for individual patients. It is a meaningful task to discover the personalized cancer driver genes that can provide diagnosis and target drug for individual patients. However, most of existing methods mainly ranks potential personalized cancer driver genes by considering the patient-specific nodes information on the gene/protein interaction network. These methods ignore the personalized edge weight information in gene interaction network, leading to false positive results. Results In this work, we presented a novel algorithm (called PDGPCS) to predict the Personalized cancer Driver Genes based on the Prize-Collecting Steiner tree model by considering the personalized edge weight information. PDGPCS first constructs the personalized weighted gene interaction network by integrating the personalized gene expression data and prior known gene/protein interaction network knowledge. Then the gene mutation data and pathway data are integrated to quantify the impact of each mutant gene on every dysregulated pathway with the prize-collecting Steiner tree model. Finally, according to the mutant gene’s aggregated impact score on all dysregulated pathways, the mutant genes are ranked for prioritizing the personalized cancer driver genes. Experimental results on four TCGA cancer datasets show that PDGPCS has better performance than other personalized driver gene prediction methods. In addition, we verified that the personalized edge weight of gene interaction network can improve the prediction performance. Conclusions PDGPCS can more accurately identify the personalized driver genes and takes a step further toward personalized medicine and treatment. The source code of PDGPCS can be freely downloaded from https://github.com/NWPU-903PR/PDGPCS. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04802-y.
Collapse
Affiliation(s)
- Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China.
| | - Zhen-Nan Wang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Yan Li
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Wei-Feng Guo
- School of Electrical Engineering, Zhengzhou University, Zhengzhou, 450001, China.
| |
Collapse
|
34
|
Belikov AV, Vyatkin AD, Leonov SV. Novel Driver Strength Index highlights important cancer genes in TCGA PanCanAtlas patients. PeerJ 2022; 10:e13860. [PMID: 35975235 PMCID: PMC9375969 DOI: 10.7717/peerj.13860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 07/18/2022] [Indexed: 01/18/2023] Open
Abstract
Background Cancer driver genes are usually ranked by mutation frequency, which does not necessarily reflect their driver strength. We hypothesize that driver strength is higher for genes preferentially mutated in patients with few driver mutations overall, because these few mutations should be strong enough to initiate cancer. Methods We propose formulas for the Driver Strength Index (DSI) and the Normalized Driver Strength Index (NDSI), the latter independent of gene mutation frequency. We validate them using TCGA PanCanAtlas datasets, established driver prediction algorithms and custom computational pipelines integrating SNA, CNA and aneuploidy driver contributions at the patient-level resolution. Results DSI and especially NDSI provide substantially different gene rankings compared to the frequency approach. E.g., NDSI prioritized members of specific protein families, including G proteins GNAQ, GNA11 and GNAS, isocitrate dehydrogenases IDH1 and IDH2, and fibroblast growth factor receptors FGFR2 and FGFR3. KEGG analysis shows that top NDSI-ranked genes comprise EGFR/FGFR2/GNAQ/GNA11-NRAS/HRAS/KRAS-BRAF pathway, AKT1-MTOR pathway, and TCEB1-VHL-HIF1A pathway. Conclusion Our indices are able to select for driver gene attributes not selected by frequency sorting, potentially for driver strength. Genes and pathways prioritized are likely the strongest contributors to cancer initiation and progression and should become future therapeutic targets.
Collapse
|
35
|
Liu Z, Lin D, Zhou Y, Zhang L, Yang C, Guo B, Xia F, Li Y, Chen D, Wang C, Chen Z, Leng C, Xiao Z. Exploring synthetic lethal network for the precision treatment of clear cell renal cell carcinoma. Sci Rep 2022; 12:13222. [PMID: 35918352 PMCID: PMC9345903 DOI: 10.1038/s41598-022-16657-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 07/13/2022] [Indexed: 11/29/2022] Open
Abstract
The emerging targeted therapies have revolutionized the treatment of advanced clear cell renal cell carcinoma (ccRCC) over the past 15 years. Nevertheless, lack of personalized treatment limits the development of effective clinical guidelines and improvement of patient prognosis. In this study, large-scale genomic profiles from ccRCC cohorts were explored for integrative analysis. A credible method was developed to identify synthetic lethality (SL) pairs and a list of 72 candidate pairs was determined, which might be utilized to selectively eliminate tumors with genetic aberrations using SL partners of specific mutations. Further analysis identified BRD4 and PRKDC as novel medical targets for patients with BAP1 mutations. After mapping these target genes to the comprehensive drug datasets, two agents (BI-2536 and PI-103) were found to have considerable therapeutic potentials in the BAP1 mutant tumors. Overall, our findings provided insight into the overview of ccRCC mutation patterns and offered novel opportunities for improving individualized cancer treatment.
Collapse
Affiliation(s)
- Zhicheng Liu
- Department of Hepatic Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China
| | - Dongxu Lin
- Department and Institute of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China
| | - Yi Zhou
- Department of Hepatic Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China
| | - Linmeng Zhang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200030, China
| | - Chen Yang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200030, China
| | - Bin Guo
- Department of Hepatic Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China
| | - Feng Xia
- Department of Hepatic Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China
| | - Yan Li
- Department of Immunology, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, 510080, Guangdong, China
| | - Danyang Chen
- Department of Neurology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China
| | - Cun Wang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200030, China
| | - Zhong Chen
- Department and Institute of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China.
| | - Chao Leng
- Department of Hepatic Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China.
| | - Zhenyu Xiao
- Department of Hepatic Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China.
| |
Collapse
|
36
|
A nonlinear model and an algorithm for identifying cancer driver pathways. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
37
|
Khalighi S, Joseph P, Babu D, Singh S, LaFramboise T, Guda K, Varadan V. SYSMut: decoding the functional significance of rare somatic mutations in cancer. Brief Bioinform 2022; 23:bbac280. [PMID: 35804437 PMCID: PMC9618165 DOI: 10.1093/bib/bbac280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 06/14/2022] [Accepted: 06/16/2022] [Indexed: 11/13/2022] Open
Abstract
Current tailored-therapy efforts in cancer are largely focused on a small number of highly recurrently mutated driver genes but therapeutic targeting of these oncogenes remains challenging. However, the vast number of genes mutated infrequently across cancers has received less attention, in part, due to a lack of understanding of their biological significance. We present SYSMut, an extendable systems biology platform that can robustly infer the biologic consequences of somatic mutations by integrating routine multiomics profiles in primary tumors. We establish SYSMut's improved performance vis-à-vis state-of-the-art driver gene identification methodologies by recapitulating the functional impact of known driver genes, while additionally identifying novel functionally impactful mutated genes across 29 cancers. Subsequent application of SYSMut on low-frequency gene mutations in head and neck squamous cell (HNSC) cancers, followed by molecular and pharmacogenetic validation, revealed the lipidogenic network as a novel therapeutic vulnerability in aggressive HNSC cancers. SYSMut is thus a robust scalable framework that enables the discovery of new targetable avenues in cancer.
Collapse
Affiliation(s)
- Sirvan Khalighi
- Division of General Medical Sciences-Oncology, Case Comprehensive Cancer Center
- Department of Genetics and genome Sciences
| | - Peronne Joseph
- Division of General Medical Sciences-Oncology, Case Comprehensive Cancer Center
| | - Deepak Babu
- Division of General Medical Sciences-Oncology, Case Comprehensive Cancer Center
| | - Salendra Singh
- Division of General Medical Sciences-Oncology, Case Comprehensive Cancer Center
| | | | - Kishore Guda
- Division of General Medical Sciences-Oncology, Case Comprehensive Cancer Center
- Digestive Health Research Institute
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, OH-44106 U.S.A
| | - Vinay Varadan
- Division of General Medical Sciences-Oncology, Case Comprehensive Cancer Center
| |
Collapse
|
38
|
Bao Z, Zheng Y, Li X, Huo Y, Zhao G, Zhang F, Li X, Xu P, Liu W, Han H. A simple pre-disease state prediction method based on variations of gene vector features. Comput Biol Med 2022; 148:105890. [DOI: 10.1016/j.compbiomed.2022.105890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 06/27/2022] [Accepted: 07/16/2022] [Indexed: 11/24/2022]
|
39
|
Petrov I, Alexeyenko A. Individualized discovery of rare cancer drivers in global network context. eLife 2022; 11:74010. [PMID: 35593700 PMCID: PMC9159755 DOI: 10.7554/elife.74010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 05/20/2022] [Indexed: 11/13/2022] Open
Abstract
Late advances in genome sequencing expanded the space of known cancer driver genes several-fold. However, most of this surge was based on computational analysis of somatic mutation frequencies and/or their impact on the protein function. On the contrary, experimental research necessarily accounted for functional context of mutations interacting with other genes and conferring cancer phenotypes. Eventually, just such results become ‘hard currency’ of cancer biology. The new method, NEAdriver employs knowledge accumulated thus far in the form of global interaction network and functionally annotated pathways in order to recover known and predict novel driver genes. The driver discovery was individualized by accounting for mutations’ co-occurrence in each tumour genome – as an alternative to summarizing information over the whole cancer patient cohorts. For each somatic genome change, probabilistic estimates from two lanes of network analysis were combined into joint likelihoods of being a driver. Thus, ability to detect previously unnoticed candidate driver events emerged from combining individual genomic context with network perspective. The procedure was applied to 10 largest cancer cohorts followed by evaluating error rates against previous cancer gene sets. The discovered driver combinations were shown to be informative on cancer outcome. This revealed driver genes with individually sparse mutation patterns that would not be detectable by other computational methods and related to cancer biology domains poorly covered by previous analyses. In particular, recurrent mutations of collagen, laminin, and integrin genes were observed in the adenocarcinoma and glioblastoma cancers. Considering constellation patterns of candidate drivers in individual cancer genomes opens a novel avenue for personalized cancer medicine.
Collapse
Affiliation(s)
- Iurii Petrov
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden.,Science for Life Laboratory, Solna, Sweden
| | - Andrey Alexeyenko
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden.,Science for Life Laboratory, Solna, Sweden.,Evi-networks, enskild konsultföretag, Huddinge, Sweden
| |
Collapse
|
40
|
Erten C, Houdjedj A, Kazan H, Taleb Bahmed AA. PersonaDrive: A Method for the Identification and Prioritization of Personalized Cancer Drivers. Bioinformatics 2022; 38:3407-3414. [PMID: 35579340 DOI: 10.1093/bioinformatics/btac329] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 05/06/2022] [Accepted: 05/11/2022] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION A major challenge in cancer genomics is to distinguish the driver mutations that are causally linked to cancer from passenger mutations that do not contribute to cancer development. The majority of existing methods provide a single driver gene list for the entire cohort of patients. However, since mutation profiles of patients from the same cancer type show a high degree of heterogeneity, a more ideal approach is to identify patient-specific drivers. RESULTS We propose a novel method that integrates genomic data, biological pathways, and protein connectivity information for personalized identification of driver genes. The method is formulated on a personalized bipartite graph for each patient. Our approach provides a personalized ranking of the mutated genes of a patient based on the sum of weighted 'pairwise pathway coverage' scores across all the samples, where appropriate pairwise patient similarity scores are used as weights to normalize these coverage scores. We compare our method against three state-of-the-art patient-specific cancer gene prioritization methods. The comparisons are with respect to a novel evaluation method that takes into account the personalized nature of the problem. We show that our approach outperforms the existing alternatives for both the TCGA and the cell line data. Additionally, we show that the KEGG/Reactome pathways enriched in our ranked genes and those that are enriched in cell lines' reference sets overlap significantly when compared to the overlaps achieved by the rankings of the alternative methods. Our findings can provide valuable information towards the development of personalized treatments and therapies. AVAILABILITY All the code and data are available at https://github.com/abu-compbio/PersonaDrive (archived at https://doi.org/10.5281/zenodo.6520187). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cesim Erten
- Department of Computer Engineering, Antalya Bilim University, Antalya, 07190, Turkey
| | - Aissa Houdjedj
- Department of Computer Engineering, Antalya Bilim University, Antalya, 07190, Turkey.,Department of Computer Engineering, Akdeniz University, Antalya, 07070, Turkey
| | - Hilal Kazan
- Department of Computer Engineering, Antalya Bilim University, Antalya, 07190, Turkey
| | - Ahmed Amine Taleb Bahmed
- Electrical and Computer Engineering Graduate Program, Antalya Bilim University, Antalya, 07190, Turkey
| |
Collapse
|
41
|
Chen Z, Lu Y, Cao B, Zhang W, Edwards A, Zhang K. Driver gene detection through Bayesian network integration of mutation and expression profiles. Bioinformatics 2022; 38:2781-2790. [PMID: 35561191 PMCID: PMC9113331 DOI: 10.1093/bioinformatics/btac203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 03/12/2022] [Accepted: 04/06/2022] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The identification of mutated driver genes and the corresponding pathways is one of the primary goals in understanding tumorigenesis at the patient level. Integration of multi-dimensional genomic data from existing repositories, e.g., The Cancer Genome Atlas (TCGA), offers an effective way to tackle this issue. In this study, we aimed to leverage the complementary genomic information of individuals and create an integrative framework to identify cancer-related driver genes. Specifically, based on pinpointed differentially expressed genes, variants in somatic mutations and a gene interaction network, we proposed an unsupervised Bayesian network integration (BNI) method to detect driver genes and estimate the disease propagation at the patient and/or cohort levels. This new method first captures inherent structural information to construct a functional gene mutation network and then extracts the driver genes and their controlled downstream modules using the minimum cover subset method. RESULTS Using other credible sources (e.g. Cancer Gene Census and Network of Cancer Genes), we validated the driver genes predicted by the BNI method in three TCGA pan-cancer cohorts. The proposed method provides an effective approach to address tumor heterogeneity faced by personalized medicine. The pinpointed drivers warrant further wet laboratory validation. AVAILABILITY AND IMPLEMENTATION The supplementary tables and source code can be obtained from https://xavieruniversityoflouisiana.sharefile.com/d-se6df2c8d0ebe4800a3030311efddafe5. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhong Chen
- Department of Computer Science, Xavier University of Louisiana, New Orleans, LA 70125, USA
- Bioinformatics Core of Xavier RCMI Center for Cancer Research, Xavier University of Louisiana, New Orleans, LA 70125, USA
| | - You Lu
- Department of Computer Science, Xavier University of Louisiana, New Orleans, LA 70125, USA
- Bioinformatics Core of Xavier RCMI Center for Cancer Research, Xavier University of Louisiana, New Orleans, LA 70125, USA
| | - Bo Cao
- Division of Basic and Pharmaceutical Sciences, College of Pharmacy, Xavier University of Louisiana, New Orleans, LA 70125, USA
| | - Wensheng Zhang
- Department of Computer Science, Xavier University of Louisiana, New Orleans, LA 70125, USA
- Bioinformatics Core of Xavier RCMI Center for Cancer Research, Xavier University of Louisiana, New Orleans, LA 70125, USA
| | - Andrea Edwards
- Department of Computer Science, Xavier University of Louisiana, New Orleans, LA 70125, USA
| | - Kun Zhang
- To whom correspondence should be addressed
| |
Collapse
|
42
|
Rahimi M, Teimourpour B, Akhavan-Safar M. DGRanker: Cancer Driver Gene Detection in Human Transcriptional Regulatory Network. IRANIAN JOURNAL OF BIOTECHNOLOGY 2022; 20:e3066. [PMID: 36337068 PMCID: PMC9583818 DOI: 10.30498/ijb.2022.289013.3066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
BACKGROUND Cancer is a group of diseases that have received much attention in biological research because of its high mortality rate and the lack of accurate identification of its root causes. In such studies, researchers usually try to identify cancer driver genes (CDGs) that start cancer in a cell. The majority of the methods that have ever been proposed for the identification of CDGs are based on gene expression data and the concept of mutation in genomic data. Recently, using networking techniques and the concept of influence maximization, some models have been proposed to identify these genes. OBJECTIVES We aimed to construct the cancer transcriptional regulatory network and identify cancer driver genes using a network science approach without the use of mutation and genomic data. MATERIALS AND METHODS In this study, we will employ the social influence network theory to identify CDGs in the human gene regulatory network (GRN) that is based on the concept of influence and power of webpages. First, we will create GRN Networks using gene expression data and Existing nodes and edges. Next, we will implement the modified algorithm on GRN networks being studied by weighting the regulatory interaction edges using the influence spread concept. Nodes with the highest ratings will be selected as the CDGs. RESULTS The results show our proposed method outperforms most of the other computational and network-based methods and show its superiority in identifying CDGs compared to many other methods. In addition, the proposed method can identify many CDGs that are overlooked by all previously published methods. CONCLUSIONS Our study demonstrated that the Google's PageRank algorithm can be utilized and modified as a network-based method for identifying cancer driver gene in transcriptional regulatory network. Furthermore, the proposed method can be considered as a complementary method to the computational-based cancer driver gene identification tools.
Collapse
Affiliation(s)
- Majid Rahimi
- Department of information technology, School of Systems and Industrial Engineering, Tarbiat Modares University (TMU), Tehran, Iran
| | - Babak Teimourpour
- School of Systems and Industrial Engineering, Tarbiat Modares University (TMU) Chamran/Al-e-Ahmad Highways Intersection, Tehran, Iran
| | - Mostafa Akhavan-Safar
- Department of Computer and Information Technology Engineering, Payame Noor University, Tehran, Iran
| |
Collapse
|
43
|
Huo Y, Li X, Xu P, Bao Z, Liu W. Analysis of Breast Cancer Based on the Dysregulated Network. Front Genet 2022; 13:856075. [PMID: 35242172 PMCID: PMC8886234 DOI: 10.3389/fgene.2022.856075] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Accepted: 01/28/2022] [Indexed: 11/13/2022] Open
Abstract
Breast cancer is a heterogeneous disease, and its development is closely associated with the underlying molecular regulatory network. In this paper, we propose a new way to measure the regulation strength between genes based on their expression values, and construct the dysregulated networks (DNs) for the four subtypes of breast cancer. Our results show that the key dysregulated networks (KDNs) are significantly enriched in critical breast cancer-related pathways and driver genes; closely related to drug targets; and have significant differences in survival analysis. Moreover, the key dysregulated genes could serve as potential driver genes, drug targets, and prognostic markers for each breast cancer subtype. Therefore, the KDN is expected to be an effective and novel way to understand the mechanisms of breast cancer.
Collapse
Affiliation(s)
- Yanhao Huo
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
| | - Xianbin Li
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
| | - Peng Xu
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China.,School of Computer Science of Information Technology, Qiannan Normal University for Nationalities, Duyun, China
| | - Zhenshen Bao
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China.,School of Computer Science of Information Technology, Qiannan Normal University for Nationalities, Duyun, China
| | - Wenbin Liu
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
| |
Collapse
|
44
|
Identification of a Five-Gene Panel to Assess Prognosis for Gastric Cancer. BIOMED RESEARCH INTERNATIONAL 2022; 2022:5593619. [PMID: 35187167 PMCID: PMC8850031 DOI: 10.1155/2022/5593619] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 12/30/2021] [Accepted: 01/04/2022] [Indexed: 11/25/2022]
Abstract
Methods Two datasets were used as training and validation cohorts to establish the predictive model. We used three types of screening criteria: background analysis, pathway analysis, and functional analysis provided by the cBioportal website. Fisher's exact test and multivariable logistic regression were performed to screen out related genes. Furthermore, we performed receiver operating characteristic (ROC) and Kaplan–Meier curve analyses to evaluate the correlation between the selected genes and overall survival. Result We screened five genes (KNL1, NRXN1, C6, CCDC169-SOHLH2, and TTN) that were highly related to recurrence of GC. The area under the receiver operating characteristic (ROC) curve was 0.813, which was much higher than that of the baseline model (AUC = 0.699). This result suggested that the mutation of five selected genes had a significant effect on the prediction of recurrence compared with other factors (age, stages, history, etc.). Furthermore, the Kaplan-Meier estimator also revealed that the mutation of five genes positively correlated with patient survival. Conclusions The patients who have mutations in these five genes may experience longer survival than those who do not have mutations. This five-gene panel will likely be a practical tool for prognostic evaluation and will provide another possible way for clinicians to determine therapy.
Collapse
|
45
|
Akhavan-Safar M, Teimourpour B, Nowzari-Dalini A. A network-based method for detecting cancer driver gene in transcriptional regulatory networks using the structure analysis of weighted regulatory interactions. Curr Bioinform 2022. [DOI: 10.2174/1574893617666220127094224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
The identification of genes that instigate cell anomalies and cause cancer in humans is an important field in oncology research. Abnormalities in these genes are transferred to other genes in the cell, disrupting its normal functionality. Such genes are known as cancer driver genes (CDGs). Various methods have been proposed for predicting CDGs, most of which are based on genomic data and computational methods. Some novel bioinformatic approaches have been developed.
Objective:
In this article, we propose a network-based algorithm, SalsaDriver (Stochastic approach for link-structure analysis to driver detection), which can calculate the receiving and influencing power of each gene using the stochastic analysis of regulatory interaction structures in gene regulatory networks.
Method:
First, regulatory networks related to breast, colon, and lung cancers were constructed using gene expression data and a list of regulatory interactions, the weights of which were then calculated using biological and topological features of the network. After that, the weighted regulatory interactions were used in the structure analysis of interactions achieved using two separate Markov chains on the bipartite graph taken from the main graph of the gene network and implementing the stochastic approach for link-structure analysis. The proposed algorithm categorizes higher-ranked genes as driver genes.
Results:
The proposed algorithm was compared with 24 other computational and network tools based on the F-measure value and the number of detected CDGs. The results were validated using four valid databases. The findings of this study show that SalsaDriver outperforms other methods and can identify a significant number of driver genes not identified using other methods.
Conclusion:
The SalsaDriver network-based approach is suitable for predicting CDGs and can be used as a complementary method along with other computational tools.
Collapse
Affiliation(s)
- Mostafa Akhavan-Safar
- Department of Computer and Information Technology Engineering, Payame Noor University (PNU), P.O. Box, 19395-4697, Tehran, Iran
- Department of Information Technology Engineering, School of Systems and Industrial Engineering, Tarbiat Modares University (TMU), Tehran, Iran
| | - Babak Teimourpour
- Department of Information Technology Engineering, School of Systems and Industrial Engineering, Tarbiat Modares University (TMU), Tehran, Iran
| | - Abbas Nowzari-Dalini
- Department of Computer Science, School of Mathematics, Statistics, and Computer Science, University of Tehran, Tehran, Iran
| |
Collapse
|
46
|
Dressler L, Bortolomeazzi M, Keddar MR, Misetic H, Sartini G, Acha-Sagredo A, Montorsi L, Wijewardhane N, Repana D, Nulsen J, Goldman J, Pollitt M, Davis P, Strange A, Ambrose K, Ciccarelli FD. Comparative assessment of genes driving cancer and somatic evolution in non-cancer tissues: an update of the Network of Cancer Genes (NCG) resource. Genome Biol 2022; 23:35. [PMID: 35078504 PMCID: PMC8790917 DOI: 10.1186/s13059-022-02607-z] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 01/10/2022] [Indexed: 12/30/2022] Open
Abstract
Background Genetic alterations of somatic cells can drive non-malignant clone formation and promote cancer initiation. However, the link between these processes remains unclear and hampers our understanding of tissue homeostasis and cancer development. Results Here, we collect a literature-based repertoire of 3355 well-known or predicted drivers of cancer and non-cancer somatic evolution in 122 cancer types and 12 non-cancer tissues. Mapping the alterations of these genes in 7953 pan-cancer samples reveals that, despite the large size, the known compendium of drivers is still incomplete and biased towards frequently occurring coding mutations. High overlap exists between drivers of cancer and non-cancer somatic evolution, although significant differences emerge in their recurrence. We confirm and expand the unique properties of drivers and identify a core of evolutionarily conserved and essential genes whose germline variation is strongly counter-selected. Somatic alteration in even one of these genes is sufficient to drive clonal expansion but not malignant transformation. Conclusions Our study offers a comprehensive overview of our current understanding of the genetic events initiating clone expansion and cancer revealing significant gaps and biases that still need to be addressed. The compendium of cancer and non-cancer somatic drivers, their literature support, and properties are accessible in the Network of Cancer Genes and Healthy Drivers resource at http://www.network-cancer-genes.org/. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-022-02607-z.
Collapse
|
47
|
Comprehensive patient-level classification and quantification of driver events in TCGA PanCanAtlas cohorts. PLoS Genet 2022; 18:e1009996. [PMID: 35030162 PMCID: PMC8759692 DOI: 10.1371/journal.pgen.1009996] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2021] [Accepted: 12/14/2021] [Indexed: 12/14/2022] Open
Abstract
There is a growing need to develop novel therapeutics for targeted treatment of cancer. The prerequisite to success is the knowledge about which types of molecular alterations are predominantly driving tumorigenesis. To shed light onto this subject, we have utilized the largest database of human cancer mutations–TCGA PanCanAtlas, multiple established algorithms for cancer driver prediction (2020plus, CHASMplus, CompositeDriver, dNdScv, DriverNet, HotMAPS, OncodriveCLUSTL, OncodriveFML) and developed four novel computational pipelines: SNADRIF (Single Nucleotide Alteration DRIver Finder), GECNAV (Gene Expression-based Copy Number Alteration Validator), ANDRIF (ANeuploidy DRIver Finder) and PALDRIC (PAtient-Level DRIver Classifier). A unified workflow integrating all these pipelines, algorithms and datasets at cohort and patient levels was created. We have found that there are on average 12 driver events per tumour, of which 0.6 are single nucleotide alterations (SNAs) in oncogenes, 1.5 are amplifications of oncogenes, 1.2 are SNAs in tumour suppressors, 2.1 are deletions of tumour suppressors, 1.5 are driver chromosome losses, 1 is a driver chromosome gain, 2 are driver chromosome arm losses, and 1.5 are driver chromosome arm gains. The average number of driver events per tumour increases with age (from 7 to 15) and cancer stage (from 10 to 15) and varies strongly between cancer types (from 1 to 24). Patients with 1 and 7 driver events per tumour are the most frequent, and there are very few patients with more than 40 events. In tumours having only one driver event, this event is most often an SNA in an oncogene. However, with increasing number of driver events per tumour, the contribution of SNAs decreases, whereas the contribution of copy-number alterations and aneuploidy events increases. By analysing genomic and transcriptomic data from 10000 cancer patients through our custom-built computational pipelines and previously established third-party algorithms, we have found that half of all driver events in a patient’s tumour appear to be gains and losses of chromosomal arms and whole chromosomes. We therefore suggest that future therapeutics development efforts should be focused on targeting aneuploidy. We have also found that approximately a third of driver events in a patient are whole gene amplifications and deletions. Thus, therapies aimed at copy-number alterations also appear very promising. On the other hand, drugs aiming at point mutations are predicted to be less successful, as these alterations are responsible for just a couple of drivers per tumour. One notable exception are patients having only one driver event in their tumours, as this event is almost always a single nucleotide alteration in an oncogene.
Collapse
|
48
|
Huang XT, Jia S, Gao L, Wu J. Reconstruction of human protein-coding gene functional association network based on machine learning. Brief Bioinform 2022; 23:6502555. [PMID: 35021191 DOI: 10.1093/bib/bbab552] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 11/13/2021] [Accepted: 12/02/2021] [Indexed: 01/02/2023] Open
Abstract
Networks consisting of molecular interactions are intrinsically dynamical systems of an organism. These interactions curated in molecular interaction databases are still not complete and contain false positives introduced by high-throughput screening experiments. In this study, we propose a framework to integrate interactions of functional associated protein-coding genes from 31 data sources to reconstruct a network with high coverage and quality. For each interaction, 369 features were constructed including properties of both the interaction and the involved genes. The training and validation sets were built on the pathway interactions as positives and the potential negative instances resulting from our proposed semi-supervised strategy. Random forest classification method was then applied to train and predict multiple times to give a score for each interaction. After setting a threshold estimated by a Binomial distribution, a Human protein-coding Gene Functional Association Network (HuGFAN) was reconstructed with 20 383 genes and 1185 429 high confidence interactions. Then, HuGFAN was compared with other networks from data sources with respect to network properties, suggesting that HuGFAN is more function and pathway related. Finally, HuGFAN was applied to identify cancer driver through two famous network-based methods (DriverNet and HotNet2) to show its outstanding performance compared with other networks. HuGFAN and other supplementary files are freely available at https://github.com/xthuang226/HuGFAN.
Collapse
Affiliation(s)
- Xiao-Tai Huang
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Songwei Jia
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Jing Wu
- School of Mechanical Engineering, Dongguan University of Technology, Dongguan, 523808, Guangdong, China
| |
Collapse
|
49
|
Sudhakar M, Rengaswamy R, Raman K. Novel ratio-metric features enable the identification of new driver genes across cancer types. Sci Rep 2022; 12:5. [PMID: 34997044 PMCID: PMC8741763 DOI: 10.1038/s41598-021-04015-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 12/13/2021] [Indexed: 12/27/2022] Open
Abstract
An emergent area of cancer genomics is the identification of driver genes. Driver genes confer a selective growth advantage to the cell. While several driver genes have been discovered, many remain undiscovered, especially those mutated at a low frequency across samples. This study defines new features and builds a pan-cancer model, cTaG, to identify new driver genes. The features capture the functional impact of the mutations as well as their recurrence across samples, which helps build a model unbiased to genes with low frequency. The model classifies genes into the functional categories of driver genes, tumour suppressor genes (TSGs) and oncogenes (OGs), having distinct mutation type profiles. We overcome overfitting and show that certain mutation types, such as nonsense mutations, are more important for classification. Further, cTaG was employed to identify tissue-specific driver genes. Some known cancer driver genes predicted by cTaG as TSGs with high probability are ARID1A, TP53, and RB1. In addition to these known genes, potential driver genes predicted are CD36, ZNF750 and ARHGAP35 as TSGs and TAB3 as an oncogene. Overall, our approach surmounts the issue of low recall and bias towards genes with high mutation rates and predicts potential new driver genes for further experimental screening. cTaG is available at https://github.com/RamanLab/cTaG .
Collapse
Affiliation(s)
- Malvika Sudhakar
- Department of Biotechnology, Bhupat Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
- Centre for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology Madras, Chennai, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), Indian Institute of Technology Madras, Chennai, India
| | - Raghunathan Rengaswamy
- Centre for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology Madras, Chennai, India.
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), Indian Institute of Technology Madras, Chennai, India.
- Department of Chemical Engineering, Indian Institute of Technology Madras, Chennai, India.
| | - Karthik Raman
- Department of Biotechnology, Bhupat Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India.
- Centre for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology Madras, Chennai, India.
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), Indian Institute of Technology Madras, Chennai, India.
| |
Collapse
|
50
|
Zeng J, Shufean MA. Molecular-based precision oncology clinical decision making augmented by artificial intelligence. Emerg Top Life Sci 2021; 5:757-764. [PMID: 34874054 PMCID: PMC8786281 DOI: 10.1042/etls20210220] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 11/08/2021] [Accepted: 11/16/2021] [Indexed: 01/03/2023]
Abstract
The rapid growth and decreasing cost of Next-generation sequencing (NGS) technologies have made it possible to conduct routine large panel genomic sequencing in many disease settings, especially in the oncology domain. Furthermore, it is now known that optimal disease management of patients depends on individualized cancer treatment guided by comprehensive molecular testing. However, translating results from molecular sequencing reports into actionable clinical insights remains a challenge to most clinicians. In this review, we discuss about some representative systems that leverage artificial intelligence (AI) to facilitate some processes of clinicians' decision making based upon molecular data, focusing on their application in precision oncology. Some limitations and pitfalls of the current application of AI in clinical decision making are also discussed.
Collapse
Affiliation(s)
- Jia Zeng
- Sheikh Khalifa Bin Zayed Al Nahyan Institute for Personalized Cancer Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX, U.S.A
| | - Md Abu Shufean
- Sheikh Khalifa Bin Zayed Al Nahyan Institute for Personalized Cancer Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX, U.S.A
| |
Collapse
|