1
|
Zhang H, Lin C, Chen Y, Shen X, Wang R, Chen Y, Lyu J. Enhancing Molecular Network-Based Cancer Driver Gene Prediction Using Machine Learning Approaches: Current Challenges and Opportunities. J Cell Mol Med 2025; 29:e70351. [PMID: 39804102 PMCID: PMC11726689 DOI: 10.1111/jcmm.70351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2024] [Revised: 12/24/2024] [Accepted: 01/02/2025] [Indexed: 01/16/2025] Open
Abstract
Cancer is a complex disease driven by mutations in the genes that play critical roles in cellular processes. The identification of cancer driver genes is crucial for understanding tumorigenesis, developing targeted therapies and identifying rational drug targets. Experimental identification and validation of cancer driver genes are time-consuming and costly. Studies have demonstrated that interactions among genes are associated with similar phenotypes. Therefore, identifying cancer driver genes using molecular network-based approaches is necessary. Molecular network-based random walk-based approaches, which integrate mutation data with protein-protein interaction networks, have been widely employed in predicting cancer driver genes and demonstrated robust predictive potential. However, recent advancements in deep learning, particularly graph-based models, have provided novel opportunities for enhancing the prediction of cancer driver genes. This review aimed to comprehensively explore how machine learning methodologies, particularly network propagation, graph neural networks, autoencoders, graph embeddings, and attention mechanisms, improve the scalability and interpretability of molecular network-based cancer gene prediction.
Collapse
Affiliation(s)
- Hao Zhang
- Postgraduate Training Base Alliance of Wenzhou Medical UniversityWenzhouZhejiangChina
- Wenzhou Key Laboratory of Biophysics, Wenzhou InstituteUniversity of Chinese Academy of SciencesWenzhouZhejiangChina
| | - Chaohuan Lin
- Postgraduate Training Base Alliance of Wenzhou Medical UniversityWenzhouZhejiangChina
- Wenzhou Key Laboratory of Biophysics, Wenzhou InstituteUniversity of Chinese Academy of SciencesWenzhouZhejiangChina
| | - Ying'ao Chen
- Wenzhou Key Laboratory of Biophysics, Wenzhou InstituteUniversity of Chinese Academy of SciencesWenzhouZhejiangChina
| | | | - Ruizhe Wang
- Wenzhou Longwan High SchoolWenzhouZhejiangChina
| | - Yiqi Chen
- Wenzhou Longwan High SchoolWenzhouZhejiangChina
| | - Jie Lyu
- Postgraduate Training Base Alliance of Wenzhou Medical UniversityWenzhouZhejiangChina
- Wenzhou Key Laboratory of Biophysics, Wenzhou InstituteUniversity of Chinese Academy of SciencesWenzhouZhejiangChina
| |
Collapse
|
2
|
Pham VVH, Jue TR, Bell JL, Luciani F, Michniewicz F, Cirillo G, Vahdat L, Mayoh C, Vittorio O. A novel network-based method identifies a cuproplasia-related pan-cancer gene signature to predict patient outcome. Hum Genet 2024; 143:1145-1162. [PMID: 38642129 PMCID: PMC11485146 DOI: 10.1007/s00439-024-02673-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/26/2024] [Indexed: 04/22/2024]
Abstract
Copper is a vital micronutrient involved in many biological processes and is an essential component of tumour cell growth and migration. Copper influences tumour growth through a process called cuproplasia, defined as abnormal copper-dependent cell-growth and proliferation. Copper-chelation therapy targeting this process has demonstrated efficacy in several clinical trials against cancer. While the molecular pathways associated with cuproplasia are partially known, genetic heterogeneity across different cancer types has limited the understanding of how cuproplasia impacts patient survival. Utilising RNA-sequencing data from The Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) datasets, we generated gene regulatory networks to identify the critical cuproplasia-related genes across 23 different cancer types. From this, we identified a novel 8-gene cuproplasia-related gene signature associated with pan-cancer survival, and a 6-gene prognostic risk score model in low grade glioma. These findings highlight the use of gene regulatory networks to identify cuproplasia-related gene signatures that could be used to generate risk score models. This can potentially identify patients who could benefit from copper-chelation therapy and identifies novel targeted therapeutic strategies.
Collapse
Affiliation(s)
- Vu Viet Hoang Pham
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW, Kensington, NSW, Australia
- School of Biomedical Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Toni Rose Jue
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW, Kensington, NSW, Australia
- School of Biomedical Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Jessica Lilian Bell
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW, Kensington, NSW, Australia
- School of Biomedical Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Fabio Luciani
- School of Biomedical Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Filip Michniewicz
- School of Biomedical Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Giuseppe Cirillo
- Department of Pharmacy, Health and Nutritional Sciences, University of Calabria, Rende, Italy
| | - Linda Vahdat
- Dartmouth-Hitchcock Medical Center: Lebanon, New Hampshire, US
| | - Chelsea Mayoh
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW, Kensington, NSW, Australia
- School of Clinical Medicine, UNSW Medicine & Health, UNSW Sydney, Kensington, NSW, Australia
| | - Orazio Vittorio
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW, Kensington, NSW, Australia.
- School of Biomedical Sciences, UNSW Sydney, Kensington, NSW, Australia.
| |
Collapse
|
3
|
Zhang T, Zhang SW, Xie MY, Li Y. Identifying cooperating cancer driver genes in individual patients through hypergraph random walk. J Biomed Inform 2024; 157:104710. [PMID: 39159864 DOI: 10.1016/j.jbi.2024.104710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Revised: 07/30/2024] [Accepted: 08/14/2024] [Indexed: 08/21/2024]
Abstract
OBJECTIVE Identifying cancer driver genes, especially rare or patient-specific cancer driver genes, is a primary goal in cancer therapy. Although researchers have proposed some methods to tackle this problem, these methods mostly identify cancer driver genes at single gene level, overlooking the cooperative relationship among cancer driver genes. Identifying cooperating cancer driver genes in individual patients is pivotal for understanding cancer etiology and advancing the development of personalized therapies. METHODS Here, we propose a novel Personalized Cooperating cancer Driver Genes (PCoDG) method by using hypergraph random walk to identify the cancer driver genes that cooperatively drive individual patient cancer progression. By leveraging the powerful ability of hypergraph in representing multi-way relationships, PCoDG first employs the personalized hypergraph to depict the complex interactions among mutated genes and differentially expressed genes of an individual patient. Then, a hypergraph random walk algorithm based on hyperedge similarity is utilized to calculate the importance scores of mutated genes, integrating these scores with signaling pathway data to identify the cooperating cancer driver genes in individual patients. RESULTS The experimental results on three TCGA cancer datasets (i.e., BRCA, LUAD, and COADREAD) demonstrate the effectiveness of PCoDG in identifying personalized cooperating cancer driver genes. These genes identified by PCoDG not only offer valuable insights into patient stratification correlating with clinical outcomes, but also provide an useful reference resource for tailoring personalized treatments. CONCLUSION We propose a novel method that can effectively identify cooperating cancer driver genes for individual patients, thereby deepening our understanding of the cooperative relationship among personalized cancer driver genes and advancing the development of precision oncology.
Collapse
Affiliation(s)
- Tong Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China; School of Electrical and Mechanical Engineering, Pingdingshan University, Pingdingshan 467000, China
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China.
| | - Ming-Yu Xie
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Yan Li
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| |
Collapse
|
4
|
Zhang N, Ma F, Guo D, Pang Y, Wang C, Zhang Y, Zheng X, Wang M. A novel hypergraph model for identifying and prioritizing personalized drivers in cancer. PLoS Comput Biol 2024; 20:e1012068. [PMID: 38683860 PMCID: PMC11081510 DOI: 10.1371/journal.pcbi.1012068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 05/09/2024] [Accepted: 04/09/2024] [Indexed: 05/02/2024] Open
Abstract
Cancer development is driven by an accumulation of a small number of driver genetic mutations that confer the selective growth advantage to the cell, while most passenger mutations do not contribute to tumor progression. The identification of these driver genes responsible for tumorigenesis is a crucial step in designing effective cancer treatments. Although many computational methods have been developed with this purpose, the majority of existing methods solely provided a single driver gene list for the entire cohort of patients, ignoring the high heterogeneity of driver events across patients. It remains challenging to identify the personalized driver genes. Here, we propose a novel method (PDRWH), which aims to prioritize the mutated genes of a single patient based on their impact on the abnormal expression of downstream genes across a group of patients who share the co-mutation genes and similar gene expression profiles. The wide experimental results on 16 cancer datasets from TCGA showed that PDRWH excels in identifying known general driver genes and tumor-specific drivers. In the comparative testing across five cancer types, PDRWH outperformed existing individual-level methods as well as cohort-level methods. Our results also demonstrated that PDRWH could identify both common and rare drivers. The personalized driver profiles could improve tumor stratification, providing new insights into understanding tumor heterogeneity and taking a further step toward personalized treatment. We also validated one of our predicted novel personalized driver genes on tumor cell proliferation by vitro cell-based assays, the promoting effect of the high expression of Low-density lipoprotein receptor-related protein 1 (LRP1) on tumor cell proliferation.
Collapse
Affiliation(s)
- Naiqian Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Fubin Ma
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Dong Guo
- School of Mathematics and Statistics, Shandong University, Weihai, China
- Department of Central Lab, Weihai Municipal Hospital, Shandong University, Weihai, China
| | - Yuxuan Pang
- SDU-ANU Joint Science College, Shandong University, Weihai, China
| | - Chenye Wang
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Xiaoqi Zheng
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Mingyi Wang
- School of Mathematics and Statistics, Shandong University, Weihai, China
- Department of Central Lab, Weihai Municipal Hospital, Shandong University, Weihai, China
| |
Collapse
|
5
|
Wei PJ, Zhu AD, Cao R, Zheng C. Personalized Driver Gene Prediction Using Graph Convolutional Networks with Conditional Random Fields. BIOLOGY 2024; 13:184. [PMID: 38534453 DOI: 10.3390/biology13030184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 03/03/2024] [Accepted: 03/10/2024] [Indexed: 03/28/2024]
Abstract
Cancer is a complex and evolutionary disease mainly driven by the accumulation of genetic variations in genes. Identifying cancer driver genes is important. However, most related studies have focused on the population level. Cancer is a disease with high heterogeneity. Thus, the discovery of driver genes at the individual level is becoming more valuable but is a great challenge. Although there have been some computational methods proposed to tackle this challenge, few can cover all patient samples well, and there is still room for performance improvement. In this study, to identify individual-level driver genes more efficiently, we propose the PDGCN method. PDGCN integrates multiple types of data features, including mutation, expression, methylation, copy number data, and system-level gene features, along with network structural features extracted using Node2vec in order to construct a sample-gene interaction network. Prediction is performed using a graphical convolutional neural network model with a conditional random field layer, which is able to better combine the network structural features with biological attribute features. Experiments on the ACC (Adrenocortical Cancer) and KICH (Kidney Chromophobe) datasets from TCGA (The Cancer Genome Atlas) demonstrated that the method performs better compared to other similar methods. It can identify not only frequently mutated driver genes, but also rare candidate driver genes and novel biomarker genes. The results of the survival and enrichment analyses of these detected genes demonstrate that the method can identify important driver genes at the individual level.
Collapse
Affiliation(s)
- Pi-Jing Wei
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, Hefei 230601, China
| | - An-Dong Zhu
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, Hefei 230601, China
| | - Ruifen Cao
- School of Computer Science and Technology, Anhui University, 111 Jiulong Road, Hefei 230601, China
| | - Chunhou Zheng
- School of Artificial Intelligence, Anhui University, 111 Jiulong Road, Hefei 230601, China
| |
Collapse
|
6
|
Huang Y, Chen F, Sun H, Zhong C. Exploring gene-patient association to identify personalized cancer driver genes by linear neighborhood propagation. BMC Bioinformatics 2024; 25:34. [PMID: 38254011 PMCID: PMC10804660 DOI: 10.1186/s12859-024-05662-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 01/18/2024] [Indexed: 01/24/2024] Open
Abstract
BACKGROUND Driver genes play a vital role in the development of cancer. Identifying driver genes is critical for diagnosing and understanding cancer. However, challenges remain in identifying personalized driver genes due to tumor heterogeneity of cancer. Although many computational methods have been developed to solve this problem, few efforts have been undertaken to explore gene-patient associations to identify personalized driver genes. RESULTS Here we propose a method called LPDriver to identify personalized cancer driver genes by employing linear neighborhood propagation model on individual genetic data. LPDriver builds personalized gene network based on the genetic data of individual patients, extracts the gene-patient associations from the bipartite graph of the personalized gene network and utilizes a linear neighborhood propagation model to mine gene-patient associations to detect personalized driver genes. The experimental results demonstrate that as compared to the existing methods, our method shows competitive performance and can predict cancer driver genes in a more accurate way. Furthermore, these results also show that besides revealing novel driver genes that have been reported to be related with cancer, LPDriver is also able to identify personalized cancer driver genes for individual patients by their network characteristics even if the mutation data of genes are hidden. CONCLUSIONS LPDriver can provide an effective approach to predict personalized cancer driver genes, which could promote the diagnosis and treatment of cancer. The source code and data are freely available at https://github.com/hyr0771/LPDriver .
Collapse
Affiliation(s)
- Yiran Huang
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004, China
- Key Laboratory of Parallel, Distributed and Intelligent Computing in Guangxi Universities and Colleges, Guangxi University, Nanning, 530004, China
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, 530004, China
| | - Fuhao Chen
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004, China
| | - Hongtao Sun
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004, China
| | - Cheng Zhong
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004, China.
- Key Laboratory of Parallel, Distributed and Intelligent Computing in Guangxi Universities and Colleges, Guangxi University, Nanning, 530004, China.
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, 530004, China.
| |
Collapse
|
7
|
De Marzio M, Glass K, Kuijjer ML. Single-sample network modeling on omics data. BMC Biol 2023; 21:296. [PMID: 38155351 PMCID: PMC10755944 DOI: 10.1186/s12915-023-01783-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 11/27/2023] [Indexed: 12/30/2023] Open
Affiliation(s)
- Margherita De Marzio
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medicine School, Boston, MA, USA
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medicine School, Boston, MA, USA.
- Biostatistics Department, Harvard Chan School of Public Health, Boston, MA, USA.
| | - Marieke L Kuijjer
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, Oslo, Norway.
| |
Collapse
|
8
|
Gillman R, Field MA, Schmitz U, Karamatic R, Hebbard L. Identifying cancer driver genes in individual tumours. Comput Struct Biotechnol J 2023; 21:5028-5038. [PMID: 37867967 PMCID: PMC10589724 DOI: 10.1016/j.csbj.2023.10.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 10/10/2023] [Accepted: 10/11/2023] [Indexed: 10/24/2023] Open
Abstract
Cancer is a heterogeneous disease with a strong genetic component making it suitable for precision medicine approaches aimed at identifying the underlying molecular drivers within a tumour. Large scale population-level cancer sequencing consortia have identified many actionable mutations common across both cancer types and sub-types, resulting in an increasing number of successful precision medicine programs. Nonetheless, such approaches fail to consider the effects of mutations unique to an individual patient and may miss rare driver mutations, necessitating personalised approaches to driver-gene prioritisation. One approach is to quantify the functional importance of individual mutations in a single tumour based on how they affect the expression of genes in a gene interaction network (GIN). These GIN-based approaches can be broadly divided into those that utilise an existing reference GIN and those that construct de novo patient-specific GINs. These single-tumour approaches have several limitations that likely influence their results, such as use of reference cohort data, network choice, and approaches to mathematical approximation, and more research is required to evaluate the in vitro and in vivo applicability of their predictions. This review examines the current state of the art methods that identify driver genes in single tumours with a focus on GIN-based driver prioritisation.
Collapse
Affiliation(s)
- Rhys Gillman
- Department of Biomedical Sciences and Molecular and Cell Biology, College of Public Health, Medical, and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Cairns, Queensland, Australia
| | - Matt A. Field
- Department of Biomedical Sciences and Molecular and Cell Biology, College of Public Health, Medical, and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Cairns, Queensland, Australia
- Immunogenomics Lab, Garvan Institute of Medical Research, Darlinghurst, New South Wales, Australia
- Menzies School of Health Research, Charles Darwin University, Darwin, Northern Territory, Australia
| | - Ulf Schmitz
- Department of Biomedical Sciences and Molecular and Cell Biology, College of Public Health, Medical, and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Cairns, Queensland, Australia
| | - Rozemary Karamatic
- Gastroenterology and Hepatology, Townsville University Hospital, PO Box 670, Townsville, Queensland 4810, Australia
- College of Medicine and Dentistry, Division of Tropical Health and Medicine, James Cook University, Townsville, Queensland, Australia
| | - Lionel Hebbard
- Department of Biomedical Sciences and Molecular and Cell Biology, College of Public Health, Medical, and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Cairns, Queensland, Australia
- Storr Liver Centre, Westmead Institute for Medical Research, Westmead Hospital and University of Sydney, Sydney, New South Wales, Australia
- Australian Institute for Tropical Health and Medicine, Townsville, Queensland, Australia
| |
Collapse
|
9
|
Pham VVH, Liu L, Bracken C, Goodall G, Li J, Le TD. Computational methods for cancer driver discovery: A survey. Am J Cancer Res 2021; 11:5553-5568. [PMID: 33859763 PMCID: PMC8039954 DOI: 10.7150/thno.52670] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 01/20/2021] [Indexed: 12/21/2022] Open
Abstract
Identifying the genes responsible for driving cancer is of critical importance for directing treatment. Accordingly, multiple computational tools have been developed to facilitate this task. Due to the different methods employed by these tools, different data considered by the tools, and the rapidly evolving nature of the field, the selection of an appropriate tool for cancer driver discovery is not straightforward. This survey seeks to provide a comprehensive review of the different computational methods for discovering cancer drivers. We categorise the methods into three groups; methods for single driver identification, methods for driver module identification, and methods for identifying personalised cancer drivers. In addition to providing a “one-stop” reference of these methods, by evaluating and comparing their performance, we also provide readers the information about the different capabilities of the methods in identifying biologically significant cancer drivers. The biologically relevant information identified by these tools can be seen through the enrichment of discovered cancer drivers in GO biological processes and KEGG pathways and through our identification of a small cancer-driver cohort that is capable of stratifying patient survival.
Collapse
|