1
|
Xu X, Qi Z, Wang L, Zhang M, Geng Z, Han X. Gsw-fi: a GLM model incorporating shrinkage and double-weighted strategies for identifying cancer driver genes with functional impact. BMC Bioinformatics 2024; 25:99. [PMID: 38448819 PMCID: PMC10916024 DOI: 10.1186/s12859-024-05707-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 02/16/2024] [Indexed: 03/08/2024] Open
Abstract
BACKGROUND Cancer, a disease with high morbidity and mortality rates, poses a significant threat to human health. Driver genes, which harbor mutations accountable for the initiation and progression of tumors, play a crucial role in cancer development. Identifying driver genes stands as a paramount objective in cancer research and precision medicine. RESULTS In the present work, we propose a method for identifying driver genes using a Generalized Linear Regression Model (GLM) with Shrinkage and double-Weighted strategies based on Functional Impact, which is named GSW-FI. Firstly, an estimating model is proposed for assessing the background functional impacts of genes based on GLM, utilizing gene features as predictors. Secondly, the shrinkage and double-weighted strategies as two revising approaches are integrated to ensure the rationality of the identified driver genes. Lastly, a statistical method of hypothesis testing is designed to identify driver genes by leveraging the estimated background function impacts. Experimental results conducted on 31 The Cancer Genome Altas datasets demonstrate that GSW-FI outperforms ten other prediction methods in terms of the overlap fraction with well-known databases and consensus predictions among different methods. CONCLUSIONS GSW-FI presents a novel approach that efficiently identifies driver genes with functional impact mutations using computational methods, thereby advancing the development of precision medicine for cancer.
Collapse
Affiliation(s)
- Xiaolu Xu
- School of Computer and Artificial Intelligence, Liaoning Normal University, Dalian, China
| | - Zitong Qi
- Department of Statistics, University of Washington, Seattle, USA
| | - Lei Wang
- Center for Reproductive and Genetic Medicine, Dalian Women and Children's Medical Group, Dalian, China.
| | - Meiwei Zhang
- Center for Reproductive and Genetic Medicine, Dalian Women and Children's Medical Group, Dalian, China.
| | - Zhaohong Geng
- Department of Cardiology, Second Affiliated Hospital of Dalian Medical University, Dalian, China
| | - Xiumei Han
- College of Artificial Intelligence, Dalian Maritime University, Dalian, China
| |
Collapse
|
2
|
Wang Y, Zhou B, Ru J, Meng X, Wang Y, Liu W. Advances in computational methods for identifying cancer driver genes. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:21643-21669. [PMID: 38124614 DOI: 10.3934/mbe.2023958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Cancer driver genes (CDGs) are crucial in cancer prevention, diagnosis and treatment. This study employed computational methods for identifying CDGs, categorizing them into four groups. The major frameworks for each of these four categories were summarized. Additionally, we systematically gathered data from public databases and biological networks, and we elaborated on computational methods for identifying CDGs using the aforementioned databases. Further, we summarized the algorithms, mainly involving statistics and machine learning, used for identifying CDGs. Notably, the performances of nine typical identification methods for eight types of cancer were compared to analyze the applicability areas of these methods. Finally, we discussed the challenges and prospects associated with methods for identifying CDGs. The present study revealed that the network-based algorithms and machine learning-based methods demonstrated superior performance.
Collapse
Affiliation(s)
- Ying Wang
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Bohao Zhou
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Jidong Ru
- School of Textile Garment and Design, Changshu Institute of Technology, Changshu 215500, China
| | - Xianglian Meng
- School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou 213032, China
| | - Yundong Wang
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Wenjie Liu
- School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou 213032, China
| |
Collapse
|
3
|
Yagi H, Onoyama I, Asanoma K, Kawakami M, Maenohara S, Kodama K, Matsumura Y, Hamada N, Hori E, Hachisuga K, Yasunaga M, Ohgami T, Okugawa K, Yahata H, Kato K. Tumor-derived ARHGAP35 mutations enhance the Gα 13-Rho signaling axis in human endometrial cancer. Cancer Gene Ther 2023; 30:313-323. [PMID: 36257976 DOI: 10.1038/s41417-022-00547-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 09/26/2022] [Accepted: 10/06/2022] [Indexed: 11/08/2022]
Abstract
Dysregulated G protein-coupled receptor signaling is involved in the formation and progression of human cancers. The heterotrimeric G protein Gα13 is highly expressed in various cancers and regulates diverse cancer-related transcriptional networks and cellular functions by activating Rho. Herein, we demonstrate that increased expression of Gα13 promotes cell proliferation through activation of Rho and the transcription factor AP-1 in human endometrial cancer. Of interest, the RhoGTPase activating protein (RhoGAP), ARHGAP35 is frequently mutated in human endometrial cancers. Among the 509 endometrial cancer samples in The Cancer Genome Atlas database, 108 harbor 152 mutations at 126 different positions within ARHGAP35, representing a somatic mutation frequency of 20.2%. We evaluated the effect of 124 tumor-derived ARHGAP35 mutations on Gα13-mediated Rho and AP-1 activation. The RhoGAP activity of ARHGAP35 was impaired by 55 of 124 tumor-derived mutations, comprised of 23 nonsense, 15 frame-shift, 15 missense mutations, and two in-frame deletions. Considering that ARHGAP35 is mutated in >2% of all tumors, it ranks among the top 30 most significantly mutated genes in human cancer. Our data suggest potential roles of ARHGAP35 as an oncogenic driver gene, providing novel therapeutic opportunities for endometrial cancer.
Collapse
Affiliation(s)
- Hiroshi Yagi
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan.
| | - Ichiro Onoyama
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Kazuo Asanoma
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Minoru Kawakami
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Shoji Maenohara
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Keisuke Kodama
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Yumiko Matsumura
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Norio Hamada
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Emiko Hori
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Kazuhisa Hachisuga
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Masafumi Yasunaga
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Tatsuhiro Ohgami
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Kaoru Okugawa
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Hideaki Yahata
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Kiyoko Kato
- Department of Obstetrics and Gynecology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| |
Collapse
|
4
|
Liao J, Li X, Gan Y, Han S, Rong P, Wang W, Li W, Zhou L. Artificial intelligence assists precision medicine in cancer treatment. Front Oncol 2023; 12:998222. [PMID: 36686757 PMCID: PMC9846804 DOI: 10.3389/fonc.2022.998222] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 11/22/2022] [Indexed: 01/06/2023] Open
Abstract
Cancer is a major medical problem worldwide. Due to its high heterogeneity, the use of the same drugs or surgical methods in patients with the same tumor may have different curative effects, leading to the need for more accurate treatment methods for tumors and personalized treatments for patients. The precise treatment of tumors is essential, which renders obtaining an in-depth understanding of the changes that tumors undergo urgent, including changes in their genes, proteins and cancer cell phenotypes, in order to develop targeted treatment strategies for patients. Artificial intelligence (AI) based on big data can extract the hidden patterns, important information, and corresponding knowledge behind the enormous amount of data. For example, the ML and deep learning of subsets of AI can be used to mine the deep-level information in genomics, transcriptomics, proteomics, radiomics, digital pathological images, and other data, which can make clinicians synthetically and comprehensively understand tumors. In addition, AI can find new biomarkers from data to assist tumor screening, detection, diagnosis, treatment and prognosis prediction, so as to providing the best treatment for individual patients and improving their clinical outcomes.
Collapse
Affiliation(s)
- Jinzhuang Liao
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Xiaoying Li
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Yu Gan
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Shuangze Han
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Pengfei Rong
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China,Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China,*Correspondence: Pengfei Rong, ; Wei Wang, ; Wei Li, ; Li Zhou,
| | - Wei Wang
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China,Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China,*Correspondence: Pengfei Rong, ; Wei Wang, ; Wei Li, ; Li Zhou,
| | - Wei Li
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China,Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China,*Correspondence: Pengfei Rong, ; Wei Wang, ; Wei Li, ; Li Zhou,
| | - Li Zhou
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China,Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China,Department of Pathology, The Xiangya Hospital of Central South University, Changsha, Hunan, China,*Correspondence: Pengfei Rong, ; Wei Wang, ; Wei Li, ; Li Zhou,
| |
Collapse
|
5
|
Anilkumar Sithara A, Maripuri D, Moorthy K, Amirtha Ganesh S, Philip P, Banerjee S, Sudhakar M, Raman K. iCOMIC: a graphical interface-driven bioinformatics pipeline for analyzing cancer omics data. NAR Genom Bioinform 2022; 4:lqac053. [PMID: 35899080 PMCID: PMC9310080 DOI: 10.1093/nargab/lqac053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 06/17/2022] [Accepted: 07/04/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
Despite the tremendous increase in omics data generated by modern sequencing technologies, their analysis can be tricky and often requires substantial expertise in bioinformatics. To address this concern, we have developed a user-friendly pipeline to analyze (cancer) genomic data that takes in raw sequencing data (FASTQ format) as input and outputs insightful statistics. Our iCOMIC toolkit pipeline featuring many independent workflows is embedded in the popular Snakemake workflow management system. It can analyze whole-genome and transcriptome data and is characterized by a user-friendly GUI that offers several advantages, including minimal execution steps and eliminating the need for complex command-line arguments. Notably, we have integrated algorithms developed in-house to predict pathogenicity among cancer-causing mutations and differentiate between tumor suppressor genes and oncogenes from somatic mutation data. We benchmarked our tool against Genome In A Bottle benchmark dataset (NA12878) and got the highest F1 score of 0.971 and 0.988 for indels and SNPs, respectively, using the BWA MEM—GATK HC DNA-Seq pipeline. Similarly, we achieved a correlation coefficient of r = 0.85 using the HISAT2-StringTie-ballgown and STAR-StringTie-ballgown RNA-Seq pipelines on the human monocyte dataset (SRP082682). Overall, our tool enables easy analyses of omics datasets, significantly ameliorating complex data analysis pipelines.
Collapse
Affiliation(s)
- Anjana Anilkumar Sithara
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Devi Priyanka Maripuri
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Keerthika Moorthy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Sai Sruthi Amirtha Ganesh
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Philge Philip
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Shayantan Banerjee
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Malvika Sudhakar
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Karthik Raman
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| |
Collapse
|
6
|
Sudhakar M, Rengaswamy R, Raman K. Multi-Omic Data Improve Prediction of Personalized Tumor Suppressors and Oncogenes. Front Genet 2022; 13:854190. [PMID: 35620468 PMCID: PMC9127508 DOI: 10.3389/fgene.2022.854190] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 04/04/2022] [Indexed: 12/12/2022] Open
Abstract
The progression of tumorigenesis starts with a few mutational and structural driver events in the cell. Various cohort-based computational tools exist to identify driver genes but require multiple samples to identify less frequently mutated driver genes. Many studies use different methods to identify driver mutations/genes from mutations that have no impact on tumor progression; however, a small fraction of patients show no mutational events in any known driver genes. Current unsupervised methods map somatic and expression data onto a network to identify personalized driver genes based on changes in expression. Our method is the first machine learning model to classify genes as tumor suppressor gene (TSG), oncogene (OG), or neutral, thus assigning the functional impact of the gene in the patient. In this study, we develop a multi-omic approach, PIVOT (Personalized Identification of driVer OGs and TSGs), to train on experimentally or computationally validated mutational and structural driver events. Given the lack of any gold standards for the identification of personalized driver genes, we label the data using four strategies and, based on classification metrics, show gene-based labeling strategies perform best. We build different models using SNV, RNA, and multi-omic features to be used based on the data available. Our models trained on multi-omic data improved predictions compared with mutation and expression data, achieving an accuracy ≥0.99 for BRCA, LUAD, and COAD datasets. We show network and expression-based features contribute the most to PIVOT. Our predictions on BRCA, COAD, and LUAD cancer types reveal commonly altered genes such as TP53 and PIK3CA, which are predicted drivers for multiple cancer types. Along with known driver genes, our models also identify new driver genes such as PRKCA, SOX9, and PSMD4. Our multi-omic model labels both CNV and mutations with a more considerable contribution by CNV alterations. While predicting labels for genes mutated in multiple samples, we also label rare driver events occurring in as few as one sample. We also identify genes with dual roles within the same cancer type. Overall, PIVOT labels personalized driver genes as TSGs and OGs and also identified rare driver genes.
Collapse
Affiliation(s)
- Malvika Sudhakar
- Centre for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology (IIT) Madras, Chennai, India.,Robert Bosch Center for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai, India.,Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, IIT Madras, Chennai, India
| | - Raghunathan Rengaswamy
- Centre for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology (IIT) Madras, Chennai, India.,Robert Bosch Center for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai, India.,Department of Chemical Engineering, IIT Madras, Chennai, India
| | - Karthik Raman
- Centre for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology (IIT) Madras, Chennai, India.,Robert Bosch Center for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai, India.,Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, IIT Madras, Chennai, India
| |
Collapse
|