1
|
Petti M, Farina L. Network medicine for patients' stratification: From single-layer to multi-omics. WIREs Mech Dis 2023; 15:e1623. [PMID: 37323106 DOI: 10.1002/wsbm.1623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 03/08/2023] [Accepted: 05/30/2023] [Indexed: 06/17/2023]
Abstract
Precision medicine research increasingly relies on the integrated analysis of multiple types of omics. In the era of big data, the large availability of different health-related information represents a great, but at the same time untapped, chance with a potentially fundamental role in the prevention, diagnosis and prognosis of diseases. Computational methods are needed to combine this data to create a comprehensive view of a given disease. Network science can model biomedical data in terms of relationships among molecular players of different nature and has been successfully proposed as a new paradigm for studying human diseases. Patient stratification is an open challenge aimed at identifying subtypes with different disease manifestations, severity, and expected survival time. Several stratification approaches based on high-throughput gene expression measurements have been successfully applied. However, few attempts have been proposed to exploit the integration of various genotypic and phenotypic data to discover novel sub-types or improve the detection of known groupings. This article is categorized under: Cancer > Biomedical Engineering Cancer > Computational Models Cancer > Genetics/Genomics/Epigenetics.
Collapse
Affiliation(s)
- Manuela Petti
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Rome, Italy
| | - Lorenzo Farina
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Rome, Italy
| |
Collapse
|
2
|
Luo J, Feng Y, Wu X, Li R, Shi J, Chang W, Wang J. ForestSubtype: a cancer subtype identifying approach based on high-dimensional genomic data and a parallel random forest. BMC Bioinformatics 2023; 24:289. [PMID: 37468832 DOI: 10.1186/s12859-023-05412-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 07/13/2023] [Indexed: 07/21/2023] Open
Abstract
BACKGROUND Cancer subtype classification is helpful for personalized cancer treatment. Although, some approaches have been developed to classifying caner subtype based on high dimensional gene expression data, it is difficult to obtain satisfactory classification results. Meanwhile, some cancers have been well studied and classified to some subtypes, which are adopt by most researchers. Hence, this priori knowledge is significant for further identifying new meaningful subtypes. RESULTS In this paper, we present a combined parallel random forest and autoencoder approach for cancer subtype identification based on high dimensional gene expression data, ForestSubtype. ForestSubtype first adopts the parallel RF and the priori knowledge of cancer subtype to train a module and extract significant candidate features. Second, ForestSubtype uses a random forest as the base module and ten parallel random forests to compute each feature weight and rank them separately. Then, the intersection of the features with the larger weights output by the ten parallel random forests is taken as our subsequent candidate features. Third, ForestSubtype uses an autoencoder to condenses the selected features into a two-dimensional data. Fourth, ForestSubtype utilizes k-means++ to obtain new cancer subtype identification results. In this paper, the breast cancer gene expression data obtained from The Cancer Genome Atlas are used for training and validation, and an independent breast cancer dataset from the Molecular Taxonomy of Breast Cancer International Consortium is used for testing. Additionally, we use two other cancer datasets for validating the generalizability of ForestSubtype. ForestSubtype outperforms the other two methods in terms of the distribution of clusters, internal and external metric results. The open-source code is available at https://github.com/lffyd/ForestSubtype . CONCLUSIONS Our work shows that the combination of high-dimensional gene expression data and parallel random forests and autoencoder, guided by a priori knowledge, can identify new subtypes more effectively than existing methods of cancer subtype classification.
Collapse
Affiliation(s)
- Junwei Luo
- School of Software, Henan Polytechnic University, Jiaozuo, China
| | - Yading Feng
- School of Software, Henan Polytechnic University, Jiaozuo, China
| | - Xuyang Wu
- School of Software, Henan Polytechnic University, Jiaozuo, China
| | - Ruimin Li
- School of Software, Henan Polytechnic University, Jiaozuo, China
| | - Jiawei Shi
- School of Software, Henan Polytechnic University, Jiaozuo, China
| | - Wenjing Chang
- School of Software, Henan Polytechnic University, Jiaozuo, China
| | - Junfeng Wang
- School of Software, Henan Polytechnic University, Jiaozuo, China.
| |
Collapse
|
3
|
Muniyappan S, Rayan AXA, Varrieth GT. DTiGNN: Learning drug-target embedding from a heterogeneous biological network based on a two-level attention-based graph neural network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:9530-9571. [PMID: 37161255 DOI: 10.3934/mbe.2023419] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
MOTIVATION In vitro experiment-based drug-target interaction (DTI) exploration demands more human, financial and data resources. In silico approaches have been recommended for predicting DTIs to reduce time and cost. During the drug development process, one can analyze the therapeutic effect of the drug for a particular disease by identifying how the drug binds to the target for treating that disease. Hence, DTI plays a major role in drug discovery. Many computational methods have been developed for DTI prediction. However, the existing methods have limitations in terms of capturing the interactions via multiple semantics between drug and target nodes in a heterogeneous biological network (HBN). METHODS In this paper, we propose a DTiGNN framework for identifying unknown drug-target pairs. The DTiGNN first calculates the similarity between the drug and target from multiple perspectives. Then, the features of drugs and targets from each perspective are learned separately by using a novel method termed an information entropy-based random walk. Next, all of the learned features from different perspectives are integrated into a single drug and target similarity network by using a multi-view convolutional neural network. Using the integrated similarity networks, drug interactions, drug-disease associations, protein interactions and protein-disease association, the HBN is constructed. Next, a novel embedding algorithm called a meta-graph guided graph neural network is used to learn the embedding of drugs and targets. Then, a convolutional neural network is employed to infer new DTIs after balancing the sample using oversampling techniques. RESULTS The DTiGNN is applied to various datasets, and the result shows better performance in terms of the area under receiver operating characteristic curve (AUC) and area under precision-recall curve (AUPR), with scores of 0.98 and 0.99, respectively. There are 23,739 newly predicted DTI pairs in total.
Collapse
Affiliation(s)
- Saranya Muniyappan
- Computer Science and Engineering, CEG Campus, Anna University, Tamil Nadu, India
| | | | | |
Collapse
|
4
|
Chen X, Yang C, Wang W, He X, Sun H, Lyu W, Zou K, Fang S, Dai Z, Dong H. Exploration of prognostic genes and risk signature in breast cancer patients based on RNA binding proteins associated with ferroptosis. Front Genet 2023; 14:1025163. [PMID: 36911389 PMCID: PMC9998954 DOI: 10.3389/fgene.2023.1025163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 01/23/2023] [Indexed: 03/14/2023] Open
Abstract
Background: Breast cancer (BRCA) is a life-threatening malignancy in women with an unsatisfactory prognosis. The purpose of this study was to explore the prognostic biomarkers and a risk signature based on ferroptosis-related RNA-binding proteins (FR-RBPs). Methods: FR-RBPs were identified using Spearman correlation analysis. Differentially expressed genes (DEGs) were identified by the "limma" R package. The univariate Cox and multivariate Cox analyses were executed to determine the prognostic genes. The risk signature was constructed and verified with the training set, testing set, and validation set. Mutation analysis, immune checkpoint expression analysis in high- and low-risk groups, and correlation between risk signature and chemotherapeutic agents were conducted using the "maftools" package, "ggplot2" package, and the CellMiner database respectively. The Human Protein Atlas (HPA) database was employed to confirm protein expression trends of prognostic genes in BRCA and normal tissues. The expression of prognostic genes in cell lines was verified by Real-time quantitative polymerase chain reaction (RT-qPCR). Kaplan-meier (KM) plotter database analysis was applied to predict the correlation between the expression levels of signature genes and survival statuses. Results: Five prognostic genes (GSPT2, RNASE1, TIPARP, TSEN54, and SAMD4A) to construct an FR-RBPs-related risk signature were identified and the risk signature was validated by the International Cancer Genome Consortium (ICGC) cohort. Univariate and multivariate Cox regression analysis demonstrated the risk score was a robust independent prognostic factor in overall survival prediction. The Tumor Mutational Burden (TMB) analysis implied that the high- and low-risk groups responded differently to immunotherapy. Drug sensitivity analysis suggested that the risk signature may serve as a chemosensitivity predictor. The results of GSEA suggested that five prognostic genes might be related to DNA replication and the immune-related pathways. RT-qPCR results demonstrated that the expression trends of prognostic genes in cell lines were consistent with the results from public databases. KM plotter database analysis suggested that high expression levels of GSPT2, RNASE1, and SAMD4A contributed to poor prognoses. Conclusion: In conclusion, this study identified the FR-RBPs-related prognostic genes and developed an FR-RBPs-related risk signature for the prognosis of BRCA, which will be of great significance in developing new therapeutic targets and prognostic molecular biomarkers for BRCA.
Collapse
Affiliation(s)
- Xiang Chen
- Department of General Surgery, Hainan General Hospital, Hainan Affiliated Hospital of Hainan Medical University, Haikou, China
| | - Changcheng Yang
- Department of Medical Oncology, The First Affiliated Hospital of Hainan Medical University, Haikou, China
| | - Wei Wang
- Department of General Surgery, Hainan General Hospital, Hainan Affiliated Hospital of Hainan Medical University, Haikou, China
| | - Xionghui He
- Department of General Surgery, Hainan General Hospital, Hainan Affiliated Hospital of Hainan Medical University, Haikou, China
| | - Hening Sun
- Department of General Surgery, Hainan General Hospital, Hainan Affiliated Hospital of Hainan Medical University, Haikou, China
| | - Wenzhi Lyu
- Department of General Surgery, Hainan General Hospital, Hainan Affiliated Hospital of Hainan Medical University, Haikou, China
| | - Kejian Zou
- Department of General Surgery, Hainan General Hospital, Hainan Affiliated Hospital of Hainan Medical University, Haikou, China
| | - Shuo Fang
- Department of Clinical Oncology, The University of Hong Kong, Hong Kong SAR, China.,Department of Oncology, The Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen, China
| | - Zhijun Dai
- Department of Breast Surgery, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Huaying Dong
- Department of General Surgery, Hainan General Hospital, Hainan Affiliated Hospital of Hainan Medical University, Haikou, China
| |
Collapse
|
5
|
Hassan Zada MS, Yuan B, Khan WA, Anjum A, Reiff-Marganiec S, Saleem R. A unified graph model based on molecular data binning for disease subtyping. J Biomed Inform 2022; 134:104187. [PMID: 36055637 DOI: 10.1016/j.jbi.2022.104187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 08/05/2022] [Accepted: 08/25/2022] [Indexed: 11/19/2022]
Abstract
Molecular disease subtype discovery from omics data is an important research problem in precision medicine.The biggest challenges are the skewed distribution and data variability in the measurements of omics data. These challenges complicate the efficient identification of molecular disease subtypes defined by clinical differences, such as survival. Existing approaches adopt kernels to construct patient similarity graphs from each view through pairwise matching. However, the distance functions used in kernels are unable to utilize the potentially critical information of extreme values and data variability which leads to the lack of robustness. In this paper, a novel robust distance metric (ROMDEX) is proposed to construct similarity graphs for molecular disease subtypes from omics data, which is able to address the data variability and extreme values challenges. The proposed approach is validated on multiple TCGA cancer datasets, and the results are compared with multiple baseline disease subtyping methods. The evaluation of results is based on Kaplan-Meier survival time analysis, which is validated using statistical tests e.g, Cox-proportional hazard (Cox p-value). We reject the null hypothesis that the cohorts have the same hazard, for the P-values less than 0.05. The proposed approach achieved best P-values of 0.00181, 0.00171, and 0.00758 for Gene Expression, DNA Methylation, and MicroRNA data respectively, which shows significant difference in survival between the cohorts. In the results, the proposed approach outperformed the existing state-of-the-art (MRGC, PINS, SNF, Consensus Clustering and Icluster+) disease subtyping approaches on various individual disease views of multiple TCGA datasets.
Collapse
Affiliation(s)
| | - Bo Yuan
- School of Computing and Mathematical Sciences, University of Leicester, United Kingdom.
| | - Wajahat Ali Khan
- School of Computing and Engineering, University of Derby, United Kingdom.
| | - Ashiq Anjum
- School of Computing and Mathematical Sciences, University of Leicester, United Kingdom.
| | | | - Rabia Saleem
- School of Computing and Engineering, University of Derby, United Kingdom.
| |
Collapse
|
6
|
A Cell Differentiation Trajectory-Related Signature for Predicting the Prognosis of Lung Adenocarcinoma. Genet Res (Camb) 2022; 2022:3483498. [PMID: 36072012 PMCID: PMC9398881 DOI: 10.1155/2022/3483498] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Accepted: 07/07/2022] [Indexed: 11/17/2022] Open
Abstract
Objective To screen the cell differentiation trajectory-related genes and build a cell differentiation trajectory-related signature for predicting the prognosis of lung adenocarcinoma (LUAD). Methods LUAD single cell mRNA expression profile, TCGA-LUAD transcriptome data were obtained from GEO and TCGA databases. Single-cell RNA-seq data were used for cell clustering and pseudotime analysis after dimensionality reduction analysis, and the cell differentiation trajectory-related genes were acquired after differential expression analysis conducted between the main branches. Then, the consensus clustering analysis was carried out on TCGA-LUAD samples, and the GSEA analysis was performed, then the differences on the expression levels of immune checkpoint genes and immunotherapy response were compared among clusters. The prognostic model was constructed, and the GSE42127 dataset was used to validate. A nomogram evaluation model was used to predict prognosis. Results Two subsets with distinct differentiation states were found after cell differentiation trajectory analysis. TCGA-LUAD samples were divided into two cell differentiation trajectory-related gene-based clusters, GSEA found that cluster 1 was significantly related to 20 pathways, cluster 2 was significantly enriched in three pathways, and it was also shown that clusters could better predict immune checkpoint gene expression and immunotherapy response. A six cell differentiation-related genes-based prognostic signature was constructed, and the patients in the high-risk group had poorer prognosis than those in the low-risk group. Moreover, a nomogram was constructed based on the prognostic signature and clinicopathological features, and this nomogram had strong predictive performance and high accuracy. Conclusion The cell differentiation-related signature and the prognostic nomogram could accurately predict survival.
Collapse
|
7
|
Wei X, Liu J, Hong Z, Chen X, Wang K, Cai J. Identification of novel tumor microenvironment-associated genes in gastric cancer based on single-cell RNA-sequencing datasets. Front Genet 2022; 13:896064. [PMID: 36046240 PMCID: PMC9421061 DOI: 10.3389/fgene.2022.896064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 07/06/2022] [Indexed: 11/13/2022] Open
Abstract
Tumor microenvironment and heterogeneity play vital roles in the development and progression of gastric cancer (GC). In the past decade, a considerable amount of single-cell RNA-sequencing (scRNA-seq) studies have been published in the fields of oncology and immunology, which improve our knowledge of the GC immune microenvironment. However, much uncertainty still exists about the relationship between the macroscopic and microscopic data in transcriptomics. In the current study, we made full use of scRNA-seq data from the Gene Expression Omnibus database (GSE134520) to identify 25 cell subsets, including 11 microenvironment-related cell types. The MIF signaling pathway network was obtained upon analysis of receptor–ligand pairs and cell–cell interactions. By comparing the gene expression in a wide variety of cells between intestinal metaplasia and early gastric cancer, we identified 64 differentially expressed genes annotated as immune response and cellular communication. Subsequently, we screened these genes for prognostic clinical value based on the patients’ follow-up data from The Cancer Genome Atlas. TMPRSS15, VIM, APOA1, and RNASE1 were then selected for the construction of LASSO risk scores, and a nomogram model incorporating another five clinical risk factors was successfully created. The effectiveness of least absolute shrinkage and selection operator risk scores was validated using gene set enrichment analysis and levels of immune cell infiltration. These findings will drive the development of prognostic evaluations affected by the immune tumor microenvironment in GC.
Collapse
Affiliation(s)
- Xujin Wei
- The Graduate School of Fujian Medical University, Fuzhou, China
- Department of Gastrointestinal Surgery, Zhongshan Hospital of Xiamen University, Institute of Gastrointestinal Oncology, School of Medicine, Xiamen University, Xiamen, China
- Xiamen Municipal Key Laboratory of Gastrointestinal Oncology, Xiamen, China
| | - Jie Liu
- The Graduate School of Fujian Medical University, Fuzhou, China
| | - Zhijun Hong
- Department of Gastrointestinal Surgery, Zhongshan Hospital of Xiamen University, Institute of Gastrointestinal Oncology, School of Medicine, Xiamen University, Xiamen, China
- Xiamen Municipal Key Laboratory of Gastrointestinal Oncology, Xiamen, China
| | - Xin Chen
- The Graduate School of Fujian Medical University, Fuzhou, China
- Department of Gastrointestinal Surgery, Zhongshan Hospital of Xiamen University, Institute of Gastrointestinal Oncology, School of Medicine, Xiamen University, Xiamen, China
- Xiamen Municipal Key Laboratory of Gastrointestinal Oncology, Xiamen, China
| | - Kang Wang
- Department of Gastrointestinal Surgery, Zhongshan Hospital of Xiamen University, Institute of Gastrointestinal Oncology, School of Medicine, Xiamen University, Xiamen, China
- Xiamen Municipal Key Laboratory of Gastrointestinal Oncology, Xiamen, China
| | - Jianchun Cai
- The Graduate School of Fujian Medical University, Fuzhou, China
- Department of Gastrointestinal Surgery, Zhongshan Hospital of Xiamen University, Institute of Gastrointestinal Oncology, School of Medicine, Xiamen University, Xiamen, China
- Xiamen Municipal Key Laboratory of Gastrointestinal Oncology, Xiamen, China
- *Correspondence: Jianchun Cai,
| |
Collapse
|
8
|
MotieGhader H, Tabrizi-Nezhadi P, Deldar Abad Paskeh M, Baradaran B, Mokhtarzadeh A, Hashemi M, Lanjanian H, Jazayeri SM, Maleki M, Khodadadi E, Nematzadeh S, Kiani F, Maghsoudloo M, Masoudi-Nejad A. Drug repositioning in non-small cell lung cancer (NSCLC) using gene co-expression and drug–gene interaction networks analysis. Sci Rep 2022; 12:9417. [PMID: 35676421 PMCID: PMC9177601 DOI: 10.1038/s41598-022-13719-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 05/16/2022] [Indexed: 12/14/2022] Open
Abstract
Lung cancer is the most common cancer in men and women. This cancer is divided into two main types, namely non-small cell lung cancer (NSCLC) and small cell lung cancer (SCLC). Around 85 to 90 percent of lung cancers are NSCLC. Repositioning potent candidate drugs in NSCLC treatment is one of the important topics in cancer studies. Drug repositioning (DR) or drug repurposing is a method for identifying new therapeutic uses of existing drugs. The current study applies a computational drug repositioning method to identify candidate drugs to treat NSCLC patients. To this end, at first, the transcriptomics profile of NSCLC and healthy (control) samples was obtained from the GEO database with the accession number GSE21933. Then, the gene co-expression network was reconstructed for NSCLC samples using the WGCNA, and two significant purple and magenta gene modules were extracted. Next, a list of transcription factor genes that regulate purple and magenta modules' genes was extracted from the TRRUST V2.0 online database, and the TF–TG (transcription factors–target genes) network was drawn. Afterward, a list of drugs targeting TF–TG genes was obtained from the DGIdb V4.0 database, and two drug–gene interaction networks, including drug-TG and drug-TF, were drawn. After analyzing gene co-expression TF–TG, and drug–gene interaction networks, 16 drugs were selected as potent candidates for NSCLC treatment. Out of 16 selected drugs, nine drugs, namely Methotrexate, Olanzapine, Haloperidol, Fluorouracil, Nifedipine, Paclitaxel, Verapamil, Dexamethasone, and Docetaxel, were chosen from the drug-TG sub-network. In addition, nine drugs, including Cisplatin, Daunorubicin, Dexamethasone, Methotrexate, Hydrocortisone, Doxorubicin, Azacitidine, Vorinostat, and Doxorubicin Hydrochloride, were selected from the drug-TF sub-network. Methotrexate and Dexamethasone are common in drug-TG and drug-TF sub-networks. In conclusion, this study proposed 16 drugs as potent candidates for NSCLC treatment through analyzing gene co-expression, TF–TG, and drug–gene interaction networks.
Collapse
|
9
|
Sarcopenia and a 5-mRNA risk module as a combined factor to predict prognosis for patients with stomach adenocarcinoma. Genomics 2021; 114:361-377. [PMID: 34933074 DOI: 10.1016/j.ygeno.2021.12.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 11/18/2021] [Accepted: 12/04/2021] [Indexed: 12/13/2022]
Abstract
BACKGROUND Sarcopenia is an important factor affecting the prognostic outcomes in adult cancer patients. Gastric cancer is considered an age-related disease and is one of the leading causes of global cancer mortality. We aimed to establish an effective age-related model at a molecular level to predict the prognosis of patients with gastric cancer. METHODS TCGA STAD (stomach adenocarcinoma) and NCBI GEO database were utilized in this study to explore the expression, clinical relevance and prognostic value of age-related mRNAs in stomach adenocarcinoma through an integrated bioinformatics analysis. WGCNA co-expression network, Univariate Cox regression analysis, LASSO regression and Multivariate Cox regression analysis were implemented to construct an age-related prognostic signature. RESULTS As a result, sarcopenia is not only an unfavorable factor for OS (overall survival) in patients with tumor of gastric (HR: 1.707, 95%CI: 1.437-2.026), but also increases the risk of postoperative complications in patients with gastric cancer (OR: 2.904, 95%CI: 2.150-3.922). A panel of 5 mRNAs (DCBLD1, DLC1, IGFBP1, RNASE1 and SPC24) were identified to dichotomize patients with significantly different OS and independently predicted the OS in TCGA STAD (HR = 3.044, 95%CI = 2.078-4.460, P < 0.001). CONCLUSION The study provided novel insights to understand STAD at a molecular level and indicated that the 5 mRNAs might act as independent promising prognosis biomarkers for STAD. Sarcopenia and the 5-mRNA risk module as a combined factor to predict prognosis may play an important role in clinical diagnosis.
Collapse
|
10
|
Chen Z, Shen Z, Zhang Z, Zhao D, Xu L, Zhang L. RNA-Associated Co-expression Network Identifies Novel Biomarkers for Digestive System Cancer. Front Genet 2021; 12:659788. [PMID: 33841514 PMCID: PMC8033200 DOI: 10.3389/fgene.2021.659788] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Accepted: 02/25/2021] [Indexed: 01/04/2023] Open
Abstract
Cancers of the digestive system are malignant diseases. Our study focused on colon cancer, esophageal cancer (ESCC), rectal cancer, gastric cancer (GC), and rectosigmoid junction cancer to identify possible biomarkers for these diseases. The transcriptome data were downloaded from the TCGA database (The Cancer Genome Atlas Program), and a network was constructed using the WGCNA algorithm. Two significant modules were found, and coexpression networks were constructed. CytoHubba was used to identify hub genes of the two networks. GO analysis suggested that the network genes were involved in metabolic processes, biological regulation, and membrane and protein binding. KEGG analysis indicated that the significant pathways were the calcium signaling pathway, fatty acid biosynthesis, and pathways in cancer and insulin resistance. Some of the most significant hub genes were hsa-let-7b-3p, hsa-miR-378a-5p, hsa-miR-26a-5p, hsa-miR-382-5p, and hsa-miR-29b-2-5p and SECISBP2 L, NCOA1, HERC1, HIPK3, and MBNL1, respectively. These genes were predicted to be associated with the tumor prognostic reference for this patient population.
Collapse
Affiliation(s)
- Zheng Chen
- School of Applied Chemistry and Biological Technology, Shenzhen Polytechnic, Shenzhen, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Zijie Shen
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Zilong Zhang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Da Zhao
- School of Applied Chemistry and Biological Technology, Shenzhen Polytechnic, Shenzhen, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
| | - Lijun Zhang
- School of Applied Chemistry and Biological Technology, Shenzhen Polytechnic, Shenzhen, China
| |
Collapse
|
11
|
Feng J, Jiang L, Li S, Tang J, Wen L. Multi-Omics Data Fusion via a Joint Kernel Learning Model for Cancer Subtype Discovery and Essential Gene Identification. Front Genet 2021; 12:647141. [PMID: 33747053 PMCID: PMC7969795 DOI: 10.3389/fgene.2021.647141] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Accepted: 02/02/2021] [Indexed: 01/17/2023] Open
Abstract
The multiple sources of cancer determine its multiple causes, and the same cancer can be composed of many different subtypes. Identification of cancer subtypes is a key part of personalized cancer treatment and provides an important reference for clinical diagnosis and treatment. Some studies have shown that there are significant differences in the genetic and epigenetic profiles among different cancer subtypes during carcinogenesis and development. In this study, we first collect seven cancer datasets from the Broad Institute GDAC Firehose, including gene expression profile, isoform expression profile, DNA methylation expression data, and survival information correspondingly. Furthermore, we employ kernel principal component analysis (PCA) to extract features for each expression profile, convert them into three similarity kernel matrices by Gaussian kernel function, and then fuse these matrices as a global kernel matrix. Finally, we apply it to spectral clustering algorithm to get the clustering results of different cancer subtypes. In the experimental results, besides using the P-value from the Cox regression model and survival analysis as the primary evaluation measures, we also introduce statistical indicators such as Rand index (RI) and adjusted RI (ARI) to verify the performance of clustering. Then combining with gene expression profile, we obtain the differential expression of genes among different subtypes by gene set enrichment analysis. For lung cancer, GMPS, EPHA10, C10orf54, and MAGEA6 are highly expressed in different subtypes; for liver cancer, CMYA5, DEPDC6, FAU, VPS24, RCBTB2, LOC100133469, and SLC35B4 are significantly expressed in different subtypes.
Collapse
Affiliation(s)
- Jie Feng
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Limin Jiang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Shuhao Li
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jijun Tang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.,School of Computational Science and Engineering, University of South Carolina, Columbia, SC, United States.,Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
| | - Lan Wen
- Changsha Municipal Center of Disease Control, Changsha, China
| |
Collapse
|