1
|
Qiu X, Liu P, Lin H, Peng Z, Sun X, Dong G, Han Y, Huang Z. Pan-cancer analysis and experimental verification of cytochrome B561 as a prognostic and therapeutic biomarker in breast cancer. Discov Oncol 2025; 16:330. [PMID: 40091073 PMCID: PMC11911281 DOI: 10.1007/s12672-025-02094-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Accepted: 03/07/2025] [Indexed: 03/19/2025] Open
Abstract
OBJECTIVE This study investigates Cytochrome B561 (CYB561) expression in Pan-Cancer, its relationship with immune invasion, and its prognostic value in Breast Cancer (BRCA) patients. METHODS Data from The Cancer Genome Atlas (TCGA) were analyzed. CYB561 expression in normal and tumor tissues was examined, with correlations to immune invasion, mutation, and immune checkpoints. Wilcoxon rank-sum test assessed expression differences. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were conducted. Logistic regression, Kaplan-Meier, and Cox regression analyses evaluated clinicopathological features and survival outcomes. A Cox multivariate analysis-based Nomogram predicted CYB561's prognostic impact. CYB561 knockout in breast cancer cells assessed functional effects. Single-cell RNA sequencing identified prognostic biomarkers. RESULTS CYB561 was highly expressed in most tumors. BRCA showed the highest correlation with ESTIMATE scores and significant negative correlation with immune checkpoints. High CYB561 expression correlated with specific clinicopathological features and survival outcomes. The nomogram predicted BRCA prognosis. CYB561 knockout inhibited breast cancer cell proliferation. Seven predictive agents for CYB561 inhibition were identified. CONCLUSIONS CYB561 exhibits aberrant expression in tumors, particularly in BRCA, and serves as a predictive marker for immune-related therapies and a prognostic indicator in BRCA.
Collapse
Affiliation(s)
- Xiaoting Qiu
- Department of Breast Surgical Oncology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, 350014, China
| | - Peizhang Liu
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350108, China
| | - Hongxiang Lin
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350108, China
| | - Zeyi Peng
- Massachusetts College Of Pharmacy And Health Sciences, Boston, MA, 02115, USA
| | - Xinhao Sun
- College of Science, Northeastern University, Boston, MA, 02115, USA
| | - Guanting Dong
- School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350108, China
| | - Yuanyuan Han
- Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, 650000, China.
| | - Zhijian Huang
- Department of Breast Surgical Oncology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, 350014, China.
| |
Collapse
|
2
|
Kallah-Dagadu G, Mohammed M, Nasejje JB, Mchunu NN, Twabi HS, Batidzirai JM, Singini GC, Nevhungoni P, Maposa I. Breast cancer prediction based on gene expression data using interpretable machine learning techniques. Sci Rep 2025; 15:7594. [PMID: 40038307 DOI: 10.1038/s41598-025-85323-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 01/01/2025] [Indexed: 03/06/2025] Open
Abstract
Breast cancer remains a global health burden, with an increase in deaths related to this particular cancer. Accurately predicting and diagnosing breast cancer is important for treatment development and survival of patients. This study aimed to accurately predict breast cancer using a dataset comprising 1208 observations and 3602 genes. The study employed feature selection techniques to identify the most influential predictive genes for breast cancer using machine learning (ML) models. The study used K-nearest Neighbors (KNN), random forests (RF), and a support vector machine (SVM). Furthermore, the study employed feature- and model-based importance and explainable ML methods, including Shapley values, Partial dependency (PDPS), and Accumulated Local Effects (ALE) plots, to explain the genes' importance ranking from the ML methods. Shapley values highlighted the significance of some of the genes in predicting cancer presence. Model-based feature ranking techniques, particularly the Leaving-One-Covariate-In (LOCI) method, identified the ten most critical genes for predicting tumor cases. The LOCI rankings from the SVM and RF methods were aligned. Additionally, visualization methods such as PDPS and ALE plots demonstrated how individual feature changes affect predictions and interactions with other genes. By combining feature selection techniques and explainable ML methods, this study has demonstrated the interpretability and reliability of machine learning models for breast cancer prediction, emphasizing the importance of incorporating explainable ML approaches for medical decision-making.
Collapse
Affiliation(s)
- Gabriel Kallah-Dagadu
- Department of Statistics and Actuarial Science, University of Ghana, Accra, Ghana
- School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, Pietermaritzburg, South Africa
| | - Mohanad Mohammed
- School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, Pietermaritzburg, South Africa
| | - Justine B Nasejje
- School of Statistics and Actuarial Science, University of the Witwatersrand, Johannesburg-Braamfontein, South Africa
| | | | - Halima S Twabi
- Department of Mathematical Sciences, University of Malawi, Zomba, Malawi
| | - Jesca Mercy Batidzirai
- School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, Pietermaritzburg, South Africa
| | | | - Portia Nevhungoni
- Biostatistics Research Unit, South African Medical Research Council, Pretoria, South Africa
| | - Innocent Maposa
- Division of Epidemiology and Biostatistics, Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Tygerberg, Cape Town, South Africa.
| |
Collapse
|
3
|
A N, Lyu P, Yu Y, Liu M, Cheng S, Chen M, Liu Y, Cao X. PICALM as a Novel Prognostic Biomarker and Its Correlation with Immune Infiltration in Breast Cancer. Appl Biochem Biotechnol 2024; 196:6011-6027. [PMID: 38175412 DOI: 10.1007/s12010-023-04840-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/19/2023] [Indexed: 01/05/2024]
Abstract
PICALM (phosphatidylinositol-binding clathrin assembly protein) mutations have been linked to a number of human disorders, including leukemia, Alzheimer's disease, and Parkinson's disease. Nevertheless, the effect of PICALM on cancer, particularly on prognosis and immune infiltration in individuals with BRCA, is unknown. We obtained the data of breast cancer patients from The Cancer Genome Atlas (TCGA) database, and analyzed the expression of PICALM in breast cancer, its impact on survival' and its role in tumor immune invasion. Finally, in vitro cellular experiments were performed to validate the results. Research has found that PICALM expression was shown to be downregulated in BRCA and to be substantially linked with clinical stage, histological type, PAM50, and age. PICALM downregulation was linked to a lower overall survival (OS) and disease-specific survival (DSS) in BRCA patients. A multivariate Cox analysis revealed that PICALM is an independent predictor of OS. The enriched pathways revealed by functional enrichment analysis included oxidative phosphorylation, angiogenesis, the TGF signaling pathway, and the IL-6/JAK/STAT3 signaling system. Furthermore, the amount of immune cell infiltration by B cells, eosinophils, mast cells, neutrophils, and T cells was positively linked with PICALM expression. Finally, we experimentally verified that low expression of PICALM can reduce proliferation, migration, and invasion in tumor cells. This evidence shows that PICALM expression impacts prognosis, immune infiltration, and pathway expression in breast cancer patients, and it might be a potential predictive biomarker for the disease.
Collapse
Affiliation(s)
- Naer A
- The First Department of Breast Cancer, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Huanhuxi Road, Hexi District, Tianjin, 300060, China
- Key Laboratory of Cancer Prevention and Therapy, Tianjin, 300060, China
- Tianjin's Clinical Research Center for Cancer, Tianjin, 300060, China
- Key Laboratory of Breast Cancer Prevention and Therapy, Tianjin Medical University, Ministry of Education, Tianjin, 300060, China
| | - Pengfei Lyu
- Department of Breast Surgery, The First Affiliated Hospital of Hainan Medical University, Haikou, 570102, China
| | - Yue Yu
- The First Department of Breast Cancer, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Huanhuxi Road, Hexi District, Tianjin, 300060, China
| | - Meiling Liu
- Department of Thyroid and Breast Surgery, Shenzhen Bao'an District Songgang People's Hospital, No. 2 Shajiang Road, Shenzhen City, 518105, Guangdong Province, China
| | - Shaohua Cheng
- Department of Thyroid and Breast Surgery, Shenzhen Bao'an District Songgang People's Hospital, No. 2 Shajiang Road, Shenzhen City, 518105, Guangdong Province, China
| | - Meiyan Chen
- Department of Thyroid and Breast Surgery, Shenzhen Bao'an District Songgang People's Hospital, No. 2 Shajiang Road, Shenzhen City, 518105, Guangdong Province, China
| | - Yunhong Liu
- Department of Thyroid and Breast Surgery, Shenzhen Bao'an District Songgang People's Hospital, No. 2 Shajiang Road, Shenzhen City, 518105, Guangdong Province, China
| | - Xuchen Cao
- The First Department of Breast Cancer, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Huanhuxi Road, Hexi District, Tianjin, 300060, China.
- Key Laboratory of Cancer Prevention and Therapy, Tianjin, 300060, China.
- Tianjin's Clinical Research Center for Cancer, Tianjin, 300060, China.
- Key Laboratory of Breast Cancer Prevention and Therapy, Tianjin Medical University, Ministry of Education, Tianjin, 300060, China.
| |
Collapse
|
4
|
Yang L, Zhang S, Zheng L, Kong F, Pu P, Li X, Jia L. Association of ADP‑ribosylation factor family genes with prognosis and immune infiltration of breast cancer. Oncol Lett 2024; 27:280. [PMID: 38699662 PMCID: PMC11063756 DOI: 10.3892/ol.2024.14413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 03/19/2024] [Indexed: 05/05/2024] Open
Abstract
Breast cancer (BC) is the most common type of cancer found in women. ADP-ribosylation factors (ARFs) are a group of small proteins that bind to GTP and are involved in controlling different cellular functions. The function and evolution of multiple ARFs in BC have remained to be fully elucidated, despite existing studies on this protein family in Homo sapiens and other species. In the present study, a systematic analysis of ARF expression levels in BC tissues compared to normal breast tissues was performed using data from The Cancer Genome Atlas database. The analysis revealed significantly higher expression of ARFs in BC tissues. In addition, the prognostic significance of ARF1 and ARF3-6 expression levels was assessed in patients with BC. Of note, elevated ARF1 expression was associated with reduced rates of distant metastasis-free survival (DMFS), overall survival (OS) and recurrence-free survival (RFS) in affected individuals. Similarly, patients with high expression levels of ARF3 had lower post-progression survival (PPS) rates. In addition, patients with higher ARF4 expression had worse PPS and patients with high ARF5 expression exhibited lower DMFS. Patients with high ARF6 expression had worse DMFS, OS, RFS and predictive power score values. Furthermore, the expression of ARF was found to be strongly linked to the infiltration of various immune cell types, namely dendritic cells, macrophages, neutrophils, CD8+ T cells and B cells. These significant associations offer a solid foundation for the potential utilization of new therapeutic targets and predictive markers for the treatment of BC.
Collapse
Affiliation(s)
- Lixian Yang
- Department of Breast Surgery, Xingtai People's Hospital, Xingtai, Hebei 054000, P.R. China
| | - Shiyu Zhang
- Department of Breast Surgery, Xingtai People's Hospital, Xingtai, Hebei 054000, P.R. China
| | - Lei Zheng
- Department of Breast Surgery, Xingtai People's Hospital, Xingtai, Hebei 054000, P.R. China
| | - Fanting Kong
- Department of Breast Surgery, Xingtai People's Hospital, Xingtai, Hebei 054000, P.R. China
| | - Pengpeng Pu
- Department of Breast Surgery, Xingtai People's Hospital, Xingtai, Hebei 054000, P.R. China
| | - Xiaowei Li
- Department of Breast Surgery, Xingtai People's Hospital, Xingtai, Hebei 054000, P.R. China
| | - Lining Jia
- Department of Breast Surgery, Xingtai People's Hospital, Xingtai, Hebei 054000, P.R. China
| |
Collapse
|
5
|
Yao D, Mei S, Tang W, Xu X, Lu Q, Shi Z. AAAKB: A manually curated database for tracking and predicting genes of Abdominal aortic aneurysm (AAA). PLoS One 2023; 18:e0289966. [PMID: 38100461 PMCID: PMC10723669 DOI: 10.1371/journal.pone.0289966] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 07/31/2023] [Indexed: 12/17/2023] Open
Abstract
Abdominal aortic aneurysm (AAA), an extremely dangerous vascular disease with high mortality, causes massive internal bleeding due to aneurysm rupture. To boost the research on AAA, efforts should be taken to organize and link the information about AAA-related genes and their functions. Currently, most researchers screen through genetic databases manually, which is cumbersome and time-consuming. Here, we developed "AAAKB" a manually curated knowledgebase containing genes, SNPs and pathways associated with AAA. In order to facilitate researchers to further explore the mechanism network of AAA, AAAKB provides predicted genes that are potentially associated with AAA. The prediction is based on the protein interaction information of genes collected in the database, and the random forest algorithm (RF) is used to build the prediction model. Some of these predicted genes are differentially expressed in patients with AAA, and some have been reported to play a role in other cardiovascular diseases, illustrating the utility of the knowledgebase in predicting novel genes. Also, AAAKB integrates a protein interaction visualization tool to quickly determine the shortest paths between target proteins. As the first knowledgebase to provide a comprehensive catalog of AAA-related genes, AAAKB will be an ideal research platform for AAA. Database URL: http://www.lqlgroup.cn:3838/AAAKB/.
Collapse
Affiliation(s)
- Di Yao
- Institute of Industrial Internet and Internet of Things, China Academy of Information and Communications Technology (CAICT), China
| | - Shuyuan Mei
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, School of Pharmacy, Nanjing Medical University, Nanjing, China
| | - Wangyang Tang
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| | - Xingyu Xu
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| | - Qiulun Lu
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, School of Pharmacy, Nanjing Medical University, Nanjing, China
| | - Zhiguang Shi
- Key Laboratory of Cardiovascular and Cerebrovascular Medicine, School of Pharmacy, Nanjing Medical University, Nanjing, China
| |
Collapse
|
6
|
Odhiambo P, Okello H, Wakaanya A, Wekesa C, Okoth P. Mutational signatures for breast cancer diagnosis using artificial intelligence. J Egypt Natl Canc Inst 2023; 35:14. [PMID: 37184779 DOI: 10.1186/s43046-023-00173-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 04/19/2023] [Indexed: 05/16/2023] Open
Abstract
BACKGROUND Breast cancer is the most common female cancer worldwide. Its diagnosis and prognosis remain scanty, imprecise, and poorly documented. Previous studies have indicated that some genetic mutational signatures are suspected to lead to progression of various breast cancer scenarios. There is paucity of data on the role of AI tools in delineating breast cancer mutational signatures. This study sought to investigate the relationship between breast cancer genetic mutational profiles using artificial intelligence models with a view to developing an accurate prognostic prediction based on breast cancer genetic signatures. Prior research on breast cancer has been based on symptoms, origin, and tumor size. It has not been investigated whether diagnosis of breast cancer can be made utilizing AI platforms like Cytoscape, Phenolyzer, and Geneshot with potential for better prognostic power. This is the first ever attempt for a combinatorial approach to breast cancer diagnosis using different AI platforms. METHOD Artificial intelligence (AI) are mathematical algorithms that simulate human cognitive abilities and solve difficult healthcare issues such as complicated biological abnormalities like those experienced in breast cancer scenarios. The current models aimed to predict outcomes and prognosis by correlating imaging phenotypes with genetic mutations, tumor profiles, and hormone receptor status and development of imaging biomarkers that combine tumor and patient-specific features. Geneshotsav 2021, Cytoscape 3.9.1, and Phenolyzer Nature Methods, 12:841-843 (2015) tools, were used to mine breast cancer-associated mutational signatures and provided useful alternative computational tools for discerning pathways and enriched networks of genes of similarity with the overall goal of providing a systematic view of the variety of mutational processes that lead to breast cancer development. The development of novel-tailored pharmaceuticals, as well as the distribution of prospective treatment alternatives, would be aided by the collection of massive datasets and the use of such tools as diagnostic markers. RESULTS Specific DNA-maintenance defects, endogenous or environmental exposures, and cancer genomic signatures are connected. The PubMed database (Geneshot) search for the keywords yielded a total of 21,921 genes associated with breast cancer. Then, based on their propensity to result in gene mutations, the genes were screened using the Phenolyzer software. These platforms lend credence to the fact that breast cancer diagnosis using Cytoscape 3.9.1, Phenolyzer, and Geneshot 2021 reveals high profile of the following mutational signatures: BRCA1, BRCA2, TP53, CHEK2, PTEN, CDH1, BRIP1, RAD51C, CASP3, CREBBP, and SMAD3.
Collapse
Affiliation(s)
- Patrick Odhiambo
- Department of Biological Sciences, School of Natural and Applied Sciences, Masinde Muliro University of Science and Technology, P.O. Box 190, Kakamega, 50100, Kenya.
| | - Harrison Okello
- Department of Biological Sciences, School of Natural and Applied Sciences, Masinde Muliro University of Science and Technology, P.O. Box 190, Kakamega, 50100, Kenya
| | - Annette Wakaanya
- Department of Mathematics, School of Natural and Applied Sciences, Masinde Muliro University of Science and Technology, P.O. Box 190, Kakamega, 50100, Kenya
| | - Clabe Wekesa
- Department of Biological Sciences, School of Natural and Applied Sciences, Masinde Muliro University of Science and Technology, P.O. Box 190, Kakamega, 50100, Kenya
| | - Patrick Okoth
- Department of Biological Sciences, School of Natural and Applied Sciences, Masinde Muliro University of Science and Technology, P.O. Box 190, Kakamega, 50100, Kenya
| |
Collapse
|
7
|
Chen Y, Liu S, Papageorgiou LG, Theofilatos K, Tsoka S. Optimisation Models for Pathway Activity Inference in Cancer. Cancers (Basel) 2023; 15:1787. [PMID: 36980673 PMCID: PMC10046797 DOI: 10.3390/cancers15061787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 02/24/2023] [Accepted: 03/08/2023] [Indexed: 03/18/2023] Open
Abstract
BACKGROUND With advances in high-throughput technologies, there has been an enormous increase in data related to profiling the activity of molecules in disease. While such data provide more comprehensive information on cellular actions, their large volume and complexity pose difficulty in accurate classification of disease phenotypes. Therefore, novel modelling methods that can improve accuracy while offering interpretable means of analysis are required. Biological pathways can be used to incorporate a priori knowledge of biological interactions to decrease data dimensionality and increase the biological interpretability of machine learning models. METHODOLOGY A mathematical optimisation model is proposed for pathway activity inference towards precise disease phenotype prediction and is applied to RNA-Seq datasets. The model is based on mixed-integer linear programming (MILP) mathematical optimisation principles and infers pathway activity as the linear combination of pathway member gene expression, multiplying expression values with model-determined gene weights that are optimised to maximise discrimination of phenotype classes and minimise incorrect sample allocation. RESULTS The model is evaluated on the transcriptome of breast and colorectal cancer, and exhibits solution results of good optimality as well as good prediction performance on related cancer subtypes. Two baseline pathway activity inference methods and three advanced methods are used for comparison. Sample prediction accuracy, robustness against noise expression data, and survival analysis suggest competitive prediction performance of our model while providing interpretability and insight on key pathways and genes. Overall, our work demonstrates that the flexible nature of mathematical programming lends itself well to developing efficient computational strategies for pathway activity inference and disease subtype prediction.
Collapse
Affiliation(s)
- Yongnan Chen
- Department of Informatics, Faculty of Natural, Mathematical and Engineering Sciences, King's College London, Bush House, London WC2B 4BG, UK
| | - Songsong Liu
- School of Management, Harbin Institute of Technology, Harbin 150001, China
| | - Lazaros G Papageorgiou
- The Sargent Centre for Process Systems Engineering, Department of Chemical Engineering, University College London, Torrington Place, London WC1E 7JE, UK
| | - Konstantinos Theofilatos
- King's College London British Heart Foundation Centre, School of Cardiovascular and Metabolic Medicine and Sciences, London SE1 7EH, UK
| | - Sophia Tsoka
- Department of Informatics, Faculty of Natural, Mathematical and Engineering Sciences, King's College London, Bush House, London WC2B 4BG, UK
| |
Collapse
|
8
|
Huang Z, Zhen S, Jin L, Chen J, Han Y, Lei W, Zhang F. miRNA-1260b Promotes Breast Cancer Cell Migration and Invasion by Downregulating CCDC134. Curr Gene Ther 2023; 23:60-71. [PMID: 36056852 DOI: 10.2174/1566523222666220901112314] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 07/25/2022] [Accepted: 08/02/2022] [Indexed: 02/08/2023]
Abstract
BACKGROUND Breast cancer (BRCA) is the most common type of cancer among women worldwide. MiR-1260b has been widely demonstrated to participate in multiple crucial biological functions of cancer tumorigenesis, but its functional effect and mechanism in human breast cancer have not been fully understood. METHODS qRT-PCR was used to detect miR-1260b expression in 29 pairs of breast cancer tissues and normal adjacent tissues. Besides, the expression level of miR-1260b in BRCA cells was also further validated by qRT-PCR. miR-1260b played its role in the prognostic process by using Kaplan-Meier curves. In addition, miR-1260b knockdown and target gene CCDC134 overexpression model was constructed in cell line MDA-MB-231. Transwell migration and invasion assay was performed to analyze the effect of miR-1260b and CCDC134 on the biological function of BRCA cells. TargetScan and miRNAWalk were used to find possible target mRNAs. The relationship between CCDC134 and immune cell surface markers was analyzed using TIMER and database and the XIANTAO platform. GSEA analysis was used to identify possible CCDC134-associated molecular mechanisms and pathways. RESULTS In the present study, miR-1260b expression was significantly upregulated in human breast cancer tissue and a panel of human breast cancer cell lines, while the secretory protein coiled-coil domain containing 134 (CCDC134) exhibited lower mRNA expression. High expression of miR-1260b was associated with poor overall survival among the patients by KM plot. Knockdown of miR-1260b significantly suppressed breast cancer cell migration and invasion and yielded the opposite result. In addition, overexpression of CCDC134 could inhibit breast cancer migration and invasion, and knockdown yielded the opposite result. There were significant positive correlations of CCDC134 with CD25 (IL2RA), CD80 and CD86. GSEA showed that miR-1260b could function through the MAPK pathway by downregulating CCDC134. CONCLUSION Collectively, these results suggested that miR-1260b might be an oncogene of breast cancer and might promote the migration and invasion of BRCA cells by down-regulating its target gene CCDC134 and activating MAPK signaling pathway as well as inhibiting immune function and causing immune escape in human breast cancer.
Collapse
Affiliation(s)
- Zhijian Huang
- Department of Breast Surgical Oncology, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China
| | - Shijian Zhen
- Department of Pathology, The First Affiliated Hospital of Hunan Traditional Chinese Medical College (Hunan Province Directly Affiliated TCM Hospital), Zhuzhou 412000, China
| | - Liangzi Jin
- Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, China
| | - Jian Chen
- Department of Breast Surgical Oncology, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China
| | - Yuanyuan Han
- Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, China
| | - Wen Lei
- Department of Breast Surgical Oncology, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China
| | - Fuqing Zhang
- Department of Aenethesiology, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China
| |
Collapse
|
9
|
He B, Wang K, Xiang J, Bing P, Tang M, Tian G, Guo C, Xu M, Yang J. DGHNE: network enhancement-based method in identifying disease-causing genes through a heterogeneous biomedical network. Brief Bioinform 2022; 23:6712302. [PMID: 36151744 DOI: 10.1093/bib/bbac405] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 08/01/2022] [Accepted: 08/21/2022] [Indexed: 12/14/2022] Open
Abstract
The identification of disease-causing genes is critical for mechanistic understanding of disease etiology and clinical manipulation in disease prevention and treatment. Yet the existing approaches in tackling this question are inadequate in accuracy and efficiency, demanding computational methods with higher identification power. Here, we proposed a new method called DGHNE to identify disease-causing genes through a heterogeneous biomedical network empowered by network enhancement. First, a disease-disease association network was constructed by the cosine similarity scores between phenotype annotation vectors of diseases, and a new heterogeneous biomedical network was constructed by using disease-gene associations to connect the disease-disease network and gene-gene network. Then, the heterogeneous biomedical network was further enhanced by using network embedding based on the Gaussian random projection. Finally, network propagation was used to identify candidate genes in the enhanced network. We applied DGHNE together with five other methods into the most updated disease-gene association database termed DisGeNet. Compared with all other methods, DGHNE displayed the highest area under the receiver operating characteristic curve and the precision-recall curve, as well as the highest precision and recall, in both the global 5-fold cross-validation and predicting new disease-gene associations. We further performed DGHNE in identifying the candidate causal genes of Parkinson's disease and diabetes mellitus, and the genes connecting hyperglycemia and diabetes mellitus. In all cases, the predicted causing genes were enriched in disease-associated gene ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways, and the gene-disease associations were highly evidenced by independent experimental studies.
Collapse
Affiliation(s)
- Binsheng He
- Academician Workstation, Changsha Medical University, Changsha 410219, China.,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, P. R. China.,School of pharmacy, Changsha Medical University, Changsha 410219, P. R. China
| | - Kun Wang
- School of Mathematical Sciences, Ocean University of China, Qingdao 266100, China
| | - Ju Xiang
- Academician Workstation, Changsha Medical University, Changsha 410219, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha 410219, China.,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, P. R. China.,School of pharmacy, Changsha Medical University, Changsha 410219, P. R. China
| | - Min Tang
- School of Life Sciences, Jiangsu University, Zhenjiang 212001, Jiangsu, China
| | - Geng Tian
- Geneis (Beijing) Co., Ltd., Beijing 100102, China
| | - Cheng Guo
- Center for Infection and Immunity, Mailman School of Public Health, Columbia University, New York, NY, 10032, USA
| | - Miao Xu
- Broad institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha 410219, China.,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, P. R. China.,School of pharmacy, Changsha Medical University, Changsha 410219, P. R. China.,Geneis (Beijing) Co., Ltd., Beijing 100102, China
| |
Collapse
|
10
|
Leng X, Yang J, Liu T, Zhao C, Cao Z, Li C, Sun J, Zheng S. A bioinformatics framework to identify the biomarkers and potential drugs for the treatment of colorectal cancer. Front Genet 2022; 13:1017539. [PMID: 36238159 PMCID: PMC9551025 DOI: 10.3389/fgene.2022.1017539] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 09/08/2022] [Indexed: 11/13/2022] Open
Abstract
Colorectal cancer (CRC), a common malignant tumor, is one of the main causes of death in cancer patients in the world. Therefore, it is critical to understand the molecular mechanism of CRC and identify its diagnostic and prognostic biomarkers. The purpose of this study is to reveal the genes involved in the development of CRC and to predict drug candidates that may help treat CRC through bioinformatics analyses. Two independent CRC gene expression datasets including The Cancer Genome Atlas (TCGA) database and GSE104836 were used in this study. Differentially expressed genes (DEGs) were analyzed separately on the two datasets, and intersected for further analyses. 249 drug candidates for CRC were identified according to the intersected DEGs and the Crowd Extracted Expression of Differential Signatures (CREEDS) database. In addition, hub genes were analyzed using Cytoscape according to the DEGs, and survival analysis results showed that one of the hub genes, TIMP1 was related to the prognosis of CRC patients. Thus, we further focused on drugs that could reverse the expression level of TIMP1. Eight potential drugs with documentary evidence and two new drugs that could reverse the expression of TIMP1 were found among the 249 drugs. In conclusion, we successfully identified potential biomarkers for CRC and achieved drug repurposing using bioinformatics methods. Further exploration is needed to understand the molecular mechanisms of these identified genes and drugs/small molecules in the occurrence, development and treatment of CRC.
Collapse
|
11
|
Li D, Li L, Quan F, Wang T, Xu S, Li S, Tian K, Feng M, He N, Tian L, Chen B, Zhang H, Wang L, Wang J. Identification of circulating immune landscape in ischemic stroke based on bioinformatics methods. Front Genet 2022; 13:921582. [PMID: 35957686 PMCID: PMC9358692 DOI: 10.3389/fgene.2022.921582] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Accepted: 07/06/2022] [Indexed: 11/19/2022] Open
Abstract
Ischemic stroke (IS) is a high-incidence disease that seriously threatens human life and health. Neuroinflammation and immune responses are key players in the pathophysiological processes of IS. However, the underlying immune mechanisms are not fully understood. In this study, we attempted to identify several immune biomarkers associated with IS. We first retrospectively collected validated human IS immune-related genes (IS-IRGs) as seed genes. Afterward, potential IS-IRGs were discovered by applying random walk with restart on the PPI network and the permutation test as a screening strategy. Doing so, the validated and potential sets of IS-IRGs were merged together as an IS-IRG catalog. Two microarray profiles were subsequently used to explore the expression patterns of the IS-IRG catalog, and only IS-IRGs that were differentially expressed between IS patients and controls in both profiles were retained for biomarker selection by the Random Forest rankings. CLEC4D and CD163 were finally identified as immune biomarkers of IS, and a classification model was constructed and verified based on the weights of two biomarkers obtained from the Neural Network algorithm. Furthermore, the CIBERSORT algorithm helped us determine the proportions of circulating immune cells. Correlation analyses between IS immune biomarkers and immune cell proportions demonstrated that CLEC4D was strongly correlated with the proportion of neutrophils (r = 0.72). These results may provide potential targets for further studies on immuno-neuroprotection therapies against reperfusion injury.
Collapse
Affiliation(s)
- Danyang Li
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Lifang Li
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Fei Quan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Tianfeng Wang
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Si Xu
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Shuang Li
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Kuo Tian
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Meng Feng
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Ni He
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Liting Tian
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Biying Chen
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Huixue Zhang
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
- *Correspondence: Huixue Zhang, ; Lihua Wang, ; Jianjian Wang,
| | - Lihua Wang
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
- *Correspondence: Huixue Zhang, ; Lihua Wang, ; Jianjian Wang,
| | - Jianjian Wang
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
- *Correspondence: Huixue Zhang, ; Lihua Wang, ; Jianjian Wang,
| |
Collapse
|
12
|
Lung Cancer Stage Prediction Using Multi-Omics Data. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:2279044. [PMID: 35880092 PMCID: PMC9308511 DOI: 10.1155/2022/2279044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 06/27/2022] [Indexed: 12/24/2022]
Abstract
Lung cancer is one of the leading causes of cancer death. Patients with early-stage lung cancer can be treated by surgery, while patients in the middle and late stages need chemotherapy or radiotherapy. Therefore, accurate staging of lung cancer is crucial for doctors to formulate accurate treatment plans for patients. In this paper, the random forest algorithm is used as the lung cancer stage prediction model, and the accuracy of lung cancer stage prediction is discussed in the microbiome, transcriptome, microbe, and transcriptome fusion groups, and the accuracy of the model is measured by indicators such as ACC, recall, and precision. The results showed that the prediction accuracy of microbial combinatorial transcriptome fusion analysis was the highest, reaching 0.809. The study reveals the role of multimodal data and fusion algorithm in accurately diagnosing lung cancer stage, which could aid doctors in clinics.
Collapse
|
13
|
Huang Z, Yang L, Chen J, Li S, Huang J, Chen Y, Liu J, Wang H, Yu H. CCDC134 as a Prognostic-Related Biomarker in Breast Cancer Correlating With Immune Infiltrates. Front Oncol 2022; 12:858487. [PMID: 35311121 PMCID: PMC8927640 DOI: 10.3389/fonc.2022.858487] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 02/08/2022] [Indexed: 12/24/2022] Open
Abstract
Background The expression of Coiled-Coil Domain Containing 134(CCDC134) is up-regulated in different pan-cancer species. However, its prognostic value and correlation with immune infiltration in breast cancer are unclear. Therefore, we evaluated the prognostic role of CCDC134 in breast cancer and its correlation with immune invasion. Methods We downloaded the transcription profile of CCDC134 between breast cancer and normal tissues from the Cancer Genome Atlas (TCGA). CCDC134 protein expression was assessed by the Clinical Proteomic Cancer Analysis Consortium (CPTAC) and the Human Protein Atlas. Gene set enrichment analysis (GSEA) was also used for pathway analysis. Receiver operating characteristic (ROC) curve was used to differentiate breast cancer from adjacent normal tissues. Kaplan-Meier method was used to evaluate the effect of CCDC134 on survival rate. The protein-protein interaction (PPI) network is built from STRING. Function expansion analysis is performed using the ClusterProfiler package. Through tumor Immune Estimation Resource (TIMER) and tumor Immune System Interaction database (TISIDB) to determine the relationship between CCDC134 expression level and immune infiltration. CTD database is used to predict drugs that inhibit CCDC134 and PubChem database is used to determine the molecular structure of identified drugs. Results The expression of CCDC134 in breast cancer tissues was significantly higher than that of CCDC134 mRNA expression in adjacent normal tissues. ROC curve analysis showed that the AUC value of CCDC134 was 0.663. Kaplan-meier survival analysis showed that patients with high CCDC134 had a lower prognosis (57.27 months vs 36.96 months, P = 2.0E-6). Correlation analysis showed that CCDC134 mRNA expression was associated with tumor purity immune invasion. In addition, CTD database analysis identified abrine, Benzo (A) Pyrene, bisphenol A, Soman, Sunitinib, Tetrachloroethylene, Valproic Acid as seven targeted therapy drugs that may be effective treatments for seven targeted therapeutics. It may be an effective treatment for inhibiting CCDC134. Conclusion In breast cancer, upregulated CCDC134 is significantly associated with lower survival and immune infiltrates invasion. Our study suggests that CCDC134 can serve as a biomarker of poor prognosis and a potential immunotherapy target in breast cancer. Seven drugs with significant potential to inhibit CCDC134 were identified.
Collapse
Affiliation(s)
- Zhijian Huang
- Department of Breast Surgical Oncology, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China.,The Graduate School of Fujian Medical University, Fuzhou, China
| | - Linhui Yang
- Department of Breast Surgical Oncology, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China
| | - Jian Chen
- Department of Breast Surgical Oncology, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China
| | - Shixiong Li
- Department of Breast Surgical Oncology, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China
| | - Jing Huang
- Department of Pharmacy, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China
| | - Yijie Chen
- Department of Ultrasound, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China
| | - Jingbo Liu
- Pathology Department, Daqing Longnan Hospital, The Fifth Affiliated Hospital of Qiqihar Medical College, Daqing, China
| | - Hongyan Wang
- Department of Pathology, Daqing Oilfield General Hospital, Daqing, China
| | - Hui Yu
- Department of Pharmacy, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China
| |
Collapse
|
14
|
Huang Z, Yang J, Qiu W, Huang J, Chen Z, Han Y, Ye C. HAUS5 Is A Potential Prognostic Biomarker With Functional Significance in Breast Cancer. Front Oncol 2022; 12:829777. [PMID: 35280773 PMCID: PMC8913513 DOI: 10.3389/fonc.2022.829777] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 02/04/2022] [Indexed: 11/13/2022] Open
Abstract
Background Breast cancer (BRCA) has become the most frequently appearing, lethal, and aggressive cancer with increasing morbidity and mortality. Previously, it was discovered that the HAUS5 protein is involved in centrosome integrity, spindle assembly, and the completion of the cytoplasmic division process during mitosis. By encouraging chromosome misdivision and aneuploidy, HAUS5 has the potential to cause cancer. The significance of HAUS5 in BRCA and the relationship between its expression and clinical outcomes or immune infiltration remains unclear. Methods Pan-cancer was analyzed by TIMER2 web and the expression differential of HAUS5 was discovered. The prognostic value of HAUS5 for BRCA was evaluated with KM plotter and confirmed with Gene Expression Omnibus (GEO) dataset. Following that, we looked at the relationship between the high and low expression groups of HAUS5 and breast cancer clinical indications. Signaling pathways linked to HAUS5 expression were discovered using Gene Set Enrichment Analysis (GSEA). The relative immune cell infiltrations of each sample were assessed using the CIBERSORT algorithm and ESTIMATE method. We evaluated the Tumor Mutation Burden (TMB) value between the two sets of samples with high and low HAUS5 expression, as well as the differences in gene mutations between the two groups. The proliferation changes of BRCA cells after knockdown of HAUS5 were evaluated by fluorescence cell counting and colony formation assay. Result HAUS5 is strongly expressed in most malignancies, and distinct associations exist between HAUS5 and prognosis in BRCA patients. Upregulated HAUS5 was associated with poor clinicopathological characteristics such as tumor T stage, ER, PR, and HER2 status. mitotic prometaphase, primary immunodeficiency, DNA replication, cell cycle related signaling pathways were all enriched in the presence of elevated HAUS5 expression, according to GSEA analysis. The BRCA microenvironment’s core gene, HAUS5, was shown to be related with invading immune cell subtypes and tumor cell stemness. TMB in the HAUS5-low expression group was significantly higher than that in the high expression group. The mutation frequency of 15 genes was substantially different in the high expression group compared to the low expression group. BRCA cells’ capacity to proliferate was decreased when HAUS5 was knocked down. Conclusion These findings show that HAUS5 is a positive regulator of BRCA progression that contributes to BRCA cells proliferation. As a result, HAUS5 might be a novel prognostic indicator and therapeutic target for BRCA patients.
Collapse
Affiliation(s)
- Zhijian Huang
- Breast Center, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Breast Surgical Oncology, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China
| | - Jiasheng Yang
- School of Electrical and Information Engineering, Anhui University of Technology, Maanshan, China
| | - Wenjing Qiu
- School of Electrical and Information Engineering, Anhui University of Technology, Maanshan, China
| | - Jing Huang
- Department of Pharmacy, Fujian Medical University Cancer Hospital, Fujian Cancer Hospital, Fuzhou, China
| | - Zhirong Chen
- Biomedical Research Center of South China, Fujian Normal University, Fuzhou, China
| | - Yuanyuan Han
- Institute of Medical Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Kunming, China
| | - Changsheng Ye
- Breast Center, Nanfang Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|
15
|
Fan Y, Dong X, Li M, Liu P, Zheng J, Li H, Zhang Y. LncRNA KRT19P3 Is Involved in Breast Cancer Cell Proliferation, Migration and Invasion. Front Oncol 2022; 11:799082. [PMID: 35059320 PMCID: PMC8763666 DOI: 10.3389/fonc.2021.799082] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 12/08/2021] [Indexed: 12/13/2022] Open
Abstract
Long non-coding RNAs (LncRNAs) have already been taken as critical regulatory molecules in breast carcinoma (BC). Besides, the progression of BC is closely associated with the immune system. However, the relationship between lncRNAs and the tumor immune system in BC has not been fully studied. LncRNA KRT19P3 has been reported to inhibit the progression of gastric cancer. In the present study, we first discovered that KRT19P3 was downregulated in BC tissues compared with para cancer tissue. Then we showed that KRT19P3 could be used as a marker to differentiate BC from para cancer tissue. Increased expression of KRT19P3 markedly inhibited the proliferation, migration, and invasion rate of BC cells in vitro and tumor growth of BC in vivo. Conversely, KRT19P3 knockdown by siRNA markedly promoted the proliferation, migration, and invasion rate of BC cells after being transfected. Comparison of clinical parameters showed an inverse relationship between the expression of KRT19P3 and pathological grade. Furthermore, immunohistochemistry (IHC) was applied to reveal the positive rate of the expression of Ki-67, programmed death-ligand 1 (PD-L1), and CD8 in BC tissues. Correlation analysis showed that Ki-67 and PD-L1 were inversely proportional to KRT19P3 but CD8 was directly proportional to KRT19P3. In conclusion, this study demonstrated that lncRNA KRT19P3 inhibits BC progression, and may affect the expression of PD-L1 in BC, which in turn affects CD8+ T (CD8 positive Cytotoxic T lymphocyte) cells in the immune microenvironment.
Collapse
Affiliation(s)
- Yanping Fan
- Pathology Department, First Affiliated Hospital of Weifang Medical University (Weifang People's Hospital), Weifang, China.,Department of Basic Medicine, Weifang Medical University, Weifang, China
| | - Xiaotong Dong
- Pathology Department, First Affiliated Hospital of Weifang Medical University (Weifang People's Hospital), Weifang, China.,Department of Basic Medicine, Weifang Medical University, Weifang, China
| | - Meizeng Li
- Pathology Department, First Affiliated Hospital of Weifang Medical University (Weifang People's Hospital), Weifang, China.,Department of Basic Medicine, Weifang Medical University, Weifang, China
| | - Pengju Liu
- School of Economics, Qingdao University, Qingdao, China
| | - Jie Zheng
- Department of Basic Medicine, Weifang Medical University, Weifang, China
| | - Hongli Li
- Department of Basic Medicine, Weifang Medical University, Weifang, China
| | - Yunxiang Zhang
- Pathology Department, First Affiliated Hospital of Weifang Medical University (Weifang People's Hospital), Weifang, China
| |
Collapse
|
16
|
Zhuang J, Liu D, Lin M, Qiu W, Liu J, Chen S. PseUdeep: RNA Pseudouridine Site Identification with Deep Learning Algorithm. Front Genet 2021; 12:773882. [PMID: 34868261 PMCID: PMC8637112 DOI: 10.3389/fgene.2021.773882] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 10/04/2021] [Indexed: 11/16/2022] Open
Abstract
Background: Pseudouridine (Ψ) is a common ribonucleotide modification that plays a significant role in many biological processes. The identification of Ψ modification sites is of great significance for disease mechanism and biological processes research in which machine learning algorithms are desirable as the lab exploratory techniques are expensive and time-consuming. Results: In this work, we propose a deep learning framework, called PseUdeep, to identify Ψ sites of three species: H. sapiens, S. cerevisiae, and M. musculus. In this method, three encoding methods are used to extract the features of RNA sequences, that is, one-hot encoding, K-tuple nucleotide frequency pattern, and position-specific nucleotide composition. The three feature matrices are convoluted twice and fed into the capsule neural network and bidirectional gated recurrent unit network with a self-attention mechanism for classification. Conclusion: Compared with other state-of-the-art methods, our model gets the highest accuracy of the prediction on the independent testing data set S-200; the accuracy improves 12.38%, and on the independent testing data set H-200, the accuracy improves 0.68%. Moreover, the dimensions of the features we derive from the RNA sequences are only 109,109, and 119 in H. sapiens, M. musculus, and S. cerevisiae, which is much smaller than those used in the traditional algorithms. On evaluation via tenfold cross-validation and two independent testing data sets, PseUdeep outperforms the best traditional machine learning model available. PseUdeep source code and data sets are available at https://github.com/dan111262/PseUdeep.
Collapse
Affiliation(s)
- Jujuan Zhuang
- College of Science, Dalian Maritime University, Dalian, China
| | - Danyang Liu
- College of Science, Dalian Maritime University, Dalian, China
| | - Meng Lin
- College of Science, Dalian Maritime University, Dalian, China
| | - Wenjing Qiu
- Electrical and Information Engineering, Anhui University of Technology, Anhui, China
- Geneis (Beijing) Co., Ltd., Beijing, China
| | | | - Size Chen
- Department of Oncology, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China
- Guangdong Provincial Engineering Research Center for Esophageal Cancer Precise Therapy, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China
- Central Laboratory, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China
- *Correspondence: Size Chen,
| |
Collapse
|