1
|
Abbas MN, Broneske D, Saake G. A multi-objective evolutionary algorithm for detecting protein complexes in PPI networks using gene ontology. Sci Rep 2025; 15:16855. [PMID: 40374682 DOI: 10.1038/s41598-025-01667-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2025] [Accepted: 05/07/2025] [Indexed: 05/17/2025] Open
Abstract
Detecting protein complexes is crucial in computational biology for understanding cellular mechanisms and facilitating drug discovery. Evolutionary algorithms (EAs) have proven effective in uncovering protein complexes within networks of protein-protein interactions (PPIs). However, their integration with functional insights from gene ontology (GO) annotations remains underexplored. This paper presents two primary contributions: First, it proposes a novel multi-objective optimization model for detecting protein complexes, conceptualizing the task as a problem with inherently conflicting objectives based on biological data. Second, it introduces an innovative gene ontology-based mutation operator, termed the Functional Similarity-Based Protein Translocation Operator ([Formula: see text]). This operator enhances collaboration between the canonical model and the GO-informed mutation strategy, thereby improving the algorithm's performance. As far as we know, this is the initial effort to incorporate the biological characteristics of PPIs into both the problem formulation and the development of intricate perturbation strategies. We assess the effectiveness of the proposed multi-objective evolutionary algorithm through experiments conducted on two widely recognized PPI networks and two standard complex datasets provided by the Munich Information Center for Protein Sequences (MIPS). To further assess the robustness of our algorithm, we create artificial networks by introducing different noise levels into the original Saccharomyces cerevisiae (yeast) PPI networks. This allows us to evaluate how perturbations in protein interactions affect the algorithm's performance compared to other approaches. The experimental results highlight that our algorithm outperforms several state-of-the-art methods in accurately identifying protein complexes. Moreover, the findings emphasize the substantial advantages of incorporating our heuristic perturbation operator, which significantly improves the quality of the detected complexes over other evolutionary algorithm-based methods.
Collapse
Affiliation(s)
- Mustafa N Abbas
- Databases and Software Engineering, Otto-von-Guericke-University, Magdeburg, Germany.
| | - David Broneske
- German Centre for Higher Education Research and Science Studies, Hannover, Germany
| | - Gunter Saake
- Databases and Software Engineering, Otto-von-Guericke-University, Magdeburg, Germany
| |
Collapse
|
2
|
Ayub U, Naveed H. GSLAlign: community detection and local PPI network alignment. J Biomol Struct Dyn 2025; 43:4174-4182. [PMID: 38214492 DOI: 10.1080/07391102.2024.2301757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Accepted: 12/29/2023] [Indexed: 01/13/2024]
Abstract
High throughput protein-protein interaction (PPI) profiling and computational techniques have resulted in generating a large amount of PPI network data. The study of PPI networks helps in understanding the biological processes of the proteins. The comparative study of the PPI networks helps in identifying the conserved interactions across the species. This article presents a novel local PPI network aligner 'GSLAlign' that consists of two stages. It first detects the communities from the PPI networks by applying the GraphSAGE algorithm using gene expression data. In the second stage, the detected communities are aligned using a community aligner that is based on protein sequence similarity. The community detection algorithm produces more separable and biologically accurate communities as compared to previous community detection algorithms. Moreover, the proposed community alignment algorithm achieves 3-8% better results in terms of semantic similarity as compared to previous local aligners. The average connectivity and coverage of the proposed algorithm are also better than the existing aligners.
Collapse
Affiliation(s)
- Umair Ayub
- Department of Computer Science, Bahria University, Lahore, Pakistan
| | - Hammad Naveed
- National University of Computer and Emerging Sciences, Lahore, Pakistan and Computational Biology Research Lab, National University of Computer and Emerging Sciences, Lahore, Pakistan
| |
Collapse
|
3
|
Tsai YH, Lai YH, Chen SJ, Cheng YC, Pai TW. DNA Methylation Biomarker Discovery for Colorectal Cancer Diagnosis Assistance Through Integrated Analysis. Cancer Inform 2025; 24:11769351251324545. [PMID: 40291817 PMCID: PMC12033546 DOI: 10.1177/11769351251324545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Accepted: 02/13/2025] [Indexed: 04/30/2025] Open
Abstract
Objective This study aimed to identify biomarkers for colorectal cancer (CRC) with representative gene functions and high classification accuracy in tissue and blood samples. Methods We integrated CRC DNA methylation profiles from The Cancer Genome Atlas and comorbidity patterns of CRC to select biomarker candidates. We clustered these candidates near the promoter regions into multiple functional groups based on their functional annotations. To validate the selected biomarkers, we applied 3 machine learning techniques to construct models and compare their prediction performances. Results The 10 screened genes showed significant methylation differences in both tissue and blood samples. Our test results showed that 3-gene combinations achieved outstanding classification performance. Selecting 3 representative biomarkers from different genetic functional clusters, the combination of ADHFE1, ADAMTS5, and MIR129-2 exhibited the best performance across the 3 prediction models, achieving a Matthews correlation coefficient > .85 and an F1-score of .9. Conclusions Using integrated DNA methylation analysis, we identified 3 CRC-related biomarkers with remarkable classification performance. These biomarkers can be used to design a practical clinical toolkit for CRC diagnosis assistance and may also serve as candidate biomarkers for further clinical experiments through liquid biopsies.
Collapse
Affiliation(s)
- Yi-Hsuan Tsai
- Department of Computer Science and Information Engineering, National Taipei University of Technology, Taipei, Taiwan
| | - Yi-Husan Lai
- Department of Product Development, ACT Genomics Co., Ltd., Taipei, Taiwan
| | - Shu-Jen Chen
- Department of Product Development, ACT Genomics Co., Ltd., Taipei, Taiwan
| | - Yi-Chiao Cheng
- Division of Colon and Rectal Surgery, Department of Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Tun-Wen Pai
- Department of Computer Science and Information Engineering, National Taipei University of Technology, Taipei, Taiwan
| |
Collapse
|
4
|
Edera AA, Stegmayer G, Milone DH. gGN: Representing the Gene Ontology as low-rank Gaussian distributions. Comput Biol Med 2024; 183:109234. [PMID: 39395345 DOI: 10.1016/j.compbiomed.2024.109234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Revised: 09/06/2024] [Accepted: 09/30/2024] [Indexed: 10/14/2024]
Abstract
Computational representations of knowledge graphs are critical for several tasks in bioinformatics, including large-scale graph analysis and gene function characterization. In this study, we introduce gGN, an unsupervised neural network for learning node representations as Gaussian distributions. Unlike prior efforts, where the covariance matrices of these distributions are simplified to diagonal, we propose representing them with a low-rank approximation. This representation not only maintains manageable learning complexity, allowing for scaling to large graphs, but is also more effective for modeling the structural features of knowledge graphs, such as their hierarchical and directional relationships between nodes. To learn the low-rank Gaussian distributions, we introduce a semantic-based loss function that effectively preserves these structural features. Systematic experiments reveal that gGN preserves structural features more effectively than existing approaches and scales efficiently on large knowledge graphs. Furthermore, applying gGN to represent the Gene Ontology, a widely used knowledge graph in bioinformatics, outperformed multiple baseline methods in ubiquitous gene characterization tasks. Altogether, the proposed low-rank Gaussian distributions not only effectively represent knowledge graphs but also open new avenues for enhancing bioinformatics tasks. gGN is publicly available as an easily installable package at https://github.com/aedera/ggn.
Collapse
Affiliation(s)
- Alejandro A Edera
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), FICH-UNL, CONICET, Ciudad Universitaria UNL 3000, Santa Fe, Argentina.
| | - Georgina Stegmayer
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), FICH-UNL, CONICET, Ciudad Universitaria UNL 3000, Santa Fe, Argentina
| | - Diego H Milone
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), FICH-UNL, CONICET, Ciudad Universitaria UNL 3000, Santa Fe, Argentina
| |
Collapse
|
5
|
Hong Y, Yuan Q, Wang L, Yang Z, Xu P, Guan X, Chen C. Integrative bioinformatics analysis to identify ferroptosis-related genes in non-obstructive azoospermia. J Assist Reprod Genet 2024; 41:2145-2161. [PMID: 38902567 PMCID: PMC11339017 DOI: 10.1007/s10815-024-03155-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 05/23/2024] [Indexed: 06/22/2024] Open
Abstract
PURPOSE The objective of this study was to discern ferroptosis-related genes (FRGs) linked to non-obstructive azoospermia and investigate the associated molecular mechanisms. METHOD A dataset related to azoospermia was retrieved from the Gene Expression Omnibus database, and FRGs were sourced from GeneCards. Ferroptosis-related differentially expressed genes (FRDEGs) were discerned. Subsequently, these genes underwent analyses encompassing Gene Ontology and Kyoto Encyclopedia of Genes and Genomes, as well as protein-protein interaction (PPI) networks and assessments of functional similarity. Following the identification of hub genes, an exploration of immune infiltration, single-cell expression, diagnostic utility, and interactions involving hub genes, RNA-binding proteins (RBPs), transcription factors (TFs), microRNAs (miRNAs), and drugs was conducted. RESULTS A total of 35 differentially expressed FRGs were discerned. These genes demonstrated enrichment in functions and pathways associated with ferroptosis. From the PPI network, eight hub genes were selected. Functional similarity analysis highlighted the potential pivotal roles of HMOX1 and GPX4 in azoospermia. Analysis of immune cell infiltration indicated a significant decrease in activated dendritic cells in the azoospermia group, with notable correlations between hub genes, particularly SAT1 and HMGCR, and immune cell infiltration. Unique expression patterns of hub genes across various cell types in the human testis were observed, with GPX4 prominently enriched in spermatid/sperm. Eight hub genes exhibited robust diagnostic value (AUC > 0.75). Lastly, a comprehensive hub gene-miRNA-TF-RBP-drug network was constructed. CONCLUSION In summary, our investigation unveiled eight FRDEGs associated with azoospermia, which hold potential as biomarkers for the diagnosis and treatment of azoospermia.
Collapse
Affiliation(s)
- Yanggang Hong
- The Second School of Medicine, Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China
- Key Laboratory of Children Genitourinary Diseases of Wenzhou, Wenzhou, 325000, Zhejiang, China
| | - Qichao Yuan
- Department of Pediatric Urology, the Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China
- Key Laboratory of Children Genitourinary Diseases of Wenzhou, Wenzhou, 325000, Zhejiang, China
| | - Lingfei Wang
- Department of Pediatric Urology, the Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China
- Key Laboratory of Children Genitourinary Diseases of Wenzhou, Wenzhou, 325000, Zhejiang, China
| | - Zihan Yang
- Department of Pediatric Urology, the Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China
- Key Laboratory of Children Genitourinary Diseases of Wenzhou, Wenzhou, 325000, Zhejiang, China
| | - Peiyu Xu
- The Second School of Medicine, Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China
- Key Laboratory of Children Genitourinary Diseases of Wenzhou, Wenzhou, 325000, Zhejiang, China
| | - Xiaoju Guan
- Department of Pediatric Urology, the Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China.
- Key Laboratory of Children Genitourinary Diseases of Wenzhou, Wenzhou, 325000, Zhejiang, China.
| | - Congde Chen
- Department of Pediatric Urology, the Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China.
- Key Laboratory of Children Genitourinary Diseases of Wenzhou, Wenzhou, 325000, Zhejiang, China.
| |
Collapse
|
6
|
Kartheeswaran KP, Rayan AXA, Varrieth GT. Enhanced disease-disease association with information enriched disease representation. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:8892-8932. [PMID: 37161227 DOI: 10.3934/mbe.2023391] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
OBJECTIVE Quantification of disease-disease association (DDA) enables the understanding of disease relationships for discovering disease progression and finding comorbidity. For effective DDA strength calculation, there is a need to address the main challenge of integration of various biomedical aspects of DDA is to obtain an information rich disease representation. MATERIALS AND METHODS An enhanced and integrated DDA framework is developed that integrates enriched literature-based with concept-based DDA representation. The literature component of the proposed framework uses PubMed abstracts and consists of improved neural network model that classifies DDAs for an enhanced literature-based DDA representation. Similarly, an ontology-based joint multi-source association embedding model is proposed in the ontology component using Disease Ontology (DO), UMLS, claims insurance, clinical notes etc. Results and Discussion: The obtained information rich disease representation is evaluated on different aspects of DDA datasets such as Gene, Variant, Gene Ontology (GO) and a human rated benchmark dataset. The DDA scores calculated using the proposed method achieved a high correlation mainly in gene-based dataset. The quantified scores also shown better correlation of 0.821, when evaluated on human rated 213 disease pairs. In addition, the generated disease representation is proved to have substantial effect on correlation of DDA scores for different categories of disease pairs. CONCLUSION The enhanced context and semantic DDA framework provides an enriched disease representation, resulting in high correlated results with different DDA datasets. We have also presented the biological interpretation of disease pairs. The developed framework can also be used for deriving the strength of other biomedical associations.
Collapse
|
7
|
Zhong Y, Zhao J, Deng H, Wu Y, Zhu L, Yang M, Liu Q, Luo G, Ma W, Li H. Integrative bioinformatics analysis to identify novel biomarkers associated with non-obstructive azoospermia. Front Immunol 2023; 14:1088261. [PMID: 36969237 PMCID: PMC10031032 DOI: 10.3389/fimmu.2023.1088261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 02/22/2023] [Indexed: 03/11/2023] Open
Abstract
AimThis study aimed to identify autophagy-related genes (ARGs) associated with non-obstructive azoospermia and explore the underlying molecular mechanisms.MethodsTwo datasets associated with azoospermia were downloaded from the Gene Expression Omnibus database, and ARGs were obtained from the Human Autophagy-dedicated Database. Autophagy-related differentially expressed genes were identified in the azoospermia and control groups. These genes were subjected to Gene Ontology and Kyoto Encyclopedia of Genes and Genomes, protein–protein interaction (PPI) network, and functional similarity analyses. After identifying the hub genes, immune infiltration and hub gene–RNA-binding protein (RBP)–transcription factor (TF)–miRNA–drug interactions were analyzed.ResultsA total 46 differentially expressed ARGs were identified between the azoospermia and control groups. These genes were enriched in autophagy-associated functions and pathways. Eight hub genes were selected from the PPI network. Functional similarity analysis revealed that HSPA5 may play a key role in azoospermia. Immune cell infiltration analysis revealed that activated dendritic cells were significantly decreased in the azoospermia group compared to those in the control groups. Hub genes, especially ATG3, KIAA0652, MAPK1, and EGFR were strongly correlated with immune cell infiltration. Finally, a hub gene–miRNA–TF–RBP–drug network was constructed.ConclusionThe eight hub genes, including EGFR, HSPA5, ATG3, KIAA0652, and MAPK1, may serve as biomarkers for the diagnosis and treatment of azoospermia. The study findings suggest potential targets and mechanisms for the occurrence and development of this disease.
Collapse
Affiliation(s)
- Yucheng Zhong
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal and Child Health Hospital of Foshan, Foshan, Guangdong, China
| | - Jun Zhao
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal and Child Health Hospital of Foshan, Foshan, Guangdong, China
| | - Hao Deng
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal and Child Health Hospital of Foshan, Foshan, Guangdong, China
| | - Yaqin Wu
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal and Child Health Hospital of Foshan, Foshan, Guangdong, China
| | - Li Zhu
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal and Child Health Hospital of Foshan, Foshan, Guangdong, China
| | - Meiqiong Yang
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal and Child Health Hospital of Foshan, Foshan, Guangdong, China
| | - Qianru Liu
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal and Child Health Hospital of Foshan, Foshan, Guangdong, China
| | - Guoqun Luo
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal and Child Health Hospital of Foshan, Foshan, Guangdong, China
| | - Wenmin Ma
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal and Child Health Hospital of Foshan, Foshan, Guangdong, China
- Assist Reproductive Medical Center, Zhaoqing West River Hospital, Zhaoqing, Guangdong, China
- *Correspondence: Wenmin Ma, ; Huan Li,
| | - Huan Li
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal and Child Health Hospital of Foshan, Foshan, Guangdong, China
- *Correspondence: Wenmin Ma, ; Huan Li,
| |
Collapse
|
8
|
Zhong Y, Chen X, Zhao J, Deng H, Li X, Xie Z, Zhou B, Xian Z, Li X, Luo G, Li H. Integrative analyses of potential biomarkers and pathways for non-obstructive azoospermia. Front Genet 2022; 13:988047. [PMID: 36506310 PMCID: PMC9730279 DOI: 10.3389/fgene.2022.988047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 11/01/2022] [Indexed: 11/26/2022] Open
Abstract
Background: Non-obstructive azoospermia (NOA) is the most severe form of male infertility. Currently, the molecular mechanisms underlying NOA pathology have not yet been elucidated. Hence, elucidation of the mechanisms of NOA and exploration of potential biomarkers are essential for accurate diagnosis and treatment of this disease. In the present study, we aimed to screen for biomarkers and pathways involved in NOA and reveal their potential molecular mechanisms using integrated bioinformatics. Methods: We downloaded two gene expression datasets from the Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) in NOA and matched the control group tissues were identified using the limma package in R software. Subsequently, Gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), gene set enrichment analysis (GSEA), protein-protein interaction (PPI) network, gene-microRNAs network, and transcription factor (TF)-hub genes regulatory network analyses were performed to identify hub genes and associated pathways. Finally, we conducted immune infiltration analysis using CIBERSORT to evaluate the relationship between the hub genes and the NOA immune infiltration levels. Results: We identified 698 common DEGs, including 87 commonly upregulated and 611 commonly downregulated genes in the two datasets. GO analysis indicated that the most significantly enriched gene was protein polyglycylation, and KEGG pathway analysis revealed that the DEGs were most significantly enriched in taste transduction and pancreatic secretion signaling pathways. GSEA showed that DEGs affected the biological functions of the ribosome, focaladhesion, and protein_expor. We further identified the top 31 hub genes from the PPI network, and friends analysis of hub genes in the PPI network showed that NR4A2 had the highest score. In addition, immune infiltration analysis found that CD8+ T cells and plasma cells were significantly correlated with ODF3 expression, whereas naive B cells, plasma cells, monocytes, M2 macrophages, and resting mast cells showed significant variation in the NR4A2 gene expression group, and there were differences in T cell regulatory immune cell infiltration in the FOS gene expression groups. Conclusion: The present study successfully constructed a regulatory network of DEGs between NOA and normal controls and screened three hub genes using integrative bioinformatics analysis. In addition, our results suggest that functional changes in several immune cells in the immune microenvironment may play an important role in spermatogenesis. Our results provide a novel understanding of the molecular mechanisms of NOA and offer potential biomarkers for its diagnosis and treatment.
Collapse
Affiliation(s)
- Yucheng Zhong
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal & Child Health Hospital of Foshan, Foshan, China
| | - Xiaoqing Chen
- Department of Breast Surgical Oncology, Southern Medical University Affiliated Maternal & Child Health Hospital of Foshan, Foshan, China
| | - Jun Zhao
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal & Child Health Hospital of Foshan, Foshan, China
| | - Hao Deng
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal & Child Health Hospital of Foshan, Foshan, China
| | - Xiaohang Li
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal & Child Health Hospital of Foshan, Foshan, China
| | - Zhongju Xie
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal & Child Health Hospital of Foshan, Foshan, China
| | - Bingyu Zhou
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal & Child Health Hospital of Foshan, Foshan, China
| | - Zhuojie Xian
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal & Child Health Hospital of Foshan, Foshan, China
| | - Xiaoqin Li
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal & Child Health Hospital of Foshan, Foshan, China
| | - Guoqun Luo
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal & Child Health Hospital of Foshan, Foshan, China,*Correspondence: Guoqun Luo, ; Huan Li,
| | - Huan Li
- Assisted Reproductive Technology Center, Southern Medical University Affiliated Maternal & Child Health Hospital of Foshan, Foshan, China,*Correspondence: Guoqun Luo, ; Huan Li,
| |
Collapse
|