1
|
Zhao H, Xu H, Wang T, Liu G. Constructing multilayer PPI networks based on homologous proteins and integrating multiple PageRank to identify essential proteins. BMC Bioinformatics 2025; 26:80. [PMID: 40059137 PMCID: PMC11892321 DOI: 10.1186/s12859-025-06093-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2025] [Accepted: 02/21/2025] [Indexed: 05/13/2025] Open
Abstract
BACKGROUND Predicting and studying essential proteins not only helps to understand the fundamental requirements for cell survival and growth regulation mechanisms but also deepens our understanding of disease mechanisms and drives drug development. Existing methods for identifying essential proteins primarily focus on PPI networks within a single species, without fully exploiting interspecies homologous relationships. These homologous relationships connect proteins from different species, forming multilayer PPI networks. Some methods only construct interlayer edges based on homologous relationships between two species, without incorporating appropriate biological attributes to assess the biological significance of these edges. Furthermore, homologous proteins are often highly conserved across multiple species, and expanding homologous relationships to more species allows for a more accurate assessment of interlayer edge importance. RESULTS To address these issues, we propose a novel model, MLPR, which constructs a multilayer PPI network based on homologous proteins and integrates multiple PageRank algorithms to identify essential proteins. This study combines homologous protein data from three species to construct interlayer transition matrices and assigns weights to interlayer edges by integrating the biological attributes of homologous proteins and cross-species GO annotations. The MLPR model uses multiple PageRank methods to comprehensively consider homologous relationships across species and designs three key parameters to find the optimal combination that balances random walks within layers, global jumps, interlayer biases, and interspecies homologous relationships. CONCLUSIONS Experimental results show that MLPR outperforms other state-of-the-art methods in terms of performance. Ablation experiments further validate that integrating homologous relationships across three species effectively enhances the overall performance of MLPR and demonstrates the advantages of the multiple PageRank model in identifying essential proteins.
Collapse
Affiliation(s)
- He Zhao
- College of Computer Science and Technology, Jilin University, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Huan Xu
- College of Computer Science and Technology, Jilin University, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Tao Wang
- College of Computer Science and Technology, Jilin University, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Guixia Liu
- College of Computer Science and Technology, Jilin University, Changchun, China.
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China.
| |
Collapse
|
2
|
Zhao H, Liu G, Cao X. A seed expansion-based method to identify essential proteins by integrating protein-protein interaction sub-networks and multiple biological characteristics. BMC Bioinformatics 2023; 24:452. [PMID: 38036960 PMCID: PMC10688502 DOI: 10.1186/s12859-023-05583-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Accepted: 11/24/2023] [Indexed: 12/02/2023] Open
Abstract
BACKGROUND The identification of essential proteins is of great significance in biology and pathology. However, protein-protein interaction (PPI) data obtained through high-throughput technology include a high number of false positives. To overcome this limitation, numerous computational algorithms based on biological characteristics and topological features have been proposed to identify essential proteins. RESULTS In this paper, we propose a novel method named SESN for identifying essential proteins. It is a seed expansion method based on PPI sub-networks and multiple biological characteristics. Firstly, SESN utilizes gene expression data to construct PPI sub-networks. Secondly, seed expansion is performed simultaneously in each sub-network, and the expansion process is based on the topological features of predicted essential proteins. Thirdly, the error correction mechanism is based on multiple biological characteristics and the entire PPI network. Finally, SESN analyzes the impact of each biological characteristic, including protein complex, gene expression data, GO annotations, and subcellular localization, and adopts the biological data with the best experimental results. The output of SESN is a set of predicted essential proteins. CONCLUSIONS The analysis of each component of SESN indicates the effectiveness of all components. We conduct comparison experiments using three datasets from two species, and the experimental results demonstrate that SESN achieves superior performance compared to other methods.
Collapse
Affiliation(s)
- He Zhao
- College of Computer Science and Technology, Jilin University, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Guixia Liu
- College of Computer Science and Technology, Jilin University, Changchun, China.
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China.
| | - Xintian Cao
- College of Computer Science and Technology, Jilin University, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| |
Collapse
|
3
|
Wang R, Ma H, Wang C. An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks. Front Genet 2022; 13:839949. [PMID: 35281831 PMCID: PMC8908451 DOI: 10.3389/fgene.2022.839949] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 01/31/2022] [Indexed: 11/14/2022] Open
Abstract
Detecting protein complexes is one of the keys to understanding cellular organization and processes principles. With high-throughput experiments and computing science development, it has become possible to detect protein complexes by computational methods. However, most computational methods are based on either unsupervised learning or supervised learning. Unsupervised learning-based methods do not need training datasets, but they can only detect one or several topological protein complexes. Supervised learning-based methods can detect protein complexes with different topological structures. However, they are usually based on a type of training model, and the generalization of a single model is poor. Therefore, we propose an Ensemble Learning Framework for Detecting Protein Complexes (ELF-DPC) within protein-protein interaction (PPI) networks to address these challenges. The ELF-DPC first constructs the weighted PPI network by combining topological and biological information. Second, it mines protein complex cores using the protein complex core mining strategy we designed. Third, it obtains an ensemble learning model by integrating structural modularity and a trained voting regressor model. Finally, it extends the protein complex cores and forms protein complexes by a graph heuristic search strategy. The experimental results demonstrate that ELF-DPC performs better than the twelve state-of-the-art approaches. Moreover, functional enrichment analysis illustrated that ELF-DPC could detect biologically meaningful protein complexes. The code/dataset is available for free download from https://github.com/RongquanWang/ELF-DPC.
Collapse
Affiliation(s)
- Rongquan Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| | - Huimin Ma
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
- *Correspondence: Huimin Ma,
| | - Caixia Wang
- School of International Economics, China Foreign Affairs University, Beijing, China
| |
Collapse
|
4
|
Chauhan V, Tiwari A, Joshi N, Khandelwal S. Multi-label classifier for protein sequence using heuristic-based deep convolution neural network. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02529-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
5
|
A Hybrid Shuffled Frog Leaping Algorithm and Its Performance Assessment in Multi-Dimensional Symmetric Function. Symmetry (Basel) 2022. [DOI: 10.3390/sym14010131] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Ensemble learning of swarm intelligence evolutionary algorithm of artificial neural network (ANN) is one of the core research directions in the field of artificial intelligence (AI). As a representative member of swarm intelligence evolutionary algorithm, shuffled frog leaping algorithm (SFLA) has the advantages of simple structure, easy implementation, short operation time, and strong global optimization ability. However, SFLA is susceptible to fall into local optimas in the face of complex and multi-dimensional symmetric function optimization, which leads to the decline of convergence accuracy. This paper proposes an improved shuffled frog leaping algorithm of threshold oscillation based on simulated annealing (SA-TO-SFLA). In this algorithm, the threshold oscillation strategy and simulated annealing strategy are introduced into the SFLA, which makes the local search behavior more diversified and the ability to escape from the local optimas stronger. By using multi-dimensional symmetric function such as drop-wave function, Schaffer function N.2, Rastrigin function, and Griewank function, two groups (i: SFLA, SA-SFLA, TO-SFLA, and SA-TO-SFLA; ii: SFLA, ISFLA, MSFLA, DSFLA, and SA-TO-SFLA) of comparative experiments are designed to analyze the convergence accuracy and convergence time. The results show that the threshold oscillation strategy has strong robustness. Moreover, compared with SFLA, the convergence accuracy of SA-TO-SFLA algorithm is significantly improved, and the median of convergence time is greatly reduced as a whole. The convergence accuracy of SFLA algorithm on these four test functions are 90%, 100%, 78%, and 92.5%, respectively, and the median of convergence time is 63.67 s, 59.71 s, 12.93 s, and 8.74 s, respectively; The convergence accuracy of SA-TO-SFLA algorithm on these four test functions is 99%, 100%, 100%, and 97.5%, respectively, and the median of convergence time is 48.64 s, 32.07 s, 24.06 s, and 3.04 s, respectively.
Collapse
|
6
|
Wang R, Ma H, Wang C. An Improved Memetic Algorithm for Detecting Protein Complexes in Protein Interaction Networks. Front Genet 2022; 12:794354. [PMID: 34970305 PMCID: PMC8712950 DOI: 10.3389/fgene.2021.794354] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 11/22/2021] [Indexed: 11/13/2022] Open
Abstract
Identifying the protein complexes in protein-protein interaction (PPI) networks is essential for understanding cellular organization and biological processes. To address the high false positive/negative rates of PPI networks and detect protein complexes with multiple topological structures, we developed a novel improved memetic algorithm (IMA). IMA first combines the topological and biological properties to obtain a weighted PPI network with reduced noise. Next, it integrates various clustering results to construct the initial populations. Furthermore, a fitness function is designed based on the five topological properties of the protein complexes. Finally, we describe the rest of our IMA method, which primarily consists of four steps: selection operator, recombination operator, local optimization strategy, and updating the population operator. In particular, IMA is a combination of genetic algorithm and a local optimization strategy, which has a strong global search ability, and searches for local optimal solutions effectively. The experimental results demonstrate that IMA performs much better than the base methods and existing state-of-the-art techniques. The source code and datasets of the IMA can be found at https://github.com/RongquanWang/IMA.
Collapse
Affiliation(s)
- Rongquan Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| | - Huimin Ma
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| | - Caixia Wang
- School of International Economics, China Foreign Affairs University, Beijing, China
| |
Collapse
|
7
|
A novel order evaluation model with nested probabilistic-numerical linguistic information applied to traditional order grabbing mode. APPL INTELL 2021. [DOI: 10.1007/s10489-020-02088-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
8
|
Fan Y, Wang P, Mafarja M, Wang M, Zhao X, Chen H. A bioinformatic variant fruit fly optimizer for tackling optimization problems. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2020.106704] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
9
|
Yan C, Wu B, Ma J, Zhang G, Luo J, Wang J, Luo H. A Novel Hybrid Filter/Wrapper Feature Selection Approach Based on Improved Fruit Fly Optimization Algorithm and Chi-square Test for High Dimensional Microarray Data. Curr Bioinform 2021. [DOI: 10.2174/1574893615666200324125535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Microarray data is widely utilized for disease analysis and diagnosis.
However, it is hard to process them directly and achieve high classification accuracy due to the
intrinsic characteristics of high dimensionality and small size samples. As an important data
preprocessing technique, feature selection is usually used to reduce the dimensionality of some
datasets.
Methods:
Given the limitations of employing filter or wrapper approaches individually for feature
selection, in the study, a novel hybrid filter-wrapper approach, CS_IFOA, is proposed for high
dimensional datasets. First, the Chi-square Test is utilized to filter out some irrelevant or redundant
features. Next, an improved binary Fruit Fly Optimization algorithm is conducted to further search
the optimal feature subset without degrading the classification accuracy. Here, the KNN classifier
with the 10-fold-CV is utilized to evaluate the classification accuracy.
Results:
Extensive experimental results on six benchmark biomedical datasets show that the
proposed CS-IFOA can achieve superior performance compared with other state-of-the-art
methods. The CS-IFOA can get a smaller number of features while achieving higher classification
accuracy. Furthermore, the standard deviation of the experimental results is relatively small, which
indicates that the proposed algorithm is relatively robust.
Conclusion:
The results confirmed the efficiency of our approach in identifying some important
genes for high-dimensional biomedical datasets, which can be used as an ideal pre-processing tool
to help optimize the feature selection process, and improve the efficiency of disease diagnosis.
Collapse
Affiliation(s)
- Chaokun Yan
- School of Computer and Information Engineering, Henan University, Kaifeng,China
| | - Bin Wu
- School of Computer and Information Engineering, Henan University, Kaifeng,China
| | | | - Ge Zhang
- School of Computer and Information Engineering, Henan University, Kaifeng,China
| | - Junwei Luo
- College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo,China
| | - Jianlin Wang
- School of Computer and Information Engineering, Henan University, Kaifeng,China
| | - Huimin Luo
- School of Computer and Information Engineering, Henan University, Kaifeng,China
| |
Collapse
|
10
|
Enhancing QUasi-Affine TRansformation Evolution (QUATRE) with adaptation scheme on numerical optimization. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.105908] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
|
11
|
Lei X, Yang X, Wu FX. Artificial Fish Swarm Optimization Based Method to Identify Essential Proteins. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:495-505. [PMID: 30113899 DOI: 10.1109/tcbb.2018.2865567] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
It is well known that essential proteins play an extremely important role in controlling cellular activities in living organisms. Identifying essential proteins from protein protein interaction (PPI) networks is conducive to the understanding of cellular functions and molecular mechanisms. Hitherto, many essential proteins detection methods have been proposed. Nevertheless, those existing identification methods are not satisfactory because of low efficiency and low sensitivity to noisy data. This paper presents a novel computational approach based on artificial fish swarm optimization for essential proteins prediction in PPI networks (called AFSO_EP). In AFSO_EP, first, a part of known essential proteins are randomly chosen as artificial fishes of priori knowledge. Then, detecting essential proteins by imitating four principal biological behaviors of artificial fishes when searching for food or companions, including foraging behavior, following behavior, swarming behavior, and random behavior, in which process, the network topology, gene expression, gene ontology (GO) annotation, and subcellular localization information are utilized. To evaluate the performance of AFSO_EP, we conduct experiments on two species (Saccharomyces cerevisiae and Drosophila melanogaster), the experimental results show that our method AFSO_EP achieves a better performance for identifying essential proteins in comparison with several other well-known identification methods, which confirms the effectiveness of AFSO_EP.
Collapse
|
12
|
Abderazek H, Yildiz AR, Mirjalili S. Comparison of recent optimization algorithms for design optimization of a cam-follower mechanism. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2019.105237] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
13
|
|
14
|
Zhao J, Lei X. Detecting overlapping protein complexes in weighted PPI network based on overlay network chain in quotient space. BMC Bioinformatics 2019; 20:682. [PMID: 31874605 PMCID: PMC6929339 DOI: 10.1186/s12859-019-3256-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Protein complexes are the cornerstones of many biological processes and gather them to form various types of molecular machinery that perform a vast array of biological functions. In fact, a protein may belong to multiple protein complexes. Most existing protein complex detection algorithms cannot reflect overlapping protein complexes. To solve this problem, a novel overlapping protein complexes identification algorithm is proposed. RESULTS In this paper, a new clustering algorithm based on overlay network chain in quotient space, marked as ONCQS, was proposed to detect overlapping protein complexes in weighted PPI networks. In the quotient space, a multilevel overlay network is constructed by using the maximal complete subgraph to mine overlapping protein complexes. The GO annotation data is used to weight the PPI network. According to the compatibility relation, the overlay network chain in quotient space was calculated. The protein complexes are contained in the last level of the overlay network. The experiments were carried out on four PPI databases, and compared ONCQS with five other state-of-the-art methods in the identification of protein complexes. CONCLUSIONS We have applied ONCQS to four PPI databases DIP, Gavin, Krogan and MIPS, the results show that it is superior to other five existing algorithms MCODE, MCL, CORE, ClusterONE and COACH in detecting overlapping protein complexes.
Collapse
Affiliation(s)
- Jie Zhao
- School of Computer Science, Shaanxi Normal University, Xi'an, 710119, Shaanxi, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, 710119, Shaanxi, China.
| |
Collapse
|
15
|
Wang X, Yu B, Ma A, Chen C, Liu B, Ma Q. Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique. Bioinformatics 2019; 35:2395-2402. [PMID: 30520961 PMCID: PMC6612859 DOI: 10.1093/bioinformatics/bty995] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2018] [Revised: 11/19/2018] [Accepted: 12/03/2018] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The prediction of protein-protein interaction (PPI) sites is a key to mutation design, catalytic reaction and the reconstruction of PPI networks. It is a challenging task considering the significant abundant sequences and the imbalance issue in samples. RESULTS A new ensemble learning-based method, Ensemble Learning of synthetic minority oversampling technique (SMOTE) for Unbalancing samples and RF algorithm (EL-SMURF), was proposed for PPI sites prediction in this study. The sequence profile feature and the residue evolution rates were combined for feature extraction of neighboring residues using a sliding window, and the SMOTE was applied to oversample interface residues in the feature space for the imbalance problem. The Multi-dimensional Scaling feature selection method was implemented to reduce feature redundancy and subset selection. Finally, the Random Forest classifiers were applied to build the ensemble learning model, and the optimal feature vectors were inserted into EL-SMURF to predict PPI sites. The performance validation of EL-SMURF on two independent validation datasets showed 77.1% and 77.7% accuracy, which were 6.2-15.7% and 6.1-18.9% higher than the other existing tools, respectively. AVAILABILITY AND IMPLEMENTATION The source codes and data used in this study are publicly available at http://github.com/QUST-AIBBDRC/EL-SMURF/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiaoying Wang
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, China
- School of Mathematics, Shandong University, Jinan, China
- Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, China
| | - Bin Yu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, China
- Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, China
- School of Life Sciences, University of Science and Technology of China, Hefei, China
| | - Anjun Ma
- Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, USA
- Department Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| | - Cheng Chen
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, China
- Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, China
| | - Bingqiang Liu
- School of Mathematics, Shandong University, Jinan, China
| | - Qin Ma
- Bioinformatics and Mathematical Biosciences Lab, Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, USA
- Department Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
16
|
Liu X, Hong Z, Liu J, Lin Y, Rodríguez-Patón A, Zou Q, Zeng X. Computational methods for identifying the critical nodes in biological networks. Brief Bioinform 2019; 21:486-497. [DOI: 10.1093/bib/bbz011] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Revised: 12/03/2018] [Accepted: 01/11/2019] [Indexed: 12/28/2022] Open
Abstract
Abstract
A biological network is complex. A group of critical nodes determines the quality and state of such a network. Increasing studies have shown that diseases and biological networks are closely and mutually related and that certain diseases are often caused by errors occurring in certain nodes in biological networks. Thus, studying biological networks and identifying critical nodes can help determine the key targets in treating diseases. The problem is how to find the critical nodes in a network efficiently and with low cost. Existing experimental methods in identifying critical nodes generally require much time, manpower and money. Accordingly, many scientists are attempting to solve this problem by researching efficient and low-cost computing methods. To facilitate calculations, biological networks are often modeled as several common networks. In this review, we classify biological networks according to the network types used by several kinds of common computational methods and introduce the computational methods used by each type of network.
Collapse
Affiliation(s)
- Xiangrong Liu
- Department of Computer Science, Xiamen University, China
| | - Zengyan Hong
- Department of Computer Science, Xiamen University, China
| | - Juan Liu
- Department of Computer Science, Xiamen University, China
| | - Yuan Lin
- ITOP Section, DNB Bank ASA, Solheimsgaten, Bergen, Norway
| | - Alfonso Rodríguez-Patón
- Universidad Politécnica de Madrid (UPM) Campus Montegancedo s/n, Boadilla del Monte, Madrid, Spain
| | - Quan Zou
- Department of Computer Science, Xiamen University, China
- Insitute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, China
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| | | |
Collapse
|
17
|
Zhang W, Xu J, Li Y, Zou X. Integrating network topology, gene expression data and GO annotation information for protein complex prediction. J Bioinform Comput Biol 2018; 17:1950001. [PMID: 30803297 DOI: 10.1142/s021972001950001x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
The prediction of protein complexes based on the protein interaction network is a fundamental task for the understanding of cellular life as well as the mechanisms underlying complex disease. A great number of methods have been developed to predict protein complexes based on protein-protein interaction (PPI) networks in recent years. However, because the high throughput data obtained from experimental biotechnology are incomplete, and usually contain a large number of spurious interactions, most of the network-based protein complex identification methods are sensitive to the reliability of the PPI network. In this paper, we propose a new method, Identification of Protein Complex based on Refined Protein Interaction Network (IPC-RPIN), which integrates the topology, gene expression profiles and GO functional annotation information to predict protein complexes from the reconstructed networks. To demonstrate the performance of the IPC-RPIN method, we evaluated the IPC-RPIN on three PPI networks of Saccharomycescerevisiae and compared it with four state-of-the-art methods. The simulation results show that the IPC-RPIN achieved a better result than the other methods on most of the measurements and is able to discover small protein complexes which have traditionally been neglected.
Collapse
Affiliation(s)
- Wei Zhang
- * School of Science, East China Jiaotong University, Nanchang 330013, P. R. China
| | - Jia Xu
- † School of Mechatronic Engineering, East China Jiaotong University, Nanchang 330013, P. R. China
| | - Yuanyuan Li
- ‡ School of Mathematics and Statistics, Wuhan Institute of Technology in Wuhan, Wuhan 430072, P. R. China
| | - Xiufen Zou
- § School of Mathematics and Statistics, Wuhan University, Wuhan 430072, P. R. China
| |
Collapse
|
18
|
Lei X, Zhao J, Fujita H, Zhang A. Predicting essential proteins based on RNA-Seq, subcellular localization and GO annotation datasets. Knowl Based Syst 2018. [DOI: 10.1016/j.knosys.2018.03.027] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
19
|
Lei X, Yang X. A new method for predicting essential proteins based on participation degree in protein complex and subgraph density. PLoS One 2018; 13:e0198998. [PMID: 29894517 PMCID: PMC5997351 DOI: 10.1371/journal.pone.0198998] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2018] [Accepted: 05/30/2018] [Indexed: 12/11/2022] Open
Abstract
Essential proteins are crucial to living cells. Identification of essential proteins from protein-protein interaction (PPI) networks can be applied to pathway analysis and function prediction, furthermore, it can contribute to disease diagnosis and drug design. There have been some experimental and computational methods designed to identify essential proteins, however, the prediction precision remains to be improved. In this paper, we propose a new method for identifying essential proteins based on Participation degree of a protein in protein Complexes and Subgraph Density, named as PCSD. In order to test the performance of PCSD, four PPI datasets (DIP, Krogan, MIPS and Gavin) are used to conduct experiments. The experiment results have demonstrated that PCSD achieves a better performance for predicting essential proteins compared with some competing methods including DC, SC, EC, IC, LAC, NC, WDC, PeC, UDoNC, and compared with the most recent method LBCC, PCSD can correctly predict more essential proteins from certain numbers of top ranked proteins on the DIP dataset, which indicates that PCSD is very effective in discovering essential proteins in most case.
Collapse
Affiliation(s)
- Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi’an, China
| | - Xiaoqin Yang
- School of Computer Science, Shaanxi Normal University, Xi’an, China
| |
Collapse
|
20
|
Wu L, Liu Q, Tian X, Zhang J, Xiao W. A new improved fruit fly optimization algorithm IAFOA and its application to solve engineering optimization problems. Knowl Based Syst 2018. [DOI: 10.1016/j.knosys.2017.12.031] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
|
21
|
Jalili M, Gebhardt T, Wolkenhauer O, Salehzadeh-Yazdi A. Unveiling network-based functional features through integration of gene expression into protein networks. Biochim Biophys Acta Mol Basis Dis 2018; 1864:2349-2359. [PMID: 29466699 DOI: 10.1016/j.bbadis.2018.02.010] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 01/31/2018] [Accepted: 02/13/2018] [Indexed: 02/02/2023]
Abstract
Decoding health and disease phenotypes is one of the fundamental objectives in biomedicine. Whereas high-throughput omics approaches are available, it is evident that any single omics approach might not be adequate to capture the complexity of phenotypes. Therefore, integrated multi-omics approaches have been used to unravel genotype-phenotype relationships such as global regulatory mechanisms and complex metabolic networks in different eukaryotic organisms. Some of the progress and challenges associated with integrated omics studies have been reviewed previously in comprehensive studies. In this work, we highlight and review the progress, challenges and advantages associated with emerging approaches, integrating gene expression and protein-protein interaction networks to unravel network-based functional features. This includes identifying disease related genes, gene prioritization, clustering protein interactions, developing the modules, extract active subnetworks and static protein complexes or dynamic/temporal protein complexes. We also discuss how these approaches contribute to our understanding of the biology of complex traits and diseases. This article is part of a Special Issue entitled: Cardiac adaptations to obesity, diabetes and insulin resistance, edited by Professors Jan F.C. Glatz, Jason R.B. Dyck and Christine Des Rosiers.
Collapse
Affiliation(s)
- Mahdi Jalili
- Hematology, Oncology and SCT Research Center, Tehran University of Medical Sciences, Tehran, Iran; Hematologic Malignancies Research Center, Tehran University of Medical Sciences, Tehran, Iran
| | - Tom Gebhardt
- Department of Systems Biology and Bioinformatics, University of Rostock, 18051 Rostock, Germany
| | - Olaf Wolkenhauer
- Department of Systems Biology and Bioinformatics, University of Rostock, 18051 Rostock, Germany
| | - Ali Salehzadeh-Yazdi
- Department of Systems Biology and Bioinformatics, University of Rostock, 18051 Rostock, Germany.
| |
Collapse
|
22
|
Han X, Liu Q, Wang H, Wang L. Novel fruit fly optimization algorithm with trend search and co-evolution. Knowl Based Syst 2018. [DOI: 10.1016/j.knosys.2017.11.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
23
|
Wang G, Ma L, Chen J. A bilevel improved fruit fly optimization algorithm for the nonlinear bilevel programming problem. Knowl Based Syst 2017. [DOI: 10.1016/j.knosys.2017.09.038] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
24
|
|
25
|
|
26
|
A tuned hybrid intelligent fruit fly optimization algorithm for fuzzy rule generation and classification. Neural Comput Appl 2017. [DOI: 10.1007/s00521-017-3115-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
|
27
|
Ye F, Lou XY, Sun LF. An improved chaotic fruit fly optimization based on a mutation strategy for simultaneous feature selection and parameter optimization for SVM and its applications. PLoS One 2017; 12:e0173516. [PMID: 28369096 PMCID: PMC5378331 DOI: 10.1371/journal.pone.0173516] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Accepted: 02/21/2017] [Indexed: 12/03/2022] Open
Abstract
This paper proposes a new support vector machine (SVM) optimization scheme based on an improved chaotic fly optimization algorithm (FOA) with a mutation strategy to simultaneously perform parameter setting turning for the SVM and feature selection. In the improved FOA, the chaotic particle initializes the fruit fly swarm location and replaces the expression of distance for the fruit fly to find the food source. However, the proposed mutation strategy uses two distinct generative mechanisms for new food sources at the osphresis phase, allowing the algorithm procedure to search for the optimal solution in both the whole solution space and within the local solution space containing the fruit fly swarm location. In an evaluation based on a group of ten benchmark problems, the proposed algorithm's performance is compared with that of other well-known algorithms, and the results support the superiority of the proposed algorithm. Moreover, this algorithm is successfully applied in a SVM to perform both parameter setting turning for the SVM and feature selection to solve real-world classification problems. This method is called chaotic fruit fly optimization algorithm (CIFOA)-SVM and has been shown to be a more robust and effective optimization method than other well-known methods, particularly in terms of solving the medical diagnosis problem and the credit card problem.
Collapse
Affiliation(s)
- Fei Ye
- School of Information Science and Technology, Southwest Jiaotong University, ChengDu, China
| | - Xin Yuan Lou
- School of Information Science and Technology, Southwest Jiaotong University, ChengDu, China
| | - Lin Fu Sun
- School of Information Science and Technology, Southwest Jiaotong University, ChengDu, China
| |
Collapse
|
28
|
|
29
|
Jiang ZB, Yang Q. A Discrete Fruit Fly Optimization Algorithm for the Traveling Salesman Problem. PLoS One 2016; 11:e0165804. [PMID: 27812175 PMCID: PMC5094794 DOI: 10.1371/journal.pone.0165804] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2016] [Accepted: 10/18/2016] [Indexed: 11/30/2022] Open
Abstract
The fruit fly optimization algorithm (FOA) is a newly developed bio-inspired algorithm. The continuous variant version of FOA has been proven to be a powerful evolutionary approach to determining the optima of a numerical function on a continuous definition domain. In this study, a discrete FOA (DFOA) is developed and applied to the traveling salesman problem (TSP), a common combinatorial problem. In the DFOA, the TSP tour is represented by an ordering of city indices, and the bio-inspired meta-heuristic search processes are executed with two elaborately designed main procedures: the smelling and tasting processes. In the smelling process, an effective crossover operator is used by the fruit fly group to search for the neighbors of the best-known swarm location. During the tasting process, an edge intersection elimination (EXE) operator is designed to improve the neighbors of the non-optimum food location in order to enhance the exploration performance of the DFOA. In addition, benchmark instances from the TSPLIB are classified in order to test the searching ability of the proposed algorithm. Furthermore, the effectiveness of the proposed DFOA is compared to that of other meta-heuristic algorithms. The results indicate that the proposed DFOA can be effectively used to solve TSPs, especially large-scale problems.
Collapse
Affiliation(s)
- Zi-bin Jiang
- College of Business Administration, Hunan University, Changsha, Hunan, China
- * E-mail:
| | - Qiong Yang
- College of Business Administration, Hunan University, Changsha, Hunan, China
| |
Collapse
|
30
|
A novel locust swarm algorithm for the joint replenishment problem considering multiple discounts simultaneously. Knowl Based Syst 2016. [DOI: 10.1016/j.knosys.2016.08.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|