1
|
Li DX, Zhou P, Zhao BW, Su XR, Li GD, Zhang J, Hu PW, Hu L. Biocaiv: an integrative webserver for motif-based clustering analysis and interactive visualization of biological networks. BMC Bioinformatics 2023; 24:451. [PMID: 38030973 PMCID: PMC10685597 DOI: 10.1186/s12859-023-05574-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 11/20/2023] [Indexed: 12/01/2023] Open
Abstract
BACKGROUND As an important task in bioinformatics, clustering analysis plays a critical role in understanding the functional mechanisms of many complex biological systems, which can be modeled as biological networks. The purpose of clustering analysis in biological networks is to identify functional modules of interest, but there is a lack of online clustering tools that visualize biological networks and provide in-depth biological analysis for discovered clusters. RESULTS Here we present BioCAIV, a novel webserver dedicated to maximize its accessibility and applicability on the clustering analysis of biological networks. This, together with its user-friendly interface, assists biological researchers to perform an accurate clustering analysis for biological networks and identify functionally significant modules for further assessment. CONCLUSIONS BioCAIV is an efficient clustering analysis webserver designed for a variety of biological networks. BioCAIV is freely available without registration requirements at http://bioinformatics.tianshanzw.cn:8888/BioCAIV/ .
Collapse
Affiliation(s)
- Dong-Xu Li
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
- University of Chinese Academy of Sciences, Beijing, China
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Peng Zhou
- School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan, China
| | - Bo-Wei Zhao
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
- University of Chinese Academy of Sciences, Beijing, China
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Xiao-Rui Su
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
- University of Chinese Academy of Sciences, Beijing, China
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Guo-Dong Li
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
- University of Chinese Academy of Sciences, Beijing, China
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Jun Zhang
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
- University of Chinese Academy of Sciences, Beijing, China
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Peng-Wei Hu
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
- University of Chinese Academy of Sciences, Beijing, China
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Lun Hu
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.
- University of Chinese Academy of Sciences, Beijing, China.
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China.
| |
Collapse
|
2
|
Luo X, Wang L, Hu P, Hu L. Predicting Protein-Protein Interactions Using Sequence and Network Information via Variational Graph Autoencoder. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3182-3194. [PMID: 37155405 DOI: 10.1109/tcbb.2023.3273567] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Protein-protein interactions (PPIs) play a critical role in the proteomics study, and a variety of computational algorithms have been developed to predict PPIs. Though effective, their performance is constrained by high false-positive and false-negative rates observed in PPI data. To overcome this problem, a novel PPI prediction algorithm, namely PASNVGA, is proposed in this work by combining the sequence and network information of proteins via variational graph autoencoder. To do so, PASNVGA first applies different strategies to extract the features of proteins from their sequence and network information, and obtains a more compact form of these features using principal component analysis. In addition, PASNVGA designs a scoring function to measure the higher-order connectivity between proteins and so as to obtain a higher-order adjacency matrix. With all these features and adjacency matrices, PASNVGA trains a variational graph autoencoder model to further learn the integrated embeddings of proteins. The prediction task is then completed by using a simple feedforward neural network. Extensive experiments have been conducted on five PPI datasets collected from different species. Compared with several state-of-the-art algorithms, PASNVGA has been demonstrated as a promising PPI prediction algorithm.
Collapse
|
3
|
Palukuri MV, Patil RS, Marcotte EM. Molecular complex detection in protein interaction networks through reinforcement learning. BMC Bioinformatics 2023; 24:306. [PMID: 37532987 PMCID: PMC10394916 DOI: 10.1186/s12859-023-05425-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 07/20/2023] [Indexed: 08/04/2023] Open
Abstract
BACKGROUND Proteins often assemble into higher-order complexes to perform their biological functions. Such protein-protein interactions (PPI) are often experimentally measured for pairs of proteins and summarized in a weighted PPI network, to which community detection algorithms can be applied to define the various higher-order protein complexes. Current methods include unsupervised and supervised approaches, often assuming that protein complexes manifest only as dense subgraphs. Utilizing supervised approaches, the focus is not on how to find them in a network, but only on learning which subgraphs correspond to complexes, currently solved using heuristics. However, learning to walk trajectories on a network to identify protein complexes leads naturally to a reinforcement learning (RL) approach, a strategy not extensively explored for community detection. Here, we develop and evaluate a reinforcement learning pipeline for community detection on weighted protein-protein interaction networks to detect new protein complexes. The algorithm is trained to calculate the value of different subgraphs encountered while walking on the network to reconstruct known complexes. A distributed prediction algorithm then scales the RL pipeline to search for novel protein complexes on large PPI networks. RESULTS The reinforcement learning pipeline is applied to a human PPI network consisting of 8k proteins and 60k PPI, which results in 1,157 protein complexes. The method demonstrated competitive accuracy with improved speed compared to previous algorithms. We highlight protein complexes such as C4orf19, C18orf21, and KIAA1522 which are currently minimally characterized. Additionally, the results suggest TMC04 be a putative additional subunit of the KICSTOR complex and confirm the involvement of C15orf41 in a higher-order complex with HIRA, CDAN1, ASF1A, and by 3D structural modeling. CONCLUSIONS Reinforcement learning offers several distinct advantages for community detection, including scalability and knowledge of the walk trajectories defining those communities. Applied to currently available human protein interaction networks, this method had comparable accuracy with other algorithms and notable savings in computational time, and in turn, led to clear predictions of protein function and interactions for several uncharacterized human proteins.
Collapse
Affiliation(s)
- Meghana V Palukuri
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA.
- Oden Institute for Computational Engineering and Sciences, University of Texas, Austin, TX, 78712, USA.
| | - Ridhi S Patil
- Department of Biomedical Engineering, University of Texas, Austin, TX, 78712, USA.
| | - Edward M Marcotte
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA.
- Oden Institute for Computational Engineering and Sciences, University of Texas, Austin, TX, 78712, USA.
| |
Collapse
|
4
|
Pan X, Hu L, Hu P, You ZH. Identifying Protein Complexes From Protein-Protein Interaction Networks Based on Fuzzy Clustering and GO Semantic Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2882-2893. [PMID: 34242171 DOI: 10.1109/tcbb.2021.3095947] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Protein complexes are of great significance to provide valuable insights into the mechanisms of biological processes of proteins. A variety of computational algorithms have thus been proposed to identify protein complexes in a protein-protein interaction network. However, few of them can perform their tasks by taking into account both network topology and protein attribute information in a unified fuzzy-based clustering framework. Since proteins in the same complex are similar in terms of their attribute information and the consideration of fuzzy clustering can also make it possible for us to identify overlapping complexes, we target to propose such a novel fuzzy-based clustering framework, namely FCAN-PCI, for an improved identification accuracy. To do so, the semantic similarity between the attribute information of proteins is calculated and we then integrate it into a well-established fuzzy clustering model together with the network topology. After that, a momentum method is adopted to accelerate the clustering procedure. FCAN-PCI finally applies a heuristical search strategy to identify overlapping protein complexes. A series of extensive experiments have been conducted to evaluate the performance of FCAN-PCI by comparing it with state-of-the-art identification algorithms and the results demonstrate the promising performance of FCAN-PCI.
Collapse
|
5
|
Sheng J, Xue J, Li P, Yi N. [A protein complex recognition method based on spatial-temporal graph convolution neural network]. NAN FANG YI KE DA XUE XUE BAO = JOURNAL OF SOUTHERN MEDICAL UNIVERSITY 2022; 42:1075-1081. [PMID: 35869773 DOI: 10.12122/j.issn.1673-4254.2022.07.17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
OBJECTIVE To propose a new method for mining complexes in dynamic protein network using spatiotemporal convolution neural network. METHODS The edge strength, node strength and edge existence probability are defined for modeling of the dynamic protein network. Based on the time series information and structure information on the graph, two convolution operators were designed using Hilbert-Huang transform, attention mechanism and residual connection technology to represent and learn the characteristics of the proteins in the network, and the dynamic protein network characteristic map was constructed. Finally, spectral clustering was used to identify the protein complexes. RESULTS The simulation results on several public biological datasets showed that the F value of the proposed algorithm exceeded 90% on DIP dataset and MIPS dataset. Compared with 4 other recognition algorithms (DPCMNE, GE-CFI, VGAE and NOCD), the proposed algorithm improved the recognition efficiency by 34.5%, 28.7%, 25.4% and 17.6%, respectively. CONCLUSION The application of deep learning technology can improve the efficiency in analysis of dynamic protein networks.
Collapse
Affiliation(s)
- J Sheng
- Clinical nursing teaching and Research Office, The Second Xiangya Hospital of Central South University, Changsha 410011, China.,Department of ultrasound diagnosis, The Second Xiangya Hospital of Central South University, Changsha 410011, China
| | - J Xue
- Operation center, The Third Xiangya Hospital of Central South University, Changsha 410013, China
| | - P Li
- School of Informatics, Hunan University of Chinese Medicine, Changsha 410208, China
| | - N Yi
- School of Informatics, Hunan University of Chinese Medicine, Changsha 410208, China
| |
Collapse
|
6
|
Karthic S, Kumar SM. Hybrid Optimized Deep Neural Network with Enhanced Conditional Random Field Based Intrusion Detection on Wireless Sensor Network. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10892-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
7
|
Ayyadurai M, Seetha J, Haque SMFU, Juliana R, Karthikeyan C. Routing Algorithm for Underwater Acoustic Sensor Network. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10891-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
8
|
Ammari AC, Labidi W, Mnif F, Yuan H, Zhou M, Sarrab M. Firefly algorithm and learning-based geographical task scheduling for operational cost minimization in distributed green data centers. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.01.052] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
9
|
Deep Learning Based-Virtual Screening Using 2D Pharmacophore Fingerprint in Drug Discovery. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10879-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
10
|
Meng X, Xiang J, Zheng R, Wu FX, Li M. DPCMNE: Detecting Protein Complexes From Protein-Protein Interaction Networks Via Multi-Level Network Embedding. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1592-1602. [PMID: 33417563 DOI: 10.1109/tcbb.2021.3050102] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Biological functions of a cell are typically carried out through protein complexes. The detection of protein complexes is therefore of great significance for understanding the cellular organizations and protein functions. In the past decades, many computational methods have been proposed to detect protein complexes. However, most of the existing methods just search the local topological information to mine dense subgraphs as protein complexes, ignoring the global topological information. To tackle this issue, we propose the DPCMNE method to detect protein complexes via multi-level network embedding. It can preserve both the local and global topological information of biological networks. First, DPCMNE employs a hierarchical compressing strategy to recursively compress the input protein-protein interaction (PPI) network into multi-level smaller PPI networks. Then, a network embedding method is applied on these smaller PPI networks to learn protein embeddings of different levels of granularity. The embeddings learned from all the compressed PPI networks are concatenated to represent the final protein embeddings of the original input PPI network. Finally, a core-attachment based strategy is adopted to detect protein complexes in the weighted PPI network constructed by the pairwise similarity of protein embeddings. To assess the efficiency of our proposed method, DPCMNE is compared with other eight clustering algorithms on two yeast datasets. The experimental results show that the performance of DPCMNE outperforms those state-of-the-art complex detection methods in terms of F1 and F1+Acc. Furthermore, the results of functional enrichment analysis indicate that protein complexes detected by DPCMNE are more biologically significant in terms of P-score.
Collapse
|
11
|
Wang R, Ma H, Wang C. An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks. Front Genet 2022; 13:839949. [PMID: 35281831 PMCID: PMC8908451 DOI: 10.3389/fgene.2022.839949] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 01/31/2022] [Indexed: 11/14/2022] Open
Abstract
Detecting protein complexes is one of the keys to understanding cellular organization and processes principles. With high-throughput experiments and computing science development, it has become possible to detect protein complexes by computational methods. However, most computational methods are based on either unsupervised learning or supervised learning. Unsupervised learning-based methods do not need training datasets, but they can only detect one or several topological protein complexes. Supervised learning-based methods can detect protein complexes with different topological structures. However, they are usually based on a type of training model, and the generalization of a single model is poor. Therefore, we propose an Ensemble Learning Framework for Detecting Protein Complexes (ELF-DPC) within protein-protein interaction (PPI) networks to address these challenges. The ELF-DPC first constructs the weighted PPI network by combining topological and biological information. Second, it mines protein complex cores using the protein complex core mining strategy we designed. Third, it obtains an ensemble learning model by integrating structural modularity and a trained voting regressor model. Finally, it extends the protein complex cores and forms protein complexes by a graph heuristic search strategy. The experimental results demonstrate that ELF-DPC performs better than the twelve state-of-the-art approaches. Moreover, functional enrichment analysis illustrated that ELF-DPC could detect biologically meaningful protein complexes. The code/dataset is available for free download from https://github.com/RongquanWang/ELF-DPC.
Collapse
Affiliation(s)
- Rongquan Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| | - Huimin Ma
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
- *Correspondence: Huimin Ma,
| | - Caixia Wang
- School of International Economics, China Foreign Affairs University, Beijing, China
| |
Collapse
|
12
|
Hu L, Yang S, Luo X, Yuan H, Sedraoui K, Zhou M. A Distributed Framework for Large-scale Protein-protein Interaction Data Analysis and Prediction Using MapReduce. IEEE/CAA JOURNAL OF AUTOMATICA SINICA 2022; 9:160-172. [DOI: 10.1109/jas.2021.1004198] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
13
|
Hu L, Zhang J, Pan X, Yan H, You ZH. HiSCF: leveraging higher-order structures for clustering analysis in biological networks. Bioinformatics 2021; 37:542-550. [PMID: 32931549 DOI: 10.1093/bioinformatics/btaa775] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Revised: 05/12/2020] [Accepted: 09/03/2020] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION Clustering analysis in a biological network is to group biological entities into functional modules, thus providing valuable insight into the understanding of complex biological systems. Existing clustering techniques make use of lower-order connectivity patterns at the level of individual biological entities and their connections, but few of them can take into account of higher-order connectivity patterns at the level of small network motifs. RESULTS Here, we present a novel clustering framework, namely HiSCF, to identify functional modules based on the higher-order structure information available in a biological network. Taking advantage of higher-order Markov stochastic process, HiSCF is able to perform the clustering analysis by exploiting a variety of network motifs. When compared with several state-of-the-art clustering models, HiSCF yields the best performance for two practical clustering applications, i.e. protein complex identification and gene co-expression module detection, in terms of accuracy. The promising performance of HiSCF demonstrates that the consideration of higher-order network motifs gains new insight into the analysis of biological networks, such as the identification of overlapping protein complexes and the inference of new signaling pathways, and also reveals the rich higher-order organizational structures presented in biological networks. AVAILABILITY AND IMPLEMENTATION HiSCF is available at https://github.com/allenv5/HiSCF. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China.,School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430070, China
| | - Jun Zhang
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430070, China
| | - Xiangyu Pan
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430070, China
| | - Hong Yan
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong 999077, China
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China
| |
Collapse
|
14
|
Automatic Detection of Melanins and Sebums from Skin Images Using a Generative Adversarial Network. Cognit Comput 2021. [DOI: 10.1007/s12559-021-09870-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
15
|
|