1
|
Pan H, Wu Z, Liu W, Zhang G. AlphaFun: Structural-Alignment-Based Proteome Annotation Reveals why the Functionally Unknown Proteins (uPE1) Are So Understudied. J Proteome Res 2024; 23:1593-1602. [PMID: 38626392 PMCID: PMC11078154 DOI: 10.1021/acs.jproteome.3c00678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 03/27/2024] [Accepted: 04/03/2024] [Indexed: 04/18/2024]
Abstract
With the rapid expansion of sequencing of genomes, the functional annotation of proteins becomes a bottleneck in understanding proteomes. The Chromosome-centric Human Proteome Project (C-HPP) aims to identify all proteins encoded by the human genome and find functional annotations for them. However, until now there are still 1137 identified human proteins without functional annotation, called uPE1 proteins. Sequence alignment was insufficient to predict their functions, and the crystal structures of most proteins were unavailable. In this study, we demonstrated a new functional annotation strategy, AlphaFun, based on structural alignment using deep-learning-predicted protein structures. Using this strategy, we functionally annotated 99% of the human proteome, including the uPE1 proteins and missing proteins, which have not been identified yet. The accuracy of the functional annotations was validated using the known-function proteins. The uPE1 proteins shared similar functions to the known-function PE1 proteins and tend to express only in very limited tissues. They are evolutionally young genes and thus should conduct functions only in specific tissues and conditions, limiting their occurrence in commonly studied biological models. Such functional annotations provide hints for functional investigations on the uPE1 proteins. This proteome-wide-scale functional annotation strategy is also applicable to any other species.
Collapse
Affiliation(s)
- Hengxin Pan
- MOE Key Laboratory of Tumor
Molecular Biology and Key Laboratory of Functional Protein Research
of Guangdong Higher Education Institutes, Institute of Life and Health
Engineering, College of Life Science and Technology, Jinan University, Guangzhou 510632, China
| | - Zhenqi Wu
- MOE Key Laboratory of Tumor
Molecular Biology and Key Laboratory of Functional Protein Research
of Guangdong Higher Education Institutes, Institute of Life and Health
Engineering, College of Life Science and Technology, Jinan University, Guangzhou 510632, China
| | - Wanting Liu
- MOE Key Laboratory of Tumor
Molecular Biology and Key Laboratory of Functional Protein Research
of Guangdong Higher Education Institutes, Institute of Life and Health
Engineering, College of Life Science and Technology, Jinan University, Guangzhou 510632, China
| | - Gong Zhang
- MOE Key Laboratory of Tumor
Molecular Biology and Key Laboratory of Functional Protein Research
of Guangdong Higher Education Institutes, Institute of Life and Health
Engineering, College of Life Science and Technology, Jinan University, Guangzhou 510632, China
| |
Collapse
|
2
|
Zhan Y, Liu J, Wu M, Tan CSH, Li X, Ou-Yang L. A partially shared joint clustering framework for detecting protein complexes from multiple state-specific signed interaction networks. Comput Biol Med 2023; 159:106936. [PMID: 37105110 DOI: 10.1016/j.compbiomed.2023.106936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 03/27/2023] [Accepted: 04/13/2023] [Indexed: 04/29/2023]
Abstract
Detecting protein complexes is critical for studying cellular organizations and functions. The accumulation of protein-protein interaction (PPI) data enables the identification of protein complexes computationally. Although a great number of computational methods have been proposed to identify protein complexes from PPI networks, most of them ignore the signs of PPIs that reflect the ways proteins interact (activation or inhibition). As not all PPIs imply co-complex relationships, taking into account the signs of PPIs can benefit the identification of protein complexes. Moreover, PPI networks are not static, but vary with the change of cell states or environments. However, existing methods are primarily designed for single-network clustering, and rarely consider joint clustering of multiple PPI networks. In this study, we propose a novel partially shared signed network clustering (PS-SNC) model for identifying protein complexes from multiple state-specific signed PPI networks jointly. PS-SNC can not only consider the signs of PPIs, but also identify the common and unique protein complexes in different states. Experimental results on synthetic and real datasets show that our PS-SNC model can achieve better performance than other state-of-the-art protein complex detection methods. Extensive analysis on real datasets demonstrate the effectiveness of PS-SNC in revealing novel insights about the underlying patterns of different cell lines.
Collapse
Affiliation(s)
- Youlin Zhan
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Jiahan Liu
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Min Wu
- Institute for Infocomm Research (I2R), Agency of Science, Technology, and Research (A*STAR), 138632, Singapore
| | - Chris Soon Heng Tan
- Department of Chemistry, College of Science, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Xiaoli Li
- Institute for Infocomm Research (I2R), Agency of Science, Technology, and Research (A*STAR), 138632, Singapore
| | - Le Ou-Yang
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, 518129, China.
| |
Collapse
|
3
|
Lyu J, Yao Z, Liang B, Liu Y, Zhang Y. Small protein complex prediction algorithm based on protein-protein interaction network segmentation. BMC Bioinformatics 2022; 23:405. [PMID: 36180820 PMCID: PMC9524060 DOI: 10.1186/s12859-022-04960-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 09/19/2022] [Indexed: 11/23/2022] Open
Abstract
Background Identifying protein complexes from protein-protein interaction network is one of significant tasks in the postgenome era. Protein complexes, none of which exceeds 10 in size play an irreplaceable role in life activities and are also a hotspot of scientific research, such as PSD-95, CD44, PKM2 and BRD4. And in MIPS, CYC2008, SGD, Aloy and TAP06 datasets, the proportion of small protein complexes is over 75%. But up to now, protein complex identification methods do not perform well in the field of small protein complexes. Results In this paper, we propose a novel method, called BOPS. It is a three-step procedure. Firstly, it calculates the balanced weights to replace the original weights. Secondly, it divides the graphs larger than MAXP until the original PPIN is divided into small PPINs. Thirdly, it enumerates the connected subset of each small PPINs, identifies potential protein complexes based on cohesion and removes those that are similar. Conclusions In four yeast PPINs, experimental results have shown that BOPS has an improvement of about 5% compared with the SOTA model. In addition, we constructed a weighted Homo sapiens PPIN based on STRINGdb and BioGRID, and BOPS gets the best result in it. These results give new insights into the identification of small protein complexes, and the weighted Homo sapiens PPIN provides more data for related research.
Collapse
Affiliation(s)
- Jiaqing Lyu
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Zhen Yao
- School of Chemical Engineering, Dalian University of Technology, Dalian, China
| | - Bing Liang
- School of Innovation and Entrepreneurship, Dalian University of Technology, Dalian, China.
| | - Yiwei Liu
- School of Innovation and Entrepreneurship, Dalian University of Technology, Dalian, China
| | - Yijia Zhang
- School of Information Science and Technology, Dalian Maritime University, Dalian, China.
| |
Collapse
|
4
|
Omranian S, Nikoloski Z, Grimm DG. Computational identification of protein complexes from network interactions: Present state, challenges, and the way forward. Comput Struct Biotechnol J 2022; 20:2699-2712. [PMID: 35685359 PMCID: PMC9166428 DOI: 10.1016/j.csbj.2022.05.049] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 05/25/2022] [Accepted: 05/25/2022] [Indexed: 01/05/2023] Open
|
5
|
Shahidul Islam M, Rafiqul Islam M, Ali AS. Protein complex prediction in large protein-protein interaction network. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
|
6
|
Dilmaghani S, Brust MR, Ribeiro CHC, Kieffer E, Danoy G, Bouvry P. From communities to protein complexes: A local community detection algorithm on PPI networks. PLoS One 2022; 17:e0260484. [PMID: 35085263 PMCID: PMC8794110 DOI: 10.1371/journal.pone.0260484] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 11/10/2021] [Indexed: 11/18/2022] Open
Abstract
Identifying protein complexes in protein-protein interaction (ppi) networks is often handled as a community detection problem, with algorithms generally relying exclusively on the network topology for discovering a solution. The advancement of experimental techniques on ppi has motivated the generation of many Gene Ontology (go) databases. Incorporating the functionality extracted from go with the topological properties from the underlying ppi network yield a novel approach to identify protein complexes. Additionally, most of the existing algorithms use global measures that operate on the entire network to identify communities. The result of using global metrics are large communities that are often not correlated with the functionality of the proteins. Moreover, ppi network analysis shows that most of the biological functions possibly lie between local neighbours in ppi networks, which are not identifiable with global metrics. In this paper, we propose a local community detection algorithm, (lcda-go), that uniquely exploits information of functionality from go combined with the network topology. lcda-go identifies the community of each protein based on the topological and functional knowledge acquired solely from the local neighbour proteins within the ppi network. Experimental results using the Krogan dataset demonstrate that our algorithm outperforms in most cases state-of-the-art approaches in assessment based on Precision, Sensitivity, and particularly Composite Score. We also deployed lcda, the local-topology based precursor of lcda-go, to compare with a similar state-of-the-art approach that exclusively incorporates topological information of ppi networks for community detection. In addition to the high quality of the results, one main advantage of lcda-go is its low computational time complexity.
Collapse
Affiliation(s)
- Saharnaz Dilmaghani
- Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, Esch-sur-Alzette, Luxembourg
- * E-mail: (SD); (MRB)
| | - Matthias R. Brust
- Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, Esch-sur-Alzette, Luxembourg
- * E-mail: (SD); (MRB)
| | - Carlos H. C. Ribeiro
- Computer Science Division, Aeronautics Institute of Technology (ITA), São Josédos Campos, Brazil
| | - Emmanuel Kieffer
- Faculty of Science, Technology and Medicine (FSTM), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Grégoire Danoy
- Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Faculty of Science, Technology and Medicine (FSTM), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Pascal Bouvry
- Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Faculty of Science, Technology and Medicine (FSTM), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
7
|
Defining and measuring probabilistic ego networks. SOCIAL NETWORK ANALYSIS AND MINING 2021. [DOI: 10.1007/s13278-020-00708-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
AbstractAnalyzing ego networks to investigate local properties and behaviors of individuals is a fundamental task in social network research. In this paper we show that there is not a unique way of defining ego networks when the existence of edges is uncertain, since there are two different ways of defining the neighborhood of a node in such network models. Therefore, we introduce two definitions of probabilistic ego networks, called V-Alters-Ego and F-Alters-Ego, both rooted in the literature. Following that, we investigate three fundamental measures (degree, betweenness and closeness) for each definition. We also propose a method to approximate betweenness of an ego node among the neighbors which are connected via shortest paths with length 2. We show that this approximation method is faster to compute and it has high correlation with ego betweenness under the V-Alters-Ego definition in many datasets. Therefore, it can be a reasonable alternative to represent the extent to which a node plays the role of an intermediate node among its neighbors.
Collapse
|
8
|
Omranian S, Angeleska A, Nikoloski Z. Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient. Comput Struct Biotechnol J 2021; 19:5255-5263. [PMID: 34630943 PMCID: PMC8479235 DOI: 10.1016/j.csbj.2021.09.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 09/13/2021] [Accepted: 09/13/2021] [Indexed: 12/23/2022] Open
Abstract
Provided a family of efficient network algorithms for protein complex identification. The parameter-free family outperforms existing approaches on different networks. It exactly recovered ~ 35% of protein complexes in a pan-plant PPI network. We examined of network perturbations on predicted protein complexes.
Identification of protein complexes from protein-protein interaction (PPI) networks is a key problem in PPI mining, solved by parameter-dependent approaches that suffer from small recall rates. Here we introduce GCC-v, a family of efficient, parameter-free algorithms to accurately predict protein complexes using the (weighted) clustering coefficient of proteins in PPI networks. Through comparative analyses with gold standards and PPI networks from Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens, we demonstrate that GCC-v outperforms twelve state-of-the-art approaches for identification of protein complexes with respect to twelve performance measures in at least 85.71% of scenarios. We also show that GCC-v results in the exact recovery of ∼35% of protein complexes in a pan-plant PPI network and discover 144 new protein complexes in Arabidopsis thaliana, with high support from GO semantic similarity. Our results indicate that findings from GCC-v are robust to network perturbations, which has direct implications to assess the impact of the PPI network quality on the predicted protein complexes.
Collapse
Affiliation(s)
- Sara Omranian
- Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, 14476 Potsdam, Germany.,Systems Biology and Mathematical Modeling, Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam, Germany
| | | | - Zoran Nikoloski
- Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, 14476 Potsdam, Germany.,Systems Biology and Mathematical Modeling, Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam, Germany
| |
Collapse
|
9
|
Hong Z, Liu J, Chen Y. An interpretable machine learning method for homo-trimeric protein interface residue-residue interaction prediction. Biophys Chem 2021; 278:106666. [PMID: 34418678 DOI: 10.1016/j.bpc.2021.106666] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 08/09/2021] [Accepted: 08/09/2021] [Indexed: 12/29/2022]
Abstract
Protein-protein interaction plays an important role in life activities. A more fine-grained analysis, such as residues and atoms level, will better benefit us to understand the mechanism for inter-protein interaction and drug design. The development of efficient computational methods to reduce trials and errors, as well as assisting experimental researchers to determine the complex structure are some of the ongoing studies in the field. The research of trimer protein interface, especially homotrimer, has been rarely studied. In this paper, we proposed an interpretable machine learning method for homo-trimeric protein interface residue pairs prediction. The structure, sequence, and physicochemical information are intergraded as feature input fed to model for training. Graph model is utilized to present spatial information for intra-protein. Matrix factorization captures the different features' interactions. Kernel function is designed to auto-acquire the adjacent information of our target residue pairs. The accuracy rate achieves 54.5% in an independent test set. Sequence and structure alignment exhibit the ability of model self-study. Our model indicates the biological significance between sequence and structure, and could be auxiliary for reducing trials and errors in the fields of protein complex determination and protein-protein docking, etc. SIGNIFICANCE: Protein complex structures are significant for understanding protein function and promising functional protein design. With data increasing, some computational tools have been developed for protein complex residue contact prediction, which is one of the most significant steps for complex structure prediction. But for homo-trimeric protein, the sequence-based deep learning predictors are infeasible for homologous sequences, and the algorithm black box prevents us from understanding of each step operation. In this way, we propose an interpreting machine learning method for homo-trimeric protein interface residue-residue interaction prediction, and the predictor shows a good performance. Our work provides a computational auxiliary way for determining the homo-trimeric proteins interface residue pairs which will be further verified by wet experiments, and and gives a hand for the downstream works, such as protein-protein docking, protein complex structure prediction and drug design.
Collapse
Affiliation(s)
- Zhonghua Hong
- Jiaxing Hospital of Traditional Chinese Medicine, Jiaxing University, Jiaxing 314001, PR China.
| | - Jiale Liu
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, PR China
| | - Yinggao Chen
- Shantou Central Hospital, Shantou 515041, PR China.
| |
Collapse
|
10
|
Swamy KBS, Schuyler SC, Leu JY. Protein Complexes Form a Basis for Complex Hybrid Incompatibility. Front Genet 2021; 12:609766. [PMID: 33633780 PMCID: PMC7900514 DOI: 10.3389/fgene.2021.609766] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 01/20/2021] [Indexed: 12/20/2022] Open
Abstract
Proteins are the workhorses of the cell and execute many of their functions by interacting with other proteins forming protein complexes. Multi-protein complexes are an admixture of subunits, change their interaction partners, and modulate their functions and cellular physiology in response to environmental changes. When two species mate, the hybrid offspring are usually inviable or sterile because of large-scale differences in the genetic makeup between the two parents causing incompatible genetic interactions. Such reciprocal-sign epistasis between inter-specific alleles is not limited to incompatible interactions between just one gene pair; and, usually involves multiple genes. Many of these multi-locus incompatibilities show visible defects, only in the presence of all the interactions, making it hard to characterize. Understanding the dynamics of protein-protein interactions (PPIs) leading to multi-protein complexes is better suited to characterize multi-locus incompatibilities, compared to studying them with traditional approaches of genetics and molecular biology. The advances in omics technologies, which includes genomics, transcriptomics, and proteomics can help achieve this end. This is especially relevant when studying non-model organisms. Here, we discuss the recent progress in the understanding of hybrid genetic incompatibility; omics technologies, and how together they have helped in characterizing protein complexes and in turn multi-locus incompatibilities. We also review advances in bioinformatic techniques suitable for this purpose and propose directions for leveraging the knowledge gained from model-organisms to identify genetic incompatibilities in non-model organisms.
Collapse
Affiliation(s)
- Krishna B. S. Swamy
- Division of Biological and Life Sciences, School of Arts and Sciences, Ahmedabad University, Ahmedabad, India
| | - Scott C. Schuyler
- Department of Biomedical Sciences, College of Medicine, Chang Gung University, Taoyuan, Taiwan
- Division of Head and Neck Surgery, Department of Otolaryngology, Chang Gung Memorial Hospital, Taoyuan, Taiwan
| | - Jun-Yi Leu
- Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
11
|
Omranian S, Angeleska A, Nikoloski Z. PC2P: Parameter-free network-based prediction of protein complexes. Bioinformatics 2021; 37:73-81. [PMID: 33416831 PMCID: PMC8034538 DOI: 10.1093/bioinformatics/btaa1089] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 12/17/2020] [Accepted: 12/30/2020] [Indexed: 11/12/2022] Open
Abstract
Motivation Prediction of protein complexes from protein–protein interaction (PPI) networks is an important problem in systems biology, as they control different cellular functions. The existing solutions employ algorithms for network community detection that identify dense subgraphs in PPI networks. However, gold standards in yeast and human indicate that protein complexes can also induce sparse subgraphs, introducing further challenges in protein complex prediction. Results To address this issue, we formalize protein complexes as biclique spanned subgraphs, which include both sparse and dense subgraphs. We then cast the problem of protein complex prediction as a network partitioning into biclique spanned subgraphs with removal of minimum number of edges, called coherent partition. Since finding a coherent partition is a computationally intractable problem, we devise a parameter-free greedy approximation algorithm, termed Protein Complexes from Coherent Partition (PC2P), based on key properties of biclique spanned subgraphs. Through comparison with nine contenders, we demonstrate that PC2P: (i) successfully identifies modular structure in networks, as a prerequisite for protein complex prediction, (ii) outperforms the existing solutions with respect to a composite score of five performance measures on 75% and 100% of the analyzed PPI networks and gold standards in yeast and human, respectively, and (iii,iv) does not compromise GO semantic similarity and enrichment score of the predicted protein complexes. Therefore, our study demonstrates that clustering of networks in terms of biclique spanned subgraphs is a promising framework for detection of complexes in PPI networks. Availability and implementation https://github.com/SaraOmranian/PC2P. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sara Omranian
- Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, 14476, Potsdam, Germany.,Systems Biology and Mathematical Modeling, Max Planck Institute of Molecular Plant Physiology, 14476, Potsdam, Germany
| | | | - Zoran Nikoloski
- Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, 14476, Potsdam, Germany.,Systems Biology and Mathematical Modeling, Max Planck Institute of Molecular Plant Physiology, 14476, Potsdam, Germany.,Centre of Plant Systems Biology and Biotechnology (CPSBB), Plovdiv, Bulgaria
| |
Collapse
|
12
|
He Z, Zhao C, Liang H, Xu B, Zou Q. Protein Complexes Identification with Family-Wise Error Rate Control. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:2062-2073. [PMID: 31027047 DOI: 10.1109/tcbb.2019.2912602] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The detection of protein complexes from protein-protein interaction network is a fundamental issue in bioinformatics and systems biology. To solve this problem, numerous methods have been proposed from different angles in the past decades. However, the study on detecting statistically significant protein complexes still has not received much attention. Although there are a few methods available in the literature for identifying statistically significant protein complexes, none of these methods can provide a more strict control on the error rate of a protein complex in terms of family-wise error rate (FWER). In this paper, we propose a new detection method SSF that is capable of controlling the FWER of each reported protein complex. More precisely, we first present a p-value calculation method based on Fisher's exact test to quantify the association between each protein and a given candidate protein complex. Consequently, we describe the key modules of the SSF algorithm: a seed expansion procedure for significant protein complexes search and a set cover strategy for redundancy elimination. The experimental results on five benchmark data sets show that: (1) our method can achieve the highest precision; (2) it outperforms three competing methods in terms of normalized mutual information (NMI) and F1 score in most cases.
Collapse
|
13
|
Elahi A, Babamir SM. Identification of Protein Complexes Based on Core-Attachment Structure and Combination of Centrality Measures and Biological Properties in PPI Weighted Networks. Protein J 2020; 39:681-702. [PMID: 33040223 DOI: 10.1007/s10930-020-09922-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/28/2020] [Indexed: 02/02/2023]
Abstract
In protein interaction networks, a complex is a group of proteins that causes a biological process to take place. The correct identification of complexes can help to better understand function of cells used for therapeutic purposes, such as drug discoveries. This paper uses core-attachment structure, centrality measures, and biological properties of proteins to identify protein complex with the aim of enhancing prediction accuracy compared to related work. We used the inherent organization of complex to the identification in this article, while most methods have not considered such properties. On the other hand, clustering methods, as the common method for identifying complexes in protein interaction networks have been applied. However, we want to propose a method for more accurate identification of complexes in this article. Using this method, we determined the core center of each complex and its attachment proteins using the centrality measures, biological properties and weight density, whereby the weight of each interaction was calculated using the protein information in the gene ontology. In the proposed approach to weighting the network and measuring the importance of proteins, we used our previous work. To compare with other methods, we used datasets DIP, Collins, Krogan, and Human. The results show that the performance of our method was significantly improved, compared to other methods, in terms of detecting the protein complex. Using the p-value concept, we show the biological significance of our predicted complexes. The proposed method could identify an acceptable number of protein complexes, with the highest proportion of biological significance in collaborating on the functional annotation of proteins.
Collapse
|
14
|
Wu Z, Liao Q, Liu B. A comprehensive review and evaluation of computational methods for identifying protein complexes from protein-protein interaction networks. Brief Bioinform 2020; 21:1531-1548. [PMID: 31631226 DOI: 10.1093/bib/bbz085] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 06/17/2019] [Accepted: 06/17/2019] [Indexed: 01/03/2025] Open
Abstract
Protein complexes are the fundamental units for many cellular processes. Identifying protein complexes accurately is critical for understanding the functions and organizations of cells. With the increment of genome-scale protein-protein interaction (PPI) data for different species, various computational methods focus on identifying protein complexes from PPI networks. In this article, we give a comprehensive and updated review on the state-of-the-art computational methods in the field of protein complex identification, especially focusing on the newly developed approaches. The computational methods are organized into three categories, including cluster-quality-based methods, node-affinity-based methods and ensemble clustering methods. Furthermore, the advantages and disadvantages of different methods are discussed, and then, the performance of 17 state-of-the-art methods is evaluated on two widely used benchmark data sets. Finally, the bottleneck problems and their potential solutions in this important field are discussed.
Collapse
Affiliation(s)
- Zhourun Wu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Qing Liao
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
15
|
Paul M, Anand A. Impact of low-confidence interactions on computational identification of protein complexes. J Bioinform Comput Biol 2020; 18:2050025. [PMID: 32757809 DOI: 10.1142/s0219720020500250] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Protein complexes are the cornerstones of most of the biological processes. Identifying protein complexes is crucial in understanding the principles of cellular organization with several important applications, including in disease diagnosis. Several computational techniques have been developed to identify protein complexes from protein-protein interaction (PPI) data (equivalently, from PPI networks). These PPI data have a significant amount of false positives, which is a bottleneck in identifying protein complexes correctly. Gene ontology (GO)-based semantic similarity measures can be used to assign a confidence score to PPIs. Consequently, low-confidence PPIs are highly likely to be false positives. In this paper, we systematically study the impact of low-confidence PPIs on the performance of complex detection methods using GO-based semantic similarity measures. We consider five state-of-the-art complex detection algorithms and nine GO-based similarity measures in the evaluation. We find that each complex detection algorithm significantly improves its performance after the filtration of low-similarity scored PPIs. It is also observed that the percentage improvement and the filtration percentage (of low-confidence PPIs) are highly correlated.
Collapse
Affiliation(s)
- Madhusudan Paul
- Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, Guwahati 781039, Assam, India.,Department of Computer and System Sciences, Visva-Bharati, Santiniketan 731235, West Bengal, India
| | - Ashish Anand
- Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, Guwahati 781039, Assam, India
| |
Collapse
|
16
|
SabziNezhad A, Jalili S. DPCT: A Dynamic Method for Detecting Protein Complexes From TAP-Aware Weighted PPI Network. Front Genet 2020; 11:567. [PMID: 32676097 PMCID: PMC7333736 DOI: 10.3389/fgene.2020.00567] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Accepted: 05/11/2020] [Indexed: 12/13/2022] Open
Abstract
Detecting protein complexes from the Protein-Protein interaction network (PPI) is the essence of discovering the rules of the cellular world. There is a large amount of PPI data available, generated from high throughput experimental data. The enormous size of the data persuaded us to use computational methods instead of experimental methods to detect protein complexes. In past years, many researchers presented their algorithms to detect protein complexes. Most of the presented algorithms use current static PPI networks. New researches proved the dynamicity of cellular systems, and so, the PPI is not static over time. In this paper, we introduce DPCT to detect protein complexes from dynamic PPI networks. In the proposed method, TAP and GO data are used to make a weighted PPI network and to reduce the noise of PPI. Gene expression data are also used to make dynamic subnetworks from PPI. A memetic algorithm is used to bicluster gene expression data and to create a dynamic subnetwork for each bicluster. Experimental results show that DPCT can detect protein complexes with better correctness than state-of-the-art detection algorithms. The source code and datasets of DPCT used can be found at https://github.com/alisn72/DPCT.
Collapse
Affiliation(s)
- Ali SabziNezhad
- Computer Engineering Department, Tarbiat Modares University, Tehran, Iran
| | - Saeed Jalili
- Computer Engineering Department, Tarbiat Modares University, Tehran, Iran
| |
Collapse
|
17
|
CDAP: An Online Package for Evaluation of Complex Detection Methods. Sci Rep 2019; 9:12751. [PMID: 31485005 PMCID: PMC6726630 DOI: 10.1038/s41598-019-49225-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 08/21/2019] [Indexed: 01/21/2023] Open
Abstract
Methods for detecting protein complexes from protein-protein interaction networks are of the most critical computational approaches. Numerous methods have been proposed in this area. Therefore, it is necessary to evaluate them. Various metrics have been proposed in order to compare these methods. Nevertheless, it is essential to define new metrics that evaluate methods both qualitatively and quantitatively. In addition, there is no tool for the comprehensive comparison of such methods. In this paper, a new criterion is introduced that can fully evaluate protein complex detection algorithms. We introduce CDAP (Complex Detection Analyzer Package); an online package for comparing protein complex detection methods. CDAP can quickly rank the performance of methods based on previously defined as well as newly introduced criteria in various settings (4 PPI datasets and 3 gold standards). It has the capability of integrating various methods and apply several filterings on the results. CDAP can be easily extended to include new datasets, gold standards, and methods. Furthermore, the user can compare the results of a custom method with the results of existing methods. Thus, the authors of future papers can use CDAP for comparing their method with the previous ones. A case study is done on YGR198W, a well-known protein, and the detected clusters are compared to the known complexes of this protein.
Collapse
|
18
|
Zhang F, Liu M, Li Q, Song FX. Exploration of attractor modules for sporadic amyotrophic lateral sclerosis via systemic module inference and attract method. Exp Ther Med 2019; 17:2575-2580. [PMID: 30906448 PMCID: PMC6425136 DOI: 10.3892/etm.2019.7264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Accepted: 02/01/2019] [Indexed: 12/01/2022] Open
Abstract
Sporadic amyotrophic lateral sclerosis (SALS) is a devastating neurodegenerative disorder. However, the understanding of SALS is still poor. This research aimed to excavate attractor modules for SALS by integrating the systemic module inference and attract method. To achieve this, gene expression data and protein-protein data were recruited and preprocessed. Then, based on the Spearman's correlation coefficient (SCC) of the interactions under these two conditions, two PPI networks separately with 870 nodes (979 interactions) in normal control group and 601 nodes (777 interactions) in SALS group were built. Systemic module inference method was performed to identify the modules, and attract method was used to identify attractor modules. Finally, pathway enrichment analysis was performed to disclose the functional enrichment of these attractor modules. In total 44 and 118 modules were identified for normal control and SALS groups, respectively. Among them, 6 modules were with similar gene composition between the two groups, and all 6 modules were considered as the attractor module via attract method. These attractor modules might be potential biomarkers for early diagnosis and therapy of SALS, which could provide insight into the disease biology and suggest possible directions for drug screening programs.
Collapse
Affiliation(s)
- Fang Zhang
- Department of Rehabilitation, The Second Hospital of Lanzhou University, Lanzhou, Gansu 730030, P.R. China
| | - Mei Liu
- Department of Rehabilitation, The Second Hospital of Lanzhou University, Lanzhou, Gansu 730030, P.R. China
| | - Qun Li
- Department of Rehabilitation, The Second Hospital of Lanzhou University, Lanzhou, Gansu 730030, P.R. China
| | - Fei-Xue Song
- Department of Oncology, The Second Hospital of Lanzhou University, Lanzhou, Gansu 730030, P.R. China
| |
Collapse
|
19
|
Lu X, Wu Z, Zhao XY, Li CF, Kan SF. Systematic tracking of altered modules identifies the key biomarkers involved in chronic lymphocytic leukemia. Oncol Lett 2019; 17:2351-2355. [PMID: 30675301 PMCID: PMC6341787 DOI: 10.3892/ol.2018.9812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Accepted: 11/27/2018] [Indexed: 11/26/2022] Open
Abstract
Key genes in chronic lymphocytic leukemia (CLL) were investigated through systematically tracking the dysregulated modules from protein-protein interaction (PPI) networks. Microarray data of normal subjects and CLL patients recruited from ArrayExpress database were applied to extract differentially expressed genes (DEGs). Additionally, we re-weighted the PPI network of normal and CLL conditions by means of Pearsons correlation coefficient (PCC). Furthermore, clique-merging method was applied to extract the modules and then the altered modules were screened out. The intersection genes were selected from miss and add genes in the altered modules. The common genes were screened from the intersection genes and DEGs in CLL. A total of 734 DEGs were screened by statistical analysis. In this investigation, there were 1,805 and 703 modules in normal as well as disease PPI network. In addition, 875 altered modules were obtained which included 145 miss genes, 353 add genes and 85 intersection genes. Finally, in-depth analysis revealed 9 mutual genes between the intersection genes and DEGs in CLL. Our analysis revealed several key genes associated with CLL by systematically tracking the dysregulated modules, which might be candidate targets for diagnosis and management of CLL.
Collapse
Affiliation(s)
- Xin Lu
- Department of Blood Transfusion, Second Hospital of Shandong University, Jinan, Shandong 250033, P.R. China
| | - Zhen Wu
- Department of Blood Transfusion, Second Hospital of Shandong University, Jinan, Shandong 250033, P.R. China
| | - Xue-Ying Zhao
- Department of Blood Transfusion, Second Hospital of Shandong University, Jinan, Shandong 250033, P.R. China
| | - Chun-Feng Li
- Department of Blood Transfusion, Second Hospital of Shandong University, Jinan, Shandong 250033, P.R. China
| | - Shi-Feng Kan
- Department of Laboratory Medicine, Qilu Hospital of Shandong University, Jinan, Shandong 250012, P.R. China
| |
Collapse
|
20
|
Zahiri J, Emamjomeh A, Bagheri S, Ivazeh A, Mahdevar G, Sepasi Tehrani H, Mirzaie M, Fakheri BA, Mohammad-Noori M. Protein complex prediction: A survey. Genomics 2019; 112:174-183. [PMID: 30660789 DOI: 10.1016/j.ygeno.2019.01.011] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 11/27/2018] [Accepted: 01/15/2019] [Indexed: 02/08/2023]
Abstract
Protein complexes are one of the most important functional units for deriving biological processes within the cell. Experimental methods have provided valuable data to infer protein complexes. However, these methods have inherent limitations. Considering these limitations, many computational methods have been proposed to predict protein complexes, in the last decade. Almost all of these in-silico methods predict protein complexes from the ever-increasing protein-protein interaction (PPI) data. These computational approaches usually use the PPI data in the format of a huge protein-protein interaction network (PPIN) as input and output various sub-networks of the given PPIN as the predicted protein complexes. Some of these methods have already reached a promising efficiency in protein complex detection. Nonetheless, there are challenges in prediction of other types of protein complexes, specially sparse and small ones. New methods should further incorporate the knowledge of biological properties of proteins to improve the performance. Additionally, there are several challenges that should be considered more effectively in designing the new complex prediction algorithms in the future. This article not only reviews the history of computational protein complex prediction but also provides new insight for improvement of new methodologies. In this article, most important computational methods for protein complex prediction are evaluated and compared. In addition, some of the challenges in the reconstruction of the protein complexes are discussed. Finally, various tools for protein complex prediction and PPIN analysis as well as the current high-throughput databases are reviewed.
Collapse
Affiliation(s)
- Javad Zahiri
- Bioinformatics and Computational Omics Lab (BioCOOL), Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Abbasali Emamjomeh
- Laboratory of Computational Biotechnology and Bioinformatics (CBB), Department of Plant Breeding and Biotechnology, University of Zabol, Zabol, Iran.
| | - Samaneh Bagheri
- Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran
| | - Asma Ivazeh
- Database Research Group (DBRG), Control and intelligent Processing Center of Excellence (CIPCE), School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran
| | - Ghasem Mahdevar
- Department of Mathematics, Faculty of Sciences, University of Isfahan, Isfahan, Iran
| | - Hessam Sepasi Tehrani
- Department of Biology, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Mehdi Mirzaie
- Department of Applied Mathematics, Faculty of Mathematical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Barat Ali Fakheri
- Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran
| | - Morteza Mohammad-Noori
- School of Mathematics, Statistics, and Computer Science, College of Science, University of Tehran, Tehran, Iran
| |
Collapse
|
21
|
Xu B, Li K, Zheng W, Liu X, Zhang Y, Zhao Z, He Z. Protein complexes identification based on go attributed network embedding. BMC Bioinformatics 2018; 19:535. [PMID: 30572820 PMCID: PMC6302388 DOI: 10.1186/s12859-018-2555-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Accepted: 11/30/2018] [Indexed: 01/19/2023] Open
Abstract
Background Identifying protein complexes from protein-protein interaction (PPI) network is one of the most important tasks in proteomics. Existing computational methods try to incorporate a variety of biological evidences to enhance the quality of predicted complexes. However, it is still a challenge to integrate different types of biological information into the complexes discovery process under a unified framework. Recently, attributed network embedding methods have be proved to be remarkably effective in generating vector representations for nodes in the network. In the transformed vector space, both the topological proximity and node attributed affinity between different nodes are preserved. Therefore, such attributed network embedding methods provide us a unified framework to integrate various biological evidences into the protein complexes identification process. Results In this article, we propose a new method called GANE to predict protein complexes based on Gene Ontology (GO) attributed network embedding. Firstly, it learns the vector representation for each protein from a GO attributed PPI network. Based on the pair-wise vector representation similarity, a weighted adjacency matrix is constructed. Secondly, it uses the clique mining method to generate candidate cores. Consequently, seed cores are obtained by ranking candidate cores based on their densities on the weighted adjacency matrix and removing redundant cores. For each seed core, its attachments are the proteins with correlation score that is larger than a given threshold. The combination of a seed core and its attachment proteins is reported as a predicted protein complex by the GANE algorithm. For performance evaluation, we compared GANE with six protein complex identification methods on five yeast PPI networks. Experimental results showes that GANE performs better than the competing algorithms in terms of different evaluation metrics. Conclusions GANE provides a framework that integrate many valuable and different biological information into the task of protein complex identification. The protein vector representation learned from our attributed PPI network can also be used in other tasks, such as PPI prediction and disease gene prediction. Electronic supplementary material The online version of this article (10.1186/s12859-018-2555-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Bo Xu
- School of Software Technology, Dalian University of Technology, No.321 Tuqiang Road, Economic Development Zone, Dalian, 116024, China. .,Key Laboratory for Ubiquitous Network and Service Software of Liaoning, Dalian, 116000, China.
| | - Kun Li
- School of Software Technology, Dalian University of Technology, No.321 Tuqiang Road, Economic Development Zone, Dalian, 116024, China
| | - Wei Zheng
- College of Computer Science and Technology, Dalian University of Technology, No.2 Linggong Road, Ganjingzi District, Dalian, 116024, China.,College of software, Dalian JiaoTong University, Dalian, 116000, China
| | - Xiaoxia Liu
- College of Computer Science and Technology, Dalian University of Technology, No.2 Linggong Road, Ganjingzi District, Dalian, 116024, China
| | - Yijia Zhang
- College of Computer Science and Technology, Dalian University of Technology, No.2 Linggong Road, Ganjingzi District, Dalian, 116024, China
| | - Zhehuan Zhao
- School of Software Technology, Dalian University of Technology, No.321 Tuqiang Road, Economic Development Zone, Dalian, 116024, China.,Key Laboratory for Ubiquitous Network and Service Software of Liaoning, Dalian, 116000, China
| | - Zengyou He
- School of Software Technology, Dalian University of Technology, No.321 Tuqiang Road, Economic Development Zone, Dalian, 116024, China.,Key Laboratory for Ubiquitous Network and Service Software of Liaoning, Dalian, 116000, China
| |
Collapse
|
22
|
Abdulateef AH, Attea BA, Rashid AN, Al-Ani M. A new evolutionary algorithm with locally assisted heuristic for complex detection in protein interaction networks. Appl Soft Comput 2018. [DOI: 10.1016/j.asoc.2018.09.031] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
23
|
Zhang W, Xu J, Li Y, Zou X. Integrating network topology, gene expression data and GO annotation information for protein complex prediction. J Bioinform Comput Biol 2018; 17:1950001. [PMID: 30803297 DOI: 10.1142/s021972001950001x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
The prediction of protein complexes based on the protein interaction network is a fundamental task for the understanding of cellular life as well as the mechanisms underlying complex disease. A great number of methods have been developed to predict protein complexes based on protein-protein interaction (PPI) networks in recent years. However, because the high throughput data obtained from experimental biotechnology are incomplete, and usually contain a large number of spurious interactions, most of the network-based protein complex identification methods are sensitive to the reliability of the PPI network. In this paper, we propose a new method, Identification of Protein Complex based on Refined Protein Interaction Network (IPC-RPIN), which integrates the topology, gene expression profiles and GO functional annotation information to predict protein complexes from the reconstructed networks. To demonstrate the performance of the IPC-RPIN method, we evaluated the IPC-RPIN on three PPI networks of Saccharomycescerevisiae and compared it with four state-of-the-art methods. The simulation results show that the IPC-RPIN achieved a better result than the other methods on most of the measurements and is able to discover small protein complexes which have traditionally been neglected.
Collapse
Affiliation(s)
- Wei Zhang
- * School of Science, East China Jiaotong University, Nanchang 330013, P. R. China
| | - Jia Xu
- † School of Mechatronic Engineering, East China Jiaotong University, Nanchang 330013, P. R. China
| | - Yuanyuan Li
- ‡ School of Mathematics and Statistics, Wuhan Institute of Technology in Wuhan, Wuhan 430072, P. R. China
| | - Xiufen Zou
- § School of Mathematics and Statistics, Wuhan University, Wuhan 430072, P. R. China
| |
Collapse
|
24
|
Performance evaluation measures for protein complex prediction. Genomics 2018; 111:1483-1492. [PMID: 30312661 DOI: 10.1016/j.ygeno.2018.10.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Revised: 09/25/2018] [Accepted: 10/04/2018] [Indexed: 02/01/2023]
Abstract
Protein complexes play a dominant role in cellular organization and function. Prediction of protein complexes from the network of physical interactions between proteins (PPI networks) has thus become one of the important research areas. Recently, many computational approaches have been developed to identify these complexes. Various performance assessment measures have been proposed for evaluating the efficiency of these methods. However, there are many inconsistencies in the definitions and usage of the measures across the literature. To address this issue, we have gathered and presented the most important performance evaluation measures and developed a tool, named CompEvaluator, to critically assess the protein complex prediction methods. The tool and documentation are publicly available at https://sourceforge.net/projects/compevaluator/files/.
Collapse
|
25
|
Janani S, Ramyachitra D, Ranjani Rani R. PCD-DPPI: Protein complex detection from dynamic PPI using shuffled frog-leaping algorithm. GENE REPORTS 2018. [DOI: 10.1016/j.genrep.2018.06.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
26
|
Reciprocal Perspective for Improved Protein-Protein Interaction Prediction. Sci Rep 2018; 8:11694. [PMID: 30076341 PMCID: PMC6076239 DOI: 10.1038/s41598-018-30044-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Accepted: 07/20/2018] [Indexed: 02/06/2023] Open
Abstract
All protein-protein interaction (PPI) predictors require the determination of an operational decision threshold when differentiating positive PPIs from negatives. Historically, a single global threshold, typically optimized via cross-validation testing, is applied to all protein pairs. However, we here use data visualization techniques to show that no single decision threshold is suitable for all protein pairs, given the inherent diversity of protein interaction profiles. The recent development of high throughput PPI predictors has enabled the comprehensive scoring of all possible protein-protein pairs. This, in turn, has given rise to context, enabling us now to evaluate a PPI within the context of all possible predictions. Leveraging this context, we introduce a novel modeling framework called Reciprocal Perspective (RP), which estimates a localized threshold on a per-protein basis using several rank order metrics. By considering a putative PPI from the perspective of each of the proteins within the pair, RP rescores the predicted PPI and applies a cascaded Random Forest classifier leading to improvements in recall and precision. We here validate RP using two state-of-the-art PPI predictors, the Protein-protein Interaction Prediction Engine and the Scoring PRotein INTeractions methods, over five organisms: Homo sapiens, Saccharomyces cerevisiae, Arabidopsis thaliana, Caenorhabditis elegans, and Mus musculus. Results demonstrate the application of a post hoc RP rescoring layer significantly improves classification (p < 0.001) in all cases over all organisms and this new rescoring approach can apply to any PPI prediction method.
Collapse
|
27
|
Yuan FC, Li B, Zhang LJ. Identification of differential modules in ankylosing spondylitis using systemic module inference and the attract method. Exp Ther Med 2018; 16:149-154. [PMID: 29977361 PMCID: PMC6030912 DOI: 10.3892/etm.2018.6134] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2016] [Accepted: 04/28/2017] [Indexed: 02/03/2023] Open
Abstract
The objective of the present study was to identify differential modules in ankylosing spondylitis (AS) by integrating network analysis, module inference and the attract method. To achieve this objective, four steps were conducted. The first step was disease objective network (DON) for AS, and healthy objective network (HON) inference dependent on gene expression data, protein-protein interaction networks and Spearman's correlation coefficient. In the second step, module detection was performed by utilizing a clique-merging algorithm, which comprised of exploring maximal cliques by clique algorithm and refining or merging maximal cliques with a high overlap. The third part was seed module evaluation through module pair matches by Jaccard score and module correlation density (MCD) calculation. Finally, in the fourth step, differential modules between the AS and healthy groups were identified based on a gene set enrichment analysis-analysis of variance model in the attract method. There were 5,301 nodes and 28,176 interactions both in DON and HON. A total of 20 and 21 modules were detected for the AS and healthy group, respectively. Notably, six seed modules across two groups were identified with Jaccard score ≥0.5, and these were ranked in descending order of differential MCD (ΔC). Seed module 1 had the highest ΔC of 0.077 and Jaccard score of 1.000. By accessing the attract method, one differential module between the AS group and healthy group was identified. In conclusion, the present study successfully identified one differential module for AS that may be a potential marker for AS target therapy and provide insights for future research on this disease.
Collapse
Affiliation(s)
- Fang-Chang Yuan
- Department of Orthopedics, People's Hospital of Rizhao, Rizhao, Shandong 276826, P.R. China
| | - Bo Li
- Department of Joint Surgery, Hospital of Xinjiang Production and Construction Corps, Urumchi, Xinjiang Uygur Autonomous Region 830002, P.R. China
| | - Li-Jun Zhang
- Department of Orthopedics, The Fifth People's Hospital of Jinan, Jinan, Shandong 250022, P.R. China
| |
Collapse
|
28
|
Ma L, Du H, Chen G. Differential network as an indicator of osteoporosis with network entropy. Exp Ther Med 2018; 16:328-332. [PMID: 29896257 PMCID: PMC5995033 DOI: 10.3892/etm.2018.6169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Accepted: 05/10/2018] [Indexed: 02/02/2023] Open
Abstract
Osteoporosis is a common skeletal disorder characterized by a decrease in bone mass and density. The peak bone mass (PBM) is a significant determinant of osteoporosis. To gain insights into the indicating effect of PBM to osteoporosis, this study focused on characterizing the PBM networks and identifying key genes. One biological data set with 12 monocyte low PBM samples and 11 high PBM samples was derived to construct protein-protein interaction networks (PPINs). Based on clique-merging, module-identification algorithm was used to identify modules from PPINs. The systematic calculation and comparison were performed to test whether the network entropy can discriminate the low PBM network from high PBM network. We constructed 32 destination networks with 66 modules divided from monocyte low and high PBM networks. Among them, network 11 was the only significantly differential one (P<0.05) with 8 nodes and 28 edges. All genes belonged to precursors of osteoclasts, which were related to calcium transport as well as blood monocytes. In conclusion, based on the entropy in PBM PPINs, the differential network appears to be a novel therapeutic indicator for osteoporosis during the bone monocyte progression; these findings are helpful in disclosing the pathogenetic mechanisms of osteoporosis.
Collapse
Affiliation(s)
- Lili Ma
- Department of Orthopaedics, Hebei Cangzhou Central Hospital, Cangzhou, Hebei 061001, P.R. China
| | - Hongmei Du
- Department of Orthopaedics, Hebei Cangzhou Central Hospital, Cangzhou, Hebei 061001, P.R. China
| | - Guangdong Chen
- Department of Orthopaedics, Hebei Cangzhou Central Hospital, Cangzhou, Hebei 061001, P.R. China
| |
Collapse
|
29
|
Attea BA, Abdullah QZ. Improving the performance of evolutionary-based complex detection models in protein–protein interaction networks. Soft comput 2018. [DOI: 10.1007/s00500-017-2593-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
30
|
MTGO: PPI Network Analysis Via Topological and Functional Module Identification. Sci Rep 2018; 8:5499. [PMID: 29615773 PMCID: PMC5882952 DOI: 10.1038/s41598-018-23672-0] [Citation(s) in RCA: 82] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Accepted: 02/28/2018] [Indexed: 11/08/2022] Open
Abstract
Protein-protein interaction (PPI) networks are viable tools to understand cell functions, disease machinery, and drug design/repositioning. Interpreting a PPI, however, it is a particularly challenging task because of network complexity. Several algorithms have been proposed for an automatic PPI interpretation, at first by solely considering the network topology, and later by integrating Gene Ontology (GO) terms as node similarity attributes. Here we present MTGO - Module detection via Topological information and GO knowledge, a novel functional module identification approach. MTGO let emerge the bimolecular machinery underpinning PPI networks by leveraging on both biological knowledge and topological properties. In particular, it directly exploits GO terms during the module assembling process, and labels each module with its best fit GO term, easing its functional interpretation. MTGO shows largely better results than other state of the art algorithms (including recent GO-based ones) when searching for small or sparse functional modules, while providing comparable or better results all other cases. MTGO correctly identifies molecular complexes and literature-consistent processes in an experimentally derived PPI network of Myocardial infarction. A software version of MTGO is available freely for non-commercial purposes at https://gitlab.com/d1vella/MTGO .
Collapse
|
31
|
Liu X, Yang Z, Zhou Z, Sun Y, Lin H, Wang J, Xu B. The impact of protein interaction networks’ characteristics on computational complex detection methods. J Theor Biol 2018; 439:141-151. [DOI: 10.1016/j.jtbi.2017.12.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Revised: 11/29/2017] [Accepted: 12/03/2017] [Indexed: 11/25/2022]
|
32
|
Maruyama O, Kuwahara Y. RocSampler: regularizing overlapping protein complexes in protein-protein interaction networks. BMC Bioinformatics 2017; 18:491. [PMID: 29244010 PMCID: PMC5731504 DOI: 10.1186/s12859-017-1920-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Background In recent years, protein-protein interaction (PPI) networks have been well recognized as important resources to elucidate various biological processes and cellular mechanisms. In this paper, we address the problem of predicting protein complexes from a PPI network. This problem has two difficulties. One is related to small complexes, which contains two or three components. It is relatively difficult to identify them due to their simpler internal structure, but unfortunately complexes of such sizes are dominant in major protein complex databases, such as CYC2008. Another difficulty is how to model overlaps between predicted complexes, that is, how to evaluate different predicted complexes sharing common proteins because CYC2008 and other databases include such protein complexes. Thus, it is critical how to model overlaps between predicted complexes to identify them simultaneously. Results In this paper, we propose a sampling-based protein complex prediction method, RocSampler (Regularizing Overlapping Complexes), which exploits, as part of the whole scoring function, a regularization term for the overlaps of predicted complexes and that for the distribution of sizes of predicted complexes. We have implemented RocSampler in MATLAB and its executable file for Windows is available at the site, http://imi.kyushu-u.ac.jp/~om/software/RocSampler/. Conclusions We have applied RocSampler to five yeast PPI networks and shown that it is superior to other existing methods. This implies that the design of scoring functions including regularization terms is an effective approach for protein complex prediction.
Collapse
Affiliation(s)
- Osamu Maruyama
- Institute of Mathematics for Industry, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka, 819-0395, Japan.
| | - Yuki Kuwahara
- Graduate School of Mathematics, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka, 819-0395, Japan
| |
Collapse
|
33
|
Li H, Guo Q. Characterization of biomarkers in stroke based on ego-networks and pathways. Biotechnol Lett 2017; 39:1835-1842. [DOI: 10.1007/s10529-017-2430-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2017] [Accepted: 09/01/2017] [Indexed: 02/02/2023]
|
34
|
Protein Complexes Prediction Method Based on Core-Attachment Structure and Functional Annotations. Int J Mol Sci 2017; 18:ijms18091910. [PMID: 28878201 PMCID: PMC5618559 DOI: 10.3390/ijms18091910] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 08/31/2017] [Accepted: 09/01/2017] [Indexed: 11/17/2022] Open
Abstract
Recent advances in high-throughput laboratory techniques captured large-scale protein–protein interaction (PPI) data, making it possible to create a detailed map of protein interaction networks, and thus enable us to detect protein complexes from these PPI networks. However, most of the current state-of-the-art studies still have some problems, for instance, incapability of identifying overlapping clusters, without considering the inherent organization within protein complexes, and overlooking the biological meaning of complexes. Therefore, we present a novel overlapping protein complexes prediction method based on core–attachment structure and function annotations (CFOCM), which performs in two stages: first, it detects protein complex cores with the maximum value of our defined cluster closeness function, in which the proteins are also closely related to at least one common function. Then it appends attach proteins into these detected cores to form the returned complexes. For performance evaluation, CFOCM and six classical methods have been used to identify protein complexes on three different yeast PPI networks, and three sets of real complexes including the Munich Information Center for Protein Sequences (MIPS), the Saccharomyces Genome Database (SGD) and the Catalogues of Yeast protein Complexes (CYC2008) are selected as benchmark sets, and the results show that CFOCM is indeed effective and robust for achieving the highest F-measure values in all tests.
Collapse
|
35
|
Identifying protein complexes in PPI network using non-cooperative sequential game. Sci Rep 2017; 7:8410. [PMID: 28827597 PMCID: PMC5566343 DOI: 10.1038/s41598-017-08760-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Accepted: 07/13/2017] [Indexed: 11/14/2022] Open
Abstract
Identifying protein complexes from protein-protein interaction (PPI) network is an important and challenging task in computational biology as it helps in better understanding of cellular mechanisms in various organisms. In this paper we propose a noncooperative sequential game based model for protein complex detection from PPI network. The key hypothesis is that protein complex formation is driven by mechanism that eventually optimizes the number of interactions within the complex leading to dense subgraph. The hypothesis is drawn from the observed network property named small world. The proposed multi-player game model translates the hypothesis into the game strategies. The Nash equilibrium of the game corresponds to a network partition where each protein either belong to a complex or form a singleton cluster. We further propose an algorithm to find the Nash equilibrium of the sequential game. The exhaustive experiment on synthetic benchmark and real life yeast networks evaluates the structural as well as biological significance of the network partitions.
Collapse
|
36
|
Qi J, Ma L, Wang X, Li Y, Wang K. Observation of significant biomarkers in osteosarcoma via integrating module- identification method with attract. Cancer Biomark 2017; 20:87-93. [PMID: 28759958 DOI: 10.3233/cbm-170144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
OBJECTIVE Osteosarcoma (OS) is the most frequent type of bone malignancy, and this disease has a poor prognosis. We aimed to identify the significant genes related with OS by integrating module-identification method and attract approach. METHODS OS-related microarray data E-GEOD-36001 were obtained from ArrayExpress database, and then protein-protein interaction (PPI) networks of normal and OS were re-weighted by means of spearman correlation coefficient (SCC). Next, maximal cliques were detected from the re-weighted PPI networks using clusteringbased on maximal cliques approach. Afterwards, highly overlapped cliques were merged according to the interconnectivity, following by candidate modules and seed modules identification. Attract proposed by Mar et al. who have suggested that this approach can extract and annotate the gene-sets which can distinguish between disease and control samples, and obtained differences of these gene-sets among the expression profile of samples were defined as attractors. Thus, we applied attract method to extract differential modules from the seed modules, and these obtained differential modules were defined as attractors. The genes in attractors were determined as attractor genes. RESULTS After eliminating the maximal cliques with nodes less than 4, there were 1,884 and 528 maximal cliques in normal and OS PPI networks, which were used to conduct module analysis. A total of 60 and 19 candidate modules were obtained in control and OS PPI networks, respectively. By comparing with normal group, 2 seed module pairs with similar gene composition were found. Significantly, based on attract method, we found that these 2 modules were differential. These 2 modules had the same gene size with 4 genes. Of note, genes CCNB1 and KIF11 simultaneously appeared in these two attractors. CONCLUSIONS We successfully identified two attractors via integrating module-identification method and attract approach, and attractor genes, for example, CCNB1 and KIF11 might play pathophysiological roles in OS development and progression.
Collapse
Affiliation(s)
- Jie Qi
- Department of Orthopaedics, Shaanxi Provicial People's Hospital, Xi'an 710068, Shaanxi, China
| | - Liang Ma
- Department of Orthopaedics, Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan 250011, Shandong, China
| | - Xiaogang Wang
- Out-patient Department, Affiliated Tumor Hospital of Xinjiang Medical University, Wuluumuqi 830011, Xinjiang, China
| | - Ying Li
- Beijing Spirallink Medical Research Institute, Beijing 100054, China
| | - Kejun Wang
- Department of Orthopaedics, Jingzhou Central Hospital, Jingzhou 434020, Hubei, China
| |
Collapse
|
37
|
Wei ST, Sun YH, Zong SH. A novel method to identify hub pathways of rheumatoid arthritis based on differential pathway networks. Mol Med Rep 2017; 16:3187-3193. [PMID: 28713940 PMCID: PMC5547957 DOI: 10.3892/mmr.2017.6985] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2015] [Accepted: 11/08/2016] [Indexed: 12/29/2022] Open
Abstract
The aim of the current study was to identify hub pathways of rheumatoid arthritis (RA) using a novel method based on differential pathway network (DPN) analysis. The present study proposed a DPN where protein-protein interaction (PPI) network was integrated with pathway-pathway interactions. Pathway data was obtained from background PPI network and the Reactome pathway database. Subsequently, pathway interactions were extracted from the pathway data by building randomized gene-gene interactions and a weight value was assigned to each pathway interaction using Spearman correlation coefficient (SCC) to identify differential pathway interactions. Differential pathway interactions were visualized using Cytoscape to construct a DPN. Topological analysis was conducted to identify hub pathways that possessed the top 5% degree distribution of DPN. Modules of DPN were mined according to ClusterONE. A total of 855 pathways were selected to build pathway interactions. By filtrating pathway interactions of weight values >0.7, a DPN with 312 nodes and 791 edges was obtained. Topological degree analysis revealed 15 hub pathways, such as heparan sulfate/heparin-glycosaminoglycan (HS-GAG) degradation, HS-GAG metabolism and keratan sulfate degradation for RA based on DPN. Furthermore, hub pathways were also important in modules, which validated the significance of hub pathways. In conclusion, the proposed method is a computationally efficient way to identify hub pathways of RA, which identified 15 hub pathways that may be potential biomarkers and provide insight to future investigation and treatment of RA.
Collapse
Affiliation(s)
- Shi-Tong Wei
- Department of Rheumatology, Yantai Yantaishan Hospital, Yantai, Shandong 264000, P.R. China
| | - Yong-Hua Sun
- Department of Rheumatology, Yantai Yantaishan Hospital, Yantai, Shandong 264000, P.R. China
| | - Shi-Hua Zong
- Department of Rheumatology, Yantai Yantaishan Hospital, Yantai, Shandong 264000, P.R. China
| |
Collapse
|
38
|
Maddi AMA, Eslahchi C. Discovering overlapped protein complexes from weighted PPI networks by removing inter-module hubs. Sci Rep 2017; 7:3247. [PMID: 28607455 PMCID: PMC5468366 DOI: 10.1038/s41598-017-03268-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Accepted: 04/26/2017] [Indexed: 12/21/2022] Open
Abstract
Detecting known protein complexes and predicting undiscovered protein complexes from protein-protein interaction (PPI) networks help us to understand principles of cell organization and its functions. Nevertheless, the discovery of protein complexes based on experiment still needs to be explored. Therefore, computational methods are useful approaches to overcome the experimental limitations. Nevertheless, extraction of protein complexes from PPI network is often nontrivial. Two major constraints are large amount of noise and ignorance of occurrence time of different interactions in PPI network. In this paper, an efficient algorithm, Inter Module Hub Removal Clustering (IMHRC), is developed based on inter-module hub removal in the weighted PPI network which can detect overlapped complexes. By removing some of the inter-module hubs and module hubs, IMHRC eliminates high amount of noise in dataset and implicitly considers different occurrence time of the PPI in network. The performance of the IMHRC was evaluated on several benchmark datasets and results were compared with some of the state-of-the-art models. The protein complexes discovered with the IMHRC method show significantly better agreement with the real complexes than other current methods. Our algorithm provides an accurate and scalable method for detecting and predicting protein complexes from PPI networks.
Collapse
Affiliation(s)
- A M A Maddi
- Department of Electrical and computer Engineering, Isfahan University of Technology, Isfahan, 1983963113, Iran
- School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, 193955746, Iran
| | - Ch Eslahchi
- Department of Computer Sciences, Faculty of Mathematics, Shahid Beheshti University, Tehran, 1983963113, Iran.
- School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, 193955746, Iran.
| |
Collapse
|
39
|
Attractors of hypertrophic cardiomyopathy using maximal cliques and attract methods. Comput Biol Chem 2017; 67:194-199. [DOI: 10.1016/j.compbiolchem.2017.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Revised: 01/06/2017] [Accepted: 01/16/2017] [Indexed: 10/20/2022]
|
40
|
Vella D, Zoppis I, Mauri G, Mauri P, Di Silvestre D. From protein-protein interactions to protein co-expression networks: a new perspective to evaluate large-scale proteomic data. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2017; 2017:6. [PMID: 28477207 PMCID: PMC5359264 DOI: 10.1186/s13637-017-0059-z] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Accepted: 03/09/2017] [Indexed: 12/19/2022]
Abstract
The reductionist approach of dissecting biological systems into their constituents has been successful in the first stage of the molecular biology to elucidate the chemical basis of several biological processes. This knowledge helped biologists to understand the complexity of the biological systems evidencing that most biological functions do not arise from individual molecules; thus, realizing that the emergent properties of the biological systems cannot be explained or be predicted by investigating individual molecules without taking into consideration their relations. Thanks to the improvement of the current -omics technologies and the increasing understanding of the molecular relationships, even more studies are evaluating the biological systems through approaches based on graph theory. Genomic and proteomic data are often combined with protein-protein interaction (PPI) networks whose structure is routinely analyzed by algorithms and tools to characterize hubs/bottlenecks and topological, functional, and disease modules. On the other hand, co-expression networks represent a complementary procedure that give the opportunity to evaluate at system level including organisms that lack information on PPIs. Based on these premises, we introduce the reader to the PPI and to the co-expression networks, including aspects of reconstruction and analysis. In particular, the new idea to evaluate large-scale proteomic data by means of co-expression networks will be discussed presenting some examples of application. Their use to infer biological knowledge will be shown, and a special attention will be devoted to the topological and module analysis.
Collapse
Affiliation(s)
- Danila Vella
- Institute for Biomedical Technologies - National Research Council (ITB-CNR), 93 Fratelli Cervi, Segrate, Milan, Italy.,Department of Computer Science, Systems and Communication DiSCo, University of Milano-Bicocca, 336 Viale Sarca, Milan, Italy
| | - Italo Zoppis
- Department of Computer Science, Systems and Communication DiSCo, University of Milano-Bicocca, 336 Viale Sarca, Milan, Italy
| | - Giancarlo Mauri
- Department of Computer Science, Systems and Communication DiSCo, University of Milano-Bicocca, 336 Viale Sarca, Milan, Italy
| | - Pierluigi Mauri
- Institute for Biomedical Technologies - National Research Council (ITB-CNR), 93 Fratelli Cervi, Segrate, Milan, Italy
| | - Dario Di Silvestre
- Institute for Biomedical Technologies - National Research Council (ITB-CNR), 93 Fratelli Cervi, Segrate, Milan, Italy.
| |
Collapse
|
41
|
Cardioprotection Effects of Sevoflurane by Regulating the Pathway of Neuroactive Ligand-Receptor Interaction in Patients Undergoing Coronary Artery Bypass Graft Surgery. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2017; 2017:3618213. [PMID: 28348638 PMCID: PMC5350303 DOI: 10.1155/2017/3618213] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Revised: 02/10/2017] [Accepted: 02/19/2017] [Indexed: 01/17/2023]
Abstract
This study was designed to identify attractor modules and further reveal the potential biological processes involving in sevoflurane-induced anesthesia in patients treated with coronary artery bypass graft (CABG) surgery. Microarray profile data (ID: E-GEOD-4386) on atrial samples obtained from patients receiving anesthetic gas sevoflurane prior to and following CABG procedure were downloaded from EMBL-EBI database for further analysis. Protein-protein interaction (PPI) networks of baseline and sevoflurane groups were inferred and reweighted according to Spearman correlation coefficient (SCC), followed by systematic modules inference using clique-merging approach. Subsequently, attract method was utilized to explore attractor modules. Finally, pathway enrichment analyses for genes in the attractor modules were implemented to illuminate the biological processes in sevoflurane group. Using clique-merging approach, 27 and 36 modules were obtained from the PPI networks of baseline and sevoflurane-treated samples, respectively. By comparing with the baseline condition, 5 module pairs with the same gene composition were identified. Subsequently, 1 out of 5 modules was identified as an attractor based on attract method. Additionally, pathway analysis indicated that genes in the attractor module were associated with neuroactive ligand-receptor interaction. Accordingly, sevoflurane might exert important functions in cardioprotection in patients following CABG, partially through regulating the pathway of neuroactive ligand-receptor interaction.
Collapse
|
42
|
Chen JY, Pandey R, Nguyen TM. HAPPI-2: a Comprehensive and High-quality Map of Human Annotated and Predicted Protein Interactions. BMC Genomics 2017; 18:182. [PMID: 28212602 PMCID: PMC5314692 DOI: 10.1186/s12864-017-3512-1] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2015] [Accepted: 01/24/2017] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Human protein-protein interaction (PPI) data is essential to network and systems biology studies. PPI data can help biochemists hypothesize how proteins form complexes by binding to each other, how extracellular signals propagate through post-translational modification of de-activated signaling molecules, and how chemical reactions are coupled by enzymes involved in a complex biological process. Our capability to develop good public database resources for human PPI data has a direct impact on the quality of future research on genome biology and medicine. RESULTS The database of Human Annotated and Predicted Protein Interactions (HAPPI) version 2.0 is a major update to the original HAPPI 1.0 database. It contains 2,922,202 unique protein-protein interactions (PPI) linked by 23,060 human proteins, making it the most comprehensive database covering human PPI data today. These PPIs contain both physical/direct interactions and high-quality functional/indirect interactions. Compared with the HAPPI 1.0 database release, HAPPI database version 2.0 (HAPPI-2) represents a 485% of human PPI data coverage increase and a 73% protein coverage increase. The revamped HAPPI web portal provides users with a friendly search, curation, and data retrieval interface, allowing them to retrieve human PPIs and available annotation information on the interaction type, interaction quality, interacting partner drug targeting data, and disease information. The updated HAPPI-2 can be freely accessed by Academic users at http://discovery.informatics.uab.edu/HAPPI . CONCLUSIONS While the underlying data for HAPPI-2 are integrated from a diverse data sources, the new HAPPI-2 release represents a good balance between data coverage and data quality of human PPIs, making it ideally suited for network biology.
Collapse
Affiliation(s)
- Jake Y Chen
- Wenzhou Medical University First Affiliate Hospital, Wenzhou, Zhejiang Province, China. .,Medeolinx, LLC, Indianapolis, IN, 46280, USA. .,The Informatics Institute, University of Alabama at Birmingham School of Medicine, Birmingham, AL, 35294, USA. .,Indiana Center for Systems Biology and Personalized Medicine, Indiana University School of Informatics and Computing, Indianapolis, IN, 46202, USA.
| | | | - Thanh M Nguyen
- Indiana Center for Systems Biology and Personalized Medicine, Indiana University School of Informatics and Computing, Indianapolis, IN, 46202, USA
| |
Collapse
|
43
|
Liu X, Li C, Zhang L, Shi X, Wu S. Personalized Identification of Differentially Expressed Modules in Osteosarcoma. Med Sci Monit 2017; 23:774-779. [PMID: 28190021 PMCID: PMC5319443 DOI: 10.12659/msm.899638] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Osteosarcoma (OS), an aggressive malignant neoplasm, is the most common primary bone cancer mainly in adolescents and young adults. Differentially expressed modules tend to distinguish differences integrally. Identifying modules individually has been crucial for understanding OS mechanisms and applications of custom therapeutic decisions in the future. MATERIAL AND METHODS Samples came from individuals were used from control group (n=15) and OS group (n=84). Based on clique-merging, module-identification algorithm was used to identify modules from OS PPI networks. A novel approach - the individualized module aberrance score (iMAS) was performed to distinguish differences, making special use of accumulated normal samples (ANS). We performed biological process ontology to classify functionally modules. Then Support Vector Machine (SVM) was used to test distribution results of normal and OS group with screened modules. RESULTS We identified 83 modules containing 2084 genes from PPI network in which 61 modules were significantly different. Cluster analysis of OS using the iMAS method identified 5 modules clusters. Specificity=1.00 and Sensitivity=1.00 proved the distribution outcomes of screened modules were mainly consistent with that of total data, which suggested the efficiency of 61 modules. CONCLUSIONS We conclude that a novel pipeline that identified the dysregulated modules in individuals of OS. The constructed process is expected to aid in personalized health care, which may present fruitful strategies for medical therapy.
Collapse
Affiliation(s)
- Xiaozhou Liu
- Department of Orthopedics, Jinling Hospital affiliated to Nanjing University, Nanjing, Jiangsu, China (mainland)
| | - Chengjun Li
- Department of Orthopedics, Jinling Hospital affiliated to Nanjing University, Nanjing, Jiangsu, China (mainland)
| | - Lei Zhang
- Department of Orthopedics, Jinling Hospital affiliated to Nanjing University, Nanjing, Jiangsu, China (mainland)
| | - Xin Shi
- Department of Orthopedics, Jinling Hospital affiliated to Nanjing University, Nanjing, Jiangsu, China (mainland)
| | - Sujia Wu
- Department of Orthopedics, Jinling Hospital affiliated to Nanjing University, Nanjing, Jiangsu, China (mainland)
| |
Collapse
|
44
|
Pellegrini M, Baglioni M, Geraci F. Protein complex prediction for large protein protein interaction networks with the Core&Peel method. BMC Bioinformatics 2016; 17:372. [PMID: 28185552 PMCID: PMC5123419 DOI: 10.1186/s12859-016-1191-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Background Biological networks play an increasingly important role in the exploration of functional modularity and cellular organization at a systemic level. Quite often the first tools used to analyze these networks are clustering algorithms. We concentrate here on the specific task of predicting protein complexes (PC) in large protein-protein interaction networks (PPIN). Currently, many state-of-the-art algorithms work well for networks of small or moderate size. However, their performance on much larger networks, which are becoming increasingly common in modern proteome-wise studies, needs to be re-assessed. Results and discussion We present a new fast algorithm for clustering large sparse networks: Core&Peel, which runs essentially in time and storage O(a(G)m+n) for a network G of n nodes and m arcs, where a(G) is the arboricity of G (which is roughly proportional to the maximum average degree of any induced subgraph in G). We evaluated Core&Peel on five PPI networks of large size and one of medium size from both yeast and homo sapiens, comparing its performance against those of ten state-of-the-art methods. We demonstrate that Core&Peel consistently outperforms the ten competitors in its ability to identify known protein complexes and in the functional coherence of its predictions. Our method is remarkably robust, being quite insensible to the injection of random interactions. Core&Peel is also empirically efficient attaining the second best running time over large networks among the tested algorithms. Conclusions Our algorithm Core&Peel pushes forward the state-of the-art in PPIN clustering providing an algorithmic solution with polynomial running time that attains experimentally demonstrable good output quality and speed on challenging large real networks. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1191-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Marco Pellegrini
- Laboratory for Integrative Systems Medicine - Istituto di Informatica e Telematica and Istituto di Fisiologia Clinica del CNR, via Moruzzi 1, Pisa, 56124, Italy.
| | - Miriam Baglioni
- Laboratory for Integrative Systems Medicine - Istituto di Informatica e Telematica and Istituto di Fisiologia Clinica del CNR, via Moruzzi 1, Pisa, 56124, Italy
| | - Filippo Geraci
- Laboratory for Integrative Systems Medicine - Istituto di Informatica e Telematica and Istituto di Fisiologia Clinica del CNR, via Moruzzi 1, Pisa, 56124, Italy
| |
Collapse
|
45
|
Rudashevskaya EL, Sickmann A, Markoutsa S. Global profiling of protein complexes: current approaches and their perspective in biomedical research. Expert Rev Proteomics 2016; 13:951-964. [PMID: 27602509 DOI: 10.1080/14789450.2016.1233064] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
INTRODUCTION Despite the rapid evolution of proteomic methods, protein interactions and their participation in protein complexes - an important aspect of their function - has rarely been investigated on the proteome-wide level. Disease states, such as muscular dystrophy or viral infection, are induced by interference in protein-protein interactions within complexes. The purpose of this review is to describe the current methods for global complexome analysis and to critically discuss the challenges and opportunities for the application of these methods in biomedical research. Areas covered: We discuss advancements in experimental techniques and computational tools that facilitate profiling of the complexome. The main focus is on the separation of native protein complexes via size exclusion chromatography and gel electrophoresis, which has recently been combined with quantitative mass spectrometry, for a global protein-complex profiling. The development of this approach has been supported by advanced bioinformatics strategies and fast and sensitive mass spectrometers that have allowed the analysis of whole cell lysates. The application of this technique to biomedical research is assessed, and future directions are anticipated. Expert commentary: The methodology is quite new, and has already shown great potential when combined with complementary methods for detection of protein complexes.
Collapse
Affiliation(s)
- Elena L Rudashevskaya
- a Department of Bioanalytics , Leibniz-Institut für Analytische Wissenschaften - ISAS eV , Dortmund , Germany
| | - Albert Sickmann
- a Department of Bioanalytics , Leibniz-Institut für Analytische Wissenschaften - ISAS eV , Dortmund , Germany.,b Medizinisches Proteom-Center , Ruhr-Universität Bochum , Bochum , Germany.,c School of Natural & Computing Sciences, Department of Chemistry , University of Aberdeen , Aberdeen , UK
| | - Stavroula Markoutsa
- a Department of Bioanalytics , Leibniz-Institut für Analytische Wissenschaften - ISAS eV , Dortmund , Germany
| |
Collapse
|
46
|
Su L, Meng X, Ma Q, Bai T, Liu G. LPRP: A Gene-Gene Interaction Network Construction Algorithm and Its Application in Breast Cancer Data Analysis. Interdiscip Sci 2016; 10:131-142. [PMID: 27640171 PMCID: PMC5838217 DOI: 10.1007/s12539-016-0185-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Revised: 08/25/2016] [Accepted: 09/06/2016] [Indexed: 10/30/2022]
Abstract
The importance of the construction of gene-gene interaction (GGI) network to better understand breast cancer has previously been highlighted. In this study, we propose a novel GGI network construction method called linear and probabilistic relations prediction (LPRP) and used it for gaining system level insight into breast cancer mechanisms. We construct separate genome-wide GGI networks for tumor and normal breast samples, respectively, by applying LPRP on their gene expression datasets profiled by The Cancer Genome Atlas. According to our analysis, a large loss of gene interactions in the tumor GGI network was observed (7436; 88.7 % reduction), which also contained fewer functional genes (4757; 32 % reduction) than the normal network. Tumor GGI network was characterized by a bigger network diameter and a longer characteristic path length but a smaller clustering coefficient and much sparse network connections. In addition, many known cancer pathways, especially immune response pathways, are enriched by genes in the tumor GGI network. Furthermore, potential cancer genes are filtered in this study, which may act as drugs targeting genes. These findings will allow for a better understanding of breast cancer mechanisms.
Collapse
Affiliation(s)
- Lingtao Su
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China
| | - Xiangyu Meng
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China. .,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China.
| | - Qingshan Ma
- The First Clinical Hospital of Jilin University, Changchun, 130021, China
| | - Tian Bai
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China
| | - Guixia Liu
- College of Computer Science and Technology, Jilin University, Changchun, 130012, China. .,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China.
| |
Collapse
|
47
|
Zhang MH, Shen QH, Qin ZM, Wang QL, Chen X. Systematic tracking of disrupted modules identifies significant genes and pathways in hepatocellular carcinoma. Oncol Lett 2016; 12:3285-3295. [PMID: 27899995 PMCID: PMC5103943 DOI: 10.3892/ol.2016.5039] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Accepted: 07/12/2016] [Indexed: 12/17/2022] Open
Abstract
The objective of the present study is to identify significant genes and pathways associated with hepatocellular carcinoma (HCC) by systematically tracking the dysregulated modules of re-weighted protein-protein interaction (PPI) networks. Firstly, normal and HCC PPI networks were inferred and re-weighted based on Pearson correlation coefficient. Next, modules in the PPI networks were explored by a clique-merging algorithm, and disrupted modules were identified utilizing a maximum weight bipartite matching in non-increasing order. Then, the gene compositions of the disrupted modules were studied and compared with differentially expressed (DE) genes, and pathway enrichment analysis for these genes was performed based on Expression Analysis Systematic Explorer. Finally, validations of significant genes in HCC were conducted using reverse transcription-quantitative polymerase chain reaction (RT-qPCR) analysis. The present study evaluated 394 disrupted module pairs, which comprised 236 dysregulated genes. When the dysregulated genes were compared with 211 DE genes, a total of 26 common genes [including phospholipase C beta 1, cytochrome P450 (CYP) 2C8 and CYP2B6] were obtained. Furthermore, 6 of these 26 common genes were validated by RT-qPCR. Pathway enrichment analysis of dysregulated genes demonstrated that neuroactive ligand-receptor interaction, purine and drug metabolism, and metabolism of xenobiotics mediated by CYP were significantly disrupted pathways. In conclusion, the present study greatly improved the understanding of HCC in a systematic manner and provided potential biomarkers for early detection and novel therapeutic methods.
Collapse
Affiliation(s)
- Meng-Hui Zhang
- Department of General Surgery, The Fourth Hospital of Jinan, Jinan, Shandong 250031, P.R. China
| | - Qin-Hai Shen
- Department of Medicine, Shandong Medical College, Jinan, Shandong 250002, P.R. China
| | - Zhao-Min Qin
- Department of Nursing, Shandong Medical College, Jinan, Shandong 250002, P.R. China
| | - Qiao-Ling Wang
- Department of Ophthalmology, The Second Hospital of Jinan, Jinan, Shandong 250022, P.R. China
| | - Xi Chen
- Department of Ophthalmology, The Ninth Hospital of Chongqing, Chongqing 400700, P.R. China
| |
Collapse
|
48
|
Smith SEP, Neier SC, Reed BK, Davis TR, Sinnwell JP, Eckel-Passow JE, Sciallis GF, Wieland CN, Torgerson RR, Gil D, Neuhauser C, Schrum AG. Multiplex matrix network analysis of protein complexes in the human TCR signalosome. Sci Signal 2016; 9:rs7. [PMID: 27485017 DOI: 10.1126/scisignal.aad7279] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Multiprotein complexes transduce cellular signals through extensive interaction networks, but the ability to analyze these networks in cells from small clinical biopsies is limited. To address this, we applied an adaptable multiplex matrix system to physiologically relevant signaling protein complexes isolated from a cell line or from human patient samples. Focusing on the proximal T cell receptor (TCR) signalosome, we assessed 210 pairs of PiSCES (proteins in shared complexes detected by exposed surface epitopes). Upon stimulation of Jurkat cells with superantigen-loaded antigen-presenting cells, this system produced high-dimensional data that enabled visualization of network activity. A comprehensive analysis platform generated PiSCES biosignatures by applying unsupervised hierarchical clustering, principal component analysis, an adaptive nonparametric with empirical cutoff analysis, and weighted correlation network analysis. We generated PiSCES biosignatures from 4-mm skin punch biopsies from control patients or patients with the autoimmune skin disease alopecia areata. This analysis distinguished disease patients from the controls, detected enhanced basal TCR signaling in the autoimmune patients, and identified a potential signaling network signature that may be indicative of disease. Thus, generation of PiSCES biosignatures represents an approach that can provide information about the activity of protein signaling networks in samples including low-abundance primary cells from clinical biopsies.
Collapse
Affiliation(s)
- Stephen E P Smith
- Department of Immunology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Steven C Neier
- Department of Immunology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Brendan K Reed
- Department of Immunology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Tessa R Davis
- Department of Immunology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Jason P Sinnwell
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA
| | - Jeanette E Eckel-Passow
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA
| | | | | | | | - Diana Gil
- Department of Immunology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Claudia Neuhauser
- University of Minnesota Informatics Institute, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Adam G Schrum
- Department of Immunology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA.
| |
Collapse
|
49
|
Lakizadeh A, Jalili S. BiCAMWI: A Genetic-Based Biclustering Algorithm for Detecting Dynamic Protein Complexes. PLoS One 2016; 11:e0159923. [PMID: 27462706 PMCID: PMC4963120 DOI: 10.1371/journal.pone.0159923] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2016] [Accepted: 07/11/2016] [Indexed: 01/08/2023] Open
Abstract
Considering the roles of protein complexes in many biological processes in the cell, detection of protein complexes from available protein-protein interaction (PPI) networks is a key challenge in the post genome era. Despite high dynamicity of cellular systems and dynamic interaction between proteins in a cell, most computational methods have focused on static networks which cannot represent the inherent dynamicity of protein interactions. Recently, some researchers try to exploit the dynamicity of PPI networks by constructing a set of dynamic PPI subnetworks correspondent to each time-point (column) in a gene expression data. However, many genes can participate in multiple biological processes and cellular processes are not necessarily related to every sample, but they might be relevant only for a subset of samples. So, it is more interesting to explore each subnetwork based on a subset of genes and conditions (i.e., biclusters) in a gene expression data. Here, we present a new method, called BiCAMWI to employ dynamicity in detecting protein complexes. The preprocessing phase of the proposed method is based on a novel genetic algorithm that extracts some sets of genes that are co-regulated under some conditions from input gene expression data. Each extracted gene set is called bicluster. In the detection phase of the proposed method, then, based on the biclusters, some dynamic PPI subnetworks are extracted from input static PPI network. Protein complexes are identified by applying a detection method on each dynamic PPI subnetwork and aggregating the results. Experimental results confirm that BiCAMWI effectively models the dynamicity inherent in static PPI networks and achieves significantly better results than state-of-the-art methods. So, we suggest BiCAMWI as a more reliable method for protein complex detection.
Collapse
Affiliation(s)
- Amir Lakizadeh
- Computer Engineering Department, Tarbiat Modares University, Tehran, Iran
| | - Saeed Jalili
- Computer Engineering Department, Tarbiat Modares University, Tehran, Iran
| |
Collapse
|
50
|
Identifying disrupted pathways by tracking altered modules in type 2 DM-related heart failure. Herz 2016; 42:98-106. [PMID: 27363418 DOI: 10.1007/s00059-016-4445-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Revised: 04/21/2016] [Accepted: 05/13/2016] [Indexed: 10/21/2022]
Abstract
BACKGROUND This study aimed to screen disrupted pathways in type 2 diabetes mellitus (T2DM) heart failure by systematically tracking the altered modules of reweighted protein-protein interaction (PPI) networks. METHODS We implemented systematic identification and comparison of modules across non-T2DM and T2DM heart failure subjects by integrating gene expression data and PPI networks. The PPI networks of non-T2DM heart failure and T2DM heart failure were constructed and reweighted by means of Spearman's correlation coefficient (SCC). Subsequently, a clique-merging algorithm was used to explore the modules in the PPI network, followed by the identification of disrupted modules based on a maximum-weight bipartite matching and sorting in descending order. Finally, pathway enrichment analyses were conducted for genes in disrupted modules to determine the biological pathways in T2DM heart failure. RESULTS By comparing the modules of non-T2DM heart failure and T2DM heart failure, 804 disrupted modules were explored. The genes in disrupted modules were significantly enriched in 39 categories (p < 1.00E-06). Of these, the most significant pathways were the focal adhesion, vascular endothelial growth factor (VEGF) signaling, and mitogen-activated protein kinase (MAPK) signaling pathways. CONCLUSION The identified disrupted pathways - focal adhesion, VEGF signaling, and MAPK signaling - might play important roles in the progression of T2DM heart failure.
Collapse
|