Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang XF, Dai DQ, Ou-Yang L, Wu MY. Exploring overlapping functional units with various structure in protein interaction networks. PLoS One 2012;7:e43092. [PMID: 22916212 PMCID: PMC3423443 DOI: 10.1371/journal.pone.0043092] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2012] [Accepted: 07/16/2012] [Indexed: 11/18/2022] Open

For:	Zhang XF, Dai DQ, Ou-Yang L, Wu MY. Exploring overlapping functional units with various structure in protein interaction networks. PLoS One 2012;7:e43092. [PMID: 22916212 PMCID: PMC3423443 DOI: 10.1371/journal.pone.0043092] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2012] [Accepted: 07/16/2012] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Zhu Y, Zhang DX, Zhang XF, Yi M, Ou-Yang L, Wu M. EC-PGMGR: Ensemble Clustering Based on Probability Graphical Model With Graph Regularization for Single-Cell RNA-seq Data. Front Genet 2020;11:572242. [PMID: 33329710 PMCID: PMC7673820 DOI: 10.3389/fgene.2020.572242] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Accepted: 09/30/2020] [Indexed: 11/21/2022] Open

Abstract

Advances in technology have made it convenient to obtain a large amount of single cell RNAsequencing (scRNA-seq) data. Since that clustering is a very important step in identifying or defining cellular phenotypes, many clustering approaches have been developed recently for these applications. The general methods can be roughly divided into normal clustering methods and integrated (ensemble) clustering methods which combine more than two normal clustering methods aiming to get much more informative performance. In order to make a contrast with the integrated clustering algorithm, the normal clustering method is often called individual or base clustering method. Note that the results of many individual clustering methods are often developed to capture one aspect of the data, and the results depend on the initial parameter settings, such as cluster number, distance metric and so on. Compared with individual clustering, although integrative clustering method may get much more accurate performance, the results depend on the base clustering results and integrated systems are often not self-regulation. Therefore, how to design a robust unsupervised clustering method is still a challenge. In order to tackle above limitations, we propose a novel Ensemble Clustering algorithm based on Probability Graphical Model with Graph Regularization, which is called EC-PGMGR for short. On one hand, we use parameter controlling in Probability Graphical Model (PGM) to automatically determine the cluster number without prior knowledge. On the other hand, we add a regularization term to reduce the effect deriving from some weak base clustering results. Particularly, the integrative results collected from base clustering methods can be assembled in the form of combination with self-regulation weights through a pre-learning process, which can efficiently enhance the effect of active clustering methods while weaken the effect of inactive clustering methods. Experiments are carried out on 7 data sets generated by different platforms with the number of single cells from 822 to 5,132. Results show that EC-PGMGR performs better than 4 alternative individual clustering methods and 2 ensemble methods in terms of accuracy including Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI), robustness, effectiveness and so on. EC-PGMGR provides an effective way to integrate different clustering results for more accurate and reliable results in further biological analysis as well. It may provide some new insights to the other applications of clustering.

Collapse

Huang H, Luo B, Wang B, Wu Q, Liang Y, He Y. Identification of Potential Gene Interactions in Heart Failure Caused by Idiopathic Dilated Cardiomyopathy. Med Sci Monit 2018;24:7697-7709. [PMID: 30368515 PMCID: PMC6216482 DOI: 10.12659/msm.912984] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Ou-Yang L, Yan H, Zhang XF. A multi-network clustering method for detecting protein complexes from multiple heterogeneous networks. BMC Bioinformatics 2017;18:463. [PMID: 29219066 PMCID: PMC5773919 DOI: 10.1186/s12859-017-1877-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2025] Open

Ou-Yang L, Zhang XF, Dai DQ, Wu MY, Zhu Y, Liu Z, Yan H. Protein complex detection based on partially shared multi-view clustering. BMC Bioinformatics 2016;17:371. [PMID: 27623844 PMCID: PMC5022186 DOI: 10.1186/s12859-016-1164-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 07/23/2016] [Indexed: 01/05/2023] Open

Abstract

Background

Protein complexes are the key molecular entities to perform many essential biological functions. In recent years, high-throughput experimental techniques have generated a large amount of protein interaction data. As a consequence, computational analysis of such data for protein complex detection has received increased attention in the literature. However, most existing works focus on predicting protein complexes from a single type of data, either physical interaction data or co-complex interaction data. These two types of data provide compatible and complementary information, so it is necessary to integrate them to discover the underlying structures and obtain better performance in complex detection.

Results

In this study, we propose a novel multi-view clustering algorithm, called the Partially Shared Multi-View Clustering model (PSMVC), to carry out such an integrated analysis. Unlike traditional multi-view learning algorithms that focus on mining either consistent or complementary information embedded in the multi-view data, PSMVC can jointly explore the shared and specific information inherent in different views. In our experiments, we compare the complexes detected by PSMVC from single data source with those detected from multiple data sources. We observe that jointly analyzing multi-view data benefits the detection of protein complexes. Furthermore, extensive experiment results demonstrate that PSMVC performs much better than 16 state-of-the-art complex detection techniques, including ensemble clustering and data integration techniques.

Conclusions

In this work, we demonstrate that when integrating multiple data sources, using partially shared multi-view clustering model can help to identify protein complexes which are not readily identifiable by conventional single-view-based methods and other integrative analysis methods. All the results and source codes are available on https://github.com/Oyl-CityU/PSMVC.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-016-1164-9) contains supplementary material, which is available to authorized users.

Collapse

Ou-Yang L, Wu M, Zhang XF, Dai DQ, Li XL, Yan H. A two-layer integration framework for protein complex detection. BMC Bioinformatics 2016;17:100. [PMID: 26911324 PMCID: PMC4765032 DOI: 10.1186/s12859-016-0939-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Accepted: 01/27/2016] [Indexed: 01/05/2023] Open

Abstract

Background

Protein complexes carry out nearly all signaling and functional processes within cells. The study of protein complexes is an effective strategy to analyze cellular functions and biological processes. With the increasing availability of proteomics data, various computational methods have recently been developed to predict protein complexes. However, different computational methods are based on their own assumptions and designed to work on different data sources, and various biological screening methods have their unique experiment conditions, and are often different in scale and noise level. Therefore, a single computational method on a specific data source is generally not able to generate comprehensive and reliable prediction results.

Results

In this paper, we develop a novel Two-layer INtegrative Complex Detection (TINCD) model to detect protein complexes, leveraging the information from both clustering results and raw data sources. In particular, we first integrate various clustering results to construct consensus matrices for proteins to measure their overall co-complex propensity. Second, we combine these consensus matrices with the co-complex score matrix derived from Tandem Affinity Purification/Mass Spectrometry (TAP) data and obtain an integrated co-complex similarity network via an unsupervised metric fusion method. Finally, a novel graph regularized doubly stochastic matrix decomposition model is proposed to detect overlapping protein complexes from the integrated similarity network.

Conclusions

Extensive experimental results demonstrate that TINCD performs much better than 21 state-of-the-art complex detection techniques, including ensemble clustering and data integration techniques.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-016-0939-3) contains supplementary material, which is available to authorized users.

Collapse

Ou-Yang L, Dai DQ, Zhang XF. Detecting Protein Complexes from Signed Protein-Protein Interaction Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015;12:1333-1344. [PMID: 26671805 DOI: 10.1109/tcbb.2015.2401014] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Zhang XF, Ou-Yang L, Hu X, Dai DQ. Identifying binary protein-protein interactions from affinity purification mass spectrometry data. BMC Genomics 2015;16:745. [PMID: 26438428 PMCID: PMC4595009 DOI: 10.1186/s12864-015-1944-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2014] [Accepted: 09/22/2015] [Indexed: 02/04/2023] Open

Abstract

Background

The identification of protein-protein interactions contributes greatly to the understanding of functional organization within cells. With the development of affinity purification-mass spectrometry (AP-MS) techniques, several computational scoring methods have been proposed to detect protein interactions from AP-MS data. However, most of the current methods focus on the detection of co-complex interactions and do not discriminate between direct physical interactions and indirect interactions. Consequently, less is known about the precise physical wiring diagram within cells.

Results

In this paper, we develop a Binary Interaction Network Model (BINM) to computationally identify direct physical interactions from co-complex interactions which can be inferred from purification data using previous scoring methods. This model provides a mathematical framework for capturing topological relationships between direct physical interactions and observed co-complex interactions. It reassigns a confidence score to each observed interaction to indicate its propensity to be a direct physical interaction. Then observed interactions with high confidence scores are predicted as direct physical interactions. We run our model on two yeast co-complex interaction networks which are constructed by two different scoring methods on a same combined AP-MS data. The direct physical interactions identified by various methods are comprehensively benchmarked against different reference sets that provide both direct and indirect evidence for physical contacts. Experiment results show that our model has a competitive performance over the state-of-the-art methods.

Conclusions

According to the results obtained in this study, BINM is a powerful scoring method that can solely use network topology to predict direct physical interactions from AP-MS data. This study provides us an alternative approach to explore the information inherent in AP-MS data. The software can be downloaded from https://github.com/Zhangxf-ccnu/BINM.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1944-z) contains supplementary material, which is available to authorized users.

Collapse

Bennett L, Kittas A, Liu S, Papageorgiou LG, Tsoka S. Community structure detection for overlapping modules through mathematical programming in protein interaction networks. PLoS One 2014;9:e112821. [PMID: 25412367 PMCID: PMC4239042 DOI: 10.1371/journal.pone.0112821] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Accepted: 10/15/2014] [Indexed: 12/05/2022] Open

Ou-Yang L, Dai DQ, Li XL, Wu M, Zhang XF, Yang P. Detecting temporal protein complexes from dynamic protein-protein interaction networks. BMC Bioinformatics 2014;15:335. [PMID: 25282536 PMCID: PMC4288635 DOI: 10.1186/1471-2105-15-335] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2014] [Accepted: 09/23/2014] [Indexed: 12/13/2022] Open

Zhang XF, Dai DQ, Ou-Yang L, Yan H. Detecting overlapping protein complexes based on a generative model with functional and topological properties. BMC Bioinformatics 2014;15:186. [PMID: 24928559 PMCID: PMC4073817 DOI: 10.1186/1471-2105-15-186] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2014] [Accepted: 06/09/2014] [Indexed: 11/20/2022] Open

Abstract

Background

Identification of protein complexes can help us get a better understanding of cellular mechanism. With the increasing availability of large-scale protein-protein interaction (PPI) data, numerous computational approaches have been proposed to detect complexes from the PPI networks. However, most of the current approaches do not consider overlaps among complexes or functional annotation information of individual proteins. Therefore, they might not be able to reflect the biological reality faithfully or make full use of the available domain-specific knowledge.

Results

In this paper, we develop a Generative Model with Functional and Topological Properties (GMFTP) to describe the generative processes of the PPI network and the functional profile. The model provides a working mechanism for capturing the interaction structures and the functional patterns of proteins. By combining the functional and topological properties, we formulate the problem of identifying protein complexes as that of detecting a group of proteins which frequently interact with each other in the PPI network and have similar annotation patterns in the functional profile. Using the idea of link communities, our method naturally deals with overlaps among complexes. The benefits brought by the functional properties are demonstrated by real data analysis. The results evaluated using four criteria with respect to two gold standards show that GMFTP has a competitive performance over the state-of-the-art approaches. The effectiveness of detecting overlapping complexes is also demonstrated by analyzing the topological and functional features of multi- and mono-group proteins.

Conclusions

Based on the results obtained in this study, GMFTP presents to be a powerful approach for the identification of overlapping protein complexes using both the PPI network and the functional profile. The software can be downloaded from http://mail.sysu.edu.cn/home/stsddq@mail.sysu.edu.cn/dai/others/GMFTP.zip.

Collapse

Brdar S, Crnojevic V, Zupan B. Integrative clustering by nonnegative matrix factorization can reveal coherent functional groups from gene profile data. IEEE J Biomed Health Inform 2014;19:698-708. [PMID: 24733033 DOI: 10.1109/jbhi.2014.2316508] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Zaki N, Mora A. A comparative analysis of computational approaches and algorithms for protein subcomplex identification. Sci Rep 2014;4:4262. [PMID: 24584908 PMCID: PMC3939454 DOI: 10.1038/srep04262] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2013] [Accepted: 02/14/2014] [Indexed: 11/09/2022] Open

Huang L, Wang G, Wang Y, Blanzieri E, Su C. Link Clustering with Extended Link Similarity and EQ Evaluation Division. PLoS One 2013;8:e66005. [PMID: 23840390 PMCID: PMC3686866 DOI: 10.1371/journal.pone.0066005] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2013] [Accepted: 04/29/2013] [Indexed: 11/24/2022] Open

Wu MY, Dai DQ, Zhang XF, Zhu Y. Cancer subtype discovery and biomarker identification via a new robust network clustering algorithm. PLoS One 2013;8:e66256. [PMID: 23799085 PMCID: PMC3684607 DOI: 10.1371/journal.pone.0066256] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2013] [Accepted: 05/02/2013] [Indexed: 11/29/2022] Open

Abstract

In cancer biology, it is very important to understand the phenotypic changes of the patients and discover new cancer subtypes. Recently, microarray-based technologies have shed light on this problem based on gene expression profiles which may contain outliers due to either chemical or electrical reasons. These undiscovered subtypes may be heterogeneous with respect to underlying networks or pathways, and are related with only a few of interdependent biomarkers. This motivates a need for the robust gene expression-based methods capable of discovering such subtypes, elucidating the corresponding network structures and identifying cancer related biomarkers. This study proposes a penalized model-based Student’s t clustering with unconstrained covariance (PMT-UC) to discover cancer subtypes with cluster-specific networks, taking gene dependencies into account and having robustness against outliers. Meanwhile, biomarker identification and network reconstruction are achieved by imposing an adaptive penalty on the means and the inverse scale matrices. The model is fitted via the expectation maximization algorithm utilizing the graphical lasso. Here, a network-based gene selection criterion that identifies biomarkers not as individual genes but as subnetworks is applied. This allows us to implicate low discriminative biomarkers which play a central role in the subnetwork by interconnecting many differentially expressed genes, or have cluster-specific underlying network structures. Experiment results on simulated datasets and one available cancer dataset attest to the effectiveness, robustness of PMT-UC in cancer subtype discovering. Moveover, PMT-UC has the ability to select cancer related biomarkers which have been verified in biochemical or biomedical research and learn the biological significant correlation among genes.

Collapse

Jiao QJ, Huang Y, Liu W, Wang XF, Chen XS, Shen HB. Revealing the hidden relationship by sparse modules in complex networks with a large-scale analysis. PLoS One 2013;8:e66020. [PMID: 23762457 PMCID: PMC3677904 DOI: 10.1371/journal.pone.0066020] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Accepted: 05/06/2013] [Indexed: 11/18/2022] Open

Zaki N, Efimov D, Berengueres J. Protein complex detection using interaction reliability assessment and weighted clustering coefficient. BMC Bioinformatics 2013;14:163. [PMID: 23688127 PMCID: PMC3680028 DOI: 10.1186/1471-2105-14-163] [Citation(s) in RCA: 84] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2012] [Accepted: 05/09/2013] [Indexed: 11/10/2022] Open

Ou-Yang L, Dai DQ, Zhang XF. Protein complex detection via weighted ensemble clustering based on Bayesian nonnegative matrix factorization. PLoS One 2013;8:e62158. [PMID: 23658709 PMCID: PMC3642239 DOI: 10.1371/journal.pone.0062158] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2012] [Accepted: 03/18/2013] [Indexed: 12/05/2022] Open

Abstract

Detecting protein complexes from protein-protein interaction (PPI) networks is a challenging task in computational biology. A vast number of computational methods have been proposed to undertake this task. However, each computational method is developed to capture one aspect of the network. The performance of different methods on the same network can differ substantially, even the same method may have different performance on networks with different topological characteristic. The clustering result of each computational method can be regarded as a feature that describes the PPI network from one aspect. It is therefore desirable to utilize these features to produce a more accurate and reliable clustering. In this paper, a novel Bayesian Nonnegative Matrix Factorization (NMF)-based weighted Ensemble Clustering algorithm (EC-BNMF) is proposed to detect protein complexes from PPI networks. We first apply different computational algorithms on a PPI network to generate some base clustering results. Then we integrate these base clustering results into an ensemble PPI network, in the form of weighted combination. Finally, we identify overlapping protein complexes from this network by employing Bayesian NMF model. When generating an ensemble PPI network, EC-BNMF can automatically optimize the values of weights such that the ensemble algorithm can deliver better results. Experimental results on four PPI networks of Saccharomyces cerevisiae well verify the effectiveness of EC-BNMF in detecting protein complexes. EC-BNMF provides an effective way to integrate different clustering results for more accurate and reliable complex detection. Furthermore, EC-BNMF has a high degree of flexibility in the choice of base clustering results. It can be coupled with existing clustering methods to identify protein complexes.

Collapse