Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Li MH, Lin L, Wang XL, Liu T. Protein-protein interaction site prediction based on conditional random fields. Bioinformatics 2007;23:597-604. [PMID: 17234636 DOI: 10.1093/bioinformatics/btl660] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

For:	Li MH, Lin L, Wang XL, Liu T. Protein-protein interaction site prediction based on conditional random fields. Bioinformatics 2007;23:597-604. [PMID: 17234636 DOI: 10.1093/bioinformatics/btl660] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Number

Cited by Other Article(s)

Zhong J, Zhao H, Zhao Q, Zhou R, Zhang L, Guo F, Wang J. RGCNPPIS: A Residual Graph Convolutional Network for Protein-Protein Interaction Site Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024;21:1676-1684. [PMID: 38843057 DOI: 10.1109/tcbb.2024.3410350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]

Gao Q, Zhang C, Li M, Yu T. Protein-Protein Interaction Prediction Model Based on ProtBert-BiGRU-Attention. J Comput Biol 2024;31:797-814. [PMID: 39069885 DOI: 10.1089/cmb.2023.0297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2024] Open

Roche R, Moussad B, Shuvo MH, Bhattacharya D. E(3) equivariant graph neural networks for robust and accurate protein-protein interaction site prediction. PLoS Comput Biol 2023;19:e1011435. [PMID: 37651442 PMCID: PMC10499216 DOI: 10.1371/journal.pcbi.1011435] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 09/13/2023] [Accepted: 08/15/2023] [Indexed: 09/02/2023] Open

Prediction of Protein-Protein Interaction Sites by Multifeature Fusion and RF with mRMR and IFS. DISEASE MARKERS 2022;2022:5892627. [PMID: 36246558 PMCID: PMC9553539 DOI: 10.1155/2022/5892627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/17/2022]

ProB-Site: Protein Binding Site Prediction Using Local Features. Cells 2022;11:cells11132117. [PMID: 35805201 PMCID: PMC9266162 DOI: 10.3390/cells11132117] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 06/30/2022] [Accepted: 07/01/2022] [Indexed: 01/16/2023] Open

Casadio R, Martelli PL, Savojardo C. Machine learning solutions for predicting protein–protein interactions. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Wang P, Zhang G, Yu ZG, Huang G. A Deep Learning and XGBoost-Based Method for Predicting Protein-Protein Interaction Sites. Front Genet 2021;12:752732. [PMID: 34764983 PMCID: PMC8576272 DOI: 10.3389/fgene.2021.752732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 09/20/2021] [Indexed: 11/29/2022] Open

Wang X, Zhang Y, Yu B, Salhi A, Chen R, Wang L, Liu Z. Prediction of protein-protein interaction sites through eXtreme gradient boosting with kernel principal component analysis. Comput Biol Med 2021;134:104516. [PMID: 34119922 DOI: 10.1016/j.compbiomed.2021.104516] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Revised: 05/24/2021] [Accepted: 05/24/2021] [Indexed: 12/22/2022]

Zeng M, Zhang F, Wu FX, Li Y, Wang J, Li M. Protein-protein interaction site prediction through combining local and global features with deep neural networks. Bioinformatics 2020;36:1114-1120. [PMID: 31593229 DOI: 10.1093/bioinformatics/btz699] [Citation(s) in RCA: 92] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Revised: 07/25/2019] [Accepted: 09/04/2019] [Indexed: 12/21/2022] Open

Savojardo C, Martelli PL, Casadio R. Protein–Protein Interaction Methods and Protein Phase Separation. Annu Rev Biomed Data Sci 2020. [DOI: 10.1146/annurev-biodatasci-011720-104428] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Zhu H, Du X, Yao Y. ConvsPPIS: Identifying Protein-protein Interaction Sites by an Ensemble Convolutional Neural Network with Feature Graph. Curr Bioinform 2020. [DOI: 10.2174/1574893614666191105155713] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Liu B, Gao X, Zhang H. BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches. Nucleic Acids Res 2020;47:e127. [PMID: 31504851 PMCID: PMC6847461 DOI: 10.1093/nar/gkz740] [Citation(s) in RCA: 272] [Impact Index Per Article: 54.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 08/07/2019] [Accepted: 08/17/2019] [Indexed: 12/14/2022] Open

Andreani J, Quignot C, Guerois R. Structural prediction of protein interactions and docking using conservation and coevolution. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1470] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Liu B. BioSeq-Analysis: a platform for DNA, RNA and protein sequence analysis based on machine learning approaches. Brief Bioinform 2020;20:1280-1294. [PMID: 29272359 DOI: 10.1093/bib/bbx165] [Citation(s) in RCA: 194] [Impact Index Per Article: 38.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Revised: 11/08/2017] [Indexed: 01/07/2023] Open

Xie Z, Deng X, Shu K. Prediction of Protein-Protein Interaction Sites Using Convolutional Neural Network and Improved Data Sets. Int J Mol Sci 2020;21:E467. [PMID: 31940793 PMCID: PMC7013409 DOI: 10.3390/ijms21020467] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Revised: 12/23/2019] [Accepted: 01/08/2020] [Indexed: 12/20/2022] Open

Wang X, Yu B, Ma A, Chen C, Liu B, Ma Q. Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique. Bioinformatics 2019;35:2395-2402. [PMID: 30520961 PMCID: PMC6612859 DOI: 10.1093/bioinformatics/bty995] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2018] [Revised: 11/19/2018] [Accepted: 12/03/2018] [Indexed: 11/14/2022] Open

Identification of Intrinsically Disordered Proteins and Regions by Length-Dependent Predictors Based on Conditional Random Fields. MOLECULAR THERAPY-NUCLEIC ACIDS 2019;17:396-404. [PMID: 31307006 PMCID: PMC6626971 DOI: 10.1016/j.omtn.2019.06.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 06/06/2019] [Accepted: 06/07/2019] [Indexed: 01/24/2023]

Viswanathan R, Fajardo E, Steinberg G, Haller M, Fiser A. Protein-protein binding supersites. PLoS Comput Biol 2019;15:e1006704. [PMID: 30615604 PMCID: PMC6336348 DOI: 10.1371/journal.pcbi.1006704] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Revised: 01/17/2019] [Accepted: 12/05/2018] [Indexed: 11/19/2022] Open

Yuan T, York JR, McCauley DW. Gliogenesis in lampreys shares gene regulatory interactions with oligodendrocyte development in jawed vertebrates. Dev Biol 2018;441:176-190. [DOI: 10.1016/j.ydbio.2018.07.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 07/02/2018] [Accepted: 07/02/2018] [Indexed: 01/09/2023]

Liu Y, Wang X, Liu B. IDP⁻CRF: Intrinsically Disordered Protein/Region Identification Based on Conditional Random Fields. Int J Mol Sci 2018;19:E2483. [PMID: 30135358 PMCID: PMC6164615 DOI: 10.3390/ijms19092483] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2018] [Revised: 08/14/2018] [Accepted: 08/18/2018] [Indexed: 12/16/2022] Open

Wu H, Wang K, Lu L, Xue Y, Lyu Q, Jiang M. Deep Conditional Random Field Approach to Transmembrane Topology Prediction and Application to GPCR Three-Dimensional Structure Modeling. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017;14:1106-1114. [PMID: 27576262 DOI: 10.1109/tcbb.2016.2602872] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Du X, Sun S, Hu C, Li X, Xia J. Prediction of protein-protein interaction sites by means of ensemble learning and weighted feature descriptor. ACTA ACUST UNITED AC 2016;23:10. [PMID: 27437195 PMCID: PMC4943499 DOI: 10.1186/s40709-016-0046-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]

Esmaielbeiki R, Krawczyk K, Knapp B, Nebel JC, Deane CM. Progress and challenges in predicting protein interfaces. Brief Bioinform 2016;17:117-31. [PMID: 25971595 PMCID: PMC4719070 DOI: 10.1093/bib/bbv027] [Citation(s) in RCA: 85] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Revised: 03/18/2015] [Indexed: 12/31/2022] Open

Prediction of Protein-Protein Interaction Sites Based on Naive Bayes Classifier. Biochem Res Int 2015;2015:978193. [PMID: 26697220 PMCID: PMC4677168 DOI: 10.1155/2015/978193] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Revised: 11/05/2015] [Accepted: 11/12/2015] [Indexed: 11/18/2022] Open

Survey of Natural Language Processing Techniques in Bioinformatics. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2015;2015:674296. [PMID: 26525745 PMCID: PMC4615216 DOI: 10.1155/2015/674296] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2015] [Revised: 06/12/2015] [Accepted: 06/21/2015] [Indexed: 01/02/2023]

Wei ZS, Yang JY, Shen HB, Yu DJ. A Cascade Random Forests Algorithm for Predicting Protein-Protein Interaction Sites. IEEE Trans Nanobioscience 2015;14:746-60. [DOI: 10.1109/tnb.2015.2475359] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]

Wierschin T, Wang K, Welter M, Waack S, Stanke M. Combining features in a graphical model to predict protein binding sites. Proteins 2015;83:844-52. [PMID: 25663045 DOI: 10.1002/prot.24775] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Revised: 01/16/2015] [Accepted: 01/26/2015] [Indexed: 11/08/2022]

Aumentado-Armstrong TT, Istrate B, Murgita RA. Algorithmic approaches to protein-protein interaction site prediction. Algorithms Mol Biol 2015;10:7. [PMID: 25713596 PMCID: PMC4338852 DOI: 10.1186/s13015-015-0033-9] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2014] [Accepted: 01/07/2015] [Indexed: 12/19/2022] Open

CRF-TM: A Conditional Random Field Method for Predicting Transmembrane Topology. INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING. BIG DATA AND MACHINE LEARNING TECHNIQUES 2015. [DOI: 10.1007/978-3-319-23862-3_52] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Dong Z, Wang K, Dang TKL, Gültas M, Welter M, Wierschin T, Stanke M, Waack S. CRF-based models of protein surfaces improve protein-protein interaction site predictions. BMC Bioinformatics 2014;15:277. [PMID: 25124108 PMCID: PMC4150965 DOI: 10.1186/1471-2105-15-277] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2013] [Accepted: 08/01/2014] [Indexed: 11/13/2022] Open

Abstract

Background

The identification of protein-protein interaction sites is a computationally challenging task and important for understanding the biology of protein complexes. There is a rich literature in this field. A broad class of approaches assign to each candidate residue a real-valued score that measures how likely it is that the residue belongs to the interface. The prediction is obtained by thresholding this score.

Some probabilistic models classify the residues on the basis of the posterior probabilities. In this paper, we introduce pairwise conditional random fields (pCRFs) in which edges are not restricted to the backbone as in the case of linear-chain CRFs utilized by Li et al. (2007). In fact, any 3D-neighborhood relation can be modeled. On grounds of a generalized Viterbi inference algorithm and a piecewise training process for pCRFs, we demonstrate how to utilize pCRFs to enhance a given residue-wise score-based protein-protein interface predictor on the surface of the protein under study. The features of the pCRF are solely based on the interface predictions scores of the predictor the performance of which shall be improved.

Results

We performed three sets of experiments with synthetic scores assigned to the surface residues of proteins taken from the data set PlaneDimers compiled by Zellner et al. (2011), from the list published by Keskin et al. (2004) and from the very recent data set due to Cukuroglu et al. (2014). That way we demonstrated that our pCRF-based enhancer is effective given the interface residue score distribution and the non-interface residue score are unimodal.

Moreover, the pCRF-based enhancer is also successfully applicable, if the distributions are only unimodal over a certain sub-domain. The improvement is then restricted to that domain. Thus we were able to improve the prediction of the PresCont server devised by Zellner et al. (2011) on PlaneDimers.

Conclusions

Our results strongly suggest that pCRFs form a methodological framework to improve residue-wise score-based protein-protein interface predictors given the scores are appropriately distributed. A prototypical implementation of our method is accessible at http://ppicrf.informatik.uni-goettingen.de/index.html.

Collapse

Du X, Cheng J, Zheng T, Duan Z, Qian F. A novel feature extraction scheme with ensemble coding for protein-protein interaction prediction. Int J Mol Sci 2014;15:12731-49. [PMID: 25046746 PMCID: PMC4139871 DOI: 10.3390/ijms150712731] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2014] [Revised: 06/23/2014] [Accepted: 07/14/2014] [Indexed: 11/16/2022] Open

Protein binding site prediction by combining hidden Markov support vector machine and profile-based propensities. ScientificWorldJournal 2014;2014:464093. [PMID: 25133234 PMCID: PMC4122092 DOI: 10.1155/2014/464093] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Accepted: 07/01/2014] [Indexed: 11/22/2022] Open

Bendell CJ, Liu S, Aumentado-Armstrong T, Istrate B, Cernek PT, Khan S, Picioreanu S, Zhao M, Murgita RA. Transient protein-protein interface prediction: datasets, features, algorithms, and the RAD-T predictor. BMC Bioinformatics 2014;15:82. [PMID: 24661439 PMCID: PMC4021185 DOI: 10.1186/1471-2105-15-82] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Accepted: 02/14/2014] [Indexed: 11/14/2022] Open

Abstract

Background

Transient protein-protein interactions (PPIs), which underly most biological processes, are a prime target for therapeutic development. Immense progress has been made towards computational prediction of PPIs using methods such as protein docking and sequence analysis. However, docking generally requires high resolution structures of both of the binding partners and sequence analysis requires that a significant number of recurrent patterns exist for the identification of a potential binding site. Researchers have turned to machine learning to overcome some of the other methods’ restrictions by generalising interface sites with sets of descriptive features. Best practices for dataset generation, features, and learning algorithms have not yet been identified or agreed upon, and an analysis of the overall efficacy of machine learning based PPI predictors is due, in order to highlight potential areas for improvement.

Results

The presence of unknown interaction sites as a result of limited knowledge about protein interactions in the testing set dramatically reduces prediction accuracy. Greater accuracy in labelling the data by enforcing higher interface site rates per domain resulted in an average 44% improvement across multiple machine learning algorithms. A set of 10 biologically unrelated proteins that were consistently predicted on with high accuracy emerged through our analysis. We identify seven features with the most predictive power over multiple datasets and machine learning algorithms. Through our analysis, we created a new predictor, RAD-T, that outperforms existing non-structurally specializing machine learning protein interface predictors, with an average 59% increase in MCC score on a dataset with a high number of interactions.

Conclusion

Current methods of evaluating machine-learning based PPI predictors tend to undervalue their performance, which may be artificially decreased by the presence of un-identified interaction sites. Changes to predictors’ training sets will be integral to the future progress of interface prediction by machine learning methods. We reveal the need for a larger test set of well studied proteins or domain-specific scoring algorithms to compensate for poor interaction site identification on proteins in general.

Collapse

Wang DD, Wang R, Yan H. Fast prediction of protein–protein interaction sites based on Extreme Learning Machines. Neurocomputing 2014. [DOI: 10.1016/j.neucom.2012.12.062] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Li BQ, Feng KY, Chen L, Huang T, Cai YD. Prediction of protein-protein interaction sites by random forest algorithm with mRMR and IFS. PLoS One 2012;7:e43927. [PMID: 22937126 PMCID: PMC3429425 DOI: 10.1371/journal.pone.0043927] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2012] [Accepted: 07/26/2012] [Indexed: 11/19/2022] Open

Chen CT, Peng HP, Jian JW, Tsai KC, Chang JY, Yang EW, Chen JB, Ho SY, Hsu WL, Yang AS. Protein-protein interaction site predictions with three-dimensional probability distributions of interacting atoms on protein surfaces. PLoS One 2012;7:e37706. [PMID: 22701576 PMCID: PMC3368894 DOI: 10.1371/journal.pone.0037706] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2011] [Accepted: 04/23/2012] [Indexed: 11/18/2022] Open

Abstract

Protein-protein interactions are key to many biological processes. Computational methodologies devised to predict protein-protein interaction (PPI) sites on protein surfaces are important tools in providing insights into the biological functions of proteins and in developing therapeutics targeting the protein-protein interaction sites. One of the general features of PPI sites is that the core regions from the two interacting protein surfaces are complementary to each other, similar to the interior of proteins in packing density and in the physicochemical nature of the amino acid composition. In this work, we simulated the physicochemical complementarities by constructing three-dimensional probability density maps of non-covalent interacting atoms on the protein surfaces. The interacting probabilities were derived from the interior of known structures. Machine learning algorithms were applied to learn the characteristic patterns of the probability density maps specific to the PPI sites. The trained predictors for PPI sites were cross-validated with the training cases (consisting of 432 proteins) and were tested on an independent dataset (consisting of 142 proteins). The residue-based Matthews correlation coefficient for the independent test set was 0.423; the accuracy, precision, sensitivity, specificity were 0.753, 0.519, 0.677, and 0.779 respectively. The benchmark results indicate that the optimized machine learning models are among the best predictors in identifying PPI sites on protein surfaces. In particular, the PPI site prediction accuracy increases with increasing size of the PPI site and with increasing hydrophobicity in amino acid composition of the PPI interface; the core interface regions are more likely to be recognized with high prediction confidence. The results indicate that the physicochemical complementarity patterns on protein surfaces are important determinants in PPIs, and a substantial portion of the PPI sites can be predicted correctly with the physicochemical complementarity features based on the non-covalent interaction data derived from protein interiors.

Collapse

Li B, Kihara D. Protein docking prediction using predicted protein-protein interface. BMC Bioinformatics 2012;13:7. [PMID: 22233443 PMCID: PMC3287255 DOI: 10.1186/1471-2105-13-7] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2011] [Accepted: 01/10/2012] [Indexed: 11/10/2022] Open

Abstract

Background

Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations.

Results

We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering.

Conclusion

We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.

Collapse

Zellner H, Staudigel M, Trenner T, Bittkowski M, Wolowski V, Icking C, Merkl R. Prescont: Predicting protein-protein interfaces utilizing four residue properties. Proteins 2011;80:154-68. [DOI: 10.1002/prot.23172] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2011] [Revised: 08/18/2011] [Accepted: 08/29/2011] [Indexed: 12/26/2022]

Xu Q, Xiang EW, Yang Q. Transferring network topological knowledge for predicting protein-protein interactions. Proteomics 2011;11:3818-25. [DOI: 10.1002/pmic.201100146] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2011] [Revised: 06/03/2011] [Accepted: 06/22/2011] [Indexed: 02/01/2023]

Amos-Binks A, Patulea C, Pitre S, Schoenrock A, Gui Y, Green JR, Golshani A, Dehne F. Binding site prediction for protein-protein interactions and novel motif discovery using re-occurring polypeptide sequences. BMC Bioinformatics 2011;12:225. [PMID: 21635751 PMCID: PMC3120708 DOI: 10.1186/1471-2105-12-225] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2011] [Accepted: 06/02/2011] [Indexed: 11/25/2022] Open

Abstract

Background

While there are many methods for predicting protein-protein interaction, very few can determine the specific site of interaction on each protein. Characterization of the specific sequence regions mediating interaction (binding sites) is crucial for an understanding of cellular pathways. Experimental methods often report false binding sites due to experimental limitations, while computational methods tend to require data which is not available at the proteome-scale. Here we present PIPE-Sites, a novel method of protein specific binding site prediction based on pairs of re-occurring polypeptide sequences, which have been previously shown to accurately predict protein-protein interactions. PIPE-Sites operates at high specificity and requires only the sequences of query proteins and a database of known binary interactions with no binding site data, making it applicable to binding site prediction at the proteome-scale.

Results

PIPE-Sites was evaluated using a dataset of 265 yeast and 423 human interacting proteins pairs with experimentally-determined binding sites. We found that PIPE-Sites predictions were closer to the confirmed binding site than those of two existing binding site prediction methods based on domain-domain interactions, when applied to the same dataset. Finally, we applied PIPE-Sites to two datasets of 2347 yeast and 14,438 human novel interacting protein pairs predicted to interact with high confidence. An analysis of the predicted interaction sites revealed a number of protein subsequences which are highly re-occurring in binding sites and which may represent novel binding motifs.

Conclusions

PIPE-Sites is an accurate method for predicting protein binding sites and is applicable to the proteome-scale. Thus, PIPE-Sites could be useful for exhaustive analysis of protein binding patterns in whole proteomes as well as discovery of novel binding motifs. PIPE-Sites is available online at http://pipe-sites.cgmlab.org/.

Collapse

Deng L, Guan J, Dong Q, Zhou S. Prediction of protein-protein interaction sites using an ensemble method. BMC Bioinformatics 2009;10:426. [PMID: 20015386 PMCID: PMC2808167 DOI: 10.1186/1471-2105-10-426] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2009] [Accepted: 12/16/2009] [Indexed: 01/23/2023] Open

Liu B, Wang X, Lin L, Tang B, Dong Q, Wang X. Prediction of protein binding sites in protein structures using hidden Markov support vector machine. BMC Bioinformatics 2009;10:381. [PMID: 19925685 PMCID: PMC2785799 DOI: 10.1186/1471-2105-10-381] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2009] [Accepted: 11/20/2009] [Indexed: 01/08/2023] Open

Abstract

Background

Predicting the binding sites between two interacting proteins provides important clues to the function of a protein. Recent research on protein binding site prediction has been mainly based on widely known machine learning techniques, such as artificial neural networks, support vector machines, conditional random field, etc. However, the prediction performance is still too low to be used in practice. It is necessary to explore new algorithms, theories and features to further improve the performance.

Results

In this study, we introduce a novel machine learning model hidden Markov support vector machine for protein binding site prediction. The model treats the protein binding site prediction as a sequential labelling task based on the maximum margin criterion. Common features derived from protein sequences and structures, including protein sequence profile and residue accessible surface area, are used to train hidden Markov support vector machine. When tested on six data sets, the method based on hidden Markov support vector machine shows better performance than some state-of-the-art methods, including artificial neural networks, support vector machines and conditional random field. Furthermore, its running time is several orders of magnitude shorter than that of the compared methods.

Conclusion

The improved prediction performance and computational efficiency of the method based on hidden Markov support vector machine can be attributed to the following three factors. Firstly, the relation between labels of neighbouring residues is useful for protein binding site prediction. Secondly, the kernel trick is very advantageous to this field. Thirdly, the complexity of the training step for hidden Markov support vector machine is linear with the number of training samples by using the cutting-plane algorithm.

Collapse

Grammatical-Restrained Hidden Conditional Random Fields for Bioinformatics applications. Algorithms Mol Biol 2009;4:13. [PMID: 19849839 PMCID: PMC2776008 DOI: 10.1186/1748-7188-4-13] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2009] [Accepted: 10/22/2009] [Indexed: 11/30/2022] Open

Using Support Vector Machine Combined with Post-processing Procedure to Improve Prediction of Interface Residues in Transient Complexes. Protein J 2009;28:369-74. [DOI: 10.1007/s10930-009-9203-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Exploiting three kinds of interface propensities to identify protein binding sites. Comput Biol Chem 2009;33:303-11. [DOI: 10.1016/j.compbiolchem.2009.07.001] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2008] [Revised: 06/22/2009] [Accepted: 07/01/2009] [Indexed: 11/21/2022]

Improved Prediction of Protein Binding Sites from Sequences Using Genetic Algorithm. Protein J 2009;28:273-80. [DOI: 10.1007/s10930-009-9192-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Du X, Cheng J, Song J. Identifying protein-protein interaction sites using covering algorithm. Int J Mol Sci 2009;10:2190-2202. [PMID: 19564948 PMCID: PMC2695276 DOI: 10.3390/ijms10052190] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2009] [Revised: 04/30/2009] [Accepted: 05/13/2009] [Indexed: 12/03/2022] Open

Ezkurdia I, Bartoli L, Fariselli P, Casadio R, Valencia A, Tress ML. Progress and challenges in predicting protein-protein interaction sites. Brief Bioinform 2009;10:233-46. [PMID: 19346321 DOI: 10.1093/bib/bbp021] [Citation(s) in RCA: 113] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Identifying protein–protein interaction sites in transient complexes with temperature factor, sequence profile and accessible surface area. Amino Acids 2009;38:263-70. [DOI: 10.1007/s00726-009-0245-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2008] [Accepted: 01/21/2009] [Indexed: 11/26/2022]

Martin J, Regad L, Lecornet H, Camproux AC. Structural deformation upon protein-protein interaction: a structural alphabet approach. BMC STRUCTURAL BIOLOGY 2008;8:12. [PMID: 18307769 PMCID: PMC2315654 DOI: 10.1186/1472-6807-8-12] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2007] [Accepted: 02/28/2008] [Indexed: 11/26/2022]