1
|
Shirali A, Stebliankin V, Karki U, Shi J, Chapagain P, Narasimhan G. A comprehensive survey of scoring functions for protein docking models. BMC Bioinformatics 2025; 26:25. [PMID: 39844036 PMCID: PMC11755896 DOI: 10.1186/s12859-024-05991-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Accepted: 11/18/2024] [Indexed: 01/24/2025] Open
Abstract
BACKGROUND While protein-protein docking is fundamental to our understanding of how proteins interact, scoring protein-protein complex conformations is a critical component of successful docking programs. Without accurate and efficient scoring functions to differentiate between native and non-native binding complexes, the accuracy of current docking tools cannot be guaranteed. Although many innovative scoring functions have been proposed, a good scoring function for docking remains elusive. Deep learning models offer alternatives to using explicit empirical or mathematical functions for scoring protein-protein complexes. RESULTS In this study, we perform a comprehensive survey of the state-of-the-art scoring functions by considering the most popular and highly performant approaches, both classical and deep learning-based, for scoring protein-protein complexes. The methods were also compared based on their runtime as it directly impacts their use in large-scale docking applications. CONCLUSIONS We evaluate the strengths and weaknesses of classical and deep learning-based approaches across seven public and popular datasets to aid researchers in understanding the progress made in this field.
Collapse
Affiliation(s)
- Azam Shirali
- Bioinformatics Research Group (BioRG), Knight Foundation School of Computing and Information Sciences, Florida International University, 11200 SW 8th 10 St, Miami, 33199, USA
| | - Vitalii Stebliankin
- Bioinformatics Research Group (BioRG), Knight Foundation School of Computing and Information Sciences, Florida International University, 11200 SW 8th 10 St, Miami, 33199, USA
| | - Ukesh Karki
- Department of Physics, Florida International University, 11200 SW 8th 10 St, Miami, 33199, USA
| | - Jimeng Shi
- Bioinformatics Research Group (BioRG), Knight Foundation School of Computing and Information Sciences, Florida International University, 11200 SW 8th 10 St, Miami, 33199, USA
| | - Prem Chapagain
- Department of Physics, Florida International University, 11200 SW 8th 10 St, Miami, 33199, USA
- Biomolecular Sciences Institute, Florida International University, 11200 SW 8th St, Miami, 33199, USA
| | - Giri Narasimhan
- Bioinformatics Research Group (BioRG), Knight Foundation School of Computing and Information Sciences, Florida International University, 11200 SW 8th 10 St, Miami, 33199, USA.
- Biomolecular Sciences Institute, Florida International University, 11200 SW 8th St, Miami, 33199, USA.
| |
Collapse
|
2
|
Orientation algorithm for PPI networks based on network propagation approach. J Biosci 2022. [DOI: 10.1007/s12038-022-00284-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
3
|
Design and characterization of high-affinity synthetic peptides as bioreceptors for diagnosis of cutaneous leishmaniasis. Anal Bioanal Chem 2021; 413:4545-4555. [PMID: 34037808 PMCID: PMC8149292 DOI: 10.1007/s00216-021-03424-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 05/12/2021] [Accepted: 05/20/2021] [Indexed: 11/01/2022]
Abstract
Cutaneous leishmaniasis (CL) is one of the illnesses caused by Leishmania parasite infection, which can be asymptomatic or severe according to the infecting Leishmania strain. CL is commonly diagnosed by directly detecting the parasites or their DNA in tissue samples. New diagnostic methodologies target specific proteins (biomarkers) secreted by the parasite during the infection process. However, specific bioreceptors for the in vivo or in vitro detection of these novel biomarkers are rather limited in terms of sensitivity and specificity. For this reason, we here introduce three novel peptides as bioreceptors for the highly sensitive and selective identification of acid phosphatase (sAP) and proteophosphoglycan (PPG), which have a crucial role in leishmaniasis infection. These high-affinity peptides have been designed from the conservative domains of the lectin family, holding the ability to interact with the biological target and produce the same effect than the original protein. The synthetic peptides have been characterized and the affinity and kinetic constants for their interaction with the targets (sAP and PPG) have been determined by a surface plasmon resonance biosensor. Values obtained for KD are in the nanomolar range, which is comparable to high-affinity antibodies, with the additional advantage of a high biochemical stability and simpler production. Pep2854 exhibited a high affinity for sAP (KD = 1.48 nM) while Pep2856 had a good affinity for PPG (KD 1.76 nM). This study evidences that these peptidomimetics represent a novel alternative tool to the use of high molecular weight proteins for biorecognition in the diagnostic test and biosensor devices for CL.
Collapse
|
4
|
Nadalin F, Carbone A. Protein-protein interaction specificity is captured by contact preferences and interface composition. Bioinformatics 2018; 34:459-468. [PMID: 29028884 PMCID: PMC5860360 DOI: 10.1093/bioinformatics/btx584] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Accepted: 09/18/2017] [Indexed: 12/24/2022] Open
Abstract
Motivation Large-scale computational docking will be increasingly used in future years to discriminate protein–protein interactions at the residue resolution. Complete cross-docking experiments make in silico reconstruction of protein–protein interaction networks a feasible goal. They ask for efficient and accurate screening of the millions structural conformations issued by the calculations. Results We propose CIPS (Combined Interface Propensity for decoy Scoring), a new pair potential combining interface composition with residue–residue contact preference. CIPS outperforms several other methods on screening docking solutions obtained either with all-atom or with coarse-grain rigid docking. Further testing on 28 CAPRI targets corroborates CIPS predictive power over existing methods. By combining CIPS with atomic potentials, discrimination of correct conformations in all-atom structures reaches optimal accuracy. The drastic reduction of candidate solutions produced by thousands of proteins docked against each other makes large-scale docking accessible to analysis. Availability and implementation CIPS source code is freely available at http://www.lcqb.upmc.fr/CIPS. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Francesca Nadalin
- Sorbonne Universités, UPMC-Univ P6, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative-UMR 7238, 75005 Paris, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC-Univ P6, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative-UMR 7238, 75005 Paris, France.,Institut Universitaire de France, 75005 Paris, France
| |
Collapse
|
5
|
Wisitponchai T, Shoombuatong W, Lee VS, Kitidee K, Tayapiwatana C. AnkPlex: algorithmic structure for refinement of near-native ankyrin-protein docking. BMC Bioinformatics 2017; 18:220. [PMID: 28424069 PMCID: PMC5395911 DOI: 10.1186/s12859-017-1628-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Accepted: 04/07/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Computational analysis of protein-protein interaction provided the crucial information to increase the binding affinity without a change in basic conformation. Several docking programs were used to predict the near-native poses of the protein-protein complex in 10 top-rankings. The universal criteria for discriminating the near-native pose are not available since there are several classes of recognition protein. Currently, the explicit criteria for identifying the near-native pose of ankyrin-protein complexes (APKs) have not been reported yet. RESULTS In this study, we established an ensemble computational model for discriminating the near-native docking pose of APKs named "AnkPlex". A dataset of APKs was generated from seven X-ray APKs, which consisted of 3 internal domains, using the reliable docking tool ZDOCK. The dataset was composed of 669 and 44,334 near-native and non-near-native poses, respectively, and it was used to generate eleven informative features. Subsequently, a re-scoring rank was generated by AnkPlex using a combination of a decision tree algorithm and logistic regression. AnkPlex achieved superior efficiency with ≥1 near-native complexes in the 10 top-rankings for nine X-ray complexes compared to ZDOCK, which only obtained six X-ray complexes. In addition, feature analysis demonstrated that the van der Waals feature was the dominant near-native pose out of the potential ankyrin-protein docking poses. CONCLUSION The AnkPlex model achieved a success at predicting near-native docking poses and led to the discovery of informative characteristics that could further improve our understanding of the ankyrin-protein complex. Our computational study could be useful for predicting the near-native poses of binding proteins and desired targets, especially for ankyrin-protein complexes. The AnkPlex web server is freely accessible at http://ankplex.ams.cmu.ac.th .
Collapse
Affiliation(s)
- Tanchanok Wisitponchai
- Division of Clinical Immunology, Department of Medical Technology, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, 50200, Thailand.,Center of Biomolecular Therapy and Diagnostic, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Vannajan Sanghiran Lee
- Thailand Center of Excellence in Physics, Commission on Higher Education, Bangkok, 10400, Thailand.,Department of Chemistry, Faculty of Science, University of Malaya, Kuala Lumpur, 50603, Malaysia
| | - Kuntida Kitidee
- Center of Biomolecular Therapy and Diagnostic, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, 50200, Thailand. .,Center for Research and Innovation, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand.
| | - Chatchai Tayapiwatana
- Division of Clinical Immunology, Department of Medical Technology, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, 50200, Thailand. .,Center of Biomolecular Therapy and Diagnostic, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, 50200, Thailand.
| |
Collapse
|
6
|
Zhang Z, Lu L, Zhang Y, Hua Li C, Wang CX, Zhang XY, Tan JJ. A combinatorial scoring function for protein-RNA docking. Proteins 2017; 85:741-752. [PMID: 28120375 DOI: 10.1002/prot.25253] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Revised: 01/16/2017] [Accepted: 01/17/2017] [Indexed: 12/13/2022]
Abstract
Protein-RNA docking is still an open question. One of the main challenges is to develop an effective scoring function that can discriminate near-native structures from the incorrect ones. To solve the problem, we have constructed a knowledge-based residue-nucleotide pairwise potential with secondary structure information considered for nonribosomal protein-RNA docking. Here we developed a weighted combined scoring function RpveScore that consists of the pairwise potential and six physics-based energy terms. The weights were optimized using the multiple linear regression method by fitting the scoring function to L_rmsd for the bound docking decoys from Benchmark II. The scoring functions were tested on 35 unbound docking cases. The results show that the scoring function RpveScore including all terms performs best. Also RpveScore was compared with the statistical mechanics-based method derived potential ITScore-PR, and the united atom-based statistical potentials QUASI-RNP and DARS-RNP. The success rate of RpveScore is 71.6% for the top 1000 structures and the number of cases where a near-native structure is ranked in top 30 is 25 out of 35 cases. For 32 systems (91.4%), RpveScore can find the binding mode in top 5 that has no lower than 50% native interface residues on protein and nucleotides on RNA. Additionally, it was found that the long-range electrostatic attractive energy plays an important role in distinguishing near-native structures from the incorrect ones. This work can be helpful for the development of protein-RNA docking methods and for the understanding of protein-RNA interactions. RpveScore program is available to the public at http://life.bjut.edu.cn/kxyj/kycg/2017116/14845362285362368_1.html Proteins 2017; 85:741-752. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Zhao Zhang
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| | - Lin Lu
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| | - Yue Zhang
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| | - Chun Hua Li
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| | - Cun Xin Wang
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| | - Xiao Yi Zhang
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| | - Jian Jun Tan
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| |
Collapse
|
7
|
Li CH, Cao LB, Su JG, Yang YX, Wang CX. A new residue-nucleotide propensity potential with structural information considered for discriminating protein-RNA docking decoys. Proteins 2011; 80:14-24. [PMID: 21953889 DOI: 10.1002/prot.23117] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2011] [Revised: 05/30/2011] [Accepted: 06/13/2011] [Indexed: 01/15/2023]
Abstract
Understanding the key factors that influence the preferences of residue-nucleotide interactions in specific protein-RNA interactions has remained a research focus. We propose an effective approach to derive residue-nucleotide propensity potentials through considering both the types of residues and nucleotides, and secondary structure information of proteins and RNAs from the currently largest nonredundant and nonribosomal protein-RNA interaction database. To test the validity of the potentials, we used them to select near-native structures from protein-RNA docking poses. The results show that considering secondary structure information, especially for RNAs, greatly improves the predictive power of pair potentials. The success rate is raised from 50.7 to 65.5% for the top 2000 structures, and the number of cases in which a near-native structure is ranked in top 50 is increased from 7 to 13 out of 17 cases. Furthermore, the exclusion of ribosomes from the database contributes 8.3% to the success rate. In addition, some very interesting findings follow: (i) the protein secondary structure element π-helix is strongly associated with RNA-binding sites; (ii) the nucleotide uracil occurs frequently in the most preferred pairs in which the unpaired and non-Watson-Crick paired uracils are predominant, which is probably significant in evolution. The new residue-nucleotide potentials can be helpful for the progress of protein-RNA docking methods, and for understanding the mechanisms of protein-RNA interactions.
Collapse
Affiliation(s)
- Chun Hua Li
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, China.
| | | | | | | | | |
Collapse
|
8
|
Fink F, Hochrein J, Wolowski V, Merkl R, Gronwald W. PROCOS: computational analysis of protein-protein complexes. J Comput Chem 2011; 32:2575-86. [PMID: 21630291 DOI: 10.1002/jcc.21837] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2010] [Revised: 04/15/2011] [Accepted: 04/15/2011] [Indexed: 11/11/2022]
Abstract
One of the main challenges in protein-protein docking is a meaningful evaluation of the many putative solutions. Here we present a program (PROCOS) that calculates a probability-like measure to be native for a given complex. In contrast to scores often used for analyzing complex structures, the calculated probabilities offer the advantage of providing a fixed range of expected values. This will allow, in principle, the comparison of models corresponding to different targets that were solved with the same algorithm. Judgments are based on distributions of properties derived from a large database of native and false complexes. For complex analysis PROCOS uses these property distributions of native and false complexes together with a support vector machine (SVM). PROCOS was compared to the established scoring schemes of ZRANK and DFIRE. Employing a set of experimentally solved native complexes, high probability values above 50% were obtained for 90% of these structures. Next, the performance of PROCOS was tested on the 40 binary targets of the Dockground decoy set, on 14 targets of the RosettaDock decoy set and on 9 targets that participated in the CAPRI scoring evaluation. Again the advantage of using a probability-based scoring system becomes apparent and a reasonable number of near native complexes was found within the top ranked complexes. In conclusion, a novel fully automated method is presented that allows the reliable evaluation of protein-protein complexes.
Collapse
Affiliation(s)
- Florian Fink
- Institute of Functional Genomics, University of Regensburg, Regensburg, Germany
| | | | | | | | | |
Collapse
|
9
|
Kowalsman N, Eisenstein M. Combining interface core and whole interface descriptors in postscan processing of protein-protein docking models. Proteins 2009; 77:297-318. [DOI: 10.1002/prot.22436] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
10
|
Tsuchiya Y, Kanamori E, Nakamura H, Kinoshita K. Classification of heterodimer interfaces using docking models and construction of scoring functions for the complex structure prediction. Adv Appl Bioinform Chem 2009; 2:79-100. [PMID: 21918618 PMCID: PMC3169947 DOI: 10.2147/aabc.s6347] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Protein–protein docking simulations can provide the predicted complex structural models. In a docking simulation, several putative structural models are selected by scoring functions from an ensemble of many complex models. Scoring functions based on statistical analyses of heterodimers are usually designed to select the complex model with the most abundant interaction mode found among the known complexes, as the correct model. However, because the formation schemes of heterodimers are extremely diverse, a single scoring function does not seem to be sufficient to describe the fitness of the predicted models other than the most abundant interaction mode. Thus, it is necessary to classify the heterodimers in terms of their individual interaction modes, and then to construct multiple scoring functions for each heterodimer type. In this study, we constructed the classification method of heterodimers based on the discriminative characters between near-native and decoy models, which were found in the comparison of the interfaces in terms of the complementarities for the hydrophobicity, the electrostatic potential and the shape. Consequently, we found four heterodimer clusters, and then constructed the multiple scoring functions, each of which was optimized for each cluster. Our multiple scoring functions were applied to the predictions in the unbound docking.
Collapse
Affiliation(s)
- Yuko Tsuchiya
- Institute of Medical Science, University of Tokyo, Tokyo, Japan
| | | | | | | |
Collapse
|
11
|
Abstract
We present version 3.0 of our publicly available protein-protein docking benchmark. This update includes 40 new test cases, representing a 48% increase from Benchmark 2.0. For all of the new cases, the crystal structures of both binding partners are available. As with Benchmark 2.0, Structural Classification of Proteins (Murzin et al., J Mol Biol 1995;247:536-540) was used to remove redundant test cases. The 124 unbound-unbound test cases in Benchmark 3.0 are classified into 88 rigid-body cases, 19 medium-difficulty cases, and 17 difficult cases, based on the degree of conformational change at the interface upon complex formation. In addition to providing the community with more test cases for evaluating docking methods, the expansion of Benchmark 3.0 will facilitate the development of new algorithms that require a large number of training examples. Benchmark 3.0 is available to the public at http://zlab.bu.edu/benchmark.
Collapse
Affiliation(s)
- Howook Hwang
- Boston University Bioinformatics Program, Boston, USA
| | - Brian Pierce
- Boston University Bioinformatics Program, Boston, USA
| | | | - Joël Janin
- Yeast Structural Genomics, IBBMC Université Paris-Sud, CNRS UMR 8619, 91405-Orsay, France
| | - Zhiping Weng
- Boston University Bioinformatics Program, Boston, USA
- Boston University Biomedical Engineering Department, Boston, USA
| |
Collapse
|
12
|
Martin J, Regad L, Etchebest C, Camproux AC. Taking advantage of local structure descriptors to analyze interresidue contacts in protein structures and protein complexes. Proteins 2008; 73:672-89. [DOI: 10.1002/prot.22091] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
13
|
Gong XQ, Chang S, Zhang QH, Li CH, Shen LZ, Ma XH, Wang MH, Liu B, He HQ, Chen WZ, Wang CX. A filter enhanced sampling and combinatorial scoring study for protein docking in CAPRI. Proteins 2007; 69:859-65. [PMID: 17803223 DOI: 10.1002/prot.21738] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Protein-protein docking is usually exploited with a two-step strategy, i.e., conformational sampling and decoy scoring. In this work, a new filter enhanced sampling scheme was proposed and added into the RosettaDock algorithm to improve the conformational sampling efficiency. The filter term is based on the statistical result that backbone hydrogen bonds in the native protein structures are wrapped by more than nine hydrophobic groups to shield them from attacks of water molecules (Fernandez and Scheraga, Proc Natl Acad Sci USA 2003;100:113-118). A combinatorial scoring function, ComScore, specially designed for the other-type protein-protein complexes was also adopted to select the near native docked modes. ComScore was composed of the atomic contact energy, van der Waals, and electrostatic interaction energies, and the weight of each item was fit through the multiple linear regression approach. To analyze our docking results, the filter enhanced sampling scheme was applied to targets T12, T20, and T21 after the CAPRI blind test, and improvements were obtained. The ligand least root mean square deviations (L_rmsds) were reduced and the hit numbers were increased. ComScore was used in the scoring test for CAPRI rounds 9-12 with good success in rounds 9 and 11.
Collapse
Affiliation(s)
- Xin Qi Gong
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100022, China
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|