1
|
Yue Y, Li S, Cheng Y, Wang L, Hou T, Zhu Z, He S. Integration of molecular coarse-grained model into geometric representation learning framework for protein-protein complex property prediction. Nat Commun 2024; 15:9629. [PMID: 39511202 PMCID: PMC11544137 DOI: 10.1038/s41467-024-53583-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 10/16/2024] [Indexed: 11/15/2024] Open
Abstract
Structure-based machine learning algorithms have been utilized to predict the properties of protein-protein interaction (PPI) complexes, such as binding affinity, which is critical for understanding biological mechanisms and disease treatments. While most existing algorithms represent PPI complex graph structures at the atom-scale or residue-scale, these representations can be computationally expensive or may not sufficiently integrate finer chemical-plausible interaction details for improving predictions. Here, we introduce MCGLPPI, a geometric representation learning framework that combines graph neural networks (GNNs) with MARTINI molecular coarse-grained (CG) models to predict PPI overall properties accurately and efficiently. Extensive experiments on three types of downstream PPI property prediction tasks demonstrate that at the CG-scale, MCGLPPI achieves competitive performance compared with the counterparts at the atom- and residue-scale, but with only a third of computational resource consumption. Furthermore, CG-scale pre-training on protein domain-domain interaction structures enhances its predictive capabilities for PPI tasks. MCGLPPI offers an effective and efficient solution for PPI overall property predictions, serving as a promising tool for the large-scale analysis of biomolecular interactions.
Collapse
Affiliation(s)
- Yang Yue
- School of Computer Science, The University of Birmingham, Edgbaston, Birmingham, UK
| | - Shu Li
- Macao Polytechnic University, Macao, China
| | - Yihua Cheng
- School of Computer Science, The University of Birmingham, Edgbaston, Birmingham, UK
| | - Lie Wang
- Bone Marrow Transplantation Center of the First Affiliated Hospital, Institute of Immunology, Zhejiang University School of Medicine, Hangzhou, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Zexuan Zhu
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen, China.
| | - Shan He
- School of Computer Science, The University of Birmingham, Edgbaston, Birmingham, UK.
- Macao Polytechnic University, Macao, China.
| |
Collapse
|
2
|
Zheng D, Liang S, Zhang C. B-Cell Epitope Predictions Using Computational Methods. Methods Mol Biol 2023; 2552:239-254. [PMID: 36346595 DOI: 10.1007/978-1-0716-2609-2_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Identifying protein antigenic epitopes that are recognizable by antibodies is a key step in immunologic research. This type of research has broad medical applications, such as new immunodiagnostic reagent discovery, vaccine design, and antibody design. However, due to the countless possibilities of potential epitopes, the experimental search through trial and error would be too costly and time-consuming to be practical. To facilitate this process and improve its efficiency, computational methods were developed to predict both linear epitopes and discontinuous antigenic epitopes. For linear B-cell epitope prediction, many methods were developed, including PREDITOP, PEOPLE, BEPITOPE, BepiPred, COBEpro, ABCpred, AAP, BCPred, BayesB, BEOracle/BROracle, BEST, LBEEP, DRREP, iBCE-EL, SVMTriP, etc. For the more challenging yet important task of discontinuous epitope prediction, methods were also developed, including CEP, DiscoTope, PEPITO, ElliPro, SEPPA, EPITOPIA, PEASE, EpiPred, SEPIa, EPCES, EPSVR, etc. In this chapter, we will discuss computational methods for B-cell epitope predictions of both linear and discontinuous epitopes. SVMTriP and EPCES/EPCSVR, the most successful among the methods for each type of the predictions, will be used as model methods to detail the standard protocols. For linear epitope prediction, SVMTriP was reported to achieve a sensitivity of 80.1% and a precision of 55.2% with a fivefold cross-validation based on a large dataset, yielding an AUC of 0.702. For discontinuous or conformational B-cell epitope prediction, EPCES and EPCSVR were both benchmarked by a curated independent test dataset in which all antigens had no complex structures with the antibody. The identified epitopes by these methods were later independently validated by various biochemical experiments. For these three model methods, webservers and all datasets are publicly available at http://sysbio.unl.edu/SVMTriP , http://sysbio.unl.edu/EPCES/ , and http://sysbio.unl.edu/EPSVR/ .
Collapse
Affiliation(s)
- Dandan Zheng
- Department of Radiation Oncology, University of Rochester School of Medicine and Dentistry, Rochester, NY, USA
| | - Shide Liang
- Department of Research and Development, Bio-Thera Solutions, Guangzhou, China.
| | - Chi Zhang
- School of Biological Sciences, University of Nebraska, Lincoln, NE, USA.
| |
Collapse
|
3
|
Mallavarpu Ambrose J, Veeraraghavan VP, Kullappan M, Velmurugan D, Vennila R, Rupert S, Dorairaj S, Surapaneni KM. Molecular modeling studies of the effects of withaferin A and its derivatives against oncoproteins associated with breast cancer stem cell activity. Process Biochem 2021. [DOI: 10.1016/j.procbio.2021.09.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
4
|
Enhancement of SARS-CoV-2 Receptor Binding Domain -CR3022 Human Antibody Binding Affinity via In silico Engineering Approach. JOURNAL OF MEDICAL MICROBIOLOGY AND INFECTIOUS DISEASES 2021. [DOI: 10.52547/jommid.9.3.156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
5
|
Chen KH, Hu YJ. Residue-Residue Interaction Prediction via Stacked Meta-Learning. Int J Mol Sci 2021; 22:ijms22126393. [PMID: 34203772 PMCID: PMC8232778 DOI: 10.3390/ijms22126393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/06/2021] [Accepted: 06/13/2021] [Indexed: 11/16/2022] Open
Abstract
Protein-protein interactions (PPIs) are the basis of most biological functions determined by residue-residue interactions (RRIs). Predicting residue pairs responsible for the interaction is crucial for understanding the cause of a disease and drug design. Computational approaches that considered inexpensive and faster solutions for RRI prediction have been widely used to predict protein interfaces for further analysis. This study presents RRI-Meta, an ensemble meta-learning-based method for RRI prediction. Its hierarchical learning structure comprises four base classifiers and one meta-classifier to integrate predictive strengths from different classifiers. It considers multiple feature types, including sequence-, structure-, and neighbor-based features, for characterizing other properties of a residue interaction environment to better distinguish between noninteracting and interacting residues. We conducted the same experiments using the same data as previously reported in the literature to demonstrate RRI-Meta's performance. Experimental results show that RRI-Meta is superior to several current prediction tools. Additionally, to analyze the factors that affect the performance of RRI-Meta, we conducted a comparative case study using different protein complexes.
Collapse
Affiliation(s)
- Kuan-Hsi Chen
- College of Computer Science, National Yang Ming Chiao Tung University, Hsinchu 300093, Taiwan;
| | - Yuh-Jyh Hu
- Institute of Biomedical Engineering, National Yang Ming Chiao Tung University, Hsinchu 300093, Taiwan
- Correspondence: ; Tel.: +886-3-571-2121
| |
Collapse
|
6
|
McCafferty CL, Marcotte EM, Taylor DW. Simplified geometric representations of protein structures identify complementary interaction interfaces. Proteins 2021; 89:348-360. [PMID: 33140424 PMCID: PMC7855953 DOI: 10.1002/prot.26020] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 09/22/2020] [Accepted: 10/25/2020] [Indexed: 12/12/2022]
Abstract
Protein-protein interactions are critical to protein function, but three-dimensional (3D) arrangements of interacting proteins have proven hard to predict, even given the identities and 3D structures of the interacting partners. Specifically, identifying the relevant pairwise interaction surfaces remains difficult, often relying on shape complementarity with molecular docking while accounting for molecular motions to optimize rigid 3D translations and rotations. However, such approaches can be computationally expensive, and faster, less accurate approximations may prove useful for large-scale prediction and assembly of 3D structures of multi-protein complexes. We asked if a reduced representation of protein geometry retains enough information about molecular properties to predict pairwise protein interaction interfaces that are tolerant of limited structural rearrangements. Here, we describe a reduced representation of 3D protein accessible surfaces on which molecular properties such as charge, hydrophobicity, and evolutionary rate can be easily mapped, implemented in the MorphProt package. Pairs of surfaces are compared to rapidly assess partner-specific potential surface complementarity. On two available benchmarks of 185 overall known protein complexes, we observe predictions comparable to other structure-based tools at correctly identifying protein interaction surfaces. Furthermore, we examined the effect of molecular motion through normal mode simulation on a benchmark receptor-ligand pair and observed no marked loss of predictive accuracy for distortions of up to 6 Å Cα-RMSD. Thus, a shape reduction of protein surfaces retains considerable information about surface complementarity, offers enhanced speed of comparison relative to more complex geometric representations, and exhibits tolerance to conformational changes.
Collapse
Affiliation(s)
- Caitlyn L. McCafferty
- Department of Molecular BiosciencesUniversity of Texas at AustinAustinTexasUSA
- Center for Systems and Synthetic BiologyUniversity of Texas at AustinAustinTexasUSA
- Institute for Cellular and Molecular BiologyUniversity of Texas at AustinAustinTexasUSA
| | - Edward M. Marcotte
- Department of Molecular BiosciencesUniversity of Texas at AustinAustinTexasUSA
- Center for Systems and Synthetic BiologyUniversity of Texas at AustinAustinTexasUSA
- Institute for Cellular and Molecular BiologyUniversity of Texas at AustinAustinTexasUSA
| | - David W. Taylor
- Department of Molecular BiosciencesUniversity of Texas at AustinAustinTexasUSA
- Center for Systems and Synthetic BiologyUniversity of Texas at AustinAustinTexasUSA
- Institute for Cellular and Molecular BiologyUniversity of Texas at AustinAustinTexasUSA
- LIVESTRONG Cancer InstitutesDell Medical SchoolAustinTexasUSA
| |
Collapse
|
7
|
EPCES and EPSVR: Prediction of B-Cell Antigenic Epitopes on Protein Surfaces with Conformational Information. Methods Mol Biol 2020; 2131:289-297. [PMID: 32162262 DOI: 10.1007/978-1-0716-0389-5_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Accurate prediction of discontinuous antigenic epitopes is important for immunologic research and medical applications, but it is not an easy problem. Currently, there are only a few prediction servers available, though discontinuous epitopes constitute the majority of all B-cell antigenic epitopes. In this chapter, we describe two online servers, EPCES and EPSVR, for discontinuous epitope prediction. All methods were benchmarked by a curated independent test set, in which all antigens had no complex structures with the antibody, and their epitopes were identified by various biochemical experiments. The servers and all datasets are available at http://sysbio.unl.edu/EPCES/ and http://sysbio.unl.edu/EPSVR/ .
Collapse
|
8
|
Guo F, Zou Q, Yang G, Wang D, Tang J, Xu J. Identifying protein-protein interface via a novel multi-scale local sequence and structural representation. BMC Bioinformatics 2019; 20:483. [PMID: 31874604 PMCID: PMC6929278 DOI: 10.1186/s12859-019-3048-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Accepted: 08/21/2019] [Indexed: 12/23/2022] Open
Abstract
Background Protein-protein interaction plays a key role in a multitude of biological processes, such as signal transduction, de novo drug design, immune responses, and enzymatic activities. Gaining insights of various binding abilities can deepen our understanding of the interaction. It is of great interest to understand how proteins in a complex interact with each other. Many efficient methods have been developed for identifying protein-protein interface. Results In this paper, we obtain the local information on protein-protein interface, through multi-scale local average block and hexagon structure construction. Given a pair of proteins, we use a trained support vector regression (SVR) model to select best configurations. On Benchmark v4.0, our method achieves average Irmsd value of 3.28Å and overall Fnat value of 63%, which improves upon Irmsd of 3.89Å and Fnat of 49% for ZRANK, and Irmsd of 3.99Å and Fnat of 46% for ClusPro. On CAPRI targets, our method achieves average Irmsd value of 3.45Å and overall Fnat value of 46%, which improves upon Irmsd of 4.18Å and Fnat of 40% for ZRANK, and Irmsd of 5.12Å and Fnat of 32% for ClusPro. The success rates by our method, FRODOCK 2.0, InterEvDock and SnapDock on Benchmark v4.0 are 41.5%, 29.0%, 29.4% and 37.0%, respectively. Conclusion Experiments show that our method performs better than some state-of-the-art methods, based on the prediction quality improved in terms of CAPRI evaluation criteria. All these results demonstrate that our method is a valuable technological tool for identifying protein-protein interface.
Collapse
Affiliation(s)
- Fei Guo
- College of Intelligence and Computing, Tianjin University, Tianjin, People's Republic of China.
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, People's Republic of China
| | - Guang Yang
- School of Economics, Nankai University, Tianjin, People's Republic of China
| | - Dan Wang
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong
| | - Jijun Tang
- College of Intelligence and Computing, Tianjin University, Tianjin, People's Republic of China.,Department of Computer Science and Engineering, University of South Carolina, Columbia, USA
| | - Junhai Xu
- College of Intelligence and Computing, Tianjin University, Tianjin, People's Republic of China
| |
Collapse
|
9
|
Kamal H, Minhas FUAA, Tripathi D, Abbasi WA, Hamza M, Mustafa R, Khan MZ, Mansoor S, Pappu HR, Amin I. βC1, pathogenicity determinant encoded by Cotton leaf curl Multan betasatellite, interacts with calmodulin-like protein 11 (Gh-CML11) in Gossypium hirsutum. PLoS One 2019; 14:e0225876. [PMID: 31794580 PMCID: PMC6890265 DOI: 10.1371/journal.pone.0225876] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Accepted: 11/14/2019] [Indexed: 01/14/2023] Open
Abstract
Begomoviruses interfere with host plant machinery to evade host defense mechanism by interacting with plant proteins. In the old world, this group of viruses are usually associated with betasatellite that induces severe disease symptoms by encoding a protein, βC1, which is a pathogenicity determinant. Here, we show that βC1 encoded by Cotton leaf curl Multan betasatellite (CLCuMB) requires Gossypium hirsutum calmodulin-like protein 11 (Gh-CML11) to infect cotton. First, we used the in silico approach to predict the interaction of CLCuMB-βC1 with Gh-CML11. A number of sequence- and structure-based in-silico interaction prediction techniques suggested a strong putative binding of CLCuMB-βC1 with Gh-CML11 in a Ca+2-dependent manner. In-silico interaction prediction was then confirmed by three different experimental approaches: The Gh-CML11 interaction was confirmed using CLCuMB-βC1 in a yeast two hybrid system and pull down assay. These results were further validated using bimolecular fluorescence complementation system showing the interaction in cytoplasmic veins of Nicotiana benthamiana. Bioinformatics and molecular studies suggested that CLCuMB-βC1 induces the overexpression of Gh-CML11 protein and ultimately provides calcium as a nutrient source for virus movement and transmission. This is the first comprehensive study on the interaction between CLCuMB-βC1 and Gh-CML11 proteins which provided insights into our understating of the role of βC1 in cotton leaf curl disease.
Collapse
Affiliation(s)
- Hira Kamal
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
- Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
- Department of Plant Pathology, Washington State University, Pullman, WA, United States of America
| | | | - Diwaker Tripathi
- Department of Biology, University of Washington, Seattle, WA, United States of America
| | - Wajid Arshad Abbasi
- Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
| | - Muhammad Hamza
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Roma Mustafa
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Muhammad Zuhaib Khan
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Shahid Mansoor
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Hanu R. Pappu
- Department of Plant Pathology, Washington State University, Pullman, WA, United States of America
| | - Imran Amin
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| |
Collapse
|
10
|
Zhou HX, Pang X. Electrostatic Interactions in Protein Structure, Folding, Binding, and Condensation. Chem Rev 2018; 118:1691-1741. [PMID: 29319301 DOI: 10.1021/acs.chemrev.7b00305] [Citation(s) in RCA: 584] [Impact Index Per Article: 83.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Charged and polar groups, through forming ion pairs, hydrogen bonds, and other less specific electrostatic interactions, impart important properties to proteins. Modulation of the charges on the amino acids, e.g., by pH and by phosphorylation and dephosphorylation, have significant effects such as protein denaturation and switch-like response of signal transduction networks. This review aims to present a unifying theme among the various effects of protein charges and polar groups. Simple models will be used to illustrate basic ideas about electrostatic interactions in proteins, and these ideas in turn will be used to elucidate the roles of electrostatic interactions in protein structure, folding, binding, condensation, and related biological functions. In particular, we will examine how charged side chains are spatially distributed in various types of proteins and how electrostatic interactions affect thermodynamic and kinetic properties of proteins. Our hope is to capture both important historical developments and recent experimental and theoretical advances in quantifying electrostatic contributions of proteins.
Collapse
Affiliation(s)
- Huan-Xiang Zhou
- Department of Chemistry and Department of Physics, University of Illinois at Chicago , Chicago, Illinois 60607, United States.,Department of Physics and Institute of Molecular Biophysics, Florida State University , Tallahassee, Florida 32306, United States
| | - Xiaodong Pang
- Department of Physics and Institute of Molecular Biophysics, Florida State University , Tallahassee, Florida 32306, United States
| |
Collapse
|
11
|
Qiu Z, Zhou B, Yuan J. Protein–protein interaction site predictions with minimum covariance determinant and Mahalanobis distance. J Theor Biol 2017; 433:57-63. [DOI: 10.1016/j.jtbi.2017.08.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Revised: 08/26/2017] [Accepted: 08/30/2017] [Indexed: 10/18/2022]
|
12
|
Northey TC, Barešić A, Martin ACR. IntPred: a structure-based predictor of protein-protein interaction sites. Bioinformatics 2017; 34:223-229. [PMID: 28968673 PMCID: PMC5860208 DOI: 10.1093/bioinformatics/btx585] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Revised: 08/21/2017] [Accepted: 09/15/2017] [Indexed: 11/17/2022] Open
Abstract
Motivation Protein–protein interactions are vital for protein function with the average protein having between three and ten interacting partners. Knowledge of precise protein–protein interfaces comes from crystal structures deposited in the Protein Data Bank (PDB), but only 50% of structures in the PDB are complexes. There is therefore a need to predict protein–protein interfaces in silico and various methods for this purpose. Here we explore the use of a predictor based on structural features and which exploits random forest machine learning, comparing its performance with a number of popular established methods. Results On an independent test set of obligate and transient complexes, our IntPred predictor performs well (MCC = 0.370, ACC = 0.811, SPEC = 0.916, SENS = 0.411) and compares favourably with other methods. Overall, IntPred ranks second of six methods tested with SPPIDER having slightly better overall performance (MCC = 0.410, ACC = 0.759, SPEC = 0.783, SENS = 0.676), but considerably worse specificity than IntPred. As with SPPIDER, using an independent test set of obligate complexes enhanced performance (MCC = 0.381) while performance is somewhat reduced on a dataset of transient complexes (MCC = 0.303). The trade-off between sensitivity and specificity compared with SPPIDER suggests that the choice of the appropriate tool is application-dependent. Availability and implementation IntPred is implemented in Perl and may be downloaded for local use or run via a web server at www.bioinf.org.uk/intpred/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Thomas C Northey
- Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, UK
| | - Anja Barešić
- Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, UK
| | - Andrew C R Martin
- Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, UK
| |
Collapse
|
13
|
Molecular Simulations of Disulfide-Rich Venom Peptides with Ion Channels and Membranes. Molecules 2017; 22:molecules22030362. [PMID: 28264446 PMCID: PMC6155311 DOI: 10.3390/molecules22030362] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Revised: 02/23/2017] [Accepted: 02/24/2017] [Indexed: 12/12/2022] Open
Abstract
Disulfide-rich peptides isolated from the venom of arthropods and marine animals are a rich source of potent and selective modulators of ion channels. This makes these peptides valuable lead molecules for the development of new drugs to treat neurological disorders. Consequently, much effort goes into understanding their mechanism of action. This paper presents an overview of how molecular simulations have been used to study the interactions of disulfide-rich venom peptides with ion channels and membranes. The review is focused on the use of docking, molecular dynamics simulations, and free energy calculations to (i) predict the structure of peptide-channel complexes; (ii) calculate binding free energies including the effect of peptide modifications; and (iii) study the membrane-binding properties of disulfide-rich venom peptides. The review concludes with a summary and outlook.
Collapse
|
14
|
da Cunha NB, Cobacho NB, Viana JFC, Lima LA, Sampaio KBO, Dohms SSM, Ferreira ACR, de la Fuente-Núñez C, Costa FF, Franco OL, Dias SC. The next generation of antimicrobial peptides (AMPs) as molecular therapeutic tools for the treatment of diseases with social and economic impacts. Drug Discov Today 2017; 22:234-248. [PMID: 27890668 PMCID: PMC7185764 DOI: 10.1016/j.drudis.2016.10.017] [Citation(s) in RCA: 128] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2016] [Revised: 10/28/2016] [Accepted: 10/31/2016] [Indexed: 12/02/2022]
Abstract
Anti-infective drugs have had a key role in the contemporary world, contributing to dramatically decrease mortality rates caused by infectious diseases worldwide. Antimicrobial peptides (AMPs) are multifunctional effectors of the innate immune system of mucosal surfaces and present antimicrobial activity against a range of pathogenic viruses, bacteria, and fungi. However, the discovery and development of new antibacterial drugs is a crucial step to overcome the great challenge posed by the emergence of antibiotic resistance. In this review, we outline recent advances in the development of novel AMPs with improved antimicrobial activities that were achieved through characteristic structural design. In addition, we describe recent progress made to overcome some of the major limitations that have hindered peptide biosynthesis.
Collapse
Affiliation(s)
- Nicolau B da Cunha
- Center of Proteomic and Biochemical Analysis, Post-Graduation in Genomic Sciences and Biotechnology Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil; Genomic Sciences and Biotechnology Program - Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil
| | - Nicole B Cobacho
- Center of Proteomic and Biochemical Analysis, Post-Graduation in Genomic Sciences and Biotechnology Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil
| | - Juliane F C Viana
- Center of Proteomic and Biochemical Analysis, Post-Graduation in Genomic Sciences and Biotechnology Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil; Universidade Ceuma, Rua Josué Montello, 1, 65060-645 São Luís, MA, Brazil
| | - Loiane A Lima
- Center of Proteomic and Biochemical Analysis, Post-Graduation in Genomic Sciences and Biotechnology Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil
| | - Kamila B O Sampaio
- Center of Proteomic and Biochemical Analysis, Post-Graduation in Genomic Sciences and Biotechnology Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil
| | - Stephan S M Dohms
- Center of Proteomic and Biochemical Analysis, Post-Graduation in Genomic Sciences and Biotechnology Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil
| | - Arthur C R Ferreira
- Center of Proteomic and Biochemical Analysis, Post-Graduation in Genomic Sciences and Biotechnology Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil
| | - César de la Fuente-Núñez
- Synthetic Biology Group, MIT Synthetic Biology Center, Massachusetts Institute of Technology, 02139 Cambridge, MA, USA; Research Laboratory of Electronics, Massachusetts Institute of Technology, 02139 Cambridge, MA, USA; Department of Biological Engineering, Massachusetts Institute of Technology, 02139 Cambridge, MA, USA; Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 02142 Cambridge, MA, USA; Broad Institute of MIT and Harvard, 02142 Cambridge, MA, USA; Harvard Biophysics Program, Harvard University, 02115 Boston, MA, USA
| | - Fabrício F Costa
- Genomic Sciences and Biotechnology Program - Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil.
| | - Octávio L Franco
- Center of Proteomic and Biochemical Analysis, Post-Graduation in Genomic Sciences and Biotechnology Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil; Genomic Sciences and Biotechnology Program - Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil; S-Inova Biotech, Post-Graduation in Biotechnology, Universidade Católica Dom Bosco, 79117-900 Campo Grande, MS, Brazil
| | - Simoni C Dias
- Center of Proteomic and Biochemical Analysis, Post-Graduation in Genomic Sciences and Biotechnology Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil; Genomic Sciences and Biotechnology Program - Universidade Católica de Brasília UCB, SGAN 916, Modulo B, Bloco C, 70.790-160 Brasilia, DF, Brazil
| |
Collapse
|
15
|
Abstract
The ClusPro server (https://cluspro.org) is a widely used tool for protein-protein docking. The server provides a simple home page for basic use, requiring only two files in Protein Data Bank (PDB) format. However, ClusPro also offers a number of advanced options to modify the search; these include the removal of unstructured protein regions, application of attraction or repulsion, accounting for pairwise distance restraints, construction of homo-multimers, consideration of small-angle X-ray scattering (SAXS) data, and location of heparin-binding sites. Six different energy functions can be used, depending on the type of protein. Docking with each energy parameter set results in ten models defined by centers of highly populated clusters of low-energy docked structures. This protocol describes the use of the various options, the construction of auxiliary restraints files, the selection of the energy parameters, and the analysis of the results. Although the server is heavily used, runs are generally completed in <4 h.
Collapse
|
16
|
Wei Q, La D, Kihara D. BindML/BindML+: Detecting Protein-Protein Interaction Interface Propensity from Amino Acid Substitution Patterns. Methods Mol Biol 2017; 1529:279-289. [PMID: 27914057 DOI: 10.1007/978-1-4939-6637-0_14] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Prediction of protein-protein interaction sites in a protein structure provides important information for elucidating the mechanism of protein function and can also be useful in guiding a modeling or design procedures of protein complex structures. Since prediction methods essentially assess the propensity of amino acids that are likely to be part of a protein docking interface, they can help in designing protein-protein interactions. Here, we introduce BindML and BindML+ protein-protein interaction sites prediction methods. BindML predicts protein-protein interaction sites by identifying mutation patterns found in known protein-protein complexes using phylogenetic substitution models. BindML+ is an extension of BindML for distinguishing permanent and transient types of protein-protein interaction sites. We developed an interactive web-server that provides a convenient interface to assist in structural visualization of protein-protein interactions site predictions. The input data for the web-server are a tertiary structure of interest. BindML and BindML+ are available at http://kiharalab.org/bindml/ and http://kiharalab.org/bindml/plus/ .
Collapse
Affiliation(s)
- Qing Wei
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - David La
- Department of Biochemistry, University of Washington, Seattle, WA, 98195, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
17
|
Li M, Goncearenco A, Panchenko AR. Annotating Mutational Effects on Proteins and Protein Interactions: Designing Novel and Revisiting Existing Protocols. Methods Mol Biol 2017; 1550:235-260. [PMID: 28188534 PMCID: PMC5388446 DOI: 10.1007/978-1-4939-6747-6_17] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
In this review we describe a protocol to annotate the effects of missense mutations on proteins, their functions, stability, and binding. For this purpose we present a collection of the most comprehensive databases which store different types of sequencing data on missense mutations, we discuss their relationships, possible intersections, and unique features. Next, we suggest an annotation workflow using the state-of-the art methods and highlight their usability, advantages, and limitations for different cases. Finally, we address a particularly difficult problem of deciphering the molecular mechanisms of mutations on proteins and protein complexes to understand the origins and mechanisms of diseases.
Collapse
Affiliation(s)
- Minghui Li
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Alexander Goncearenco
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Anna R Panchenko
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, 20894, USA.
| |
Collapse
|
18
|
Dai W, Wu A, Ma L, Li YX, Jiang T, Li YY. A novel index of protein-protein interface propensity improves interface residue recognition. BMC SYSTEMS BIOLOGY 2016; 10:112. [PMID: 28155660 PMCID: PMC5259823 DOI: 10.1186/s12918-016-0351-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Background Protein-protein interface holds important information of protein-protein interactions which play key roles in most biological processes. In the past few years, a lot of efforts have been made to improve interface residue recognition by characterizing protein-protein interfaces and extracting relevant features. However, most previous studies were carried out in a qualitative level, and there are also some inconsistencies between them. Results In the present work, to improve interface residue recognition, we built a novel quantitative residue protein-protein interface propensity index (QIPI) and gained a comprehensive picture of protein-protein interface through analyzing protein-protein interfaces on our comprehensive protein-protein interfaces dataset (Astral2.05-40-4506). Furthermore, in order to assess the effect of QIPI in improving the protein-protein interface prediction, we developed an interface residue recognition method SPR (Single domain based Patch Recognition) based on the QIPI. The evaluation results proved that our novel QIPI is able to improve the interface residue recognition. Conclusions Through a comprehensive quantitative analysis of protein-protein interface, we constructed a novel quantitative protein-protein interface propensity index (QIPI), which could be easily applied to improve the interface residue recognition and helpful in understanding the protein-protein interface. Availability QIPI and SPR are available to non-commercial users at our website: http://www.scbit.org/QIPI/. Electronic supplementary material The online version of this article (doi:10.1186/s12918-016-0351-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Wentao Dai
- Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Shanghai, 2012035, People's Republic of China.,Shanghai Industrial Technology Institute, 1278 Keyuan Road, Shanghai, 201203, People's Republic of China
| | - Aiping Wu
- Suzhou Institute of Systems Medicine, Suzhou, Jiangsu, 215123, China
| | - Liangxiao Ma
- Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Shanghai, 2012035, People's Republic of China
| | - Yi-Xue Li
- Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Shanghai, 2012035, People's Republic of China.,Shanghai Industrial Technology Institute, 1278 Keyuan Road, Shanghai, 201203, People's Republic of China.,Shanghai Engineering Research Center of Pharmaceutical Translation, 1278 Keyuan Road, Shanghai, 201203, People's Republic of China
| | - Taijiao Jiang
- Suzhou Institute of Systems Medicine, Suzhou, Jiangsu, 215123, China. .,Center for Systems Medicine, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100005, China.
| | - Yuan-Yuan Li
- Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Shanghai, 2012035, People's Republic of China. .,Shanghai Industrial Technology Institute, 1278 Keyuan Road, Shanghai, 201203, People's Republic of China. .,Shanghai Engineering Research Center of Pharmaceutical Translation, 1278 Keyuan Road, Shanghai, 201203, People's Republic of China.
| |
Collapse
|
19
|
Du T, Liao L, Wu CH. Enhancing interacting residue prediction with integrated contact matrix prediction in protein-protein interaction. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2016; 2016:17. [PMID: 27818677 PMCID: PMC5075339 DOI: 10.1186/s13637-016-0051-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/02/2016] [Accepted: 09/25/2016] [Indexed: 11/10/2022]
Abstract
Identifying the residues in a protein that are involved in protein-protein interaction and identifying the contact matrix for a pair of interacting proteins are two computational tasks at different levels of an in-depth analysis of protein-protein interaction. Various methods for solving these two problems have been reported in the literature. However, the interacting residue prediction and contact matrix prediction were handled by and large independently in those existing methods, though intuitively good prediction of interacting residues will help with predicting the contact matrix. In this work, we developed a novel protein interacting residue prediction system, contact matrix-interaction profile hidden Markov model (CM-ipHMM), with the integration of contact matrix prediction and the ipHMM interaction residue prediction. We propose to leverage what is learned from the contact matrix prediction and utilize the predicted contact matrix as "feedback" to enhance the interaction residue prediction. The CM-ipHMM model showed significant improvement over the previous method that uses the ipHMM for predicting interaction residues only. It indicates that the downstream contact matrix prediction could help the interaction site prediction.
Collapse
Affiliation(s)
- Tianchuan Du
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716 USA
| | - Li Liao
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716 USA
| | - Cathy H Wu
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716 USA
| |
Collapse
|
20
|
Kuo TH, Li KB. Predicting Protein-Protein Interaction Sites Using Sequence Descriptors and Site Propensity of Neighboring Amino Acids. Int J Mol Sci 2016; 17:ijms17111788. [PMID: 27792167 PMCID: PMC5133789 DOI: 10.3390/ijms17111788] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Revised: 10/14/2016] [Accepted: 10/18/2016] [Indexed: 12/17/2022] Open
Abstract
Information about the interface sites of Protein–Protein Interactions (PPIs) is useful for many biological research works. However, despite the advancement of experimental techniques, the identification of PPI sites still remains as a challenging task. Using a statistical learning technique, we proposed a computational tool for predicting PPI interaction sites. As an alternative to similar approaches requiring structural information, the proposed method takes all of the input from protein sequences. In addition to typical sequence features, our method takes into consideration that interaction sites are not randomly distributed over the protein sequence. We characterized this positional preference using protein complexes with known structures, proposed a numerical index to estimate the propensity and then incorporated the index into a learning system. The resulting predictor, without using structural information, yields an area under the ROC curve (AUC) of 0.675, recall of 0.597, precision of 0.311 and accuracy of 0.583 on a ten-fold cross-validation experiment. This performance is comparable to the previous approach in which structural information was used. Upon introducing the B-factor data to our predictor, we demonstrated that the AUC can be further improved to 0.750. The tool is accessible at http://bsaltools.ym.edu.tw/predppis.
Collapse
Affiliation(s)
- Tzu-Hao Kuo
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei 112, Taiwan.
| | - Kuo-Bin Li
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei 112, Taiwan.
- Office of Information Management, National Yang-Ming University Hospital, Yilan 260, Taiwan.
| |
Collapse
|
21
|
Guo F, Ding Y, Li SC, Shen C, Wang L. Protein–protein interface prediction based on hexagon structure similarity. Comput Biol Chem 2016; 63:83-88. [DOI: 10.1016/j.compbiolchem.2016.02.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2016] [Accepted: 02/01/2016] [Indexed: 01/17/2023]
|
22
|
Im W, Liang J, Olson A, Zhou HX, Vajda S, Vakser IA. Challenges in structural approaches to cell modeling. J Mol Biol 2016; 428:2943-64. [PMID: 27255863 PMCID: PMC4976022 DOI: 10.1016/j.jmb.2016.05.024] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2016] [Revised: 05/19/2016] [Accepted: 05/24/2016] [Indexed: 11/17/2022]
Abstract
Computational modeling is essential for structural characterization of biomolecular mechanisms across the broad spectrum of scales. Adequate understanding of biomolecular mechanisms inherently involves our ability to model them. Structural modeling of individual biomolecules and their interactions has been rapidly progressing. However, in terms of the broader picture, the focus is shifting toward larger systems, up to the level of a cell. Such modeling involves a more dynamic and realistic representation of the interactomes in vivo, in a crowded cellular environment, as well as membranes and membrane proteins, and other cellular components. Structural modeling of a cell complements computational approaches to cellular mechanisms based on differential equations, graph models, and other techniques to model biological networks, imaging data, etc. Structural modeling along with other computational and experimental approaches will provide a fundamental understanding of life at the molecular level and lead to important applications to biology and medicine. A cross section of diverse approaches presented in this review illustrates the developing shift from the structural modeling of individual molecules to that of cell biology. Studies in several related areas are covered: biological networks; automated construction of three-dimensional cell models using experimental data; modeling of protein complexes; prediction of non-specific and transient protein interactions; thermodynamic and kinetic effects of crowding; cellular membrane modeling; and modeling of chromosomes. The review presents an expert opinion on the current state-of-the-art in these various aspects of structural modeling in cellular biology, and the prospects of future developments in this emerging field.
Collapse
Affiliation(s)
- Wonpil Im
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66047, United States.
| | - Jie Liang
- Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607, United States.
| | - Arthur Olson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States.
| | - Huan-Xiang Zhou
- Department of Physics and Institute of Molecular Biophysics, Florida State University, Tallahassee, FL 32306, United States.
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, United States.
| | - Ilya A Vakser
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66047, United States.
| |
Collapse
|
23
|
Du X, Sun S, Hu C, Li X, Xia J. Prediction of protein-protein interaction sites by means of ensemble learning and weighted feature descriptor. ACTA ACUST UNITED AC 2016; 23:10. [PMID: 27437195 PMCID: PMC4943499 DOI: 10.1186/s40709-016-0046-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Background Reliable prediction of protein–protein interaction sites is an important goal in the field of bioinformatics. Many computational methods have been explored for the large-scale prediction of protein–protein interaction sites based on various data types, including protein sequence, structural and genomic data. Although much progress has been achieved in recent years, the problem has not yet been satisfactorily solved. Results In this work, we presented an efficient approach that uses ensemble learning algorithm with weighted feature descriptor (EL-WFD) to predict protein–protein interaction sites. Moreover, weighted feature descriptor was designed to describe the distance influence of neighboring residues on interaction sites. The results on two dataset (Hetero and Homo), show that the proposed method yields a satisfactory accuracy with 83.8 % recall and 96.3 % precision on the Hetero dataset and 84.2 % recall and 96.3 % precision on the Homo dataset, respectively. In both datasets, our method tend to obtain high Mathews correlation coefficient compared with state-of-the-art technique random forest method. Conclusions The experimental results show that the EL-WFD method is quite effective in predicting protein–protein interaction sites. The novel weighted feature descriptor was proved to be promising in discovering interaction sites. Overall, the proposed method can be considered as a new powerful tool for predicting protein–protein interaction sites with excellence performance.
Collapse
Affiliation(s)
- Xiuquan Du
- School of Computer Science and Technology, Anhui University, Hefei, 230601 Anhui China ; Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, 230601 Anhui China
| | - Shiwei Sun
- School of Computer Science and Technology, Anhui University, Hefei, 230601 Anhui China
| | - Changlin Hu
- School of Computer Science and Technology, Anhui University, Hefei, 230601 Anhui China
| | - Xinrui Li
- School of Computer Science and Technology, Anhui University, Hefei, 230601 Anhui China
| | - Junfeng Xia
- Co-Innovation Center for Information Supply & Assurance Technology, Anhui University, Hefei, 230601 Anhui China ; Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, 230601 Anhui China ; Institute of Health Sciences, Anhui University, Hefei, 230601 Anhui China
| |
Collapse
|
24
|
Esmaielbeiki R, Krawczyk K, Knapp B, Nebel JC, Deane CM. Progress and challenges in predicting protein interfaces. Brief Bioinform 2016; 17:117-31. [PMID: 25971595 PMCID: PMC4719070 DOI: 10.1093/bib/bbv027] [Citation(s) in RCA: 85] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Revised: 03/18/2015] [Indexed: 12/31/2022] Open
Abstract
The majority of biological processes are mediated via protein-protein interactions. Determination of residues participating in such interactions improves our understanding of molecular mechanisms and facilitates the development of therapeutics. Experimental approaches to identifying interacting residues, such as mutagenesis, are costly and time-consuming and thus, computational methods for this purpose could streamline conventional pipelines. Here we review the field of computational protein interface prediction. We make a distinction between methods which address proteins in general and those targeted at antibodies, owing to the radically different binding mechanism of antibodies. We organize the multitude of currently available methods hierarchically based on required input and prediction principles to provide an overview of the field.
Collapse
|
25
|
Xue LC, Dobbs D, Bonvin AMJJ, Honavar V. Computational prediction of protein interfaces: A review of data driven methods. FEBS Lett 2015; 589:3516-26. [PMID: 26460190 PMCID: PMC4655202 DOI: 10.1016/j.febslet.2015.10.003] [Citation(s) in RCA: 111] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Revised: 10/01/2015] [Accepted: 10/02/2015] [Indexed: 01/06/2023]
Abstract
Reliably pinpointing which specific amino acid residues form the interface(s) between a protein and its binding partner(s) is critical for understanding the structural and physicochemical determinants of protein recognition and binding affinity, and has wide applications in modeling and validating protein interactions predicted by high-throughput methods, in engineering proteins, and in prioritizing drug targets. Here, we review the basic concepts, principles and recent advances in computational approaches to the analysis and prediction of protein-protein interfaces. We point out caveats for objectively evaluating interface predictors, and discuss various applications of data-driven interface predictors for improving energy model-driven protein-protein docking. Finally, we stress the importance of exploiting binding partner information in reliably predicting interfaces and highlight recent advances in this emerging direction.
Collapse
Affiliation(s)
- Li C Xue
- Faculty of Science - Chemistry, Bijvoet Center for Biomolecular Research, Utrecht Univ., Utrecht 3584 CH, The Netherlands.
| | - Drena Dobbs
- Department of Genetics, Development & Cell Biology, Iowa State Univ., Ames, IA 50011, USA; Bioinformatics & Computational Biology Program, Iowa State Univ., Ames, IA 50011, USA
| | - Alexandre M J J Bonvin
- Faculty of Science - Chemistry, Bijvoet Center for Biomolecular Research, Utrecht Univ., Utrecht 3584 CH, The Netherlands
| | - Vasant Honavar
- College of Information Sciences & Technology, Pennsylvania State Univ., University Park, PA 16802, USA; Genomics & Bioinformatics Program, Pennsylvania State Univ., University Park, PA 16802, USA; Neuroscience Program, Pennsylvania State Univ., University Park, PA 16802, USA; The Huck Institutes of the Life Sciences, Pennsylvania State Univ., University Park, PA 16802, USA; Center for Big Data Analytics & Discovery Informatics, Pennsylvania State Univ., University Park, PA 16802, USA; Institute for Cyberscience, Pennsylvania State Univ., University Park, PA 16802, USA
| |
Collapse
|
26
|
Prediction of Protein-Protein Interaction Sites Based on Naive Bayes Classifier. Biochem Res Int 2015; 2015:978193. [PMID: 26697220 PMCID: PMC4677168 DOI: 10.1155/2015/978193] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Revised: 11/05/2015] [Accepted: 11/12/2015] [Indexed: 11/18/2022] Open
Abstract
Protein functions through interactions with other proteins and biomolecules and these interactions occur on the so-called interface residues of the protein sequences. Identifying interface residues makes us better understand the biological mechanism of protein interaction. Meanwhile, information about the interface residues contributes to the understanding of metabolic, signal transduction networks and indicates directions in drug designing. In recent years, researchers have focused on developing new computational methods for predicting protein interface residues. Here we creatively used a 181-dimension protein sequence feature vector as input to the Naive Bayes Classifier- (NBC-) based method to predict interaction sites in protein-protein complexes interaction. The prediction of interaction sites in protein interactions is regarded as an amino acid residue binary classification problem by applying NBC with protein sequence features. Independent test results suggested that Naive Bayes Classifier-based method with the protein sequence features as input vectors performed well.
Collapse
|
27
|
Soner S, Ozbek P, Garzon JI, Ben-Tal N, Haliloglu T. DynaFace: Discrimination between Obligatory and Non-obligatory Protein-Protein Interactions Based on the Complex's Dynamics. PLoS Comput Biol 2015; 11:e1004461. [PMID: 26506003 PMCID: PMC4623975 DOI: 10.1371/journal.pcbi.1004461] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 07/08/2015] [Indexed: 12/31/2022] Open
Abstract
Protein-protein interfaces have been evolutionarily-designed to enable transduction between the interacting proteins. Thus, we hypothesize that analysis of the dynamics of the complex can reveal details about the nature of the interaction, and in particular whether it is obligatory, i.e., persists throughout the entire lifetime of the proteins, or not. Indeed, normal mode analysis, using the Gaussian network model, shows that for the most part obligatory and non-obligatory complexes differ in their decomposition into dynamic domains, i.e., the mobile elements of the protein complex. The dynamic domains of obligatory complexes often mix segments from the interacting chains, and the hinges between them do not overlap with the interface between the chains. In contrast, in non-obligatory complexes the interface often hinges between dynamic domains, held together through few anchor residues on one side of the interface that interact with their counterpart grooves in the other end. In automatic analysis, 117 of 139 obligatory (84.2%) and 203 of 246 non-obligatory (82.5%) complexes are correctly classified by our method: DynaFace. We further use DynaFace to predict obligatory and non-obligatory interactions among a set of 300 putative protein complexes. DynaFace is available at: http://safir.prc.boun.edu.tr/dynaface.
Collapse
Affiliation(s)
- Seren Soner
- Department of Computer Engineering and Polymer Research Center, Bogazici University, Istanbul, Turkey
| | - Pemra Ozbek
- Department of Bioengineering, Marmara University, Istanbul, Turkey
| | - Jose Ignacio Garzon
- Departments of Biochemistry and Molecular Biophysics and Systems Biology and Howard Hughes Medical Institute, Columbia University, New York, New York, United States of America
| | - Nir Ben-Tal
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Turkan Haliloglu
- Department of Chemical Engineering and Polymer Research Center, Bogazici University, Istanbul, Turkey
- * E-mail:
| |
Collapse
|
28
|
Guo F, Li SC, Wei Z, Zhu D, Shen C, Wang L. Structural neighboring property for identifying protein-protein binding sites. BMC SYSTEMS BIOLOGY 2015; 9 Suppl 5:S3. [PMID: 26356630 PMCID: PMC4565107 DOI: 10.1186/1752-0509-9-s5-s3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Background The protein-protein interaction plays a key role in the control of many biological functions, such as drug design and functional analysis. Determination of binding sites is widely applied in molecular biology research. Therefore, many efficient methods have been developed for identifying binding sites. In this paper, we calculate structural neighboring property through Voronoi diagram. Using 6,438 complexes, we study local biases of structural neighboring property on interface. Results We propose a novel statistical method to extract interacting residues, and interacting patches can be clustered as predicted interface residues. In addition, structural neighboring property can be adopted to construct a new energy function, for evaluating docking solutions. It includes new statistical property as well as existing energy items. Comparing to existing methods, our approach improves overall Fnat value by at least 3%. On Benchmark v4.0, our method has average Irmsd value of 3.31Å and overall Fnat value of 63%, which improves upon Irmsd of 3.89 Å and Fnat of 49% for ZRANK, and Irmsd of 3.99Å and Fnat of 46% for ClusPro. On the CAPRI targets, our method has average Irmsd value of 3.46 Å and overall Fnat value of 45%, which improves upon Irmsd of 4.18 Å and Fnat of 40% for ZRANK, and Irmsd of 5.12 Å and Fnat of 32% for ClusPro. Conclusions Experiments show that our method achieves better results than some state-of-the-art methods for identifying protein-protein binding sites, with the prediction quality improved in terms of CAPRI evaluation criteria.
Collapse
|
29
|
Identification of Protein–Protein Interactions by Detecting Correlated Mutation at the Interface. J Chem Inf Model 2015; 55:2042-9. [DOI: 10.1021/acs.jcim.5b00320] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
30
|
Hwang H, Petrey D, Honig B. A hybrid method for protein-protein interface prediction. Protein Sci 2015; 25:159-65. [PMID: 26178156 DOI: 10.1002/pro.2744] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2015] [Revised: 07/02/2015] [Accepted: 07/06/2015] [Indexed: 12/31/2022]
Abstract
The growing structural coverage of proteomes is making structural comparison a powerful tool for function annotation. Such template-based approaches are based on the observation that structural similarity is often sufficient to infer similar function. However, it seems clear that, in addition to structural similarity, the specific characteristics of a given protein should also be taken into account in predicting function. Here we describe PredUs 2.0, a method to predict regions on a protein surface likely to bind other proteins, that is, interfacial residues. PredUs 2.0 is based on the PredUs method that is entirely template-based and uses known binding sites in structurally similar proteins to predict interfacial residues. PredUs 2.0 uses a Bayesian approach to combine the template-based scoring of PredUs with a score that reflects the propensities of individual amino acids to be in interfaces. PredUs 2.0 includes a novel protein size dependent metric to determine the number of residues that should be reported as interfacial. PredUs 2.0 significantly outperforms PredUs as well as other published interface prediction methods.
Collapse
Affiliation(s)
- Howook Hwang
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Howard Hughes Medical Institute, Columbia University, 1130 St. Nicholas Ave., Room 815, New York, NY, 10032
| | - Donald Petrey
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Howard Hughes Medical Institute, Columbia University, 1130 St. Nicholas Ave., Room 815, New York, NY, 10032
| | - Barry Honig
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Howard Hughes Medical Institute, Columbia University, 1130 St. Nicholas Ave., Room 815, New York, NY, 10032
| |
Collapse
|
31
|
Janin J, Wodak SJ, Lensink MF, Velankar S. Assessing Structural Predictions of Protein-Protein Recognition: The CAPRI Experiment. REVIEWS IN COMPUTATIONAL CHEMISTRY 2015. [DOI: 10.1002/9781118889886.ch4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
32
|
Maheshwari S, Brylinski M. Predicting protein interface residues using easily accessible on-line resources. Brief Bioinform 2015; 16:1025-34. [PMID: 25797794 DOI: 10.1093/bib/bbv009] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Indexed: 01/20/2023] Open
Abstract
It has been more than a decade since the completion of the Human Genome Project that provided us with a complete list of human proteins. The next obvious task is to figure out how various parts interact with each other. On that account, we review 10 methods for protein interface prediction, which are freely available as web servers. In addition, we comparatively evaluate their performance on a common data set comprising different quality target structures. We find that using experimental structures and high-quality homology models, structure-based methods outperform those using only protein sequences, with global template-based approaches providing the best performance. For moderate-quality models, sequence-based methods often perform better than those structure-based techniques that rely on fine atomic details. We note that post-processing protocols implemented in several methods quantitatively improve the results only for experimental structures, suggesting that these procedures should be tuned up for computer-generated models. Finally, we anticipate that advanced meta-prediction protocols are likely to enhance interface residue prediction. Notwithstanding further improvements, easily accessible web servers already provide the scientific community with convenient resources for the identification of protein-protein interaction sites.
Collapse
|
33
|
Wierschin T, Wang K, Welter M, Waack S, Stanke M. Combining features in a graphical model to predict protein binding sites. Proteins 2015; 83:844-52. [PMID: 25663045 DOI: 10.1002/prot.24775] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Revised: 01/16/2015] [Accepted: 01/26/2015] [Indexed: 11/08/2022]
Abstract
Large efforts have been made in classifying residues as binding sites in proteins using machine learning methods. The prediction task can be translated into the computational challenge of assigning each residue the label binding site or non-binding site. Observational data comes from various possibly highly correlated sources. It includes the structure of the protein but not the structure of the complex. The model class of conditional random fields (CRFs) has previously successfully been used for protein binding site prediction. Here, a new CRF-approach is presented that models the dependencies of residues using a general graphical structure defined as a neighborhood graph and thus our model makes fewer independence assumptions on the labels than sequential labeling approaches. A novel node feature "change in free energy" is introduced into the model, which is then denoted by ΔF-CRF. Parameters are trained with an online large-margin algorithm. Using the standard feature class relative accessible surface area alone, the general graph-structure CRF already achieves higher prediction accuracy than the linear chain CRF of Li et al. ΔF-CRF performs significantly better on a large range of false positive rates than the support-vector-machine-based program PresCont of Zellner et al. on a homodimer set containing 128 chains. ΔF-CRF has a broader scope than PresCont since it is not constrained to protein subgroups and requires no multiple sequence alignment. The improvement is attributed to the advantageous combination of the novel node feature with the standard feature and to the adopted parameter training method.
Collapse
Affiliation(s)
- Torsten Wierschin
- Institute of Mathematics and Computer Science, University of Greifswald, 17487, Greifswald, Germany
| | | | | | | | | |
Collapse
|
34
|
Li Z, He Y, Wong L, Li J. Burial Level Change Defines a High Energetic Relevance for Protein Binding Interfaces. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:410-421. [PMID: 26357227 DOI: 10.1109/tcbb.2014.2361355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Protein-protein interfaces defined through atomic contact or solvent accessibility change are widely adopted in structural biology studies. But, these definitions cannot precisely capture energetically important regions at protein interfaces. The burial depth of an atom in a protein is related to the atom's energy. This work investigates how closely the change in burial level of an atom/residue upon complexation is related to the binding. Burial level change is different from burial level itself. An atom deeply buried in a monomer with a high burial level may not change its burial level after an interaction and it may have little burial level change. We hypothesize that an interface is a region of residues all undergoing burial level changes after interaction. By this definition, an interface can be decomposed into an onion-like structure according to the burial level change extent. We found that our defined interfaces cover energetically important residues more precisely, and that the binding free energy of an interface is distributed progressively from the outermost layer to the core. These observations are used to predict binding hot spots. Our approach's F-measure performance on a benchmark dataset of alanine mutagenesis residues is much superior or similar to those by complicated energy modeling or machine learning approaches.
Collapse
|
35
|
Aumentado-Armstrong TT, Istrate B, Murgita RA. Algorithmic approaches to protein-protein interaction site prediction. Algorithms Mol Biol 2015; 10:7. [PMID: 25713596 PMCID: PMC4338852 DOI: 10.1186/s13015-015-0033-9] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2014] [Accepted: 01/07/2015] [Indexed: 12/19/2022] Open
Abstract
Interaction sites on protein surfaces mediate virtually all biological activities, and their identification holds promise for disease treatment and drug design. Novel algorithmic approaches for the prediction of these sites have been produced at a rapid rate, and the field has seen significant advancement over the past decade. However, the most current methods have not yet been reviewed in a systematic and comprehensive fashion. Herein, we describe the intricacies of the biological theory, datasets, and features required for modern protein-protein interaction site (PPIS) prediction, and present an integrative analysis of the state-of-the-art algorithms and their performance. First, the major sources of data used by predictors are reviewed, including training sets, evaluation sets, and methods for their procurement. Then, the features employed and their importance in the biological characterization of PPISs are explored. This is followed by a discussion of the methodologies adopted in contemporary prediction programs, as well as their relative performance on the datasets most recently used for evaluation. In addition, the potential utility that PPIS identification holds for rational drug design, hotspot prediction, and computational molecular docking is described. Finally, an analysis of the most promising areas for future development of the field is presented.
Collapse
|
36
|
Robin G, Sato Y, Desplancq D, Rochel N, Weiss E, Martineau P. Restricted Diversity of Antigen Binding Residues of Antibodies Revealed by Computational Alanine Scanning of 227 Antibody–Antigen Complexes. J Mol Biol 2014; 426:3729-3743. [DOI: 10.1016/j.jmb.2014.08.013] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Revised: 07/31/2014] [Accepted: 08/09/2014] [Indexed: 12/28/2022]
|
37
|
Cui D, Ou S, Patel S. Protein-spanning water networks and implications for prediction of protein-protein interactions mediated through hydrophobic effects. Proteins 2014; 82:3312-26. [DOI: 10.1002/prot.24683] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Revised: 07/30/2014] [Accepted: 08/11/2014] [Indexed: 01/11/2023]
Affiliation(s)
- Di Cui
- Department of Chemistry and Biochemistry; University of Delaware; Newark Delaware 19716
| | - Shuching Ou
- Department of Chemistry and Biochemistry; University of Delaware; Newark Delaware 19716
| | - Sandeep Patel
- Department of Chemistry and Biochemistry; University of Delaware; Newark Delaware 19716
| |
Collapse
|
38
|
Dong Z, Wang K, Dang TKL, Gültas M, Welter M, Wierschin T, Stanke M, Waack S. CRF-based models of protein surfaces improve protein-protein interaction site predictions. BMC Bioinformatics 2014; 15:277. [PMID: 25124108 PMCID: PMC4150965 DOI: 10.1186/1471-2105-15-277] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2013] [Accepted: 08/01/2014] [Indexed: 11/13/2022] Open
Abstract
Background The identification of protein-protein interaction sites is a computationally challenging task and important for understanding the biology of protein complexes. There is a rich literature in this field. A broad class of approaches assign to each candidate residue a real-valued score that measures how likely it is that the residue belongs to the interface. The prediction is obtained by thresholding this score. Some probabilistic models classify the residues on the basis of the posterior probabilities. In this paper, we introduce pairwise conditional random fields (pCRFs) in which edges are not restricted to the backbone as in the case of linear-chain CRFs utilized by Li et al. (2007). In fact, any 3D-neighborhood relation can be modeled. On grounds of a generalized Viterbi inference algorithm and a piecewise training process for pCRFs, we demonstrate how to utilize pCRFs to enhance a given residue-wise score-based protein-protein interface predictor on the surface of the protein under study. The features of the pCRF are solely based on the interface predictions scores of the predictor the performance of which shall be improved. Results We performed three sets of experiments with synthetic scores assigned to the surface residues of proteins taken from the data set PlaneDimers compiled by Zellner et al. (2011), from the list published by Keskin et al. (2004) and from the very recent data set due to Cukuroglu et al. (2014). That way we demonstrated that our pCRF-based enhancer is effective given the interface residue score distribution and the non-interface residue score are unimodal. Moreover, the pCRF-based enhancer is also successfully applicable, if the distributions are only unimodal over a certain sub-domain. The improvement is then restricted to that domain. Thus we were able to improve the prediction of the PresCont server devised by Zellner et al. (2011) on PlaneDimers. Conclusions Our results strongly suggest that pCRFs form a methodological framework to improve residue-wise score-based protein-protein interface predictors given the scores are appropriately distributed. A prototypical implementation of our method is accessible at http://ppicrf.informatik.uni-goettingen.de/index.html.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Stephan Waack
- Institute of Computer Science, University of Göttingen, Goldschmidtstr, 7, 37077 Göttingen, Germany.
| |
Collapse
|
39
|
Sudha G, Nussinov R, Srinivasan N. An overview of recent advances in structural bioinformatics of protein-protein interactions and a guide to their principles. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2014; 116:141-50. [PMID: 25077409 DOI: 10.1016/j.pbiomolbio.2014.07.004] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Accepted: 07/13/2014] [Indexed: 12/20/2022]
Abstract
Rich data bearing on the structural and evolutionary principles of protein-protein interactions are paving the way to a better understanding of the regulation of function in the cell. This is particularly the case when these interactions are considered in the framework of key pathways. Knowledge of the interactions may provide insights into the mechanisms of crucial 'driver' mutations in oncogenesis. They also provide the foundation toward the design of protein-protein interfaces and inhibitors that can abrogate their formation or enhance them. The main features to learn from known 3-D structures of protein-protein complexes and the extensive literature which analyzes them computationally and experimentally include the interaction details which permit undertaking structure-based drug discovery, the evolution of complexes and their interactions, the consequences of alterations such as post-translational modifications, ligand binding, disease causing mutations, host pathogen interactions, oligomerization, aggregation and the roles of disorder, dynamics, allostery and more to the protein and the cell. This review highlights some of the recent advances in these areas, including design, inhibition and prediction of protein-protein complexes. The field is broad, and much work has been carried out in these areas, making it challenging to cover it in its entirety. Much of this is due to the fast increase in the number of molecules whose structures have been determined experimentally and the vast increase in computational power. Here we provide a concise overview.
Collapse
Affiliation(s)
- Govindarajan Sudha
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India.
| | - Ruth Nussinov
- Cancer and Inflammation Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., National Cancer Institute, Frederick, MD 21702, USA; Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel.
| | | |
Collapse
|
40
|
Protein binding site prediction by combining hidden Markov support vector machine and profile-based propensities. ScientificWorldJournal 2014; 2014:464093. [PMID: 25133234 PMCID: PMC4122092 DOI: 10.1155/2014/464093] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Accepted: 07/01/2014] [Indexed: 11/22/2022] Open
Abstract
Identification of protein binding sites is critical for studying the function of the proteins. In this paper, we proposed a method for protein binding site prediction, which combined the order profile propensities and hidden Markov support vector machine (HM-SVM). This method employed the sequential labeling technique to the field of protein binding site prediction. The input features of HM-SVM include the profile-based propensities, the Position-Specific Score Matrix (PSSM), and Accessible Surface Area (ASA). When tested on different data sets, the proposed method showed promising results, and outperformed some closely relative methods by more than 10% in terms of AUC.
Collapse
|
41
|
Andreani J, Guerois R. Evolution of protein interactions: From interactomes to interfaces. Arch Biochem Biophys 2014; 554:65-75. [DOI: 10.1016/j.abb.2014.05.010] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2014] [Revised: 04/28/2014] [Accepted: 05/12/2014] [Indexed: 12/16/2022]
|
42
|
Esmaielbeiki R, Nebel JC. Scoring docking conformations using predicted protein interfaces. BMC Bioinformatics 2014; 15:171. [PMID: 24906633 PMCID: PMC4057934 DOI: 10.1186/1471-2105-15-171] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2012] [Accepted: 05/29/2014] [Indexed: 12/22/2022] Open
Abstract
Background Since proteins function by interacting with other molecules, analysis of protein-protein interactions is essential for comprehending biological processes. Whereas understanding of atomic interactions within a complex is especially useful for drug design, limitations of experimental techniques have restricted their practical use. Despite progress in docking predictions, there is still room for improvement. In this study, we contribute to this topic by proposing T-PioDock, a framework for detection of a native-like docked complex 3D structure. T-PioDock supports the identification of near-native conformations from 3D models that docking software produced by scoring those models using binding interfaces predicted by the interface predictor, Template based Protein Interface Prediction (T-PIP). Results First, exhaustive evaluation of interface predictors demonstrates that T-PIP, whose predictions are customised to target complexity, is a state-of-the-art method. Second, comparative study between T-PioDock and other state-of-the-art scoring methods establishes T-PioDock as the best performing approach. Moreover, there is good correlation between T-PioDock performance and quality of docking models, which suggests that progress in docking will lead to even better results at recognising near-native conformations. Conclusion Accurate identification of near-native conformations remains a challenging task. Although availability of 3D complexes will benefit from template-based methods such as T-PioDock, we have identified specific limitations which need to be addressed. First, docking software are still not able to produce native like models for every target. Second, current interface predictors do not explicitly consider pairwise residue interactions between proteins and their interacting partners which leaves ambiguity when assessing quality of complex conformations.
Collapse
Affiliation(s)
- Reyhaneh Esmaielbeiki
- Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK.
| | | |
Collapse
|
43
|
Dhole K, Singh G, Pai PP, Mondal S. Sequence-based prediction of protein–protein interaction sites with L1-logreg classifier. J Theor Biol 2014; 348:47-54. [DOI: 10.1016/j.jtbi.2014.01.028] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2013] [Revised: 01/10/2014] [Accepted: 01/22/2014] [Indexed: 11/30/2022]
|
44
|
van Ingen H, Bonvin AMJJ. Information-driven modeling of large macromolecular assemblies using NMR data. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2014; 241:103-114. [PMID: 24656083 DOI: 10.1016/j.jmr.2013.10.021] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2013] [Accepted: 10/25/2013] [Indexed: 06/03/2023]
Abstract
Availability of high-resolution atomic structures is one of the prerequisites for a mechanistic understanding of biomolecular function. This atomic information can, however, be difficult to acquire for interesting systems such as high molecular weight and multi-subunit complexes. For these, low-resolution and/or sparse data from a variety of sources including NMR are often available to define the interaction between the subunits. To make best use of all the available information and shed light on these challenging systems, integrative computational tools are required that can judiciously combine and accurately translate the sparse experimental data into structural information. In this Perspective we discuss NMR techniques and data sources available for the modeling of large and multi-subunit complexes. Recent developments are illustrated by particularly challenging application examples taken from the literature. Within this context, we also position our data-driven docking approach, HADDOCK, which can integrate a variety of information sources to drive the modeling of biomolecular complexes. It is the synergy between experimentation and computational modeling that will provides us with detailed views on the machinery of life and lead to a mechanistic understanding of biomolecular function.
Collapse
Affiliation(s)
- Hugo van Ingen
- NMR Spectroscopy Research Group, Bijvoet Center for Biomolecular Research, Utrecht University, Faculty of Science - Chemistry, Padulaan 8, 3854 CH Utrecht, The Netherlands.
| | - Alexandre M J J Bonvin
- NMR Spectroscopy Research Group, Bijvoet Center for Biomolecular Research, Utrecht University, Faculty of Science - Chemistry, Padulaan 8, 3854 CH Utrecht, The Netherlands.
| |
Collapse
|
45
|
Xue LC, Jordan RA, EL-Manzalawy Y, Dobbs D, Honavar V. DockRank: ranking docked conformations using partner-specific sequence homology-based protein interface prediction. Proteins 2014; 82:250-67. [PMID: 23873600 PMCID: PMC4417613 DOI: 10.1002/prot.24370] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2012] [Revised: 06/27/2013] [Accepted: 07/09/2013] [Indexed: 12/11/2022]
Abstract
Selecting near-native conformations from the immense number of conformations generated by docking programs remains a major challenge in molecular docking. We introduce DockRank, a novel approach to scoring docked conformations based on the degree to which the interface residues of the docked conformation match a set of predicted interface residues. DockRank uses interface residues predicted by partner-specific sequence homology-based protein-protein interface predictor (PS-HomPPI), which predicts the interface residues of a query protein with a specific interaction partner. We compared the performance of DockRank with several state-of-the-art docking scoring functions using Success Rate (the percentage of cases that have at least one near-native conformation among the top m conformations) and Hit Rate (the percentage of near-native conformations that are included among the top m conformations). In cases where it is possible to obtain partner-specific (PS) interface predictions from PS-HomPPI, DockRank consistently outperforms both (i) ZRank and IRAD, two state-of-the-art energy-based scoring functions (improving Success Rate by up to 4-fold); and (ii) Variants of DockRank that use predicted interface residues obtained from several protein interface predictors that do not take into account the binding partner in making interface predictions (improving success rate by up to 39-fold). The latter result underscores the importance of using partner-specific interface residues in scoring docked conformations. We show that DockRank, when used to re-rank the conformations returned by ClusPro, improves upon the original ClusPro rankings in terms of both Success Rate and Hit Rate. DockRank is available as a server at http://einstein.cs.iastate.edu/DockRank/.
Collapse
Affiliation(s)
- Li C. Xue
- Bioinformatics and Computational Biology program, Iowa State University, Ames, Iowa
| | - Rafael A. Jordan
- Department of Computer Science, Iowa State University, Ames, Iowa
- Department of Systems and Computer Engineering, Pontificia Universidad Javeriana, Cali, Colombia
| | - Yasser EL-Manzalawy
- Department of Computer Science, Iowa State University, Ames, Iowa
- Department of Systems and Computer Engineering, Al-Azhar University, Cairo, Egypt
| | - Drena Dobbs
- Bioinformatics and Computational Biology program, Iowa State University, Ames, Iowa
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, Iowa
| | - Vasant Honavar
- Bioinformatics and Computational Biology program, Iowa State University, Ames, Iowa
- Department of Computer Science, Iowa State University, Ames, Iowa
| |
Collapse
|
46
|
de Moraes FR, Neshich IAP, Mazoni I, Yano IH, Pereira JGC, Salim JA, Jardine JG, Neshich G. Improving predictions of protein-protein interfaces by combining amino acid-specific classifiers based on structural and physicochemical descriptors with their weighted neighbor averages. PLoS One 2014; 9:e87107. [PMID: 24489849 PMCID: PMC3904977 DOI: 10.1371/journal.pone.0087107] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2012] [Accepted: 12/22/2013] [Indexed: 11/18/2022] Open
Abstract
Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now integrated into the BlueStar STING suite of programs. Consequently, the prediction of protein-protein interfaces for all proteins available in the PDB is possible through STING_interfaces module, accessible at the following website: (http://www.cbi.cnptia.embrapa.br/SMS/predictions/index.html).
Collapse
Affiliation(s)
- Fábio R. de Moraes
- Biology Institute, University of Campinas, Campinas, São Paulo, Brazil
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
| | - Izabella A. P. Neshich
- Biology Institute, University of Campinas, Campinas, São Paulo, Brazil
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
| | - Ivan Mazoni
- Biology Institute, University of Campinas, Campinas, São Paulo, Brazil
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
| | - Inácio H. Yano
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
| | - José G. C. Pereira
- Biology Institute, University of Campinas, Campinas, São Paulo, Brazil
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
| | - José A. Salim
- School of Electrical and Computer Engineering, University of Campinas, Campinas, São Paulo, Brazil
| | - José G. Jardine
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
| | - Goran Neshich
- Brazilian Agricultural Research Corporation (EMBRAPA), National Center for Agricultural Informatics, Campinas, São Paulo, Brazil
- * E-mail:
| |
Collapse
|
47
|
Feiglin A, Ashkenazi S, Schlessinger A, Rost B, Ofran Y. Co-expression and co-localization of hub proteins and their partners are encoded in protein sequence. MOLECULAR BIOSYSTEMS 2014; 10:787-94. [PMID: 24457447 DOI: 10.1039/c3mb70411d] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Spatiotemporal coordination is a critical factor in biological processes. Some hubs in protein-protein interaction networks tend to be co-expressed and co-localized with their partners more strongly than others, a difference which is arguably related to functional differences between the hubs. Based on numerous analyses of yeast hubs, it has been suggested that differences in co-expression and co-localization are reflected in the structural and molecular characteristics of the hubs. We hypothesized that if indeed differences in co-expression and co-localization are encoded in the molecular characteristics of the protein, it may be possible to predict the tendency for co-expression and co-localization of human hubs based on features learned from systematically characterized yeast hubs. Thus, we trained a prediction algorithm on hubs from yeast that were classified as either strongly or weakly co-expressed and co-localized with their partners, and applied the trained model to 800 human hub proteins. We found that the algorithm significantly distinguishes between human hubs that are co-expressed and co-localized with their partners and hubs that are not. The prediction is based on sequence derived features such as "stickiness", i.e. the existence of multiple putative binding sites that enable multiple simultaneous interactions, "plasticity", i.e. the existence of predicted structural disorder which conjecturally allows for multiple consecutive interactions with the same binding site and predicted subcellular localization. These results suggest that spatiotemporal dynamics is encoded, at least in part, in the amino acid sequence of the protein and that this encoding is similar in yeast and in human.
Collapse
Affiliation(s)
- Ariel Feiglin
- The Goodman faculty of life sciences, Bar Ilan University, Ramat Gan 52900, Israel.
| | | | | | | | | |
Collapse
|
48
|
Bhaskara RM, Padhi A, Srinivasan N. Accurate prediction of interfacial residues in two-domain proteins using evolutionary information: implications for three-dimensional modeling. Proteins 2013; 82:1219-34. [PMID: 24375512 DOI: 10.1002/prot.24486] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2013] [Revised: 11/04/2013] [Accepted: 11/19/2013] [Indexed: 01/08/2023]
Abstract
With the preponderance of multidomain proteins in eukaryotic genomes, it is essential to recognize the constituent domains and their functions. Often function involves communications across the domain interfaces, and the knowledge of the interacting sites is essential to our understanding of the structure-function relationship. Using evolutionary information extracted from homologous domains in at least two diverse domain architectures (single and multidomain), we predict the interface residues corresponding to domains from the two-domain proteins. We also use information from the three-dimensional structures of individual domains of two-domain proteins to train naïve Bayes classifier model to predict the interfacial residues. Our predictions are highly accurate (∼85%) and specific (∼95%) to the domain-domain interfaces. This method is specific to multidomain proteins which contain domains in at least more than one protein architectural context. Using predicted residues to constrain domain-domain interaction, rigid-body docking was able to provide us with accurate full-length protein structures with correct orientation of domains. We believe that these results can be of considerable interest toward rational protein and interaction design, apart from providing us with valuable information on the nature of interactions.
Collapse
|
49
|
Minhas FUAA, Geiss BJ, Ben-Hur A. PAIRpred: partner-specific prediction of interacting residues from sequence and structure. Proteins 2013; 82:1142-55. [PMID: 24243399 DOI: 10.1002/prot.24479] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2013] [Revised: 11/04/2013] [Accepted: 11/09/2013] [Indexed: 11/10/2022]
Abstract
We present a novel partner-specific protein-protein interaction site prediction method called PAIRpred. Unlike most existing machine learning binding site prediction methods, PAIRpred uses information from both proteins in a protein complex to predict pairs of interacting residues from the two proteins. PAIRpred captures sequence and structure information about residue pairs through pairwise kernels that are used for training a support vector machine classifier. As a result, PAIRpred presents a more detailed model of protein binding, and offers state of the art accuracy in predicting binding sites at the protein level as well as inter-protein residue contacts at the complex level. We demonstrate PAIRpred's performance on Docking Benchmark 4.0 and recent CAPRI targets. We present a detailed performance analysis outlining the contribution of different sequence and structure features, together with a comparison to a variety of existing interface prediction techniques. We have also studied the impact of binding-associated conformational change on prediction accuracy and found PAIRpred to be more robust to such structural changes than existing schemes. As an illustration of the potential applications of PAIRpred, we provide a case study in which PAIRpred is used to analyze the nature and specificity of the interface in the interaction of human ISG15 protein with NS1 protein from influenza A virus. Python code for PAIRpred is available at http://combi.cs.colostate.edu/supplements/pairpred/.
Collapse
|
50
|
Piatek MJ, Schramm MC, Burra DD, binShbreen A, Jankovic BR, Chowdhary R, Archer JA, Bajic VB. Simplified method for predicting a functional class of proteins in transcription factor complexes. PLoS One 2013; 8:e68857. [PMID: 23874789 PMCID: PMC3709904 DOI: 10.1371/journal.pone.0068857] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2012] [Accepted: 06/05/2013] [Indexed: 12/24/2022] Open
Abstract
Background Initiation of transcription is essential for most of the cellular responses to environmental conditions and for cell and tissue specificity. This process is regulated through numerous proteins, their ligands and mutual interactions, as well as interactions with DNA. The key such regulatory proteins are transcription factors (TFs) and transcription co-factors (TcoFs). TcoFs are important since they modulate the transcription initiation process through interaction with TFs. In eukaryotes, transcription requires that TFs form different protein complexes with various nuclear proteins. To better understand transcription regulation, it is important to know the functional class of proteins interacting with TFs during transcription initiation. Such information is not fully available, since not all proteins that act as TFs or TcoFs are yet annotated as such, due to generally partial functional annotation of proteins. In this study we have developed a method to predict, using only sequence composition of the interacting proteins, the functional class of human TF binding partners to be (i) TF, (ii) TcoF, or (iii) other nuclear protein. This allows for complementing the annotation of the currently known pool of nuclear proteins. Since only the knowledge of protein sequences is required in addition to protein interaction, the method should be easily applicable to many species. Results Based on experimentally validated interactions between human TFs with different TFs, TcoFs and other nuclear proteins, our two classification systems (implemented as a web-based application) achieve high accuracies in distinguishing TFs and TcoFs from other nuclear proteins, and TFs from TcoFs respectively. Conclusion As demonstrated, given the fact that two proteins are capable of forming direct physical interactions and using only information about their sequence composition, we have developed a completely new method for predicting a functional class of TF interacting protein partners with high precision and accuracy.
Collapse
Affiliation(s)
- Marek J. Piatek
- King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, Thuwal, Kingdom of Saudi Arabia
| | - Michael C. Schramm
- King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, Thuwal, Kingdom of Saudi Arabia
| | - Dharani D. Burra
- King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, Thuwal, Kingdom of Saudi Arabia
| | - Abdulaziz binShbreen
- King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, Thuwal, Kingdom of Saudi Arabia
| | - Boris R. Jankovic
- King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, Thuwal, Kingdom of Saudi Arabia
| | - Rajesh Chowdhary
- Biomedical Informatics Research Center, MCRF, Marshfield Clinic, Marshfield, Wisconsin, United States of America
| | - John A.C. Archer
- King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, Thuwal, Kingdom of Saudi Arabia
| | - Vladimir B. Bajic
- King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, Thuwal, Kingdom of Saudi Arabia
- * E-mail:
| |
Collapse
|