1
|
Using diverse potentials and scoring functions for the development of improved machine-learned models for protein-ligand affinity and docking pose prediction. J Comput Aided Mol Des 2021; 35:1095-1123. [PMID: 34708263 DOI: 10.1007/s10822-021-00423-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Accepted: 10/11/2021] [Indexed: 10/20/2022]
Abstract
The advent of computational drug discovery holds the promise of significantly reducing the effort of experimentalists, along with monetary cost. More generally, predicting the binding of small organic molecules to biological macromolecules has far-reaching implications for a range of problems, including metabolomics. However, problems such as predicting the bound structure of a protein-ligand complex along with its affinity have proven to be an enormous challenge. In recent years, machine learning-based methods have proven to be more accurate than older methods, many based on simple linear regression. Nonetheless, there remains room for improvement, as these methods are often trained on a small set of features, with a single functional form for any given physical effect, and often with little mention of the rationale behind choosing one functional form over another. Moreover, it is not entirely clear why one machine learning method is favored over another. In this work, we endeavor to undertake a comprehensive effort towards developing high-accuracy, machine-learned scoring functions, systematically investigating the effects of machine learning method and choice of features, and, when possible, providing insights into the relevant physics using methods that assess feature importance. Here, we show synergism among disparate features, yielding adjusted R2 with experimental binding affinities of up to 0.871 on an independent test set and enrichment for native bound structures of up to 0.913. When purely physical terms that model enthalpic and entropic effects are used in the training, we use feature importance assessments to probe the relevant physics and hopefully guide future investigators working on this and other computational chemistry problems.
Collapse
|
2
|
Milanetti E, Miotto M, Di Rienzo L, Nagaraj M, Monti M, Golbek TW, Gosti G, Roeters SJ, Weidner T, Otzen DE, Ruocco G. In-Silico Evidence for a Two Receptor Based Strategy of SARS-CoV-2. Front Mol Biosci 2021; 8:690655. [PMID: 34179095 PMCID: PMC8219949 DOI: 10.3389/fmolb.2021.690655] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Accepted: 05/19/2021] [Indexed: 01/04/2023] Open
Abstract
We propose a computational investigation on the interaction mechanisms between SARS-CoV-2 spike protein and possible human cell receptors. In particular, we make use of our newly developed numerical method able to determine efficiently and effectively the relationship of complementarity between portions of protein surfaces. This innovative and general procedure, based on the representation of the molecular isoelectronic density surface in terms of 2D Zernike polynomials, allows the rapid and quantitative assessment of the geometrical shape complementarity between interacting proteins, which was unfeasible with previous methods. Our results indicate that SARS-CoV-2 uses a dual strategy: in addition to the known interaction with angiotensin-converting enzyme 2, the viral spike protein can also interact with sialic-acid receptors of the cells in the upper airways.
Collapse
Affiliation(s)
- Edoardo Milanetti
- Department of Physics, Sapienza University, Rome, Italy
- Center for Life Nano and Neuro Science, Italian Institute of Technology, Rome, Italy
| | - Mattia Miotto
- Department of Physics, Sapienza University, Rome, Italy
- Center for Life Nano and Neuro Science, Italian Institute of Technology, Rome, Italy
| | - Lorenzo Di Rienzo
- Center for Life Nano and Neuro Science, Italian Institute of Technology, Rome, Italy
| | - Madhu Nagaraj
- Interdisciplinary Nanoscience Center (iNANO), Aarhus University, Aarhus, Denmark
| | - Michele Monti
- Centre for Genomic Regulation (CRG), the Barcelona Institute for Science and Technology, Barcelona, Spain
- RNA System Biology Lab, Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa, Italy
| | | | - Giorgio Gosti
- Center for Life Nano and Neuro Science, Italian Institute of Technology, Rome, Italy
| | | | - Tobias Weidner
- Department of Chemistry, Aarhus University, Aarhus, Denmark
| | - Daniel E. Otzen
- Interdisciplinary Nanoscience Center (iNANO), Aarhus University, Aarhus, Denmark
| | - Giancarlo Ruocco
- Department of Physics, Sapienza University, Rome, Italy
- Center for Life Nano and Neuro Science, Italian Institute of Technology, Rome, Italy
| |
Collapse
|
3
|
Prediction of peptide binding to MHC using machine learning with sequence and structure-based feature sets. Biochim Biophys Acta Gen Subj 2020; 1864:129535. [DOI: 10.1016/j.bbagen.2020.129535] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Revised: 01/09/2020] [Accepted: 01/14/2020] [Indexed: 11/18/2022]
|
4
|
Eren E, Watts NR, Dearborn AD, Palmer IW, Kaufman JD, Steven AC, Wingfield PT. Structures of Hepatitis B Virus Core- and e-Antigen Immune Complexes Suggest Multi-point Inhibition. Structure 2018; 26:1314-1326.e4. [PMID: 30100358 DOI: 10.1016/j.str.2018.06.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 06/13/2018] [Accepted: 06/29/2018] [Indexed: 12/22/2022]
Abstract
Hepatitis B virus (HBV) is the leading cause of liver disease worldwide. While an adequate vaccine is available, current treatment options are limited, not highly effective, and associated with adverse effects, encouraging the development of alternative therapeutics. The HBV core gene encodes two different proteins: core, which forms the viral nucleocapsid, and pre-core, which serves as an immune modulator with multiple points of action. The two proteins mostly have the same sequence, although they differ at their N and C termini and in their dimeric arrangements. Previously, we engineered two human-framework antibody fragments (Fab/scFv) with nano- to picomolar affinities for both proteins. Here, by means of X-ray crystallography, analytical ultracentrifugation, and electron microscopy, we demonstrate that the antibodies have non-overlapping epitopes and effectively block biologically important assemblies of both proteins. These properties, together with the anticipated high tolerability and long half-lives of the antibodies, make them promising therapeutics.
Collapse
Affiliation(s)
- Elif Eren
- Laboratory of Structural Biology Research, NIAMS, National Institutes of Health, Bethesda, MD 20892, USA
| | - Norman R Watts
- Protein Expression Laboratory, NIAMS, National Institutes of Health, Bethesda, MD 20892, USA
| | - Altaira D Dearborn
- Protein Expression Laboratory, NIAMS, National Institutes of Health, Bethesda, MD 20892, USA
| | - Ira W Palmer
- Protein Expression Laboratory, NIAMS, National Institutes of Health, Bethesda, MD 20892, USA
| | - Joshua D Kaufman
- Protein Expression Laboratory, NIAMS, National Institutes of Health, Bethesda, MD 20892, USA
| | - Alasdair C Steven
- Laboratory of Structural Biology Research, NIAMS, National Institutes of Health, Bethesda, MD 20892, USA
| | - Paul T Wingfield
- Protein Expression Laboratory, NIAMS, National Institutes of Health, Bethesda, MD 20892, USA.
| |
Collapse
|
5
|
Simões T, Lopes D, Dias S, Fernandes F, Pereira J, Jorge J, Bajaj C, Gomes A. Geometric Detection Algorithms for Cavities on Protein Surfaces in Molecular Graphics: A Survey. COMPUTER GRAPHICS FORUM : JOURNAL OF THE EUROPEAN ASSOCIATION FOR COMPUTER GRAPHICS 2017; 36:643-683. [PMID: 29520122 PMCID: PMC5839519 DOI: 10.1111/cgf.13158] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Detecting and analyzing protein cavities provides significant information about active sites for biological processes (e.g., protein-protein or protein-ligand binding) in molecular graphics and modeling. Using the three-dimensional structure of a given protein (i.e., atom types and their locations in 3D) as retrieved from a PDB (Protein Data Bank) file, it is now computationally viable to determine a description of these cavities. Such cavities correspond to pockets, clefts, invaginations, voids, tunnels, channels, and grooves on the surface of a given protein. In this work, we survey the literature on protein cavity computation and classify algorithmic approaches into three categories: evolution-based, energy-based, and geometry-based. Our survey focuses on geometric algorithms, whose taxonomy is extended to include not only sphere-, grid-, and tessellation-based methods, but also surface-based, hybrid geometric, consensus, and time-varying methods. Finally, we detail those techniques that have been customized for GPU (Graphics Processing Unit) computing.
Collapse
Affiliation(s)
- Tiago Simões
- Instituto de Telecomunicações, Portugal
- Universidade da Beira Interior, Portugal
| | | | - Sérgio Dias
- Instituto de Telecomunicações, Portugal
- Universidade da Beira Interior, Portugal
| | | | - João Pereira
- INESC-ID Lisboa, Portugal
- Instituto Superior Técnico, Universidade de Lisboa, Portugal
| | - Joaquim Jorge
- INESC-ID Lisboa, Portugal
- Instituto Superior Técnico, Universidade de Lisboa, Portugal
| | | | - Abel Gomes
- Instituto de Telecomunicações, Portugal
- Universidade da Beira Interior, Portugal
| |
Collapse
|
6
|
Esmaielbeiki R, Krawczyk K, Knapp B, Nebel JC, Deane CM. Progress and challenges in predicting protein interfaces. Brief Bioinform 2016; 17:117-31. [PMID: 25971595 PMCID: PMC4719070 DOI: 10.1093/bib/bbv027] [Citation(s) in RCA: 100] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Revised: 03/18/2015] [Indexed: 12/31/2022] Open
Abstract
The majority of biological processes are mediated via protein-protein interactions. Determination of residues participating in such interactions improves our understanding of molecular mechanisms and facilitates the development of therapeutics. Experimental approaches to identifying interacting residues, such as mutagenesis, are costly and time-consuming and thus, computational methods for this purpose could streamline conventional pipelines. Here we review the field of computational protein interface prediction. We make a distinction between methods which address proteins in general and those targeted at antibodies, owing to the radically different binding mechanism of antibodies. We organize the multitude of currently available methods hierarchically based on required input and prediction principles to provide an overview of the field.
Collapse
|
7
|
Computing Discrete Fine-Grained Representations of Protein Surfaces. COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS 2016. [DOI: 10.1007/978-3-319-44332-4_14] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
8
|
Daily MD, Chun J, Heredia-Langner A, Wei G, Baker NA. Origin of parameter degeneracy and molecular shape relationships in geometric-flow calculations of solvation free energies. J Chem Phys 2014; 139:204108. [PMID: 24289345 DOI: 10.1063/1.4832900] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Implicit solvent models are important tools for calculating solvation free energies for chemical and biophysical studies since they require fewer computational resources but can achieve accuracy comparable to that of explicit-solvent models. In past papers, geometric flow-based solvation models have been established for solvation analysis of small and large compounds. In the present work, the use of realistic experiment-based parameter choices for the geometric flow models is studied. We find that the experimental parameters of solvent internal pressure p = 172 MPa and surface tension γ = 72 mN/m produce solvation free energies within 1 RT of the global minimum root-mean-squared deviation from experimental data over the expanded set. Our results demonstrate that experimental values can be used for geometric flow solvent model parameters, thus eliminating the need for additional parameterization. We also examine the correlations between optimal values of p and γ which are strongly anti-correlated. Geometric analysis of the small molecule test set shows that these results are inter-connected with an approximately linear relationship between area and volume in the range of molecular sizes spanned by the data set. In spite of this considerable degeneracy between the surface tension and pressure terms in the model, both terms are important for the broader applicability of the model.
Collapse
Affiliation(s)
- Michael D Daily
- Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, USA
| | | | | | | | | |
Collapse
|
9
|
Zhu X, Ericksen SS, Demerdash ONA, Mitchell JC. Data-driven models for protein interaction and design. Proteins 2013; 81:2221-8. [PMID: 24038640 DOI: 10.1002/prot.24405] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Revised: 08/12/2013] [Accepted: 08/21/2013] [Indexed: 12/13/2022]
Abstract
We describe methods and results for four new types of challenge in the Critical Assessment of PRedicted Interactions (CAPRI). Two new challenges asked predictors to create models related to protein interface design. The first of these was to distinguish binding interfaces from designed nonbinding interfaces. The second was to predict the effects of all single-point mutations on hemagglutinin binding to two small designed proteins. Two additional challenges asked predictors to submit high-resolution structures for interface-bound crystallographic waters and for binding heparin to a putative glycosylase.
Collapse
|
10
|
Demerdash ONA, Mitchell JC. Using physical potentials and learned models to distinguish native binding interfaces from de novo designed interfaces that do not bind. Proteins 2013; 81:1919-30. [DOI: 10.1002/prot.24337] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2012] [Revised: 04/04/2013] [Accepted: 05/23/2013] [Indexed: 11/05/2022]
Affiliation(s)
- Omar N. A. Demerdash
- Medical Scientist Training Program; University of Wisconsin-Madison; Madison Wisconsin
- Biophysics Program; University of Wisconsin-Madison; Madison Wisconsin
| | - Julie C. Mitchell
- Department of Biochemistry; University of Wisconsin-Madison; Madison Wisconsin
- Department of Mathematics; University of Wisconsin-Madison; Madison Wisconsin
| |
Collapse
|
11
|
Abstract
In this study, we present the DNA-Binding Site Identifier (DBSI), a new structure-based method for predicting protein interaction sites for DNA binding. DBSI was trained and validated on a data set of 263 proteins (TRAIN-263), tested on an independent set of protein-DNA complexes (TEST-206) and data sets of 29 unbound (APO-29) and 30 bound (HOLO-30) protein structures distinct from the training data. We computed 480 candidate features for identifying protein residues that bind DNA, including new features that capture the electrostatic microenvironment within shells near the protein surface. Our iterative feature selection process identified features important in other models, as well as features unique to the DBSI model, such as a banded electrostatic feature with spatial separation comparable with the canonical width of the DNA minor groove. Validations and comparisons with established methods using a range of performance metrics clearly demonstrate the predictive advantage of DBSI, and its comparable performance on unbound (APO-29) and bound (HOLO-30) conformations demonstrates robustness to binding-induced protein conformational changes. Finally, we offer our feature data table to others for integration into their own models or for testing improved feature selection and model training strategies based on DBSI.
Collapse
Affiliation(s)
- Xiaolei Zhu
- BACTER Institute, University of Wisconsin-Madison, Madison, WI, USA, Departments of Mathematics and Biochemistry, University of Wisconsin-Madison, Madison, WI, USA
| | | | | |
Collapse
|
12
|
Park JK, Jernigan R, Wu Z. Coarse grained normal mode analysis vs. refined Gaussian Network Model for protein residue-level structural fluctuations. Bull Math Biol 2013; 75:124-60. [PMID: 23296997 DOI: 10.1007/s11538-012-9797-y] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2012] [Accepted: 11/08/2012] [Indexed: 11/26/2022]
Abstract
We investigate several approaches to coarse grained normal mode analysis on protein residual-level structural fluctuations by choosing different ways of representing the residues and the forces among them. Single-atom representations using the backbone atoms C(α), C, N, and C(β) are considered. Combinations of some of these atoms are also tested. The force constants between the representative atoms are extracted from the Hessian matrix of the energy function and served as the force constants between the corresponding residues. The residue mean-square-fluctuations and their correlations with the experimental B-factors are calculated for a large set of proteins. The results are compared with all-atom normal mode analysis and the residue-level Gaussian Network Model. The coarse-grained methods perform more efficiently than all-atom normal mode analysis, while their B-factor correlations are also higher. Their B-factor correlations are comparable with those estimated by the Gaussian Network Model and in many cases better. The extracted force constants are surveyed for different pairs of residues with different numbers of separation residues in sequence. The statistical averages are used to build a refined Gaussian Network Model, which is able to predict residue-level structural fluctuations significantly better than the conventional Gaussian Network Model in many test cases.
Collapse
Affiliation(s)
- Jun-Koo Park
- Department of Mathematics, Iowa State University, Ames, IA 50010, USA.
| | | | | |
Collapse
|
13
|
Xu B, Wei X, Deng L, Guan J, Zhou S. A semi-supervised boosting SVM for predicting hot spots at protein-protein interfaces. BMC SYSTEMS BIOLOGY 2012; 6 Suppl 2:S6. [PMID: 23282146 PMCID: PMC3521187 DOI: 10.1186/1752-0509-6-s2-s6] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
BACKGROUND Hot spots are residues contributing the most of binding free energy yet accounting for a small portion of a protein interface. Experimental approaches to identify hot spots such as alanine scanning mutagenesis are expensive and time-consuming, while computational methods are emerging as effective alternatives to experimental approaches. RESULTS In this study, we propose a semi-supervised boosting SVM, which is called sbSVM, to computationally predict hot spots at protein-protein interfaces by combining protein sequence and structure features. Here, feature selection is performed using random forests to avoid over-fitting. Due to the deficiency of positive samples, our approach samples useful unlabeled data iteratively to boost the performance of hot spots prediction. The performance evaluation of our method is carried out on a dataset generated from the ASEdb database for cross-validation and a dataset from the BID database for independent test. Furthermore, a balanced dataset with similar amounts of hot spots and non-hot spots (65 and 66 respectively) derived from the first training dataset is used to further validate our method. All results show that our method yields good sensitivity, accuracy and F1 score comparing with the existing methods. CONCLUSION Our method boosts prediction performance of hot spots by using unlabeled data to overcome the deficiency of available training data. Experimental results show that our approach is more effective than the traditional supervised algorithms and major existing hot spot prediction methods.
Collapse
Affiliation(s)
- Bin Xu
- Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | | | | | | | | |
Collapse
|
14
|
Morrow JK, Zhang S. Computational prediction of protein hot spot residues. Curr Pharm Des 2012; 18:1255-65. [PMID: 22316154 DOI: 10.2174/138161212799436412] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2011] [Accepted: 12/06/2011] [Indexed: 11/22/2022]
Abstract
Most biological processes involve multiple proteins interacting with each other. It has been recently discovered that certain residues in these protein-protein interactions, which are called hot spots, contribute more significantly to binding affinity than others. Hot spot residues have unique and diverse energetic properties that make them challenging yet important targets in the modulation of protein-protein complexes. Design of therapeutic agents that interact with hot spot residues has proven to be a valid methodology in disrupting unwanted protein-protein interactions. Using biological methods to determine which residues are hot spots can be costly and time consuming. Recent advances in computational approaches to predict hot spots have incorporated a myriad of features, and have shown increasing predictive successes. Here we review the state of knowledge around protein-protein interactions, hot spots, and give an overview of multiple in silico prediction techniques of hot spot residues.
Collapse
Affiliation(s)
- John Kenneth Morrow
- Department of Experimental Therapeutics, The University of Texas M.D. Anderson Cancer Center, Houston, Texas 77054, USA
| | | |
Collapse
|
15
|
Demerdash ONA, Mitchell JC. Density-cluster NMA: A new protein decomposition technique for coarse-grained normal mode analysis. Proteins 2012; 80:1766-79. [PMID: 22434479 DOI: 10.1002/prot.24072] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2011] [Revised: 02/13/2012] [Accepted: 03/12/2012] [Indexed: 11/10/2022]
Abstract
Normal mode analysis has emerged as a useful technique for investigating protein motions on long time scales. This is largely due to the advent of coarse-graining techniques, particularly Hooke's Law-based potentials and the rotational-translational blocking (RTB) method for reducing the size of the force-constant matrix, the Hessian. Here we present a new method for domain decomposition for use in RTB that is based on hierarchical clustering of atomic density gradients, which we call Density-Cluster RTB (DCRTB). The method reduces the number of degrees of freedom by 85-90% compared with the standard blocking approaches. We compared the normal modes from DCRTB against standard RTB using 1-4 residues in sequence in a single block, with good agreement between the two methods. We also show that Density-Cluster RTB and standard RTB perform well in capturing the experimentally determined direction of conformational change. Significantly, we report superior correlation of DCRTB with B-factors compared with 1-4 residue per block RTB. Finally, we show significant reduction in computational cost for Density-Cluster RTB that is nearly 100-fold for many examples.
Collapse
Affiliation(s)
- Omar N A Demerdash
- Medical Scientist Training Program, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | | |
Collapse
|
16
|
Prakash A, Luthra PM. Insilico study of the A2AR–D2R kinetics and interfacial contact surface for heteromerization. Amino Acids 2012; 43:1451-64. [DOI: 10.1007/s00726-012-1218-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2011] [Accepted: 01/04/2012] [Indexed: 12/28/2022]
|
17
|
Wang YT, Lee WJ. Binding hot-spots in an antibody–ssDNA interface: a molecular dynamics study. MOLECULAR BIOSYSTEMS 2012; 8:3274-80. [DOI: 10.1039/c2mb25250c] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
18
|
Crystal structure of QscR, a Pseudomonas aeruginosa quorum sensing signal receptor. Proc Natl Acad Sci U S A 2011; 108:15763-8. [PMID: 21911405 DOI: 10.1073/pnas.1112398108] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Acyl-homoserine lactone (AHL) quorum sensing controls gene expression in hundreds of Proteobacteria including a number of plant and animal pathogens. Generally, the AHL receptors are members of a family of related transcription factors, and although they have been targets for development of antivirulence therapeutics there is very little structural information about this class of bacterial receptors. We have determined the structure of the transcription factor, QscR, bound to N-3-oxo-dodecanoyl-homoserine lactone from the opportunistic human pathogen Pseudomonas aeruginosa at a resolution of 2.55 Å. The ligand-bound QscR is a dimer with a unique symmetric "cross-subunit" arrangement containing multiple dimerization interfaces involving both domains of each subunit. The QscR dimer appears poised to bind DNA. Predictions about signal binding and dimerization contacts were supported by studies of mutant QscR proteins in vivo. The acyl chain of the AHL is in close proximity to the dimerization interfaces. Our data are consistent with an allosteric mechanism of signal transmission in the regulation of DNA binding and thus virulence gene expression.
Collapse
|
19
|
Zhu X, Mitchell JC. KFC2: a knowledge-based hot spot prediction method based on interface solvation, atomic density, and plasticity features. Proteins 2011; 79:2671-83. [PMID: 21735484 DOI: 10.1002/prot.23094] [Citation(s) in RCA: 164] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2010] [Revised: 04/03/2011] [Accepted: 04/27/2011] [Indexed: 11/09/2022]
Abstract
Hot spots constitute a small fraction of protein-protein interface residues, yet they account for a large fraction of the binding affinity. Based on our previous method (KFC), we present two new methods (KFC2a and KFC2b) that outperform other methods at hot spot prediction. A number of improvements were made in developing these new methods. First, we created a training data set that contained a similar number of hot spot and non-hot spot residues. In addition, we generated 47 different features, and different numbers of features were used to train the models to avoid over-fitting. Finally, two feature combinations were selected: One (used in KFC2a) is composed of eight features that are mainly related to solvent accessible surface area and local plasticity; the other (KFC2b) is composed of seven features, only two of which are identical to those used in KFC2a. The two models were built using support vector machines (SVM). The two KFC2 models were then tested on a mixed independent test set, and compared with other methods such as Robetta, FOLDEF, HotPoint, MINERVA, and KFC. KFC2a showed the highest predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.85); however, the false positive rate was somewhat higher than for other models. KFC2b showed the best predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.62) among all methods other than KFC2a, and the False Positive Rate (FPR = 0.15) was comparable with other highly predictive methods.
Collapse
Affiliation(s)
- Xiaolei Zhu
- BACTER Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | | |
Collapse
|
20
|
Bai H, Yang K, Yu D, Zhang C, Chen F, Lai L. Predicting kinetic constants of protein-protein interactions based on structural properties. Proteins 2010; 79:720-34. [PMID: 21287608 DOI: 10.1002/prot.22904] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2010] [Revised: 07/24/2010] [Accepted: 08/23/2010] [Indexed: 02/01/2023]
Abstract
Elucidating kinetic processes of protein-protein interactions (PPI) helps to understand how basic building blocks affect overall behavior of living systems. In this study, we used structure-based properties to build predictive models for kinetic constants of PPI. A highly diverse PPI dataset, protein-protein kinetic interaction data and structures (PPKIDS), was built. PPKIDS contains 62 PPI with complex structures and kinetic constants measured experimentally. The influence of structural properties on kinetics of PPI was studied using 35 structure-based features, describing different aspects of complex structures. Linear models for the prediction of kinetic constants were built by fitting with selected subsets of structure-based features. The models gave correlation coefficients of 0.801, 0.732, and 0.770 for k(off), k(on), and K(d), respectively, in leave-one-out cross validations. The predictive models reported here use only protein complex structures as input and can be generally applied in PPI studies as well as systems biology modeling. Our study confirmed that different properties play different roles in the kinetic process of PPI. For example, k(on) was affected by overall structural features of complexes, such as the composition of secondary structures, the change of translational and rotational entropy, and the electrostatic interaction; while k(off) was determined by interfacial properties, such as number of contacted atom pairs per 100 Ų. This information provides useful hints for PPI design.
Collapse
Affiliation(s)
- Hongjun Bai
- Beijing National Laboratory for Molecular Sciences, State Key Laboratory of Structural Chemistry for Stable and Unstable Species, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | | | | | | | | | | |
Collapse
|
21
|
Abstract
Identification of epitopes that invoke strong responses from B-cells is one of the key steps in designing effective vaccines against pathogens. Because experimental determination of epitopes is expensive in terms of cost, time, and effort involved, there is an urgent need for computational methods for reliable identification of B-cell epitopes. Although several computational tools for predicting B-cell epitopes have become available in recent years, the predictive performance of existing tools remains far from ideal. We review recent advances in computational methods for B-cell epitope prediction, identify some gaps in the current state of the art, and outline some promising directions for improving the reliability of such methods.
Collapse
|
22
|
Churchill MEA, Klass J, Zoetewey DL. Structural analysis of HMGD-DNA complexes reveals influence of intercalation on sequence selectivity and DNA bending. J Mol Biol 2010; 403:88-102. [PMID: 20800069 DOI: 10.1016/j.jmb.2010.08.031] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2010] [Revised: 08/03/2010] [Accepted: 08/16/2010] [Indexed: 10/19/2022]
Abstract
The ubiquitous, eukaryotic, high-mobility group box (HMGB) chromosomal proteins promote many chromatin-mediated cellular activities through their non-sequence-specific binding and bending of DNA. Minor-groove DNA binding by the HMG box results in substantial DNA bending toward the major groove owing to electrostatic interactions, shape complementarity, and DNA intercalation that occurs at two sites. Here, the structures of the complexes formed with DNA by a partially DNA intercalation-deficient mutant of Drosophila melanogaster HMGD have been determined by X-ray crystallography at a resolution of 2.85 Å. The six proteins and 50 bp of DNA in the crystal structure revealed a variety of bound conformations. All of the proteins bound in the minor groove, bridging DNA molecules, presumably because these DNA regions are easily deformed. The loss of the primary site of DNA intercalation decreased overall DNA bending and shape complementarity. However, DNA bending at the secondary site of intercalation was retained and most protein-DNA contacts were preserved. The mode of binding resembles the HMGB1 box A-cisplatin-DNA complex, which also lacks a primary intercalating residue. This study provides new insights into the binding mechanisms used by HMG boxes to recognize varied DNA structures and sequences as well as modulate DNA structure and DNA bending.
Collapse
Affiliation(s)
- Mair E A Churchill
- Department of Pharmacology, University of Colorado Denver School of Medicine, Aurora, CO 80045, USA; Molecular Biology Program, University of Colorado Denver School of Medicine, Aurora, CO 80045, USA.
| | - Janet Klass
- Department of Pharmacology, University of Colorado Denver School of Medicine, Aurora, CO 80045, USA
| | - David L Zoetewey
- Molecular Biology Program, University of Colorado Denver School of Medicine, Aurora, CO 80045, USA
| |
Collapse
|
23
|
New measures for estimating surface complementarity and packing at protein-protein interfaces. FEBS Lett 2010; 584:1163-8. [PMID: 20153323 DOI: 10.1016/j.febslet.2010.02.021] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2009] [Revised: 01/25/2010] [Accepted: 02/05/2010] [Indexed: 11/22/2022]
Abstract
A number of methods exist that use different approaches to assess geometric properties like the surface complementarity and atom packing at the protein-protein interface. We have developed two new and conceptually different measures using the Delaunay tessellation and interface slice selection to compute the surface complementarity and atom packing at the protein-protein interface in a straightforward manner. Our measures show a strong correlation among themselves and with other existing measures, and can be calculated in a highly time-efficient manner. The measures are discriminative for evaluating biological, as well as non-biological protein-protein contacts, especially from large protein complexes and large-scale structural studies (http://pallab.serc.iisc.ernet.in/nip_nsc).
Collapse
|
24
|
Demerdash ONA, Daily MD, Mitchell JC. Structure-based predictive models for allosteric hot spots. PLoS Comput Biol 2009; 5:e1000531. [PMID: 19816556 PMCID: PMC2748687 DOI: 10.1371/journal.pcbi.1000531] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2009] [Accepted: 09/09/2009] [Indexed: 12/12/2022] Open
Abstract
In allostery, a binding event at one site in a protein modulates the behavior of a distant site. Identifying residues that relay the signal between sites remains a challenge. We have developed predictive models using support-vector machines, a widely used machine-learning method. The training data set consisted of residues classified as either hotspots or non-hotspots based on experimental characterization of point mutations from a diverse set of allosteric proteins. Each residue had an associated set of calculated features. Two sets of features were used, one consisting of dynamical, structural, network, and informatic measures, and another of structural measures defined by Daily and Gray [1]. The resulting models performed well on an independent data set consisting of hotspots and non-hotspots from five allosteric proteins. For the independent data set, our top 10 models using Feature Set 1 recalled 68–81% of known hotspots, and among total hotspot predictions, 58–67% were actual hotspots. Hence, these models have precision P = 58–67% and recall R = 68–81%. The corresponding models for Feature Set 2 had P = 55–59% and R = 81–92%. We combined the features from each set that produced models with optimal predictive performance. The top 10 models using this hybrid feature set had R = 73–81% and P = 64–71%, the best overall performance of any of the sets of models. Our methods identified hotspots in structural regions of known allosteric significance. Moreover, our predicted hotspots form a network of contiguous residues in the interior of the structures, in agreement with previous work. In conclusion, we have developed models that discriminate between known allosteric hotspots and non-hotspots with high accuracy and sensitivity. Moreover, the pattern of predicted hotspots corresponds to known functional motifs implicated in allostery, and is consistent with previous work describing sparse networks of allosterically important residues. Allostery is the process whereby a molecule binds to one site in a protein and alters the function of a distant site. This phenomenon is ubiquitous, as proteins frequently must adapt their behavior to changes in the cellular milieu. The mechanism(s) underlying allostery remains incompletely understood. In particular, predictive models are needed that distinguish amino-acid residues that are critical to allostery, or “hotspots”, from non-hotspots. Here we have used data-mining approaches to infer rules that distinguish hotspots from non-hotspots. Starting with a data set of known hotspot and non-hotspot residues from a diverse set of allosteric proteins, the training data set, we applied machine learning to this data to “learn” models, or sets of rules, for distinguishing hotspots and non-hotspots by inferring associations between the classification (hotspot or non-hotspot) and an associated set of calculated attributes. Many models that showed the highest predictive power on the training data also exhibited high accuracy and sensitivity when applied to an independent data set. Moreover, the pattern of predicted hotspots in the proteins we studied was consistent with known structure/function relationships and previous work suggesting that a network of essential residues mediates the allosteric transition.
Collapse
Affiliation(s)
- Omar N. A. Demerdash
- Biophysics Program, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Medical Scientist Training Program, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Michael D. Daily
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Julie C. Mitchell
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Department of Mathematics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- * E-mail:
| |
Collapse
|
25
|
Thompson EE, Kornev AP, Kannan N, Kim C, Ten Eyck LF, Taylor SS. Comparative surface geometry of the protein kinase family. Protein Sci 2009; 18:2016-26. [PMID: 19610074 PMCID: PMC2786965 DOI: 10.1002/pro.209] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Identifying conserved pockets on the surfaces of a family of proteins can provide insight into conserved geometric features and sites of protein-protein interaction. Here we describe mapping and comparison of the surfaces of aligned crystallographic structures, using the protein kinase family as a model. Pockets are rapidly computed using two computer programs, FADE and Crevasse. FADE uses gradients of atomic density to locate grooves and pockets on the molecular surface. Crevasse, a new piece of software, splits the FADE output into distinct pockets. The computation was run on 10 kinase catalytic cores aligned on the alphaF-helix, and the resulting pockets spatially clustered. The active site cleft appears as a large, contiguous site that can be subdivided into nucleotide and substrate docking sites. Substrate specificity determinants in the active site cleft between serine/threonine and tyrosine kinases are visible and distinct. The active site clefts cluster tightly, showing a conserved spatial relationship between the active site and alphaF-helix in the C-lobe. When the alphaC-helix is examined, there are multiple mechanisms for anchoring the helix using spatially conserved docking sites. A novel site at the top of the N-lobe is present in all the kinases, and there is a large conserved pocket over the hinge and the alphaC-beta4 loop. Other pockets on the kinase core are strongly conserved but have not yet been mapped to a protein-protein interaction. Sites identified by this algorithm have revealed structural and spatially conserved features of the kinase family and potential conserved intermolecular and intramolecular binding sites.
Collapse
Affiliation(s)
- Elaine E Thompson
- Department of Chemistry and Biochemistry, University of California at San DiegoLa Jolla, CA 92093
| | - Alexandr P Kornev
- Department of Pharmacology, Baylor College of MedicineHouston, TX 77030
| | - Natarajan Kannan
- Department of Biochemistry and Molecular Biology, University of GeorgiaAthens, GA 30602-7229,Institute of Bioinformatics, University of GeorgiaAthens, GA 30602-7229
| | - Choel Kim
- Department of Pharmacology, Baylor College of MedicineHouston, TX 77030
| | - Lynn F Ten Eyck
- Department of Chemistry and Biochemistry, University of California at San DiegoLa Jolla, CA 92093,San Diego Supercomputer Center, University of California at San DiegoLa Jolla, CA 92093,*Correspondence to: Lynn F. Ten Eyck, San Diego Supercomputer Center, University of California at San Diego, La Jolla, CA 92093. E-mail:
| | - Susan S Taylor
- Department of Chemistry and Biochemistry, University of California at San DiegoLa Jolla, CA 92093,Department of Pharmacology, Baylor College of MedicineHouston, TX 77030,Department of Pharmacology, University of California at San DiegoLa Jolla, CA 92093
| |
Collapse
|
26
|
Potential for protein surface shape analysis using spherical harmonics and 3D Zernike descriptors. Cell Biochem Biophys 2009; 54:23-32. [PMID: 19521674 DOI: 10.1007/s12013-009-9051-x] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2009] [Accepted: 05/22/2009] [Indexed: 10/20/2022]
Abstract
With structure databases expanding at a rapid rate, the task at hand is to provide reliable clues to their molecular function and to be able to do so on a large scale. This, however, requires suitable encodings of the molecular structure which are amenable to fast screening. To this end, moment-based representations provide a compact and nonredundant description of molecular shape and other associated properties. In this article, we present an overview of some commonly used representations with specific focus on two schemes namely spherical harmonics and their extension, the 3D Zernike descriptors. Key features and differences of the two are reviewed and selected applications are highlighted. We further discuss recent advances covering aspects of shape and property-based comparison at both global and local levels and demonstrate their applicability through some of our studies.
Collapse
|
27
|
Moreira IS, Fernandes PA, Ramos MJ. Protein-protein docking dealing with the unknown. J Comput Chem 2009; 31:317-42. [DOI: 10.1002/jcc.21276] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
28
|
Darnell SJ, LeGault L, Mitchell JC. KFC Server: interactive forecasting of protein interaction hot spots. Nucleic Acids Res 2008; 36:W265-9. [PMID: 18539611 PMCID: PMC2447760 DOI: 10.1093/nar/gkn346] [Citation(s) in RCA: 121] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The KFC Server is a web-based implementation of the KFC (Knowledge-based FADE and Contacts) model-a machine learning approach for the prediction of binding hot spots, or the subset of residues that account for most of a protein interface's; binding free energy. The server facilitates the automated analysis of a user submitted protein-protein or protein-DNA interface and the visualization of its hot spot predictions. For each residue in the interface, the KFC Server characterizes its local structural environment, compares that environment to the environments of experimentally determined hot spots and predicts if the interface residue is a hot spot. After the computational analysis, the user can visualize the results using an interactive job viewer able to quickly highlight predicted hot spots and surrounding structural features within the protein structure. The KFC Server is accessible at http://kfc.mitchell-lab.org.
Collapse
Affiliation(s)
- Steven J Darnell
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | | |
Collapse
|
29
|
Visual Analysis of Biomolecular Surfaces. ACTA ACUST UNITED AC 2008. [DOI: 10.1007/978-3-540-72630-2_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
30
|
Acierno JP, Braden BC, Klinke S, Goldbaum FA, Cauerhff A. Affinity Maturation Increases the Stability and Plasticity of the Fv Domain of Anti-protein Antibodies. J Mol Biol 2007; 374:130-46. [DOI: 10.1016/j.jmb.2007.09.005] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2007] [Revised: 08/13/2007] [Accepted: 09/05/2007] [Indexed: 11/26/2022]
|
31
|
Makrodimitris K, Masica DL, Kim ET, Gray JJ. Structure Prediction of Protein−Solid Surface Interactions Reveals a Molecular Recognition Motif of Statherin for Hydroxyapatite. J Am Chem Soc 2007; 129:13713-22. [DOI: 10.1021/ja074602v] [Citation(s) in RCA: 100] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Kosta Makrodimitris
- Contribution from the Department of Chemical and Biomolecular Engineering, Program in Molecular and Computational Biophysics, Departments of Biomedical Engineering and Computer Science, and Institute for NanoBioTechnology, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland 21218
| | - David L. Masica
- Contribution from the Department of Chemical and Biomolecular Engineering, Program in Molecular and Computational Biophysics, Departments of Biomedical Engineering and Computer Science, and Institute for NanoBioTechnology, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland 21218
| | - Eric T. Kim
- Contribution from the Department of Chemical and Biomolecular Engineering, Program in Molecular and Computational Biophysics, Departments of Biomedical Engineering and Computer Science, and Institute for NanoBioTechnology, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland 21218
| | - Jeffrey J. Gray
- Contribution from the Department of Chemical and Biomolecular Engineering, Program in Molecular and Computational Biophysics, Departments of Biomedical Engineering and Computer Science, and Institute for NanoBioTechnology, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland 21218
| |
Collapse
|
32
|
Darnell SJ, Page D, Mitchell JC. An automated decision-tree approach to predicting protein interaction hot spots. Proteins 2007; 68:813-23. [PMID: 17554779 DOI: 10.1002/prot.21474] [Citation(s) in RCA: 160] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Protein-protein interactions can be altered by mutating one or more "hot spots," the subset of residues that account for most of the interface's binding free energy. The identification of hot spots requires a significant experimental effort, highlighting the practical value of hot spot predictions. We present two knowledge-based models that improve the ability to predict hot spots: K-FADE uses shape specificity features calculated by the Fast Atomic Density Evaluation (FADE) program, and K-CON uses biochemical contact features. The combined K-FADE/CON (KFC) model displays better overall predictive accuracy than computational alanine scanning (Robetta-Ala). In addition, because these methods predict different subsets of known hot spots, a large and significant increase in accuracy is achieved by combining KFC and Robetta-Ala. The KFC analysis is applied to the calmodulin (CaM)/smooth muscle myosin light chain kinase (smMLCK) interface, and to the bone morphogenetic protein-2 (BMP-2)/BMP receptor-type I (BMPR-IA) interface. The results indicate a strong correlation between KFC hot spot predictions and mutations that significantly reduce the binding affinity of the interface.
Collapse
Affiliation(s)
- Steven J Darnell
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | | | | |
Collapse
|
33
|
Abstract
Pancreatic ribonuclease A (EC 3.1.27.5, RNase) is, perhaps, the best-studied enzyme of the 20th century. It was isolated by René Dubos, crystallized by Moses Kunitz, sequenced by Stanford Moore and William Stein, and synthesized in the laboratory of Bruce Merrifield, all at the Rockefeller Institute/University. It has proven to be an excellent model system for many different types of experiments, both as an enzyme and as a well-characterized protein for biophysical studies. Of major significance was the demonstration by Chris Anfinsen at NIH that the primary sequence of RNase encoded the three-dimensional structure of the enzyme. Many other prominent protein chemists/enzymologists have utilized RNase as a dominant theme in their research. In this review, the history of RNase and its offspring, RNase S (S-protein/S-peptide), will be considered, especially the work in the Merrifield group, as a preface to preliminary data and proposed experiments addressing topics of current interest. These include entropy-enthalpy compensation, entropy of ligand binding, the impact of protein modification on thermal stability, and the role of protein dynamics in enzyme action. In continuing to use RNase as a prototypical enzyme, we stand on the shoulders of the giants of protein chemistry to survey the future.
Collapse
Affiliation(s)
- Garland R Marshall
- Center for Computational Biology, Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO 63110, USA.
| | | | | |
Collapse
|
34
|
Rapberger R, Lukas A, Mayer B. Identification of discontinuous antigenic determinants on proteins based on shape complementarities. J Mol Recognit 2007; 20:113-21. [PMID: 17421048 DOI: 10.1002/jmr.819] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Diverse procedures for identifying antigenic determinants on proteins have been developed, including experimental as well as computational approaches. However, most of these techniques focus on continuous epitopes, whereas fast and reliable identification and verification of discontinuous epitopes remains barely amenable. In this paper, we describe a computational workflow for the detection of discontinuous epitopes on proteins. The workflow uses a given protein 3D structure as input, and combines a per residue solvent accessibility constraint with epitope to paratope shape complementarity measures and binding energies for assigning antigenic determinants in the conformational context. We have developed the procedure on a given set of 26 antigen-antibody complexes with a known structure, and have further expanded the available paratope shapes by generating a virtual paratope library in order to improve the screening for candidate residues constituting discontinuous epitopes. Applying the workflow on the 26 given antigens with known discontinuous epitopes resulted in the correct identification of the spatial proximity of 12 antigen-antibody interaction sites. Combining solvent accessibility, shape complementarity and binding energies towards the identification of discontinuous epitopes clearly outperforms approaches solely considering accessibility and residue distance constraints.
Collapse
Affiliation(s)
- Ronald Rapberger
- Institute for Theoretical Chemistry, University of Vienna, Währinger Strasse 17, A-1090 Vienna, Austria
| | | | | |
Collapse
|
35
|
Staadt OG, Natarajan V, Weber GH, Wiley DF, Hamann B. Interactive processing and visualization of image data for biomedical and life science applications. BMC Cell Biol 2007; 8 Suppl 1:S10. [PMID: 17634091 PMCID: PMC1924506 DOI: 10.1186/1471-2121-8-s1-s10] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Background Applications in biomedical science and life science produce large data sets using increasingly powerful imaging devices and computer simulations. It is becoming increasingly difficult for scientists to explore and analyze these data using traditional tools. Interactive data processing and visualization tools can support scientists to overcome these limitations. Results We show that new data processing tools and visualization systems can be used successfully in biomedical and life science applications. We present an adaptive high-resolution display system suitable for biomedical image data, algorithms for analyzing and visualization protein surfaces and retinal optical coherence tomography data, and visualization tools for 3D gene expression data. Conclusion We demonstrated that interactive processing and visualization methods and systems can support scientists in a variety of biomedical and life science application areas concerned with massive data analysis.
Collapse
Affiliation(s)
- Oliver G Staadt
- Institute for Data Analysis and Visualization and Department of Computer Science, University of California, Davis, CA, USA
| | - Vijay Natarajan
- Department of Computer Science and Automation, Indian Institute of Science, Bangalore, India
| | - Gunther H Weber
- Computational Research Division, Lawrence Berkeley National Laboratory, California, USA
| | | | - Bernd Hamann
- Institute for Data Analysis and Visualization and Department of Computer Science, University of California, Davis, CA, USA
| |
Collapse
|
36
|
Johnson RJ, McCoy JG, Bingman CA, Phillips GN, Raines RT. Inhibition of human pancreatic ribonuclease by the human ribonuclease inhibitor protein. J Mol Biol 2007; 368:434-49. [PMID: 17350650 PMCID: PMC1993901 DOI: 10.1016/j.jmb.2007.02.005] [Citation(s) in RCA: 110] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2006] [Revised: 01/27/2007] [Accepted: 02/02/2007] [Indexed: 11/26/2022]
Abstract
The ribonuclease inhibitor protein (RI) binds to members of the bovine pancreatic ribonuclease (RNase A) superfamily with an affinity in the femtomolar range. Here, we report on structural and energetic aspects of the interaction between human RI (hRI) and human pancreatic ribonuclease (RNase 1). The structure of the crystalline hRI x RNase 1 complex was determined at a resolution of 1.95 A, revealing the formation of 19 intermolecular hydrogen bonds involving 13 residues of RNase 1. In contrast, only nine such hydrogen bonds are apparent in the structure of the complex between porcine RI and RNase A. hRI, which is anionic, also appears to use its horseshoe-shaped structure to engender long-range Coulombic interactions with RNase 1, which is cationic. In accordance with the structural data, the hRI.RNase 1 complex was found to be extremely stable (t(1/2)=81 days; K(d)=2.9 x 10(-16) M). Site-directed mutagenesis experiments enabled the identification of two cationic residues in RNase 1, Arg39 and Arg91, that are especially important for both the formation and stability of the complex, and are thus termed "electrostatic targeting residues". Disturbing the electrostatic attraction between hRI and RNase 1 yielded a variant of RNase 1 that maintained ribonucleolytic activity and conformational stability but had a 2.8 x 10(3)-fold lower association rate for complex formation and 5.9 x 10(9)-fold lower affinity for hRI. This variant of RNase 1, which exhibits the largest decrease in RI affinity of any engineered ribonuclease, is also toxic to human erythroleukemia cells. Together, these results provide new insight into an unusual and important protein-protein interaction, and could expedite the development of human ribonucleases as chemotherapeutic agents.
Collapse
Affiliation(s)
- R Jeremy Johnson
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706-1544, USA
| | | | | | | | | |
Collapse
|
37
|
Kedlaya RH, Bhat KM, Mitchell J, Darnell SJ, Setaluri V. TRP1 interacting PDZ-domain protein GIPC forms oligomers and is localized to intracellular vesicles in human melanocytes. Arch Biochem Biophys 2006; 454:160-9. [PMID: 16962991 PMCID: PMC2877380 DOI: 10.1016/j.abb.2006.08.010] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2006] [Revised: 08/05/2006] [Accepted: 08/08/2006] [Indexed: 11/18/2022]
Abstract
PDZ proteins coordinate assembly of protein complexes that participate in diverse biological processes. GIPC is a multifunctional PDZ protein that interacts with several soluble and membrane proteins. Unlike most PDZ proteins, GIPC contains single PDZ domain and the mechanisms by which GIPC mediates its actions remain unclear. We investigated the possibility that in lieu of multiple PDZ domains, GIPC forms multimers. Here, we demonstrate that GIPC can bind to itself and that the PDZ domain is involved in GIPC-GIPC interaction. Gel filtration, sucrose gradient centrifugation and chemical cross-linking showed that whereas bulk of cytosolic GIPC was present as monomer, oligomers with an estimated molecular mass corresponding to GIPC homotrimer were readily detectable in the membrane fraction. Modeling of GIPC PDZ domain showed feasibility of trimerization. Immunogold electron microscopy showed that GIPC is present in clusters near vesicles. Our data suggest that oligomers of GIPC mediate its functions in melanocytes.
Collapse
Affiliation(s)
| | - Kumar M.R. Bhat
- Department of Dermatology, University of Wisconsin, Madison, WI 53706, USA
| | - Julie Mitchell
- Department of Mathematics and Biochemistry, University of Wisconsin, Madison, WI 53706, USA
| | - Steven J. Darnell
- Department of Mathematics and Biochemistry, University of Wisconsin, Madison, WI 53706, USA
| | | |
Collapse
|
38
|
Roemer SC, Donham DC, Sherman L, Pon VH, Edwards DP, Churchill MEA. Structure of the progesterone receptor-deoxyribonucleic acid complex: novel interactions required for binding to half-site response elements. Mol Endocrinol 2006; 20:3042-52. [PMID: 16931575 PMCID: PMC2532839 DOI: 10.1210/me.2005-0511] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The DNA binding domain (DBD) of nuclear hormone receptors contains a highly conserved globular domain and a less conserved carboxyl-terminal extension (CTE). Despite previous observations that the CTEs of some classes of nuclear receptors are structured and interact with DNA outside of the hexanucleotide hormone response element (HRE), there has been no evidence for such a CTE among the steroid receptors. We have determined the structure of the progesterone receptor (PR)-DBD-CTE DNA complex at a resolution of 2.5 A, which revealed binding of the CTE to the minor groove flanking the HREs. Alanine substitutions of the interacting CTE residues reduced affinity for inverted repeat HREs separated by three nucleotides, and essentially abrogated binding to a single HRE. A highly compressed minor groove of the trinucleotide spacer and a novel dimerization interface were also observed. A PR binding site selection experiment revealed sequence preferences in the trinucleotide spacer and flanking DNA. These results, taken together, support the notion that sequences outside of the HREs influence the DNA binding affinity and specificity of steroid receptors.
Collapse
Affiliation(s)
- Sarah C Roemer
- Program in Molecular Biology, Department of Pharmacology, University of Colorado at Denver and Health Sciences Center, Aurora, Colorado 80045, USA
| | | | | | | | | | | |
Collapse
|
39
|
Mitchell JC, Shahbaz S, Ten Eyck LF. Interfaces in Molecular Docking. MOLECULAR SIMULATION 2006. [DOI: 10.1080/0892702031000152217] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
40
|
Rutkoski TJ, Kurten EL, Mitchell JC, Raines RT. Disruption of shape-complementarity markers to create cytotoxic variants of ribonuclease A. J Mol Biol 2005; 354:41-54. [PMID: 16188273 DOI: 10.1016/j.jmb.2005.08.007] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2005] [Revised: 08/06/2005] [Accepted: 08/08/2005] [Indexed: 10/25/2022]
Abstract
Onconase (ONC), an amphibian member of the bovine pancreatic ribonuclease A (RNase A) superfamily, is in phase III clinical trials as a treatment for malignant mesothelioma. RNase A is a far more efficient catalyst of RNA cleavage than ONC but is not cytotoxic. The innate ability of ONC to evade the cytosolic ribonuclease inhibitor protein (RI) is likely to be a primary reason for its cytotoxicity. In contrast, the non-covalent interaction between RNase A and RI is one of the strongest known, with the RI.RNase A complex having a K(d) value in the femtomolar range. Here, we report on the use of the fast atomic density evaluation (FADE) algorithm to identify regions in the molecular interface of the RI.RNase A complex that exhibit a high degree of geometric complementarity. Guided by these "knobs" and "holes", we designed variants of RNase A that evade RI. The D38R/R39D/N67R/G88R substitution increased the K(d) value of the pRI.RNase A complex by 20 x 10(6)-fold (to 1.4 microM) with little change to catalytic activity or conformational stability. This and two related variants of RNase A were more toxic to human cancer cells than was ONC. Notably, these cytotoxic variants exerted their toxic activity on cancer cells selectively, and more selectively than did ONC. Substitutions that further diminish affinity for RI (which has a cytosolic concentration of 4 microM) are unlikely to produce a substantial increase in cytotoxic activity. These results demonstrate the utility of the FADE algorithm in the examination of protein-protein interfaces and represent a landmark towards the goal of developing chemotherapeutics based on mammalian ribonucleases.
Collapse
Affiliation(s)
- Thomas J Rutkoski
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | | | | |
Collapse
|
41
|
Carpy AJM, Marchand-Geneste N. e-molecular shapes and properties. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2003; 14:329-337. [PMID: 14758977 DOI: 10.1080/10629360310001623926] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Due to recent computer technology advances, shape analysis has gained importance in all domains. In drug design and proteomics, molecular surfaces (van der Waals surface, solvent accessible surface, solvent excluded surface, polar surface area, electron density surface, separating surface, etc.), buried surfaces (gap, cleft, cavity, etc.) as well as shape properties of these surfaces, can be easily computed and visualized via the Internet. Freely available resources from the Internet for academic use, are reviewed.
Collapse
Affiliation(s)
- A J M Carpy
- Laboratoire de Physico- et Toxico-Chimie des Systèmes Naturels, UMR 5472 CNRS, Université de Bordeaux 1, 351, Cours de la Libération 33405, Talence cedex, France.
| | | |
Collapse
|
42
|
Law DS, Ten Eyck LF, Katzenelson O, Tsigelny I, Roberts VA, Pique ME, Mitchell JC. Finding needles in haystacks: Reranking DOT results by using shape complementarity, cluster analysis, and biological information. Proteins 2003; 52:33-40. [PMID: 12784365 DOI: 10.1002/prot.10395] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
We present an evaluation of our results for the first Critical Assessment of PRedicted Interaction (CAPRI). The methods used include the molecular docking program DOT, shape analysis tool FADE, cluster analysis and filtering based on biological data. Good results were obtained for most of the seven CAPRI targets, and for two systems, submissions having the highest number of correctly predicted contacts were produced.
Collapse
Affiliation(s)
- Dennis S Law
- San Diego Supercomputer Center, University of California at San Diego, La Jolla 92093-0527, USA
| | | | | | | | | | | | | |
Collapse
|
43
|
Abstract
A new method, using circular variance, is introduced for mapping macromolecular topography. Circular variance, generally used to measures angular spread, can be used to characterize of molecular structures based on a simple idea. It will be shown that the circular variance of vectors drawn from some origin to a set of points is well correlated with the degree to which that origin is inside/outside the chosen points. In addition, it has continuous derivatives that are also easy to compute. This concept will be shown to be useful for: (i) distinguishing between atoms near the surface of a macromolecule and those in either the deep interior or remote exterior; (ii) identifying invaginations (even shallow ones); and (iii) detecting linker regions that interconnect two domains.
Collapse
Affiliation(s)
- Mihaly Mezei
- Department of Physiology and Biophysics, Mount Sinai School of Medicine, NYU, NY 10029, USA.
| |
Collapse
|
44
|
Hugot M, Bensel N, Vogel M, Reymond MT, Stadler B, Reymond JL, Baumann U. A structural basis for the activity of retro-Diels-Alder catalytic antibodies: evidence for a catalytic aromatic residue. Proc Natl Acad Sci U S A 2002; 99:9674-8. [PMID: 12093912 PMCID: PMC124973 DOI: 10.1073/pnas.142286599] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2002] [Accepted: 05/13/2002] [Indexed: 11/18/2022] Open
Abstract
The nitroxyl synthase catalytic antibodies 10F11, 9D9, and 27C5 catalyze the release of nitroxyl from a bicyclic pro-drug by accelerating a retro-Diels-Alder reaction. The Fabs (antigen-binding fragments) of these three catalytic antibodies were cloned and sequenced. Fab 9D9 was crystallized in the apo-form and in complex with one transition state analogue of the reaction. Crystal structures of Fab 10F11 in complex with ligands mimicking substrate, transition state, and product have been determined at resolutions ranging from 1.8 to 2.3 A. Antibodies 9D9 and 10F11 show increased shape complementarity (as quantified by the program sc) to the hapten and to a modeled transition state as compared with substrate and product. The shape complementarity is mediated to a large extent by an aromatic residue (tyrosine or tryptophan) at the bottom of the hydrophobic active pocket, which undergoes pi-stacking interactions with the aromatic rings of the ligands. Another factor contributing to the different reactivity of the regioisomers probably arises because of hydrogen-bonding interactions between the nitroxyl bridge and the backbone amide of PheH101 and possibly a conserved water molecule.
Collapse
Affiliation(s)
- Marina Hugot
- Department of Chemistry and Biochemistry, University of Berne, Freiestrasse 3, CH-3012 Berne, Switzerland
| | | | | | | | | | | | | |
Collapse
|