1
|
Abstract
In Structural Genomics projects, virtual high-throughput ligand screening can be utilized to provide important functional details for newly determined protein structures. Using a variety of publicly available software tools, it is possible to computationally model, predict, and evaluate how different ligands interact with a given protein. At the Center for Structural Genomics of Infectious Diseases (CSGID) a series of protein analysis, docking and molecular dynamics software is scripted into a single hierarchical pipeline allowing for an exhaustive investigation of protein-ligand interactions. The ability to conduct accurate computational predictions of protein-ligand binding is a vital component in improving both the efficiency and economics of drug discovery. Computational simulations can minimize experimental efforts, the slowest and most cost prohibitive aspect of identifying new therapeutics.
Collapse
Affiliation(s)
- T Andrew Binkowski
- Center for Structural Genomics of Infectious Diseases, Computation Institute, University of Chicago, Chicago, IL, USA,
| | | | | | | | | |
Collapse
|
2
|
Borchers CH, Kast J, Foster LJ, Siu KWM, Overall CM, Binkowski TA, Hildebrand WH, Scherer A, Mansoor M, Keown PA. The Human Proteome Organization Chromosome 6 Consortium: integrating chromosome-centric and biology/disease driven strategies. J Proteomics 2014; 100:60-7. [PMID: 23933161 PMCID: PMC4096956 DOI: 10.1016/j.jprot.2013.08.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2013] [Accepted: 08/01/2013] [Indexed: 11/20/2022]
Abstract
The Human Proteome Project (HPP) is designed to generate a comprehensive map of the protein-based molecular architecture of the human body, to provide a resource to help elucidate biological and molecular function, and to advance diagnosis and treatment of diseases. Within this framework, the chromosome-based HPP (C-HPP) has allocated responsibility for mapping individual chromosomes by country or region, while the biology/disease HPP (B/D-HPP) coordinates these teams in cross-functional disease-based groups. Chromosome 6 (Ch6) provides an excellent model for integration of these two tasks. This metacentric chromosome has a complement of 1002-1034 genes that code for known, novel or putative proteins. Ch6 is functionally associated with more than 120 major human diseases, many with high population prevalence, devastating clinical impact and profound societal consequences. The unique combination of genomic, proteomic, metabolomic, phenomic and health services data being drawn together within the Ch6 program has enormous potential to advance personalized medicine by promoting robust biomarkers, subunit vaccines and new drug targets. The strong liaison between the clinical and laboratory teams, and the structured framework for technology transfer and health policy decisions within Canada will increase the speed and efficacy of this transition, and the value of this translational research. BIOLOGICAL SIGNIFICANCE Canada has been selected to play a leading role in the international Human Proteome Project, the global counterpart of the Human Genome Project designed to understand the structure and function of the human proteome in health and disease. Canada will lead an international team focusing on chromosome 6, which is functionally associated with more than 120 major human diseases, including immune and inflammatory disorders affecting the brain, skeletal system, heart and blood vessels, lungs, kidney, liver, gastrointestinal tract and endocrine system. Many of these chronic and persistent diseases have a high population prevalence, devastating clinical impact and profound societal consequences. As a result, they impose a multi-billion dollar economic burden on Canada and on all advanced societies through direct costs of patient care, the loss of health and productivity, and extensive caregiver burden. There is no definitive treatment at the present time for any of these disorders. The manuscript outlines the research which will involve a systematic assessment of all chromosome 6 genes, development of a knowledge base, and development of assays and reagents for all chromosome 6 proteins. We feel that the informatic infrastructure and MRM assays developed will place the chromosome 6 consortium in an excellent position to be a leading player in this major international research initiative. This article is part of a Special Issue: Can Proteomics Fill the Gap Between Genomics and Phenotypes?
Collapse
Affiliation(s)
- C H Borchers
- University of Victoria/Genome BC Proteomics Centre, Victoria, BC, Canada
| | - J Kast
- Biomedical Research Centre, University of British Columbia, Vancouver, BC, Canada
| | - L J Foster
- Centre for High Throughput Biology, University of British Columbia, BC, Canada
| | - K W M Siu
- Centre for Research in Mass Spectrometry, York University, Ontario, Canada
| | - C M Overall
- Centre for Blood Research, Faculty of Dentistry, University of British Columbia, Canada
| | - T A Binkowski
- Midwest Centre for Structural Genomics, Argonne National Laboratory and Computation Institute, University of Chicago, USA
| | - W H Hildebrand
- Department of Microbiology and Immunology, University of Oklahoma, OK, USA
| | - A Scherer
- Australian Genome Research Facility, Walter and Eliza Hall Institute, Parkville, Australia
| | - M Mansoor
- Department Medicine, University of British Columbia, Vancouver, BC, Canada
| | - P A Keown
- Department Medicine, University of British Columbia, Vancouver, BC, Canada; Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
3
|
Tan K, Chhor G, Binkowski TA, Jedrzejczak RP, Makowska-Grzyska M, Joachimiak A. Sensor domain of histidine kinase KinB of Pseudomonas: a helix-swapped dimer. J Biol Chem 2014; 289:12232-44. [PMID: 24573685 DOI: 10.1074/jbc.m113.514836] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The overproduction of polysaccharide alginate is responsible for the formation of mucus in the lungs of cystic fibrosis patients. Histidine kinase KinB of the KinB-AlgB two-component system in Pseudomonas aeruginosa acts as a negative regulator of alginate biosynthesis. The modular architecture of KinB is similar to other histidine kinases. However, its periplasmic signal sensor domain is unique and is found only in the Pseudomonas genus. Here, we present the first crystal structures of the KinB sensor domain. The domain is a dimer in solution, and in the crystal it shows an atypical dimer of a helix-swapped four-helix bundle. A positively charged cavity is formed on the dimer interface and involves several strictly conserved residues, including Arg-60. A phosphate anion is bound asymmetrically in one of the structures. In silico docking identified several monophosphorylated sugars, including β-D-fructose 6-phosphate and β-D-mannose 6-phosphate, a precursor and an intermediate of alginate synthesis, respectively, as potential KinB ligands. Ligand binding was confirmed experimentally. Conformational transition from a symmetric to an asymmetric structure and decreasing dimer stability caused by ligand binding may be a part of the signal transduction mechanism of the KinB-AlgB two-component system.
Collapse
Affiliation(s)
- Kemin Tan
- From the Midwest Center for Structural Genomics and
| | | | | | | | | | | |
Collapse
|
4
|
Binkowski TA, Marino SR, Joachimiak A. Predicting HLA class I non-permissive amino acid residues substitutions. PLoS One 2012; 7:e41710. [PMID: 22905104 PMCID: PMC3414483 DOI: 10.1371/journal.pone.0041710] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Accepted: 06/27/2012] [Indexed: 12/20/2022] Open
Abstract
Prediction of peptide binding to human leukocyte antigen (HLA) molecules is essential to a wide range of clinical entities from vaccine design to stem cell transplant compatibility. Here we present a new structure-based methodology that applies robust computational tools to model peptide-HLA (p-HLA) binding interactions. The method leverages the structural conservation observed in p-HLA complexes to significantly reduce the search space and calculate the system’s binding free energy. This approach is benchmarked against existing p-HLA complexes and the prediction performance is measured against a library of experimentally validated peptides. The effect on binding activity across a large set of high-affinity peptides is used to investigate amino acid mismatches reported as high-risk factors in hematopoietic stem cell transplantation.
Collapse
Affiliation(s)
- T Andrew Binkowski
- Biosciences Division, Argonne National Laboratory, Midwest Center for Structural Genomics, Argonne, Illinois, United States of America
| | | | | |
Collapse
|
5
|
Makowska-Grzyska M, Kim Y, Wu R, Wilton R, Gollapalli DR, Wang XK, Zhang R, Jedrzejczak R, Mack JC, Maltseva N, Mulligan R, Binkowski TA, Gornicki P, Kuhn ML, Anderson WF, Hedstrom L, Joachimiak A. Bacillus anthracis inosine 5'-monophosphate dehydrogenase in action: the first bacterial series of structures of phosphate ion-, substrate-, and product-bound complexes. Biochemistry 2012; 51:6148-63. [PMID: 22788966 DOI: 10.1021/bi300511w] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Inosine 5'-monophosphate dehydrogenase (IMPDH) catalyzes the first unique step of the GMP branch of the purine nucleotide biosynthetic pathway. This enzyme is found in organisms of all three kingdoms. IMPDH inhibitors have broad clinical applications in cancer treatment, as antiviral drugs and as immunosuppressants, and have also displayed antibiotic activity. We have determined three crystal structures of Bacillus anthracis IMPDH, in a phosphate ion-bound (termed "apo") form and in complex with its substrate, inosine 5'-monophosphate (IMP), and product, xanthosine 5'-monophosphate (XMP). This is the first example of a bacterial IMPDH in more than one state from the same organism. Furthermore, for the first time for a prokaryotic enzyme, the entire active site flap, containing the conserved Arg-Tyr dyad, is clearly visible in the structure of the apoenzyme. Kinetic parameters for the enzymatic reaction were also determined, and the inhibitory effect of XMP and mycophenolic acid (MPA) has been studied. In addition, the inhibitory potential of two known Cryptosporidium parvum IMPDH inhibitors was examined for the B. anthracis enzyme and compared with those of three bacterial IMPDHs from Campylobacter jejuni, Clostridium perfringens, and Vibrio cholerae. The structures contribute to the characterization of the active site and design of inhibitors that specifically target B. anthracis and other microbial IMPDH enzymes.
Collapse
Affiliation(s)
- Magdalena Makowska-Grzyska
- Center for Structural Genomics of Infectious Diseases, University of Chicago, 5735 South Ellis Avenue, Chicago, IL 60637, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Binkowski TA, Marino SR, Joachimiak A. 13-OR Computational modeling of peptides into HLA-A*02:01 molecules. Hum Immunol 2011. [DOI: 10.1016/j.humimm.2011.07.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
7
|
Kern J, Wilton R, Zhang R, Binkowski TA, Joachimiak A, Schneewind O. Structure of surface layer homology (SLH) domains from Bacillus anthracis surface array protein. J Biol Chem 2011; 286:26042-9. [PMID: 21572039 DOI: 10.1074/jbc.m111.248070] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Surface (S)-layers, para-crystalline arrays of protein, are deposited in the envelope of most bacterial species. These surface organelles are retained in the bacterial envelope through the non-covalent association of proteins with cell wall carbohydrates. Bacillus anthracis, a Gram-positive pathogen, produces S-layers of the protein Sap, which uses three consecutive repeats of the surface-layer homology (SLH) domain to engage secondary cell wall polysaccharides (SCWP). Using x-ray crystallography, we reveal here the structure of these SLH domains, which assume the shape of a three-prong spindle. Each SLH domain contributes to a three-helical bundle at the spindle base, whereas another α-helix and its connecting loops generate the three prongs. The inter-prong grooves contain conserved cationic and anionic residues, which are necessary for SLH domains to bind the B. anthracis SCWP. Modeling experiments suggest that the SLH domains of other S-layer proteins also fold into three-prong spindles and capture bacterial envelope carbohydrates by a similar mechanism.
Collapse
Affiliation(s)
- Justin Kern
- Department of Microbiology, University of Chicago, Chicago, Illinois 60637, USA
| | | | | | | | | | | |
Collapse
|
8
|
Binkowski TA, Cuff M, Nocek B, Chang C, Joachimiak A. Assisted assignment of ligands corresponding to unknown electron density. ACTA ACUST UNITED AC 2010; 11:21-30. [PMID: 20091237 DOI: 10.1007/s10969-010-9078-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2009] [Accepted: 01/03/2010] [Indexed: 11/28/2022]
Abstract
A semi-automated computational procedure to assist in the identification of bound ligands from unknown electron density has been developed. The atomic surface surrounding the density blob is compared to a library of three-dimensional ligand binding surfaces extracted from the Protein Data Bank (PDB). Ligands corresponding to surfaces which share physicochemical texture and geometric shape similarities are considered for assignment. The method is benchmarked against a set of well represented ligands from the PDB, in which we show that we can identify the correct ligand based on the corresponding binding surface. Finally, we apply the method during model building and refinement stages from structural genomics targets in which unknown density blobs were discovered. A semi-automated computational method is described which aims to assist crystallographers with assigning the identity of a ligand corresponding to unknown electron density. Using shape and physicochemical similarity assessments between the protein surface surrounding the density and a database of known ligand binding surfaces, a plausible list of candidate ligands are identified for consideration. The method is validated against highly observed ligands from the Protein Data Bank and results are shown from its use in a high-throughput structural genomics pipeline.
Collapse
Affiliation(s)
- T Andrew Binkowski
- Midwest Center for Structural Genomics (MCSG), Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439, USA.
| | | | | | | | | |
Collapse
|
9
|
Liang J, Tseng YY, Dundas J, Binkowski TA, Joachimiak A, Ouyang Z, Adamian L. Chapter 4. Predicting and characterizing protein functions through matching geometric and evolutionary patterns of binding surfaces. Adv Protein Chem Struct Biol 2009; 75:107-41. [PMID: 20731991 PMCID: PMC2882714 DOI: 10.1016/s0065-3233(07)75004-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2023]
Abstract
Predicting protein functions from structures is an important and challenging task. Although proteins are often thought to be packed as tightly as solids, closer examination based on geometric computation reveals that they contain numerous voids and pockets. Most of them are of random nature, but some are binding sites providing surfaces to interact with other molecules. A promising approach for function prediction is to infer functions through discovery of similarity in local binding pockets, as proteins binding to similar substrates/ligands and carrying out similar functions have similar physical constraints for binding and reactions. In this chapter, we describe computational methods to distinguish those surface pockets that are likely to be involved in important biological functions, and methods to identify key residues in these pockets. We further describe how to predict protein functions at large scale from structures by detecting binding surfaces similar in residue make-ups, shape, and orientation. We also describe a Bayesian Monte Carlo method that can separate selection pressure due to biological function from pressure due to protein folding. We show how this method can be used to reconstruct the evolutionary history of binding surfaces for detecting similar binding surfaces. In addition, we briefly discuss how the negative image of a binding pocket can be casted, and how such information can be used to facilitate drug discovery.
Collapse
Affiliation(s)
- Jie Liang
- Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, 200240, China
| | | | | | | | | | | | | |
Collapse
|
10
|
Binkowski TA, Joachimiak A. Protein functional surfaces: global shape matching and local spatial alignments of ligand binding sites. BMC Struct Biol 2008; 8:45. [PMID: 18954462 PMCID: PMC2626596 DOI: 10.1186/1472-6807-8-45] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/25/2008] [Accepted: 10/27/2008] [Indexed: 12/02/2022]
Abstract
Background Protein surfaces comprise only a fraction of the total residues but are the most conserved functional features of proteins. Surfaces performing identical functions are found in proteins absent of any sequence or fold similarity. While biochemical activity can be attributed to a few key residues, the broader surrounding environment plays an equally important role. Results We describe a methodology that attempts to optimize two components, global shape and local physicochemical texture, for evaluating the similarity between a pair of surfaces. Surface shape similarity is assessed using a three-dimensional object recognition algorithm and physicochemical texture similarity is assessed through a spatial alignment of conserved residues between the surfaces. The comparisons are used in tandem to efficiently search the Global Protein Surface Survey (GPSS), a library of annotated surfaces derived from structures in the PDB, for studying evolutionary relationships and uncovering novel similarities between proteins. Conclusion We provide an assessment of our method using library retrieval experiments for identifying functionally homologous surfaces binding different ligands, functionally diverse surfaces binding the same ligand, and binding surfaces of ubiquitous and conformationally flexible ligands. Results using surface similarity to predict function for proteins of unknown function are reported. Additionally, an automated analysis of the ATP binding surface landscape is presented to provide insight into the correlation between surface similarity and function for structures in the PDB and for the subset of protein kinases.
Collapse
Affiliation(s)
- T Andrew Binkowski
- Midwest Center for Structural Genomics and Structural Biology Center, Biosciences Division, Argonne National Laboratory, Argonne, Illinois 60439, USA.
| | | |
Collapse
|
11
|
Kim Y, Quartey P, Li H, Volkart L, Hatzos C, Chang C, Nocek B, Cuff M, Osipiuk J, Tan K, Fan Y, Bigelow L, Maltseva N, Wu R, Borovilos M, Duggan E, Zhou M, Binkowski TA, Zhang RG, Joachimiak A. Large-scale evaluation of protein reductive methylation for improving protein crystallization. Nat Methods 2008; 5:853-4. [PMID: 18825126 DOI: 10.1038/nmeth1008-853] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
12
|
Qiu Y, Zhang R, Binkowski TA, Tereshko V, Joachimiak A, Kossiakoff A. The 1.38 A crystal structure of DmsD protein from Salmonella typhimurium, a proofreading chaperone on the Tat pathway. Proteins 2008; 71:525-33. [PMID: 18175314 DOI: 10.1002/prot.21828] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The DmsD protein is necessary for the biogenesis of dimethyl sulphoxide (DMSO) reductase in many prokaryotes. It performs a critical chaperone function initiated through its binding to the twin-arginine signal peptide of DmsA, the catalytic subunit of DMSO reductase. Upon binding to DmsD, DmsA is translocated to the periplasm via the so-called twin-arginine translocation (Tat) pathway. Here we report the 1.38 A crystal structure of the protein DmsD from Salmonella typhimurium and compare it with a close functional homolog, TorD. DmsD has an all-alpha fold structure with a notable helical extension located at its N-terminus with two solvent exposed hydrophobic residues. A major difference between DmsD and TorD is that TorD structure is a domain-swapped dimer, while DmsD exists as a monomer. Nevertheless, these two proteins have a number of common features suggesting they function by using similar mechanisms. A possible signal peptide-binding site is proposed based on structural similarities. Computational analysis was used to identify a potential GTP binding pocket on similar surfaces of DmsD and TorD structures.
Collapse
Affiliation(s)
- Yang Qiu
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois 60637, USA
| | | | | | | | | | | |
Collapse
|
13
|
Abstract
BACKGROUND Identifying structurally similar proteins with different chain topologies can aid studies in homology modeling, protein folding, protein design, and protein evolution. These include circular permuted protein structures, and the more general cases of non-cyclic permutations between similar structures, which are related by non-topological rearrangement beyond circular permutation. We present a method based on an approximation algorithm that finds sequence-order independent structural alignments that are close to optimal. We formulate the structural alignment problem as a special case of the maximum-weight independent set problem, and solve this computationally intensive problem approximately by iteratively solving relaxations of a corresponding integer programming problem. The resulting structural alignment is sequence order independent. Our method is also insensitive to insertions, deletions, and gaps. RESULTS Using a novel similarity score and a statistical model for significance p-value, we are able to discover previously unknown circular permuted proteins between nucleoplasmin-core protein and auxin binding protein, between aspartate rasemase and 3-dehydrogenate dehydralase, as well as between migration inhibition factor and arginine repressor which involves an additional strand-swapping. We also report the finding of non-cyclic permuted protein structures existing in nature between AML1/core binding factor and ribofalvin synthase. Our method can be used for large scale alignment of protein structures regardless of the topology. CONCLUSION The approximation algorithm introduced in this work can find good solutions for the problem of protein structure alignment. Furthermore, this algorithm can detect topological differences between two spatially similar protein structures. The alignment between MIF and the arginine repressor demonstrates our algorithm's ability to detect structural similarities even when spatial rearrangement of structural units has occurred. The effectiveness of our method is also demonstrated by the discovery of previously unknown circular permutations. In addition, we report in this study the finding of a naturally occurring non-cyclic permuted protein between AML1/Core Binding Factor chain F and riboflavin synthase chain A.
Collapse
Affiliation(s)
- Joe Dundas
- Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607-7053, USA.
| | | | | | | |
Collapse
|
14
|
Binkowski TA, DasGupta B, Liang J. Order independent structural alignment of circularly permuted proteins. Conf Proc IEEE Eng Med Biol Soc 2007; 2004:2781-4. [PMID: 17270854 DOI: 10.1109/iembs.2004.1403795] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Circular permutation connects the N and C termini of a protein and concurrently cleaves elsewhere in the chain, providing an important mechanism for generating novel protein fold and functions. However, their in genomes is unknown because current detection methods can miss many occurrences, mistaking random repeats as circular permutation. Here we develop a method for detecting circularly permuted proteins from structural comparison. Sequence order independent alignment of protein structures can be regarded as a special case of the maximum-weight independent set problem, which is known to be computationally hard. We develop an efficient approximation algorithm by repeatedly solving relaxations of an appropriate intermediate integer programming formulation, we show that the approximation ratio is much better than the theoretical worst case ratio of r=1/4. Circularly permuted proteins reported in literature can be identified rapidly with our method, while they escape the detection by publicly available servers for structural alignment.
Collapse
|
15
|
Abstract
Structural genomics (SG) initiatives are expanding the universe of protein fold space by rapidly determining structures of proteins that were intentionally selected on the basis of low sequence similarity to proteins of known structure. Often these proteins have no associated biochemical or cellular functions. The SG success has resulted in an accelerated deposition of novel structures. In some cases the structural bioinformatics analysis applied to these novel structures has provided specific functional assignment. However, this approach has also uncovered limitations in the functional analysis of uncharacterized proteins using traditional sequence and backbone structure methodologies. A novel method, named pvSOAR (pocket and void Surface of Amino Acid Residues), of comparing the protein surfaces of geometrically defined pockets and voids was developed. pvSOAR was able to detect previously unrecognized and novel functional relationships between surface features of proteins. In this study, pvSOAR is applied to several structural genomics proteins. We examined the surfaces of YecM, BioH, and RpiB from Escherichia coli as well as the CBS domains from inosine-5'-monosphate dehydrogenase from Streptococcus pyogenes, conserved hypothetical protein Ta549 from Thermoplasm acidophilum, and CBS domain protein mt1622 from Methanobacterium thermoautotrophicum with the goal to infer information about their biochemical function.
Collapse
Affiliation(s)
- T Andrew Binkowski
- Department of Bioengineering, The University of Illinois, 851 South Morgan St., Room 218, Chicago, IL 60607, USA.
| | | | | |
Collapse
|
16
|
Binkowski TA, Freeman P, Liang J. pvSOAR: detecting similar surface patterns of pocket and void surfaces of amino acid residues on proteins. Nucleic Acids Res 2004; 32:W555-8. [PMID: 15215448 PMCID: PMC441528 DOI: 10.1093/nar/gkh390] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Detecting similar protein surfaces provides an important route for discovering unrecognized or novel functional relationship between proteins. The web server pvSOAR (pocket and void Surfaces Of Amino acid Residues) provides an online resource to identify similar protein surface regions. pvSOAR can take a structure either uploaded by a user or obtained from the Protein Data Bank, and identifies similar surface patterns based on geometrically defined pockets and voids. It provides several search modes to compare protein surfaces by similarity in local sequence, local shape and local orientation. Statistically significant search results are reported for visualization and interactive exploration. pvSOAR can be used to predict biological functions of proteins with known three-dimensional structures but unknown biological roles. It can also be used to study functional relationship between proteins and for exploration of the evolutionary origins of structural elements important for protein function. We present an example using pvSOAR to explore the biological roles of a protein whose structure was solved by the structural genomics project. The pvSOAR web server is available at http://pvsoar.bioengr.uic.edu/.
Collapse
Affiliation(s)
- T Andrew Binkowski
- Department of Bioengineering, The University of Illinois at Chicago, Chicago, IL 60607-7052, USA
| | | | | |
Collapse
|
17
|
Stitziel NO, Binkowski TA, Tseng YY, Kasif S, Liang J. topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association. Nucleic Acids Res 2004; 32:D520-2. [PMID: 14681472 PMCID: PMC308838 DOI: 10.1093/nar/gkh104] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The database of topographic mapping of Single Nucleotide Polymorphism (topoSNP) provides an online resource for analyzing non-synonymous SNPs (nsSNPs) that can be mapped onto known 3D structures of proteins. These include disease- associated nsSNPs derived from the Online Mendelian Inheritance in Man (OMIM) database and other nsSNPs derived from dbSNP, a resource at the National Center for Biotechnology Information that catalogs SNPs. TopoSNP further classifies each nsSNP site into three categories based on their geometric location: those located in a surface pocket or an interior void of the protein, those on a convex region or a shallow depressed region, and those that are completely buried in the interior of the protein structure. These unique geometric descriptions provide more detailed mapping of nsSNPs to protein structures. The current release also includes relative entropy of SNPs calculated from multiple sequence alignment as obtained from the Pfam database (a database of protein families and conserved protein motifs) as well as manually adjusted multiple alignments obtained from ClustalW. These structural and conservational data can be useful for studying whether nsSNPs in coding regions are likely to lead to phenotypic changes. TopoSNP includes an interactive structural visualization web interface, as well as downloadable batch data. The database will be updated at regular intervals and can be accessed at: http://gila.bioengr.uic.edu/snp/toposnp.
Collapse
Affiliation(s)
- Nathan O Stitziel
- Department of Bioengineering, University of Illinois at Chicago, M/C 063, 851 S. Morgan Street, Chicago, IL 60607, USA
| | | | | | | | | |
Collapse
|
18
|
Abstract
We describe a novel approach for inferring functional relationship of proteins by detecting sequence and spatial patterns of protein surfaces. Well-formed concave surface regions in the form of pockets and voids are examined to identify similarity relationship that might be directly related to protein function. We first exhaustively identify and measure analytically all 910,379 surface pockets and interior voids on 12,177 protein structures from the Protein Data Bank. The similarity of patterns of residues forming pockets and voids are then assessed in sequence, in spatial arrangement, and in orientational arrangement. Statistical significance in the form of E and p-values is then estimated for each of the three types of similarity measurements. Our method is fully automated without human intervention and can be used without input of query patterns. It does not assume any prior knowledge of functional residues of a protein, and can detect similarity based on surface patterns small and large. It also tolerates, to some extent, conformational flexibility of functional sites. We show with examples that this method can detect functional relationship with specificity for members of the same protein family and superfamily, as well as remotely related functional surfaces from proteins of different fold structures. We envision that this method can be used for discovering novel functional relationship of protein surfaces, for functional annotation of protein structures with unknown biological roles, and for further inquiries on evolutionary origins of structural elements important for protein function.
Collapse
Affiliation(s)
- T Andrew Binkowski
- Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607-7052, USA
| | | | | |
Collapse
|
19
|
Binkowski TA, Naghibzadeh S, Liang J. CASTp: Computed Atlas of Surface Topography of proteins. Nucleic Acids Res 2003; 31:3352-5. [PMID: 12824325 PMCID: PMC168919 DOI: 10.1093/nar/gkg512] [Citation(s) in RCA: 482] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2003] [Accepted: 03/06/2003] [Indexed: 11/13/2022] Open
Abstract
Computed Atlas of Surface Topography of proteins (CASTp) provides an online resource for locating, delineating and measuring concave surface regions on three-dimensional structures of proteins. These include pockets located on protein surfaces and voids buried in the interior of proteins. The measurement includes the area and volume of pocket or void by solvent accessible surface model (Richards' surface) and by molecular surface model (Connolly's surface), all calculated analytically. CASTp can be used to study surface features and functional regions of proteins. CASTp includes a graphical user interface, flexible interactive visualization, as well as on-the-fly calculation for user uploaded structures. CASTp is updated daily and can be accessed at http://cast.engr.uic.edu.
Collapse
Affiliation(s)
- T Andrew Binkowski
- Department of Bioengineering, MC-063, University of Illinois at Chicago, 851 S. Morgan Street, Chicago, IL 60607, USA
| | | | | |
Collapse
|
20
|
Abstract
Higher-order interactions are important for protein folding and assembly. We introduce the concept of interhelical three-body interactions as derived from Delaunay triangulation and alpha shapes of protein structures. In addition to glycophorin A, where triplets are strongly correlated with protein stability, we found that tight interhelical triplet interactions exist extensively in other membrane proteins, where many types of triplets occur far more frequently than in soluble proteins. We developed a probabilistic model for estimating the value of membrane helical interaction triplet (MHIT) propensity. Because the number of known structures of membrane proteins is limited, we developed a bootstrap method for determining the 95% confidence intervals of estimated MHIT values. We identified triplets that have high propensity for interhelical interactions and are unique to membrane proteins, e.g. AGF, AGG, GLL, GFF and others. A significant fraction (32%) of triplet types contains triplets that may be involved in interhelical hydrogen bond interactions, suggesting the prevalent and important roles of H-bond in the assembly of TM helices. There are several well-defined spatial conformations for triplet interactions on helices with similar parallel or antiparallel orientations and with similar right-handed or left-handed crossing angles. Often, they contain small residues and correspond to the regions of the closest contact between helices. Sequence motifs such as GG4 and AG4 can be part of the three-body interactions that have similar conformations, which in turn can be part of a higher-order cooperative four residue spatial motif observed in helical pairs from different proteins. In many cases, spatial motifs such as serine zipper and polar clamp are part of triplet interactions. On the basis of the analysis of the archaeal rhodopsin family of proteins, tightly packed triplet interactions can be achieved with several different choices of amino acid residues.
Collapse
Affiliation(s)
- Larisa Adamian
- Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607, USA
| | | | | | | |
Collapse
|