1
|
Weaver GC, Arya R, Schneider CL, Hudson AW, Stern LJ. Structural Models for Roseolovirus U20 And U21: Non-Classical MHC-I Like Proteins From HHV-6A, HHV-6B, and HHV-7. Front Immunol 2022; 13:864898. [PMID: 35444636 PMCID: PMC9013968 DOI: 10.3389/fimmu.2022.864898] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Accepted: 03/08/2022] [Indexed: 01/08/2023] Open
Abstract
Human roseolovirus U20 and U21 are type I membrane glycoproteins that have been implicated in immune evasion by interfering with recognition of classical and non-classical MHC proteins. U20 and U21 are predicted to be type I glycoproteins with extracytosolic immunoglobulin-like domains, but detailed structural information is lacking. AlphaFold and RoseTTAfold are next generation machine-learning-based prediction engines that recently have revolutionized the field of computational three-dimensional protein structure prediction. Here, we review the structural biology of viral immunoevasins and the current status of computational structure prediction algorithms. We use these computational tools to generate structural models for U20 and U21 proteins, which are predicted to adopt MHC-Ia-like folds with closed MHC platforms and immunoglobulin-like domains. We evaluate these structural models and place them within current understanding of the structural basis for viral immune evasion of T cell and natural killer cell recognition.
Collapse
Affiliation(s)
- Grant C. Weaver
- Immunology and Microbiology Graduate Program, Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA, United States
- Department of Pathology, UMass Chan Medical School, Worcester, MA, United States
| | - Richa Arya
- Department of Pathology, UMass Chan Medical School, Worcester, MA, United States
| | | | - Amy W. Hudson
- Department of Microbiology and Molecular Genetics, Medical College of Wisconsin, Milwaukee, WI, United States
| | - Lawrence J. Stern
- Immunology and Microbiology Graduate Program, Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA, United States
- Department of Pathology, UMass Chan Medical School, Worcester, MA, United States
- Department of Biochemistry and Molecular Biotechnology, UMass Chan Medical School, Worcester, MA, United States
| |
Collapse
|
2
|
DRAMP 2.0, an updated data repository of antimicrobial peptides. Sci Data 2019; 6:148. [PMID: 31409791 PMCID: PMC6692298 DOI: 10.1038/s41597-019-0154-y] [Citation(s) in RCA: 173] [Impact Index Per Article: 34.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 07/17/2019] [Indexed: 12/20/2022] Open
Abstract
Data Repository of Antimicrobial Peptides (DRAMP, http://dramp.cpu-bioinfor.org/) is an open-access comprehensive database containing general, patent and clinical antimicrobial peptides (AMPs). Currently DRAMP has been updated to version 2.0, it contains a total of 19,899 entries (newly added 2,550 entries), including 5,084 general entries, 14,739 patent entries, and 76 clinical entries. The update covers new entries, structures, annotations, classifications and downloads. Compared with APD and CAMP, DRAMP contains 14,040 (70.56% in DRAMP) non-overlapping sequences. In order to facilitate users to trace original references, PubMed_ID of references have been contained in activity information. The data of DRAMP can be downloaded by dataset and activity, and the website source code is also available on dedicatedly designed download webpage. Although thousands of AMPs have been reported, only a few parts have entered clinical stage. In the paper, we described several AMPs in clinical trials, including their properties, indications and clinicaltrials.gov identifiers. Finally, we provide the applications of DRAMP in the development of AMPs.
Collapse
|
3
|
Physics-Based Modeling of Side Chain—Side Chain Interactions in the UNRES Force Field. SPRINGER SERIES ON BIO- AND NEUROSYSTEMS 2019. [DOI: 10.1007/978-3-319-95843-9_4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
4
|
Identifying key interactions stabilizing DOF zinc finger-DNA complexes using in silico approaches. J Theor Biol 2015; 382:150-9. [PMID: 26092376 DOI: 10.1016/j.jtbi.2015.06.013] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Revised: 05/09/2015] [Accepted: 06/06/2015] [Indexed: 11/21/2022]
Abstract
DOF (DNA-binding with one finger) proteins, a family of DNA-binding transcription factors, are members of zinc fingers unique to plants. They are associated with different plant specific phenomena including germination, dormancy, light and defense responses. Until now, there is no report of experimentally solved structure for DOF proteins, making empirical investigation of DOF-DNA interaction more challenging. It has been shown that comparative modeling can be used to reliably predict the three-dimensional (3D) model of structurally unknown proteins whenever a suitable template is available. Furthermore, current molecular mechanics force fields allow prediction of interaction energies for macromolecular complexes. Therefore, the approaches considered in this work were to model the 3D structures of DOF zinc fingers (ZFs) from Arabidopsis thaliana complexed with DNA molecule, to calculate their binding energies, to identify key interactions established between ZFs and DNA, and to determine the impact of the different interactions on the binding energies. The results were used to predict the binding affinities for the novel designed ZFs and may be used in engineering DNA binding proteins.
Collapse
|
5
|
Berrondo M, Gray JJ. Computed structures of point deletion mutants and their enzymatic activities. Proteins 2011; 79:2844-60. [PMID: 21905110 DOI: 10.1002/prot.23109] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2010] [Revised: 04/08/2011] [Accepted: 05/13/2011] [Indexed: 11/11/2022]
Abstract
Point deletions in enzymes can vary in effect from negligible to complete loss of activity; however, these effects are not generally predictable. Deletions are widely observed in nature and often result in diseases such as cancer, cystic fibrosis, or osteogenesis imperfecta. Here, we have developed an algorithm to model the perturbed structures of deletion mutants with the ultimate goal of predicting their activities. The algorithm works by deleting the specified residue from the wild-type structure, creating a gap that is closed using a combination of local and global moves that change the backbone torsion angles of the protein structure. On a set of five proteins for which both wild-type and deletion mutant x-ray crystal structures are available, the algorithm produces deep, narrow energy funnels within 1.5 Å of the crystal structure for the deletion mutants. To assess the ability of our algorithm to predict activity from the predicted structures, we tested the correlation of experimental activity with several measures of the predicted structure ensemble using a set of 45 point deletions from ricin. Estimates incorporating likely prevalence of active and inactive deletion sites suggest that activity can be predicted correctly over 60% of the time from the active site root-mean squared deviation of the lowest energy predicted structures. The predictions are stronger than simple sequence organization measures, but more fundamental work is required in structure prediction and enzyme activity determination to allow consistent prediction of activity.
Collapse
Affiliation(s)
- Monica Berrondo
- Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | |
Collapse
|
6
|
Bhattacharjee N, Biswas P. Position-specific propensities of amino acids in the β-strand. BMC STRUCTURAL BIOLOGY 2010; 10:29. [PMID: 20920153 PMCID: PMC2955036 DOI: 10.1186/1472-6807-10-29] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2010] [Accepted: 09/28/2010] [Indexed: 11/23/2022]
Abstract
Background Despite the importance of β-strands as main building blocks in proteins, the propensity of amino acid in β-strands is not well-understood as it has been more difficult to determine experimentally compared to α-helices. Recent studies have shown that most of the amino acids have significantly high or low propensity towards both ends of β-strands. However, a comprehensive analysis of the sequence dependent amino acid propensities at positions between the ends of the β-strand has not been investigated. Results The propensities of the amino acids calculated from a large non-redundant database of proteins are found to be highly position-specific and vary continuously throughout the length of the β-strand. They follow an unexpected characteristic periodic pattern in inner positions with respect to the cap residues in both termini of β-strands; this periodic nature is markedly different from that of the α-helices with respect to the strength and pattern in periodicity. This periodicity is not only different for different amino acids but it also varies considerably for the amino acids belonging to the same physico-chemical group. Average hydrophobicity is also found to be periodic with respect to the positions from both termini of β-strands. Conclusions The results contradict the earlier perception of isotropic nature of amino acid propensities in the middle region of β-strands. These position-specific propensities should be of immense help in understanding the factors responsible for β-strand design and efficient prediction of β-strand structure in unknown proteins.
Collapse
|
7
|
Odell LR, Howan D, Gordon CP, Robertson MJ, Chau N, Mariana A, Whiting AE, Abagyan R, Daniel JA, Gorgani NN, Robinson PJ, McCluskey A. The pthaladyns: GTP competitive inhibitors of dynamin I and II GTPase derived from virtual screening. J Med Chem 2010; 53:5267-80. [PMID: 20575553 DOI: 10.1021/jm100442u] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We report the development of a homology model for the GTP binding domain of human dynamin I based on the corresponding crystal structure of Dictyostelium discoidum dynamin A. Virtual screening identified 2-[(2-biphenyl-2-yl-1,3-dioxo-2,3-dihydro-1H-isoindole-5-carbonyl)amino]-4-chlorobenzoic acid (1) as a approximately 170 microM potent inhibitor. Homology modeling- and focused library-led synthesis resulted in development of a series of active compounds (the "pthaladyns") with 4-chloro-2-(2-(4-(hydroxymethyl)phenyl)-1,3-dioxoisoindoline-5-carboxamido)benzoic acid (29), a 4.58 +/- 0.06 microM dynamin I GTPase inhibitor. Pthaladyn-29 displays borderline selectivity for dynamin I relative to dynamin II ( approximately 5-10 fold). Only pthaladyn-23 (dynamin I IC(50) 17.4 +/- 5.8 microM) was an effective inhibitor of dynamin I mediated synaptic vesicle endocytosis in brain synaptosomes with an IC(50) of 12.9 +/- 5.9 microM. This compound was also competitive with respect to Mg(2+).GTP. Thus the pthaladyns are the first GTP competitive inhibitors of dynamin I and II GTPase and may be effective new tools for the study of neuronal endocytosis.
Collapse
Affiliation(s)
- Luke R Odell
- Chemistry, The University of Newcastle, University Drive, Callaghan, NSW 2308, Australia
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Saunders R, Deane CM. Synonymous codon usage influences the local protein structure observed. Nucleic Acids Res 2010; 38:6719-28. [PMID: 20530529 PMCID: PMC2965230 DOI: 10.1093/nar/gkq495] [Citation(s) in RCA: 112] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Translation of mRNA into protein is a unidirectional information flow process. Analysing the input (mRNA) and output (protein) of translation, we find that local protein structure information is encoded in the mRNA nucleotide sequence. The Coding Sequence and Structure (CSandS) database developed in this work provides a detailed mapping between over 4000 solved protein structures and their mRNA. CSandS facilitates a comprehensive analysis of codon usage over many organisms. In assigning translation speed, we find that relative codon usage is less informative than tRNA concentration. For all speed measures, no evidence was found that domain boundaries are enriched with slow codons. In fact, genes seemingly avoid slow codons around structurally defined domain boundaries. Translation speed, however, does decrease at the transition into secondary structure. Codons are identified that have structural preferences significantly different from the amino acid they encode. However, each organism has its own set of ‘significant codons’. Our results support the premise that codons encode more information than merely amino acids and give insight into the role of translation in protein folding.
Collapse
Affiliation(s)
- Rhodri Saunders
- Department of Statistics, Oxford University, 1 South Parks Road, Oxford OX1 3TG, UK.
| | | |
Collapse
|
9
|
Narang P, Bhushan K, Bose S, Jayaram B. A computational pathway for bracketing native-like structures fo small alpha helical globular proteins. Phys Chem Chem Phys 2009; 7:2364-75. [PMID: 19785123 DOI: 10.1039/b502226f] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Impressive advances in the applications of bioinformatics for protein structure prediction coupled with growing structural databases on one hand and the insurmountable time-scale problem with ab initio computational methods on the other continue to raise doubts whether a computational solution to the protein folding problem--categorized as an NP-hard problem--is within reach in the near future. Combining some specially designed biophysical filters and vector algebra tools with ab initio methods, we present here a promising computational pathway for bracketing native-like structures of small alpha helical globular proteins departing from secondary structural information. The automated protocol is initiated by generating multiple structures around the loops between secondary structural elements. A set of knowledge-based biophysical filters namely persistence length and radius of gyration, developed and calibrated on approximately 1000 globular proteins, is introduced to screen the trial structures to filter out improbable candidates for the native and reduce the size of the library of probable structures. The ensemble so generated encompasses a few structures with native-like topology. Monte Carlo optimizations of the loop dihedrals are then carried out to remove steric clashes. The resultant structures are energy minimized and ranked according to a scoring function tested previously on a series of decoy sets vis-a-vis their corresponding natives. We find that the 100 lowest energy structures culled from the ensemble of energy optimized trial structures comprise at least a few to within 3-5 angstroms of the native. Thus the formidable "needle in a haystack" problem is narrowed down to finding an optimal solution amongst a computationally tractable number of alternatives. Encouraging results obtained on twelve small alpha helical globular proteins with the above outlined pathway are presented and discussed.
Collapse
Affiliation(s)
- Pooja Narang
- Department of Chemistry, Indian Institute of Technology, Hauz Khas, New Delhi 110016, India
| | | | | | | |
Collapse
|
10
|
Liu T, Horst JA, Samudrala R. A novel method for predicting and using distance constraints of high accuracy for refining protein structure prediction. Proteins 2009; 77:220-34. [PMID: 19422061 DOI: 10.1002/prot.22434] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The principal bottleneck in protein structure prediction is the refinement of models from lower accuracies to the resolution observed by experiment. We developed a novel constraints-based refinement method that identifies a high number of accurate input constraints from initial models and rebuilds them using restrained torsion angle dynamics (rTAD). We previously created a Bayesian statistics-based residue-specific all-atom probability discriminatory function (RAPDF) to discriminate native-like models by measuring the probability of accuracy for atom type distances within a given model. Here, we exploit RAPDF to score (i.e., filter) constraints from initial predictions that may or may not be close to a native-like state, obtain consensus of top scoring constraints amongst five initial models, and compile sets with no redundant residue pair constraints. We find that this method consistently produces a large and highly accurate set of distance constraints from which to build refinement models. We further optimize the balance between accuracy and coverage of constraints by producing multiple structure sets using different constraint distance cutoffs, and note that the cutoff governs spatially near versus distant effects in model generation. This complete procedure of deriving distance constraints for rTAD simulations improves the quality of initial predictions significantly in all cases evaluated by us. Our procedure represents a significant step in solving the protein structure prediction and refinement problem, by enabling the use of consensus constraints, RAPDF, and rTAD for protein structure modeling and refinement.
Collapse
Affiliation(s)
- Tianyun Liu
- Department of Genetics, Stanford University, Stanford, California, USA
| | | | | |
Collapse
|
11
|
Giardi MT, Scognamiglio V, Rea G, Rodio G, Antonacci A, Lambreva M, Pezzotti G, Johanningmeier U. Optical biosensors for environmental monitoring based on computational and biotechnological tools for engineering the photosynthetic D1 protein of Chlamydomonas reinhardtii. Biosens Bioelectron 2009; 25:294-300. [PMID: 19674888 DOI: 10.1016/j.bios.2009.07.003] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2009] [Revised: 06/23/2009] [Accepted: 07/09/2009] [Indexed: 11/19/2022]
Abstract
Homology-based protein modelling and computational screening followed by virtual mutagenesis analyses were used to identify functional amino acids in the D1 protein of the photosynthetic electron transfer chain interacting with herbicides. A library of functional mutations in the unicellular green alga Chlamydomonas reinhardtii for preparing biomediators was built and their interactions with herbicides were calculated. D1 proteins giving the lowest and highest binding energy with herbicides were considered as suitable for preparing the environmental biosensors for detecting specific herbicide classes. Arising from the results of theoretical calculations, three mutants were prepared by site-directed mutagenesis and characterized by fluorescence analysis. Their adsorption and selective recognition ability were studied by an equilibrium-adsorption method. The S268C and S264K biomediators showed high sensitivity and resistance, respectively, to both triazine and urea classes of herbicides. When immobilized on a silicon septum, the biomediators were found to be highly stable, remaining so for at least 1-month at room temperature. The fluorescence properties were exploited and a reusable and portable multiarray optical biosensor for environmental monitoring was developed with limits of detection between 0.8 x 10(-11) and 3.0 x 10(-9), depending on the target analyte. In addition, biomediator regeneration without obvious deterioration in performance was demonstrated.
Collapse
Affiliation(s)
- Maria Teresa Giardi
- Institute of Crystallography, Area of Research of Rome, Department of Agrofood, CNR, Via Salaria km 29.300, 00015, Monterotondo Scalo, Rome, Italy
| | | | | | | | | | | | | | | |
Collapse
|
12
|
Abstract
Determining the primary sequences of informational macromolecules is no longer a limiting factor for our ability to completely understand the biological functioning of cells and organisms. Similarly, our understanding of transcriptional regulation (transcriptomics) has been greatly enhanced by the availability of microarrays. Our next hurdle is to learn the biochemical functions of all the gene products (proteomics) and the totality of all the interactions among them (interactomics). Using traditional biochemical methods, this will take a very long time. More efficient methods are needed to address these questions, or at least to suggest possible candidates for further testing. High-resolution imaging using molecule-specific tags will reveal details of cellular architecture that are expected to provide additional insights and clues about the interactions and functions of many gene products. Computer modeling of macromolecular structures and functional systems will be of key importance. We present here a brief historical and futuristic perspective of genomics and some of its other ‘omics offshoots in the post-genomic era.
Collapse
|
13
|
Liu T, Guerquin M, Samudrala R. Improving the accuracy of template-based predictions by mixing and matching between initial models. BMC STRUCTURAL BIOLOGY 2008; 8:24. [PMID: 18457597 PMCID: PMC2424052 DOI: 10.1186/1472-6807-8-24] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2007] [Accepted: 05/05/2008] [Indexed: 11/10/2022]
Abstract
BACKGROUND Comparative modeling is a technique to predict the three dimensional structure of a given protein sequence based primarily on its alignment to one or more proteins with experimentally determined structures. A major bottleneck of current comparative modeling methods is the lack of methods to accurately refine a starting initial model so that it approaches the resolution of the corresponding experimental structure. We investigate the effectiveness of a graph-theoretic clique finding approach to solve this problem. RESULTS Our method takes into account the information presented in multiple templates/alignments at the three-dimensional level by mixing and matching regions between different initial comparative models. This method enables us to obtain an optimized conformation ensemble representing the best combination of secondary structures, resulting in the refined models of higher quality. In addition, the process of mixing and matching accumulates near-native conformations, resulting in discriminating the native-like conformation in a more effective manner. In the seventh Critical Assessment of Structure Prediction (CASP7) experiment, the refined models produced are more accurate than the starting initial models. CONCLUSION This novel approach can be applied without any manual intervention to improve the quality of comparative predictions where multiple template/alignment combinations are available for modeling, producing conformational models of higher quality than the starting initial predictions.
Collapse
Affiliation(s)
- Tianyun Liu
- Department of Microbiology, University of Washington, School of Medicine, Seattle, WA 98195, USA
| | - Michal Guerquin
- Department of Microbiology, University of Washington, School of Medicine, Seattle, WA 98195, USA
| | - Ram Samudrala
- Department of Microbiology, University of Washington, School of Medicine, Seattle, WA 98195, USA
| |
Collapse
|
14
|
Abstract
This review presents the advances in protein structure prediction from the computational methods perspective. The approaches are classified into four major categories: comparative modeling, fold recognition, first principles methods that employ database information, and first principles methods without database information. Important advances along with current limitations and challenges are presented.
Collapse
Affiliation(s)
- C A Floudas
- Department of Chemical Engineering, Princeton University, Princeton, New Jersey 08544-5263, USA.
| |
Collapse
|
15
|
|
16
|
Jayaram B, Bhushan K, Shenoy SR, Narang P, Bose S, Agrawal P, Sahu D, Pandey V. Bhageerath: an energy based web enabled computer software suite for limiting the search space of tertiary structures of small globular proteins. Nucleic Acids Res 2006; 34:6195-204. [PMID: 17090600 PMCID: PMC1693886 DOI: 10.1093/nar/gkl789] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
We describe here an energy based computer software suite for narrowing down the search space of tertiary structures of small globular proteins. The protocol comprises eight different computational modules that form an automated pipeline. It combines physics based potentials with biophysical filters to arrive at 10 plausible candidate structures starting from sequence and secondary structure information. The methodology has been validated here on 50 small globular proteins consisting of 2–3 helices and strands with known tertiary structures. For each of these proteins, a structure within 3–6 Å RMSD (root mean square deviation) of the native has been obtained in the 10 lowest energy structures. The protocol has been web enabled and is accessible at .
Collapse
Affiliation(s)
- B Jayaram
- Department of Chemistry and Supercomputing Facility for Bioinformatics and Computational Biology, Indian Institute of Technology Delhi Hauz Khas, New Delhi 110 016, India.
| | | | | | | | | | | | | | | |
Collapse
|
17
|
Narang P, Bhushan K, Bose S, Jayaram B. Protein Structure Evaluation using an All-Atom Energy Based Empirical Scoring Function. J Biomol Struct Dyn 2006; 23:385-406. [PMID: 16363875 DOI: 10.1080/07391102.2006.10531234] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Arriving at the native conformation of a polypeptide chain characterized by minimum most free energy is a problem of long standing interest in protein structure prediction endeavors. Owing to the computational requirements in developing free energy estimates, scoring functions--energy based or statistical--have received considerable renewed attention in recent years for distinguishing native structures of proteins from non-native like structures. Several cleverly designed decoy sets, CASP (Critical Assessment of Techniques for Protein Structure Prediction) structures and homology based internet accessible three dimensional model builders are now available for validating the scoring functions. We describe here an all-atom energy based empirical scoring function and examine its performance on a wide series of publicly available decoys. Barring two protein sequences where native structure is ranked second and seventh, native is identified as the lowest energy structure in 67 protein sequences from among 61,659 decoys belonging to 12 different decoy sets. We further illustrate a potential application of the scoring function in bracketing native-like structures of two small mixed alpha/beta globular proteins starting from sequence and secondary structural information. The scoring function has been web enabled at www.scfbio-iitd.res.in/utility/proteomics/energy.jsp.
Collapse
Affiliation(s)
- Pooja Narang
- Department of Chemistry, Indian Institute of Technology, Hauz Khas, New Delhi - 110016, India.
| | | | | | | |
Collapse
|
18
|
Floudas C, Fung H, McAllister S, Mönnigmann M, Rajgaria R. Advances in protein structure prediction and de novo protein design: A review. Chem Eng Sci 2006. [DOI: 10.1016/j.ces.2005.04.009] [Citation(s) in RCA: 175] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
19
|
Levefelt C, Lundh D. A fold-recognition approach to loop modeling. J Mol Model 2005; 12:125-39. [PMID: 16096805 DOI: 10.1007/s00894-005-0003-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2004] [Accepted: 06/22/2005] [Indexed: 11/26/2022]
Abstract
A novel approach is proposed for modeling loop regions in proteins. In this approach, a prerequisite sequence-structure alignment is examined for regions where the target sequence is not covered by the structural template. These regions, extended with a number of residues from adjacent stem regions, are submitted to fold recognition. The alignments produced by fold recognition are integrated into the initial alignment to create an alignment between the target sequence and several structures, where gaps in the main structural template are covered by local structural templates. This one-to-many (1:N) alignment is used to create a protein model by existing protein-modeling techniques. Several alternative approaches were evaluated using a set of ten proteins. One approach was selected and evaluated using another set of 31 proteins. The most promising result was for gap regions not located at the C-terminus or N-terminus of a protein, where the method produced an average RMSD 12% lower than the loop modeling provided with the program MODELLER. This improvement is shown to be statistically significant.
Collapse
Affiliation(s)
- Christer Levefelt
- School of Humanities and Informatics, University of Skövde, Box 408, 54128 Skövde, Sweden
| | | |
Collapse
|
20
|
Floudas CA. Research challenges, opportunities and synergism in systems engineering and computational biology. AIChE J 2005. [DOI: 10.1002/aic.10620] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
21
|
Tung CS, Goodman JL, Lu H, Macken CA. Homology model of the structure of influenza B virus HA1. J Gen Virol 2004; 85:3249-3259. [PMID: 15483238 DOI: 10.1099/vir.0.80021-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Influenza B virus is one of two types of influenza virus that cause substantial morbidity and mortality in humans, the other being influenza A virus. The inability to provide lasting protection to humans against influenza B virus infection is due, in part, to antigenic drift of the viral surface glycoprotein, haemagglutinin (HA). Studies of the antigenicity of the HA of influenza B virus have been hampered by lack of knowledge of its structure. To address this gap, two possible models have been inferred for this structure, based on two known structures of the homologous HA of the influenza A virus (subtypes H3 and H9). Statistical, structural and functional analyses of these models suggested that they matched important details of experimental observations and did not differ from each other in any substantive way. These models were used to investigate two HA sites at which viral variants appeared to carry a selective advantage. It was found that each of these sites coevolved with nearby sites to compensate for either size or charge changes.
Collapse
Affiliation(s)
- Chang-Shung Tung
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Joshua L Goodman
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Henry Lu
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Catherine A Macken
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| |
Collapse
|
22
|
Kim R, Choi CY. Minimally complex problem set for anAb Initio protein structure prediction study. BIOTECHNOL BIOPROC E 2004. [DOI: 10.1007/bf02933067] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
23
|
Tramontano A, Morea V. Exploiting evolutionary relationships for predicting protein structures. Biotechnol Bioeng 2004; 84:756-62. [PMID: 14708116 DOI: 10.1002/bit.10850] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
In the last few years there have been many developments in computational biology, particularly with regard to novel, imaginative exploitation of genomic data. Disappointingly, there has been a lack of progress in the methodology for prediction of protein structures. In the last several years, however, promising new methods have finally begun to emerge. These methods are increasing the power and scope of the methodology, but, most importantly, they are generating new areas of investigation that we believe will accelerate progress in the field. In this review we describe recent developments and highlight the implications of their success as well as areas where efforts should be focused.
Collapse
Affiliation(s)
- Anna Tramontano
- Department of Biochemical Sciences A. Rossi Fanelli, University La Sapienza, P. le Aldo Moro 5, 00185 Rome, Italy.
| | | |
Collapse
|
24
|
Abstract
Experimental protein structures often provide extensive insight into the mode and specificity of small molecule binding, and this information is useful for understanding protein function and for the design of drugs. We have performed an analysis of the reliability with which ligand-binding information can be deduced from computer model structures, as opposed to experimentally derived ones. Models produced as part of the CASP experiments are used. The accuracy of contacts between protein model atoms and experimentally determined ligand atom positions is the main criterion. Only comparative models are included (i.e., models based on a sequence relationship between the protein of interest and a known structure). We find that, as expected, contact errors increase with decreasing sequence identity used as a basis for modeling. Analysis of the causes of errors shows that sequence alignment errors between model and experimental template have the most deleterious effect. In general, good, but not perfect, insight into ligand binding can be obtained from models based on a sequence relationship, providing there are no alignment errors in the model. The results support a structural genomics strategy based on experimental sampling of structure space so that all protein domains can be modeled on the basis of 30% or higher sequence identity.
Collapse
Affiliation(s)
- Carol DeWeese-Scott
- Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, 9600 Gudelsky Drive, Rockville, Maryland, USA
| | | |
Collapse
|
25
|
|
26
|
Reinhardt A, Eisenberg D. DPANN: Improved sequence to structure alignments following fold recognition. Proteins 2004; 56:528-38. [PMID: 15229885 DOI: 10.1002/prot.20144] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
In fold recognition (FR) a protein sequence of unknown structure is assigned to the closest known three-dimensional (3D) fold. Although FR programs can often identify among all possible folds the one a sequence adopts, they frequently fail to align the sequence to the equivalent residue positions in that fold. Such failures frustrate the next step in structure prediction, protein model building. Hence it is desirable to improve the quality of the alignments between the sequence and the identified structure. We have used artificial neural networks (ANN) to derive a substitution matrix to create alignments between a protein sequence and a protein structure through dynamic programming (DPANN: Dynamic Programming meets Artificial Neural Networks). The matrix is based on the amino acid type and the secondary structure state of each residue. In a database of protein pairs that have the same fold but lack sequences-similarity, DPANN aligns over 30% of all sequences to the paired structure, resembling closely the structural superposition of the pair. In over half of these cases the DPANN alignment is close to the structural superposition, although the initial alignment from the step of fold recognition is not close. Conversely, the alignment created during fold recognition outperforms DPANN in only 10% of all cases. Thus application of DPANN after fold recognition leads to substantial improvements in alignment accuracy, which in turn provides more useful templates for the modeling of protein structures. In the artificial case of using actual instead of predicted secondary structures for the probe protein, over 50% of the alignments are successful.
Collapse
|
27
|
Capriotti E, Fariselli P, Rossi I, Casadio R. A Shannon entropy-based filter detects high- quality profile-profile alignments in searches for remote homologues. Proteins 2003; 54:351-60. [PMID: 14696197 DOI: 10.1002/prot.10564] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Detection of homologous proteins with low-sequence identity to a given target (remote homologues) is routinely performed with alignment algorithms that take advantage of sequence profile. In this article, we investigate the efficacy of different alignment procedures for the task at hand on a set of 185 protein pairs with similar structures but low-sequence similarity. Criteria based on the SCOP label detection and MaxSub scores are adopted to score the results. We investigate the efficacy of alignments based on sequence-sequence, sequence-profile, and profile-profile information. We confirm that with profile-profile alignments the results are better than with other procedures. In addition, we report, and this is novel, that the selection of the results of the profile-profile alignments can be improved by using Shannon entropy, indicating that this parameter is important to recognize good profile-profile alignments among a plethora of meaningless pairs. By this, we enhance the global search accuracy without losing sensitivity and filter out most of the erroneous alignments. We also show that when the entropy filtering is adopted, the quality of the resulting alignments is comparable to that computed for the target and template structures with CE, a structural alignment program.
Collapse
|
28
|
Wang Z, Moult J. Three-dimensional structural location and molecular functional effects of missense SNPs in the T cell receptor V? domain. Proteins 2003; 53:748-57. [PMID: 14579365 DOI: 10.1002/prot.10522] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The mechanisms by which human single nucleotide polymorphisms (SNPs) influence susceptibility to disease are not yet well understood. In a previous study, we developed a structure-based model that may be used to identify which missense SNPs are neutral and which are deleterious to protein function and so potentially involved in disease (Wang and Moult, Hum Mutat 2001;263-270). The model has now been applied to a set of 54 missense cSNPs in the 46 functional T-cell receptor Vbeta-genes. Most of these missense cSNPs are found to be neutral, but 10 are identified as likely deleterious to protein function. Only one was previously associated with disease. We suggest that the others may be disease related but that redundancy in the T-cell response prevents any simple, monogenic effect. Therefore, these SNPs are the most likely contributors to complex, polygenic disease traits. It has been noted that there is a surprisingly high (74%) fraction of nonsynonymous SNPs in these genes. Contrary to expectation, the analysis shows that these are not associated with an unusually high fraction of deleterious SNPs, nor do they significantly contribute to a larger range of antigen recognition or a reduced superantigen-binding repertoire.
Collapse
MESH Headings
- Binding Sites
- Genes, Immunoglobulin
- Genetic Predisposition to Disease
- Immunoglobulin Variable Region/chemistry
- Immunoglobulin Variable Region/genetics
- Models, Molecular
- Mutation, Missense
- Polymorphism, Single Nucleotide
- Protein Binding
- Protein Structure, Tertiary
- Receptors, Antigen, T-Cell, alpha-beta/chemistry
- Receptors, Antigen, T-Cell, alpha-beta/genetics
- Receptors, Antigen, T-Cell, alpha-beta/metabolism
Collapse
Affiliation(s)
- Zhen Wang
- Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, Rockville, Maryland 20850, USA
| | | |
Collapse
|
29
|
Méndez R, Leplae R, De Maria L, Wodak SJ. Assessment of blind predictions of protein-protein interactions: current status of docking methods. Proteins 2003; 52:51-67. [PMID: 12784368 DOI: 10.1002/prot.10393] [Citation(s) in RCA: 333] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The current status of docking procedures for predicting protein-protein interactions starting from their three-dimensional structure is assessed from a first major evaluation of blind predictions. This evaluation was performed as part of a communitywide experiment on Critical Assessment of PRedicted Interactions (CAPRI). Seven newly determined structures of protein-protein complexes were available as targets for this experiment. These were the complexes between a kinase and its protein substrate, between a T-cell receptor beta-chain and a superantigen, and five antigen-antibody complexes. For each target, the predictors were given the experimental structures of the free components, or of one free and one bound component in a random orientation. The structure of the complex was revealed only at the time of the evaluation. A total of 465 predictions submitted by 19 groups were evaluated. These groups used a wide range of algorithms and scoring functions, some of which were completely novel. The quality of the predicted interactions was evaluated by comparing residue-residue contacts and interface residues to those in the X-ray structures and by analyzing the fit of the ligand molecules (the smaller of the two proteins in the complex) or of interface residues only, in the predicted versus target complexes. A total of 14 groups produced predictions, ranking from acceptable to highly accurate for five of the seven targets. The use of available biochemical and biological information, and in one instance structural information, played a key role in achieving this result. It was essential for identifying the native binding modes for the five correctly predicted targets, including the kinase-substrate complex where the enzyme changes conformation on association. But it was also the cause for missing the correct solution for the two remaining unpredicted targets, which involve unexpected antigen-antibody binding modes. Overall, this analysis reveals genuine progress in docking procedures but also illustrates the remaining serious limitations and points out the need for better scoring functions and more effective ways for handling conformational flexibility.
Collapse
Affiliation(s)
- Raúl Méndez
- Service de Conformation de Macromolecules Biologiques, et Bioinformatique, Centre de Biologie Structurale et Bioinformatique, CP 263, BC6, Université Libre de Bruxelles, Bruxelles, Belgium
| | | | | | | |
Collapse
|
30
|
Abstract
Technological advances in miniaturization have found a niche in biology and signal the beginning of a new revolution. Most of the attention and advances have been made with DNA chips yet a lot of progress is being made in the use of other biomolecules and cells. A variety of reviews have covered only different aspects and technologies but leading to the shared terminology of "biochips." This review provides a basic introduction and an in-depth survey of the different technologies and applications involving the use of non-DNA molecules such as proteins and cells. The review focuses on microarrays and microfluidics, but also describes some cellular systems (studies involving patterning and sensor chips) and nanotechnology. The principles of each technology including parameters involved in biochip design and operation are outlined. A discussion of the different biological and biomedical applications illustrates the significance of biochips in biotechnology.
Collapse
Affiliation(s)
- Jocelyn H Ng
- IMI Consulting GmbH, Auf dem Amtshof 3, 30938 Burgwedel, Germany.
| | | |
Collapse
|
31
|
Abstract
An explosion of in vitro experimental data on the folding of proteins has revealed many examples of folding in the millisecond or faster timescale, often occurring in the absence of stable intermediate states. We review experimental methods for measuring fast protein folding kinetics, and then discuss various analytical models used to interpret these data. Finally, we classify general mechanisms that have been proposed to explain fast protein folding into two catagories, heterogeneous and homogeneous, reflecting the nature of the transition state. One heterogeneous mechanism, the diffusion-collision mechanism, can be used to interpret experimental data for a number of proteins.
Collapse
Affiliation(s)
- Jeffrey K Myers
- Department of Biochemistry, Duke University Medical Center, Box 3711, Durham, North Carolina 27710, USA.
| | | |
Collapse
|
32
|
Fukunishi H, Watanabe O, Takada S. On the Hamiltonian replica exchange method for efficient sampling of biomolecular systems: Application to protein structure prediction. J Chem Phys 2002. [DOI: 10.1063/1.1472510] [Citation(s) in RCA: 602] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
33
|
Wohlfahrt G, Hangoc V, Schomburg D. Positioning of anchor groups in protein loop prediction: the importance of solvent accessibility and secondary structure elements. Proteins 2002; 47:370-8. [PMID: 11948790 DOI: 10.1002/prot.10098] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The prediction of loop regions in the process of protein structure prediction by homology is still an unsolved problem. In an earlier publication, we could show that the correct placement of the amino acids serving as an anchor group to be connected by a loop fragment with a predicted geometry is a highly important step and an essential requirement within the process (Lessel and Schomburg, Proteins 1999; 37:56-64). In this article, we present an analysis of the quality of possible loop predictions with respect to gap length, fragment length, amino acid type, secondary structure, and solvent accessibility. For 550 insertions and 544 deletions, we test all possible positions for anchor groups with an inserted loop of a length between 3 and 12 amino acids. We could show that approximately 80% of the indel regions could be predicted within 1.5 A RMSD from a knowledge-based loop data base if criteria for the correct localization of anchor groups could be found and the loops can be sorted correctly. From our analysis, several conclusions regarding the optimal placement of anchor groups become obvious: (1) The correct placement of anchor groups is even more important for longer gap lengths, (2) medium length fragments (length 5-8) perform better than short or long ones, (3) the placement of anchor groups at hydrophobic amino acids gives a higher chance to include the best possible loop, (4) anchor groups within secondary structure elements, in particular beta-sheets are suitable, (5) amino acids with lower solvent accessibility are better anchor group. A preliminary test using a combination of the anchor group positioning criteria deduced from our analysis shows very promising results.
Collapse
Affiliation(s)
- Gerd Wohlfahrt
- University of Cologne, Institute of Biochemistry, Köln, Germany.
| | | | | |
Collapse
|
34
|
Fielden MR, Matthews JB, Fertuck KC, Halgren RG, Zacharewski TR. In silico approaches to mechanistic and predictive toxicology: an introduction to bioinformatics for toxicologists. Crit Rev Toxicol 2002; 32:67-112. [PMID: 11951993 DOI: 10.1080/20024091064183] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Bioinformatics, or in silico biology, is a rapidly growing field that encompasses the theory and application of computational approaches to model, predict, and explain biological function at the molecular level. This information rich field requires new skills and new understanding of genome-scale studies in order to take advantage of the rapidly increasing amount of sequence, expression, and structure information in public and private databases. Toxicologists are poised to take advantage of the large public databases in an effort to decipher the molecular basis of toxicity. With the advent of high-throughput sequencing and computational methodologies, expressed sequences can be rapidly detected and quantitated in target tissues by database searching. Novel genes can also be isolated in silico, while their function can be predicted and characterized by virtue of sequence homology to other known proteins. Genomic DNA sequence data can be exploited to predict target genes and their modes of regulation, as well as identify susceptible genotypes based on single nucleotide polymorphism data. In addition, highly parallel gene expression profiling technologies will allow toxicologists to mine large databases of gene expression data to discover molecular biomarkers and other diagnostic and prognostic genes or expression profiles. This review serves to introduce to toxicologists the concepts of in silico biology most relevant to mechanistic and predictive toxicology, while highlighting the applicability of in silico methods using select examples.
Collapse
Affiliation(s)
- Mark R Fielden
- Department of Biochemistry and Molecular Biology, National Food Safety and Toxicology Center, Michigan State University, East Lansing 48824, USA
| | | | | | | | | |
Collapse
|
35
|
Pristovsek P, Rüterjans H, Jerala R. Semiautomatic sequence-specific assignment of proteins based on the tertiary structure--the program st2nmr. J Comput Chem 2002; 23:335-40. [PMID: 11908496 DOI: 10.1002/jcc.10011] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The sequence-specific assignment of resonances is still the most time-consuming procedure that is necessary as the first step in high-resolution NMR studies of proteins. In many cases a reliable three-dimensional (3D) structure of the protein is available, for example, from X-ray spectroscopy or homology modeling. Here we introduce the st2nmr program that uses the 3D structure and Nuclear Overhauser Effect spectroscopy (NOESY) peak list(s) to evaluate and optimize trial sequence-specific assignments of spin systems derived from correlation spectra to residues of the protein. A distance-dependent target function that scores trial assignments based on the presence of expected NOESY crosspeaks is optimized in a Monte Carlo fashion. The performance of the program st2nmr is tested on real NMR data of an alpha-helical (cytochrome c) and beta-sheet (lipocalin) protein using homology models and/or X-ray structures; it succeeded in completely reproducing the correct sequence-specific assignments in most cases using 2D and/or 15N/13C Nuclear Overhauser Effect (NOE) data. Additionally to amino acid residues the program can also handle ligands that are bound to the protein, such as heme, and can be used as a complementary tool to fully automated assignment procedures.
Collapse
|
36
|
Affiliation(s)
- Arul Jayaraman
- Center for Engineering in Medicine and Surgical Services, Massachusetts General Hospital, Harvard Medical School, and Shriners Burns Hospital, Boston, Massachusetts 02114
| | - Martin L. Yarmush
- Center for Engineering in Medicine and Surgical Services, Massachusetts General Hospital, Harvard Medical School, and Shriners Burns Hospital, Boston, Massachusetts 02114
| | - Charles M. Roth
- Center for Engineering in Medicine and Surgical Services, Massachusetts General Hospital, Harvard Medical School, and Shriners Burns Hospital, Boston, Massachusetts 02114
| |
Collapse
|
37
|
Abstract
Genomics has changed our view of the biological world in the past decade, providing both new information and new tools to characterise biological systems. Over 100 microbial genomes - including many of substantial clinical importance - have been fully or partially sequenced, pushing the search for novel antimicrobial compounds into the post-genomic era. Genomic information and associated new technologies have the potential to revolutionise the drug discovery process. Genomic methods have created a wealth of potential new antimicrobial targets; strategies are evolving to provide validation for these targets before chemical inhibitors are identified. The ability to obtain large amounts of purified target proteins and advances in X-ray crystallography have caused significant increases in available protein structures, which may foreshadow an increased effort in structure-based drug design. The post-genomics strategies used in antimicrobial drug discovery may have application for small molecule drug discovery in numerous therapeutic areas.
Collapse
Affiliation(s)
- Molly B Schmid
- Genencor International, 925 Page Mill Road, Palo Alto CA 94304, USA.
| |
Collapse
|
38
|
Di Gennaro JA, Siew N, Hoffman BT, Zhang L, Skolnick J, Neilson LI, Fetrow JS. Enhanced functional annotation of protein sequences via the use of structural descriptors. J Struct Biol 2001; 134:232-45. [PMID: 11551182 DOI: 10.1006/jsbi.2001.4391] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
In order to circumvent limitations of sequence based methods in the process of making functional predictions for proteins, we have developed a methodology that uses a sequence-to-structure-to-function paradigm. First, an approximate three-dimensional structure is predicted. Then, a three-dimensional descriptor of the functional site, termed a Fuzzy Functional Form, or FFF, is used to screen the structure for the presence of the functional site of interest (Fetrow et al., 1998; Fetrow and Skolnick, 1998). Previously, a disulfide oxidoreductase FFF was developed and applied to predicted structures obtained from a small structural database. Here, using a substantially larger structural database, we expand the analysis of the disulfide oxidoreductase FFF to the B. subtilis genome. To ascertain the performance of the FFF, its results are compared to those obtained using both the sequence alignment method BLAST and three local sequence motif databases: PRINTS, Prosite, and Blocks. The FFF method is then compared in detail to Blocks and it is shown that the FFF is more flexible and sensitive in finding a specific function in a set of unknown proteins. In addition, the estimated false positive rate of function prediction is significantly lower using the FFF structural motif, rather than the standard sequence motif methods. We also present a second FFF and describe a specific example of the results of its whole-genome application to D. melanogaster using a newer threading algorithm. Our results from all of these studies indicate that the addition of three-dimensional structural information adds significant value in the prediction of biochemical function of genomic sequences.
Collapse
Affiliation(s)
- J A Di Gennaro
- GeneFormatics, Incorporated, 5830 Oberlin Drive, Suite 200, San Diego, California 92121, USA.
| | | | | | | | | | | | | |
Collapse
|
39
|
van Hooft PA, Höltje HD. Construction of a full three-dimensional model of the transpeptidase domain of Streptococcus pneumoniae PBP2x starting from its Calpha-atom coordinates. J Comput Aided Mol Des 2000; 14:719-30. [PMID: 11131966 DOI: 10.1023/a:1008164914993] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
A new method is described for generating all-atom protein structures from Calpha-atom information. The method, which combines both local structural trace alignments and comparative side chain modeling with ab initio side chain modeling, makes use of both the virtual-bond and the dipole-path methods. Provided that 3D structures of structurally and functionally related proteins exist, the method presented here is highly suitable for generating all-atom coordinates of partly solved, low-resolution crystal structures. Particularly the active site region can be modeled accurately with this procedure, which enables investigation of the binding modes of different classes of ligands with molecular dynamics simulations. The method is applied to the trace of Streptococcus pneumoniae, in order to construct an all-atom structure of the transpeptidase domain. Since after generation of full coordinates of the transpeptidase domain the structure had been solved to 2.4 A resolution, new X-ray coordinates for the worst modeled loop (residues T370 to M386; 17 out of a total number of 351 residues constituting the transpeptidase domain) were incorporated, as kindly provided by Dr. Dideberg. The structure was relaxed with molecular dynamics simulations and simulated annealing methods. The RMS deviation between the 144 aligned Calpha-atoms and the corresponding ones in the originally solved 3.5 A resolution crystal structure was 0.98. The 351 Calpha-atoms of the whole transpeptidase domain of the final model showed an RMS deviation of 1.58. The Ramachandran plot showed that 79.3% of the residues are in the most favored regions, with only 1.0% occurring in disallowed regions. The model presented here can be used to investigate the three-dimensional influences of mutations around the active site of PBP2x.
Collapse
Affiliation(s)
- P A van Hooft
- Institut für Pharmazeutische Chemie, Heinrich-Heine Universität Düsseldorf, Germany
| | | |
Collapse
|
40
|
Fischer KF, Marqusee S. A rapid test for identification of autonomous folding units in proteins. J Mol Biol 2000; 302:701-12. [PMID: 10986128 DOI: 10.1006/jmbi.2000.4049] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The structure of a protein is dictated by a large number of weak interactions that cooperatively stabilize the native state. Usually, excised fragments smaller than a domain have little if any residual structure. When autonomous units of structure are found within domains, this challenges common assumptions about the cooperativity of protein structure. Such autonomous folding units (AFUs) are of wide interest and have applications in protein engineering and as simple model systems for studying the determinants of stability and specificity. A new method of identifying AFUs within proteins is presented here. The rapid autonomous fragment test (RAFT) identifies AFUs based on analysis of inter-residue contacts present in the three-dimensional structure of a protein. RAFT is fast enough to mine the entire PDB for AFUs and provide a library of potential small stable folds. We show that RAFT is able to predict whether a protein fragment will be structured if isolated from its parent domain.
Collapse
Affiliation(s)
- K F Fischer
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720-3206, USA
| | | |
Collapse
|
41
|
Abstract
Functional analysis of the proteins discovered in fully sequenced genomes represent the next major challenge of life science research. Computational methods play an increasingly important role in this activity. Among them, comparative protein modelling will play a major role in this challenge, especially in the light of the Structural Genomics programmes about to be started around the world. In recent years, much progress has been made in automating these methods, enabling the production of models for genome scale problems. In this review we discuss how protein models can be applied to functional analysis, as well as some of the current issues and limitations inherent to these methods.
Collapse
Affiliation(s)
- M C Peitsch
- Glaxo Wellcome Experimental Research, Plan-les-Ouates/GE, Switzerland.
| | | | | |
Collapse
|
42
|
Abstract
Several recent publications illustrated advantages of using sequence profiles in recognizing distant homologies between proteins. At the same time, the practical usefulness of distant homology recognition depends not only on the sensitivity of the algorithm, but also on the quality of the alignment between a prediction target and the template from the database of known proteins. Here, we study this question for several supersensitive protein algorithms that were previously compared in their recognition sensitivity (Rychlewski et al., 2000). A database of protein pairs with similar structures, but low sequence similarity is used to rate the alignments obtained with several different methods, which included sequence-sequence, sequence-profile, and profile-profile alignment methods. We show that incorporation of evolutionary information encoded in sequence profiles into alignment calculation methods significantly increases the alignment accuracy, bringing them closer to the alignments obtained from structure comparison. In general, alignment quality is correlated with recognition and alignment score significance. For every alignment method, alignments with statistically significant scores correlate with both correct structural templates and good quality alignments. At the same time, average alignment lengths differ in various methods, making the comparison between them difficult. For instance, the alignments obtained by FFAS, the profile-profile alignment algorithm developed in our group are always longer that the alignments obtained with the PSI-BLAST algorithms. To address this problem, we develop methods to truncate or extend alignments to cover a specified percentage of protein lengths. In most cases, the elongation of the alignment by profile-profile methods is reasonable, adding fragments of similar structure. The examples of erroneous alignment are examined and it is shown that they can be identified based on the model quality.
Collapse
Affiliation(s)
- L Jaroszewski
- The Burnham Institute, La Jolla, California 92037, USA
| | | | | |
Collapse
|
43
|
Abstract
A number of recent advances have been made in deriving function information from protein structure. A fold relationship to an already characterized protein will often allow general information about function to be deduced. More detailed information can be obtained using sequence relationships to already studied proteins. Methods of deducing function directly from structure, without the use of evolutionary relationships, are developing rapidly. All such methods may be used with models of protein structure, rather than with experimentally determined ones, but model accuracy imposes limitations. The rapid expansion of the structural genomics field has created a new urgency for improved methods of structure-based annotation of function.
Collapse
Affiliation(s)
- J Moult
- Center for Advanced Research in Biotechnology, University of Maryland, Biotechnology Institute, Rockville, MD 20850, USA.
| | | |
Collapse
|