1
|
Vasquez JK, West KHJ, Yang T, Polaske TJ, Cornilescu G, Tonelli M, Blackwell HE. Conformational Switch to a β-Turn in a Staphylococcal Quorum Sensing Signal Peptide Causes a Dramatic Increase in Potency. J Am Chem Soc 2020; 142:750-761. [PMID: 31859506 DOI: 10.1021/jacs.9b05513] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
We report the solution-phase structures of native signal peptides and related analogs capable of either strongly agonizing or antagonizing the AgrC quorum sensing (QS) receptor in the emerging pathogen Staphylococcus epidermidis. Chronic S. epidermidis infections are often recalcitrant to traditional therapies due to antibiotic resistance and formation of robust biofilms. The accessory gene regulator (agr) QS system plays an important role in biofilm formation in this opportunistic pathogen, and the binding of an autoinducing peptide (AIP) signal to its cognate transmembrane receptor (AgrC) is responsible for controlling agr. Small molecules or peptides capable of modulating this binding event are of significant interest as probes to investigate both the agr system and QS as a potential antivirulence target. We used NMR spectroscopy to characterize the structures of the three native S. epidermidis AIP signals and five non-native analogs with distinct activity profiles in the AgrC-I receptor from S. epidermidis. These studies revealed a suite of structural motifs critical for ligand activity. Interestingly, a unique β-turn was present in the macrocycles of the two most potent AgrC-I modulators, in both an agonist and an antagonist, which was distinct from the macrocycle conformation in the less-potent AgrC-I modulators and in the native AIP-I itself. This previously unknown β-turn provides a structural rationale for these ligands' respective biological activity profiles. Development of analogs to reinforce the β-turn resulted in our first antagonist with subnanomolar potency in AgrC-I, while analogs designed to contain a disrupted β-turn were dramatically less potent relative to their parent compounds. Collectively, these studies provide new insights into the AIP:AgrC interactions crucial for QS activation in S. epidermidis and advance the understanding of QS at the molecular level.
Collapse
Affiliation(s)
- Joseph K Vasquez
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Korbin H J West
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Tian Yang
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Thomas J Polaske
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Gabriel Cornilescu
- National Magnetic Resonance Facility at Madison , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Marco Tonelli
- National Magnetic Resonance Facility at Madison , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Helen E Blackwell
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| |
Collapse
|
2
|
Khan FI, Wei DQ, Gu KR, Hassan MI, Tabrez S. Current updates on computer aided protein modeling and designing. Int J Biol Macromol 2016; 85:48-62. [DOI: 10.1016/j.ijbiomac.2015.12.072] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Revised: 12/17/2015] [Accepted: 12/21/2015] [Indexed: 12/15/2022]
|
3
|
Dhingra P, Jayaram B. A homology/ab initio hybrid algorithm for sampling near-native protein conformations. J Comput Chem 2013; 34:1925-36. [PMID: 23728619 DOI: 10.1002/jcc.23339] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2012] [Revised: 03/09/2013] [Accepted: 04/21/2013] [Indexed: 12/19/2022]
Abstract
One of the major challenges for protein tertiary structure prediction strategies is the quality of conformational sampling algorithms, which can effectively and readily search the protein fold space to generate near-native conformations. In an effort to advance the field by making the best use of available homology as well as fold recognition approaches along with ab initio folding methods, we have developed Bhageerath-H Strgen, a homology/ab initio hybrid algorithm for protein conformational sampling. The methodology is tested on the benchmark CASP9 dataset of 116 targets. In 93% of the cases, a structure with TM-score ≥ 0.5 is generated in the pool of decoys. Further, the performance of Bhageerath-H Strgen was seen to be efficient in comparison with different decoy generation methods. The algorithm is web enabled as Bhageerath-H Strgen web tool which is made freely accessible for protein decoy generation (http://www.scfbio-iitd.res.in/software/Bhageerath-HStrgen1.jsp).
Collapse
Affiliation(s)
- Priyanka Dhingra
- Department of Chemistry, Indian Institute of Technology, Hauz Khas, New Delhi, 110016, India
| | | |
Collapse
|
4
|
Lobito AA, Ramani SR, Tom I, Bazan JF, Luis E, Fairbrother WJ, Ouyang W, Gonzalez LC. Murine insulin growth factor-like (IGFL) and human IGFL1 proteins are induced in inflammatory skin conditions and bind to a novel tumor necrosis factor receptor family member, IGFLR1. J Biol Chem 2011; 286:18969-81. [PMID: 21454693 PMCID: PMC3099712 DOI: 10.1074/jbc.m111.224626] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Psoriasis is a human skin condition characterized by epidermal hyperproliferation and infiltration of multiple leukocyte populations. In characterizing a novel insulin growth factor (IGF)-like (IGFL) gene in mice (mIGFL), we found transcripts of this gene to be most highly expressed in skin with enhanced expression in models of skin wounding and psoriatic-like inflammation. A possible functional ortholog in humans, IGFL1, was uniquely and significantly induced in psoriatic skin samples. In vitro IGFL1 expression was up-regulated in cultured primary keratinocytes stimulated with tumor necrosis factor α but not by other psoriasis-associated cytokines. Finally, using a secreted and transmembrane protein library, we discovered high affinity interactions between human IGFL1 and mIGFL and the TMEM149 ectodomain. TMEM149 (renamed here as IGFLR1) is an uncharacterized gene with structural similarity to the tumor necrosis factor receptor family. Our studies demonstrate that IGFLR1 is expressed primarily on the surface of mouse T cells. The connection between mIGFL and IGFLR1 receptor suggests mIGFL may influence T cell biology within inflammatory skin conditions.
Collapse
Affiliation(s)
- Adrian A Lobito
- Department of Protein Chemistry, Genentech, Inc, South San Francisco, California 94080-4918, USA
| | | | | | | | | | | | | | | |
Collapse
|
5
|
A historical perspective of template-based protein structure prediction. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2008; 413:3-42. [PMID: 18075160 DOI: 10.1007/978-1-59745-574-9_1] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
This chapter presents a broad and a historical overview of the problem of protein structure prediction. Different structure prediction methods, including homology modeling, fold recognition (FR)/protein threading, ab initio/de novo approaches, and hybrid techniques involving multiple types of approaches, are introduced in a historical context. The progress of the field as a whole, especially in the threading/FR area, as reflected by the CASP/CAFASP contests, is reviewed. At the end of the chapter, we discuss the challenging issues ahead in the field of protein structure prediction.
Collapse
|
6
|
Skolnick J, Kolinski A. Monte Carlo Approaches to the Protein Folding Problem. ADVANCES IN CHEMICAL PHYSICS 2007. [DOI: 10.1002/9780470141649.ch7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
|
7
|
Gromiha MM, Selvaraj S. Inter-residue interactions in protein folding and stability. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2004; 86:235-77. [PMID: 15288760 DOI: 10.1016/j.pbiomolbio.2003.09.003] [Citation(s) in RCA: 225] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
During the process of protein folding, the amino acid residues along the polypeptide chain interact with each other in a cooperative manner to form the stable native structure. The knowledge about inter-residue interactions in protein structures is very helpful to understand the mechanism of protein folding and stability. In this review, we introduce the classification of inter-residue interactions into short, medium and long range based on a simple geometric approach. The features of these interactions in different structural classes of globular and membrane proteins, and in various folds have been delineated. The development of contact potentials and the application of inter-residue contacts for predicting the structural class and secondary structures of globular proteins, solvent accessibility, fold recognition and ab initio tertiary structure prediction have been evaluated. Further, the relationship between inter-residue contacts and protein-folding rates has been highlighted. Moreover, the importance of inter-residue interactions in protein-folding kinetics and for understanding the stability of proteins has been discussed. In essence, the information gained from the studies on inter-residue interactions provides valuable insights for understanding protein folding and de novo protein design.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Aomi Frontier Building 17F, 2-43 Aomi, Koto-ku, Tokyo 135-0064, Japan.
| | | |
Collapse
|
8
|
Karchin R, Cline M, Karplus K. Evaluation of local structure alphabets based on residue burial. Proteins 2004; 55:508-18. [PMID: 15103615 DOI: 10.1002/prot.20008] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Residue burial, which describes a protein residue's exposure to solvent and neighboring atoms, is key to protein structure prediction, modeling, and analysis. We assessed 21 alphabets representing residue burial, according to their predictability from amino acid sequence, conservation in structural alignments, and utility in one fold-recognition scenario. This follows upon our previous work in assessing nine representations of backbone geometry.1 The alphabet found to be most effective overall has seven states and is based on a count of C(beta) atoms within a 14 A-radius sphere centered at the C(beta) of a residue of interest. When incorporated into a hidden Markov model (HMM), this alphabet gave us a 38% performance boost in fold recognition and 23% in alignment quality.
Collapse
Affiliation(s)
- Rachel Karchin
- Department of Biopharmaceutical Sciences, University of California, San Francisco 94143-2240, USA.
| | | | | |
Collapse
|
9
|
Capriotti E, Fariselli P, Rossi I, Casadio R. A Shannon entropy-based filter detects high- quality profile-profile alignments in searches for remote homologues. Proteins 2003; 54:351-60. [PMID: 14696197 DOI: 10.1002/prot.10564] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Detection of homologous proteins with low-sequence identity to a given target (remote homologues) is routinely performed with alignment algorithms that take advantage of sequence profile. In this article, we investigate the efficacy of different alignment procedures for the task at hand on a set of 185 protein pairs with similar structures but low-sequence similarity. Criteria based on the SCOP label detection and MaxSub scores are adopted to score the results. We investigate the efficacy of alignments based on sequence-sequence, sequence-profile, and profile-profile information. We confirm that with profile-profile alignments the results are better than with other procedures. In addition, we report, and this is novel, that the selection of the results of the profile-profile alignments can be improved by using Shannon entropy, indicating that this parameter is important to recognize good profile-profile alignments among a plethora of meaningless pairs. By this, we enhance the global search accuracy without losing sensitivity and filter out most of the erroneous alignments. We also show that when the entropy filtering is adopted, the quality of the resulting alignments is comparable to that computed for the target and template structures with CE, a structural alignment program.
Collapse
|
10
|
Marti‐Renom MA, Madhusudhan M, Eswar N, Pieper U, Shen M, Sali A, Fiser A, Mirkovic N, John B, Stuart A. Modeling Protein Structure from its Sequence. ACTA ACUST UNITED AC 2003. [DOI: 10.1002/0471250953.bi0501s03] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Marc A. Marti‐Renom
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and The California Institute for Quantitative Biomedical Research University of California at San Francisco San Francisco California
| | - M.S. Madhusudhan
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and The California Institute for Quantitative Biomedical Research University of California at San Francisco San Francisco California
| | - Narayanan Eswar
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and The California Institute for Quantitative Biomedical Research University of California at San Francisco San Francisco California
| | - Ursula Pieper
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and The California Institute for Quantitative Biomedical Research University of California at San Francisco San Francisco California
| | - Min‐yi Shen
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and The California Institute for Quantitative Biomedical Research University of California at San Francisco San Francisco California
| | - Andrej Sali
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and The California Institute for Quantitative Biomedical Research University of California at San Francisco San Francisco California
| | - Andras Fiser
- Department of Biochemistry and Seaver Foundation Center for Bioinformatics Albert Einstein College of Medicine Bronx New York
| | - Nebojsa Mirkovic
- Laboratory of Molecular Biophysics The Rockefeller University New York New York
| | - Bino John
- Laboratory of Molecular Biophysics The Rockefeller University New York New York
| | - Ashley Stuart
- Laboratory of Molecular Biophysics The Rockefeller University New York New York
| |
Collapse
|
11
|
Abstract
A new potential energy function representing the conformational preferences of sequentially local regions of a protein backbone is presented. This potential is derived from secondary structure probabilities such as those produced by neural network-based prediction methods. The potential is applied to the problem of remote homolog identification, in combination with a distance-dependent inter-residue potential and position-based scoring matrices. This fold recognition jury is implemented in a Java application called JThread. These methods are benchmarked on several test sets, including one released entirely after development and parameterization of JThread. In benchmark tests to identify known folds structurally similar to (but not identical with) the native structure of a sequence, JThread performs significantly better than PSI-BLAST, with 10% more structures identified correctly as the most likely structural match in a fold library, and 20% more structures correctly narrowed down to a set of five possible candidates. JThread also improves the average sequence alignment accuracy significantly, from 53% to 62% of residues aligned correctly. Reliable fold assignments and alignments are identified, making the method useful for genome annotation. JThread is applied to predicted open reading frames (ORFs) from the genomes of Mycoplasma genitalium and Drosophila melanogaster, identifying 20 new structural annotations in the former and 801 in the latter.
Collapse
Affiliation(s)
- John Marc Chandonia
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94143-2240, USA
| | | |
Collapse
|
12
|
Hodder AN, Drew DR, Epa VC, Delorenzi M, Bourgon R, Miller SK, Moritz RL, Frecklington DF, Simpson RJ, Speed TP, Pike RN, Crabb BS. Enzymic, phylogenetic, and structural characterization of the unusual papain-like protease domain of Plasmodium falciparum SERA5. J Biol Chem 2003; 278:48169-77. [PMID: 13679369 DOI: 10.1074/jbc.m306755200] [Citation(s) in RCA: 77] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Serine repeat antigen 5 (SERA5) is an abundant antigen of the human malaria parasite Plasmodium falciparum and is the most strongly expressed member of the nine-gene SERA family. It appears to be essential for the maintenance of the erythrocytic cycle, unlike a number of other members of this family, and has been implicated in parasite egress and/or erythrocyte invasion. All SERA proteins possess a central domain that has homology to papain except in the case of SERA5 (and some other SERAs), where the active site cysteine has been replaced with a serine. To investigate if this domain retains catalytic activity, we expressed, purified, and refolded a recombinant form of the SERA5 enzyme domain. This protein possessed chymotrypsin-like proteolytic activity as it processed substrates downstream of aromatic residues, and its activity was reversed by the serine protease inhibitor 3,4-diisocoumarin. Although all Plasmodium SERA enzyme domain sequences share considerable homology, phylogenetic studies revealed two distinct clusters across the genus, separated according to whether they possess an active site serine or cysteine. All Plasmodia appear to have at least one member of each group. Consistent with separate biological roles for members of these two clusters, molecular modeling studies revealed that SERA5 and SERA6 enzyme domains have dramatically different surface properties, although both have a characteristic papain-like fold, catalytic cleft, and an appropriately positioned catalytic triad. This study provides impetus for the examination of SERA5 as a target for antimalarial drug design.
Collapse
Affiliation(s)
- Anthony N Hodder
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, 1G Royal Parade, Parkville, Victoria 3050, Australia.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Aita T, Ota M, Husimi Y. An in silico exploration of the neutral network in protein sequence space. J Theor Biol 2003; 221:599-613. [PMID: 12713943 DOI: 10.1006/jtbi.2003.3209] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Designating amino-acid sequences that fold into a common main-chain structure as "neutral sequences" for the structure, regardless of their function or stability, we investigated the distribution of neutral sequences in protein sequence space. For four distinct target structures (alpha, beta,alpha/beta and alpha+beta types) with the same chain length of 108, we generated the respective neutral sequences by using the inverse folding technique with a knowledge-based potential function. We assumed that neutral sequences for a protein structure have Z scores higher than or equal to fixed thresholds, where thresholds are defined as the Z score for the corresponding native sequence (case 1) or much greater Z score (case 2). An exploring walk simulation suggested that the neutral sequences mapped into the sequence space were connected with each other through straight neutral paths and formed an inherent neutral network over the sequence space. Through another exploring walk simulation, we investigated contiguous regions between or among the neutral networks for the distinct protein structures and obtained the following results. The closest approach distance between the two neutral networks ranged from 5 to 29 on the Hamming distance scale, showing a linear increase against the threshold values. The sequences located at the "interchange" regions between the two neutral networks have intermediate sequence-profile-scores for both corresponding structures. Introducing a "ball" in the sequence space that contains at least one neutral sequence for each of the four structures, we found that the minimal radius of the ball that is centered at an arbitrary position ranged from 35 to 50, while the minimal radius of the ball that is centered at a certain special position ranged from 20 to 30, in the Hamming distance scale. The relatively small Hamming distances (5-30) may support an evolution mechanism by transferring from a network for a structure to another network for a more beneficial structure via the interchange regions.
Collapse
Affiliation(s)
- Takuyo Aita
- Tsukuba Research Institute, Novartis Pharma K. K. Ohkubo 8, Tsukuba 300-2611, Japan
| | | | | |
Collapse
|
14
|
Imai K, Mitaku S. Common Pattern of Coarse-Grained Charge Distribution of Structurally Analogous Proteins. CHEM-BIO INFORMATICS JOURNAL 2003. [DOI: 10.1273/cbij.3.194] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Kenichiro Imai
- Nagoya University, Graduate School of Engineering, Department of Applied Physics
| | - Shigeki Mitaku
- Nagoya University, Graduate School of Engineering, Department of Applied Physics
| |
Collapse
|
15
|
Himly M, Jahn-Schmid B, Dedic A, Kelemen P, Wopfner N, Altmann F, van Ree R, Briza P, Richter K, Ebner C, Ferreira F. Art v 1, the major allergen of mugwort pollen, is a modular glycoprotein with a defensin-like and a hydroxyproline-rich domain. FASEB J 2003; 17:106-8. [PMID: 12475905 DOI: 10.1096/fj.02-0472fje] [Citation(s) in RCA: 100] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
In late summer, pollen grains originating from Compositae weeds (e.g., mugwort, ragweed) are a major source of allergens worldwide. Here, we report the isolation of a cDNA clone coding for Art v 1, the major allergen of mugwort pollen. Sequence analysis showed that Art v 1 is a secreted allergen with an N-terminal cysteine-rich domain homologous to plant defensins and a C-terminal proline-rich region containing several (Ser/Ala)(Pro)2-4 repeats. Structural analysis showed that some of the proline residues in the C-terminal domain of Art v 1 are posttranslationally modified by hydroxylation and O-glycosylation. The O-glycans are composed of 3 galactoses and 9-16 arabinoses linked to a hydroxyproline and represent a new type of plant O-glycan. A 3-D structural model of Art v 1 was generated showing a characteristic "head and tail" structure. Evaluation of the antibody binding properties of natural and recombinant Art v 1 produced in Escherichia coli revealed the involvement of the defensin fold and posttranslational modifications in the formation of epitopes recognized by IgE antibodies from allergic patients. However, posttranslational modifications did not influence T-cell recognition. Thus, recombinant nonglycosylated Art v 1 is a good starting template for engineering hypoallergenic vaccines for weed-pollen therapy.
Collapse
Affiliation(s)
- Martin Himly
- Institute of Genetics and General Biology, University of Salzburg, A-5020 Salzburg, Austria
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Jahn-Schmid B, Kelemen P, Himly M, Bohle B, Fischer G, Ferreira F, Ebner C. The T cell response to Art v 1, the major mugwort pollen allergen, is dominated by one epitope. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2002; 169:6005-11. [PMID: 12421987 DOI: 10.4049/jimmunol.169.10.6005] [Citation(s) in RCA: 60] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Mugwort (Artemisia vulgaris) pollen allergens represent the main cause of pollinosis in late summer in Europe. At least 95% of sera from mugwort pollen-allergic patients contain IgE against a highly glycosylated 24- to 28-kDa glycoprotein. Recently, this major allergen, termed Art v 1, was characterized, cloned in Escherichia coli, and produced in recombinant form. In the present study we characterized and compared the T cell responses to natural (nArt v 1) and recombinant Art v 1 (rArt v 1). In vitro T cell responses to nArt v 1 and rArt v 1 were studied in PBMC, T cell lines (TCL), and T cell clones (TCC) established from PBMC of mugwort-allergic patients. Stimulation of PBMC or allergen-specific TCL with either nArt v 1 or rArt v 1 resulted in comparable proliferative T cell responses. Eighty-five percent of the TCC reactive with rArt v 1 cross-reacted with the natural protein. The majority of the CD4(+)CD8(-)TCR alphabeta(+) Art v 1-specific TCC, obtained from 10 different donors, belonged to the Th2 phenotype. Epitope mapping of TCL and TCC using overlapping peptides revealed a single immunodominant T cell epitope recognized by 81% of the patients. Inhibition experiments demonstrated that the presentation of this peptide is restricted by HLA-DR molecules. In conclusion, the T cell response to Art v 1 is characterized by one strong immunodominant epitope and evidently differs from the T cell responses to other common pollen allergens known to contain multiple T cell epitopes. Therefore, mugwort allergy may be an ideal candidate for a peptide-based immunotherapy approach.
Collapse
MESH Headings
- Allergens/analysis
- Allergens/immunology
- Allergens/metabolism
- Antigen-Antibody Reactions
- Antigens, Plant
- Artemisia/immunology
- Cell Line
- Clone Cells
- Epitopes, T-Lymphocyte/analysis
- Epitopes, T-Lymphocyte/immunology
- Epitopes, T-Lymphocyte/metabolism
- Gene Rearrangement, beta-Chain T-Cell Antigen Receptor
- HLA-DQ Antigens/analysis
- HLA-DR Antigens/analysis
- Histocompatibility Testing
- Humans
- Immunoblotting
- Immunodominant Epitopes/analysis
- Immunodominant Epitopes/immunology
- Immunodominant Epitopes/metabolism
- Immunoglobulin E/blood
- Leukocytes, Mononuclear/chemistry
- Leukocytes, Mononuclear/immunology
- Leukocytes, Mononuclear/metabolism
- Lymphocyte Activation/immunology
- Plant Proteins/analysis
- Plant Proteins/immunology
- Plant Proteins/metabolism
- Pollen/immunology
- Receptors, Antigen, T-Cell, alpha-beta/analysis
- Receptors, Antigen, T-Cell, alpha-beta/biosynthesis
- Receptors, Antigen, T-Cell, alpha-beta/genetics
- Recombinant Proteins/immunology
- Recombinant Proteins/metabolism
- T-Lymphocytes/chemistry
- T-Lymphocytes/immunology
- T-Lymphocytes/metabolism
Collapse
Affiliation(s)
- Beatrice Jahn-Schmid
- Department of Pathophysiology, Division of Immunopathology, University of Vienna, AKH-3Q, Waehringer Guertel 18-20, A-1090 Vienna, Austria
| | | | | | | | | | | | | |
Collapse
|
17
|
Sippl MJ, Lackner P, Domingues FS, Prlić A, Malik R, Andreeva A, Wiederstein M. Assessment of the CASP4 fold recognition category. Proteins 2002; Suppl 5:55-67. [PMID: 11835482 DOI: 10.1002/prot.10006] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
We present the assessment of the CASP4 fold recognition category. The tasks we had to execute include the splitting of multidomain targets into single domains, the classification of target domains in terms of prediction categories, the numerical evaluation of predictions, the mapping of numerical scores to quality indices, the ranking of predictors, the selection of top-performing groups, and the analysis and critical discussion of the state of the art in this field. The 125 fold recognition groups were assessed by a total score that summarizes their performance over all targets and a quality score reflecting the average quality of the submitted models. Most of the top-performing groups achieved respectable results on both scores simultaneously. Several groups submitted models that were much closer to the respective target structures than any of the known folds in the Protein Data Bank. The CASP4 assessment included the automated servers of the parallel CAFASP experiment. For the total score, the highest rank achieved by a fully automated server is 12. Two thirds of the predictors have rather low scores.
Collapse
Affiliation(s)
- M J Sippl
- Center for Applied Molecular Engineering, Institute for Chemistry and Biochemistry, University of Salzburg, Salzburg, Austria.
| | | | | | | | | | | | | |
Collapse
|
18
|
Reva B, Finkelstein A, Topiol S. Threading with chemostructural restrictions method for predicting fold and functionally significant residues: application to dipeptidylpeptidase IV (DPP-IV). Proteins 2002; 47:180-93. [PMID: 11933065 DOI: 10.1002/prot.10076] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We present a new method for more accurate modeling of protein structure, called threading with chemostructural restrictions. This method addresses those cases in which a target sequence has only remote homologues of known structure for which sequence comparison methods cannot provide accurate alignments. Although remote homologues cannot provide an accurate model for the whole chain, they can be used in constructing practically useful models for the most conserved-and often the most interesting-part of the structure. For many proteins of interest, one can suggest certain chemostructural patterns for the native structure based on the available information on the structural superfamily of the protein, the type of activity, the sequence location of the functionally significant residues, and other factors. We use such patterns to restrict (1) a number of possible templates, and (2) a number of allowed chain conformations on a template. The latter restrictions are imposed in the form of additional template potentials (including terms acting as sequence anchors) that act on certain residues. This approach is tested on remote homologues of alpha/beta-hydrolases that have significant structural similarity in the positions of their catalytic triads. The study shows that, in spite of significant deviations between the model and the native structures, the surroundings of the catalytic triad (positions of C(alpha) atoms of 20-30 nearby residues) can be reproduced with accuracy of 2-3 A. We then apply the approach to predict the structure of dipeptidylpeptidase IV (DPP-IV). Using experimentally available data identifying the catalytic triad residues of DPP-IV (David et al., J Biol Chem 1993;268:17247-17252); we predict a model structure of the catalytic domain of DPP-IV based on the 3D fold of prolyl oligopeptidase (Fulop et al., Cell 1998;94:161-170) and use this structure for modeling the interaction of DPP-IV with inhibitor.
Collapse
Affiliation(s)
- Boris Reva
- Novartis Institute for Biomedical Research, Summit, New Jersey, USA.
| | | | | |
Collapse
|
19
|
Pollastri G, Baldi P, Fariselli P, Casadio R. Prediction of coordination number and relative solvent accessibility in proteins. Proteins 2002; 47:142-53. [PMID: 11933061 DOI: 10.1002/prot.10069] [Citation(s) in RCA: 183] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Knowing the coordination number and relative solvent accessibility of all the residues in a protein is crucial for deriving constraints useful in modeling protein folding and protein structure and in scoring remote homology searches. We develop ensembles of bidirectional recurrent neural network architectures to improve the state of the art in both contact and accessibility prediction, leveraging a large corpus of curated data together with evolutionary information. The ensembles are used to discriminate between two different states of residue contacts or relative solvent accessibility, higher or lower than a threshold determined by the average value of the residue distribution or the accessibility cutoff. For coordination numbers, the ensemble achieves performances ranging within 70.6-73.9% depending on the radius adopted to discriminate contacts (6A-12A). These performances represent gains of 16-20% over the baseline statistical predictor, always assigning an amino acid to the largest class, and are 4-7% better than any previous method. A combination of different radius predictors further improves performance. For accessibility thresholds in the relevant 15-30% range, the ensemble consistently achieves a performance above 77%, which is 10-16% above the baseline prediction and better than other existing predictors, by up to several percentage points. For both problems, we quantify the improvement due to evolutionary information in the form of PSI-BLAST-generated profiles over BLAST profiles. The prediction programs are implemented in the form of two web servers, CONpro and ACCpro, available at http://promoter.ics.uci.edu/BRNN-PRED/.
Collapse
Affiliation(s)
- Gianluca Pollastri
- Department of Information and Computer Science, Institute for Genomics and Bioinformatics, University of California, Irvine, California 92697-3425, USA
| | | | | | | |
Collapse
|
20
|
Robson B, Mordasini T, Curioni A. Studies in the assessment of folding quality for protein modeling and structure prediction. J Proteome Res 2002; 1:115-33. [PMID: 12643532 DOI: 10.1021/pr0155228] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A diagnostic for assessing the quality of a fold has been developed to which further criteria can be progressively added. The goal is to create a measure that can follow the status of a protein structure in a simulation or modeling process, when the answer (the experimental structure) is not known in advance, rather than simply reject deliberate misfolds. This places greater emphasis on the need to study, and calibrate against, marginal cases, i.e., unusual native structures, incomplete structures, partially erroneous X-ray structures, good models, poor models, and the effect of cofactors. The first three terms introduced in the diagnostic are appropriate core-forming properties or noncore properties of residues in relation to tertiary structure, appropriate neighboring structure density for each residue in relation to tertiary structure, and secondary structure consistency. While the method emerges as a useful simulation analysis tool, we find a need for further fine-tuning to diminish sensitivity to minor conformational changes that retain essential features of the fold, balanced against the need to obtain a more sensitive response when a conformational change involves less physically meaningful interatomic interactions. This dual utility is difficult to obtain: the investigation highlights some of the issues. Initial attempts to obtain it have led to terms in the diagnostic that are admittedly complex: simplifications must also be explored.
Collapse
Affiliation(s)
- Barry Robson
- IBM Research, T. J. Watson Research Laboratory, Yorktown Heights, New York 10598, USA
| | | | | |
Collapse
|
21
|
Chazalet V, Uehara K, Geremia RA, Breton C. Identification of essential amino acids in the Azorhizobium caulinodans fucosyltransferase NodZ. J Bacteriol 2001; 183:7067-75. [PMID: 11717264 PMCID: PMC95554 DOI: 10.1128/jb.183.24.7067-7075.2001] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The nodZ gene, which is present in various rhizobial species, is involved in the addition of a fucose residue in an alpha 1-6 linkage to the reducing N-acetylglucosamine residue of lipo-chitin oligosaccharide signal molecules, the so-called Nod factors. Fucosylation of Nod factors is known to affect nodulation efficiency and host specificity. Despite a lack of overall sequence identity, NodZ proteins share conserved peptide motifs with mammalian and plant fucosyltransferases that participate in the biosynthesis of complex glycans and polysaccharides. These peptide motifs are thought to play important roles in catalysis. NodZ was expressed as an active and soluble form in Escherichia coli and was subjected to site-directed mutagenesis to investigate the role of the most conserved residues. Enzyme assays demonstrate that the replacement of the invariant Arg-182 by either alanine, lysine, or aspartate results in products with no detectable activity. A similar result is obtained with the replacement of the conserved acidic position (Asp-275) into its corresponding amide form. The residues His-183 and Asn-185 appear to fulfill functions that are more specific to the NodZ subfamily. Secondary structure predictions and threading analyses suggest the presence of a "Rossmann-type" nucleotide binding domain in the half C-terminal part of the catalytic domain of fucosyltransferases. Site-directed mutagenesis combined with theoretical approaches have shed light on the possible nucleotide donor recognition mode for NodZ and related fucosyltransferases.
Collapse
Affiliation(s)
- V Chazalet
- Centre de Recherches sur les Macromolécules Végétales and Joseph Fourier University, CNRS, Grenoble, France
| | | | | | | |
Collapse
|
22
|
Fairbrother WJ, Gordon NC, Humke EW, O'Rourke KM, Starovasnik MA, Yin JP, Dixit VM. The PYRIN domain: a member of the death domain-fold superfamily. Protein Sci 2001; 10:1911-8. [PMID: 11514682 PMCID: PMC2253208 DOI: 10.1110/ps.13801] [Citation(s) in RCA: 122] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
PYRIN domains were identified recently as putative protein-protein interaction domains at the N-termini of several proteins thought to function in apoptotic and inflammatory signaling pathways. The approximately 95 residue PYRIN domains have no statistically significant sequence homology to proteins with known three-dimensional structure. Using secondary structure prediction and potential-based fold recognition methods, however, the PYRIN domain is predicted to be a member of the six-helix bundle death domain-fold superfamily that includes death domains (DDs), death effector domains (DEDs), and caspase recruitment domains (CARDs). Members of the death domain-fold superfamily are well established mediators of protein-protein interactions found in many proteins involved in apoptosis and inflammation, indicating further that the PYRIN domains serve a similar function. An homology model of the PYRIN domain of CARD7/DEFCAP/NAC/NALP1, a member of the Apaf-1/Ced-4 family of proteins, was constructed using the three-dimensional structures of the FADD and p75 neurotrophin receptor DDs, and of the Apaf-1 and caspase-9 CARDs, as templates. Validation of the model using a variety of computational techniques indicates that the fold prediction is consistent with the sequence. Comparison of a circular dichroism spectrum of the PYRIN domain of CARD7/DEFCAP/NAC/NALP1 with spectra of several proteins known to adopt the death domain-fold provides experimental support for the structure prediction.
Collapse
Affiliation(s)
- W J Fairbrother
- Department of Protein Engineering, Genentech, Inc., South San Francisco, California 94080, USA.
| | | | | | | | | | | | | |
Collapse
|
23
|
Dunbrack RL, Gerloff DL, Bower M, Chen X, Lichtarge O, Cohen FE. Meeting review: the Second meeting on the Critical Assessment of Techniques for Protein Structure Prediction (CASP2), Asilomar, California, December 13-16, 1996. FOLDING & DESIGN 2001; 2:R27-42. [PMID: 9135979 DOI: 10.1016/s1359-0278(97)00011-4] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
In most fields of scientific endeavor, the outcomes of important experiments are not always known before the experiments are performed. But in protein structure prediction, algorithms are usually developed and tested in situations where the answers are known. In December 1996, the Second Meeting on the Critical Assessment of Techniques for Protein Structure Prediction (CASP2) was held in Asilomar, California to rectify this situation: protein sequences were provided in advance for which the experimental structure had not yet been published. Over 70 research groups provided bona fide predictions on 42 targets in four categories: comparative or 'homology' modeling, fold recognition or 'threading', ab initio structure predictions, and docking predictions. Since the previous CASP meeting in 1994, the role of fold recognition in structure prediction has increased enormously with the largest number of groups participating in this category. In this review, we highlight some of the important developments and give at least a qualitative sense of what kind of methods produced some of the better predictions.
Collapse
Affiliation(s)
- R L Dunbrack
- Department of Cellular and Molecular Pharmacology, University of California at San Francisco 94143-0450, USA
| | | | | | | | | | | |
Collapse
|
24
|
Martí-Renom MA, Stuart AC, Fiser A, Sánchez R, Melo F, Sali A. Comparative protein structure modeling of genes and genomes. ANNUAL REVIEW OF BIOPHYSICS AND BIOMOLECULAR STRUCTURE 2001; 29:291-325. [PMID: 10940251 DOI: 10.1146/annurev.biophys.29.1.291] [Citation(s) in RCA: 2333] [Impact Index Per Article: 101.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Comparative modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. The number of protein sequences that can be modeled and the accuracy of the predictions are increasing steadily because of the growth in the number of known protein structures and because of the improvements in the modeling software. Further advances are necessary in recognizing weak sequence-structure similarities, aligning sequences with structures, modeling of rigid body shifts, distortions, loops and side chains, as well as detecting errors in a model. Despite these problems, it is currently possible to model with useful accuracy significant parts of approximately one third of all known protein sequences. The use of individual comparative models in biology is already rewarding and increasingly widespread. A major new challenge for comparative modeling is the integration of it with the torrents of data from genome sequencing projects as well as from functional and structural genomics. In particular, there is a need to develop an automated, rapid, robust, sensitive, and accurate comparative modeling pipeline applicable to whole genomes. Such large-scale modeling is likely to encourage new kinds of applications for the many resulting models, based on their large number and completeness at the level of the family, organism, or functional network.
Collapse
Affiliation(s)
- M A Martí-Renom
- Laboratories of Molecular Biophysics, Pels Family Center for Biochemistry and Structural Biology, Rockefeller University, New York, NY 10021, USA
| | | | | | | | | | | |
Collapse
|
25
|
Abstract
A homology-based structure prediction method ideally gives both a correct fold assignment and an accurate query-template alignment. In this article we show that the combination of two existing methods, PSI-BLAST and threading, leads to significant enhancement in the success rate of fold recognition. The combined approach, termed COBLATH, also yields much higher alignment accuracy than found in previous studies. It consists of two-way searches both by PSI-BLAST and by threading. In the PSI-BLAST portion, a query is used to search for hits in a library of potential templates and, conversely, each potential template is used to search for hits in a library of queries. In the threading portion, the scoring function is the sum of a sequence profile and a 6x6 substitution matrix between predicted query and known template secondary structure and solvent exposure. "Two-way" in threading means that the query's sequence profile is used to match the sequences of all potential templates and the sequence profiles of all potential templates are used to match the query's sequence. When tested on a set of 533 nonhomologous proteins, COBLATH was able to assign folds for 390 (73%). Among these 390 queries, 265 (68%) had root-mean-square deviations (RMSDs) of less than 8 A between predicted and actual structures. Such high success rate and accuracy make COBLATH an ideal tool for structural genomics.
Collapse
Affiliation(s)
- Y Shan
- Department of Physics, Drexel University, Philadelphia, Pennsylvania 19104, USA
| | | | | |
Collapse
|
26
|
Yagnik AT, Lahm A, Meola A, Roccasecca RM, Ercole BB, Nicosia A, Tramontano A. A model for the hepatitis C virus envelope glycoprotein E2. Proteins 2000; 40:355-66. [PMID: 10861927 DOI: 10.1002/1097-0134(20000815)40:3<355::aid-prot20>3.0.co;2-k] [Citation(s) in RCA: 166] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Several experimental studies on hepatitis C virus (HCV) have suggested the envelope glycoprotein E2 as a key antigen for an effective vaccine against the virus. Knowledge of its structure, therefore, would present a significant step forward in the fight against this disease. This paper reports the application of fold recognition methods in order to produce a model of the HCV E2 protein. Such investigation highlighted the envelope protein E of Tick Borne Encephalitis virus as a possible template for building a model of HCV E2. Mapping of experimental data onto the model allowed the prediction of a composite interaction site between E2 and its proposed cellular receptor CD81, as well as a heparin binding domain. In addition, experimental evidence is provided to show that CD81 recognition by E2 is isolate or strain specific and possibly mediated by the second hypervariable region (HVR2) of E2. Finally, the studies have also allowed a rough model for the quaternary structure of the envelope glycoproteins E1 and E2 complex to be proposed. Proteins 2000;40:355-366.
Collapse
Affiliation(s)
- A T Yagnik
- Istituto di Ricerche di Biologia Molecolare P. Angeletti, Pomezia (Rome), Italy
| | | | | | | | | | | | | |
Collapse
|
27
|
Abstract
As the number of completely sequenced genomes rapidly increases, the postgenomic problem of gene function identification becomes ever more pressing. Predicting the structures of proteins encoded by genes of interest is one possible means to glean subtle clues as to the functions of these proteins. There are limitations to this approach to gene identification and a survey of the expected reliability of different protein structure prediction techniques has been undertaken.
Collapse
Affiliation(s)
- D T Jones
- Department of Biological Sciences, Brunel University, Uxbridge, UB8 3PH, UK.
| |
Collapse
|
28
|
Domingues FS, Lackner P, Andreeva A, Sippl MJ. Structure-based evaluation of sequence comparison and fold recognition alignment accuracy. J Mol Biol 2000; 297:1003-13. [PMID: 10736233 DOI: 10.1006/jmbi.2000.3615] [Citation(s) in RCA: 72] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The biological role, biochemical function, and structure of uncharacterized protein sequences is often inferred from their similarity to known proteins. A constant goal is to increase the reliability, sensitivity, and accuracy of alignment techniques to enable the detection of increasingly distant relationships. Development, tuning, and testing of these methods benefit from appropriate benchmarks for the assessment of alignment accuracy.Here, we describe a benchmark protocol to estimate sequence-to-sequence and sequence-to-structure alignment accuracy. The protocol consists of structurally related pairs of proteins and procedures to evaluate alignment accuracy over the whole set. The set of protein pairs covers all the currently known fold types. The benchmark is challenging in the sense that it consists of proteins lacking clear sequence similarity. Correct target alignments are derived from the three-dimensional structures of these pairs by rigid body superposition. An evaluation engine computes the accuracy of alignments obtained from a particular algorithm in terms of alignment shifts with respect to the structure derived alignments. Using this benchmark we estimate that the best results can be obtained from a combination of amino acid residue substitution matrices and knowledge-based potentials.
Collapse
Affiliation(s)
- F S Domingues
- Center for Applied Molecular Engineering, Institute for Chemistry and Biochemistry, University of Salzburg, Jakob Haringer Strasse 3, Salzburg, A-5020, Austria
| | | | | | | |
Collapse
|
29
|
Richardson CJ, Barlow DJ. The bottom line for prediction of residue solvent accessibility. PROTEIN ENGINEERING 1999; 12:1051-4. [PMID: 10611398 DOI: 10.1093/protein/12.12.1051] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
A simple method of predicting residue solvent accessibilities in proteins is described, with the intention that it should be used as a baseline by which more sophisticated approaches to prediction can be judged. Comparison with existing methods of predicting residue burial reveals that their performance is often little better than that of the baseline method. The problem of comparing different prediction methods is shown to be complicated by the proliferation of different schemes for classifying residue burial.
Collapse
Affiliation(s)
- C J Richardson
- Pharmacy Department, King's College London, Manresa Road, London SW3 6LX, UK.
| | | |
Collapse
|
30
|
van Mourik J, Clementi C, Maritan A, Seno F, Banavar JR. Determination of interaction potentials of amino acids from native protein structures: Tests on simple lattice models. J Chem Phys 1999. [DOI: 10.1063/1.478885] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
31
|
Jones DT. GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol 1999; 287:797-815. [PMID: 10191147 DOI: 10.1006/jmbi.1999.2583] [Citation(s) in RCA: 694] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
A new protein fold recognition method is described which is both fast and reliable. The method uses a traditional sequence alignment algorithm to generate alignments which are then evaluated by a method derived from threading techniques. As a final step, each threaded model is evaluated by a neural network in order to produce a single measure of confidence in the proposed prediction. The speed of the method, along with its sensitivity and very low false-positive rate makes it ideal for automatically predicting the structure of all the proteins in a translated bacterial genome (proteome). The method has been applied to the genome of Mycoplasma genitalium, and analysis of the results shows that as many as 46 % of the proteins derived from the predicted protein coding regions have a significant relationship to a protein of known structure. In some cases, however, only one domain of the protein can be predicted, giving a total coverage of 30 % when calculated as a fraction of the number of amino acid residues in the whole proteome.
Collapse
Affiliation(s)
- D T Jones
- Department of Biological Sciences, University of Warwick, Coventry, CV4 7AL, UK.
| |
Collapse
|
32
|
Vijayakumar M, Qian H, Zhou HX. Hydrogen bonds between short polar side chains and peptide backbone: Prevalence in proteins and effects on helix-forming propensities. Proteins 1999. [DOI: 10.1002/(sici)1097-0134(19990301)34:4<497::aid-prot9>3.0.co;2-g] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
33
|
Jermutus L, Guez V, Bedouelle H. Disordered C-terminal domain of tyrosyl-tRNA synthetase: secondary structure prediction. Biochimie 1999; 81:235-44. [PMID: 10385005 DOI: 10.1016/s0300-9084(99)80057-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The C-terminal domain (residues 320-419) of tyrosyl-tRNA synthetase (TyrRS) from Bacillus stearothermophilus is disordered in the crystal structure and involved in the binding of the anticodon arm of tRNA(Tyr). The sequences of 11 TyrRSs of prokaryotic or mitochondrial origins were aligned and the alignment showed the existence of conserved residues in the sequences of the C-terminal domains. A consensus could be deduced from the application of five programs of secondary structure prediction to the 11 sequences of the query set. These results suggested that the sequences of the C-terminal domains determined a precise and conserved secondary structure. They predicted that the C-terminal domain would have a mixed fold (alpha/beta or alpha+beta), with the alpha-helices in the first half of the sequence and the beta-strands mainly in its second half. Several programs of fold recognition from sequence alone, by threading onto known structures, were applied but none of them identified a type of fold that would be common to the different sequences of the query set. Therefore, the fold of the C-terminal, anticodon binding domain might be novel.
Collapse
Affiliation(s)
- L Jermutus
- Groupe d'Ingénierie des Protéines (CNRS URA 1129), Unité de Biochimie Cellulaire, Institut Pasteur, Paris, France
| | | | | |
Collapse
|
34
|
Domingues FS, Koppensteiner WA, Jaritz M, Prlic A, Weichenberger C, Wiederstein M, Floeckner H, Lackner P, Sippl MJ. Sustained performance of knowledge-based potentials in fold recognition. Proteins 1999. [DOI: 10.1002/(sici)1097-0134(1999)37:3+<112::aid-prot15>3.0.co;2-r] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
35
|
Application of Reduced Models to Protein Structure Prediction. ACTA ACUST UNITED AC 1999. [DOI: 10.1016/s1380-7323(99)80086-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
36
|
Abstract
We outline a general strategy for determining the effective coarse-grained interactions between the amino acids of a protein from the experimentally derived native-state structures. The method is, in principle, free from any adjustable or empirically determined parameters, and it is tested on simple models and compared with other existing approaches.
Collapse
Affiliation(s)
- F Seno
- Istituto Nazionale per la Fisica della Materia, Dipartimento di Fisica G. Galilei, Università di Padova, Italy.
| | | | | |
Collapse
|
37
|
Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. J Mol Biol 1998; 275:895-916. [PMID: 9480776 DOI: 10.1006/jmbi.1997.1479] [Citation(s) in RCA: 365] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
We present a formalism to compute the probability of an amino acid sequence conformation being native-like, given a set of pairwise atom-atom distances. The formalism is used to derive three discriminatory functions with different types of representations for the atom-atom contacts observed in a database of protein structures. These functions include two virtual atom representations and one all-heavy atom representation. When applied to six different decoy sets containing a range of correct and incorrect conformations of amino acid sequences, the all-atom distance-dependent discriminatory function is able to identify correct from incorrect more often than the discriminatory functions using approximate representations. We illustrate the importance of using a detailed atomic description for obtaining the most accurate discrimination, and the necessity for testing discriminatory functions against a wide variety of decoys. The discriminatory function is also shown to be capable of capturing the fine details of atom-atom preferences. These results suggest that the all-atom distance-dependent discriminatory function will be useful for protein structure prediction and model refinement.
Collapse
Affiliation(s)
- R Samudrala
- Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, 9600, Gudelsky Drive, Rockville, MD 20850, USA
| | | |
Collapse
|
38
|
Lathrop RH, Rogers RG, White JV, Gaitatzes C, Smith TF, Bienkowska J, Bryant BK, Buturović LJ, Nambudripad R. Analysis and algorithms for protein sequence–structure alignment. COMPUTATIONAL METHODS IN MOLECULAR BIOLOGY 1998. [DOI: 10.1016/s0167-7306(08)60469-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
39
|
Benner SA, Cannarozzi G, Gerloff D, Turcotte M, Chelvanayagam G. Bona Fide Predictions of Protein Secondary Structure Using Transparent Analyses of Multiple Sequence Alignments. Chem Rev 1997; 97:2725-2844. [PMID: 11851479 DOI: 10.1021/cr940469a] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Steven A. Benner
- Department of Chemistry, University of Florida, Gainesville, Florida 32611-7200
| | | | | | | | | |
Collapse
|
40
|
|
41
|
Abstract
In fold recognition by threading one takes the amino acid sequence of a protein and evaluates how well it fits into one of the known three-dimensional (3D) protein structures. The quality of sequence-structure fit is typically evaluated using inter-residue potentials of mean force or other statistical parameters. Here, we present an alternative approach to evaluating sequence-structure fitness. Starting from the amino acid sequence we first predict secondary structure and solvent accessibility for each residue. We then thread the resulting one-dimensional (1D) profile of predicted structure assignments into each of the known 3D structures. The optimal threading for each sequence-structure pair is obtained using dynamic programming. The overall best sequence-structure pair constitutes the predicted 3D structure for the input sequence. The method is fine-tuned by adding information from direct sequence-sequence comparison and applying a series of empirical filters. Although the method relies on reduction of 3D information into 1D structure profiles, its accuracy is, surprisingly, not clearly inferior to methods based on evaluation of residue interactions in 3D. We therefore hypothesise that existing 1D-3D threading methods essentially do not capture more than the fitness of an amino acid sequence for a particular 1D succession of secondary structure segments and residue solvent accessibility. The prediction-based threading method on average finds any structurally homologous region at first rank in 29% of the cases (including sequence information). For the 22% first hits detected at highest scores, the expected accuracy rose to 75%. However, the task of detecting entire folds rather than homologous fragments was managed much better; 45 to 75% of the first hits correctly recognised the fold.
Collapse
Affiliation(s)
- B Rost
- EMBL, Heidelberg, Germany
| | | | | |
Collapse
|
42
|
Abstract
If protein structure prediction methods are to make any impact on the impending onerous task of analyzing the large numbers of unknown protein sequences generated by the ongoing genome-sequencing projects, it is vital that they make the difficult transition from computational 'gedankenexperiments' to practical software tools. This has already happened in the field of comparative modelling and is currently happening in the threading field. Unfortunately, there is little evidence of this transition happening in the field of ab initio tertiary-structure prediction.
Collapse
Affiliation(s)
- D T Jones
- Department of Biological Sciences, University of Warwick, Coventry, UK.
| |
Collapse
|
43
|
Simons KT, Kooperberg C, Huang E, Baker D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol 1997; 268:209-25. [PMID: 9149153 DOI: 10.1006/jmbi.1997.0959] [Citation(s) in RCA: 950] [Impact Index Per Article: 35.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
We explore the ability of a simple simulated annealing procedure to assemble native-like structures from fragments of unrelated protein structures with similar local sequences using Bayesian scoring functions. Environment and residue pair specific contributions to the scoring functions appear as the first two terms in a series expansion for the residue probability distributions in the protein database; the decoupling of the distance and environment dependencies of the distributions resolves the major problems with current database-derived scoring functions noted by Thomas and Dill. The simulated annealing procedure rapidly and frequently generates native-like structures for small helical proteins and better than random structures for small beta sheet containing proteins. Most of the simulated structures have native-like solvent accessibility and secondary structure patterns, and thus ensembles of these structures provide a particularly challenging set of decoys for evaluating scoring functions. We investigate the effects of multiple sequence information and different types of conformational constraints on the overall performance of the method, and the ability of a variety of recently developed scoring functions to recognize the native-like conformations in the ensembles of simulated structures.
Collapse
Affiliation(s)
- K T Simons
- Department of Biochemistry, University of Washington, Seattle 98195, USA
| | | | | | | |
Collapse
|
44
|
Rice DW, Eisenberg D. A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. J Mol Biol 1997; 267:1026-38. [PMID: 9135128 DOI: 10.1006/jmbi.1997.0924] [Citation(s) in RCA: 131] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
In protein fold recognition, a probe amino acid sequence is compared to a library of representative folds of known structure to identify a structural homolog. In cases where the probe and its homolog have clear sequence similarity, traditional residue substitution matrices have been used to predict the structural similarity. In cases where the probe is sequentially distant from its homolog, we have developed a (7 x 3 x 2 x 7 x 3) 3D-1D substitution matrix (called H3P2), calculated from a database of 119 structural pairs. Members of each pair share a similar fold, but have sequence identity less than 30%. Each probe sequence position is defined by one of seven residue classes and three secondary structure classes. Each homologous fold position is defined by one of seven residue classes, three secondary structure classes, and two burial classes. Thus the matrix is five-dimensional and contains 7 x 3 x 2 x 7 x 3 = 882 elements or 3D-1D scores. The first step in assigning a probe sequence to its homologous fold is the prediction of the three-state (helix, strand, coil) secondary structure of the probe; here we use the profile based neural network prediction of secondary structure (PHD) program. Then a dynamic programming algorithm uses the H3P2 matrix to align the probe sequence with structures in a representative fold library. To test the effectiveness of the H3P2 matrix a challenging, fold class diverse, and cross-validated benchmark assessment is used to compare the H3P2 matrix to the GONNET, PAM250, BLOSUM62 and a secondary structure only substitution matrix. For distantly related sequences the H3P2 matrix detects more homologous structures at higher reliabilities than do these other substitution matrices, based on sensitivity versus specificity plots (or SENS-SPEC plots). The added efficacy of the H3P2 matrix arises from its information on the statistical preferences for various sequence-structure environment combinations from very distantly related proteins. It introduces the predicted secondary structure information from a sequence into fold recognition in a statistical way that normalizes the inherent correlations between residue type, secondary structure and solvent accessibility.
Collapse
Affiliation(s)
- D W Rice
- UCLA-DOE Laboratory of Structural Biology and Molecular Medicine, Molecular Biology Institute, UCLA, Los Angeles, CA 90095-1570, USA
| | | |
Collapse
|
45
|
Osuna J, Soberón X, Morett E. A proposed architecture for the central domain of the bacterial enhancer-binding proteins based on secondary structure prediction and fold recognition. Protein Sci 1997; 6:543-55. [PMID: 9070437 PMCID: PMC2143673 DOI: 10.1002/pro.5560060304] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
The expression of genes transcribed by the RNA polymerase with the alternative sigma factor sigma 54 (E sigma 54) is absolutely dependent on activator proteins that bind to enhancer-like sites, located far upstream from the promoter. These unique prokaryotic proteins, known as enhancer-binding proteins (EBP), mediate open promoter complex formation in a reaction dependent on NTP hydrolysis. The best characterized proteins of this family of regulators are NtrC and NifA, which activate genes required for ammonia assimilation and nitrogen fixation, respectively. In a recent IRBM course (@ontiers of protein structure prediction," IRBM, Pomezia, Italy, 1995; see web site http://www.mrc-cpe.cam.uk/irbm-course95/), one of us (J.O.) participated in the elaboration of the proposal that the Central domain of the EBPs might adopt the classical mononucleotide-binding fold. This suggestion was based on the results of a new protein fold recognition algorithm (Map) and in the mapping of correlated mutations calculated for the sequence family on the same mononucleotide-binding fold topology. In this work, we present new data that support the previous conclusion. The results from a number of different secondary structure prediction programs suggest that the Central domain could adopt an alpha/beta topology. The fold recognition programs ProFIT 0.9, 3D PROFILE combined with secondary structure prediction, and 123D suggest a mononucleotide-binding fold topology for the Central domain amino acid sequence. Finally, and most importantly, three of five reported residue alterations that impair the Central domain. ATPase activity of the E sigma 54 activators are mapped to polypeptide regions that might be playing equivalent roles as those involved in nucleotide-binding in the mononucleotide-binding proteins. Furthermore, the known residue substitution that alter the function of the E sigma 54 activators, leaving intact the Central domain ATPase activity, are mapped on region proposed to play an equivalent role as the effector region of the GTPase superfamily.
Collapse
Affiliation(s)
- J Osuna
- Departamento de Reconocimiento Molecular Bioestructura, Universidad Nacional Autónoma de México, México.
| | | | | |
Collapse
|
46
|
Abstract
Although we are still a long way from being able to predict the details of protein structure from the underlying chemistry, slow but steady progress is being made at modeling structural features by recognizing the patterns that connect sequence to structure.
Collapse
Affiliation(s)
- D Shortle
- Department of Biological Chemistry, The Johns Hopkins University School of Medicine, 725 North Wolfe Street, Baltimore, Maryland 21205, USA
| |
Collapse
|
47
|
Abstract
The computational techniques of sorting out protein folds (these techniques include dynamic programming, self-consistent field theory, etc.) have already ceased to be the bottleneck of predictions. The main problem is that all the methods of recognition and prediction of protein structure can actually use only some part of the interactions operating in the chain, and that even their energies are not known precisely. This is the principal source of errors now. The errors can be reduced by employment of many distant homologues, but this opens a possibility to predict a generalized folding pattern rather than a particular fold with all its details.
Collapse
Affiliation(s)
- A V Finkelstein
- Institute of Protein Research, Russian Academy of Sciences, 142292 Pushchino, Moscow Region, Russia.
| |
Collapse
|
48
|
Kolinski A, Skolnick J, Godzik A, Hu WP. A method for the prediction of surface “U”-turns and transglobular connections in small proteins. Proteins 1997. [DOI: 10.1002/(sici)1097-0134(199702)27:2<290::aid-prot14>3.0.co;2-h] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
49
|
|
50
|
Jones DT. Successful ab initio prediction of the tertiary structure of NK-lysin using multiple sequences and recognized supersecondary structural motifs. Proteins 1997. [DOI: 10.1002/(sici)1097-0134(1997)1+<185::aid-prot24>3.0.co;2-j] [Citation(s) in RCA: 66] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|