1551
|
Laganeckas M, Margelevicius M, Venclovas C. Identification of new homologs of PD-(D/E)XK nucleases by support vector machines trained on data derived from profile-profile alignments. Nucleic Acids Res 2010; 39:1187-96. [PMID: 20961958 PMCID: PMC3045609 DOI: 10.1093/nar/gkq958] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
PD-(D/E)XK nucleases, initially represented by only Type II restriction enzymes, now comprise a large and extremely diverse superfamily of proteins. They participate in many different nucleic acids transactions including DNA degradation, recombination, repair and RNA processing. Different PD-(D/E)XK families, although sharing a structurally conserved core, typically display little or no detectable sequence similarity except for the active site motifs. This makes the identification of new superfamily members using standard homology search techniques challenging. To tackle this problem, we developed a method for the detection of PD-(D/E)XK families based on the binary classification of profile–profile alignments using support vector machines (SVMs). Using a number of both superfamily-specific and general features, SVMs were trained to identify true positive alignments of PD-(D/E)XK representatives. With this method we identified several PFAM families of uncharacterized proteins as putative new members of the PD-(D/E)XK superfamily. In addition, we assigned several unclassified restriction enzymes to the PD-(D/E)XK type. Results show that the new method is able to make confident assignments even for alignments that have statistically insignificant scores. We also implemented the method as a freely accessible web server at http://www.ibt.lt/bioinformatics/software/pdexk/.
Collapse
|
1552
|
|
1553
|
Graebsch A, Roche S, Kostrewa D, Söding J, Niessing D. Of bits and bugs--on the use of bioinformatics and a bacterial crystal structure to solve a eukaryotic repeat-protein structure. PLoS One 2010; 5:e13402. [PMID: 20976240 PMCID: PMC2954813 DOI: 10.1371/journal.pone.0013402] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2010] [Accepted: 09/24/2010] [Indexed: 11/19/2022] Open
Abstract
Pur-α is a nucleic acid-binding protein involved in cell cycle control, transcription, and neuronal function. Initially no prediction of the three-dimensional structure of Pur-α was possible. However, recently we solved the X-ray structure of Pur-α from the fruitfly Drosophila melanogaster and showed that it contains a so-called PUR domain. Here we explain how we exploited bioinformatics tools in combination with X-ray structure determination of a bacterial homolog to obtain diffracting crystals and the high-resolution structure of Drosophila Pur-α. First, we used sensitive methods for remote-homology detection to find three repetitive regions in Pur-α. We realized that our lack of understanding how these repeats interact to form a globular domain was a major problem for crystallization and structure determination. With our information on the repeat motifs we then identified a distant bacterial homolog that contains only one repeat. We determined the bacterial crystal structure and found that two of the repeats interact to form a globular domain. Based on this bacterial structure, we calculated a computational model of the eukaryotic protein. The model allowed us to design a crystallizable fragment and to determine the structure of Drosophila Pur-α. Key for success was the fact that single repeats of the bacterial protein self-assembled into a globular domain, instructing us on the number and boundaries of repeats to be included for crystallization trials with the eukaryotic protein. This study demonstrates that the simpler structural domain arrangement of a distant prokaryotic protein can guide the design of eukaryotic crystallization constructs. Since many eukaryotic proteins contain multiple repeats or repeating domains, this approach might be instructive for structural studies of a range of proteins.
Collapse
Affiliation(s)
- Almut Graebsch
- Institute of Structural Biology, Helmholtz Zentrum München, Munich, Germany
- Department of Biochemistry, Gene Center of the Ludwig-Maximilians-University Munich, Munich, Germany
| | - Stéphane Roche
- Institute of Structural Biology, Helmholtz Zentrum München, Munich, Germany
- Department of Biochemistry, Gene Center of the Ludwig-Maximilians-University Munich, Munich, Germany
| | - Dirk Kostrewa
- Department of Biochemistry, Gene Center of the Ludwig-Maximilians-University Munich, Munich, Germany
| | - Johannes Söding
- Department of Biochemistry, Gene Center of the Ludwig-Maximilians-University Munich, Munich, Germany
| | - Dierk Niessing
- Institute of Structural Biology, Helmholtz Zentrum München, Munich, Germany
- Department of Biochemistry, Gene Center of the Ludwig-Maximilians-University Munich, Munich, Germany
| |
Collapse
|
1554
|
Bateman A, Coggill P, Finn RD. DUFs: families in search of function. Acta Crystallogr Sect F Struct Biol Cryst Commun 2010; 66:1148-52. [PMID: 20944204 PMCID: PMC2954198 DOI: 10.1107/s1744309110001685] [Citation(s) in RCA: 184] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2009] [Accepted: 01/13/2010] [Indexed: 11/30/2022]
Abstract
Domains of unknown function (DUFs) are a large set of uncharacterized protein families that are found in the Pfam database. Here, the scale and growth of functionally uncharacterized families in biological databases are surveyed and the prospects for discovering their function are examined. In particular, the important role that structural genomics can play in identifying potential function is evaluated.
Collapse
Affiliation(s)
- Alex Bateman
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, England.
| | | | | |
Collapse
|
1555
|
Csaba G, Zimmer R. Vorescore--fold recognition improved by rescoring of protein structure models. Bioinformatics 2010; 26:i474-81. [PMID: 20823310 PMCID: PMC2935407 DOI: 10.1093/bioinformatics/btq369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Summary: The identification of good protein structure models and their appropriate ranking is a crucial problem in structure prediction and fold recognition. For many alignment methods, rescoring of alignment-induced models using structural information can improve the separation of useful and less useful models as compared with the alignment score. Vorescore, a template-based protein structure model rescoring system is introduced. The method scores the model structure against the template used for the modeling using Vorolign. The method works on models from different alignment methods and incorporates both knowledge from the prediction method and the rescoring. Results: The performance of Vorescore is evaluated in a large-scale and difficult protein structure prediction context. We use different threading methods to create models for 410 targets, in three scenarios: (i) family members are contained in the template set; (ii) superfamily members (but no family members); and (iii) only fold members (but no family or superfamily members). In all cases Vorescore improves significantly (e.g. 40% on both Gotoh and HHalign at the fold level) on the model quality, and clearly outperforms the state-of-the-art physics-based model scoring system Rosetta. Moreover, Vorescore improves on other successful rescoring approaches such as Pcons and ProQ. In an additional experiment we add high-quality models based on structural alignments to the set, which allows Vorescore to improve the fold recognition rate by another 50%. Availability: All models of the test set (about 2 million, 44 GB gzipped) are available upon request. Contact:csaba@bio.ifi.lmu.de; ralf.zimmer@ifi.lmu.de
Collapse
Affiliation(s)
- Gergely Csaba
- Department of Informatics, Ludwig-Maximilians-Universität München, München, Germany.
| | | |
Collapse
|
1556
|
Essential biological processes of an emerging pathogen: DNA replication, transcription, and cell division in Acinetobacter spp. Microbiol Mol Biol Rev 2010; 74:273-97. [PMID: 20508250 DOI: 10.1128/mmbr.00048-09] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Within the last 15 years, members of the bacterial genus Acinetobacter have risen from relative obscurity to be among the most important sources of hospital-acquired infections. The driving force for this has been the remarkable ability of these organisms to acquire antibiotic resistance determinants, with some strains now showing resistance to every antibiotic in clinical use. There is an urgent need for new antibacterial compounds to combat the threat imposed by Acinetobacter spp. and other intractable bacterial pathogens. The essential processes of chromosomal DNA replication, transcription, and cell division are attractive targets for the rational design of antimicrobial drugs. The goal of this review is to examine the wealth of genome sequence and gene knockout data now available for Acinetobacter spp., highlighting those aspects of essential systems that are most suitable as drug targets. Acinetobacter spp. show several key differences from other pathogenic gammaproteobacteria, particularly in global stress response pathways. The involvement of these pathways in short- and long-term antibiotic survival suggests that Acinetobacter spp. cope with antibiotic-induced stress differently from other microorganisms.
Collapse
|
1557
|
Calvanese L, Marasco D, Doti N, Saporito A, D'Auria G, Paolillo L, Ruvo M, Falcigno L. Structural investigations on the Nodal-Cripto binding: A theoretical and experimental approach. Biopolymers 2010; 93:1011-21. [DOI: 10.1002/bip.21517] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
1558
|
Insights into the binding of Phenyltiocarbamide (PTC) agonist to its target human TAS2R38 bitter receptor. PLoS One 2010; 5:e12394. [PMID: 20811630 PMCID: PMC2928277 DOI: 10.1371/journal.pone.0012394] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2010] [Accepted: 08/02/2010] [Indexed: 12/02/2022] Open
Abstract
Humans' bitter taste perception is mediated by the hTAS2R subfamily of the G protein-coupled membrane receptors (GPCRs). Structural information on these receptors is currently limited. Here we identify residues involved in the binding of phenylthiocarbamide (PTC) and in receptor activation in one of the most widely studied hTAS2Rs (hTAS2R38) by means of structural bioinformatics and molecular docking. The predictions are validated by site-directed mutagenesis experiments that involve specific residues located in the putative binding site and trans-membrane (TM) helices 6 and 7 putatively involved in receptor activation. Based on our measurements, we suggest that (i) residue N103 participates actively in PTC binding, in line with previous computational studies. (ii) W99, M100 and S259 contribute to define the size and shape of the binding cavity. (iii) W99 and M100, along with F255 and V296, play a key role for receptor activation, providing insights on bitter taste receptor activation not emerging from the previously reported computational models.
Collapse
|
1559
|
Marsin S, Lopes A, Mathieu A, Dizet E, Orillard E, Guérois R, Radicella JP. Genetic dissection of Helicobacter pylori AddAB role in homologous recombination. FEMS Microbiol Lett 2010; 311:44-50. [DOI: 10.1111/j.1574-6968.2010.02077.x] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
|
1560
|
On the evolutionary origins of "Fold Space Continuity": a study of topological convergence and divergence in mixed alpha-beta domains. J Struct Biol 2010; 172:244-52. [PMID: 20691788 DOI: 10.1016/j.jsb.2010.07.016] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2010] [Revised: 06/25/2010] [Accepted: 07/31/2010] [Indexed: 11/21/2022]
Abstract
Existing protein structure classifications group proteins by overall structural similarity at the highest level and by evolutionary relationships at the lowest level, deriving higher-level groups by pairwise structure comparison. For this to be successful requires that large changes in structure are relatively rare in evolution and that proteins with no detectable evolutionary relationship do not converge on similar global chain conformations since this creates conflicts between structural and evolutionary consistency. Analysis of global structural changes using core topological descriptions for 4261 domains from classes C and D of the SCOP database and new measures of topological distance and consistency of classification showed that the topological consistency of SCOP folds is highly variable with some folds having no consistent description and significant overlaps between groups including some members of separate folds with identical topological descriptions. Topological clustering shows that including sufficient indels to allow family members to be joined would also require joining several distinct folds. We conclude that evolutionary changes in the global topology of protein domains are the root cause of many difficulties for present approaches to structure classification using pairwise comparison. As a resolution we propose that a purely structural classification should be created using an approach similar to that adopted by the Gene Ontology in which proteins are assigned labels describing structure.
Collapse
|
1561
|
van der Meijden E, Janssens RWA, Lauber C, Bouwes Bavinck JN, Gorbalenya AE, Feltkamp MCW. Discovery of a new human polyomavirus associated with trichodysplasia spinulosa in an immunocompromized patient. PLoS Pathog 2010; 6:e1001024. [PMID: 20686659 PMCID: PMC2912394 DOI: 10.1371/journal.ppat.1001024] [Citation(s) in RCA: 346] [Impact Index Per Article: 23.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2010] [Accepted: 06/30/2010] [Indexed: 01/06/2023] Open
Abstract
The Polyomaviridae constitute a family of small DNA viruses infecting a variety of hosts. In humans, polyomaviruses can cause infections of the central nervous system, urinary tract, skin, and possibly the respiratory tract. Here we report the identification of a new human polyomavirus in plucked facial spines of a heart transplant patient with trichodysplasia spinulosa, a rare skin disease exclusively seen in immunocompromized patients. The trichodysplasia spinulosa-associated polyomavirus (TSV) genome was amplified through rolling-circle amplification and consists of a 5232-nucleotide circular DNA organized similarly to known polyomaviruses. Two putative “early” (small and large T antigen) and three putative “late” (VP1, VP2, VP3) genes were identified. The TSV large T antigen contains several domains (e.g. J-domain) and motifs (e.g. HPDKGG, pRb family-binding, zinc finger) described for other polyomaviruses and potentially involved in cellular transformation. Phylogenetic analysis revealed a close relationship of TSV with the Bornean orangutan polyomavirus and, more distantly, the Merkel cell polyomavirus that is found integrated in Merkel cell carcinomas of the skin. The presence of TSV in the affected patient's skin was confirmed by newly designed quantitative TSV-specific PCR, indicative of a viral load of 105 copies per cell. After topical cidofovir treatment, the lesions largely resolved coinciding with a reduction in TSV load. PCR screening demonstrated a 4% prevalence of TSV in an unrelated group of immunosuppressed transplant recipients without apparent disease. In conclusion, a new human polyomavirus was discovered and identified as the possible cause of trichodysplasia spinulosa in immunocompromized patients. The presence of TSV also in clinically unaffected individuals suggests frequent virus transmission causing subclinical, probably latent infections. Further studies have to reveal the impact of TSV infection in relation to other populations and diseases. Diseases that occur exclusively in immunocompromized patients are often of an infectious nature. Trichodysplasia spinulosa (TS) is such a disease characterized by development of papules, spines and alopecia in the face. Fortunately this disease is rare, because facial features can change dramatically, as in the case of an adolescent TS patient who was on immunosuppressive drugs because of heart-transplantation. A viral cause of TS was suspected already for some time because virus particles had been seen in TS lesions. In pursuit of this unknown virus, we isolated DNA from collected TS spines and could detect a unique small circular DNA suggestive of a polyomavirus genome. Additional experiments confirmed the presence in these samples of a new polyomavirus that we tentatively called TS-associated polyomavirus (TSPyV or TSV). TSV shares several properties with other polyomaviruses, such as genome organization and proteome composition, association with disease in immunosuppressed patients and occurence in individuals without overt disease. The latter indicates that TSV circulates in the human population. Future studies have to show how this newly identified polyomavirus spreads, how it causes disease and if it is related to other (skin) conditions as well.
Collapse
Affiliation(s)
- Els van der Meijden
- Department of Medical Microbiology, Leiden University Medical Center, Leiden, The Netherlands
| | - René W. A. Janssens
- Department of Dermatology, Jeroen Bosch Hospital, ‘s-Hertogenbosch, The Netherlands
| | - Chris Lauber
- Department of Medical Microbiology, Leiden University Medical Center, Leiden, The Netherlands
| | | | - Alexander E. Gorbalenya
- Department of Medical Microbiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Mariet C. W. Feltkamp
- Department of Medical Microbiology, Leiden University Medical Center, Leiden, The Netherlands
- * E-mail:
| |
Collapse
|
1562
|
Smolarek D, Bertrand O, Czerwinski M, Colin Y, Etchebest C, de Brevern AG. Multiple interests in structural models of DARC transmembrane protein. Transfus Clin Biol 2010; 17:184-96. [PMID: 20655787 DOI: 10.1016/j.tracli.2010.05.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2010] [Accepted: 05/21/2010] [Indexed: 12/23/2022]
Abstract
Duffy Antigen Receptor for Chemokines (DARC) is an unusual transmembrane chemokine receptor which (i) binds the two main chemokine families and (ii) does not transduct any signal as it lacks the DRY consensus sequence. It is considered as silent chemokine receptor, a tank useful for chemiotactism. DARC had been particularly studied as a major actor of malaria infection by Plasmodium vivax. It is also implicated in multiple chemokine inflammation, inflammatory diseases, in cancer and might play a role in HIV infection and AIDS. In this review, we focus on the interest to build structural model of DARC to understand more precisely its abilities to bind its physiological ligand CXCL8 and its malaria ligand. We also present innovative development on VHHs able to bind DARC protein. We underline difficulties and limitations of such bioinformatics approaches and highlight the crucial importance of biological data to conduct these kinds of researches.
Collapse
Affiliation(s)
- D Smolarek
- Inserm UMR-S 665, dynamique des structures et interactions des macromolecules biologiques (DSIMB), 6, rue Alexandre-Cabanel, 75739 Paris cedex 15, France
| | | | | | | | | | | |
Collapse
|
1563
|
Garcia Silva MR, Tosar JP, Frugier M, Pantano S, Bonilla B, Esteban L, Serra E, Rovira C, Robello C, Cayota A. Cloning, characterization and subcellular localization of a Trypanosoma cruzi argonaute protein defining a new subfamily distinctive of trypanosomatids. Gene 2010; 466:26-35. [PMID: 20621168 DOI: 10.1016/j.gene.2010.06.012] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2010] [Revised: 06/22/2010] [Accepted: 06/29/2010] [Indexed: 01/02/2023]
Abstract
Over the last years an expanding family of small non-coding RNAs (sRNA) has been identified in eukaryotic genomes which behave as sequence-specific triggers for mRNA degradation, translation repression, heterochromatin formation and genome stability. To achieve their effectors functions, sRNAs associate with members of the Argonaute protein family. Argonaute proteins are segregated into three paralogous groups: the AGO-like subfamily, the PIWI-like subfamily, and the WAGO subfamily (for Worm specific AGO). Detailed phylogenetic analysis of the small RNA-related machinery components revealed that they can be traced back to the common ancestor of eukaryotes. However, this machinery seems to be lost or excessively simplified in some unicellular organisms such as Saccharomyces cerevisiae, Trypanosoma cruzi, Leishmania major and Plasmodium falciparum which are unable to utilize dsRNA to trigger degradation of target RNAs. We reported here a unique ORF encoding for an AGO/PIWI protein in T. cruzi which was expressed in all stages of its life cycle at the transcript as well as the protein level. Database search for remote homologues, revealed the presence of a divergent PAZ domain adjacent to the well supported PIWI domain. Our results strongly suggested that this unique AGO/PIWI protein from T. cruzi is a canonical Argonaute in terms of its domain architecture. We propose to reclassify all Argonaute members from trypanosomatids as a distinctive phylogenetic group representing a new subfamily of Argonaute proteins and propose the generic designation of AGO/PIWI-tryp to identify them. Inside the Trypanosomatid-specific node, AGO/PIWI-tryps were clearly segregated into two paralog groups designated as AGO-tryp and PIWI-tryp according to the presence or absence of a functional link with RNAi-related phenomena, respectively.
Collapse
Affiliation(s)
- Maria R Garcia Silva
- Functional Genomics Unit, Institut Pasteur de Montevideo, Mataojo 2020 CP11400 Montevideo, Uruguay.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
1564
|
Altschul SF, Wootton JC, Zaslavsky E, Yu YK. The construction and use of log-odds substitution scores for multiple sequence alignment. PLoS Comput Biol 2010; 6:e1000852. [PMID: 20657661 PMCID: PMC2904766 DOI: 10.1371/journal.pcbi.1000852] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2009] [Accepted: 06/03/2010] [Indexed: 01/18/2023] Open
Abstract
Most pairwise and multiple sequence alignment programs seek alignments with optimal scores. Central to defining such scores is selecting a set of substitution scores for aligned amino acids or nucleotides. For local pairwise alignment, substitution scores are implicitly of log-odds form. We now extend the log-odds formalism to multiple alignments, using Bayesian methods to construct "BILD" ("Bayesian Integral Log-odds") substitution scores from prior distributions describing columns of related letters. This approach has been used previously only to define scores for aligning individual sequences to sequence profiles, but it has much broader applicability. We describe how to calculate BILD scores efficiently, and illustrate their uses in Gibbs sampling optimization procedures, gapped alignment, and the construction of hidden Markov model profiles. BILD scores enable automated selection of optimal motif and domain model widths, and can inform the decision of whether to include a sequence in a multiple alignment, and the selection of insertion and deletion locations. Other applications include the classification of related sequences into subfamilies, and the definition of profile-profile alignment scores. Although a fully realized multiple alignment program must rely upon more than substitution scores, many existing multiple alignment programs can be modified to employ BILD scores. We illustrate how simple BILD score based strategies can enhance the recognition of DNA binding domains, including the Api-AP2 domain in Toxoplasma gondii and Plasmodium falciparum.
Collapse
Affiliation(s)
- Stephen F Altschul
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America.
| | | | | | | |
Collapse
|
1565
|
Effective gene silencing in a microsporidian parasite associated with honeybee (Apis mellifera) colony declines. Appl Environ Microbiol 2010; 76:5960-4. [PMID: 20622131 DOI: 10.1128/aem.01067-10] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Honeybee colonies are vulnerable to parasites and pathogens ranging from viruses to vertebrates. An increasingly prevalent disease of managed honeybees is caused by the microsporidian Nosema ceranae. Microsporidia are basal fungi and obligate parasites with much-reduced genomic and cellular components. A recent genome-sequencing effort for N. ceranae indicated the presence of machinery for RNA silencing in this species, suggesting that RNA interference (RNAi) might be exploited to regulate Nosema gene expression within bee hosts. Here we used controlled laboratory experiments to show that double-stranded RNA homologous to specific N. ceranae ADP/ATP transporter genes can specifically and differentially silence transcripts encoding these proteins. This inhibition also affects Nosema levels and host physiology. Gene silencing could be mediated solely by Nosema or in concert with known systemic RNAi mechanisms in their bee hosts. These results are novel for the microsporidia and provide a possible avenue for controlling a disease agent implicated in severe honeybee colony losses. Moreover, since microsporidia are pathogenic in several known veterinary and human diseases, this advance may have broader applications in the future for disease control.
Collapse
|
1566
|
Essential roles for imuA'- and imuB-encoded accessory factors in DnaE2-dependent mutagenesis in Mycobacterium tuberculosis. Proc Natl Acad Sci U S A 2010; 107:13093-8. [PMID: 20615954 DOI: 10.1073/pnas.1002614107] [Citation(s) in RCA: 94] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
In Mycobacterium tuberculosis (Mtb), damage-induced mutagenesis is dependent on the C-family DNA polymerase, DnaE2. Included with dnaE2 in the Mtb SOS regulon is a putative operon comprising Rv3395c, which encodes a protein of unknown function restricted primarily to actinomycetes, and Rv3394c, which is predicted to encode a Y-family DNA polymerase. These genes were previously identified as components of an imuA-imuB-dnaE2-type mutagenic cassette widespread among bacterial genomes. Here, we confirm that Rv3395c (designated imuA') and Rv3394c (imuB) are individually essential for induced mutagenesis and damage tolerance. Yeast two-hybrid analyses indicate that ImuB interacts with both ImuA' and DnaE2, as well as with the beta-clamp. Moreover, disruption of the ImuB-beta clamp interaction significantly reduces induced mutagenesis and damage tolerance, phenocopying imuA', imuB, and dnaE2 gene deletion mutants. Despite retaining structural features characteristic of Y-family members, ImuB homologs lack conserved active-site amino acids required for polymerase activity. In contrast, replacement of DnaE2 catalytic residues reproduces the dnaE2 gene deletion phenotype, strongly implying a direct role for the alpha-subunit in mutagenic lesion bypass. These data implicate differential protein interactions in specialist polymerase function and identify the split imuA'-imuB/dnaE2 cassette as a compelling target for compounds designed to limit mutagenesis in a pathogen increasingly associated with drug resistance.
Collapse
|
1567
|
Jin H, White SR, Shida T, Schulz S, Aguiar M, Gygi SP, Bazan JF, Nachury MV. The conserved Bardet-Biedl syndrome proteins assemble a coat that traffics membrane proteins to cilia. Cell 2010; 141:1208-19. [PMID: 20603001 PMCID: PMC2898735 DOI: 10.1016/j.cell.2010.05.015] [Citation(s) in RCA: 475] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2009] [Revised: 03/19/2010] [Accepted: 04/16/2010] [Indexed: 12/18/2022]
Abstract
The BBSome is a complex of Bardet-Biedl Syndrome (BBS) proteins that shares common structural elements with COPI, COPII, and clathrin coats. Here, we show that the BBSome constitutes a coat complex that sorts membrane proteins to primary cilia. The BBSome is the major effector of the Arf-like GTPase Arl6/BBS3, and the BBSome and GTP-bound Arl6 colocalize at ciliary punctae in an interdependent manner. Strikingly, Arl6(GTP)-mediated recruitment of the BBSome to synthetic liposomes produces distinct patches of polymerized coat apposed onto the lipid bilayer. Finally, the ciliary targeting signal of somatostatin receptor 3 needs to be directly recognized by the BBSome in order to mediate targeting of membrane proteins to cilia. Thus, we propose that trafficking of BBSome cargoes to cilia entails the coupling of BBSome coat polymerization to the recognition of sorting signals by the BBSome.
Collapse
Affiliation(s)
- Hua Jin
- Department of Molecular and Cellular Physiology, Stanford University School of Medicine, CA 94305-5345, USA
| | - Susan Roehl White
- Department of Molecular and Cellular Physiology, Stanford University School of Medicine, CA 94305-5345, USA
| | - Toshinobu Shida
- Department of Molecular and Cellular Physiology, Stanford University School of Medicine, CA 94305-5345, USA
| | - Stefan Schulz
- Institute of Pharmacology and Toxicology, Friedrich-Schiller-University, D-07743 Jena, Germany
| | - Mike Aguiar
- Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Steven P. Gygi
- Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | | | - Maxence V. Nachury
- Department of Molecular and Cellular Physiology, Stanford University School of Medicine, CA 94305-5345, USA
| |
Collapse
|
1568
|
Krupovic M, Gribaldo S, Bamford DH, Forterre P. The evolutionary history of archaeal MCM helicases: a case study of vertical evolution combined with hitchhiking of mobile genetic elements. Mol Biol Evol 2010; 27:2716-32. [PMID: 20581330 DOI: 10.1093/molbev/msq161] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Genes encoding DNA replication proteins have been frequently exchanged between cells and mobile elements, such as viruses or plasmids. This raises potential problems to reconstruct their history. Here, we combine phylogenetic and genomic context analyses to study the evolution of the replicative minichromosome maintenance (MCM) helicases in Archaea. Several archaeal genomes encode more than one copy of the mcm gene. Genome context analysis reveals that most of these additional copies are encoded within mobile elements. Exhaustive analysis of these elements reveals diverse groups of integrated archaeal plasmids or viruses, including several head-and-tail proviruses. Some MCMs encoded by mobile elements are structurally distinct from their cellular counterparts, with one case of novel domain organization. Both genome context and phylogenetic analysis indicate that MCM encoded by mobile elements were recruited from cellular genomes. An accelerated evolution and a dramatic expansion of methanococcal MCMs suggest a host-to-virus-to-host transfer loop, possibly triggered by the loss of the archaeal initiator protein Cdc6 in Methanococcales. Surprisingly, despite extensive transfer of mcm genes between viruses, plasmids, and cells, the topology of the MCM tree is strikingly congruent with the consensus archaeal phylogeny, indicating that mobile elements encoding mcm have coevolved with their hosts and that DNA replication proteins can be also useful to reconstruct the history of the archaeal domain.
Collapse
Affiliation(s)
- Mart Krupovic
- Department of Biosciences and Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | | | | | | |
Collapse
|
1569
|
Kerk D, Moorhead GBG. A phylogenetic survey of myotubularin genes of eukaryotes: distribution, protein structure, evolution, and gene expression. BMC Evol Biol 2010; 10:196. [PMID: 20576132 PMCID: PMC2927912 DOI: 10.1186/1471-2148-10-196] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2009] [Accepted: 06/24/2010] [Indexed: 01/02/2023] Open
Abstract
Background Phosphorylated phosphatidylinositol (PtdIns) lipids, produced and modified by PtdIns kinases and phosphatases, are critical to the regulation of diverse cellular functions. The myotubularin PtdIns-phosphate phosphatases have been well characterized in yeast and especially animals, where multiple isoforms, both catalytically active and inactive, occur. Myotubularin mutations bring about disruption of cellular membrane trafficking, and in humans, disease. Previous studies have suggested that myotubularins are widely distributed amongst eukaryotes, but key evolutionary questions concerning the origin of different myotubularin isoforms remain unanswered, and little is known about the function of these proteins in most organisms. Results We have identified 80 myotubularin homologues amidst the completely sequenced genomes of 30 organisms spanning four eukaryotic supergroups. We have mapped domain architecture, and inferred evolutionary histories. We have documented an expansion in the Amoebozoa of a family of inactive myotubularins with a novel domain architecture, which we dub "IMLRK" (inactive myotubularin/LRR/ROCO/kinase). There is an especially large myotubularin gene family in the pathogen Entamoeba histolytica, the majority of them IMLRK proteins. We have analyzed published patterns of gene expression in this organism which indicate that myotubularins may be important to critical life cycle stage transitions and host infection. Conclusions This study presents an overall framework of eukaryotic myotubularin gene evolution. Inactive myotubularin homologues with distinct domain architectures appear to have arisen on three separate occasions in different eukaryotic lineages. The large and distinctive set of myotubularin genes found in an important pathogen species suggest that in this organism myotubularins might present important new targets for basic research and perhaps novel therapeutic strategies.
Collapse
Affiliation(s)
- David Kerk
- Department of Biological Sciences, University of Calgary, Alberta, Canada
| | | |
Collapse
|
1570
|
Jung J, Park HJ, Uhm KN, Kim D, Kim HK. Asymmetric synthesis of (S)-ethyl-4-chloro-3-hydroxy butanoate using a Saccharomyces cerevisiae reductase: enantioselectivity and enzyme-substrate docking studies. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2010; 1804:1841-9. [PMID: 20601218 DOI: 10.1016/j.bbapap.2010.06.011] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2010] [Revised: 06/05/2010] [Accepted: 06/14/2010] [Indexed: 11/25/2022]
Abstract
Ethyl (S)-4-chloro-3-hydroxy butanoate (ECHB) is a building block for the synthesis of hypercholesterolemia drugs. In this study, various microbial reductases have been cloned and expressed in Escherichia coli. Their reductase activities toward ethyl-4-chloro oxobutanoate (ECOB) have been assayed. Amidst them, Baker's yeast YDL124W, YOR120W, and YOL151W reductases showed high activities. YDL124W produced (S)-ECHB exclusively, whereas YOR120W and YOL151W made (R)-form alcohol. The homology models and docking models with ECOB and NADPH elucidated their substrate specificities and enantioselectivities. A glucose dehydrogenase-coupling reaction was used as NADPH recycling system to perform continuously the reduction reaction. Recombinant E. coli cell co-expressing YDL124W and Bacillus subtilis glucose dehydrogenase produced (S)-ECHB exclusively.
Collapse
Affiliation(s)
- Jihye Jung
- Division of Biotechnology, The Catholic University of Korea, Bucheon 420-743, Republic of Korea
| | | | | | | | | |
Collapse
|
1571
|
Margelevicius M, Laganeckas M, Venclovas C. COMA server for protein distant homology search. ACTA ACUST UNITED AC 2010; 26:1905-6. [PMID: 20529888 DOI: 10.1093/bioinformatics/btq306] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
SUMMARY Detection of distant homology is a widely used computational approach for studying protein evolution, structure and function. Here, we report a homology search web server based on sequence profile-profile comparison. The user may perform searches in one of several regularly updated profile databases using either a single sequence or a multiple sequence alignment as an input. The same profile databases can also be downloaded for local use. The capabilities of the server are illustrated with the identification of new members of the highly diverse PD-(D/E)XK nuclease superfamily. AVAILABILITY http://www.ibt.lt/bioinformatics/coma/
Collapse
|
1572
|
Yamashita H, Shang M, Tripathi M, Jourquin J, Georgescu W, Liu S, Weidow B, Quaranta V. Epitope mapping of function-blocking monoclonal antibody CM6 suggests a "weak" integrin binding site on the laminin-332 LG2 domain. J Cell Physiol 2010; 223:541-8. [PMID: 20301201 PMCID: PMC2874318 DOI: 10.1002/jcp.22107] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Laminin-332 (Ln-332) is an extracellular matrix molecule that regulates cell adhesion, spreading, and migration by interaction with cell surface receptors such as alpha3beta1 and alpha6beta4. Previously, we developed a function-blocking monoclonal antibody against rat Ln-332, CM6, which blocks hemidesmosome assembly induced by Ln-332-alpha6beta4 interactions. However, the location of its epitope on Ln-332 has remained unclear. In this study, we show that the CM6 epitope is located on the laminin G-like (LG)2 module of the Ln-332 alpha3 chain. To specify the residues involved in this epitope, we produced a series of GST-fused alpha3 LG2 mutant proteins in which rat-specific acids were replaced with human acids by a site-directed mutagenesis strategy. CM6 reactivity against these proteins showed that CM6 binds to the (1089)NERSVR(1094) sequence of rat Ln-332 LG2 module. In a structural model, this sequence maps to an LG2 loop sequence that is exposed to solvent according to predictions, consistent with its accessibility to antibody. CM6 inhibits integrin-dependent cell adhesion on Ln-332 and inhibits cell spreading on both Ln-332 and recombinant LG2 (rLG2; but not rLG3), suggesting the presence of an alpha3beta1 binding site on LG2. However, we were unable to show that rLG2 supports adhesion in standard assays, suggesting that LG2 may contain a "weak" integrin binding site, only detectable in spreading assays that do not require washes. These results, together with our previous findings, indicate that binding sites for alpha3beta1 and alpha6beta4 are closely spaced in the Ln-332 LG domains where they regulate alternative cell functions, namely adhesion/migration or hemidesmosome anchoring.
Collapse
Affiliation(s)
- Hironobu Yamashita
- Department of Cancer Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Meiling Shang
- Department of Cell Biology, The Scripps Research Institute, La Jolla, CA 92037
| | - Manisha Tripathi
- Department of Cancer Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Jerome Jourquin
- Department of Cancer Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Walter Georgescu
- Department of Biomedical Engineering, Vanderbilt University, Nashville, TN 37232
| | - Shanshan Liu
- Department of Cancer Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Brandy Weidow
- Department of Cancer Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Vito Quaranta
- Department of Cancer Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| |
Collapse
|
1573
|
DeBartolo J, Hocky G, Wilde M, Xu J, Freed KF, Sosnick TR. Protein structure prediction enhanced with evolutionary diversity: SPEED. Protein Sci 2010; 19:520-34. [PMID: 20066664 DOI: 10.1002/pro.330] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
For naturally occurring proteins, similar sequence implies similar structure. Consequently, multiple sequence alignments (MSAs) often are used in template-based modeling of protein structure and have been incorporated into fragment-based assembly methods. Our previous homology-free structure prediction study introduced an algorithm that mimics the folding pathway by coupling the formation of secondary and tertiary structure. Moves in the Monte Carlo procedure involve only a change in a single pair of phi,psi backbone dihedral angles that are obtained from a Protein Data Bank-based distribution appropriate for each amino acid, conditional on the type and conformation of the flanking residues. We improve this method by using MSAs to enrich the sampling distribution, but in a manner that does not require structural knowledge of any protein sequence (i.e., not homologous fragment insertion). In combination with other tools, including clustering and refinement, the accuracies of the predicted secondary and tertiary structures are substantially improved and a global and position-resolved measure of confidence is introduced for the accuracy of the predictions. Performance of the method in the Critical Assessment of Structure Prediction (CASP8) is discussed.
Collapse
Affiliation(s)
- Joe DeBartolo
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, USA
| | | | | | | | | | | |
Collapse
|
1574
|
Teichert F, Minning J, Bastolla U, Porto M. High quality protein sequence alignment by combining structural profile prediction and profile alignment using SABER-TOOTH. BMC Bioinformatics 2010; 11:251. [PMID: 20470364 PMCID: PMC2885375 DOI: 10.1186/1471-2105-11-251] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2009] [Accepted: 05/14/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein alignments are an essential tool for many bioinformatics analyses. While sequence alignments are accurate for proteins of high sequence similarity, they become unreliable as they approach the so-called 'twilight zone' where sequence similarity gets indistinguishable from random. For such distant pairs, structure alignment is of much better quality. Nevertheless, sequence alignment is the only choice in the majority of cases where structural data is not available. This situation demands development of methods that extend the applicability of accurate sequence alignment to distantly related proteins. RESULTS We develop a sequence alignment method that combines the prediction of a structural profile based on the protein's sequence with the alignment of that profile using our recently published alignment tool SABERTOOTH. In particular, we predict the contact vector of protein structures using an artificial neural network based on position-specific scoring matrices generated by PSI-BLAST and align these predicted contact vectors. The resulting sequence alignments are assessed using two different tests: First, we assess the alignment quality by measuring the derived structural similarity for cases in which structures are available. In a second test, we quantify the ability of the significance score of the alignments to recognize structural and evolutionary relationships. As a benchmark we use a representative set of the SCOP (structural classification of proteins) database, with similarities ranging from closely related proteins at SCOP family level, to very distantly related proteins at SCOP fold level. Comparing these results with some prominent sequence alignment tools, we find that SABERTOOTH produces sequence alignments of better quality than those of Clustal W, T-Coffee, MUSCLE, and PSI-BLAST. HHpred, one of the most sophisticated and computationally expensive tools available, outperforms our alignment algorithm at family and superfamily levels, while the use of SABERTOOTH is advantageous for alignments at fold level. Our alignment scheme will profit from future improvements of structural profiles prediction. CONCLUSIONS We present the automatic sequence alignment tool SABERTOOTH that computes pairwise sequence alignments of very high quality. SABERTOOTH is especially advantageous when applied to alignments of remotely related proteins. The source code is available at http://www.fkp.tu-darmstadt.de/sabertooth_project/, free for academic users upon request.
Collapse
Affiliation(s)
- Florian Teichert
- Institut für Festkörperphysik, Technische Universität Darmstadt, Hochschulstr, Darmstadt, Germany
| | | | | | | |
Collapse
|
1575
|
Civril F, Wehenkel A, Giorgi FM, Santaguida S, Di Fonzo A, Grigorean G, Ciccarelli FD, Musacchio A. Structural analysis of the RZZ complex reveals common ancestry with multisubunit vesicle tethering machinery. Structure 2010; 18:616-26. [PMID: 20462495 DOI: 10.1016/j.str.2010.02.014] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2009] [Revised: 01/22/2010] [Accepted: 02/19/2010] [Indexed: 01/31/2023]
Abstract
The RZZ complex recruits dynein to kinetochores. We investigated structure, topology, and interactions of the RZZ subunits (ROD, ZWILCH, and ZW10) in vitro, in vivo, and in silico. We identify neuroblastoma-amplified gene (NAG), a ZW10 binder, as a ROD homolog. ROD and NAG contain an N-terminal beta propeller followed by an alpha solenoid, which is the architecture of certain nucleoporins and vesicle coat subunits, suggesting a distant evolutionary relationship. ZW10 binding to ROD and NAG is mutually exclusive. The resulting ZW10 complexes (RZZ and NRZ) respectively contain ZWILCH and RINT1 as additional subunits. The X-ray structure of ZWILCH, the first for an RZZ subunit, reveals a novel fold distinct from RINT1's. The evolutionarily conserved NRZ likely acts as a tethering complex for retrograde trafficking of COPI vesicles from the Golgi to the endoplasmic reticulum. The RZZ, limited to metazoans, probably evolved from the NRZ, exploiting the dynein-binding capacity of ZW10 to direct dynein to kinetochores.
Collapse
Affiliation(s)
- Filiz Civril
- Department of Experimental Oncology, European Institute of Oncology, Via Adamello 16, I-20139 Milan, Italy
| | | | | | | | | | | | | | | |
Collapse
|
1576
|
Zhang J, Wang Q, Barz B, He Z, Kosztin I, Shang Y, Xu D. MUFOLD: A new solution for protein 3D structure prediction. Proteins 2010; 78:1137-52. [PMID: 19927325 DOI: 10.1002/prot.22634] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
There have been steady improvements in protein structure prediction during the past 2 decades. However, current methods are still far from consistently predicting structural models accurately with computing power accessible to common users. Toward achieving more accurate and efficient structure prediction, we developed a number of novel methods and integrated them into a software package, MUFOLD. First, a systematic protocol was developed to identify useful templates and fragments from Protein Data Bank for a given target protein. Then, an efficient process was applied for iterative coarse-grain model generation and evaluation at the Calpha or backbone level. In this process, we construct models using interresidue spatial restraints derived from alignments by multidimensional scaling, evaluate and select models through clustering and static scoring functions, and iteratively improve the selected models by integrating spatial restraints and previous models. Finally, the full-atom models were evaluated using molecular dynamics simulations based on structural changes under simulated heating. We have continuously improved the performance of MUFOLD by using a benchmark of 200 proteins from the Astral database, where no template with >25% sequence identity to any target protein is included. The average root-mean-square deviation of the best models from the native structures is 4.28 A, which shows significant and systematic improvement over our previous methods. The computing time of MUFOLD is much shorter than many other tools, such as Rosetta. MUFOLD demonstrated some success in the 2008 community-wide experiment for protein structure prediction CASP8.
Collapse
Affiliation(s)
- Jingfen Zhang
- Department of Computer Science, University of Missouri, Columbia, Missouri 65211, USA
| | | | | | | | | | | | | |
Collapse
|
1577
|
Kim BH, Cong Q, Grishin NV. HangOut: generating clean PSI-BLAST profiles for domains with long insertions. ACTA ACUST UNITED AC 2010; 26:1564-5. [PMID: 20413635 PMCID: PMC2881392 DOI: 10.1093/bioinformatics/btq208] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Summary: Profile-based similarity search is an essential step in structure-function studies of proteins. However, inclusion of non-homologous sequence segments into a profile causes its corruption and results in false positives. Profile corruption is common in multidomain proteins, and single domains with long insertions are a significant source of errors. We developed a procedure (HangOut) that, for a single domain with specified insertion position, cleans erroneously extended PSI-BLAST alignments to generate better profiles. Availability: HangOut is implemented in Python 2.3 and runs on all Unix-compatible platforms. The source code is available under the GNU GPL license at http://prodata.swmed.edu/HangOut/ Contact:kim@chop.swmed.edu; grishin@chop.swmed.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bong-Hyun Kim
- Department of Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA.
| | | | | |
Collapse
|
1578
|
Jeong CS, Kim D. Linear predictive coding representation of correlated mutation for protein sequence alignment. BMC Bioinformatics 2010; 11 Suppl 2:S2. [PMID: 20406500 PMCID: PMC3165164 DOI: 10.1186/1471-2105-11-s2-s2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Background Although both conservation and correlated mutation (CM) are important information reflecting the different sorts of context in multiple sequence alignment, most of alignment methods use sequence profiles that only represent conservation. There is no general way to represent correlated mutation and incorporate it with sequence alignment yet. Methods We develop a novel method, CM profile, to represent correlated mutation as the spectral feature derived by using linear predictive coding where correlated mutations among different positions are represented by a fixed number of values. We combine CM profile with conventional sequence profile to improve alignment quality. Results For distantly related protein pairs, using CM profile improves the profile-profile alignment with or without predicted secondary structure. Especially, at superfamily level, combining CM profile with sequence profile improves profile-profile alignment by 9.5% while predicted secondary structure does by 6.0%. More significantly, using both of them improves profile-profile alignment by 13.9%. We also exemplify the effectiveness of CM profile by demonstrating that the resulting alignment preserves share coevolution and contacts. Conclusions In this work, we introduce a novel method, CM profile, which represents correlated mutation information as paralleled form, and apply it to the protein sequence alignment problem. When combined with conventional sequence profile, CM profile improves alignment quality significantly better than predicted secondary structure information, which should be beneficial for target-template alignment in protein structure prediction. Because of the generality of CM profile, it can be used for other bioinformatics applications in the same way of using sequence profile.
Collapse
Affiliation(s)
- Chan-seok Jeong
- Department of Bio and Brain Engineering, KAIST, 373-1 Guseong-dong, Yuseong-gu, Daejeon, 305-701, Korea
| | | |
Collapse
|
1579
|
Abstract
Many protein classification systems capture homologous relationships by grouping domains into families and superfamilies on the basis of sequence similarity. Superfamilies with similar 3D structures are further grouped into folds. In the absence of discernable sequence similarity, these structural similarities were long thought to have originated independently, by convergent evolution. However, the growth of databases and advances in sequence comparison methods have led to the discovery of many distant evolutionary relationships that transcend the boundaries of superfamilies and folds. To investigate the contributions of convergent versus divergent evolution in the origin of protein folds, we clustered representative domains of known structure by their sequence similarity, treating them as point masses in a virtual 2D space which attract or repel each other depending on their pairwise sequence similarities. As expected, families in the same superfamily form tight clusters. But often, superfamilies of the same fold are linked with each other, suggesting that the entire fold evolved from an ancient prototype. Strikingly, some links connect superfamilies with different folds. They arise from modular peptide fragments of between 20 and 40 residues that co-occur in the connected folds in disparate structural contexts. These may be descendants of an ancestral pool of peptide modules that evolved as cofactors in the RNA world and from which the first folded proteins arose by amplification and recombination. Our galaxy of folds summarizes, in a single image, most known and many yet undescribed homologous relationships between protein superfamilies, providing new insights into the evolution of protein domains.
Collapse
Affiliation(s)
- Vikram Alva
- Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Tübingen 72076, Germany
| | | | | | | | | |
Collapse
|
1580
|
Abstract
The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.
Collapse
Affiliation(s)
- Ambrish Roy
- Center for Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Ave, Ann Arbor, MI 48109, USA
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| | - Alper Kucukural
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| | - Yang Zhang
- Center for Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Ave, Ann Arbor, MI 48109, USA
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| |
Collapse
|
1581
|
Wang Z, Eickholt J, Cheng J. MULTICOM: a multi-level combination approach to protein structure prediction and its assessments in CASP8. Bioinformatics 2010; 26:882-8. [PMID: 20150411 PMCID: PMC2844995 DOI: 10.1093/bioinformatics/btq058] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2009] [Revised: 02/02/2010] [Accepted: 02/08/2010] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Protein structure prediction is one of the most important problems in structural bioinformatics. Here we describe MULTICOM, a multi-level combination approach to improve the various steps in protein structure prediction. In contrast to those methods which look for the best templates, alignments and models, our approach tries to combine complementary and alternative templates, alignments and models to achieve on average better accuracy. RESULTS The multi-level combination approach was implemented via five automated protein structure prediction servers and one human predictor which participated in the eighth Critical Assessment of Techniques for Protein Structure Prediction (CASP8), 2008. The MULTICOM servers and human predictor were consistently ranked among the top predictors on the CASP8 benchmark. The methods can predict moderate- to high-resolution models for most template-based targets and low-resolution models for some template-free targets. The results show that the multi-level combination of complementary templates, alternative alignments and similar models aided by model quality assessment can systematically improve both template-based and template-free protein modeling. AVAILABILITY The MULTICOM server is freely available at http://casp.rnet.missouri.edu/multicom_3d.html .
Collapse
Affiliation(s)
- Zheng Wang
- Department of Computer Science, Informatics Institute and C. Bond Life Science Center, University of Missouri, Columbia, MO 65211, USA
| | | | | |
Collapse
|
1582
|
Anantharaman V, Zhang D, Aravind L. OST-HTH: a novel predicted RNA-binding domain. Biol Direct 2010; 5:13. [PMID: 20302647 PMCID: PMC2848206 DOI: 10.1186/1745-6150-5-13] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2010] [Accepted: 03/19/2010] [Indexed: 02/02/2023] Open
Abstract
Background The mechanism by which the arthropod Oskar and vertebrate TDRD5/TDRD7 proteins nucleate or organize structurally related ribonucleoprotein (RNP) complexes, the polar granule and nuage, is poorly understood. Using sequence profile searches we identify a novel domain in these proteins that is widely conserved across eukaryotes and bacteria. Results Using contextual information from domain architectures, sequence-structure superpositions and available functional information we predict that this domain is likely to adopt the winged helix-turn-helix fold and bind RNA with a potential specificity for dsRNA. We show that in eukaryotes this domain is often combined in the same polypeptide with protein-protein- or lipid- interaction domains that might play a role in anchoring these proteins to specific cytoskeletal structures. Conclusions Thus, proteins with this domain might have a key role in the recognition and localization of dsRNA, including miRNAs, rasiRNAs and piRNAs hybridized to their targets. In other cases, this domain is fused to ubiquitin-binding, E3 ligase and ubiquitin-like domains indicating a previously under-appreciated role for ubiquitination in regulating the assembly and stability of nuage-like RNP complexes. Both bacteria and eukaryotes encode a conserved family of proteins that combines this predicted RNA-binding domain with a previously uncharacterized domain (DUF88). We present evidence that it is an RNAse belonging to the superfamily that includes the 5'->3' nucleases, PIN and NYN domains and might be recruited to degrade certain RNAs. Reviewers This article was reviewed by Sandor Pongor and Arcady Mushegian.
Collapse
Affiliation(s)
- Vivek Anantharaman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | |
Collapse
|
1583
|
González JM, Esteban M. A poxvirus Bcl-2-like gene family involved in regulation of host immune response: sequence similarity and evolutionary history. Virol J 2010; 7:59. [PMID: 20230632 PMCID: PMC2907574 DOI: 10.1186/1743-422x-7-59] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2009] [Accepted: 03/15/2010] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND Poxviruses evade the immune system of the host through the action of viral encoded inhibitors that block various signalling pathways. The exact number of viral inhibitors is not yet known. Several members of the vaccinia virus A46 and N1 families, with a Bcl-2-like structure, are involved in the regulation of the host innate immune response where they act non-redundantly at different levels of the Toll-like receptor signalling pathway. N1 also maintains an anti-apoptotic effect by acting similarly to cellular Bcl-2 proteins. Whether there are related families that could have similar functions is the main subject of this investigation. RESULTS We describe the sequence similarity existing among poxvirus A46, N1, N2 and C1 protein families, which share a common domain of approximately 110-140 amino acids at their C-termini that spans the entire N1 sequence. Secondary structure and fold recognition predictions suggest that this domain presents an all-alpha-helical fold compatible with the Bcl-2-like structures of vaccinia virus proteins N1, A52, B15 and K7. We propose that these protein families should be merged into a single one. We describe the phylogenetic distribution of this family and reconstruct its evolutionary history, which indicates an extensive gene gain in ancestral viruses and a further stabilization of its gene content. CONCLUSIONS Based on the sequence/structure similarity, we propose that other members with unknown function, like vaccinia virus N2, C1, C6 and C16/B22, might have a similar role in the suppression of host immune response as A46, A52, B15 and K7, by antagonizing at different levels with the TLR signalling pathways.
Collapse
Affiliation(s)
- José M González
- Department of Molecular and Cellular Biology, Centro Nacional de Biotecnología-CSIC, Darwin 3, 28049 Madrid, Spain
| | | |
Collapse
|
1584
|
Chugunov AO, Efremov RG. [Prediction of the spatial structure of proteins: emphasis on membrane targets]. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2010; 35:744-60. [PMID: 20208575 DOI: 10.1134/s106816200906003x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Knowledge of the spatial structure of proteins is a prerequisite for both awareness of their functional mechanisms and the framework for rational drug discovery and design. Meanwhile, direct structural determination is often hampered or impractical due to the complexity, expensiveness, and limited capabilities of experimental techniques. These issues are especially pronounced for integral membrane proteins. On numerous occasions, the theoretical prediction of protein structures may facilitate the process by exploiting physical or empirical principles. This paper surveys modern techniques for the prediction of the spatial structure of proteins using computer algorithms, and the main emphasis is placed on the most "complex" targets - membrane proteins (MPs). The first part of the review describes de novo methods based on empirical physical principles; in the second part, a comparative modeling philosophy, which accounts for the structure of related proteins, is described. Special focus is made regarding pharmacologically relevant classes of G-coupled receptors, receptor tyrosine ki-nases, and other MPs. Algorithms for the assessment of the models quality and potential fields of application of computer models are discussed.
Collapse
|
1585
|
The orf virus inhibitor of apoptosis functions in a Bcl-2-like manner, binding and neutralizing a set of BH3-only proteins and active Bax. Apoptosis 2010; 14:1317-30. [PMID: 19779821 DOI: 10.1007/s10495-009-0403-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
We have previously shown that the Orf virus protein, ORFV125, is a potent inhibitor of the mitochondrial pathway of apoptosis and displays rudimentary sequence similarities to cellular anti-apoptotic Bcl-2 proteins. Here we investigate the proposal that ORFV125 acts in a Bcl-2-like manner to inhibit apoptosis. We show that the viral protein interacted with a range of BH3-only proteins (Bik, Puma, DP5, Noxa and all 3 isoforms of Bim) and neutralized their pro-apoptotic activity. In addition, ORFV125 bound to the active, but not the inactive, form of Bax, and reduced the formation of Bax dimers. Mutation of specific amino acids in ORFV125 that are conserved and functionally important in mammalian Bcl-2 family proteins led to loss of both binding and inhibitory functions. We conclude that ORFV125's mechanism of action is Bcl-2-like and propose that the viral protein's combined ability to bind to a range of BH3-only proteins as well as the active form of Bax provides significant protection against apoptosis. Furthermore, we demonstrate that the binding profile of ORFV125 is distinct to that of other poxviral Bcl-2-like proteins.
Collapse
|
1586
|
Lopes A, Amarir-Bouhram J, Faure G, Petit MA, Guerois R. Detection of novel recombinases in bacteriophage genomes unveils Rad52, Rad51 and Gp2.5 remote homologs. Nucleic Acids Res 2010; 38:3952-62. [PMID: 20194117 PMCID: PMC2896510 DOI: 10.1093/nar/gkq096] [Citation(s) in RCA: 95] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Homologous recombination is a key in contributing to bacteriophages genome repair, circularization and replication. No less than six kinds of recombinase genes have been reported so far in bacteriophage genomes, two (UvsX and Gp2.5) from virulent, and four (Sak, Redβ, Erf and Sak4) from temperate phages. Using profile–profile comparisons, structure-based modelling and gene-context analyses, we provide new views on the global landscape of recombinases in 465 bacteriophages. We show that Sak, Redβ and Erf belong to a common large superfamily adopting a shortcut Rad52-like fold. Remote homologs of Sak4 are predicted to adopt a shortcut Rad51/RecA fold and are discovered widespread among phage genomes. Unexpectedly, within temperate phages, gene-context analyses also pinpointed the presence of distant Gp2.5 homologs, believed to be restricted to virulent phages. All in all, three major superfamilies of phage recombinases emerged either related to Rad52-like, Rad51-like or Gp2.5-like proteins. For two newly detected recombinases belonging to the Sak4 and Gp2.5 families, we provide experimental evidence of their recombination activity in vivo. Temperate versus virulent lifestyle together with the importance of genome mosaicism is discussed in the light of these novel recombinases. Screening for these recombinases in genomes can be performed at http://biodev.extra.cea.fr/virfam.
Collapse
Affiliation(s)
- Anne Lopes
- CEA, iBiTecS, F-91191 Gif sur Yvette, France
| | | | | | | | | |
Collapse
|
1587
|
Kececioglu J, Kim E, Wheeler T. Aligning Protein Sequences with Predicted Secondary Structure. J Comput Biol 2010; 17:561-80. [DOI: 10.1089/cmb.2009.0222] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
- John Kececioglu
- Department of Computer Science, University of Arizona, Tucson, Arizona 85721
| | - Eagu Kim
- Work done while at Department of Computer Science, University of Arizona. Present affiliation: Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin 53706
| | - Travis Wheeler
- Work done while at Department of Computer Science, University of Arizona. Present affiliation: Janelia Farm Research Campus, Ashburn, Virginia 20147
| |
Collapse
|
1588
|
Ortore G, Di Colo F, Martinelli A. Docking of hydroxamic acids into HDAC1 and HDAC8: a rationalization of activity trends and selectivities. J Chem Inf Model 2010; 49:2774-85. [PMID: 19947584 DOI: 10.1021/ci900288e] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A docking protocol using Gold software was developed to predict the binding disposition of histone deacetylase (HDAC) inhibitors, starting from the X-ray structures of HDAC8. The optimized procedure was subsequently utilized to dock into HDAC8 and into a homology model of HDAC1 nearly 40 compounds that had been tested for their inhibitory activity against the two HDAC isozymes. Evaluation of the best binding poses allowed us to identify the ligand properties and the protein residues important for activity and selectivity. HDACs are important anticancer drug targets, and their study is currently being actively pursued. As such, our results could help design new isozyme-selective HDAC inhibitors. Furthermore, this strategy may also be used for the investigation of other HDACs.
Collapse
Affiliation(s)
- Gabriella Ortore
- Dipartimento di Scienze Farmaceutiche, Universita di Pisa, via Bonanno 6, 56126 Pisa, Italy
| | | | | |
Collapse
|
1589
|
Margelevicius M, Venclovas C. Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparison. BMC Bioinformatics 2010; 11:89. [PMID: 20158924 PMCID: PMC2837030 DOI: 10.1186/1471-2105-11-89] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2009] [Accepted: 02/17/2010] [Indexed: 01/31/2023] Open
Abstract
Background Detection of common evolutionary origin (homology) is a primary means of inferring protein structure and function. At present, comparison of protein families represented as sequence profiles is arguably the most effective homology detection strategy. However, finding the best way to represent evolutionary information of a protein sequence family in the profile, to compare profiles and to estimate the biological significance of such comparisons, remains an active area of research. Results Here, we present a new homology detection method based on sequence profile-profile comparison. The method has a number of new features including position-dependent gap penalties and a global score system. Position-dependent gap penalties provide a more biologically relevant way to represent and align protein families as sequence profiles. The global score system enables an analytical solution of the statistical parameters needed to estimate the statistical significance of profile-profile similarities. The new method, together with other state-of-the-art profile-based methods (HHsearch, COMPASS and PSI-BLAST), is benchmarked in all-against-all comparison of a challenging set of SCOP domains that share at most 20% sequence identity. For benchmarking, we use a reference ("gold standard") free model-based evaluation framework. Evaluation results show that at the level of protein domains our method compares favorably to all other tested methods. We also provide examples of the new method outperforming structure-based similarity detection and alignment. The implementation of the new method both as a standalone software package and as a web server is available at http://www.ibt.lt/bioinformatics/coma. Conclusion Due to a number of developments, the new profile-profile comparison method shows an improved ability to match distantly related protein domains. Therefore, the method should be useful for annotation and homology modeling of uncharacterized proteins.
Collapse
|
1590
|
Fokkens L, Botelho SMC, Boekhorst J, Snel B. Enrichment of homologs in insignificant BLAST hits by co-complex network alignment. BMC Bioinformatics 2010; 11:86. [PMID: 20152020 PMCID: PMC2836305 DOI: 10.1186/1471-2105-11-86] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2009] [Accepted: 02/12/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Homology is a crucial concept in comparative genomics. The algorithm probably most widely used for homology detection in comparative genomics, is BLAST. Usually a stringent score cutoff is applied to distinguish putative homologs from possible false positive hits. As a consequence, some BLAST hits are discarded that are in fact homologous. RESULTS Analogous to the use of the genomics context in genome alignments, we test whether conserved functional context can be used to select candidate homologs from insignificant BLAST hits. We make a co-complex network alignment between complex subunits in yeast and human and find that proteins with an insignificant BLAST hit that are part of homologous complexes, are likely to be homologous themselves. Further analysis of the distant homologs we recovered using the co-complex network alignment, shows that a large majority of these distant homologs are in fact ancient paralogs. CONCLUSIONS Our results show that, even though evolution takes place at the sequence and genome level, co-complex networks can be used as circumstantial evidence to improve confidence in the homology of distantly related sequences.
Collapse
Affiliation(s)
- Like Fokkens
- Theoretical Biology and Bioinformatics group, Department of Biology, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, Utrecht, the Netherlands.
| | | | | | | |
Collapse
|
1591
|
Madera M, Calmus R, Thiltgen G, Karplus K, Gough J. Improving protein secondary structure prediction using a simple k-mer model. Bioinformatics 2010; 26:596-602. [PMID: 20130034 PMCID: PMC2828123 DOI: 10.1093/bioinformatics/btq020] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Motivation: Some first order methods for protein sequence analysis inherently treat each position as independent. We develop a general framework for introducing longer range interactions. We then demonstrate the power of our approach by applying it to secondary structure prediction; under the independence assumption, sequences produced by existing methods can produce features that are not protein like, an extreme example being a helix of length 1. Our goal was to make the predictions from state of the art methods more realistic, without loss of performance by other measures. Results: Our framework for longer range interactions is described as a k-mer order model. We succeeded in applying our model to the specific problem of secondary structure prediction, to be used as an additional layer on top of existing methods. We achieved our goal of making the predictions more realistic and protein like, and remarkably this also improved the overall performance. We improve the Segment OVerlap (SOV) score by 1.8%, but more importantly we radically improve the probability of the real sequence given a prediction from an average of 0.271 per residue to 0.385. Crucially, this improvement is obtained using no additional information. Availability:http://supfam.cs.bris.ac.uk/kmer Contact:gough@cs.bris.ac.uk
Collapse
Affiliation(s)
- Martin Madera
- Department of Computer Science, University of Bristol, Woodland Road, Bristol BS8 1UB, UK
| | | | | | | | | |
Collapse
|
1592
|
Lane WJ, Darst SA. Molecular evolution of multisubunit RNA polymerases: sequence analysis. J Mol Biol 2010; 395:671-85. [PMID: 19895820 PMCID: PMC2813377 DOI: 10.1016/j.jmb.2009.10.062] [Citation(s) in RCA: 139] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2009] [Revised: 10/22/2009] [Accepted: 10/26/2009] [Indexed: 11/21/2022]
Abstract
Transcription in all cellular organisms is performed by multisubunit, DNA-dependent RNA polymerases that synthesize RNA from DNA templates. Previous sequence and structural studies have elucidated the importance of shared regions common to all multisubunit RNA polymerases. In addition, RNA polymerases contain multiple lineage-specific domain insertions involved in protein-protein and protein-nucleic acid interactions. We have created comprehensive multiple sequence alignments using all available sequence data for the multisubunit RNA polymerase large subunits, including the bacterial beta and beta' subunits and their homologs from archaebacterial RNA polymerases, the eukaryotic RNA polymerases I, II, and III, the nuclear-cytoplasmic large double-stranded DNA virus RNA polymerases, and plant plastid RNA polymerases. To overcome technical difficulties inherent to the large-subunit sequences, including large sequence length, small and large lineage-specific insertions, split subunits, and fused proteins, we created an automated and customizable sequence retrieval and processing system. In addition, we used our alignments to create a more expansive set of shared sequence regions and bacterial lineage-specific domain insertions. We also analyzed the intergenic gap between the bacterial beta and beta' genes.
Collapse
Affiliation(s)
- William J. Lane
- The Rockefeller University, Box 224, 1230 York Avenue, New York, NY 10021, USA
| | - Seth A. Darst
- The Rockefeller University, Box 224, 1230 York Avenue, New York, NY 10021, USA
| |
Collapse
|
1593
|
Remmert M, Biegert A, Linke D, Lupas AN, Söding J. Evolution of outer membrane beta-barrels from an ancestral beta beta hairpin. Mol Biol Evol 2010; 27:1348-58. [PMID: 20106904 DOI: 10.1093/molbev/msq017] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Outer membrane beta-barrels (OMBBs) are the major class of outer membrane proteins from Gram-negative bacteria, mitochondria, and plastids. Their transmembrane domains consist of 8-24 beta-strands forming a closed, barrel-shaped beta-sheet around a central pore. Despite their obvious structural regularity, evidence for an origin by duplication or for a common ancestry has not been found. We use three complementary approaches to show that all OMBBs from Gram-negative bacteria evolved from a single, ancestral beta beta hairpin. First, we link almost all families of known single-chain bacterial OMBBs with each other through transitive profile searches. Second, we identify a clear repeat signature in the sequences of many OMBBs in which the repeating sequence unit coincides with the structural beta beta hairpin repeat. Third, we show that the observed sequence similarity between OMBB hairpins cannot be explained by structural or membrane constraints on their sequences. The third approach addresses a longstanding problem in protein evolution: how to distinguish between a very remotely homologous relationship and the opposing scenario of "sequence convergence." The origin of a diverse group of proteins from a single hairpin module supports the hypothesis that, around the time of transition from the RNA to the protein world, proteins arose by amplification and recombination of short peptide modules that had previously evolved as cofactors of RNAs.
Collapse
Affiliation(s)
- M Remmert
- Department of Biochemistry, Gene Center Munich and Center for Integrated Protein Science (CIPSM), Ludwig-Maximilians-Universtät München, Munich, Germany
| | | | | | | | | |
Collapse
|
1594
|
Hildebrand A, Remmert M, Biegert A, Söding J. Fast and accurate automatic structure prediction with HHpred. Proteins 2010; 77 Suppl 9:128-32. [PMID: 19626712 DOI: 10.1002/prot.22499] [Citation(s) in RCA: 355] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Automated protein structure prediction is becoming a mainstream tool for biological research. This has been fueled by steady improvements of publicly available automated servers over the last decade, in particular their ability to build good homology models for an increasing number of targets by reliably detecting and aligning more and more remotely homologous templates. Here, we describe the three fully automated versions of the HHpred server that participated in the community-wide blind protein structure prediction competition CASP8. What makes HHpred unique is the combination of usability, short response times (typically under 15 min) and a model accuracy that is competitive with those of the best servers in CASP8.
Collapse
Affiliation(s)
- Andrea Hildebrand
- Gene Center and Center for Integrated Protein Science (Munich), Ludwig-Maximilians-University Munich, 81377 Munich, Germany
| | | | | | | |
Collapse
|
1595
|
Abstract
Yeast exonuclease 5 is encoded by the YBR163w (DEM1) gene, and this gene has been renamed EXO5. It is distantly related to the Escherichia coli RecB exonuclease class. Exo5 is localized to the mitochondria, and EXO5 deletions or nuclease-defective EXO5 mutants invariably yield petites, amplifying either the ori3 or ori5 region of the mitochondrial genome. These petites remain unstable and undergo continuous rearrangement. The mitochondrial phenotype of exo5Delta strains suggests an essential role for the enzyme in DNA replication and recombination. No nuclear phenotype associated with EXO5 deletions has been detected. Exo5 is a monomeric 5' exonuclease that releases dinucleotides as products. It is specific for single-stranded DNA and does not hydrolyze RNA. However, Exo5 has the capacity to slide across 5' double-stranded DNA or 5' RNA sequences and resumes cutting two nucleotides downstream of the double-stranded-to-single-stranded junction or RNA-to-DNA junction, respectively.
Collapse
|
1596
|
Santarella-Mellwig R, Franke J, Jaedicke A, Gorjanacz M, Bauer U, Budd A, Mattaj IW, Devos DP. The compartmentalized bacteria of the planctomycetes-verrucomicrobia-chlamydiae superphylum have membrane coat-like proteins. PLoS Biol 2010; 8:e1000281. [PMID: 20087413 PMCID: PMC2799638 DOI: 10.1371/journal.pbio.1000281] [Citation(s) in RCA: 113] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2009] [Accepted: 12/08/2009] [Indexed: 02/06/2023] Open
Abstract
Compartmentalized bacteria have proteins that are structurally related to eukaryotic membrane coats, and one of these proteins localizes at the membrane of vesicles formed inside bacterial cells. The development of the endomembrane system was a major step in eukaryotic evolution. Membrane coats, which exhibit a unique arrangement of β-propeller and α-helical repeat domains, play key roles in shaping eukaryotic membranes. Such proteins are likely to have been present in the ancestral eukaryote but cannot be detected in prokaryotes using sequence-only searches. We have used a structure-based detection protocol to search all proteomes for proteins with this domain architecture. Apart from the eukaryotes, we identified this protein architecture only in the Planctomycetes-Verrucomicrobia-Chlamydiae (PVC) bacterial superphylum, many members of which share a compartmentalized cell plan. We determined that one such protein is partly localized at the membranes of vesicles formed inside the cells in the planctomycete Gemmata obscuriglobus. Our results demonstrate similarities between bacterial and eukaryotic compartmentalization machinery, suggesting that the bacterial PVC superphylum contributed significantly to eukaryogenesis. Despite decades of research, the origin of eukaryotic cells remains an unsolved issue. The endomembrane system defines the eukaryotic cell, and its origin is linked to that of eukaryotes. A search was conducted within all known sequences for proteins that are characteristic of the eukaryotic endomembrane system, using a combination of fold types that is uniquely found in the membrane coat proteins. Outside eukaryotes, such proteins were solely found in the Planctomycetes-Verrucomicrobia-Chlamydiae (PVC) bacterial superphylum. By immuno-electron microscopy, one of these bacterial proteins was found to localize adjacent to the membranes of vesicles found within the cells of one member of the PVC superphylum. Thus, there appear to be similarities between bacterial and eukaryotic compartmentalization systems, suggesting that the bacterial PVC superphylum may have contributed significantly to eukaryogenesis.
Collapse
Affiliation(s)
| | - Josef Franke
- Laboratory of Cellular and Structural Biology, The Rockefeller University, New York, New York, United States of America
| | | | | | - Ulrike Bauer
- European Molecular Biology Laboratory, Heidelberg, Germany
| | - Aidan Budd
- European Molecular Biology Laboratory, Heidelberg, Germany
| | - Iain W. Mattaj
- European Molecular Biology Laboratory, Heidelberg, Germany
| | - Damien P. Devos
- European Molecular Biology Laboratory, Heidelberg, Germany
- * E-mail:
| |
Collapse
|
1597
|
Raman S, Vernon R, Thompson J, Tyka M, Sadreyev R, Pei J, Kim D, Kellogg E, DiMaio F, Lange O, Kinch L, Sheffler W, Kim BH, Das R, Grishin NV, Baker D. Structure prediction for CASP8 with all-atom refinement using Rosetta. Proteins 2010; 77 Suppl 9:89-99. [PMID: 19701941 DOI: 10.1002/prot.22540] [Citation(s) in RCA: 378] [Impact Index Per Article: 25.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
We describe predictions made using the Rosetta structure prediction methodology for the Eighth Critical Assessment of Techniques for Protein Structure Prediction. Aggressive sampling and all-atom refinement were carried out for nearly all targets. A combination of alignment methodologies was used to generate starting models from a range of templates, and the models were then subjected to Rosetta all atom refinement. For the 64 domains with readily identified templates, the best submitted model was better than the best alignment to the best template in the Protein Data Bank for 24 cases, and improved over the best starting model for 43 cases. For 13 targets where only very distant sequence relationships to proteins of known structure were detected, models were generated using the Rosetta de novo structure prediction methodology followed by all-atom refinement; in several cases the submitted models were better than those based on the available templates. Of the 12 refinement challenges, the best submitted model improved on the starting model in seven cases. These improvements over the starting template-based models and refinement tests demonstrate the power of Rosetta structure refinement in improving model accuracy.
Collapse
Affiliation(s)
- Srivatsan Raman
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1598
|
Abstract
The I-TASSER algorithm for 3D protein structure prediction was tested in CASP8, with the procedure fully automated in both the Server and Human sections. The quality of the server models is close to that of human ones but the human predictions incorporate more diverse templates from other servers which improve the human predictions in some of the distant homology targets. For the first time, the sequence-based contact predictions from machine learning techniques are found helpful for both template-based modeling (TBM) and template-free modeling (FM). In TBM, although the accuracy of the sequence based contact predictions is on average lower than that from template-based ones, the novel contacts in the sequence-based predictions, which are complementary to the threading templates in the weakly or unaligned regions, are important to improve the global and local packing in these regions. Moreover, the newly developed atomic structural refinement algorithm was tested in CASP8 and found to improve the hydrogen-bonding networks and the overall TM-score, which is mainly due to its ability of removing steric clashes so that the models can be generated from cluster centroids. Nevertheless, one of the major issues of the I-TASSER pipeline is the model selection where the best models could not be appropriately recognized when the correct templates are detected only by the minority of the threading algorithms. There are also problems related with domain-splitting and mirror image recognition which mainly influences the performance of I-TASSER modeling in the FM-based structure predictions.
Collapse
Affiliation(s)
- Yang Zhang
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, Lawrence, Kansas 66047, USA.
| |
Collapse
|
1599
|
Frank K, Gruber M, Sippl MJ. COPS Benchmark: interactive analysis of database search methods. Bioinformatics 2010; 26:574-5. [DOI: 10.1093/bioinformatics/btp712] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
1600
|
Khan F, Furuta Y, Kawai M, Kaminska KH, Ishikawa K, Bujnicki JM, Kobayashi I. A putative mobile genetic element carrying a novel type IIF restriction-modification system (PluTI). Nucleic Acids Res 2010; 38:3019-30. [PMID: 20071747 PMCID: PMC2875022 DOI: 10.1093/nar/gkp1221] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Genome comparison and genome context analysis were used to find a putative mobile element in the genome of Photorhabdus luminescens, an entomopathogenic bacterium. The element is composed of 16-bp direct repeats in the terminal regions, which are identical to a part of insertion sequences (ISs), a DNA methyltransferase gene homolog, two genes of unknown functions and an open reading frame (ORF) (plu0599) encoding a protein with no detectable sequence similarity to any known protein. The ORF (plu0599) product showed DNA endonuclease activity, when expressed in a cell-free expression system. Subsequently, the protein, named R.PluTI, was expressed in vivo, purified and found to be a novel type IIF restriction enzyme that recognizes 5′-GGCGC/C-3′ (/ indicates position of cleavage). R.PluTI cleaves a two-site supercoiled substrate at both the sites faster than a one-site supercoiled substrate. The modification enzyme homolog encoded by plu0600, named M.PluTI, was expressed in Escherichia coli and shown to protect DNA from R.PluTI cleavage in vitro, and to suppress the lethal effects of R.PluTI expression in vivo. These results suggested that they constitute a restriction–modification system, present on the putative mobile element. Our approach thus allowed detection of a previously uncharacterized family of DNA-interacting proteins.
Collapse
Affiliation(s)
- Feroz Khan
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Japan
| | | | | | | | | | | | | |
Collapse
|