1
|
Ballard A, Bieniek S, Carlini DB. The fitness consequences of synonymous mutations in Escherichia coli: Experimental evidence for a pleiotropic effect of translational selection. Gene 2019; 694:111-120. [PMID: 30738968 DOI: 10.1016/j.gene.2019.01.031] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Revised: 12/21/2018] [Accepted: 01/22/2019] [Indexed: 01/06/2023]
Abstract
Codon usage bias (CUB) is a universal feature of genomes, and in most species CUB of protein coding genes is positively correlated with expression level and degree of evolutionary conservation. There is mounting experimental evidence that CUB is due in part to selection for translational efficiency and/or accuracy, i.e., translational selection. However, there is a paucity of experimental data on whether and how CUB acts in trans - does the usage of preferred codons in a highly expressed gene affect the translation of other genes by freeing up more ribosomes, thereby increasing their availability to translate all mRNA transcripts in the cell? We investigated this question by creating two extreme versions of the highly expressed Escherichia coli β-lactamase (bla) gene, one comprised almost entirely of unpreferred codons, and a second comprised almost entirely of preferred codons. We monitored the fitness effects of these synonymous mutations over hundreds of generations in two selective environments that allowed us to disentangle translational effects acting in cis from those acting in trans. In a selective environment for maximizing translational efficiency in trans of a gene (tetA) encoding a tetracycline resistance protein, unpreferred synonymous mutations had a negative impact on long-term fitness, whereas preferred mutations had a positive impact on long-term fitness, providing strong experimental evidence for a pleiotropic effect of translational selection.
Collapse
Affiliation(s)
- Anne Ballard
- Department of Biology, American University, 4400 Massachusetts Avenue, NW, Washington, DC, 20016, United States of America
| | - Sarah Bieniek
- Department of Biology, American University, 4400 Massachusetts Avenue, NW, Washington, DC, 20016, United States of America
| | - David B Carlini
- Department of Biology, American University, 4400 Massachusetts Avenue, NW, Washington, DC, 20016, United States of America.
| |
Collapse
|
2
|
Sastry A, Monk J, Tegel H, Uhlen M, Palsson BO, Rockberg J, Brunk E. Machine learning in computational biology to accelerate high-throughput protein expression. Bioinformatics 2018; 33:2487-2495. [PMID: 28398465 DOI: 10.1093/bioinformatics/btx207] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Accepted: 04/05/2017] [Indexed: 01/21/2023] Open
Abstract
Motivation The Human Protein Atlas (HPA) enables the simultaneous characterization of thousands of proteins across various tissues to pinpoint their spatial location in the human body. This has been achieved through transcriptomics and high-throughput immunohistochemistry-based approaches, where over 40 000 unique human protein fragments have been expressed in E. coli. These datasets enable quantitative tracking of entire cellular proteomes and present new avenues for understanding molecular-level properties influencing expression and solubility. Results Combining computational biology and machine learning identifies protein properties that hinder the HPA high-throughput antibody production pipeline. We predict protein expression and solubility with accuracies of 70% and 80%, respectively, based on a subset of key properties (aromaticity, hydropathy and isoelectric point). We guide the selection of protein fragments based on these characteristics to optimize high-throughput experimentation. Availability and implementation We present the machine learning workflow as a series of IPython notebooks hosted on GitHub (https://github.com/SBRG/Protein_ML). The workflow can be used as a template for analysis of further expression and solubility datasets. Contact ebrunk@ucsd.edu or johanr@biotech.kth.se. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Anand Sastry
- Department of Bioengineering, University of California, San Diego, CA, USA
| | - Jonathan Monk
- Department of Bioengineering, University of California, San Diego, CA, USA
| | - Hanna Tegel
- KTH - Royal Institute of Technology, Department of Proteomics and Nanobiotechnology, SE-106 91 Stockholm, Sweden
| | - Mathias Uhlen
- KTH - Royal Institute of Technology, Department of Proteomics and Nanobiotechnology, SE-106 91 Stockholm, Sweden.,The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Bernhard O Palsson
- Department of Bioengineering, University of California, San Diego, CA, USA.,The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Johan Rockberg
- KTH - Royal Institute of Technology, Department of Proteomics and Nanobiotechnology, SE-106 91 Stockholm, Sweden
| | - Elizabeth Brunk
- Department of Bioengineering, University of California, San Diego, CA, USA.,The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Lyngby, Denmark
| |
Collapse
|
3
|
Komar AA. The Yin and Yang of codon usage. Hum Mol Genet 2016; 25:R77-R85. [PMID: 27354349 DOI: 10.1093/hmg/ddw207] [Citation(s) in RCA: 107] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 06/24/2016] [Indexed: 01/07/2023] Open
Abstract
The genetic code is degenerate. With the exception of two amino acids (Met and Trp), all other amino acid residues are each encoded by multiple, so-called synonymous codons. Synonymous codons were initially presumed to have entirely equivalent functions, however, the finding that synonymous codons are not present at equal frequencies in genes/genomes suggested that codon choice might have functional implications beyond amino acid coding. The pattern of non-uniform codon use (known as codon usage bias) varies between organisms and represents a unique feature of an organism. Organism-specific codon choice is related to organism-specific differences in populations of cognate tRNAs. This implies that, in a given organism, frequently used codons will be translated more rapidly than infrequently used ones and vice versa A theory of codon-tRNA co-evolution (necessary to balance accurate and efficient protein production) was put forward to explain the existence of codon usage bias. This model suggests that selection favours preferred (frequent) over un-preferred (rare) codons in order to sustain efficient protein production in cells and that a given un-preferred codon will have the same effect on an organism's fitness regardless of its position within an mRNA's open reading frame. However, many recent studies refute this prediction. Un-preferred codons have been found to have important functional roles and their effects appeared to be position-dependent. Synonymous codon usage affects the efficiency/stringency of mRNA decoding, mRNA biogenesis/stability, and protein secretion and folding. This review summarizes recent developments in the field that have identified novel functions of synonymous codons and their usage.
Collapse
Affiliation(s)
- Anton A Komar
- Center for Gene Regulation in Health and Disease and Department of Biological, Geological and Environmental Sciences, Cleveland State University, Cleveland, Ohio, OH, USA Department of Biochemistry and Center for RNA Molecular Biology, Case Western Reserve University, Cleveland, Ohio, USA Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, OH, USA
| |
Collapse
|
4
|
Codon influence on protein expression in E. coli correlates with mRNA levels. Nature 2016; 529:358-363. [PMID: 26760206 DOI: 10.1038/nature16509] [Citation(s) in RCA: 294] [Impact Index Per Article: 32.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2014] [Accepted: 12/01/2015] [Indexed: 02/06/2023]
Abstract
Degeneracy in the genetic code, which enables a single protein to be encoded by a multitude of synonymous gene sequences, has an important role in regulating protein expression, but substantial uncertainty exists concerning the details of this phenomenon. Here we analyse the sequence features influencing protein expression levels in 6,348 experiments using bacteriophage T7 polymerase to synthesize messenger RNA in Escherichia coli. Logistic regression yields a new codon-influence metric that correlates only weakly with genomic codon-usage frequency, but strongly with global physiological protein concentrations and also mRNA concentrations and lifetimes in vivo. Overall, the codon content influences protein expression more strongly than mRNA-folding parameters, although the latter dominate in the initial ~16 codons. Genes redesigned based on our analyses are transcribed with unaltered efficiency but translated with higher efficiency in vitro. The less efficiently translated native sequences show greatly reduced mRNA levels in vivo. Our results suggest that codon content modulates a kinetic competition between protein elongation and mRNA degradation that is a central feature of the physiology and also possibly the regulation of translation in E. coli.
Collapse
|
5
|
The Art of Gene Redesign and Recombinant Protein Production: Approaches and Perspectives. TOPICS IN MEDICINAL CHEMISTRY 2016. [DOI: 10.1007/7355_2016_2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
6
|
The effects of codon context on in vivo translation speed. PLoS Genet 2014; 10:e1004392. [PMID: 24901308 PMCID: PMC4046918 DOI: 10.1371/journal.pgen.1004392] [Citation(s) in RCA: 104] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Accepted: 04/04/2014] [Indexed: 11/19/2022] Open
Abstract
We developed a bacterial genetic system based on translation of the his operon leader peptide gene to determine the relative speed at which the ribosome reads single or multiple codons in vivo. Low frequency effects of so-called "silent" codon changes and codon neighbor (context) effects could be measured using this assay. An advantage of this system is that translation speed is unaffected by the primary sequence of the His leader peptide. We show that the apparent speed at which ribosomes translate synonymous codons can vary substantially even for synonymous codons read by the same tRNA species. Assaying translation through codon pairs for the 5'- and 3'- side positioning of the 64 codons relative to a specific codon revealed that the codon-pair orientation significantly affected in vivo translation speed. Codon pairs with rare arginine codons and successive proline codons were among the slowest codon pairs translated in vivo. This system allowed us to determine the effects of different factors on in vivo translation speed including Shine-Dalgarno sequence, rate of dipeptide bond formation, codon context, and charged tRNA levels.
Collapse
|
7
|
Aslam F, Gardner QTAA, Zain H, Nadeem MS, Ali M, Rashid N, Akhtar M. Studies on the expression and processing of human proinsulin derivatives encoded by different DNA constructs. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013; 1834:2116-23. [DOI: 10.1016/j.bbapap.2013.07.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2013] [Revised: 07/01/2013] [Accepted: 07/09/2013] [Indexed: 10/26/2022]
|
8
|
Strong purifying selection at synonymous sites in D. melanogaster. PLoS Genet 2013; 9:e1003527. [PMID: 23737754 PMCID: PMC3667748 DOI: 10.1371/journal.pgen.1003527] [Citation(s) in RCA: 151] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2013] [Accepted: 04/08/2013] [Indexed: 11/19/2022] Open
Abstract
Synonymous sites are generally assumed to be subject to weak selective constraint. For this reason, they are often neglected as a possible source of important functional variation. We use site frequency spectra from deep population sequencing data to show that, contrary to this expectation, 22% of four-fold synonymous (4D) sites in Drosophila melanogaster evolve under very strong selective constraint while few, if any, appear to be under weak constraint. Linking polymorphism with divergence data, we further find that the fraction of synonymous sites exposed to strong purifying selection is higher for those positions that show slower evolution on the Drosophila phylogeny. The function underlying the inferred strong constraint appears to be separate from splicing enhancers, nucleosome positioning, and the translational optimization generating canonical codon bias. The fraction of synonymous sites under strong constraint within a gene correlates well with gene expression, particularly in the mid-late embryo, pupae, and adult developmental stages. Genes enriched in strongly constrained synonymous sites tend to be particularly functionally important and are often involved in key developmental pathways. Given that the observed widespread constraint acting on synonymous sites is likely not limited to Drosophila, the role of synonymous sites in genetic disease and adaptation should be reevaluated.
Collapse
|
9
|
Schenk MF, Szendro IG, Krug J, de Visser JAGM. Quantifying the adaptive potential of an antibiotic resistance enzyme. PLoS Genet 2012; 8:e1002783. [PMID: 22761587 PMCID: PMC3386231 DOI: 10.1371/journal.pgen.1002783] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2012] [Accepted: 05/09/2012] [Indexed: 12/30/2022] Open
Abstract
For a quantitative understanding of the process of adaptation, we need to understand its "raw material," that is, the frequency and fitness effects of beneficial mutations. At present, most empirical evidence suggests an exponential distribution of fitness effects of beneficial mutations, as predicted for Gumbel-domain distributions by extreme value theory. Here, we study the distribution of mutation effects on cefotaxime (Ctx) resistance and fitness of 48 unique beneficial mutations in the bacterial enzyme TEM-1 β-lactamase, which were obtained by screening the products of random mutagenesis for increased Ctx resistance. Our contributions are threefold. First, based on the frequency of unique mutations among more than 300 sequenced isolates and correcting for mutation bias, we conservatively estimate that the total number of first-step mutations that increase Ctx resistance in this enzyme is 87 [95% CI 75-189], or 3.4% of all 2,583 possible base-pair substitutions. Of the 48 mutations, 10 are synonymous and the majority of the 38 non-synonymous mutations occur in the pocket surrounding the catalytic site. Second, we estimate the effects of the mutations on Ctx resistance by determining survival at various Ctx concentrations, and we derive their fitness effects by modeling reproduction and survival as a branching process. Third, we find that the distribution of both measures follows a Fréchet-type distribution characterized by a broad tail of a few exceptionally fit mutants. Such distributions have fundamental evolutionary implications, including an increased predictability of evolution, and may provide a partial explanation for recent observations of striking parallel evolution of antibiotic resistance.
Collapse
Affiliation(s)
- Martijn F. Schenk
- Institute for Genetics, University of Cologne, Köln, Germany
- Laboratory of Genetics, Wageningen University, Wageningen, The Netherlands
| | - Ivan G. Szendro
- Institute for Theoretical Physics, University of Cologne, Köln, Germany
| | - Joachim Krug
- Institute for Theoretical Physics, University of Cologne, Köln, Germany
- Systems Biology of Ageing Cologne (Sybacol), University of Cologne, Köln, Germany
| | | |
Collapse
|
10
|
Nocadello S, Swennen EF. The new pLAI (lux regulon based auto-inducible) expression system for recombinant protein production in Escherichia coli. Microb Cell Fact 2012; 11:3. [PMID: 22222111 PMCID: PMC3274441 DOI: 10.1186/1475-2859-11-3] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2011] [Accepted: 01/05/2012] [Indexed: 11/26/2022] Open
Abstract
Background After many years of intensive research, it is generally assumed that no universal expression system can exist for high-level production of a given recombinant protein. Among the different expression systems, the inducible systems are the most popular for their tight regulation. However, induction is in many cases less favorable due to the high cost and/or toxicity of inducers, incompatibilities with industrial scale-up or detrimental growth conditions. Expression systems using autoinduction (or self-induction) prove to be extremely versatile allowing growth and induction of recombinant proteins without the need to monitor cell density or add inducer. Unfortunately, almost all the actual auto inducible expression systems need endogenous or induced metabolic changes during the growth to trigger induction, both frequently linked to detrimental condition to cell growth. In this context, we use a simple modular approach for a cell density-based genetic regulation in order to assemble an autoinducible recombinant protein expression system in E. coli. Result The newly designed pLAI expression system places the expression of recombinant proteins in Escherichia coli under control of the regulatory genes of the lux regulon of Vibrio fischeri's Quorum Sensing (QS) system. The pLAI system allows a tight regulation of the recombinant gene allowing a negligible basal expression and expression only at high cell density. Sequence optimization of regulative genes of QS of V. fischeri for expression in E. coli upgraded the system to high level expression. Moreover, partition of regulative genes between the plasmid and the host genome and introduction of a molecular safety lock permitted tighter control of gene expression. Conclusion Coupling gene expression to cell density using cell-to-cell communication provides a promising approach for recombinant protein production. The system allows the control of expression of the target recombinant gene independently from external inducers or drastic changes in metabolic conditions and enabling tight regulation of expression.
Collapse
|
11
|
Chung DH, Min Z, Wang BC, Kushner SR. Single amino acid changes in the predicted RNase H domain of Escherichia coli RNase G lead to complementation of RNase E deletion mutants. RNA (NEW YORK, N.Y.) 2010; 16:1371-1385. [PMID: 20507976 PMCID: PMC2885686 DOI: 10.1261/rna.2104810] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2010] [Accepted: 04/12/2010] [Indexed: 05/29/2023]
Abstract
The endoribonuclease RNase E of Escherichia coli is an essential enzyme that plays a major role in all aspects of RNA metabolism. In contrast, its paralog, RNase G, seems to have more limited functions. It is involved in the maturation of the 5' terminus of 16S rRNA, the processing of a few tRNAs, and the initiation of decay of a limited number of mRNAs but is not required for cell viability and cannot substitute for RNase E under normal physiological conditions. Here we show that neither the native nor N-terminal extended form of RNase G can restore the growth defect associated with either the rne-1 or rneDelta1018 alleles even when expressed at very high protein levels. In contrast, two distinct spontaneously derived single amino acid substitutions within the predicted RNase H domain of RNase G, generating the rng-219 and rng-248 alleles, result in complementation of the growth defect associated with various RNase E mutants, suggesting that this region of the two proteins may help distinguish their in vivo biological activities. Analysis of rneDelta1018/rng-219 and rneDelta1018/rng-248 double mutants has provided interesting insights into the distinct roles of RNase E and RNase G in mRNA decay and tRNA processing.
Collapse
Affiliation(s)
- Dae-hwan Chung
- Department of Genetics, University of Georgia, Athens, Georgia 30602, USA
| | | | | | | |
Collapse
|
12
|
Dreyfus M. Killer and protective ribosomes. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2009; 85:423-66. [PMID: 19215779 DOI: 10.1016/s0079-6603(08)00811-8] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
In prokaryotes, translation influences mRNA decay. The breakdown of most Escherichia coli mRNAs is initiated by RNase E, a 5'-dependent endonuclease. Some mRNAs are protected by ribosomes even if these are located far upstream of cleavage sites ("protection at a distance"), whereas others require direct shielding of these sites. I argue that these situations reflect different modes of interaction of RNase E with mRNAs. Protection at a distance is most impressive in Bacilli, where ribosomes can protect kilobases of unstable downstream sequences. I propose that this protection reflects the role in mRNA decay of RNase J1, a 5'-->3' exonuclease with no E. coli equivalent. Finally, recent years have shown that besides their protective role, ribosomes can also cleave their mRNA under circumstances that cause ribosome stalling. The endonuclease associated with this "killing" activity, which has a eukaryotic counterpart ("no-go decay"), is not characterized; it may be borne by the distressed ribosome itself.
Collapse
|
13
|
Abstract
The persistent difficulties in the production of protein at high levels in heterologous systems, as well as the inability to understand pathologies associated with protein aggregation, highlight our limited knowledge on the mechanisms of protein folding in vivo. Attempts to improve yield and quality of recombinant proteins are diverse, frequently involving optimization of the cell growth temperature, the use of synonymous codons and/or the co-expression of tRNAs, chaperones and folding catalysts among others. Although protein secondary structure can be determined largely by the amino acid sequence, protein folding within the cell is affected by a range of factors beyond amino acid sequence. The folding pathway of a nascent polypeptide can be affected by transient interactions with other proteins and ligands, the ribosome, translocation through a pore membrane, redox conditions, among others. The translation rate as well as the translation machinery itself can dramatically affect protein folding, and thus the structure and function of the protein product. This review addresses current efforts to better understand how the use of synonymous codons in the mRNA and the availability of tRNAs can modulate translation kinetics, affecting the folding, the structure and the biological activity of proteins.
Collapse
Affiliation(s)
- Monica Marin
- Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay.
| |
Collapse
|
14
|
Abstract
Ribonuclease LS in Escherichia coli is a potential antagonist of bacteriophage T4. When T4 dmd is mutated, this RNase efficiently cleaves T4 mRNAs and leads to the silencing of late genes, thus blocking T4 growth. We previously found that, when two consecutive ochre codons were placed in the open reading frame of T4 soc, RNase LS cleaved soc mRNA at a specific site downstream of the ochre codons. Here, we demonstrate that RNase LS cleaves soc RNA at the same site even when only a single ochre codon is present or is replaced with either an amber or an opal codon. On the other hand, disruption of the Shine-Dalgarno sequence, a ribosome-binding site required for the initiation of translation, eliminates the cleavage. These results strongly suggest that RNase LS cleaves in a manner dependent on translation termination. Consistent with this suggestion, the cleavage dependency on an amber codon was considerably reduced in the presence of amber-codon-suppressing tRNA. Instead, two other cleavages that depend on translation of the region containing the target sites occurred farther downstream. Additional analysis suggests that an interaction of the ribosome with a stop codon might affect the site of cleavage by RNase LS in an mRNA molecule. This effect of the ribosome could reflect remodeling of the high-order structure of the mRNA molecule.
Collapse
Affiliation(s)
- Haruyo Yamanishi
- Department of Biology, Graduate School of Science, Osaka University, Japan
| | | |
Collapse
|
15
|
Abstract
The dmd gene of bacteriophage T4 is required for the stability of late-gene mRNAs. When this gene is mutated, late genes are globally silenced because of rapid degradation of their mRNAs. Our previous work suggested that a novel Escherichia coli endonuclease, RNase LS, is responsible for the rapid degradation of mRNAs. In this study, we demonstrated that rnlA (formerly yfjN) is essential for RNase LS activity both in vivo and in vitro. In addition, we investigated a role of RNase LS in the RNA metabolism of E. coli cells under vegetative growth conditions. A mutation in rnlA reduced the decay rate of many E. coli mRNAs, although there are differences in the mutational effects on the stabilization of different mRNAs. In addition, we found that a 307-nucleotide fragment with an internal sequence of 23S rRNA accumulated to a high level in rnlA mutant cells. These results strongly suggest that RNase LS plays a role in the RNA metabolism of E. coli as well as phage T4.
Collapse
Affiliation(s)
- Yuichi Otsuka
- Department of Biology, Graduate School of Science, Osaka University, Osaka 560-0043, Japan
| | | |
Collapse
|
16
|
Giacani L, Sun ES, Hevner K, Molini BJ, Van Voorhis WC, Lukehart SA, Centurion-Lara A. Tpr homologs in Treponema paraluiscuniculi Cuniculi A strain. Infect Immun 2004; 72:6561-76. [PMID: 15501788 PMCID: PMC523035 DOI: 10.1128/iai.72.11.6561-6576.2004] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Treponema paraluiscuniculi, the etiologic agent of rabbit venereal syphilis, is morphologically indistinguishable from Treponema pallidum subsp. pallidum (T. pallidum), the human syphilis treponeme, and induces similar immune responses and histopathologic changes in the infected host. Because of their high degree of relatedness, comparative studies are likely to identify genetic determinants that contribute to pathogenesis or virulence in human syphilis. The tpr (Treponema pallidum repeat) genes are believed to code for potential virulence factors. In this study, we identified 10 tpr homologs in Treponema paraluiscuniculi Cuniculi A strain and determined their sequence architecture. Half of this group of paralogous genes were predicted to be nonfunctional due to the presence of frameshifts and premature stop codons. Furthermore, the immune response against the T. paraluiscuniculi Tpr homologs in long-term-infected rabbits was studied by enzyme-linked immunosorbent assay and lymphocyte proliferation assay, showing that TprK is the only target of the antibody and T-cell responses during experimental infection and emphasizing the importance of this putative virulence factor in venereal treponematosis.
Collapse
Affiliation(s)
- Lorenzo Giacani
- Department of Medicine, University of Washington, Seattle, USA
| | | | | | | | | | | | | |
Collapse
|
17
|
Power PM, Jones RA, Beacham IR, Bucholtz C, Jennings MP. Whole genome analysis reveals a high incidence of non-optimal codons in secretory signal sequences of Escherichia coli. Biochem Biophys Res Commun 2004; 322:1038-44. [PMID: 15336569 DOI: 10.1016/j.bbrc.2004.08.022] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2004] [Indexed: 11/21/2022]
Abstract
Translational pausing may occur due to a number of mechanisms, including the presence of non-optimal codons, and it is thought to play a role in the folding of specific polypeptide domains during translation and in the facilitation of signal peptide recognition during sec-dependent protein targeting. In this whole genome analysis of Escherichia coli we have found that non-optimal codons in the signal peptide-encoding sequences of secretory genes are overrepresented relative to the "mature" portions of these genes; this is in addition to their overrepresentation in the 5'-regions of genes encoding non-secretory proteins. We also find increased non-optimal codon usage at the 3' ends of most E. coli genes, in both non-secretory and secretory sequences. Whereas presumptive translational pausing at the 5' and 3' ends of E. coli messenger RNAs may clearly have a general role in translation, we suggest that it also has a specific role in sec-dependent protein export, possibly in facilitating signal peptide recognition. This finding may have important implications for our understanding of how the majority of non-cytoplasmic proteins are targeted, a process that is essential to all biological cells.
Collapse
Affiliation(s)
- Peter M Power
- School of Molecular and Microbial Sciences, The University of Queensland, Brisbane, Queensland 4072, Australia
| | | | | | | | | |
Collapse
|
18
|
Duan J, Antezana MA. Mammalian mutation pressure, synonymous codon choice, and mRNA degradation. J Mol Evol 2004; 57:694-701. [PMID: 14745538 DOI: 10.1007/s00239-003-2519-1] [Citation(s) in RCA: 99] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2003] [Accepted: 06/30/2003] [Indexed: 11/29/2022]
Abstract
The usage of synonymous codons (SCs) in mammalian genes is highly correlated with local base composition and is therefore thought to be determined by mutation pressure. The usage is nonetheless structured. For instance, mammals share with Saccharomyces and Drosophila most preferences for the C-ending over the G-ending codon (or vice versa) within each fourfold-degenerate SC family and the fact that their SCs are placed along coding regions in ways that minimize the number of T|A and C|G dinucleotides ("|" being the codon boundary). TA and CG underrepresentations are observed everywhere in the mammalian genome affecting the SC usage, the amino acid composition of proteins, and the primary structure of introns and noncoding DNA. While the rarity of CG is ascribed to the high mutability of this dinucleotide, the rarity of TA in coding regions is considered adaptive because UA dinucleotides are cleaved by endoribonucleases. Here we present in vivo experimental evidence indicating that the number of T|A and/or C|G dinucleotides of a human gene can affect strongly the expression level and degradation of its mRNA. Our results are consistent with indirect evidence produced by other workers and with the detailed work that has been devoted to characterize UA cleavage in vitro and in vivo. We conclude that SC choice can influence strongly mRNA function and gene expression through effects not directly related to the codon-anticodon interaction. These effects should constrain heavily the nucleotide motif composition of the most abundant mRNAs in the transcriptome, in particular, their SC usage, a usage that must be reflected by cellular tRNA concentrations and thus defines for all other genes which SCs are translated fastest and most accurately. Furthermore, the need to avoid such effects genome-wide appears serious enough to have favored the evolution of biases in context-dependent mutation that reduce the occurrence of intrinsically unfavorable motifs, and/or, when possible, to have induced the molecular machinery mediating such effects to rely opportunistically on already existing motif rarities and abundances. This may explain why nucleotide motif preferences are very similar in transcribed and nontranscribed mammalian DNA even though the preferences appear to be adaptive only in transcribed DNA.
Collapse
Affiliation(s)
- Jubao Duan
- Department of Psychiatry, The University of Chicago, 924 East 57th Street, R-004, Chicago, IL 60637, USA
| | | |
Collapse
|
19
|
Chamary JV, Hurst LD. Similar rates but different modes of sequence evolution in introns and at exonic silent sites in rodents: evidence for selectively driven codon usage. Mol Biol Evol 2004; 21:1014-23. [PMID: 15014158 DOI: 10.1093/molbev/msh087] [Citation(s) in RCA: 73] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In mammals divergence at fourfold degenerate sites in codons (K(4)) and intronic sequence (K(i)) are both used to estimate the mutation rate, under the supposition that both evolve neutrally. Does it matter which of these we use? Using either class of sequence can be defended because (1) K(4) is the same as K(i) (at least in rodents) and (2) there is no selectively driven codon usage (hence no systematic selection on third sites). Here we re-examine these findings using 560 introns (for 136 genes) in the mouse-rat comparison, aligned by eye and using a new maximum likelihood protocol. We find that the rate of evolution at fourfold sites and at intronic sites is similar in magnitude, but only after eliminating putatively constrained sites from introns (first introns and sites flanking intron-exon junctions). Any approximate congruence between the two rates is not, however, owing to an underlying similarity in the mode of sequence evolution. Some dinucleotides are hypermutable and differently abundant in exons and introns (e.g., CpGs). More importantly, after controlling for relative abundance, all dinucleotides starting with A or T are more prevalent in mismatches in exons than in introns, whereas C-starting dinucleotides (except CG) are more common in introns. Although C content at intronic sites is lower than at flanking fourfold sites, G content is similar, demonstrating that there exists a strong strand-specific preference for C nucleotides that is unique to exons. Transcription-coupled mutational processes and biased gene conversion cannot explain this, as they should affect introns and flanking exons equally. Therefore, by elimination, we propose this to be strong evidence for selectively driven codon usage in mammals.
Collapse
Affiliation(s)
- Jean-Vincent Chamary
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | | |
Collapse
|
20
|
Bernstein JA, Khodursky AB, Lin PH, Lin-Chao S, Cohen SN. Global analysis of mRNA decay and abundance in Escherichia coli at single-gene resolution using two-color fluorescent DNA microarrays. Proc Natl Acad Sci U S A 2002; 99:9697-702. [PMID: 12119387 PMCID: PMC124983 DOI: 10.1073/pnas.112318199] [Citation(s) in RCA: 638] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/27/2002] [Indexed: 11/18/2022] Open
Abstract
Much of the information available about factors that affect mRNA decay in Escherichia coli, and by inference in other bacteria, has been gleaned from study of less than 25 of the approximately 4,300 predicted E. coli messages. To investigate these factors more broadly, we examined the half-lives and steady-state abundance of known and predicted E. coli mRNAs at single-gene resolution by using two-color fluorescent DNA microarrays. An rRNA-based strategy for normalization of microarray data was developed to permit quantitation of mRNA decay after transcriptional arrest by rifampicin. We found that globally, mRNA half-lives were similar in nutrient-rich media and defined media in which the generation time was approximately tripled. A wide range of stabilities was observed for individual mRNAs of E. coli, although approximately 80% of all mRNAs had half-lives between 3 and 8 min. Genes having biologically related metabolic functions were commonly observed to have similar stabilities. Whereas the half-lives of a limited number of mRNAs correlated positively with their abundance, we found that overall, increased mRNA stability is not predictive of increased abundance. Neither the density of putative sites of cleavage by RNase E, which is believed to initiate mRNA decay in E. coli, nor the free energy of folding of 5' or 3' untranslated region sequences was predictive of mRNA half-life. Our results identify previously unsuspected features of mRNA decay at a global level and also indicate that generalizations about decay derived from the study of individual gene transcripts may have limited applicability.
Collapse
Affiliation(s)
- Jonathan A Bernstein
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | | | | | | | | |
Collapse
|
21
|
Lesnik T, Solomovici J, Deana A, Ehrlich R, Reiss C. Ribosome traffic in E. coli and regulation of gene expression. J Theor Biol 2000; 202:175-85. [PMID: 10640436 DOI: 10.1006/jtbi.1999.1047] [Citation(s) in RCA: 48] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The ribosome traffic during translation of E. coli coding sequences was simulated, assuming that the rate of translation of individual codons is limited by the cognate tRNA availability. Actual translation rates were taken from Solomovici et al. (J. theor. Biol. 185, 511-521, 1997). The mean translation rates of the 4271 sequences cover a broad, two-fold range, whereas the local rate of translation along messengers varies three-fold on average. The simulation allows one to sketch the ribosome traffic on the polysome, in particular by providing the extent of mRNA sequences uncovered between consecutive ribosomes and the time during which these sequences are exposed. These parameters may participate in the control of mRNA stability and transcriptional polarity. By averaging the translation rates in a 17-codon window, assumed to be the sequence covered by a translating ribosome, and sliding this window along a given coding sequence, the addresses KMAX and KMIN, and the times TMAX and TMIN of respectively the slowest and the fastest translated window were determined. It is shown that under the assumptions made, TMAX sets the number of proteins translated from a given mRNA molecule per unit time, in case the delay between consecutive translation starts is below TMAX. Both windows display two strong biases, one as expected on the usage of codon frequencies, and the other surprisingly on the occurrence of amino acids.
Collapse
Affiliation(s)
- T Lesnik
- Centre de Génétique Moléculaire, CNRS, bat. 24, Ave. de la Terrasse, Gif Sur Yvette, F91198, France.
| | | | | | | | | |
Collapse
|
22
|
Deana A, Ehrlich R, Reiss C. Silent mutations in the Escherichia coli ompA leader peptide region strongly affect transcription and translation in vivo. Nucleic Acids Res 1998; 26:4778-82. [PMID: 9753749 PMCID: PMC147888 DOI: 10.1093/nar/26.20.4778] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In order to test the effect of silent mutations on the regulation of gene expression, we monitored several steps of transcription and translation of the ompA gene in vivo , in which some or all codons between codons 6 and 14, frequently used in Escherichia coli , had been exchanged for infrequent synonymous codons. Northern blot analysis revealed an up to 4-fold reduction in the half-life of the mutated messengers and a >10-fold reduction in their steady-state amounts. Western blot analysis showed a 10-fold reduction in the amount of OmpA protein. Use of a system expressing a Rho-specific anti-terminator allowed us to detect a strong transcription polarity effect in the silent mutants. These results demonstrate that silent mutations can severely inhibit several steps of gene expression in E. coli and that code degeneracy is efficiently exploited in this species for setting signals for gene control and regulation.
Collapse
Affiliation(s)
- A Deana
- Centre de Génétique Moléculaire, Laboratoire Structure et Dynamique du Génome, CNRS, F91198 Gif-sur-Yvette, France.
| | | | | |
Collapse
|
23
|
Abstract
This map is an update of the edition 9 map by Berlyn et al. (M. K. B. Berlyn, K. B. Low, and K. E. Rudd, p. 1715-1902, in F. C. Neidhardt et al., ed., Escherichia coli and Salmonella: cellular and molecular biology, 2nd ed., vol. 2, 1996). It uses coordinates established by the completed sequence, expressed as 100 minutes for the entire circular map, and adds new genes discovered and established since 1996 and eliminates those shown to correspond to other known genes. The latter are included as synonyms. An alphabetical list of genes showing map location, synonyms, the protein or RNA product of the gene, phenotypes of mutants, and reference citations is provided. In addition to genes known to correspond to gene sequences, other genes, often older, that are described by phenotype and older mapping techniques and that have not been correlated with sequences are included.
Collapse
Affiliation(s)
- M K Berlyn
- Department of Biology and School of Forestry and Environmental Studies, Yale University, New Haven, Connecticut 06520-8104, USA.
| |
Collapse
|
24
|
Thanaraj TA, Argos P. Protein secondary structural types are differentially coded on messenger RNA. Protein Sci 1996; 5:1973-83. [PMID: 8897597 PMCID: PMC2143259 DOI: 10.1002/pro.5560051003] [Citation(s) in RCA: 130] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Tricodon regions on messenger RNAs corresponding to a set of proteins from Escherichia coli were scrutinized for their translation speed. The fractional frequency values of the individual codons as they occur in mRNAs of highly expressed genes from Escherichia coli were taken as an indicative measure of the translation speed. The tricodons were classified by the sum of the frequency values of the constituent codons. Examination of the conformation of the encoded amino acid residues in the corresponding protein tertiary structures revealed a correlation between codon usage in mRNA and topological features of the encoded proteins. Alpha helices on proteins tend to be preferentially coded by translationally fast mRNA regions while the slow segments often code for beta strands and coil regions. Fast regions correspondingly avoid coding for beta strands and coil regions while the slow regions similarly move away from encoding alpha helices. Structural and mechanistic aspects of the ribosome peptide channel support the relevance of sequence fragment translation and subsequent conformation. A discussion is presented relating the observation to the reported kinetic data on the formation and stabilization of protein secondary structural types during protein folding. The observed absence of such strong positive selection for codons in non-highly expressed genes is compatible with existing theories that mutation pressure may well dominate codon selection in non-highly expressed genes.
Collapse
Affiliation(s)
- T A Thanaraj
- European Molecular Biology Laboratory, Heidelberg, Germany.
| | | |
Collapse
|