1
|
Vignolini T, Capitanio M, Caldini C, Gardini L, Pavone FS. Highly inclined light sheet allows volumetric super-resolution imaging of efflux pumps distribution in bacterial biofilms. Sci Rep 2024; 14:12902. [PMID: 38839922 PMCID: PMC11153600 DOI: 10.1038/s41598-024-63729-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 05/31/2024] [Indexed: 06/07/2024] Open
Abstract
Bacterial biofilms are highly complex communities in which isogenic bacteria display different gene expression patterns and organize in a three-dimensional mesh gaining enhanced resistance to biocides. The molecular mechanisms behind such increased resistance remain mostly unknown, also because of the technical difficulties in biofilm investigation at the sub-cellular and molecular level. In this work we focus on the AcrAB-TolC protein complex, a multidrug efflux pump found in Enterobacteriaceae, whose overexpression is associated with most multiple drug resistance (MDR) phenotypes occurring in Gram-negative bacteria. We propose an optical method to quantify the expression level of the AcrAB-TolC pump within the biofilm volume at the sub-cellular level, with single-molecule sensitivity. Through a combination of super-resolution PALM with single objective light sheet and precision genome editing, we can directly quantify the spatial distribution of endogenous AcrAB-TolC pumps expressed in both planktonic bacteria and, importantly, within the bacterial biofilm volume. We observe a gradient of pump density within the biofilm volume and over the course of biofilm maturation. Notably, we propose an optical method that could be broadly employed to achieve volumetric super-resolution imaging of thick samples.
Collapse
Affiliation(s)
- T Vignolini
- European Laboratory for Non- Linear Spectroscopy, LENS, Via N. Carrara 1, 50019, Sesto Fiorentino, Italy.
- Department of Physics and Astronomy, University of Florence, Via G. Sansone 1, 50019, Sesto Fiorentino, Italy.
- Parasite RNA Biology Group, Institut Pasteur, Université Paris Cité, 75015, Paris, France.
| | - M Capitanio
- European Laboratory for Non- Linear Spectroscopy, LENS, Via N. Carrara 1, 50019, Sesto Fiorentino, Italy
- Department of Physics and Astronomy, University of Florence, Via G. Sansone 1, 50019, Sesto Fiorentino, Italy
| | - C Caldini
- European Laboratory for Non- Linear Spectroscopy, LENS, Via N. Carrara 1, 50019, Sesto Fiorentino, Italy
- Department of Physics and Astronomy, University of Florence, Via G. Sansone 1, 50019, Sesto Fiorentino, Italy
| | - L Gardini
- European Laboratory for Non- Linear Spectroscopy, LENS, Via N. Carrara 1, 50019, Sesto Fiorentino, Italy.
- National Institute of Optics, National Research Council, Via N. Carrara 1, 50019, Sesto Fiorentino, Italy.
| | - F S Pavone
- European Laboratory for Non- Linear Spectroscopy, LENS, Via N. Carrara 1, 50019, Sesto Fiorentino, Italy
- Department of Physics and Astronomy, University of Florence, Via G. Sansone 1, 50019, Sesto Fiorentino, Italy
| |
Collapse
|
2
|
Fages‐Lartaud M, Hundvin K, Hohmann‐Marriott MF. Mechanisms governing codon usage bias and the implications for protein expression in the chloroplast of Chlamydomonas reinhardtii. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 112:919-945. [PMID: 36071273 PMCID: PMC9828097 DOI: 10.1111/tpj.15970] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 08/29/2022] [Accepted: 09/01/2022] [Indexed: 05/30/2023]
Abstract
Chloroplasts possess a considerably reduced genome that is decoded via an almost minimal set of tRNAs. These features make an excellent platform for gaining insights into fundamental mechanisms that govern protein expression. Here, we present a comprehensive and revised perspective of the mechanisms that drive codon selection in the chloroplast of Chlamydomonas reinhardtii and the functional consequences for protein expression. In order to extract this information, we applied several codon usage descriptors to genes with different expression levels. We show that highly expressed genes strongly favor translationally optimal codons, while genes with lower functional importance are rather affected by directional mutational bias. We demonstrate that codon optimality can be deduced from codon-anticodon pairing affinity and, for a small number of amino acids (leucine, arginine, serine, and isoleucine), tRNA concentrations. Finally, we review, analyze, and expand on the impact of codon usage on protein yield, secondary structures of mRNA, translation initiation and termination, and amino acid composition of proteins, as well as cotranslational protein folding. The comprehensive analysis of codon choice provides crucial insights into heterologous gene expression in the chloroplast of C. reinhardtii, which may also be applicable to other chloroplast-containing organisms and bacteria.
Collapse
Affiliation(s)
- Maxime Fages‐Lartaud
- Department of BiotechnologyNorwegian University of Science and TechnologyTrondheimN‐7491Norway
| | - Kristoffer Hundvin
- Department of BiotechnologyNorwegian University of Science and TechnologyTrondheimN‐7491Norway
| | | |
Collapse
|
3
|
Deb B, Uddin A, Chakraborty S. Analysis of codon usage of Horseshoe Bat Hepatitis B virus and its host. Virology 2021; 561:69-79. [PMID: 34171764 DOI: 10.1016/j.virol.2021.05.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 05/07/2021] [Accepted: 05/19/2021] [Indexed: 11/28/2022]
Abstract
In the present analysis, codon usage strategies and base distribution of Horseshoe bat hepatitis B virus (HBHBV) were analyzed and compared with its host Rhinolophus sinicus, as no work was yet reported. The magnitude of synonymous codon usage bias (CUB) in the virus and its host was low with higher proportion of the base C. Notably, 21 more frequently used codons, 19 less frequently used codons and 3 underrepresented codons (TCG, ACG and GCG) were found to be similar in both virus and its host coding sequences. Neutrality plot analysis reported greater role of natural selection in HBHBV (67.84%) and R. sinicus (76.90%) over mutation pressure. Base skewness and protein properties also influenced the CUB of genes. Further, codon usage analysis depicted, HBHBV and R. sinicus had many similarities in codon usage patterns that might reflect viral adaptation to its host.
Collapse
Affiliation(s)
- Bornali Deb
- Department of Biotechnology, Assam University, Silchar, 788150, Assam, India
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Algapur, Hailakandi, 788150, Assam, India
| | - Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar, 788150, Assam, India.
| |
Collapse
|
4
|
Barbhuiya PA, Uddin A, Chakraborty S. Understanding the codon usage patterns of mitochondrial CO genes among Amphibians. Gene 2021; 777:145462. [PMID: 33515725 DOI: 10.1016/j.gene.2021.145462] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 12/18/2020] [Accepted: 01/20/2021] [Indexed: 11/17/2022]
Abstract
A universal phenomenon of using synonymous codons unequally in coding sequences known as codon usage bias (CUB) is observed in all forms of life. Mutation and natural selection drive CUB in many species but the relative role of evolutionary forces varies across species, genes and genomes. We studied the CUB in mitochondrial (mt) CO genes from three orders of Amphibia using bioinformatics approach as no work was reported yet. We observed that CUB of mt CO genes of Amphibians was weak across different orders. Order Caudata had higher CUB followed by Gymnophiona and Anura for all genes and CUB also varied across genes. Nucleotide composition analysis showed that CO genes were AT-rich. The AT content in Caudata was higher than that in Gymnophiona while Anura showed the least content. Multiple investigations namely nucleotide composition, correspondence analysis, parity plot analysis showed that the interplay of mutation pressure and natural selection caused CUB in these genes. Neutrality plot suggested the involvement of natural selection was more than the mutation pressure. The contribution of natural selection was higher in Anura than Gymnophiona and the lowest in Caudata. The codons CGA, TGA, AAA were found to be highly favoured by nature across all genes and orders.
Collapse
Affiliation(s)
- Parvin A Barbhuiya
- Department of Biotechnology, Assam University, Silchar 788150, Assam, India
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Algapur, Hailakandi 788150, Assam, India
| | - Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar 788150, Assam, India.
| |
Collapse
|
5
|
Zhou JH, Li H, Li X, Gao J, Xu L, Han S, Liu Y, Shang Y, Cao X. Tracing Brucella evolutionary dynamics in expanding host ranges through nucleotide, codon and amino acid usages in genomes. J Biomol Struct Dyn 2020; 39:3986-3995. [PMID: 32448095 DOI: 10.1080/07391102.2020.1773313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
The host range of Brucella organisms has expanded from terrestrial and marine mammals to fish and amphibians. The high homology genomes of different Brucella organisms promote us to investigate evolutionary patterns for nucleotide, codon and amino acid usage patterns at gene levels among Brucella species. Although the similar patterns for nucleotide and synonymous codon usages exist in gene population, GC composition at the first codon position has significant correlations to that of the second and third codon positions, respectively, suggesting that nucleotide usages surrounding one codon influence synonymous codon usage patterns. Evolutionary patterns represented by synonymous codon and amino acid usages reflect host factor impacting Brucella speciation. As for genetic variations of important virulent factors involved with different biological functions, genes encoding lipoplysaccharides (LPSs) display more distinctive codon adaptation to Brucella than those of the BvrR/BvrS system and type IV secretion system. By Bayesian analysis, the polygenetic constructions for these genes of virulent factors shared by Brucella species display the purifying/positive selections and partially host factor in mediating genetic variations of these genes. The systemic analyses for nucleotide, synonymous codon and amino acid usages at gene level and genetic variations of important virulent factor genes display that host limitation influences either genetic characterizations at gene level or a particular gene involved in virulent factors of Brucella.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Jian-Hua Zhou
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
| | - Hua Li
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China.,China Agricultural Vet Biology and Technology limited liability company, Lanzhou, Gansu, P.R. China
| | - Xuerui Li
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
| | - Jing Gao
- Gansu Center for Animal Disease Prevention and Control, Lanzhou, Gansu, P.R. China
| | - Long- Xu
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China.,College of Veterinary Medicine, Gansu Agricultural University, Lanzhou, Gansu, P.R. China
| | - Shengyi Han
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China.,College of Veterinary Medicine, Gansu Agricultural University, Lanzhou, Gansu, P.R. China
| | - Yongsheng Liu
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
| | - Youjun Shang
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
| | - Xiaoan Cao
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, P.R. China
| |
Collapse
|
6
|
Shaytan AK, Xiao H, Armeev GA, Gaykalova DA, Komarova GA, Wu C, Studitsky VM, Landsman D, Panchenko AR. Structural interpretation of DNA-protein hydroxyl-radical footprinting experiments with high resolution using HYDROID. Nat Protoc 2018; 13:2535-2556. [PMID: 30341436 PMCID: PMC6322412 DOI: 10.1038/s41596-018-0048-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Hydroxyl-radical footprinting (HRF) is a powerful method for probing structures of nucleic acid-protein complexes with single-nucleotide resolution in solution. To tap the full quantitative potential of HRF, we describe a protocol, hydroxyl-radical footprinting interpretation for DNA (HYDROID), to quantify HRF data and integrate them with atomistic structural models. The stages of the HYDROID protocol are extraction of the lane profiles from gel images, quantification of the DNA cleavage frequency at each nucleotide and theoretical estimation of the DNA cleavage frequency from atomistic structural models, followed by comparison of experimental and theoretical results. Example scripts for each step of HRF data analysis and interpretation are provided for several nucleosome systems; they can be easily adapted to analyze user data. As input, HYDROID requires polyacrylamide gel electrophoresis (PAGE) images of HRF products and optionally can use a molecular model of the DNA-protein complex. The HYDROID protocol can be used to quantify HRF over DNA regions of up to 100 nucleotides per gel image. In addition, it can be applied to the analysis of RNA-protein complexes and free RNA or DNA molecules in solution. Compared with other methods reported to date, HYDROID is unique in its ability to simultaneously integrate HRF data with the analysis of atomistic structural models. HYDROID is freely available. The complete protocol takes ~3 h. Users should be familiar with the command-line interface, the Python scripting language and Protein Data Bank (PDB) file formats. A graphical user interface (GUI) with basic functionality (HYDROID_GUI) is also available.
Collapse
Affiliation(s)
- Alexey K Shaytan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
- Department of Biology, Lomonosov Moscow State University, Moscow, Russia.
| | - Hua Xiao
- Laboratory of Biochemistry and Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Grigoriy A Armeev
- Department of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Daria A Gaykalova
- Department of Otolaryngology-Head and Neck Surgery, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Galina A Komarova
- Department of Physics, Lomonosov Moscow State University, Moscow, Russia
| | - Carl Wu
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Molecular Biology & Genetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Vasily M Studitsky
- Department of Biology, Lomonosov Moscow State University, Moscow, Russia
- Fox Chase Cancer Center, Philadelphia, PA, USA
| | - David Landsman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Anna R Panchenko
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
7
|
Sinha I, Woodrow CJ. Forces acting on codon bias in malaria parasites. Sci Rep 2018; 8:15984. [PMID: 30374097 PMCID: PMC6206010 DOI: 10.1038/s41598-018-34404-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 10/16/2018] [Indexed: 11/09/2022] Open
Abstract
Malaria parasite genomes have a range of codon biases, with Plasmodium falciparum one of the most AT-biased genomes known. We examined the make up of synonymous coding sites and stop codons in the core genomes of representative malaria parasites, showing first that local DNA context influences codon bias similarly across P. falciparum, P. vivax and P. berghei, with suppression of CpG dinucleotides and enhancement of CpC dinucleotides, both within and aross codons. Intense asexual phase gene expression in P. falciparum and P. berghei is associated with increased A3:G3 bias but reduced T3:C3 bias at 2-fold sites, consistent with adaptation of codons to tRNA pools and avoidance of wobble tRNA interactions that potentially slow down translation. In highly expressed genes, the A3:G3 ratio can exceed 30-fold while the T3:C3 ratio can be less than 1, according to the encoded amino acid and subsequent base. Lysine codons (AAA/G) show distinctive behaviour with substantially reduced A3:G3 bias in highly expressed genes, perhaps because of selection against frameshifting when the AAA codon is followed by another adenine. Intense expression is also associated with a strong bias towards TAA stop codons (found in 94% and 89% of highly expressed P. falciparum and P. berghei genes respectively) and a proportional rise in the TAAA stop ‘tetranucleotide’. The presence of these expression-linked effects in the relatively AT-rich malaria parasite species adds weight to the suggestion that AT-richness in the Plasmodium genus might be a fitness adaptation. Potential explanations for the relative lack of codon bias in P. vivax include the distinct features of its lifecycle and its effective population size over evolutionary time.
Collapse
Affiliation(s)
- I Sinha
- Mahidol-Oxford Tropical Medicine Research Unit (MORU), Mahidol University, Bangkok, Thailand.,Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, UK
| | - C J Woodrow
- Mahidol-Oxford Tropical Medicine Research Unit (MORU), Mahidol University, Bangkok, Thailand. .,Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, UK.
| |
Collapse
|
8
|
Cao XA, Hu W, Shang YJ, Liu YS, Han SY, Wang YN, Zhao L, Li XR, Zhou JH. Analyses of nucleotide, synonymous codon and amino acid usages at gene levels of Brucella melitensis strain QY1. INFECTION GENETICS AND EVOLUTION 2018; 65:257-264. [PMID: 30092351 DOI: 10.1016/j.meegid.2018.08.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Revised: 07/20/2018] [Accepted: 08/04/2018] [Indexed: 12/20/2022]
Abstract
Brucella melitensis is the causative pathogen of the zoonotic disease brucellosis in China. This work focused on analyses of genetic features represented by nucleotide, synonymous codon and amino acid usages at gene levels of B. melitensis strain QY1 isolated from China. Although nucleotide usage biases at different codon positions all work on synonymous codon usage bias, nucleotide usage biases at the 1st and 3rd positions play more important roles in codon usages. Mutation pressure caused by nucleotide composition constraint influences the formation of over-representative synonymous codons, but neighboring nucleotides surrounding a codon strongly influence synonymous codon usage bias for B. melitensis strain QY1. There is significant correlation between amino acid usage bias and hydropathicity of proteins for B. melitensis strain QY1. Compared with different Brucella species about synonymous codon usage patterns, synonymous codon usages are not obviously influenced by hosts. Due to nucleotide usage bias at the 1st codon position influencing synonymous codon and amino acid usages, good interactions among nucleotide, synonymous codon and amino acid usages exist in the evolutionary process of B. melitensis.
Collapse
Affiliation(s)
- Xiao-An Cao
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou 730046, Gansu, PR China
| | - Wen Hu
- Gansu Police Vocational College, Lanzhou 730046, Gansu, PR China
| | - You-Jun Shang
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou 730046, Gansu, PR China
| | - Yong-Sheng Liu
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou 730046, Gansu, PR China
| | - Sheng-Yi Han
- College of Veterinary Medicine, Gansu Agricultural University, Lanzhou 730070, Gansu, PR China
| | - Yi-Ning Wang
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou 730046, Gansu, PR China
| | - Lu Zhao
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou 730046, Gansu, PR China
| | - Xue-Rui Li
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou 730046, Gansu, PR China
| | - Jian-Hua Zhou
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou 730046, Gansu, PR China.
| |
Collapse
|
9
|
Ma XX, Cao X, Ma P, Chang QY, Li LJ, Zhou XK, Zhang DR, Li MS, Ma ZR. Comparative genomic analysis for nucleotide, codon, and amino acid usage patterns of mycoplasmas. J Basic Microbiol 2018. [PMID: 29537653 DOI: 10.1002/jobm.201700490] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The evolutionary factors in influencing the genetic characteristics of nucleotide, synonymous codon, and amino acid usage of 18 mycoplasma species were analyzed. The nucleotide usage at the 1st and 2nd codon position which determines amino acid composition of proteins has a significant correlation with the total nucleotide composition of gene population of these mycoplasma species, however, the nucleotide usage at the 3rd codon position which affects synonymous codon usage patterns has a slight correlation with either the total nucleotide composition or the nucleotide usage at the 1st and 2nd codon position. Other evolutionary factors join in the evolutionary process of mycoplasma apart from mutation pressure caused by nucleotide usage constraint based on the relationships between effective number of codons/codon adaptation index and nucleotide usage at the 3rd codon position. Although nucleotide usage of gene population in mycoplasma dominates in forming the overall codon usage trends, the relative abundance of codon with nucleotide context and amino acid usage pattern show that translation selection involved in translation accuracy and efficiency play an important role in synonymous codon usage patterns. In addition, synonymous codon usage patterns of gene population have a bigger power to represent genetic diversity among different species than amino acid usage. These results suggest that although the mycoplasmas reduce its genome size during the evolutionary process and shape the form, which is opposite to their hosts, of AT usages at high levels, this kind organism still depends on nucleotide usage at the 1st and 2nd codon positions to control syntheses of the requested proteins for surviving in their hosts and nucleotide usage at the 3rd codon position to develop genetic diversity of different mycoplasma species. This systemic analysis with 18 mycoplasma species may provide useful clues for further in vivo genetic studies on the related species.
Collapse
Affiliation(s)
- Xiao-Xia Ma
- Key Laboratory of Bioengineering & Biotechnology of State Ethnic Affairs Commission, Engineering and Technology Research Center for Animal Cell, College of Life Science and Engineering, Northwest Minzu University, Lanzhou, Gansu, P.R. China
| | - Xin Cao
- Key Laboratory of Bioengineering & Biotechnology of State Ethnic Affairs Commission, Engineering and Technology Research Center for Animal Cell, College of Life Science and Engineering, Northwest Minzu University, Lanzhou, Gansu, P.R. China
| | - Peng Ma
- Key Laboratory of Bioengineering & Biotechnology of State Ethnic Affairs Commission, Engineering and Technology Research Center for Animal Cell, College of Life Science and Engineering, Northwest Minzu University, Lanzhou, Gansu, P.R. China
| | - Qiu-Yan Chang
- Key Laboratory of Bioengineering & Biotechnology of State Ethnic Affairs Commission, Engineering and Technology Research Center for Animal Cell, College of Life Science and Engineering, Northwest Minzu University, Lanzhou, Gansu, P.R. China
| | - Lin-Jie Li
- Key Laboratory of Bioengineering & Biotechnology of State Ethnic Affairs Commission, Engineering and Technology Research Center for Animal Cell, College of Life Science and Engineering, Northwest Minzu University, Lanzhou, Gansu, P.R. China
| | - Xiao-Kai Zhou
- Key Laboratory of Bioengineering & Biotechnology of State Ethnic Affairs Commission, Engineering and Technology Research Center for Animal Cell, College of Life Science and Engineering, Northwest Minzu University, Lanzhou, Gansu, P.R. China
| | - De-Rong Zhang
- Key Laboratory of Bioengineering & Biotechnology of State Ethnic Affairs Commission, Engineering and Technology Research Center for Animal Cell, College of Life Science and Engineering, Northwest Minzu University, Lanzhou, Gansu, P.R. China
| | - Ming-Sheng Li
- Key Laboratory of Bioengineering & Biotechnology of State Ethnic Affairs Commission, Engineering and Technology Research Center for Animal Cell, College of Life Science and Engineering, Northwest Minzu University, Lanzhou, Gansu, P.R. China
| | - Zhong-Ren Ma
- Key Laboratory of Bioengineering & Biotechnology of State Ethnic Affairs Commission, Engineering and Technology Research Center for Animal Cell, College of Life Science and Engineering, Northwest Minzu University, Lanzhou, Gansu, P.R. China
| |
Collapse
|
10
|
Abstract
Synonymous mutations do not change the sequence of the polypeptide but they may still influence fitness. We investigated in Salmonella enterica how four synonymous mutations in the rpsT gene (encoding ribosomal protein S20) reduce fitness (i.e., growth rate) and the mechanisms by which this cost can be genetically compensated. The reduced growth rates of the synonymous mutants were correlated with reduced levels of the rpsT transcript and S20 protein. In an adaptive evolution experiment, these fitness impairments could be compensated by mutations that either caused up-regulation of S20 through increased gene dosage (due to duplications), increased transcription of the rpsT gene (due to an rpoD mutation or mutations in rpsT), or increased translation from the rpsT transcript (due to rpsT mutations). We suggest that the reduced levels of S20 in the synonymous mutants result in production of a defective subpopulation of 30S subunits lacking S20 that reduce protein synthesis and bacterial growth and that the compensatory mutations restore S20 levels and the number of functional ribosomes. Our results demonstrate how specific synonymous mutations can cause substantial fitness reductions and that many different types of intra- and extragenic compensatory mutations can efficiently restore fitness. Furthermore, this study highlights that also synonymous sites can be under strong selection, which may have implications for the use of dN/dS ratios as signature for selection.
Collapse
Affiliation(s)
- Anna Knöppel
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Joakim Näsvall
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Dan I Andersson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
11
|
Hauber DJ, Grogan DW, DeBry RW. Mutations to Less-Preferred Synonymous Codons in a Highly Expressed Gene of Escherichia coli: Fitness and Epistatic Interactions. PLoS One 2016; 11:e0146375. [PMID: 26727272 PMCID: PMC4699635 DOI: 10.1371/journal.pone.0146375] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2015] [Accepted: 12/16/2015] [Indexed: 01/11/2023] Open
Abstract
Codon-tRNA coevolution to maximize protein production has been, until recently, the dominant hypothesis to explain codon-usage bias in highly expressed bacterial genes. Two predictions of this hypothesis are 1) selection is weak; and 2) similar silent replacements at different codons should have similar fitness consequence. We used an allele-replacement strategy to change five specific 3rd-codon-position (silent) sites in the highly expressed Escherichia coli ribosomal protein gene rplQ from the wild type to a less-preferred alternative. We introduced the five mutations within a 10-codon region. Four of the silent sites were chosen to test the second prediction, with a CTG to CTA mutation being introduced at two closely linked leucine codons and an AAA to AAG mutation being introduced at two closely linked lysine codons. We also introduced a fifth silent mutation, a GTG to GTA mutation at a valine codon in the same genic region. We measured the fitness effect of the individual mutations by competing each single-mutant strain against the parental wild-type strain, using a disrupted form of the araA gene as a selectively neutral phenotypic marker to distinguish between strains in direct competition experiments. Three of the silent mutations had a fitness effect of |s| > 0.02, which is contradictory to the prediction that selection will be weak. The two leucine mutations had significantly different fitness effects, as did the two lysine mutations, contradictory to the prediction that similar mutations at different codons should have similar fitness effects. We also constructed a strain carrying all five silent mutations in combination. Its fitness effect was greater than that predicted from the individual fitness values, suggesting that negative synergistic epistasis acts on the combination allele.
Collapse
Affiliation(s)
- David J. Hauber
- Department of Biological Sciences, University of Cincinnati, Cincinnati, Ohio, United States of America
| | - Dennis W. Grogan
- Department of Biological Sciences, University of Cincinnati, Cincinnati, Ohio, United States of America
| | - Ronald W. DeBry
- Department of Biological Sciences, University of Cincinnati, Cincinnati, Ohio, United States of America
| |
Collapse
|
12
|
Camiolo S, Melito S, Porceddu A. New insights into the interplay between codon bias determinants in plants. DNA Res 2015; 22:461-70. [PMID: 26546225 PMCID: PMC4675714 DOI: 10.1093/dnares/dsv027] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 10/01/2015] [Indexed: 12/28/2022] Open
Abstract
Codon bias is the non-random use of synonymous codons, a phenomenon that has been observed in species as diverse as bacteria, plants and mammals. The preferential use of particular synonymous codons may reflect neutral mechanisms (e.g. mutational bias, G|C-biased gene conversion, genetic drift) and/or selection for mRNA stability, translational efficiency and accuracy. The extent to which these different factors influence codon usage is unknown, so we dissected the contribution of mutational bias and selection towards codon bias in genes from 15 eudicots, 4 monocots and 2 mosses. We analysed the frequency of mononucleotides, dinucleotides and trinucleotides and investigated whether the compositional genomic background could account for the observed codon usage profiles. Neutral forces such as mutational pressure and G|C-biased gene conversion appeared to underlie most of the observed codon bias, although there was also evidence for the selection of optimal translational efficiency and mRNA folding. Our data confirmed the compositional differences between monocots and dicots, with the former featuring in general a lower background compositional bias but a higher overall codon bias.
Collapse
Affiliation(s)
- S Camiolo
- Dipartimento di Agraria, SACEG, Università degli Studi di Sassari, Sassari, Italy
| | - S Melito
- Dipartimento di Agraria, SACEG, Università degli Studi di Sassari, Sassari, Italy
| | - A Porceddu
- Dipartimento di Agraria, SACEG, Università degli Studi di Sassari, Sassari, Italy
| |
Collapse
|
13
|
Karimi Z, Nezafat N, Negahdaripour M, Berenjian A, Hemmati S, Ghasemi Y. The effect of rare codons following the ATG start codon on expression of human granulocyte-colony stimulating factor in Escherichia coli. Protein Expr Purif 2015; 114:108-14. [DOI: 10.1016/j.pep.2015.05.017] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Revised: 05/27/2015] [Accepted: 05/29/2015] [Indexed: 10/23/2022]
|
14
|
The effects of the context-dependent codon usage bias on the structure of the nsp1α of porcine reproductive and respiratory syndrome virus. BIOMED RESEARCH INTERNATIONAL 2014; 2014:765320. [PMID: 25162025 PMCID: PMC4137607 DOI: 10.1155/2014/765320] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/18/2014] [Revised: 06/05/2014] [Accepted: 06/19/2014] [Indexed: 11/18/2022]
Abstract
The information about the crystal structure of porcine reproductive and respiratory syndrome virus (PRRSV) leader protease nsp1α is available to analyze the roles of tRNA abundance of pigs and codon usage of the nsp1 α gene in the formation of this protease. The effects of tRNA abundance of the pigs and the synonymous codon usage and the context-dependent codon bias (CDCB) of the nsp1 α on shaping the specific folding units (α-helix, β-strand, and the coil) in the nsp1α were analyzed based on the structural information about this protease from protein data bank (PDB: 3IFU) and the nsp1 α of the 191 PRRSV strains. By mapping the overall tRNA abundance along the nsp1 α, we found that there is no link between the fluctuation of the overall tRNA abundance and the specific folding units in the nsp1α, and the low translation speed of ribosome caused by the tRNA abundance exists in the nsp1 α. The strong correlation between some synonymous codon usage and the specific folding units in the nsp1α was found, and the phenomenon of CDCB exists in the specific folding units of the nsp1α. These findings provide an insight into the roles of the synonymous codon usage and CDCB in the formation of PRRSV nsp1α structure.
Collapse
|
15
|
The effects of codon context on in vivo translation speed. PLoS Genet 2014; 10:e1004392. [PMID: 24901308 PMCID: PMC4046918 DOI: 10.1371/journal.pgen.1004392] [Citation(s) in RCA: 104] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Accepted: 04/04/2014] [Indexed: 11/19/2022] Open
Abstract
We developed a bacterial genetic system based on translation of the his operon leader peptide gene to determine the relative speed at which the ribosome reads single or multiple codons in vivo. Low frequency effects of so-called "silent" codon changes and codon neighbor (context) effects could be measured using this assay. An advantage of this system is that translation speed is unaffected by the primary sequence of the His leader peptide. We show that the apparent speed at which ribosomes translate synonymous codons can vary substantially even for synonymous codons read by the same tRNA species. Assaying translation through codon pairs for the 5'- and 3'- side positioning of the 64 codons relative to a specific codon revealed that the codon-pair orientation significantly affected in vivo translation speed. Codon pairs with rare arginine codons and successive proline codons were among the slowest codon pairs translated in vivo. This system allowed us to determine the effects of different factors on in vivo translation speed including Shine-Dalgarno sequence, rate of dipeptide bond formation, codon context, and charged tRNA levels.
Collapse
|
16
|
Lind PA, Andersson DI. Fitness costs of synonymous mutations in the rpsT gene can be compensated by restoring mRNA base pairing. PLoS One 2013; 8:e63373. [PMID: 23691039 PMCID: PMC3655191 DOI: 10.1371/journal.pone.0063373] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2013] [Accepted: 03/27/2013] [Indexed: 01/12/2023] Open
Abstract
We previously reported that the distribution of fitness effects for non-synonymous and synonymous mutations in Salmonella typhimurium ribosomal proteins S20 and L1 are similar, suggesting that fitness constraints are present at the level of mRNA. Here we explore the hypothesis that synonymous mutations confer their fitness-reducing effect by alterating the secondary structure of the mRNA. To this end, we constructed a set of synonymous substitutions in the rpsT gene, encoding ribosomal protein S20, that are located in predicted paired regions in the mRNA and measured their effect on bacterial fitness. Our results show that for 3/9 cases tested, the reduced fitness conferred by a synonymous mutation could be fully or partly restored by introducing a second synonymous substitution that restore base pairing in a mRNA stem. In addition, random mutations in predicted paired regions had larger fitness effects than those in unpaired regions. Finally, we did not observe any correlation between fitness effects of the synonymous mutations and their rarity. These results suggest that for ribosomal protein S20, the deleterious effects of synonymous mutations are not generally due to codon usage effects, but that mRNA secondary structure is a major fitness constraint.
Collapse
Affiliation(s)
- Peter A. Lind
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Dan I. Andersson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
- * E-mail:
| |
Collapse
|
17
|
Behura SK, Severson DW. Codon usage bias: causative factors, quantification methods and genome-wide patterns: with emphasis on insect genomes. Biol Rev Camb Philos Soc 2012; 88:49-61. [PMID: 22889422 DOI: 10.1111/j.1469-185x.2012.00242.x] [Citation(s) in RCA: 134] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Codon usage bias refers to the phenomenon where specific codons are used more often than other synonymous codons during translation of genes, the extent of which varies within and among species. Molecular evolutionary investigations suggest that codon bias is manifested as a result of balance between mutational and translational selection of such genes and that this phenomenon is widespread across species and may contribute to genome evolution in a significant manner. With the advent of whole-genome sequencing of numerous species, both prokaryotes and eukaryotes, genome-wide patterns of codon bias are emerging in different organisms. Various factors such as expression level, GC content, recombination rates, RNA stability, codon position, gene length and others (including environmental stress and population size) can influence codon usage bias within and among species. Moreover, there has been a continuous quest towards developing new concepts and tools to measure the extent of codon usage bias of genes. In this review, we outline the fundamental concepts of evolution of the genetic code, discuss various factors that may influence biased usage of synonymous codons and then outline different principles and methods of measurement of codon usage bias. Finally, we discuss selected studies performed using whole-genome sequences of different insect species to show how codon bias patterns vary within and among genomes. We conclude with generalized remarks on specific emerging aspects of codon bias studies and highlight the recent explosion of genome-sequencing efforts on arthropods (such as twelve Drosophila species, species of ants, honeybee, Nasonia and Anopheles mosquitoes as well as the recent launch of a genome-sequencing project involving 5000 insects and other arthropods) that may help us to understand better the evolution of codon bias and its biological significance.
Collapse
Affiliation(s)
- Susanta K Behura
- Department of Biological Sciences, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA.
| | | |
Collapse
|
18
|
Forster AC. Synthetic biology challenges long-held hypotheses in translation, codon bias and transcription. Biotechnol J 2012; 7:835-45. [DOI: 10.1002/biot.201200002] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2012] [Revised: 04/28/2012] [Accepted: 05/08/2012] [Indexed: 11/09/2022]
|
19
|
Hua J, Lee RW. Factors Affecting Codon Bias in the Mitochondrial Genomes of the Streptophyte Mesostigma viride and the Chlorophyte Chlamydomonas reinhardtii. J Eukaryot Microbiol 2012; 59:287-9. [DOI: 10.1111/j.1550-7408.2011.00613.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2011] [Accepted: 01/12/2012] [Indexed: 11/29/2022]
Affiliation(s)
- Jimeng Hua
- Department of Biology; Dalhousie University; Halifax; Nova Scotia; B3H 4R2; Canada
| | - Robert W. Lee
- Department of Biology; Dalhousie University; Halifax; Nova Scotia; B3H 4R2; Canada
| |
Collapse
|
20
|
Chattopadhyay S, Sahoo S, Kanner WA, Chakrabarti J. Pressures in archaeal protein coding genes: a comparative study. Comp Funct Genomics 2010; 4:56-65. [PMID: 18629113 PMCID: PMC2447400 DOI: 10.1002/cfg.246] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2002] [Accepted: 11/25/2002] [Indexed: 11/06/2022] Open
Abstract
Our studies on the bases of codons from 11 completely sequenced archaeal genomes show that, as we move from GC-rich to AT-rich protein-coding gene-containing species, the differences between G and C and between A and T, the purine load (AG content), and also the overall persistence (i.e. the tendency of a base to be followed by the same base) within codons, all increase almost simultaneously, although the extent of increase is different over the three positions within codons. These findings suggest that the deviations from the second parity rule (through the increasing differences between complementary base contents) and the increasing purine load hinder the chance of formation of the intra-strand Watson-Crick base-paired secondary structures in mRNAs (synonymous with the protein-coding genes we dealt with), thereby increasing the translational efficiency. We hypothesize that the ATrich protein-coding gene-containing archaeal species might have better translational efficiency than their GC-rich counterparts.
Collapse
Affiliation(s)
- Sujay Chattopadhyay
- Department of Theoretical Physics, Indian Association for the Cultivation of Science, Jadavpur, Calcutta 700 032, India.
| | | | | | | |
Collapse
|
21
|
Abstract
The frequencies of alternative synonymous codons vary both among species and among genes from the same genome. These patterns have been inferred to reflect the action of natural selection. Here we evaluate this in bacteria. While intragenomic variation in many species is consistent with selection favouring translationally optimal codons, much of the variation among species appears to be due to biased patterns of mutation. The strength of selection on codon usage can be estimated by two different approaches. First, the extent of bias in favour of translationally optimal codons in highly expressed genes, compared to that in genes where selection is weak, reveals the long-term effectiveness of selection. Here we show that the strength of selected codon usage bias is highly correlated with bacterial growth rate, suggesting that selection has favoured translational efficiency. Second, the pattern of bias towards optimal codons at polymorphic sites reveals the ongoing action of selection. Using this approach we obtained results that were completely consistent with the first method; importantly, the frequency spectra of optimal codons at polymorphic sites were similar to those predicted under an equilibrium model. Highly expressed genes in Escherichia coli appear to be under continuing strong selection, whereas selection is very weak in genes expressed at low levels.
Collapse
Affiliation(s)
- Paul M Sharp
- Institute of Evolutionary Biology, University of Edinburgh, , Kings Buildings, Edinburgh EH9 3JT, UK.
| | | | | |
Collapse
|
22
|
Tats A, Tenson T, Remm M. Preferred and avoided codon pairs in three domains of life. BMC Genomics 2008; 9:463. [PMID: 18842120 PMCID: PMC2585594 DOI: 10.1186/1471-2164-9-463] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2008] [Accepted: 10/08/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Alternative synonymous codons are not used with equal frequencies. In addition, the contexts of codons - neighboring nucleotides and neighboring codons - can have certain patterns. The codon context can influence both translational accuracy and elongation rates. However, it is not known how strong or conserved the codon context preferences in different organisms are. We analyzed 138 organisms (bacteria, archaea and eukaryotes) to find conserved patterns of codon pairs. RESULTS After removing the effects of single codon usage and dipeptide biases we discovered a set of neighboring codons for which avoidances or preferences were conserved in all three domains of life. Such biased codon pairs could be divided into subtypes on the basis of the nucleotide patterns that influence the bias. The most frequently avoided type of codon pair was nnUAnn. We discovered that 95.7% of avoided nnUAnn type patterns contain out-frame UAA or UAG triplets on the sense and/or antisense strand. On average, nnUAnn codon pairs are more frequently avoided in ORFeomes than in genomes. Thus we assume that translational selection plays a major role in the avoidance of these codon pairs. Among the preferred codon pairs, nnGCnn was the major type. CONCLUSION Translational selection shapes codon pair usage in protein coding sequences by rules that are common to all three domains of life. The most frequently avoided codon pairs contain the patterns nnUAnn, nnGGnn, nnGnnC, nnCGCn, GUCCnn, CUCCnn, nnCnnA or UUCGnn. The most frequently preferred codon pairs contain the patterns nnGCnn, nnCAnn or nnUnCn.
Collapse
Affiliation(s)
- Age Tats
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Riia str. 23, Tartu 51010, Estonia.
| | | | | |
Collapse
|
23
|
Cutler RW, Chantawannakul P. Synonymous codon usage bias dependent on local nucleotide context in the class Deinococci. J Mol Evol 2008; 67:301-14. [PMID: 18696025 DOI: 10.1007/s00239-008-9152-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2008] [Accepted: 07/14/2008] [Indexed: 11/25/2022]
Abstract
To study the evolution of mutation biased synonymous codon usage, we examined nucleotide co-occurrence patterns in the Deinococcus radiodurans, D. geothermalis, and Thermus thermophilus genomes for nucleotide replacement dependent on the surrounding nucleotide context. Nucleotides on the third codon site were found to be strongly correlated with nucleotide sites at most six nucleotides away in all three species, where abundance patterns were dependent on whether two nucleotides share the same purine(R)/pyrimidine(Y) status. In the class Deinococci adjacent third site nucleotides were strongly correlated, where NNR|NNR and NNY|NNY codon pairs were overabundant while NNR|NNY and NNY|NNR codon pairs were underabundant. By far the largest deviations in all three species occur for NN(YR)|(YR)NN codon pairs. In the Thermus species, the NNY|YNN and NNR|RNN codon pairs were overabundant versus the underabundant NNY|RNN and NNR|YNN codon pairs, whereas in the Deinococcus species the opposite over-/underabundance relationship held for adjacent (GC) bases. We also observed a weaker overabundance of NNR|NRN and NNY|NYN codon pairs versus the underabundant NNR|NYN and NNY|NRN codon pairs. The perfect purine/pyrimidine symmetry of each of these cases, plus the lack of significant deviations for nucleotide pairs on other length scales up to 20 codons apart demonstrates that a pervasive pattern of nucleotide replacement dependent on local nucleotide context, and not codon bias, has occurred in these species. This nucleotide replacement has led to modified synonymous codon usage within the class Deinococci that affects which codons are positioned at particular codon sites dependent on the local nucleotide context.
Collapse
|
24
|
Moura G, Pinheiro M, Arrais J, Gomes AC, Carreto L, Freitas A, Oliveira JL, Santos MAS. Large scale comparative codon-pair context analysis unveils general rules that fine-tune evolution of mRNA primary structure. PLoS One 2007; 2:e847. [PMID: 17786218 PMCID: PMC1952141 DOI: 10.1371/journal.pone.0000847] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2007] [Accepted: 07/31/2007] [Indexed: 11/18/2022] Open
Abstract
Background Codon usage and codon-pair context are important gene primary structure features that influence mRNA decoding fidelity. In order to identify general rules that shape codon-pair context and minimize mRNA decoding error, we have carried out a large scale comparative codon-pair context analysis of 119 fully sequenced genomes. Methodologies/Principal Findings We have developed mathematical and software tools for large scale comparative codon-pair context analysis. These methodologies unveiled general and species specific codon-pair context rules that govern evolution of mRNAs in the 3 domains of life. We show that evolution of bacterial and archeal mRNA primary structure is mainly dependent on constraints imposed by the translational machinery, while in eukaryotes DNA methylation and tri-nucleotide repeats impose strong biases on codon-pair context. Conclusions The data highlight fundamental differences between prokaryotic and eukaryotic mRNA decoding rules, which are partially independent of codon usage.
Collapse
Affiliation(s)
- Gabriela Moura
- Department of Biology, Center for Environmental and Marine Studies, University of Aveiro, Aveiro, Portugal
| | - Miguel Pinheiro
- Institute of Electronics and Telematics Engineering, University of Aveiro, Aveiro, Portugal
| | - Joel Arrais
- Institute of Electronics and Telematics Engineering, University of Aveiro, Aveiro, Portugal
| | - Ana Cristina Gomes
- Department of Biology, Center for Environmental and Marine Studies, University of Aveiro, Aveiro, Portugal
| | - Laura Carreto
- Department of Biology, Center for Environmental and Marine Studies, University of Aveiro, Aveiro, Portugal
| | - Adelaide Freitas
- Department of Mathematics, University of Aveiro, Aveiro, Portugal
| | - José L. Oliveira
- Institute of Electronics and Telematics Engineering, University of Aveiro, Aveiro, Portugal
| | - Manuel A. S. Santos
- Department of Biology, Center for Environmental and Marine Studies, University of Aveiro, Aveiro, Portugal
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
25
|
Cutler RW, Chantawannakul P. The effect of local nucleotides on synonymous codon usage in the honeybee (Apis mellifera L.) genome. J Mol Evol 2007; 64:637-45. [PMID: 17541680 DOI: 10.1007/s00239-006-0198-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2006] [Accepted: 02/12/2007] [Indexed: 10/23/2022]
Abstract
Using all currently predicted coding regions in the honeybee genome, a novel form of synonymous codon bias is presented that affects the usage of particular codons dependent on the surrounding nucleotides in the coding region. Nucleotides at the third codon site are correlated, dependent on their weak (adenine [A] or thyamine [T]) versus strong (guanine [G] or cytosine [C]) status, to nucleotides on the first codon site which are dependent on their purine (A/G) versus pyrimidine (C/T) status. In particular, for adjacent third and first site nucleotides, weak-pyrimidine and strong-purine nucleotide combinations occur much more frequently than the underabundant weak-purine and strong-pyrimidine nucleotide combinations. Since a similar effect is also found in the noncoding regions, but is present for all adjacent nucleotides, this coding effect is most likely due to a genome-wide context-dependent mutation error correcting mechanism in combination with selective constraints on adjacent first and second nucleotide pairs within codons. The position-dependent relationship of synonymous codon usage is evidence for a novel form of codon position bias which utilizes the redundancy in the genetic code to minimize the effect of nucleotide mutations within coding regions.
Collapse
Affiliation(s)
- Robert W Cutler
- Department of Biology, Bard College, Annandale-on-Hudson, NY 12504, USA
| | | |
Collapse
|
26
|
Moura G, Pinheiro M, Freitas AV, Oliveira JL, Santos MAS. Computational and statistical methodologies for ORFeome primary structure analysis. Methods Mol Biol 2007; 395:449-462. [PMID: 17993691 DOI: 10.1007/978-1-59745-514-5_28] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Codon usage and context are biased in open reading frames (ORFs) of most genomes. Codon usage is largely influenced by biased genome G+C pressure, in particular in prokaryotes, but the general rules that govern the evolution of codon context remain largely elusive. To shed new light into this question, we have developed computational, statistical, and graphical tools for analysis of codon context on an ORFeome wide scale. Here, we describe these methodologies in detail and show how they can be used for analysis of ORFs of any genome sequenced.
Collapse
|
27
|
Michel CJ. An analytical model of gene evolution with 9 mutation parameters: an application to the amino acids coded by the common circular code. Bull Math Biol 2006; 69:677-98. [PMID: 16952018 DOI: 10.1007/s11538-006-9147-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2006] [Accepted: 05/31/2006] [Indexed: 10/24/2022]
Abstract
We develop here an analytical evolutionary model based on a trinucleotide mutation matrix 64 x 64 with nine substitution parameters associated with the three types of substitutions in the three trinucleotide sites. It generalizes the previous models based on the nucleotide mutation matrices 4 x 4 and the trinucleotide mutation matrix 64 x 64 with three and six parameters. It determines at some time t the exact occurrence probabilities of trinucleotides mutating randomly according to these nine substitution parameters. An application of this model allows an evolutionary study of the common circular code [Formula: see text] of eukaryotes and prokaryotes and its 12 coded amino acids. The main property of this code [Formula: see text] is the retrieval of the reading frames in genes, both locally, i.e. anywhere in genes and in particular without a start codon, and automatically with a window of a few nucleotides. However, since its identification in 1996, amino acid information coded by [Formula: see text] has never been studied. Very unexpectedly, this evolutionary model demonstrates that random substitutions in this code [Formula: see text] and with particular values for the nine substitutions parameters retrieve after a certain time of evolution a frequency distribution of these 12 amino acids very close to the one coded by the actual genes.
Collapse
Affiliation(s)
- Christian J Michel
- Equipe de Bioinformatique Théorique, LSIIT (UMR CNRS-ULP 7005), Université Louis Pasteur de Strasbourg, Pôle API, Boulevard Sébastien Brant, 67400 Illkirch, France.
| |
Collapse
|
28
|
Frey G, Michel CJ. An analytical model of gene evolution with six mutation parameters: an application to archaeal circular codes. Comput Biol Chem 2006; 30:1-11. [PMID: 16324886 DOI: 10.1016/j.compbiolchem.2005.09.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2005] [Revised: 09/04/2005] [Accepted: 09/05/2005] [Indexed: 11/17/2022]
Abstract
We develop here an analytical evolutionary model based on a trinucleotide mutation matrix 64 x 64 with six substitution parameters associated with the transitions and transversions in the three trinucleotide sites. It generalizes the previous models based on the nucleotide mutation matrices 4 x 4 and the trinucleotide mutation matrix 64 x 64 with three parameters. It determines at some time t the exact occurrence probabilities of trinucleotides mutating randomly according to six substitution parameters. An application of this model allows an evolutionary study of the common circular code COM and the 15 archaeal circular codes X which have been recently identified in several archaeal genomes. The main property of a circular code is the retrieval of the reading frames in genes, both locally, i.e. anywhere in genes and in particular without a start codon, and automatically with a window of a few nucleotides. In genes, the circular code is superimposed on the traditional genetic one. Very unexpectedly, the evolutionary model demonstrates that the archaeal circular codes can derive from the common circular code subjected to random substitutions with particular values for six substitutions parameters. It has a strong correlation with the statistical observations of three archaeal codes in actual genes. Furthermore, the properties of these substitution rates allow proposal of an evolutionary classification of the 15 archaeal codes into three main classes according to this model. In almost all the cases, they agree with the actual degeneracy of the genetic code with substitutions more frequent in the third trinucleotide site and with transitions more frequent that transversions in any trinucleotide site.
Collapse
Affiliation(s)
- Gabriel Frey
- Equipe de Bioinformatique Théorique, LSIIT (UMR CNRS-ULP 7005), Université Louis Pasteur de Strasbourg, Pôle API, Boulevard Sébastien Brant, 67400 Illkirch, France.
| | | |
Collapse
|
29
|
Gilchrist MA, Wagner A. A model of protein translation including codon bias, nonsense errors, and ribosome recycling. J Theor Biol 2006; 239:417-34. [PMID: 16171830 DOI: 10.1016/j.jtbi.2005.08.007] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2005] [Revised: 08/05/2005] [Accepted: 08/08/2005] [Indexed: 11/15/2022]
Abstract
We present and analyse a model of protein translation at the scale of an individual messenger RNA (mRNA) transcript. The model we develop is unique in that it incorporates the phenomena of ribosome recycling and nonsense errors. The model conceptualizes translation as a probabilistic wave of ribosome occupancy traveling down a heterogeneous medium, the mRNA transcript. Our results show that the heterogeneity of the codon translation rates along the mRNA results in short-scale spikes and dips in the wave. Nonsense errors attenuate this wave on a longer scale while ribosome recycling reinforces it. We find that the combination of nonsense errors and codon usage bias can have a large effect on the probability that a ribosome will completely translate a transcript. We also elucidate how these forces interact with ribosome recycling to determine the overall translation rate of an mRNA transcript. We derive a simple cost function for nonsense errors using our model and apply this function to the yeast (Saccharomyces cervisiae) genome. Using this function we are able to detect position dependent selection on codon bias which correlates with gene expression levels as predicted a priori. These results indirectly validate our underlying model assumptions and confirm that nonsense errors can play an important role in shaping codon usage bias.
Collapse
Affiliation(s)
- Michael A Gilchrist
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, 37996, USA.
| | | |
Collapse
|
30
|
Das S, Paul S, Dutta C. Evolutionary constraints on codon and amino acid usage in two strains of human pathogenic actinobacteria Tropheryma whipplei. J Mol Evol 2006; 62:645-58. [PMID: 16557339 DOI: 10.1007/s00239-005-0164-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2005] [Accepted: 12/20/2005] [Indexed: 12/13/2022]
Abstract
The factors governing codon and amino acid usages in the predicted protein-coding sequences of Tropheryma whipplei TW08/27 and Twist genomes have been analyzed. Multivariate analysis identifies the replicational-transcriptional selection coupled with DNA strand-specific asymmetric mutational bias as a major driving force behind the significant interstrand variations in synonymous codon usage patterns in T. whipplei genes, while a residual intrastrand synonymous codon bias is imparted by a selection force operating at the level of translation. The strand-specific mutational pressure has little influence on the amino acid usage, for which the mean hydropathy level and aromaticity are the major sources of variation, both having nearly equal impact. In spite of the intracellular lifestyle, the amino acid usage in highly expressed gene products of T. whipplei follows the cost-minimization hypothesis. The products of the highly expressed genes of these relatively A + T-rich actinobacteria prefer to use the residues encoded by GC-rich codons, probably due to greater conservation of a GC-rich ancestral state in the highly expressed genes, as suggested by the lower values of the rate of nonsynonymous divergences between orthologous sequences of highly expressed genes from the two strains of T. whipplei. Both the genomes under study are characterized by the presence of two distinct groups of membrane-associated genes, products of which exhibit significant differences in primary and potential secondary structures as well as in the propensity of protein disorder.
Collapse
Affiliation(s)
- Sabyasachi Das
- Bioinformatics Centre, Indian Institute of Chemical Biology, 4 Raja S. C. Mullick Road, Kolkata 700 032, India
| | | | | |
Collapse
|
31
|
Abstract
The sequence of a stretch of nucleotides affects its propensity for errors during replication and expression. Are proteins encoded by stable or unstable nucleotide sequences? If selection for variability is prevalent, one could expect an excess of unstable sequences. Alternatively, if selection against targets for errors were substantial, an excess of stable sequences would be expected. We screened the genome sequences of different organisms for an important determinant of stability, the presence of mononucleotide repeats. We find that codons are used to encode proteins in a way that avoids the emergence of mononucleotide repeats, and we can attribute this bias to selection rather than a neutral process. This indicates that selection for stability, rather than for the generation of variation, substantially influences how information is encoded in the genome. Mutations are a double-edged sword. Most mutations are deleterious to an organism's fitness. On the other hand, without mutation, evolutionary change cannot occur. The rate of mutation is partially controlled by the organism, and one determinant of the mutation rate is the DNA sequence itself. Some DNA sequences are prone to mutations and errors during gene expression, whereas other sequences are more stable. Do organisms typically use stable or unstable DNA sequences in their genes? Both possibilities might seem plausible, and both have been postulated. To answer this question, the authors studied whether organisms' DNA sequences are more or less stable than expected by chance. Analyzing the genomes of a bacterium, a yeast, and a nematode, they find a overwhelming prevalence of stable DNA sequences, suggesting that selection for genetic stability is more important than selection for the generation of variation.
Collapse
Affiliation(s)
- Martin Ackermann
- Division of Biological Sciences, University of California San Diego, La Jolla, California, United States of America.
| | | |
Collapse
|
32
|
Frey G, Michel CJ. Identification of circular codes in bacterial genomes and their use in a factorization method for retrieving the reading frames of genes. Comput Biol Chem 2006; 30:87-101. [PMID: 16439185 DOI: 10.1016/j.compbiolchem.2005.11.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2005] [Revised: 11/07/2005] [Accepted: 11/07/2005] [Indexed: 10/25/2022]
Abstract
We developed a statistical method that allows each trinucleotide to be associated with a unique frame among the three possible ones in a (protein coding) gene. An extensive gene study in 175 complete bacterial genomes based on this statistical approach resulted in identification of 72 new circular codes. Finding a circular code enables an immediate retrieval of the reading frame locally anywhere in a gene. No knowledge of location of the start codon is required and a short window of only a few nucleotides is sufficient for automatic retrieval. We have therefore developed a factorization method (that explores previously found circular codes) for retrieving the reading frames of bacterial genes. Its principle is new and easy to understand. Neither complex treatment nor specific information on the nucleotide sequences is necessary. Moreover, the method can be used for short regions in nucleotide sequences (less than 25 nucleotides in protein coding genes). Selected additional properties of circular codes and their possible biological consequences are also discussed.
Collapse
Affiliation(s)
- Gabriel Frey
- Equipe de Bioinformatique Théorique, LSIIT (UMR CNRS-ULP 7005), Université Louis Pasteur de Strasbourg, Pôle API, Boulevard Sébastien Brant, 67400 Illkirch, France.
| | | |
Collapse
|
33
|
Mitreva M, Wendl MC, Martin J, Wylie T, Yin Y, Larson A, Parkinson J, Waterston RH, McCarter JP. Codon usage patterns in Nematoda: analysis based on over 25 million codons in thirty-two species. Genome Biol 2006; 7:R75. [PMID: 26271136 PMCID: PMC1779591 DOI: 10.1186/gb-2006-7-8-r75] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2006] [Revised: 06/30/2006] [Accepted: 08/14/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Codon usage has direct utility in molecular characterization of species and is also a arker for molecular evolution. To understand codon usage within the diverse phylum Nematoda,we analyzed a total of 265,494 expressed sequence tags (ESTs) from 30 nematode species. The full genomes of Caenorhabditis elegans and C. briggsae were also examined. A total of 25,871,325 codons ere analyzed and a comprehensive codon usage table for all species was generated. This is the first codon usage table available for 24 of these organisms. RESULTS Codon usage similarity in Nematoda usually persists over the breadth of a genus but thenrapidly diminishes even within each clade. Globodera, Meloidogyne, Pristionchus, and Strongyloides have the most highly derived patterns of codon usage. The major factor affecting differences in codon usage between species is the coding sequence GC content, which varies in nematodes from 32%to 51%. Coding GC content (measured as GC3) also explains much of the observed variation in the effective number of codons (R = 0.70), which is a measure of codon bias, and it even accounts for differences in amino acid frequency. Codon usage is also affected by neighboring nucleotides(N1 context). Coding GC content correlates strongly with estimated noncoding genomic GC content (R = 0.92). On examining abundant clusters in five species, candidate optimal codons were identified that may be preferred in highly expressed transcripts. CONCLUSION Evolutionary models indicate that total genomic GC content, probably the product of directional mutation pressure, drives codon usage rather than the converse, a conclusion that is supported by examination of nematode genomes.
Collapse
Affiliation(s)
- Makedonka Mitreva
- Genome Sequencing Center, Washington University School of Medicine, St Louis, Missouri 63108, USA
| | - Michael C Wendl
- Genome Sequencing Center, Washington University School of Medicine, St Louis, Missouri 63108, USA
| | - John Martin
- Genome Sequencing Center, Washington University School of Medicine, St Louis, Missouri 63108, USA
| | - Todd Wylie
- Genome Sequencing Center, Washington University School of Medicine, St Louis, Missouri 63108, USA
| | - Yong Yin
- Genome Sequencing Center, Washington University School of Medicine, St Louis, Missouri 63108, USA
| | - Allan Larson
- Department of Biology, Washington University, St. Louis, Missouri 63130, USA
| | - John Parkinson
- Hospital for Sick Children, Toronto, and Departments of Biochemistry/Medical Genetics and Microbiology, University of Toronto, M5G 1X8, Canada
| | - Robert H Waterston
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - James P McCarter
- Genome Sequencing Center, Washington University School of Medicine, St Louis, Missouri 63108, USA
- Divergence Inc., St Louis, Missouri 63141, USA
| |
Collapse
|
34
|
Das S, Pan A, Paul S, Dutta C. Comparative Analyses of Codon and Amino Acid Usage in Symbiotic Island and Core Genome in Nitrogen-Fixing Symbiotic BacteriumBradyrhizobium japonicum. J Biomol Struct Dyn 2005; 23:221-32. [PMID: 16060695 DOI: 10.1080/07391102.2005.10507061] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Genes involved in the symbiotic interactions between the nitrogen-fixing endosymbiont Bradyrhizobium japonicum, and its leguminous host are mostly clustered in a symbiotic island (SI), acquired by the bacterium through a process of horizontal transfer. A comparative analysis of the codon and amino acid usage in core and SI genes/proteins of B. japonicum has been carried out in the present study. The mutational bias, translational selection, and gene length are found to be the major sources of variation in synonymous codon usage in the core genome as well as in SI, the strength of translational selection being higher in core genes than in SI. In core proteins, hydrophobicity is the main source of variation in amino acid usage, expressivity and aromaticity being the second and third important sources. But in SI proteins, aromaticity is the chief source of variation, followed by expressivity and hydrophobicity. In SI proteins, both the mean molecular weight and mean aromaticity of individual proteins exhibit significant positive correlation with gene expressivity, which violate the cost-minimization hypothesis. Investigation of nucleotide substitution patterns in B. japonicum and Mesorhizobium loti orthologous genes reveals that both synonymous and non-synonymous sites of highly expressed genes are more conserved than their lowly expressed counterparts and this conservation is more pronounced in the genes present in core genome than in SI.
Collapse
Affiliation(s)
- Sabyasachi Das
- Bioinformatics Centre, Indian Institute of Chemical Biology, 4 Raja SC Mullick Road, Kolkata 700 032, India
| | | | | | | |
Collapse
|
35
|
Das S, Ghosh S, Pan A, Dutta C. Compositional variation in bacterial genes and proteins with potential expression level. FEBS Lett 2005; 579:5205-10. [PMID: 16165133 DOI: 10.1016/j.febslet.2005.08.042] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2005] [Accepted: 08/22/2005] [Indexed: 11/22/2022]
Abstract
Usage of guanine and cytosine at three codon sites in eubacterial genes vary distinctly with potential expressivity, as predicted by Codon Adaptation Index (CAI). In bacteria with moderate/high GC-content, G(3) follows a biphasic relationship, while C(3) increases with CAI. In AT-rich bacteria, correlation of CAI is negative with G(3), but non-specific with C(3). Correlations of CAI with residues encoded by G-starting codons are positive, while with those by C-starting codons are usually negative/random. Average Size/Complexity Score and aromaticity of gene-products decrease with CAI, confirming general validity of cost-minimization principle in free-living eubacteria. Alcoholicity of bacterial gene-products usually decreases with expressivity.
Collapse
Affiliation(s)
- Sabyasachi Das
- Bioinformatics Center, Indian Institute of Chemical Biology, 4, Raja S.C. Mullick Road, Kolkata 700 032, India
| | | | | | | |
Collapse
|
36
|
Carlini DB. Context-dependent codon bias and messenger RNA longevity in the yeast transcriptome. Mol Biol Evol 2005; 22:1403-11. [PMID: 15772378 DOI: 10.1093/molbev/msi135] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Context-dependent codon bias and its relationship with messenger RNA (mRNA) longevity was examined in 4,648 mRNA transcripts of the Saccharomyces cerevisiae transcriptome for which mRNA half-lives have been empirically determined. Surprisingly, rare codon usage (codons used <13 times per 1,000 codons in the genome) increased with mRNA half-life. However, it is shown that this pattern was not due to preference for rare codon use within codon families containing both rare and nonrare codons. Rather, the pattern was due to an increase in the frequency of amino acids encoded solely by rare codons, and a decrease in the frequency of amino acids never encoded by rare codons, with mRNA half-life. When standardized by open reading frame length, the use of consecutive rare codons was also positively correlated with mRNA half-life. There was negative correlation between the usage of synonymous A|T dinucleotides spanning codon boundaries and mRNA half-life, despite the fact that the frequency of AT dinucleotide usage overall, and AT dinucleotide usage at other codon position contexts (e.g., 1-2, 2-3, or 3|1 total), was not correlated with mRNA half-life. The use of A|T dinucleotides at synonymous dicodon boundaries could potentially allow for more efficient 3'-5' degradation by endonucleolytic cleavage.
Collapse
|
37
|
Moura G, Pinheiro M, Silva R, Miranda I, Afreixo V, Dias G, Freitas A, Oliveira JL, Santos MAS. Comparative context analysis of codon pairs on an ORFeome scale. Genome Biol 2005; 6:R28. [PMID: 15774029 PMCID: PMC1088947 DOI: 10.1186/gb-2005-6-3-r28] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2004] [Revised: 11/25/2004] [Accepted: 01/17/2005] [Indexed: 11/10/2022] Open
Abstract
We have developed a system for comparative codon context analysis of open reading frames in whole genomes, providing insights into the rules that govern the evolution of codon-pair context. Codon context is an important feature of gene primary structure that modulates mRNA decoding accuracy. We have developed an analytical software package and a graphical interface for comparative codon context analysis of all the open reading frames in a genome (the ORFeome). Using the complete ORFeome sequences of Saccharomyces cerevisiae, Schizosaccharomyces pombe, Candida albicans and Escherichia coli, we show that this methodology permits large-scale codon context comparisons and provides new insight on the rules that govern the evolution of codon-pair context.
Collapse
Affiliation(s)
- Gabriela Moura
- Centre for Cell Biology, Department of Biology, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Miguel Pinheiro
- Institute of Electronics and Telematics Engineering, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Raquel Silva
- Centre for Cell Biology, Department of Biology, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Isabel Miranda
- Centre for Cell Biology, Department of Biology, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Vera Afreixo
- Institute of Electronics and Telematics Engineering, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Gaspar Dias
- Institute of Electronics and Telematics Engineering, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Adelaide Freitas
- Department of Mathematics, University of Aveiro, 3810-193 Aveiro, Portugal
| | - José L Oliveira
- Institute of Electronics and Telematics Engineering, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Manuel AS Santos
- Centre for Cell Biology, Department of Biology, University of Aveiro, 3810-193 Aveiro, Portugal
| |
Collapse
|
38
|
Duan J, Antezana MA. Mammalian mutation pressure, synonymous codon choice, and mRNA degradation. J Mol Evol 2004; 57:694-701. [PMID: 14745538 DOI: 10.1007/s00239-003-2519-1] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2003] [Accepted: 06/30/2003] [Indexed: 11/29/2022]
Abstract
The usage of synonymous codons (SCs) in mammalian genes is highly correlated with local base composition and is therefore thought to be determined by mutation pressure. The usage is nonetheless structured. For instance, mammals share with Saccharomyces and Drosophila most preferences for the C-ending over the G-ending codon (or vice versa) within each fourfold-degenerate SC family and the fact that their SCs are placed along coding regions in ways that minimize the number of T|A and C|G dinucleotides ("|" being the codon boundary). TA and CG underrepresentations are observed everywhere in the mammalian genome affecting the SC usage, the amino acid composition of proteins, and the primary structure of introns and noncoding DNA. While the rarity of CG is ascribed to the high mutability of this dinucleotide, the rarity of TA in coding regions is considered adaptive because UA dinucleotides are cleaved by endoribonucleases. Here we present in vivo experimental evidence indicating that the number of T|A and/or C|G dinucleotides of a human gene can affect strongly the expression level and degradation of its mRNA. Our results are consistent with indirect evidence produced by other workers and with the detailed work that has been devoted to characterize UA cleavage in vitro and in vivo. We conclude that SC choice can influence strongly mRNA function and gene expression through effects not directly related to the codon-anticodon interaction. These effects should constrain heavily the nucleotide motif composition of the most abundant mRNAs in the transcriptome, in particular, their SC usage, a usage that must be reflected by cellular tRNA concentrations and thus defines for all other genes which SCs are translated fastest and most accurately. Furthermore, the need to avoid such effects genome-wide appears serious enough to have favored the evolution of biases in context-dependent mutation that reduce the occurrence of intrinsically unfavorable motifs, and/or, when possible, to have induced the molecular machinery mediating such effects to rely opportunistically on already existing motif rarities and abundances. This may explain why nucleotide motif preferences are very similar in transcribed and nontranscribed mammalian DNA even though the preferences appear to be adaptive only in transcribed DNA.
Collapse
Affiliation(s)
- Jubao Duan
- Department of Psychiatry, The University of Chicago, 924 East 57th Street, R-004, Chicago, IL 60637, USA
| | | |
Collapse
|
39
|
Fuglsang A. The relationship between palindrome avoidance and intragenic codon usage variations: a Monte Carlo study. Biochem Biophys Res Commun 2004; 316:755-62. [PMID: 15033465 DOI: 10.1016/j.bbrc.2004.02.117] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2004] [Indexed: 10/26/2022]
Abstract
Several studies have shown that codon usage within genes varies, as it seems dependent on both codon context and codon position within the gene. Given that palindromes in addition often are avoided in genomes, this study aimed at finding out if intragenic variations in codon usage may be a way to control the amount and location of palindromes. A Monte Carlo algorithm was written which resampled the codons in genes while keeping the amino acid sequence of the translation product constant. On the resampled sequences, palindromes were counted and their intragenic positions mapped. Escherichia coli K12 uses type II restriction-modification systems and displays pronounced codon usage phenomena. Using this as a reference organism it was clearly shown that the number of palindromes in genes is generally lower than the amount of palindromes in resampled genes; thus, the succession of codons seems to be a way to decrease the number of palindromes. The intragenic position of palindromes in resampled sequences, however, was largely equal to the position in the native genes, so codon usage phenomena are unlikely to be a way to control the intragenic position of palindromes. The analysis was repeated on two bacteriophages and gave similar same results, even though the virus genomes are much smaller. Studies on the endosymbionts Buchnera sp. APS and Wigglesworthia sp., which seemingly have no type II restriction-modification systems, showed that in these species there is only weak evidence for codon usage acting to control the number of palindromes.
Collapse
Affiliation(s)
- Anders Fuglsang
- Danish University of Pharmaceutical Sciences, Institute of Pharmacology, Copenhagen.
| |
Collapse
|
40
|
Rowley MJ, O'Connor K, Wijeyewickrema L. Phage display for epitope determination: a paradigm for identifying receptor-ligand interactions. BIOTECHNOLOGY ANNUAL REVIEW 2004; 10:151-88. [PMID: 15504706 DOI: 10.1016/s1387-2656(04)10006-9] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Antibodies that react with many different molecular species of protein and non-protein nature are widely studied in biology and have particular utilities, but the precise epitopes recognized are seldom well defined. The definition of epitopes by X-ray crystallography of the antigen-antibody complex, the gold standard procedure, has shown that most antibody epitopes are conformational and specified by interactions with topographic determinants on the surface of the antigenic molecule. Techniques available for the definition of such epitopes are limited. Phage display using either gene-specific libraries, or random peptide libraries, provides a powerful technique for an approach to epitope identification. The technique can identify amino acids on protein antigens that are critical for antibody binding and, further, the isolation of peptide motifs that are both structural and functional mimotopes of both protein and non-protein antigens. This review discusses techniques used to isolate such mimotopes, to confirm their specificity, and to characterize peptide epitopes. Moreover there are direct practical applications to deriving epitopes or mimotopes by sequence, notably the development of new diagnostic reagents, or therapeutic agonist or antagonist molecules. The techniques developed for mapping of antibody epitopes are applicable to probing the origins of autoimmune diseases and certain cancers by identifying "immunofootprints" of unknown initiating agents, as we discuss herein, and are directly applicable to examination of a wider range of receptor-ligand interactions.
Collapse
Affiliation(s)
- Merrill J Rowley
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Victoria 3800, Australia.
| | | | | |
Collapse
|
41
|
Abstract
A new statistical method associating each trinucleotide with a frame is developed for identifying circular codes. Its sensibility allows the detection of several circular codes in the (protein coding) genes of archaeal genomes. Several properties of these circular codes are described, in particular the lengths of the minimal windows to retrieve the construction frames, a new definition of a parameter for measuring some probabilities of words generated by the circular codes, and the types of nucleotides in the trinucleotide sites. Some biological consequences are presented in Discussion.
Collapse
Affiliation(s)
- Gabriel Frey
- Equipe de Bioinformatique Théorique, LSIIT, UMR CNRS-ULP 7005, Université Louis Pasteur de Strasbourg, Pôle API, Boulevard Sébastien Brant, 67400 Illkirch, France.
| | | |
Collapse
|
42
|
McHardy AC, Pühler A, Kalinowski J, Meyer F. Comparing expression level-dependent features in codon usage with protein abundance: An analysis of ‘predictive proteomics’. Proteomics 2003; 4:46-58. [PMID: 14730671 DOI: 10.1002/pmic.200300501] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Synonymous codon usage is a commonly used means for estimating gene expression levels of Escherichia coli genes and has also been used for predicting highly expressed genes for a number of prokaryotic genomes. By comparison of expression level-dependent features in codon usage with protein abundance data from two proteome studies of exponentially growing E. coli and Bacillus subtilis cells, we try to evaluate whether the implicit assumption of this approach can be confirmed with experimental data. Log-odds ratio scores are used to model differences in codon usage between highly expressed genes and genomic average. Using these, the strength and significance of expression level-dependent features in codon usage were determined for the genes of the Escherichia coli, Bacillus subtilis and Haemophilus influenzae genomes. The comparison of codon usage features with protein abundance data confirmed a relationship between these to be present, although exceptions to this, possibly related to functional context, were found. For species with expression level-dependent features in their codon usage, the applied methodology could be used to improve in silico simulations of the outcome of two-dimensional gel electrophoretic experiments.
Collapse
|
43
|
Abstract
The association of codon context and codon usage was studied in seven bacteria as well as Schizosaccharomyces pombe and Encephalitozoon cuniculi. The association is strongest in magnitude closest to the codons of interest but there is apparently no rule about which of the two contexts is generally strongest associated to codon usage. In all bacterial species and in the intron-rich Sch. pombe it was furthermore observed from plots of chi2 versus N that the wobble positions of codons in the proximity cause regular peaks both upstream and downstream. This observation is discussed in relation to a possible effect of mutational pressure on the association of codon usage and codon context. Absence of peaks corresponding to the wobble positions in the intron-poor En. cuniculi, and presence in Sch. pombe, may indicate that the role of introns in the context-dependent codon bias is negligible.
Collapse
|
44
|
Akashi H, Gojobori T. Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc Natl Acad Sci U S A 2002; 99:3695-700. [PMID: 11904428 PMCID: PMC122586 DOI: 10.1073/pnas.062526999] [Citation(s) in RCA: 488] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2001] [Indexed: 01/11/2023] Open
Abstract
Biosynthesis of an Escherichia coli cell, with organic compounds as sources of energy and carbon, requires approximately 20 to 60 billion high-energy phosphate bonds [Stouthamer, A. H. (1973) Antonie van Leeuwenhoek 39, 545-565]. A substantial fraction of this energy budget is devoted to biosynthesis of amino acids, the building blocks of proteins. The fueling reactions of central metabolism provide precursor metabolites for synthesis of the 20 amino acids incorporated into proteins. Thus, synthesis of an amino acid entails a dual cost: energy is lost by diverting chemical intermediates from fueling reactions and additional energy is required to convert precursor metabolites to amino acids. Among amino acids, costs of synthesis vary from 12 to 74 high-energy phosphate bonds per molecule. The energetic advantage to encoding a less costly amino acid in a highly expressed gene can be greater than 0.025% of the total energy budget. Here, we provide evidence that amino acid composition in the proteomes of E. coli and Bacillus subtilis reflects the action of natural selection to enhance metabolic efficiency. We employ synonymous codon usage bias as a measure of translation rates and show increases in the abundance of less energetically costly amino acids in highly expressed proteins.
Collapse
Affiliation(s)
- Hiroshi Akashi
- Institute of Molecular Evolutionary Genetics and Department of Biology, 208 Mueller Laboratory, Pennsylvania State University, University Park, PA 16802, USA.
| | | |
Collapse
|
45
|
Fedorov A, Saxonov S, Gilbert W. Regularities of context-dependent codon bias in eukaryotic genes. Nucleic Acids Res 2002; 30:1192-7. [PMID: 11861911 PMCID: PMC101244 DOI: 10.1093/nar/30.5.1192] [Citation(s) in RCA: 79] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Nucleotides surrounding a codon influence the choice of this particular codon from among the group of possible synonymous codons. The strongest influence on codon usage arises from the nucleotide immediately following the codon and is known as the N1 context. We studied the relative abundance of codons with N1 contexts in genes from four eukaryotes for which the entire genomes have been sequenced: Homo sapiens, Drosophila melanogaster, Caenorhabditis elegans and Arabidopsis thaliana. For all the studied organisms it was found that 90% of the codons have a statistically significant N1 context-dependent codon bias. The relative abundance of each codon with an N1 context was compared with the relative abundance of the same 4mer oligonucleotide in the whole genome. This comparison showed that in about half of all cases the context-dependent codon bias could not be explained by the sequence composition of the genome. Ranking statistics were applied to compare context-dependent codon biases for codons from different synonymous groups. We found regularities in N1 context-dependent codon bias with respect to the codon nucleotide composition. Codons with the same nucleotides in the second and third positions and the same N1 context have a statistically significant correlation of their relative abundances.
Collapse
Affiliation(s)
- Alexei Fedorov
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA.
| | | | | |
Collapse
|
46
|
Abstract
Genetic mutations that lead to undetectable or minimal changes in phenotypes are said to reveal redundant functions. Redundancy is common among phenotypes of higher organisms that experience low mutation rates and small population sizes. Redundancy is less common among organisms with high mutation rates and large populations, or among the rapidly dividing cells of multicellular organisms. In these cases, one even observes the opposite tendency: a hypersensitivity to mutation, which we refer to as antiredundancy. In this paper we analyze the evolutionary dynamics of redundancy and antiredundancy. Assuming a cost of redundancy, we find that large populations will evolve antiredundant mechanisms for removing mutants and thereby bolster the robustness of wild-type genomes; whereas small populations will evolve redundancy to ensure that all individuals have a high chance of survival. We propose that antiredundancy is as important for developmental robustness as redundancy, and is an essential mechanism for ensuring tissue-level stability in complex multicellular organisms. We suggest that antiredundancy deserves greater attention in relation to cancer, mitochondrial disease, and virus infection.
Collapse
Affiliation(s)
- David C Krakauer
- Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA.
| | | |
Collapse
|
47
|
Fedorov A, Saxonov S, Fedorova L, Daizadeh I. Comparison of intron-containing and intron-lacking human genes elucidates putative exonic splicing enhancers. Nucleic Acids Res 2001; 29:1464-9. [PMID: 11266547 PMCID: PMC31294 DOI: 10.1093/nar/29.7.1464] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Of the rules used by the splicing machinery to precisely determine intron-exon boundaries only a fraction is known. Recent evidence suggests that specific short sequences within exons help in defining these boundaries. Such sequences are known as exonic splicing enhancers (ESE). A possible bioinformatical approach to studying ESE sequences is to compare genes that harbor introns with genes that do not. For this purpose two non-redundant samples of 719 intron-containing and 63 intron-lacking human genes were created. We performed a statistical analysis on these datasets of intron-containing and intron-lacking human coding sequences and found a statistically significant difference (P = 0.01) between these samples in terms of 5-6mer oligonucleotide distributions. The difference is not created by a few strong signals present in the majority of exons, but rather by the accumulation of multiple weak signals through small variations in codon frequencies, codon biases and context-dependent codon biases between the samples. A list of putative novel human splicing regulation sequences has been elucidated by our analysis.
Collapse
Affiliation(s)
- A Fedorov
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA.
| | | | | | | |
Collapse
|
48
|
Abstract
According to New Synthesis doctrine, the direction of evolution is determined by selection and not by "internal causes" that act by way of propensities of variation. This doctrine rests on the theoretical claim that because mutation rates are small in comparison to selection coefficients, mutation is powerless to overcome opposing selection. Using a simple population-genetic model, this claim is shown to depend on assuming the prior availability of variation, so that mutation may act only as a "pressure" on the frequencies of existing alleles, and not as the evolutionary process that introduces novelty. As shown here, mutational bias in the introduction of novelty can strongly influence the course of evolution, even when mutation rates are small in comparison to selection coefficients. Recognizing this mode of causation provides a distinct mechanistic basis for an "internalist" approach to determining the contribution of mutational and developmental factors to evolutionary phenomena such as homoplasy, parallelism, and directionality.
Collapse
Affiliation(s)
- L Y Yampolsky
- Center for Advanced Research in Biotechnology, Rockville, MD 20874, USA
| | | |
Collapse
|
49
|
Hooper SD, Berg OG. Gradients in nucleotide and codon usage along Escherichia coli genes. Nucleic Acids Res 2000; 28:3517-23. [PMID: 10982871 PMCID: PMC110745 DOI: 10.1093/nar/28.18.3517] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The usage of codons and nucleotide combinations varies along genes and systematic variation causes gradients in usage. We have studied such gradients of nucleotides and nucleotide combinations and their immediate context in Escherichia coli. To distinguish mutational and selectional effects, the genes were subdivided into three groups with different codon usage bias and the gradients of nucleotide usage were studied in each group. Some combinations that can be associated with a propensity for processivity errors show strong negative gradients that become weaker in genes with low codon bias, consistent with a selection on translational efficiency. One of the strongest gradients is for third position G, which shows a pervasive positive gradient in usage in most contexts of surrounding bases.
Collapse
Affiliation(s)
- S D Hooper
- Department of Molecular Evolution, EBC, Uppsala University, Norbyvägen 18C, SE-75236, Uppsala, Sweden
| | | |
Collapse
|
50
|
Freistroffer DV, Kwiatkowski M, Buckingham RH, Ehrenberg M. The accuracy of codon recognition by polypeptide release factors. Proc Natl Acad Sci U S A 2000; 97:2046-51. [PMID: 10681447 PMCID: PMC15751 DOI: 10.1073/pnas.030541097] [Citation(s) in RCA: 126] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The precision with which individual termination codons in mRNA are recognized by protein release factors (RFs) has been measured and compared with the decoding of sense codons by tRNA. An Escherichia coli system for protein synthesis in vitro with purified components was used to study the accuracy of termination by RF1 and RF2 in the presence or absence of RF3. The efficiency of factor-dependent termination at all sense codons differing from any of the three stop codons by a single mutation was measured and compared with the efficiency of termination at the three stop codons. RF1 and RF2 discriminate against sense codons related to stop codons by between 3 and more than 6 orders of magnitude. This high level of accuracy is obtained without energy-driven error correction (proofreading), in contrast to codon-dependent aminoacyl-tRNA recognition by ribosomes. Two codons, UAU and UGG, stand out as hotspots for RF-dependent premature termination.
Collapse
Affiliation(s)
- D V Freistroffer
- Department of Cell and Molecular Biology, Biomedical Centre, Uppsala University, Box 596, S-75124 Uppsala, Sweden
| | | | | | | |
Collapse
|