1
|
Gupta S, Singh R. Comparative study of codon usage profiles of Zingiber officinale and its associated fungal pathogens. Mol Genet Genomics 2021; 296:1121-1134. [PMID: 34181071 DOI: 10.1007/s00438-021-01808-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Accepted: 06/22/2021] [Indexed: 01/08/2023]
Abstract
Codon usage bias influences the genetic features prevalent in genomes of all the organisms. It also plays a crucial role in establishing the host-pathogen relationship. The present study elucidates the role of codon usage pattern regarding the predilection of fungal pathogens Aspergillus flavus, Aspergillus niger, Fusarium oxysporum and Colletotrichum gloeosporioides towards host plant Zingiber officinale. We found a similar trend of codon usage pattern operative in plant and fungal pathogens. This concurrence might be attributed for the colonization of fungal pathogens in Z. officinale. The transcriptome of both plant and pathogens showed bias towards GC-ending codons. Natural selection and mutational pressure seem to be accountable for shaping the codon usage pattern of host and pathogen. We also identified some distinctive preferred codons in A. flavus, F. oxysporum and Z. officinale that could be regarded as signature codons for the identification of these organisms. Knowledge of favored, avoided and unique codons will help to devise strategies for reducing spice losses due to fungal pathogens.
Collapse
Affiliation(s)
- Suruchi Gupta
- Plant Biotechnology Division, CSIR-Indian Institute of Integrative Medicine, Jammu, 180001, India
| | - Ravail Singh
- Plant Biotechnology Division, CSIR-Indian Institute of Integrative Medicine, Jammu, 180001, India.
- Academy of Scientific and Innovative Research (AcSIR), Jammu, 180001, India.
- DZMB Senckenberg am Meer, Wilhelmshaven, Germany.
| |
Collapse
|
2
|
Bacterial Symbionts of Tsetse Flies: Relationships and Functional Interactions Between Tsetse Flies and Their Symbionts. Results Probl Cell Differ 2021; 69:497-536. [PMID: 33263885 DOI: 10.1007/978-3-030-51849-3_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2023]
Abstract
Tsetse flies (Glossina spp.) act as the sole vectors of the African trypanosome species that cause Human African Trypanosomiasis (HAT or African Sleeping Sickness) and Nagana in animals. These flies have undergone a variety of specializations during their evolution including an exclusive diet consisting solely of vertebrate blood for both sexes as well as an obligate viviparous reproductive biology. Alongside these adaptations, Glossina species have developed intricate relationships with specific microbes ranging from mutualistic to parasitic. These relationships provide fundamental support required to sustain the specializations associated with tsetse's biology. This chapter provides an overview on the knowledge to date regarding the biology behind these relationships and focuses primarily on four bacterial species that are consistently associated with Glossina species. Here their interactions with the host are reviewed at the morphological, biochemical and genetic levels. This includes: the obligate symbiont Wigglesworthia, which is found in all tsetse species and is essential for nutritional supplementation to the blood-specific diet, immune system maturation and facilitation of viviparous reproduction; the commensal symbiont Sodalis, which is a frequently associated symbiont optimized for survival within the fly via nutritional adaptation, vertical transmission through mating and may alter vectorial capacity of Glossina for trypanosomes; the parasitic symbiont Wolbachia, which can manipulate Glossina via cytoplasmic incompatibility and shows unique interactions at the genetic level via horizontal transmission of its genetic material into the genome in two Glossina species; finally, knowledge on recently observed relations between Spiroplasma and Glossina is explored and potential interactions are discussed based on knowledge of interactions between this bacterial Genera and other insect species. These flies have a simple microbiome relative to that of other insects. However, these relationships are deep, well-studied and provide a window into the complexity and function of host/symbiont interactions in an important disease vector.
Collapse
|
3
|
Assis R. Lineage-Specific Expression Divergence in Grasses Is Associated with Male Reproduction, Host-Pathogen Defense, and Domestication. Genome Biol Evol 2019; 11:207-219. [PMID: 30398650 PMCID: PMC6331041 DOI: 10.1093/gbe/evy245] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/03/2018] [Indexed: 02/02/2023] Open
Abstract
Poaceae (grasses) is an agriculturally important and widely distributed family of plants with extraordinary phenotypic diversity, much of which was generated under recent lineage-specific evolution. Yet, little is known about the genes and functional modules involved in the lineage-specific divergence of grasses. Here, I address this question on a genome-wide scale by applying a novel branch-based statistic of lineage-specific expression divergence, LED, to RNA-seq data from nine tissues of the wild grass Brachypodium distachyon and its domesticated relatives Oryza sativa japonica (rice) and Sorghum bicolor (sorghum). I find that LED is generally smallest in B. distachyon and largest in O. sativa japonica, which underwent domestication earlier than S. bicolor, supporting the hypothesis that domestication may increase the rate of lineage-specific expression divergence in grasses. Moreover, in all three species, LED is positively correlated with protein-coding sequence divergence and tissue specificity, and negatively correlated with network connectivity. Further analysis reveals that genes with large LED are often primarily expressed in anther, implicating lineage-specific expression divergence in the evolution of male reproductive phenotypes. Gene ontology enrichment analysis also identifies an overrepresentation of terms related to male reproduction in the two domesticated grasses, as well as to those involved in host-pathogen defense in all three species. Last, examinations of genes with the largest LED reveal that their lineage-specific expression divergence may have contributed to antimicrobial functions in B. distachyon, to enhanced adaptation and yield during domestication in O. sativa japonica, and to defense against a widespread and devastating fungal pathogen in S. bicolor. Together, these findings suggest that lineage-specific expression divergence in grasses may increase under domestication and preferentially target rapidly evolving genes involved in male reproduction, host-pathogen defense, and the origin of domesticated phenotypes.
Collapse
Affiliation(s)
- Raquel Assis
- Department of Biology, Pennsylvania State University, University Park
| |
Collapse
|
4
|
Alleman A, Hertweck KL, Kambhampati S. Random Genetic Drift and Selective Pressures Shaping the Blattabacterium Genome. Sci Rep 2018; 8:13427. [PMID: 30194350 PMCID: PMC6128925 DOI: 10.1038/s41598-018-31796-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Accepted: 08/21/2018] [Indexed: 01/30/2023] Open
Abstract
Estimates suggest that at least half of all extant insect genera harbor obligate bacterial mutualists. Whereas an endosymbiotic relationship imparts many benefits upon host and symbiont alike, the intracellular lifestyle has profound effects on the bacterial genome. The obligate endosymbiont genome is a product of opposing forces: genes important to host survival are maintained through physiological constraint, contrasted by the fixation of deleterious mutations and genome erosion through random genetic drift. The obligate cockroach endosymbiont, Blattabacterium - providing nutritional augmentation to its host in the form of amino acid synthesis - displays radical genome alterations when compared to its most recent free-living relative Flavobacterium. To date, eight Blattabacterium genomes have been published, affording an unparalleled opportunity to examine the direction and magnitude of selective forces acting upon this group of symbionts. Here, we find that the Blattabacterium genome is experiencing a 10-fold increase in selection rate compared to Flavobacteria. Additionally, the proportion of selection events is largely negative in direction, with only a handful of loci exhibiting signatures of positive selection. These findings suggest that the Blattabacterium genome will continue to erode, potentially resulting in an endosymbiont with an even further reduced genome, as seen in other insect groups such as Hemiptera.
Collapse
Affiliation(s)
- Austin Alleman
- Department of Biology, University of Texas at Tyler, 3900 University Blvd., Tyler, Texas, 75799, United States.
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Johannes von Müller Weg 6, Mainz, 55128, Germany.
| | - Kate L Hertweck
- Department of Biology, University of Texas at Tyler, 3900 University Blvd., Tyler, Texas, 75799, United States
| | - Srini Kambhampati
- Department of Biology, University of Texas at Tyler, 3900 University Blvd., Tyler, Texas, 75799, United States
| |
Collapse
|
5
|
The Impact of Selection at the Amino Acid Level on the Usage of Synonymous Codons. G3-GENES GENOMES GENETICS 2017; 7:967-981. [PMID: 28122952 PMCID: PMC5345726 DOI: 10.1534/g3.116.038125] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
There are two main forces that affect usage of synonymous codons: directional mutational pressure and selection. The effectiveness of protein translation is usually considered as the main selectional factor. However, biased codon usage can also be a byproduct of a general selection at the amino acid level interacting with nucleotide replacements. To evaluate the validity and strength of such an effect, we superimposed >3.5 billion unrestricted mutational processes on the selection of nonsynonymous substitutions based on the differences in physicochemical properties of the coded amino acids. Using a modified evolutionary optimization algorithm, we determined the conditions in which the effect on the relative codon usage is maximized. We found that the effect is enhanced by mutational processes generating more adenine and thymine than guanine and cytosine, as well as more purines than pyrimidines. Interestingly, this effect is observed only under an unrestricted model of nucleotide substitution, and disappears when the mutational process is time-reversible. Comparison of the simulation results with data for real protein coding sequences indicates that the impact of selection at the amino acid level on synonymous codon usage cannot be neglected. Furthermore, it can considerably interfere, especially in AT-rich genomes, with other selections on codon usage, e.g., translational efficiency. It may also lead to difficulties in the recognition of other effects influencing codon bias, and an overestimation of protein coding sequences whose codon usage is subjected to adaptational selection.
Collapse
|
6
|
Mondo SJ, Salvioli A, Bonfante P, Morton JB, Pawlowska TE. Nondegenerative Evolution in Ancient Heritable Bacterial Endosymbionts of Fungi. Mol Biol Evol 2016; 33:2216-31. [DOI: 10.1093/molbev/msw086] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
7
|
Roque E, Fares MA, Yenush L, Rochina MC, Wen J, Mysore KS, Gómez-Mena C, Beltrán JP, Cañas LA. Evolution by gene duplication of Medicago truncatula PISTILLATA-like transcription factors. JOURNAL OF EXPERIMENTAL BOTANY 2016; 67:1805-1817. [PMID: 26773809 PMCID: PMC4783364 DOI: 10.1093/jxb/erv571] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
PISTILLATA (PI) is a member of the B-function MADS-box gene family, which controls the identity of both petals and stamens in Arabidopsis thaliana. In Medicago truncatula (Mt), there are two PI-like paralogs, known as MtPI and MtNGL9. These genes differ in their expression patterns, but it is not known whether their functions have also diverged. Describing the evolution of certain duplicated genes, such as transcription factors, remains a challenge owing to the complex expression patterns and functional divergence between the gene copies. Here, we report a number of functional studies, including analyses of gene expression, protein-protein interactions, and reverse genetic approaches designed to demonstrate the respective contributions of each M. truncatula PI-like paralog to the B-function in this species. Also, we have integrated molecular evolution approaches to determine the mode of evolution of Mt PI-like genes after duplication. Our results demonstrate that MtPI functions as a master regulator of B-function in M. truncatula, maintaining the overall ancestral function, while MtNGL9 does not seem to have a role in this regard, suggesting that the pseudogenization could be the functional evolutionary fate for this gene. However, we provide evidence that purifying selection is the primary evolutionary force acting on this paralog, pinpointing the conservation of its biochemical function and, alternatively, the acquisition of a new role for this gene.
Collapse
Affiliation(s)
- Edelín Roque
- Instituto de Biología Molecular y Celular de Plantas Consejo Superior de Investigaciones Científicas & Universidad Politécnica de Valencia (CSIC-UPV), Ciudad Politécnica de la Innovación, Edf. 8E, C/ Ingeniero Fausto Elio s/n, E-46011 Valencia, Spain
| | - Mario A Fares
- Instituto de Biología Molecular y Celular de Plantas Consejo Superior de Investigaciones Científicas & Universidad Politécnica de Valencia (CSIC-UPV), Ciudad Politécnica de la Innovación, Edf. 8E, C/ Ingeniero Fausto Elio s/n, E-46011 Valencia, Spain
| | - Lynne Yenush
- Instituto de Biología Molecular y Celular de Plantas Consejo Superior de Investigaciones Científicas & Universidad Politécnica de Valencia (CSIC-UPV), Ciudad Politécnica de la Innovación, Edf. 8E, C/ Ingeniero Fausto Elio s/n, E-46011 Valencia, Spain
| | - Mari Cruz Rochina
- Instituto de Biología Molecular y Celular de Plantas Consejo Superior de Investigaciones Científicas & Universidad Politécnica de Valencia (CSIC-UPV), Ciudad Politécnica de la Innovación, Edf. 8E, C/ Ingeniero Fausto Elio s/n, E-46011 Valencia, Spain
| | - Jiangqi Wen
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73401, USA
| | - Kirankumar S Mysore
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73401, USA
| | - Concepción Gómez-Mena
- Instituto de Biología Molecular y Celular de Plantas Consejo Superior de Investigaciones Científicas & Universidad Politécnica de Valencia (CSIC-UPV), Ciudad Politécnica de la Innovación, Edf. 8E, C/ Ingeniero Fausto Elio s/n, E-46011 Valencia, Spain
| | - José Pío Beltrán
- Instituto de Biología Molecular y Celular de Plantas Consejo Superior de Investigaciones Científicas & Universidad Politécnica de Valencia (CSIC-UPV), Ciudad Politécnica de la Innovación, Edf. 8E, C/ Ingeniero Fausto Elio s/n, E-46011 Valencia, Spain
| | - Luis A Cañas
- Instituto de Biología Molecular y Celular de Plantas Consejo Superior de Investigaciones Científicas & Universidad Politécnica de Valencia (CSIC-UPV), Ciudad Politécnica de la Innovación, Edf. 8E, C/ Ingeniero Fausto Elio s/n, E-46011 Valencia, Spain
| |
Collapse
|
8
|
Wang K, Yu S, Ji X, Lakner C, Griffing A, Thorne JL. Roles of solvent accessibility and gene expression in modeling protein sequence evolution. Evol Bioinform Online 2015; 11:85-96. [PMID: 25987828 PMCID: PMC4415675 DOI: 10.4137/ebo.s22911] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2014] [Revised: 02/04/2015] [Accepted: 02/09/2015] [Indexed: 11/05/2022] Open
Abstract
Models of protein evolution tend to ignore functional constraints, although structural constraints are sometimes incorporated. Here we propose a probabilistic framework for codon substitution that evaluates joint effects of relative solvent accessibility (RSA), a structural constraint; and gene expression, a functional constraint. First, we explore the relationship between RSA and codon usage at the genomic scale as well as at the individual gene scale. Motivated by these results, we construct our framework by determining how probable is an amino acid, given RSA and gene expression, and then evaluating the relative probability of observing a codon compared to other synonymous codons. We come to the biologically plausible conclusion that both RSA and gene expression are related to amino acid frequencies, but, among synonymous codons, the relative probability of a particular codon is more closely related to gene expression than RSA. To illustrate the potential applications of our framework, we propose a new codon substitution model. Using this model, we obtain estimates of 2N s, the product of effective population size N, and relative fitness difference of allele s. For a training data set consisting of human proteins with known structures and expression data, 2N s is estimated separately for synonymous and nonsynonymous substitutions in each protein. We then contrast the patterns of synonymous and nonsynonymous 2N s estimates across proteins while also taking gene expression levels of the proteins into account. We conclude that our 2N s estimates are too concentrated around 0, and we discuss potential explanations for this lack of variability.
Collapse
Affiliation(s)
- Kuangyu Wang
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | - Shuhui Yu
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA. ; College of Life Science, Chongqing University, Chongqing, China
| | - Xiang Ji
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | - Clemens Lakner
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | - Alexander Griffing
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | - Jeffrey L Thorne
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| |
Collapse
|
9
|
Iriarte A, Baraibar JD, Diana L, Castro-Sowinski S, Romero H, Musto H. Trends in amino acid usage across the class Mollicutes. J Biomol Struct Dyn 2014; 32:65-74. [DOI: 10.1080/07391102.2012.748636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
10
|
Iriarte A, Baraibar JD, Romero H, Castro-Sowinski S, Musto H. Evolution of optimal codon choices in the family Enterobacteriaceae. MICROBIOLOGY-SGM 2013; 159:555-564. [PMID: 23288542 DOI: 10.1099/mic.0.061952-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The Enterobacteriaceae are a large family of Proteobacteria that include many well-known prokaryotic genera, such as Escherichia, Yersinia and Salmonella. The main ideas of synonymous codon usage (CU) evolution and translational selection have been deeply influenced by studies with these bacterial groups. In this work we report the analysis of the CU pattern of completely sequenced bacterial genomes that belong to the Enterobacteriaceae. The effect of selection in translation acting at the levels of speed and accuracy, and phylogenetic trends within this group are described. Preferred (optimal) codons were identified. The evolutionary dynamics of these codons were studied and following a Bayesian approach these preferences were traced back to the common ancestor of the family. We found that there is some level of variation in selection among the analysed micro-organisms that is probably associated with lineage-specific trends. The codon bias was largely conserved across the evolutionary time of the family in highly expressed genes and protein conserved regions, suggesting a major role of negative selection. In this sense, the results support the idea that the extant CU bias is finely tuned over the ancestral well-conserved pool of tRNAs.
Collapse
Affiliation(s)
- Andrés Iriarte
- Área Genética, Depto. de Genética y Mejora Animal, Facultad de Veterinaria (UDELAR), Av. A. Lasplaces 1550, CP 11600, Montevideo, Uruguay.,Laboratorio de Evolución, Facultad de Ciencias (UDELAR), Iguá 4225, 11400 Montevideo, Uruguay.,Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias (UDELAR), Iguá 4225, 11400 Montevideo, Uruguay
| | - Juan Diego Baraibar
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias (UDELAR), Iguá 4225, 11400 Montevideo, Uruguay
| | - Héctor Romero
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias (UDELAR), Iguá 4225, 11400 Montevideo, Uruguay
| | - Susana Castro-Sowinski
- Sección Bioquímica y Biología Molecular, Facultad de Ciencias (UDELAR), Iguá 4225, 11400 Montevideo, Uruguay
| | - Héctor Musto
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias (UDELAR), Iguá 4225, 11400 Montevideo, Uruguay
| |
Collapse
|
11
|
Aoi MC, Rourke BC. Interspecific and intragenic differences in codon usage bias among vertebrate myosin heavy-chain genes. J Mol Evol 2011; 73:74-93. [PMID: 21915654 DOI: 10.1007/s00239-011-9457-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2010] [Accepted: 08/19/2011] [Indexed: 01/13/2023]
Abstract
Synonymous codon usage bias is a broadly observed phenomenon in bacteria, plants, and invertebrates and may result from selection. However, the role of selective pressures in shaping codon bias is still controversial in vertebrates, particularly for mammals. The myosin heavy-chain (MyHC) gene family comprises multiple isoforms of the major force-producing contractile protein in cardiac and skeletal muscles. Slow and fast genes are tandemly arrayed on separate chromosomes, and have distinct patterns of functionality and expression in muscle. We analyze both full-length MyHC genes (~5400 bp) and a larger collection of partial sequences at the 3' end (~500 bp). The MyHC isoforms are an interesting system in which to study codon usage bias because of their length, expression, and critical importance to organismal mobility. Codon bias and GC content differs among MyHC genes with regards to functional type, isoform, and position within the gene. Codon bias even varies by isoform within a species. We find evidence in favor of both chromosomal influences on nucleotide composition and selection against nonsense errors (SANE) acting on codon usage in MyHC genes. Intragenic variation in codon bias and elongation rate is significant, with a strong trend for increasing codon bias and elongation rate towards the 3' end of the gene, although the trend is dependent upon the degeneracy class of the codons. Therefore, patterns of codon usage in MyHC genes are consistent with models supporting SANE as a major force shaping codon usage.
Collapse
Affiliation(s)
- Mikio C Aoi
- Department of Mathematics, North Carolina State University, Raleigh, NC 27695, USA
| | | |
Collapse
|
12
|
Stewart FJ, Sharma AK, Bryant JA, Eppley JM, DeLong EF. Community transcriptomics reveals universal patterns of protein sequence conservation in natural microbial communities. Genome Biol 2011; 12:R26. [PMID: 21426537 PMCID: PMC3129676 DOI: 10.1186/gb-2011-12-3-r26] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2011] [Revised: 02/28/2011] [Accepted: 03/22/2011] [Indexed: 12/02/2022] Open
Abstract
Background Combined metagenomic and metatranscriptomic datasets make it possible to study the molecular evolution of diverse microbial species recovered from their native habitats. The link between gene expression level and sequence conservation was examined using shotgun pyrosequencing of microbial community DNA and RNA from diverse marine environments, and from forest soil. Results Across all samples, expressed genes with transcripts in the RNA sample were significantly more conserved than non-expressed gene sets relative to best matches in reference databases. This discrepancy, observed for many diverse individual genomes and across entire communities, coincided with a shift in amino acid usage between these gene fractions. Expressed genes trended toward GC-enriched amino acids, consistent with a hypothesis of higher levels of functional constraint in this gene pool. Highly expressed genes were significantly more likely to fall within an orthologous gene set shared between closely related taxa (core genes). However, non-core genes, when expressed above the level of detection, were, on average, significantly more highly expressed than core genes based on transcript abundance normalized to gene abundance. Finally, expressed genes showed broad similarities in function across samples, being relatively enriched in genes of energy metabolism and underrepresented by genes of cell growth. Conclusions These patterns support the hypothesis, predicated on studies of model organisms, that gene expression level is a primary correlate of evolutionary rate across diverse microbial taxa from natural environments. Despite their complexity, meta-omic datasets can reveal broad evolutionary patterns across taxonomically, functionally, and environmentally diverse communities.
Collapse
Affiliation(s)
- Frank J Stewart
- School of Biology, Georgia Institute of Technology, Ford ES&T Building, Rm 1242, 311 Ferst Drive, Atlanta, GA 30332, USA
| | | | | | | | | |
Collapse
|
13
|
Misawa K, Kikuno RF. Relationship between amino acid composition and gene expression in the mouse genome. BMC Res Notes 2011; 4:20. [PMID: 21272306 PMCID: PMC3038927 DOI: 10.1186/1756-0500-4-20] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2010] [Accepted: 01/27/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Codon bias is a phenomenon that refers to the differences in the frequencies of synonymous codons among different genes. In many organisms, natural selection is considered to be a cause of codon bias because codon usage in highly expressed genes is biased toward optimal codons. Methods have previously been developed to predict the expression level of genes from their nucleotide sequences, which is based on the observation that synonymous codon usage shows an overall bias toward a few codons called major codons. However, the relationship between codon bias and gene expression level, as proposed by the translation-selection model, is less evident in mammals. FINDINGS We investigated the correlations between the expression levels of 1,182 mouse genes and amino acid composition, as well as between gene expression and codon preference. We found that a weak but significant correlation exists between gene expression levels and amino acid composition in mouse. In total, less than 10% of variation of expression levels is explained by amino acid components. We found the effect of codon preference on gene expression was weaker than the effect of amino acid composition, because no significant correlations were observed with respect to codon preference. CONCLUSION These results suggest that it is difficult to predict expression level from amino acid components or from codon bias in mouse.
Collapse
Affiliation(s)
- Kazuharu Misawa
- Research Program for Computational Science, Research and Development Group for Next-Generation Integrated Living Matter Simulation, Fusion of Data and Analysis Research and Development Team, RIKEN, 4-6-1 Shirokane-dai, Minato-ku, Tokyo 108-8639, Japan.
| | | |
Collapse
|
14
|
Selected codon usage bias in members of the class Mollicutes. Gene 2010; 473:110-8. [PMID: 21147204 DOI: 10.1016/j.gene.2010.11.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2010] [Revised: 11/20/2010] [Accepted: 11/22/2010] [Indexed: 11/24/2022]
Abstract
Mollicutes are parasitic microorganisms mainly characterized by small cell sizes, reduced genomes and great A and T mutational bias. We analyzed the codon usage patterns of the completely sequenced genomes of bacteria that belong to this class. We found that for many organisms not only mutational bias but also selection has a major effect on codon usage. Through a comparative perspective and based on three widely used criteria we were able to classify Mollicutes according to the effect of selection on codon usage. We found conserved optimal codons in many species and study the tRNA gene pool in each genome. Previous results are reinforced by the fact that, when selection is operative, the putative optimal codons found match the respective cognate tRNA. Finally, we trace selection effect backwards to the common ancestor of the class and estimate the phylogenetic inertia associated with this character. We discuss the possible scenarios that explain the observed evolutionary patterns.
Collapse
|
15
|
Söllner J, Heinzel A, Summer G, Fechete R, Stipkovits L, Szathmary S, Mayer B. Concept and application of a computational vaccinology workflow. Immunome Res 2010; 6 Suppl 2:S7. [PMID: 21067549 PMCID: PMC2981879 DOI: 10.1186/1745-7580-6-s2-s7] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The last years have seen a renaissance of the vaccine area, driven by clinical needs in infectious diseases but also chronic diseases such as cancer and autoimmune disorders. Equally important are technological improvements involving nano-scale delivery platforms as well as third generation adjuvants. In parallel immunoinformatics routines have reached essential maturity for supporting central aspects in vaccinology going beyond prediction of antigenic determinants. On this basis computational vaccinology has emerged as a discipline aimed at ab-initio rational vaccine design.Here we present a computational workflow for implementing computational vaccinology covering aspects from vaccine target identification to functional characterization and epitope selection supported by a Systems Biology assessment of central aspects in host-pathogen interaction. We exemplify the procedures for Epstein Barr Virus (EBV), a clinically relevant pathogen causing chronic infection and suspected of triggering malignancies and autoimmune disorders. RESULTS We introduce pBone/pView as a computational workflow supporting design and execution of immunoinformatics workflow modules, additionally involving aspects of results visualization, knowledge sharing and re-use. Specific elements of the workflow involve identification of vaccine targets in the realm of a Systems Biology assessment of host-pathogen interaction for identifying functionally relevant targets, as well as various methodologies for delineating B- and T-cell epitopes with particular emphasis on broad coverage of viral isolates as well as MHC alleles.Applying the workflow on EBV specifically proposes sequences from the viral proteins LMP2, EBNA2 and BALF4 as vaccine targets holding specific B- and T-cell epitopes promising broad strain and allele coverage. CONCLUSION Based on advancements in the experimental assessment of genomes, transcriptomes and proteomes for both, pathogen and (human) host, the fundaments for rational design of vaccines have been laid out. In parallel, immunoinformatics modules have been designed and successfully applied for supporting specific aspects in vaccine design. Joining these advancements, further complemented by novel vaccine formulation and delivery aspects, have paved the way for implementing computational vaccinology for rational vaccine design tackling presently unmet vaccine challenges.
Collapse
Affiliation(s)
- Johannes Söllner
- emergentec biodevelopment GmbH, Rathausstrasse 5/3, 1010 Vienna, Austria
| | - Andreas Heinzel
- emergentec biodevelopment GmbH, Rathausstrasse 5/3, 1010 Vienna, Austria
- University of Applied Sciences, Softwarepark 11, 4232 Hagenberg, Austria
| | - Georg Summer
- University of Applied Sciences, Softwarepark 11, 4232 Hagenberg, Austria
| | - Raul Fechete
- emergentec biodevelopment GmbH, Rathausstrasse 5/3, 1010 Vienna, Austria
| | | | - Susan Szathmary
- Galenbio Kft., Erdőszél köz 21, 1037 Budapest, Hungary and GalenBio, Inc., 5922 Farnsworth Ct, Carlsbad, CA 92008, USA
| | - Bernd Mayer
- emergentec biodevelopment GmbH, Rathausstrasse 5/3, 1010 Vienna, Austria
- Institute for Theoretical Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
| |
Collapse
|
16
|
Davis JJ, Olsen GJ. Characterizing the native codon usages of a genome: an axis projection approach. Mol Biol Evol 2010; 28:211-21. [PMID: 20679093 PMCID: PMC3002238 DOI: 10.1093/molbev/msq185] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Codon usage can provide insights into the nature of the genes in a genome. Genes that are “native” to a genome (have not been recently acquired by horizontal transfer) range in codon usage from a low-bias “typical” usage to a more biased “high-expression” usage characteristic of genes encoding abundant proteins. Genes that differ from these native codon usages are candidates for foreign genes that have been recently acquired by horizontal gene transfer. In this study, we present a method for characterizing the codon usages of native genes—both typical and highly expressed—within a genome. Each gene is evaluated relative to a half line (or axis) in a 59D space of codon usage. The axis begins at the modal codon usage, the usage that matches the largest number of genes in the genome, and it passes through a point representing the codon usage of a set of genes with expression-related bias. A gene whose codon usage matches (does not significantly differ from) a point on this axis is a candidate native gene, and the location of its projection onto the axis provides a general estimate of its expression level. A gene that differs significantly from all points on the axis is a candidate foreign gene. This automated approach offers significant improvements over existing methods. We illustrate this by analyzing the genomes of Pseudomonas aeruginosa PAO1 and Bacillus anthracis A0248, which can be difficult to analyze with commonly used methods due to their biased base compositions. Finally, we use this approach to measure the proportion of candidate foreign genes in 923 bacterial and archaeal genomes. The organisms with the most homogeneous genomes (containing the fewest candidate foreign genes) are mostly endosymbionts and parasites, though with exceptions that include Pelagibacter ubique and Beutenbergia cavernae. The organisms with the most heterogeneous genomes (containing the most candidate foreign genes) include members of the genera Bacteroides, Corynebacterium, Desulfotalea, Neisseria, Xylella, and Thermobaculum.
Collapse
Affiliation(s)
- James J Davis
- Department of Microbiology, University of Illinois at Urbana-Champaign
| | | |
Collapse
|
17
|
Belda E, Moya A, Bentley S, Silva FJ. Mobile genetic element proliferation and gene inactivation impact over the genome structure and metabolic capabilities of Sodalis glossinidius, the secondary endosymbiont of tsetse flies. BMC Genomics 2010; 11:449. [PMID: 20649993 PMCID: PMC3091646 DOI: 10.1186/1471-2164-11-449] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2010] [Accepted: 07/22/2010] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Genome reduction is a common evolutionary process in symbiotic and pathogenic bacteria. This process has been extensively characterized in bacterial endosymbionts of insects, where primary mutualistic bacteria represent the most extreme cases of genome reduction consequence of a massive process of gene inactivation and loss during their evolution from free-living ancestors. Sodalis glossinidius, the secondary endosymbiont of tsetse flies, contains one of the few complete genomes of bacteria at the very beginning of the symbiotic association, allowing to evaluate the relative impact of mobile genetic element proliferation and gene inactivation over the structure and functional capabilities of this bacterial endosymbiont during the transition to a host dependent lifestyle. RESULTS A detailed characterization of mobile genetic elements and pseudogenes reveals a massive presence of different types of prophage elements together with five different families of IS elements that have proliferated across the genome of Sodalis glossinidius at different levels. In addition, a detailed survey of intergenic regions allowed the characterization of 1501 pseudogenes, a much higher number than the 972 pseudogenes described in the original annotation. Pseudogene structure reveals a minor impact of mobile genetic element proliferation in the process of gene inactivation, with most of pseudogenes originated by multiple frameshift mutations and premature stop codons. The comparison of metabolic profiles of Sodalis glossinidius and tsetse fly primary endosymbiont Wiglesworthia glossinidia based on their whole gene and pseudogene repertoires revealed a novel case of pathway inactivation, the arginine biosynthesis, in Sodalis glossinidius together with a possible case of metabolic complementation with Wigglesworthia glossinidia for thiamine biosynthesis. CONCLUSIONS The complete re-analysis of the genome sequence of Sodalis glossinidius reveals novel insights in the evolutionary transition from a free-living ancestor to a host-dependent lifestyle, with a massive proliferation of mobile genetic elements mainly of phage origin although with minor impact in the process of gene inactivation that is taking place in this bacterial genome. The metabolic analysis of the whole endosymbiotic consortia of tsetse flies have revealed a possible phenomenon of metabolic complementation between primary and secondary endosymbionts that can contribute to explain the co-existence of both bacterial endosymbionts in the context of the tsetse host.
Collapse
Affiliation(s)
- Eugeni Belda
- Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València. Apartat 22085, València E-46071, Spain
| | - Andrés Moya
- Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València. Apartat 22085, València E-46071, Spain
- CIBER en Epidemiología y Salud Pública (CIBEResp), Barcelona, Spain
- Unidad Mixta de Investigación de Genómica y Salud (Centro Superior de Investigación en Salud Pública, CSISP/Institut Cavanilles, Universitat de València, Spain
| | | | - Francisco J Silva
- Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València. Apartat 22085, València E-46071, Spain
- CIBER en Epidemiología y Salud Pública (CIBEResp), Barcelona, Spain
- Unidad Mixta de Investigación de Genómica y Salud (Centro Superior de Investigación en Salud Pública, CSISP/Institut Cavanilles, Universitat de València, Spain
| |
Collapse
|
18
|
Supek F, Škunca N, Repar J, Vlahoviček K, Šmuc T. Translational selection is ubiquitous in prokaryotes. PLoS Genet 2010; 6:e1001004. [PMID: 20585573 PMCID: PMC2891978 DOI: 10.1371/journal.pgen.1001004] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2009] [Accepted: 05/26/2010] [Indexed: 11/29/2022] Open
Abstract
Codon usage bias in prokaryotic genomes is largely a consequence of background substitution patterns in DNA, but highly expressed genes may show a preference towards codons that enable more efficient and/or accurate translation. We introduce a novel approach based on supervised machine learning that detects effects of translational selection on genes, while controlling for local variation in nucleotide substitution patterns represented as sequence composition of intergenic DNA. A cornerstone of our method is a Random Forest classifier that outperformed previous distance measure-based approaches, such as the codon adaptation index, in the task of discerning the (highly expressed) ribosomal protein genes by their codon frequencies. Unlike previous reports, we show evidence that translational selection in prokaryotes is practically universal: in 460 of 461 examined microbial genomes, we find that a subset of genes shows a higher codon usage similarity to the ribosomal proteins than would be expected from the local sequence composition. These genes constitute a substantial part of the genome—between 5% and 33%, depending on genome size—while also exhibiting higher experimentally measured mRNA abundances and tending toward codons that match tRNA anticodons by canonical base pairing. Certain gene functional categories are generally enriched with, or depleted of codon-optimized genes, the trends of enrichment/depletion being conserved between Archaea and Bacteria. Prominent exceptions from these trends might indicate genes with alternative physiological roles; we speculate on specific examples related to detoxication of oxygen radicals and ammonia and to possible misannotations of asparaginyl–tRNA synthetases. Since the presence of codon optimizations on genes is a valid proxy for expression levels in fully sequenced genomes, we provide an example of an “adaptome” by highlighting gene functions with expression levels elevated specifically in thermophilic Bacteria and Archaea. Synonymous codons are not equally common in genomes. The main causes of unequal codon usage are varying nucleotide substitution patterns, as manifested in the wide range of genomic nucleotide compositions. However, since the first E. coli and yeast genes were sequenced, it became evident that there was also a bias towards codons that can be translated to protein faster and more accurately. This bias was stronger in highly expressed genes, and its driving force was termed translational selection. Researchers sought for effects of translational selection in microbial genomes as they became available, employing a flurry of mathematical approaches which sometimes led to contradictory conclusions. We introduce a sensitive and accurate machine learning-based methodology and find that highly expressed genes have a recognizable codon usage pattern in almost every bacterial and archaeal genome analyzed, even after accounting for large differences in background nucleotide composition. We also show that the gene functional category has a great bearing on whether that gene is subject to translational selection. Since presence of codon optimizations can be used as a purely sequence-derived proxy for expression levels, we can delineate “adaptomes” by relating predicted gene activity to organisms' phenotypes, which we demonstrate on genomes of temperature-resistant Bacteria and Archaea.
Collapse
Affiliation(s)
- Fran Supek
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Nives Škunca
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Jelena Repar
- Division of Molecular Biology, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Kristian Vlahoviček
- Division of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia
- Department of Informatics, University of Oslo, Oslo, Norway
| | - Tomislav Šmuc
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia
- * E-mail:
| |
Collapse
|
19
|
Abstract
Most genomes are heterogeneous in codon usage, so a codon usage study should start by defining the codon usage that is typical to the genome. Although this is commonly taken to be the genomewide average, we propose that the mode-the codon usage that matches the most genes-provides a more useful approximation of the typical codon usage of a genome. We provide a method for estimating the modal codon usage, which utilizes a continuous approximation to the number of matching genes and a simplex optimization. In a survey of bacterial and archaeal genomes, as many as 20% more of the genes in a given genome match the modal codon usage than the average codon usage. We use the mode to examine the evolution of the multireplicon genomes of Agrobacterium tumefaciens C58 and Borrelia burgdorferi B31. In A. tumefaciens, the circular and linear chromosomes are characterized by a common "chromosome-like" codon usage, whereas both plasmids share a distinct "plasmid-like" codon usage. In B. burgdorferi, in addition to different codon-usage biases on the leading and lagging strands of DNA replication found by McInerney (McInerney JO. 1998. Replicational and transcriptional selection on codon usage in Borrelia burgdorferi. Proc Natl Acad Sci USA. 95:10698-10703), we also detect a codon-usage similarity between linear plasmid lp38 and the leading strand of the chromosome and a high similarity among the cp32 family of plasmids.
Collapse
Affiliation(s)
- James J Davis
- Department of Microbiology, University of Illinois at Urbana-Champaign, IL, USA
| | | |
Collapse
|
20
|
Hu J, Blanchard JL. Environmental Sequence Data from the Sargasso Sea Reveal That the Characteristics of Genome Reduction in Prochlorococcus Are Not a Harbinger for an Escalation in Genetic Drift. Mol Biol Evol 2009. [DOI: 10.1093/molbev/msn299] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
21
|
Larson MA, Bressani R, Sayood K, Corn JE, Berger JM, Griep MA, Hinrichs SH. Hyperthermophilic Aquifex aeolicus initiates primer synthesis on a limited set of trinucleotides comprised of cytosines and guanines. Nucleic Acids Res 2008; 36:5260-9. [PMID: 18684998 PMCID: PMC2532735 DOI: 10.1093/nar/gkn461] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The placement of the extreme thermophile Aquifex aeolicus in the bacterial phylogenetic tree has evoked much controversy. We investigated whether adaptations for growth at high temperatures would alter a key functional component of the replication machinery, specifically DnaG primase. Although the structure of bacterial primases is conserved, the trinucleotide initiation specificity for A. aeolicus was hypothesized to differ from other microbes as an adaptation to a geothermal milieu. To determine the full range of A. aeolicus primase activity, two oligonucleotides were designed that comprised all potential trinucleotide initiation sequences. One of the screening templates supported primer synthesis and the lengths of the resulting primers were used to predict possible initiation trinucleotides. Use of trinucleotide-specific templates demonstrated that the preferred initiation trinucleotide sequence for A. aeolicus primase was 5′-d(CCC)-3′. Two other sequences, 5′-d(GCC)-3′ and d(CGC)-3′, were also capable of supporting initiation, but to a much lesser degree. None of these trinucleotides were known to be recognition sequences used by other microbial primases. These results suggest that the initiation specificity of A. aeolicus primase may represent an adaptation to a thermophilic environment.
Collapse
Affiliation(s)
- Marilynn A Larson
- Department of Microbiology and Pathology, University of Nebraska Medical Center, Omaha, NE 68198-6495, USA
| | | | | | | | | | | | | |
Collapse
|
22
|
Zhou T, Drummond DA, Wilke CO. Contact density affects protein evolutionary rate from bacteria to animals. J Mol Evol 2008; 66:395-404. [PMID: 18379715 DOI: 10.1007/s00239-008-9094-4] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2007] [Revised: 02/16/2008] [Accepted: 02/25/2008] [Indexed: 12/29/2022]
Abstract
The density of contacts or the fraction of buried sites in a protein structure is thought to be related to a protein's designability, and genes encoding more designable proteins should evolve faster than other genes. Several recent studies have tested this hypothesis but have found conflicting results. Here, we investigate how a gene's evolutionary rate is affected by its protein's contact density, considering the four species Escherichia coli, Saccharomyces cerevisiae, Drosophila melanogaster, and Homo sapiens. We find for all four species that contact density correlates positively with evolutionary rate, and that these correlations do not seem to be confounded by gene expression level. The strength of this signal, however, varies widely among species. We also study the effect of contact density on domain evolution in multidomain proteins and find that a domain's contact density influences the domain's evolutionary rate. Within the same protein, a domain with higher contact density tends to evolve faster than a domain with lower contact density. Our study provides evidence that contact density can increase evolutionary rates, and that it acts similarly on the level of entire proteins and of individual protein domains.
Collapse
Affiliation(s)
- Tong Zhou
- Center for Computational Biology and Bioinformatics, Section of Integrative Biology, University of Texas at Austin, Austin, TX 78731, USA
| | | | | |
Collapse
|
23
|
Fuglsang A. Impact of bias discrepancy and amino acid usage on estimates of the effective number of codons used in a gene, and a test for selection on codon usage. Gene 2007; 410:82-8. [PMID: 18248919 DOI: 10.1016/j.gene.2007.12.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2007] [Revised: 10/22/2007] [Accepted: 12/03/2007] [Indexed: 11/26/2022]
Abstract
The effective number of codons (Nc) used in a gene is one of the most commonly used measures of synonymous codon usage bias, owing much of its popularity to the fact that it is species independent and that simulation studies have shown that it is less dependent of gene length than other measures. In this paper I provide a clear and practically meaningful definition of bias discrepancy (BD; when the degree of codon bias varies within a degeneracy class). Moreover I evaluate the impact of BD and amino acid usage on estimates of Nc. It is shown that both factors have a significant effect on accuracy and precision. Both amino acid usage and BD influence accuracy considerably, especially in short genes. Finally, I demonstrate how the definition of bias discrepancy can be applied to investigate if codon usage is influenced by selection and I discuss this test in relation to the incongruous literature that exists for Buchnera sp. APS and Borrelia burgdorferi.
Collapse
Affiliation(s)
- Anders Fuglsang
- University of Copenhagen, Faculty of Pharmaceutical Sciences, 2 Universitetsparken, Copenhagen O, Denmark.
| |
Collapse
|
24
|
Froula JL, Francino MP. Selection against spurious promoter motifs correlates with translational efficiency across bacteria. PLoS One 2007; 2:e745. [PMID: 17710145 PMCID: PMC1939733 DOI: 10.1371/journal.pone.0000745] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2007] [Accepted: 07/13/2007] [Indexed: 11/19/2022] Open
Abstract
Because binding of RNAP to misplaced sites could compromise the efficiency of transcription, natural selection for the optimization of gene expression should regulate the distribution of DNA motifs capable of RNAP-binding across the genome. Here we analyze the distribution of the −10 promoter motifs that bind the σ70 subunit of RNAP in 42 bacterial genomes. We show that selection on these motifs operates across the genome, maintaining an over-representation of −10 motifs in regulatory sequences while eliminating them from the nonfunctional and, in most cases, from the protein coding regions. In some genomes, however, −10 sites are over-represented in the coding sequences; these sites could induce pauses effecting regulatory roles throughout the length of a transcriptional unit. For nonfunctional sequences, the extent of motif under-representation varies across genomes in a manner that broadly correlates with the number of tRNA genes, a good indicator of translational speed and growth rate. This suggests that minimizing the time invested in gene transcription is an important selective pressure against spurious binding. However, selection against spurious binding is detectable in the reduced genomes of host-restricted bacteria that grow at slow rates, indicating that components of efficiency other than speed may also be important. Minimizing the number of RNAP molecules per cell required for transcription, and the corresponding energetic expense, may be most relevant in slow growers. These results indicate that genome-level properties affecting the efficiency of transcription and translation can respond in an integrated manner to optimize gene expression. The detection of selection against promoter motifs in nonfunctional regions also confirms previous results indicating that no sequence may evolve free of selective constraints, at least in the relatively small and unstructured genomes of bacteria.
Collapse
Affiliation(s)
- Jeffrey L. Froula
- Evolutionary Genomics Program, DOE Joint Genome Institute, Walnut Creek, California, United States of America
| | - M. Pilar Francino
- Evolutionary Genomics Program, DOE Joint Genome Institute, Walnut Creek, California, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
25
|
Choi JK, Kim SC, Seo J, Kim S, Bhak J. Impact of transcriptional properties on essentiality and evolutionary rate. Genetics 2007; 175:199-206. [PMID: 17057246 PMCID: PMC1775009 DOI: 10.1534/genetics.106.066027] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2006] [Accepted: 10/14/2006] [Indexed: 11/18/2022] Open
Abstract
We characterized general transcriptional activity and variability of eukaryotic genes from global expression profiles of human, mouse, rat, fly, plants, and yeast. The variability shows a higher degree of divergence between distant species, implying that it is more closely related to phenotypic evolution, than the activity. More specifically, we show that transcriptional variability should be a true indicator of evolutionary rate. If we rule out the effect of translational selection, which seems to operate only in yeast, the apparent slow evolution of highly expressed genes should be attributed to their low variability. Meanwhile, rapidly evolving genes may acquire a high level of transcriptional variability and contribute to phenotypic variations. Essentiality also seems to be correlated with the variability, not the activity. We show that indispensable or highly interactive proteins tend to be present in high abundance to maintain a low variability. Our results challenge the current theory that highly expressed genes are essential and evolve slowly. Transcriptional variability, rather than transcriptional activity, might be a common indicator of essentiality and evolutionary rate, contributing to the correlation between the two variables.
Collapse
|
26
|
Huerta AM, Francino MP, Morett E, Collado-Vides J. Selection for unequal densities of sigma70 promoter-like signals in different regions of large bacterial genomes. PLoS Genet 2006; 2:e185. [PMID: 17096598 PMCID: PMC1635534 DOI: 10.1371/journal.pgen.0020185] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2005] [Accepted: 09/12/2006] [Indexed: 11/18/2022] Open
Abstract
The evolutionary processes operating in the DNA regions that participate in the regulation of gene expression are poorly understood. In Escherichia coli, we have established a sequence pattern that distinguishes regulatory from nonregulatory regions. The density of promoter-like sequences, that could be recognizable by RNA polymerase and may function as potential promoters, is high within regulatory regions, in contrast to coding regions and regions located between convergently transcribed genes. Moreover, functional promoter sites identified experimentally are often found in the subregions of highest density of promoter-like signals, even when individual sites with higher binding affinity for RNA polymerase exist elsewhere within the regulatory region. In order to see the generality of this pattern, we have analyzed 43 additional genomes belonging to most established bacterial phyla. Differential densities between regulatory and nonregulatory regions are detectable in most of the analyzed genomes, with the exception of those that have evolved toward extreme genome reduction. Thus, presence of this pattern follows that of genes and other genomic features that require weak selection to be effective in order to persist. On this basis, we suggest that the loss of differential densities in the reduced genomes of host-restricted pathogens and symbionts is an outcome of the process of genome degradation resulting from the decreased efficiency of purifying selection in highly structured small populations. This implies that the differential distribution of promoter-like signals between regulatory and nonregulatory regions detected in large bacterial genomes confers a significant, although small, fitness advantage. This study paves the way for further identification of the specific types of selective constraints that affect the organization of regulatory regions and the overall distribution of promoter-like signals through more detailed comparative analyses among closely related bacterial genomes. The most important step in the regulation of genetic expression is the initiation of transcription. This process is accomplished by the association or specific binding of RNA polymerase to particular sequence segments present in the DNA, the promoters. Promoters are located in the upstream regions of the transcribed genes. The evolutionary processes operating in the DNA regions that participate in the regulation of gene expression are poorly understood. For a long time, the canonical picture of a σ70 promoter has been a 60 base pair region defined by the transcription start-point (+1) and two conserved hexanucleotide sequences centered 10 and 35 base pairs upstream from the +1. The authors have shown that in Escherichia coli, promoters exist in clusters, as a series of overlapping potentially competing RNAP interaction sites. The E. coli regulatory regions contain high densities of these promoter-like signals, in contrast to coding regions and regions located between convergently transcribed genes. They report that the differential densities between regulatory and nonregulatory regions are detectable in most eubacterial genomes, with the exception of those that have experienced severe genome degradation and size reduction. This suggests that the presence of this pattern in large bacterial genomes confers a significant, although small, fitness advantage.
Collapse
Affiliation(s)
- Araceli M Huerta
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, México.
| | | | | | | |
Collapse
|
27
|
Das S, Paul S, Bag SK, Dutta C. Analysis of Nanoarchaeum equitans genome and proteome composition: indications for hyperthermophilic and parasitic adaptation. BMC Genomics 2006; 7:186. [PMID: 16869956 PMCID: PMC1574309 DOI: 10.1186/1471-2164-7-186] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2006] [Accepted: 07/25/2006] [Indexed: 11/24/2022] Open
Abstract
Background Nanoarchaeum equitans, the only known hyperthermophilic archaeon exhibiting parasitic life style, has raised some new questions about the evolution of the Archaea and provided a model of choice to study the genome landmarks correlated with thermo-parasitic adaptation. In this context, we have analyzed the genome and proteome composition of N. equitans and compared the same with those of other mesophiles, hyperthermophiles and obligatory host-associated organisms. Results Analysis of nucleotide, codon and amino acid usage patterns in N. equitans indicates the presence of distinct selective constraints, probably due to its adaptation to a thermo-parasitic life-style. Among the conspicuous characteristics featuring its hyperthermophilic adaptation are overrepresentation of purine bases in protein coding sequences, higher GC-content in tRNA/rRNA sequences, distinct synonymous codon usage, enhanced usage of aromatic and positively charged residues, and decreased frequencies of polar uncharged residues, as compared to those in mesophilic organisms. Positively charged amino acid residues are relatively abundant in the encoded gene-products of N. equitans and other hyperthermophiles, which is reflected in their isoelectric point distribution. Pairwise comparison of 105 orthologous protein sequences shows a strong bias towards replacement of uncharged polar residues of mesophilic proteins by Lys/Arg, Tyr and some hydrophobic residues in their Nanoarchaeal orthologs. The traits potentially attributable to the symbiotic/parasitic life-style of the organism include the presence of apparently weak translational selection in synonymous codon usage and a marked heterogeneity in membrane-associated proteins, which may be important for N. equitans to interact with the host and hence, may help the organism to adapt to the strictly host-associated life style. Despite being strictly host-dependent, N. equitans follows cost minimization hypothesis. Conclusion The present study reveals that the genome and proteome composition of N. equitans are marked with the signatures of dual adaptation – one to high temperature and the other to obligatory parasitism. While the analysis of nucleotide/amino acid preferences in N. equitans offers an insight into the molecular strategies taken by the archaeon for thermo-parasitic adaptation, the comparative study of the compositional characteristics of mesophiles, hyperthermophiles and obligatory host-associated organisms demonstrates the generality of such strategies in the microbial world.
Collapse
Affiliation(s)
- Sabyasachi Das
- Bioinformatics Centre, Indian Institute of Chemical Biology, Kolkata–700032, India
| | - Sandip Paul
- Bioinformatics Centre, Indian Institute of Chemical Biology, Kolkata–700032, India
| | - Sumit K Bag
- Bioinformatics Centre, Indian Institute of Chemical Biology, Kolkata–700032, India
| | - Chitra Dutta
- Bioinformatics Centre, Indian Institute of Chemical Biology, Kolkata–700032, India
- Human Genetics & Genomics Division, Indian Institute of Chemical Biology, Kolkata–700032, India
| |
Collapse
|
28
|
Bodilis J, Barray S. Molecular evolution of the major outer-membrane protein gene (oprF) of Pseudomonas. MICROBIOLOGY-SGM 2006; 152:1075-1088. [PMID: 16549671 DOI: 10.1099/mic.0.28656-0] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The major outer-membrane protein of Pseudomonas, OprF, is multifunctional. It is a non-specific porin that plays a role in maintenance of cell shape, in growth in a low-osmolarity environment, and in adhesion to various supports or molecules. OprF has been studied extensively for its utility as a vaccine component, its role in antimicrobial drug resistance, and its porin function. The authors have previously shown important differences between the OprF and 16S rDNA phylogenies: Pseudomonas fluorescens isolates split into two quite separate clusters, probably according to their ecological niche. In this study, the evolutionary history of the oprF gene was investigated further. The study of G+C content at the third codon position, synonymous codon usage (codon adaptation index, CAI) and genomic context showed no evidence of horizontal transfer or gene duplication. Similarly, a robust likelihood test of incongruence showed no significant incongruence between the oprF phylogeny and the species phylogeny. In addition, the ratio of nonsynonymous mutations to synonymous mutations (K(a)/K(s)) is high between the different clusters, especially between the two clusters containing P. fluorescens isolates, highlighting important modifications in evolutionary constraints during the history of the oprF gene. Since OprF is known as a pleiotropic protein, modifications in evolutionary constraints could have resulted from variations in cryptic functions, correlated with the ecological fingerprint. Finally, relaxed constraints and/or episodic positive evolution, especially for some P. fluorescens strains, could have led to a phylogeny reconstruction artifact.
Collapse
Affiliation(s)
- Josselin Bodilis
- LMDF (Laboratoire de Microbiologie Du Froid), UPRES 2123, ABISS (Atelier de Biologie, Informatique, Statistique et Sociolinguistinque), Université de Rouen, 76821 Mont Saint Aignan, France
| | - Sylvie Barray
- LMDF (Laboratoire de Microbiologie Du Froid), UPRES 2123, ABISS (Atelier de Biologie, Informatique, Statistique et Sociolinguistinque), Université de Rouen, 76821 Mont Saint Aignan, France
| |
Collapse
|
29
|
Banerjee T, Ghosh TC. Gene expression level shapes the amino acid usages in Prochlorococcus marinus MED4. J Biomol Struct Dyn 2006; 23:547-54. [PMID: 16494504 DOI: 10.1080/07391102.2006.10507079] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Prochlorococcus species are the first example of free-living bacteria with reduced genome. Codon and amino acid usages bias of Prochlorococcus marinus MED4 was investigated using all protein coding genes having length greater than or equal to 100 amino acids. Correspondence analysis on relative synonymous codon usage (RSCU) values shows that there is no such influence of translational selection in shaping the codon usage variation among the genes in this organism. However, amino acid usages were markedly different between the highly and lowly expressed genes in this organism and in particular, GC rich amino acids were found to occur significantly higher in highly expressed genes than the lowly expressed genes. Comparative analysis of the homologous genes of Synechococcus sp. WH8102 and Prochlorococcus marinus MED4 shows that amino acids conservation in highly expressed genes is significantly higher than lowly expressed genes. Based on our results we concluded that conservation of GC rich amino acids in the highly expressed genes to its ancestor is the major source of variation in amino acid usages in the organism.
Collapse
Affiliation(s)
- T Banerjee
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | | |
Collapse
|
30
|
Sällström B, Arnaout RA, Davids W, Bjelkmar P, Andersson SGE. Protein evolutionary rates correlate with expression independently of synonymous substitutions in Helicobacter pylori. J Mol Evol 2006; 62:600-14. [PMID: 16586017 DOI: 10.1007/s00239-005-0104-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2005] [Accepted: 12/20/2005] [Indexed: 11/29/2022]
Abstract
In free-living microorganisms, such as Escherichia coli and Saccharomyces cerevisiae, both synonymous and nonsynonymous substitution frequencies correlate with expression levels. Here, we have tested the hypothesis that the correlation between amino acid substitution rates and expression is a by-product of selection for codon bias and translational efficiency in highly expressed genes. To this end, we have examined the correlation between protein evolutionary rates and expression in the human gastric pathogen Helicobacter pylori, where the absence of selection on synonymous sites enables the two types of substitutions to be uncoupled. The results revealed a statistically significant negative correlation between expression levels and nonsynonymous substitutions in both H. pylori and E. coli. We also found that neighboring genes located on the same, but not on opposite strands, evolve at significantly more similar rates than random gene pairs, as expected by co-expression of genes located in the same operon. However, the two species differ in that synonymous substitutions show a strand-specific pattern in E. coli, whereas the weak similarity in synonymous substitutions for neighbors in H. pylori is independent of gene orientation. These results suggest a direct influence of expression levels on nonsynonymous substitution frequencies independent of codon bias and selective constraints on synonymous sites.
Collapse
Affiliation(s)
- Björn Sällström
- Program of Molecular Evolution, Department of Evolution, Genomics and Systematics, Evolutionary Biology Center, Uppsala University, 752 36 Uppsala, Sweden
| | | | | | | | | |
Collapse
|
31
|
Das S, Paul S, Dutta C. Evolutionary constraints on codon and amino acid usage in two strains of human pathogenic actinobacteria Tropheryma whipplei. J Mol Evol 2006; 62:645-58. [PMID: 16557339 DOI: 10.1007/s00239-005-0164-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2005] [Accepted: 12/20/2005] [Indexed: 12/13/2022]
Abstract
The factors governing codon and amino acid usages in the predicted protein-coding sequences of Tropheryma whipplei TW08/27 and Twist genomes have been analyzed. Multivariate analysis identifies the replicational-transcriptional selection coupled with DNA strand-specific asymmetric mutational bias as a major driving force behind the significant interstrand variations in synonymous codon usage patterns in T. whipplei genes, while a residual intrastrand synonymous codon bias is imparted by a selection force operating at the level of translation. The strand-specific mutational pressure has little influence on the amino acid usage, for which the mean hydropathy level and aromaticity are the major sources of variation, both having nearly equal impact. In spite of the intracellular lifestyle, the amino acid usage in highly expressed gene products of T. whipplei follows the cost-minimization hypothesis. The products of the highly expressed genes of these relatively A + T-rich actinobacteria prefer to use the residues encoded by GC-rich codons, probably due to greater conservation of a GC-rich ancestral state in the highly expressed genes, as suggested by the lower values of the rate of nonsynonymous divergences between orthologous sequences of highly expressed genes from the two strains of T. whipplei. Both the genomes under study are characterized by the presence of two distinct groups of membrane-associated genes, products of which exhibit significant differences in primary and potential secondary structures as well as in the propensity of protein disorder.
Collapse
Affiliation(s)
- Sabyasachi Das
- Bioinformatics Centre, Indian Institute of Chemical Biology, 4 Raja S. C. Mullick Road, Kolkata 700 032, India
| | | | | |
Collapse
|
32
|
Huerta AM, Collado-Vides J, Francino MP. Proceedings of the SMBE Tri-National Young Investigators' Workshop 2005. Positional conservation of clusters of overlapping promoter-like sequences in enterobacterial genomes. Mol Biol Evol 2006; 23:997-1010. [PMID: 16547149 DOI: 10.1093/molbev/msk004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
The selective mechanisms operating in regulatory regions of bacterial genomes are poorly understood. We have previously shown that, in most bacterial genomes, regulatory regions contain high densities of sigma70 promoter-like signals that are significantly above the densities detected in nonregulatory genomic regions. In order to investigate the molecular evolutionary forces that operate in bacterial regulatory regions and how they affect the observed redundancy of promoter-like signals, we have undertaken a comparative analysis across the completely sequenced genomes of enteric gamma-proteobacteria. This analysis detects significant positional conservation of promoter-like signal clusters across enterics, some times in spite of strong primary sequence divergence. This suggests that the conservation of the nature and exact position of specific nucleotides is not necessarily the priority of selection for maintaining the transcriptional function in these bacteria. We have further characterized the structural conservation of the regulatory regions of dnaQ and crp across all enterics. These two regions differ in essentiality and mode of regulation, the regulation of crp being more complex and involving interactions with several transcription factors. This results in substantially different modes of evolution, with the dnaQ region appearing to evolve under stronger purifying selection and the crp region showing the likely effects of stabilizing selection for a complex pattern of gene expression. The higher flexibility of the crp region is consistent with the observed less conservation of global regulators in evolution. Patterns of regulatory evolution are also found to be markedly different in endosymbiotic bacteria, in a manner consistent with regulatory regions suffering some level of degradation, as has been observed for many other characters in these genomes. Therefore, the mode of evolution of bacterial regulatory regions appears to be highly dependent on both the lifestyle of the bacterium and the specific regulatory requirements of different genes. In fact, in many bacteria, the mode of evolution of genes requiring significant physiological adaptability in expression levels may follow patterns similar to those operating in the more complex regulatory regions of eukaryotic genomes.
Collapse
Affiliation(s)
- Araceli M Huerta
- Evolutionary Genomics Department, Lawrence Berkeley National Laboratory, Walnut Creek, CA, USA.
| | | | | |
Collapse
|
33
|
Degnan PH, Lazarus AB, Wernegreen JJ. Genome sequence of Blochmannia pennsylvanicus indicates parallel evolutionary trends among bacterial mutualists of insects. Genome Res 2005; 15:1023-33. [PMID: 16077009 PMCID: PMC1182215 DOI: 10.1101/gr.3771305] [Citation(s) in RCA: 137] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The distinct lifestyle of obligately intracellular bacteria can alter fundamental forces that drive and constrain genome change. In this study, sequencing the 792-kb genome of Blochmannia pennsylvanicus, an obligate endosymbiont of Camponotus pennsylvanicus, enabled us to trace evolutionary changes that occurred in the context of a bacterial-ant association. Comparison to the genome of Blochmannia floridanus reveals differential loss of genes involved in cofactor biosynthesis, the composition and structure of the cell wall and membrane, gene regulation, and DNA replication. However, the two Blochmannia species show complete conservation in the order and strand orientation of shared genes. This finding of extreme stasis in genome architecture, also reported previously for the aphid endosymbiont Buchnera, suggests that genome stability characterizes long-term bacterial mutualists of insects and constrains their evolutionary potential. Genome-wide analyses of protein divergences reveal 10- to 50-fold faster amino acid substitution rates in Blochmannia compared to related bacteria. Despite these varying features of genome evolution, a striking correlation in the relative divergences of proteins indicates parallel functional constraints on gene functions across ecologically distinct bacterial groups. Furthermore, the increased rates of amino acid substitution and gene loss in Blochmannia have occurred in a lineage-specific fashion, which may reflect life history differences of their ant hosts.
Collapse
Affiliation(s)
- Patrick H Degnan
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, Massachusetts 02543, USA
| | | | | |
Collapse
|
34
|
Drummond DA, Bloom JD, Adami C, Wilke CO, Arnold FH. Why highly expressed proteins evolve slowly. Proc Natl Acad Sci U S A 2005; 102:14338-43. [PMID: 16176987 PMCID: PMC1242296 DOI: 10.1073/pnas.0504070102] [Citation(s) in RCA: 605] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Much recent work has explored molecular and population-genetic constraints on the rate of protein sequence evolution. The best predictor of evolutionary rate is expression level, for reasons that have remained unexplained. Here, we hypothesize that selection to reduce the burden of protein misfolding will favor protein sequences with increased robustness to translational missense errors. Pressure for translational robustness increases with expression level and constrains sequence evolution. Using several sequenced yeast genomes, global expression and protein abundance data, and sets of paralogs traceable to an ancient whole-genome duplication in yeast, we rule out several confounding effects and show that expression level explains roughly half the variation in Saccharomyces cerevisiae protein evolutionary rates. We examine causes for expression's dominant role and find that genome-wide tests favor the translational robustness explanation over existing hypotheses that invoke constraints on function or translational efficiency. Our results suggest that proteins evolve at rates largely unrelated to their functions and can explain why highly expressed proteins evolve slowly across the tree of life.
Collapse
Affiliation(s)
- D Allan Drummond
- Program in Computation and Neural Systems and Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125-4100, USA.
| | | | | | | | | |
Collapse
|
35
|
Sun J, Chen M, Xu J, Luo J. Relationships among stop codon usage bias, its context, isochores, and gene expression level in various eukaryotes. J Mol Evol 2005; 61:437-44. [PMID: 16170455 DOI: 10.1007/s00239-004-0277-3] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2004] [Accepted: 01/25/2005] [Indexed: 11/25/2022]
Abstract
It is well known that stop codons play a critical role in the process of protein synthesis. However, little effort has been made to investigate whether stop codon usage exhibits biases, such as widely seen for synonymous codon usage. Here we systematically investigate stop codon usage bias in various eukaryotes as well as its relationships with its context, GC3 content, gene expression level, and secondary structure. The results show that there is a strong bias for stop codon usage in different eukaryotes, i.e., UAA is overrepresented in the lower eukaryotes, UGA is overrepresented in the higher eukaryotes, and UAG is least used in all eukaryotes. Different conserved patterns for each stop codon in different eukaryotic classes are found based on information content and logo analysis. GC3 contents increase with increasing complexity of organisms. Secondary structure prediction revealed that UAA is generally associated with loop structures, whereas UGA is more uniformly present in loop and stem structures, i.e., UGA is less biased toward having a particular structure. The stop codon usage bias, however, shows no significant relationship with GC3 content and gene expression level in individual eukaryotes. The results indicate that genomic complexity and GC3 content might contribute to stop codon usage bias in different eukaryotes. Our results indicate that stop codons, like synonymous codons, exhibit biases in usage. Additional work will be needed to understand the causes of these biases and their relationship to the mechanism of protein termination.
Collapse
Affiliation(s)
- Jingchun Sun
- School of Life Sciences & Technology, Shanghai Jiaotong University, Shanghai 200240, China
| | | | | | | |
Collapse
|
36
|
Baldridge GD, Burkhardt N, Herron MJ, Kurtti TJ, Munderloh UG. Analysis of fluorescent protein expression in transformants of Rickettsia monacensis, an obligate intracellular tick symbiont. Appl Environ Microbiol 2005; 71:2095-105. [PMID: 15812043 PMCID: PMC1082560 DOI: 10.1128/aem.71.4.2095-2105.2005] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We developed and applied transposon-based transformation vectors for molecular manipulation and analysis of spotted fever group rickettsiae, which are obligate intracellular bacteria that infect ticks and, in some cases, mammals. Using the Epicentre EZ::TN transposon system, we designed transposons for simultaneous expression of a reporter gene and a chloramphenicol acetyltransferase (CAT) resistance marker. Transposomes (transposon-transposase complexes) were electroporated into Rickettsia monacensis, a rickettsial symbiont isolated from the tick Ixodes ricinus. Each transposon contained an expression cassette consisting of the rickettsial ompA promoter and a green fluorescent protein (GFP) reporter gene (GFPuv) or the ompB promoter and a red fluorescent protein reporter gene (DsRed2), followed by the ompA transcription terminator and a second ompA promoter CAT gene cassette. Selection with chloramphenicol gave rise to rickettsial populations with chromosomally integrated single-copy transposons as determined by PCR, Southern blotting, and sequence analysis. Reverse transcription-PCR and Northern blots demonstrated transcription of all three genes. GFPuv transformant rickettsiae exhibited strong fluorescence in individual cells, but DsRed2 transformants did not. Western blots confirmed expression of GFPuv in R. monacensis and in Escherichia coli, but DsRed2 was expressed only in E. coli. The DsRed2 gene, but not the GFPuv gene, contains many GC-rich amino acid codons that are rare in the preferred codon suite of rickettsiae, possibly explaining the failure to express DsRed2 protein in R. monacensis. We demonstrated that our vectors provide a means to study rickettsia-host cell interactions by visualizing GFPuv-fluorescent R. monacensis associated with actin tails in tick host cells.
Collapse
Affiliation(s)
- Gerald D Baldridge
- Department of Entomology, University of Minnesota, 1980 Folwell Ave., St. Paul, MN 55108, USA.
| | | | | | | | | |
Collapse
|
37
|
Aksoy S, Rio RVM. Interactions among multiple genomes: tsetse, its symbionts and trypanosomes. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2005; 35:691-8. [PMID: 15894186 DOI: 10.1016/j.ibmb.2005.02.012] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 02/11/2005] [Indexed: 05/02/2023]
Abstract
Insect-borne diseases exact a high public health burden and have a devastating impact on livestock and agriculture. To date, control has proved to be exceedingly difficult. One such disease that has plagued sub-Saharan Africa is caused by the protozoan African trypanosomes (Trypanosoma species) and transmitted by tsetse flies (Diptera: Glossinidae). This presentation describes the biology of the tsetse fly and its interactions with trypanosomes as well as its symbionts. Tsetse can harbor up to three distinct microbial symbionts, including two enterics (Wigglesworthia glossinidia and Sodalis glossinidius) as well as facultative Wolbachia infections, which influence host physiology. Recent investigations into the genome of the obligate symbiont Wigglesworthia have revealed characteristics indicative of its long co-evolutionary history with the tsetse host species. Comparative analysis of the commensal-like Sodalis with free-living enterics provides examples of adaptations to the host environment (physiology and ecology), reflecting genomic tailoring events during the process of transitioning into a symbiotic lifestyle. From an applied perspective, the extensive knowledge accumulated on the genomic and developmental biology of the symbionts coupled with our ability to both express foreign genes in these microbes in vitro and repopulate tsetse midguts with these engineered microbes now provides a means to interfere with the host physiological traits which contribute to vector competence promising a novel tool for disease management.
Collapse
Affiliation(s)
- Serap Aksoy
- Department of Epidemiology and Public Health, Yale University School of Medicine, 60 College St., 606 LEPH, New Haven, CT 06510, USA.
| | | |
Collapse
|
38
|
Schaber J, Rispe C, Wernegreen J, Buness A, Delmotte F, Silva FJ, Moya A. Gene expression levels influence amino acid usage and evolutionary rates in endosymbiotic bacteria. Gene 2005; 352:109-117. [PMID: 15935576 DOI: 10.1016/j.gene.2005.04.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2004] [Revised: 01/25/2005] [Accepted: 04/01/2005] [Indexed: 02/07/2023]
Abstract
Most endosymbiotic bacteria have extremely reduced genomes, accelerated evolutionary rates, and strong AT base compositional bias thought to reflect reduced efficacy of selection and increased mutational pressure. Here, we present a comparative study of evolutionary forces shaping five fully sequenced bacterial endosymbionts of insects. The results of this study were three-fold: (i) Stronger conservation of high expression genes at not just nonsynonymous, but also synonymous, sites. (ii) Variation in amino acid usage strongly correlates with GC content and expression level of genes. This pattern is largely explained by greater conservation of high expression genes, leading to their higher GC content. However, we also found indication of selection favoring GC-rich amino acids that contrasts with former studies. (iii) Although the specific nutritional requirements of the insect host are known to affect gene content of endosymbionts, we found no detectable influence on substitution rates, amino acid usage, or codon usage of bacterial genes involved in host nutrition.
Collapse
Affiliation(s)
- Jörg Schaber
- Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de Valencia, A.C. 22085, 46071 Valencia, Spain.
| | | | | | | | | | | | | |
Collapse
|
39
|
Sharp PM, Bailes E, Grocock RJ, Peden JF, Sockett RE. Variation in the strength of selected codon usage bias among bacteria. Nucleic Acids Res 2005; 33:1141-53. [PMID: 15728743 PMCID: PMC549432 DOI: 10.1093/nar/gki242] [Citation(s) in RCA: 299] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2004] [Revised: 01/10/2005] [Accepted: 01/23/2005] [Indexed: 12/21/2022] Open
Abstract
Among bacteria, many species have synonymous codon usage patterns that have been influenced by natural selection for those codons that are translated more accurately and/or efficiently. However, in other species selection appears to have been ineffective. Here, we introduce a population genetics-based model for quantifying the extent to which selection has been effective. The approach is applied to 80 phylogenetically diverse bacterial species for which whole genome sequences are available. The strength of selected codon usage bias, S, is found to vary substantially among species; in 30% of the genomes examined, there was no significant evidence that selection had been effective. Values of S are highly positively correlated with both the number of rRNA operons and the number of tRNA genes. These results are consistent with the hypothesis that species exposed to selection for rapid growth have more rRNA operons, more tRNA genes and more strongly selected codon usage bias. For example, Clostridium perfringens, the species with the highest value of S, can have a generation time as short as 7 min.
Collapse
Affiliation(s)
- Paul M Sharp
- Institute of Genetics, University of Nottingham, Queens Medical Centre, Nottingham NG7 2UH, UK.
| | | | | | | | | |
Collapse
|
40
|
Banerjee T, Basak S, Gupta SK, Ghosh TC. Evolutionary forces in shaping the codon and amino acid usages in Blochmannia floridanus. J Biomol Struct Dyn 2005; 22:13-23. [PMID: 15214801 DOI: 10.1080/07391102.2004.10506976] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Endosymbiotic relationship has great effect on ecological system. Codon and amino acid usages bias of endosymbiotic bacteria Blochmannia floridanus (whose host is an ant Camponotus floridanus) was investigated using experimentally known genes of this organism. Correspondence Analysis on RSCU values show that there exists only one single explanatory major axis that is linked to the strand specific mutational biases. Majority of the genes have a tendency to concentrate on the leading strand, which may be related to the adaptive property related to the replication mechanisms. Amino acid usages were markedly different between the highly and lowly expressed genes in this organism and in particular, GC rich amino acids were found to occur significantly higher in highly expressed genes than the lowly expressed genes. Comparative analyses of the orthologous genes of Escherichia coli and Blochmannia floridanus show that highly expressed genes are significantly more conserved than lowly expressed genes. Based on our results we concluded that strand specific mutational bias is strongly operational in selecting the codon usage in this organism. Replicational-transcriptional selection can be invoked from the presence of majority of highly expressed genes in the leading strand. Conservation of GC rich amino acids in the highly expressed genes to its ancestor is the major source of variation in amino acid usages in the organism. Hydrophobicity of the genes is the second major source in differentiating the genes according to their amino acid usages in this organism.
Collapse
Affiliation(s)
- T Banerjee
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | | | | | | |
Collapse
|
41
|
Herbeck JT, Degnan PH, Wernegreen JJ. Nonhomogeneous model of sequence evolution indicates independent origins of primary endosymbionts within the enterobacteriales (gamma-Proteobacteria). Mol Biol Evol 2004; 22:520-32. [PMID: 15525700 DOI: 10.1093/molbev/msi036] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Standard methods of phylogenetic reconstruction are based on models that assume homogeneity of nucleotide composition among taxa. However, this assumption is often violated in biological data sets. In this study, we examine possible effects of nucleotide heterogeneity among lineages on the phylogenetic reconstruction of a bacterial group that spans a wide range of genomic nucleotide contents: obligately endosymbiotic bacteria and free-living or commensal species in the gamma-Proteobacteria. We focus on AT-rich primary endosymbionts to better understand the origins of obligately intracellular lifestyles. Previous phylogenetic analyses of this bacterial group point to the importance of accounting for base compositional variation in estimating relationships, particularly between endosymbiotic and free-living taxa. Here, we develop an approach to compare susceptibility of various phylogenetic reconstruction methods to the effects of nucleotide heterogeneity. First, we identify candidate trees of gamma-Proteobacteria groEL and 16S rRNA using approaches that assume homogeneous and stationary base composition, including Bayesian, maximum likelihood, parsimony, and distance methods. We then create permutations of the resulting candidate trees by varying the placement of the AT-rich endosymbiont Buchnera. These permutations are evaluated under the nonhomogeneous and nonstationary maximum likelihood model of Galtier and Gouy, which allows equilibrium base content to vary among examined lineages. Our results show that commonly used phylogenetic methods produce incongruent trees of the Enterobacteriales, and that the placement of Buchnera is especially unstable. However, under a nonhomogeneous model, various groEL and 16S rRNA phylogenies that separate Buchnera from other AT-rich endosymbionts (Blochmannia and Wigglesworthia) have consistently and significantly higher likelihood scores. Blochmannia and Wigglesworthia appear to have evolved from secondary endosymbionts, and represent an origin of primary endosymbiosis that is independent from Buchnera. This application of a nonhomogeneous model offers a computationally feasible way to test specific phylogenetic hypotheses for taxa with heterogeneous and nonstationary base composition.
Collapse
Affiliation(s)
- Joshua T Herbeck
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, Massachusetts, USA.
| | | | | |
Collapse
|