201
|
Puigbò P, Romeu A, Garcia-Vallvé S. HEG-DB: a database of predicted highly expressed genes in prokaryotic complete genomes under translational selection. Nucleic Acids Res 2007; 36:D524-7. [PMID: 17933767 PMCID: PMC2238906 DOI: 10.1093/nar/gkm831] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The highly expressed genes database (HEG-DB) is a genomic database that includes the prediction of which genes are highly expressed in prokaryotic complete genomes under strong translational selection. The current version of the database contains general features for almost 200 genomes under translational selection, including the correspondence analysis of the relative synonymous codon usage for all genes, and the analysis of their highly expressed genes. For each genome, the database contains functional and positional information about the predicted group of highly expressed genes. This information can also be accessed using a search engine. Among other statistical parameters, the database also provides the Codon Adaptation Index (CAI) for all of the genes using the codon usage of the highly expressed genes as a reference set. The 'Pathway Tools Omics Viewer' from the BioCyc database enables the metabolic capabilities of each genome to be explored, particularly those related to the group of highly expressed genes. The HEG-DB is freely available at http://genomes.urv.cat/HEG-DB.
Collapse
Affiliation(s)
- Pere Puigbò
- Evolutionary Genomics Group, Biochemistry and Biotechnology Department, Faculty of Chemistry, Rovira i Virgili University (URV), c/Marcel-li Domingo, s/n. Campus Sescelades, 43007 Tarragona, Spain.
| | | | | |
Collapse
|
202
|
Vladimirov NV, Likhoshvai VA, Matushkin YG. Correlation of codon biases and potential secondary structures with mRNA translation efficiency in unicellular organisms. Mol Biol 2007. [DOI: 10.1134/s0026893307050184] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
203
|
Fosmids of novel marine Planctomycetes from the Namibian and Oregon coast upwelling systems and their cross-comparison with planctomycete genomes. ISME JOURNAL 2007; 1:419-35. [DOI: 10.1038/ismej.2007.63] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
204
|
Konstantinidis K, Tebbe A, Klein C, Scheffer B, Aivaliotis M, Bisle B, Falb M, Pfeiffer F, Siedler F, Oesterhelt D. Genome-wide proteomics of Natronomonas pharaonis. J Proteome Res 2007; 6:185-93. [PMID: 17203963 DOI: 10.1021/pr060352q] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The aerobic, haloalkaliphilic archaeon Natronomonas pharaonis is able to survive in salt-saturated lakes of pH 11. According to genome analysis, the theoretical proteome consists of 2843 proteins. To reach further conclusions about its cellular physiology, the cytosolic protein inventory of Nmn. pharaonis has been analyzed using MS/MS on an ESI-Q-TOF mass spectrometer coupled on-line with a nanoLC system. The efficiency of this shotgun approach is illustrated by the identification of 929 proteins of which 886 are soluble proteins representing 41% of the cytosolic proteome. Cell lysis under denaturing conditions in water with subsequent separation by SDS-PAGE prior to nanoLC-MS/MS resulted in identification of 700 proteins. The same number (but a different subset) of proteins was identified upon cell lysis under native conditions followed by size fractionation (retaining protein complexes) prior to SDS-PAGE. Additional size fractionation reduced sample complexity and increased identification reliability. The set of identified proteins covers about 60% of the cytosolic proteins involved in metabolism and genetic information processing. Many of the identified proteins illustrate the high genetic variability among the halophilic archaea.
Collapse
Affiliation(s)
- Kosta Konstantinidis
- Department of Membrane Biochemistry, Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
205
|
Willenbrock H, Friis C, Friis AS, Ussery DW. An environmental signature for 323 microbial genomes based on codon adaptation indices. Genome Biol 2007; 7:R114. [PMID: 17156429 PMCID: PMC1794427 DOI: 10.1186/gb-2006-7-12-r114] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2006] [Revised: 09/20/2006] [Accepted: 12/07/2006] [Indexed: 11/23/2022] Open
Abstract
The correlation of two methods for estimating codon adaptation indices applied to more than 300 bacterial species shows that codon usage preference provides an environmental signature by which it is possible to group bacteria according to their lifestyle Background Codon adaptation indices (CAIs) represent an evolutionary strategy to modulate gene expression and have widely been used to predict potentially highly expressed genes within microbial genomes. Here, we evaluate and compare two very different methods for estimating CAI values, one corresponding to translational codon usage bias and the second obtained mathematically by searching for the most dominant codon bias. Results The level of correlation between these two CAI methods is a simple and intuitive measure of the degree of translational bias in an organism, and from this we confirm that fast replicating bacteria are more likely to have a dominant translational codon usage bias than are slow replicating bacteria, and that this translational codon usage bias may be used for prediction of highly expressed genes. By analyzing more than 300 bacterial genomes, as well as five fungal genomes, we show that codon usage preference provides an environmental signature by which it is possible to group bacteria according to their lifestyle, for instance soil bacteria and soil symbionts, spore formers, enteric bacteria, aquatic bacteria, and intercellular and extracellular pathogens. Conclusion The results and the approach described here may be used to acquire new knowledge regarding species lifestyle and to elucidate relationships between organisms that are far apart evolutionarily.
Collapse
Affiliation(s)
- Hanni Willenbrock
- Center for Biological Sequence Analysis, BioCentrum-DTU, The Technical University of Denmark, DK-2800 Lyngby, Denmark
| | - Carsten Friis
- Center for Biological Sequence Analysis, BioCentrum-DTU, The Technical University of Denmark, DK-2800 Lyngby, Denmark
| | - Agnieszka S Friis
- Center for Biological Sequence Analysis, BioCentrum-DTU, The Technical University of Denmark, DK-2800 Lyngby, Denmark
| | - David W Ussery
- Center for Biological Sequence Analysis, BioCentrum-DTU, The Technical University of Denmark, DK-2800 Lyngby, Denmark
| |
Collapse
|
206
|
Puigbò P, Guzmán E, Romeu A, Garcia-Vallvé S. OPTIMIZER: a web server for optimizing the codon usage of DNA sequences. Nucleic Acids Res 2007; 35:W126-31. [PMID: 17439967 PMCID: PMC1933141 DOI: 10.1093/nar/gkm219] [Citation(s) in RCA: 417] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
OPTIMIZER is an on-line application that optimizes the codon usage of a gene to increase its expression level. Three methods of optimization are available: the ‘one amino acid–one codon’ method, a guided random method based on a Monte Carlo algorithm, and a new method designed to maximize the optimization with the fewest changes in the query sequence. One of the main features of OPTIMIZER is that it makes it possible to optimize a DNA sequence using pre-computed codon usage tables from a predicted group of highly expressed genes from more than 150 prokaryotic species under strong translational selection. These groups of highly expressed genes have been predicted using a new iterative algorithm. In addition, users can use, as a reference set, a pre-computed table containing the mean codon usage of ribosomal protein genes and, as a novelty, the tRNA gene-copy numbers. OPTIMIZER is accessible free of charge at http://genomes.urv.es/OPTIMIZER.
Collapse
Affiliation(s)
- Pere Puigbò
- Evolutionary Genomics Group, Biochemistry and Biotechnology Department, Faculty of Chemistry, Rovira i Virgili University (URV), c/Marcel·li Domingo, s/n. Campus Sescelades, 43007 Tarragona, Spain and Institut Català de la Salut, Àrea Bàsica de Salut, Tarragona 2, Spain
| | - Eduard Guzmán
- Evolutionary Genomics Group, Biochemistry and Biotechnology Department, Faculty of Chemistry, Rovira i Virgili University (URV), c/Marcel·li Domingo, s/n. Campus Sescelades, 43007 Tarragona, Spain and Institut Català de la Salut, Àrea Bàsica de Salut, Tarragona 2, Spain
| | - Antoni Romeu
- Evolutionary Genomics Group, Biochemistry and Biotechnology Department, Faculty of Chemistry, Rovira i Virgili University (URV), c/Marcel·li Domingo, s/n. Campus Sescelades, 43007 Tarragona, Spain and Institut Català de la Salut, Àrea Bàsica de Salut, Tarragona 2, Spain
| | - Santiago Garcia-Vallvé
- Evolutionary Genomics Group, Biochemistry and Biotechnology Department, Faculty of Chemistry, Rovira i Virgili University (URV), c/Marcel·li Domingo, s/n. Campus Sescelades, 43007 Tarragona, Spain and Institut Català de la Salut, Àrea Bàsica de Salut, Tarragona 2, Spain
- *To whom correspondence should be addressed. +34 977558778+34 977558232
| |
Collapse
|
207
|
Kalia VC, Lal S, Cheema S. Insight in to the phylogeny of polyhydroxyalkanoate biosynthesis: Horizontal gene transfer. Gene 2007; 389:19-26. [PMID: 17113245 DOI: 10.1016/j.gene.2006.09.010] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2006] [Revised: 06/29/2006] [Accepted: 09/25/2006] [Indexed: 11/26/2022]
Abstract
Polyhydroxyalkanoates (PHAs) are gaining more and more importance the world over due to their structural diversity and close analogy to plastics. Their biodegradability makes them extremely desirable substitutes for synthetic plastics. PHAs are produced in organisms under certain stress conditions. Here, we investigated 253 sequenced (completely and unfinished) genomes for the diversity and phylogenetics of the PHA biosynthesis. Discrepancies in the phylogenetic trees for phaA, phaB and phaC genes of the PHA biosynthesis have led to the suggestion that horizontal gene transfer (HGT) may be a major contributor for its evolution. Twenty four organisms belonging to diverse taxa were found to be involved in HGT. Among these, Bacillus cereus ATCC 14579 and Xanthomonas axonopodis pv. citri str. 306 seem to have acquired all the three genes through HGT events and have not been characterized so far as PHA producers. This study also revealed certain potential organisms such as Streptomyces coelicolor A3(2), Staphylococcus epidermidis ATCC 12228, Brucella suis 1330, Burkholderia sp., DSMZ 9242 and Leptospira interrogans serovar lai str. 56601, which can be transformed into novel PHA producers through recombinant DNA technology.
Collapse
Affiliation(s)
- Vipin C Kalia
- Environmental Biotechnology, Institute of Genomics and Integrative Biology (CSIR), Delhi, India.
| | | | | |
Collapse
|
208
|
Revisiting the directional mutation pressure theory: The analysis of a particular genomic structure in Leishmania major. Gene 2006; 385:28-40. [DOI: 10.1016/j.gene.2006.04.031] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2005] [Accepted: 04/04/2006] [Indexed: 11/20/2022]
|
209
|
Carbone A. Computational prediction of genomic functional cores specific to different microbes. J Mol Evol 2006; 63:733-46. [PMID: 17103060 DOI: 10.1007/s00239-005-0250-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2005] [Accepted: 07/10/2006] [Indexed: 10/23/2022]
Abstract
Computational and experimental attempts tried to characterize a universal core of genes representing the minimal set of functional needs for an organism. Based on the increasing number of available complete genomes, comparative genomics has concluded that the universal core contains < 50 genes. In contrast, experiments suggest a much larger set of essential genes (certainly more than several hundreds, even under the most restrictive hypotheses) that is dependent on the biological complexity and environmental specificity of the organism. Highly biased genes, which are generally also the most expressed in translationally biased organisms, tend to be over represented in the class of genes deemed to be essential for any given bacterial species. This association is far from perfect; nevertheless, it allows us to propose a new computational method to detect, to a certain extent, ubiquitous genes, nonorthologous genes, environment-specific genes, genes involved in the stress response, and genes with no identified function but highly likely to be essential for the cell. Most of these groups of genes cannot be identified with previously attempted computational and experimental approaches. The large variety of life-styles and the unusually detectable functional signals characterizing translationally biased organisms suggest using them as reference organisms to infer essentiality in other microbial species. The case of small parasitic genomes is discussed. Data issued by the analysis are compared with previous computational and experimental studies. Results are discussed both on methodological and biological grounds.
Collapse
Affiliation(s)
- Alessandra Carbone
- Génomique Analytique, Université Pierre et Marie Curie-Paris 6, INSERM U511, 91, Bd de I'Hôpital, 75013, Paris, France.
| |
Collapse
|
210
|
Guan Q, Zheng W, Tang S, Liu X, Zinkel RA, Tsui KW, Yandell BS, Culbertson MR. Impact of nonsense-mediated mRNA decay on the global expression profile of budding yeast. PLoS Genet 2006; 2:e203. [PMID: 17166056 PMCID: PMC1657058 DOI: 10.1371/journal.pgen.0020203] [Citation(s) in RCA: 102] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2006] [Accepted: 10/18/2006] [Indexed: 11/19/2022] Open
Abstract
Nonsense-mediated mRNA decay (NMD) is a eukaryotic mechanism of RNA surveillance that selectively eliminates aberrant transcripts coding for potentially deleterious proteins. NMD also functions in the normal repertoire of gene expression. In Saccharomyces cerevisiae, hundreds of endogenous RNA Polymerase II transcripts achieve steady-state levels that depend on NMD. For some, the decay rate is directly influenced by NMD (direct targets). For others, abundance is NMD-sensitive but without any effect on the decay rate (indirect targets). To distinguish between direct and indirect targets, total RNA from wild-type (Nmd+) and mutant (Nmd−) strains was probed with high-density arrays across a 1-h time window following transcription inhibition. Statistical models were developed to describe the kinetics of RNA decay. 45% ± 5% of RNAs targeted by NMD were predicted to be direct targets with altered decay rates in Nmd− strains. Parallel experiments using conventional methods were conducted to empirically test predictions from the global experiment. The results show that the global assay reliably distinguished direct versus indirect targets. Different types of targets were investigated, including transcripts containing adjacent, disabled open reading frames, upstream open reading frames, and those prone to out-of-frame initiation of translation. Known targeting mechanisms fail to account for all of the direct targets of NMD, suggesting that additional targeting mechanisms remain to be elucidated. 30% of the protein-coding targets of NMD fell into two broadly defined functional themes: those affecting chromosome structure and behavior and those affecting cell surface dynamics. Overall, the results provide a preview for how expression profiles in multi-cellular eukaryotes might be impacted by NMD. Furthermore, the methods for analyzing decay rates on a global scale offer a blueprint for new ways to study mRNA decay pathways in any organism where cultured cell lines are available. Genes determine the structure of proteins through transcription and translation in which an RNA copy of the gene is made (mRNA) and then translated to make the protein. Cellular protein levels reflect the relative rates of mRNA synthesis and degradation, which are subject to multiple layers of controls. Mechanisms also exist to ensure the quality of each mRNA. One quality control mechanism called nonsense-mediated mRNA decay (NMD) triggers the rapid degradation of mRNAs containing coding errors that would otherwise lead to the production of non-functional or potentially deleterious proteins. NMD occurs in yeasts, plants, flies, worms, mice, and humans. In humans, NMD affects the etiology of genetic disorders by affecting the expression of genes that carry disease-causing mutations. Besides quality assurance, NMD plays another role in gene expression by controlling the abundance of hundreds of normal mRNAs that are devoid of coding errors. In this paper, the authors used DNA arrays to monitor the relative decay rates of all mRNAs in budding yeast and found a subset where decay rates were dependent on NMD. Many of the corresponding proteins perform related functional roles affecting both the structure and behavior of chromosomes and the structure and integrity of the cell surface.
Collapse
Affiliation(s)
- Qiaoning Guan
- Laboratories of Genetics and Molecular Biology, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Wei Zheng
- Laboratories of Genetics and Molecular Biology, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Shijie Tang
- Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Xiaosong Liu
- Department of Physics, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Robert A Zinkel
- Laboratories of Genetics and Molecular Biology, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Kam-Wah Tsui
- Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Brian S Yandell
- Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America
- Department of Horticulture, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Michael R Culbertson
- Laboratories of Genetics and Molecular Biology, University of Wisconsin, Madison, Wisconsin, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
211
|
Bergman NH, Anderson EC, Swenson EE, Niemeyer MM, Miyoshi AD, Hanna PC. Transcriptional profiling of the Bacillus anthracis life cycle in vitro and an implied model for regulation of spore formation. J Bacteriol 2006; 188:6092-100. [PMID: 16923876 PMCID: PMC1595399 DOI: 10.1128/jb.00723-06] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2006] [Accepted: 06/26/2006] [Indexed: 11/20/2022] Open
Abstract
The life cycle of Bacillus anthracis includes both vegetative and endospore morphologies which alternate based on nutrient availability, and there is considerable evidence indicating that the ability of this organism to cause anthrax depends on its ability to progress through this life cycle in a regulated manner. Here we report the use of a custom B. anthracis GeneChip in defining the gene expression patterns that occur throughout the entire life cycle in vitro. Nearly 5,000 genes were expressed in five distinct waves of transcription as the bacteria progressed from germination through sporulation, and we identified a specific set of functions represented within each wave. We also used these data to define the temporal expression of the spore proteome, and in doing so we have demonstrated that much of the spore's protein content is not synthesized de novo during sporulation but rather is packaged from preexisting stocks. We explored several potential mechanisms by which the cell could control which proteins are packaged into the developing spore, and our analyses were most consistent with a model in which B. anthracis regulates the composition of the spore proteome based on protein stability. This study is by far the most comprehensive survey yet of the B. anthracis life cycle and serves as a useful resource in defining the growth-phase-dependent expression patterns of each gene. Additionally, the data and accompanying bioinformatics analyses suggest a model for sporulation that has broad implications for B. anthracis biology and offer new possibilities for microbial forensics and detection.
Collapse
Affiliation(s)
- Nicholas H Bergman
- Department of Microbiology and Immunology, Bioinformatics Program, University of Michigan Medical School, 6605H Medical Sciences Bldg II, 1150 W. Medical Center Dr., Ann Arbor, MI 48109-0620, USA.
| | | | | | | | | | | |
Collapse
|
212
|
Uno R, Nakayama Y, Tomita M. Over-representation of Chi sequences caused by di-codon increase in Escherichia coli K-12. Gene 2006; 380:30-7. [PMID: 16854534 DOI: 10.1016/j.gene.2006.05.013] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2005] [Revised: 04/20/2006] [Accepted: 05/09/2006] [Indexed: 11/17/2022]
Abstract
Chi sequences (5'-GCTGGTGG-3') are cis-acting 8 bp sequence elements that enhance homologous recombination promoted by the RecBCD pathway in Escherichia coli. The genome of E. coli K-12 MG1655 contains 1009 Chi sequences and this frequency far exceeds the expected value for occurrence of an 8 bp sequence in a genome of this size. It is generally thought that the over-representation of Chi sequences indicates that they have been selected for during evolution because of their function in recombination. The genes from three E. coli strains (K-12, O157 and CFT) were classified into three categories (island, match to other E. coli, and backbone). Island genes have a different base composition and codon usage in comparison with those in the backbone genes, therefore they were relatively new and not yet adapted to the base composition patterns and codon usage typical of the recipient genome. The over-representation of Chi sequences was examined by comparing Chi frequencies and codon frequencies between island and backbone genes. The difference in the CTGGTG di-codon frequency between the backbone and island genes was correlated with the frequency of Chi sequences which were translated in the Leu-Val (-G/CTG/GTG/G-) reading frame in the K-12 strain. These results suggest that the main reading frame of Chi sequences increased as a result of the di-codon CTG-GTG increasing under a genome-wide pressure for adapting to the codon usage and base composition of the E. coli K-12 strain, and that the RecBCD recombinase might adjust its recognition sequence to a frequently occurring oligomer such as G-CTG-GTG-G.
Collapse
Affiliation(s)
- Reina Uno
- Institute for Advanced Biosciences, Keio University, Tsuruoka, 997-0014, Japan.
| | | | | |
Collapse
|
213
|
Couturier E, Rocha EPC. Replication-associated gene dosage effects shape the genomes of fast-growing bacteria but only for transcription and translation genes. Mol Microbiol 2006; 59:1506-18. [PMID: 16468991 DOI: 10.1111/j.1365-2958.2006.05046.x] [Citation(s) in RCA: 150] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The bidirectional replication of bacterial genomes leads to transient gene dosage effects. Here, we show that such effects shape the chromosome organisation of fast-growing bacteria and that they correlate strongly with maximal growth rate. Surprisingly the predicted maximal number of replication rounds shows little if any phylogenetic inertia, suggesting that it is a very labile trait. Yet, a combination of theoretical and statistical analyses predicts that dozens of replication forks may be simultaneously present in the cells of certain species. This suggests a strikingly efficient management of the replication apparatus, of replication fork arrests and of chromosome segregation in such cells. Gene dosage effects strongly constrain the position of genes involved in translation and transcription, but not other highly expressed genes. The relative proximity of the former genes to the origin of replication follows the regulatory dependencies observed under exponential growth, as the bias is stronger for RNA polymerase, then rDNA, then ribosomal proteins and tDNA. Within tDNAs we find that only the positions of the previously proposed 'ubiquitous' tRNA, which translate the most frequent codons in highly expressed genes, show strong signs of selection for gene dosage effects. Finally, we provide evidence for selection acting upon genome organisation to take advantage of gene dosage effects by identifying a positive correlation between genome stability and the number of simultaneous replication rounds. We also show that gene dosage effects can explain the over-representation of highly expressed genes in the largest replichore of genomes containing more than one chromosome. Together, these results demonstrate that replication-associated gene dosage is an important determinant of chromosome organisation and dynamics, especially among fast-growing bacteria.
Collapse
Affiliation(s)
- Etienne Couturier
- Atelier de Bioinformatique, Université Pierre et Marie Curie, 12, Rue Cuvier, 75005 Paris, France
| | | |
Collapse
|
214
|
Gilchrist MA, Wagner A. A model of protein translation including codon bias, nonsense errors, and ribosome recycling. J Theor Biol 2006; 239:417-34. [PMID: 16171830 DOI: 10.1016/j.jtbi.2005.08.007] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2005] [Revised: 08/05/2005] [Accepted: 08/08/2005] [Indexed: 11/15/2022]
Abstract
We present and analyse a model of protein translation at the scale of an individual messenger RNA (mRNA) transcript. The model we develop is unique in that it incorporates the phenomena of ribosome recycling and nonsense errors. The model conceptualizes translation as a probabilistic wave of ribosome occupancy traveling down a heterogeneous medium, the mRNA transcript. Our results show that the heterogeneity of the codon translation rates along the mRNA results in short-scale spikes and dips in the wave. Nonsense errors attenuate this wave on a longer scale while ribosome recycling reinforces it. We find that the combination of nonsense errors and codon usage bias can have a large effect on the probability that a ribosome will completely translate a transcript. We also elucidate how these forces interact with ribosome recycling to determine the overall translation rate of an mRNA transcript. We derive a simple cost function for nonsense errors using our model and apply this function to the yeast (Saccharomyces cervisiae) genome. Using this function we are able to detect position dependent selection on codon bias which correlates with gene expression levels as predicted a priori. These results indirectly validate our underlying model assumptions and confirm that nonsense errors can play an important role in shaping codon usage bias.
Collapse
Affiliation(s)
- Michael A Gilchrist
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, 37996, USA.
| | | |
Collapse
|
215
|
Wu ZL, Bartleson CJ, Ham AJL, Guengerich FP. Heterologous expression, purification, and properties of human cytochrome P450 27C1. Arch Biochem Biophys 2006; 445:138-46. [PMID: 16360114 DOI: 10.1016/j.abb.2005.11.002] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2005] [Revised: 11/04/2005] [Accepted: 11/05/2005] [Indexed: 11/18/2022]
Abstract
Cytochrome P450 (P450) 27C1 is one of the "orphan" P450 enzymes without a known biological function. A human P450 27C1 cDNA with a nucleotide sequence modified for Escherichia coli usage was prepared and modified at the N-terminus, based on the expected mitochondrial localization. A derivative with residues 3-60 deleted was expressed at a level of 1350nmol/L E. coli culture and had the characteristic P450 spectra. The identity of the expressed protein was confirmed by mass spectrometry of proteolytic fragments. The purified P450 was in the low-spin iron state, and the spin equilibrium was not perturbed by any of the potential substrates vitamin D(3), 1alpha- or 25-hydroxy vitamin D(3), or cholesterol. P450s 27A1 and 27B1 are known to catalyze the 25-hydroxylation of vitamin D(3) and the 1alpha-hydroxylation of 25-hydroxy vitamin D(3), respectively. In the presence of recombinant human adrenodoxin and adrenodoxin reductase, recombinant P450 27C1 did not catalyze the oxidation of vitamin D(3), 1alpha- or 25-hydroxy vitamin D(3), or cholesterol at detectable rates. P450 27C1 mRNA was determined to be expressed in liver, kidney, pancreas, and several other human tissues.
Collapse
Affiliation(s)
- Zhong-Liu Wu
- Department of Biochemistry and Center in Molecular Toxicology, Vanderbilt University School of Medicine, Nashville, TN 37232-0146, USA
| | | | | | | |
Collapse
|
216
|
Wu G, Bashir-Bello N, Freeland SJ. The Synthetic Gene Designer: a flexible web platform to explore sequence manipulation for heterologous expression. Protein Expr Purif 2005; 47:441-5. [PMID: 16376569 DOI: 10.1016/j.pep.2005.10.020] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2005] [Revised: 10/13/2005] [Accepted: 10/13/2005] [Indexed: 11/15/2022]
Abstract
"Codon optimization" is a general approach to improving heterologous expression where genes are moved from their native genomes into alternatives that exhibit different patterns of codon usage. However, despite reports of successful manipulations and the existence of stand-alone codon optimization software packages or commercial services that offer to redesign genes, the scientific community lacks any systematic understanding of what exactly it means to optimize codon usage. Thus we present a bona fide web application, the "Synthetic Gene Designer," which contrasts with existing software by providing a centralized, free, and transparent platform for the broader scientific community to develop knowledge about synthetic gene design. Consistent with this goal, our software is associated with a moderated e-forum that promotes discussion of synthetic gene design and offers technical support. In addition, the Synthetic Gene Designer presents enhanced functionality over existing software options: for example, it enables users to work with non-standard genetic codes, with user-defined patterns of codon usage and an expanded range of methods for codon optimization. The Synthetic Gene Designer, together with on-line tutorials and the forum, is available at .
Collapse
Affiliation(s)
- Gang Wu
- Department of Biological Sciences, University of Maryland at Baltimore County, 21250, USA.
| | | | | |
Collapse
|
217
|
Reva ON, Tümmler B. Differentiation of regions with atypical oligonucleotide composition in bacterial genomes. BMC Bioinformatics 2005; 6:251. [PMID: 16225667 PMCID: PMC1274298 DOI: 10.1186/1471-2105-6-251] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2005] [Accepted: 10/14/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Complete sequencing of bacterial genomes has become a common technique of present day microbiology. Thereafter, data mining in the complete sequence is an essential step. New in silico methods are needed that rapidly identify the major features of genome organization and facilitate the prediction of the functional class of ORFs. We tested the usefulness of local oligonucleotide usage (OU) patterns to recognize and differentiate types of atypical oligonucleotide composition in DNA sequences of bacterial genomes. RESULTS A total of 163 bacterial genomes of eubacteria and archaea published in the NCBI database were analyzed. Local OU patterns exhibit substantial intrachromosomal variation in bacteria. Loci with alternative OU patterns were parts of horizontally acquired gene islands or ancient regions such as genes for ribosomal proteins and RNAs. OU statistical parameters, such as local pattern deviation (D), pattern skew (PS) and OU variance (OUV) enabled the detection and visualization of gene islands of different functional classes. CONCLUSION A set of approaches has been designed for the statistical analysis of nucleotide sequences of bacterial genomes. These methods are useful for the visualization and differentiation of regions with atypical oligonucleotide composition prior to or accompanying gene annotation.
Collapse
Affiliation(s)
- Oleg N Reva
- Klinische Forschergruppe, OE6711, Medizinische Hochschule Hannover, Carl-Neuberg-Strasse 1, D-30625 Hannover, Germany
- Danylo Zabolotny Institute of Microbiology and Virology of the National Academy of Science of Ukraine, Dep. of Antibiotics, 154 Zabolotnogo Str., D03680, Kyiv GSP, Ukraine
| | - Burkhard Tümmler
- Klinische Forschergruppe, OE6711, Medizinische Hochschule Hannover, Carl-Neuberg-Strasse 1, D-30625 Hannover, Germany
| |
Collapse
|
218
|
Carbone A, Madden R. Insights on the evolution of metabolic networks of unicellular translationally biased organisms from transcriptomic data and sequence analysis. J Mol Evol 2005; 61:456-69. [PMID: 16187158 DOI: 10.1007/s00239-004-0317-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2004] [Accepted: 04/20/2005] [Indexed: 11/27/2022]
Abstract
Codon bias is related to metabolic functions in translationally biased organisms, and two facts are argued about. First, genes with high codon bias describe in meaningful ways the metabolic characteristics of the organism; important metabolic pathways corresponding to crucial characteristics of the lifestyle of an organism, such as photosynthesis, nitrification, anaerobic versus aerobic respiration, sulfate reduction, methanogenesis, and others, happen to involve especially biased genes. Second, gene transcriptional levels of sets of experiments representing a significant variation of biological conditions strikingly confirm, in the case of Saccharomyces cerevisiae, that metabolic preferences are detectable by purely statistical analysis: the high metabolic activity of yeast during fermentation is encoded in the high bias of enzymes involved in the associated pathways, suggesting that this genome was affected by a strong evolutionary pressure that favored a predominantly fermentative metabolism of yeast in the wild. The ensemble of metabolic pathways involving enzymes with high codon bias is rather well defined and remains consistent across many species, even those that have not been considered as translationally biased, such as Helicobacter pylori, for instance, reveal some weak form of translational bias for this genome. We provide numerical evidence, supported by experimental data, of these facts and conclude that the metabolic networks of translationally biased genomes, observable today as projections of eons of evolutionary pressure, can be analyzed numerically and predictions of the role of specific pathways during evolution can be derived. The new concepts of Comparative Pathway Index, used to compare organisms with respect to their metabolic networks, and Evolutionary Pathway Index, used to detect evolutionarily meaningful bias in the genetic code from transcriptional data, are introduced.
Collapse
Affiliation(s)
- Alessandra Carbone
- Génomique Analytique, Université Pierre et Marie Curie, INSERM U511, 91 Bd de l'Hôpital, 75013 Paris, France.
| | | |
Collapse
|
219
|
Grote A, Hiller K, Scheer M, Münch R, Nörtemann B, Hempel DC, Jahn D. JCat: a novel tool to adapt codon usage of a target gene to its potential expression host. Nucleic Acids Res 2005; 33:W526-31. [PMID: 15980527 PMCID: PMC1160137 DOI: 10.1093/nar/gki376] [Citation(s) in RCA: 1064] [Impact Index Per Article: 53.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
A novel method for the adaptation of target gene codon usage to most sequenced prokaryotes and selected eukaryotic gene expression hosts was developed to improve heterologous protein production. In contrast to existing tools, JCat (Java Codon Adaptation Tool) does not require the manual definition of highly expressed genes and is, therefore, a very rapid and easy method. Further options of JCat for codon adaptation include the avoidance of unwanted cleavage sites for restriction enzymes and Rho-independent transcription terminators. The output of JCat is both graphically and as Codon Adaptation Index (CAI) values given for the pasted sequence and the newly adapted sequence. Additionally, a list of genes in FASTA-format can be uploaded to calculate CAI values. In one example, all genes of the genome of Caenorhabditis elegans were adapted to Escherichia coli codon usage and further optimized to avoid commonly used restriction sites. In a second example, the Pseudomonas aeruginosa exbD gene codon usage was adapted to E.coli codon usage with parallel avoidance of the same restriction sites. For both, the degree of introduced changes was documented and evaluated. JCat is integrated into the PRODORIC database that hosts all required information on the various organisms to fulfill the requested calculations. JCat is freely accessible at .
Collapse
Affiliation(s)
- Andreas Grote
- Institut für Mikrobiologie, Spielmannstraße 7, Technische Universität BraunschweigD-38106 Braunschweig, Germany
- Institut für Bioverfahrenstechnik, Gaußstraße 17, Technische Universität BraunschweigD-38106 Braunschweig, Germany
| | - Karsten Hiller
- Institut für Mikrobiologie, Spielmannstraße 7, Technische Universität BraunschweigD-38106 Braunschweig, Germany
| | - Maurice Scheer
- Institut für Mikrobiologie, Spielmannstraße 7, Technische Universität BraunschweigD-38106 Braunschweig, Germany
- Fachbereich für Informatik, Am Exer 2, Fachhochschule WolfenbüttelD-38302 Wolfenbüttel, Germany
| | - Richard Münch
- Institut für Mikrobiologie, Spielmannstraße 7, Technische Universität BraunschweigD-38106 Braunschweig, Germany
| | - Bernd Nörtemann
- Institut für Bioverfahrenstechnik, Gaußstraße 17, Technische Universität BraunschweigD-38106 Braunschweig, Germany
| | - Dietmar C. Hempel
- Institut für Bioverfahrenstechnik, Gaußstraße 17, Technische Universität BraunschweigD-38106 Braunschweig, Germany
| | - Dieter Jahn
- Institut für Mikrobiologie, Spielmannstraße 7, Technische Universität BraunschweigD-38106 Braunschweig, Germany
- To whom correspondence should be addressed. Tel: +49 531 391 5801; Fax: +49 531 391 5854;
| |
Collapse
|
220
|
Abstract
The expression of functional proteins in heterologous hosts is a cornerstone of modern biotechnology. Unfortunately, proteins are often difficult to express outside their original context. They might contain codons that are rarely used in the desired host, come from organisms that use non-canonical code or contain expression-limiting regulatory elements within their coding sequence. Improvements in the speed and cost of gene synthesis have facilitated the complete redesign of entire gene sequences to maximize the likelihood of high protein expression. Redesign strategies are discussed here, including modification of translation initiation regions, alteration of mRNA structural elements and use of different codon biases.
Collapse
|
221
|
Carbone A, Képès F, Zinovyev A. Codon bias signatures, organization of microorganisms in codon space, and lifestyle. Mol Biol Evol 2004; 22:547-61. [PMID: 15537809 DOI: 10.1093/molbev/msi040] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
New and simple numerical criteria based on a codon adaptation index are applied to the complete genomic sequences of 80 Eubacteria and 16 Archaea, to infer weak and strong genome tendencies toward content bias, translational bias, and strand bias. These criteria can be applied to all microbial genomes, even those for which little biological information is known, and a codon bias signature, that is the collection of strong biases displayed by a genome, can be automatically derived. A codon bias space, where genomes are identified by their preferred codons, is proposed as a novel formal framework to interpret genomic relationships. Principal component analysis confirms that although GC content has a dominant effect on codon bias space, thermophilic and mesophilic species can be identified and separated by codon preferences. Two more examples concerning lifestyle are studied with linear discriminant analysis: suitable separating functions characterized by sets of preferred codons are provided to discriminate: translationally biased (hyper)thermophiles from mesophiles, and organisms with different respiratory characteristics, aerobic, anaerobic, facultative aerobic and facultative anaerobic. These results suggest that codon bias space might reflect the geometry of a prokaryotic "physiology space." Evolutionary perspectives are noted, numerical criteria and distances among organisms are validated on known cases, and various results and predictions are discussed both on methodological and biological grounds.
Collapse
Affiliation(s)
- A Carbone
- Génomique Analytique, Université Pierre et Marie Curie, INSERM U511, 91, Bd de l'Hôpital, 75013 Paris, France.
| | | | | |
Collapse
|
222
|
Jordan IK, Mariño-Ramírez L, Wolf YI, Koonin EV. Conservation and coevolution in the scale-free human gene coexpression network. Mol Biol Evol 2004; 21:2058-70. [PMID: 15282333 DOI: 10.1093/molbev/msh222] [Citation(s) in RCA: 142] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The role of natural selection in biology is well appreciated. Recently, however, a critical role for physical principles of network self-organization in biological systems has been revealed. Here, we employ a systems level view of genome-scale sequence and expression data to examine the interplay between these two sources of order, natural selection and physical self-organization, in the evolution of human gene regulation. The topology of a human gene coexpression network, derived from tissue-specific expression profiles, shows scale-free properties that imply evolutionary self-organization via preferential node attachment. Genes with numerous coexpressed partners (the hubs of the coexpression network) evolve more slowly on average than genes with fewer coexpressed partners, and genes that are coexpressed show similar rates of evolution. Thus, the strength of selective constraints on gene sequences is affected by the topology of the gene coexpression network. This connection is strong for the coding regions and 3' untranslated regions (UTRs), but the 5' UTRs appear to evolve under a different regime. Surprisingly, we found no connection between the rate of gene sequence divergence and the extent of gene expression profile divergence between human and mouse. This suggests that distinct modes of natural selection might govern sequence versus expression divergence, and we propose a model, based on rapid, adaptation-driven divergence and convergent evolution of gene expression patterns, for how natural selection could influence gene expression divergence.
Collapse
Affiliation(s)
- I King Jordan
- National Center for Biotechnology Information, National Institutes of Health Bethesda, Maryland, USA
| | | | | | | |
Collapse
|
223
|
Davis JC, Petrov DA. Preferential duplication of conserved proteins in eukaryotic genomes. PLoS Biol 2004; 2:E55. [PMID: 15024414 PMCID: PMC368158 DOI: 10.1371/journal.pbio.0020055] [Citation(s) in RCA: 124] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2003] [Accepted: 12/18/2003] [Indexed: 11/18/2022] Open
Abstract
A central goal in genome biology is to understand the origin and maintenance of genic diversity. Over evolutionary time, each gene's contribution to the genic content of an organism depends not only on its probability of long-term survival, but also on its propensity to generate duplicates that are themselves capable of long-term survival. In this study we investigate which types of genes are likely to generate functional and persistent duplicates. We demonstrate that genes that have generated duplicates in the C. elegans and S. cerevisiae genomes were 25%-50% more constrained prior to duplication than the genes that failed to leave duplicates. We further show that conserved genes have been consistently prolific in generating duplicates for hundreds of millions of years in these two species. These findings reveal one way in which gene duplication shapes the content of eukaryotic genomes. Our finding that the set of duplicate genes is biased has important implications for genome-scale studies.
Collapse
Affiliation(s)
- Jerel C Davis
- Department of Biological Sciences, Stanford University, Stanford, California, USA.
| | | |
Collapse
|
224
|
Friberg M, von Rohr P, Gonnet G. Limitations of codon adaptation index and other coding DNA-based features for prediction of protein expression inSaccharomyces cerevisiae. Yeast 2004; 21:1083-93. [DOI: 10.1002/yea.1150] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|