1
|
Eckshtain-Levi N, Shkedy D, Gershovits M, Da Silva GM, Tamir-Ariel D, Walcott R, Pupko T, Burdman S. Insights from the Genome Sequence of Acidovorax citrulli M6, a Group I Strain of the Causal Agent of Bacterial Fruit Blotch of Cucurbits. Front Microbiol 2016; 7:430. [PMID: 27092114 PMCID: PMC4821854 DOI: 10.3389/fmicb.2016.00430] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2015] [Accepted: 03/17/2016] [Indexed: 11/13/2022] Open
Abstract
Acidovorax citrulli is a seedborne bacterium that causes bacterial fruit blotch of cucurbit plants including watermelon and melon. A. citrulli strains can be divided into two major groups based on DNA fingerprint analyses and biochemical properties. Group I strains have been generally isolated from non-watermelon cucurbits, while group II strains are closely associated with watermelon. In the present study, we report the genome sequence of M6, a group I model A. citrulli strain, isolated from melon. We used comparative genome analysis to investigate differences between the genome of strain M6 and the genome of the group II model strain AAC00-1. The draft genome sequence of A. citrulli M6 harbors 139 contigs, with an overall approximate size of 4.85 Mb. The genome of M6 is ∼500 Kb shorter than that of strain AAC00-1. Comparative analysis revealed that this size difference is mainly explained by eight fragments, ranging from ∼35-120 Kb and distributed throughout the AAC00-1 genome, which are absent in the M6 genome. In agreement with this finding, while AAC00-1 was found to possess 532 open reading frames (ORFs) that are absent in strain M6, only 123 ORFs in M6 were absent in AAC00-1. Most of these M6 ORFs are hypothetical proteins and most of them were also detected in two group I strains that were recently sequenced, tw6 and pslb65. Further analyses by PCR assays and coverage analyses with other A. citrulli strains support the notion that some of these fragments or significant portions of them are discriminative between groups I and II strains of A. citrulli. Moreover, GC content, effective number of codon values and cluster of orthologs' analyses indicate that these fragments were introduced into group II strains by horizontal gene transfer events. Our study reports the genome sequence of a model group I strain of A. citrulli, one of the most important pathogens of cucurbits. It also provides the first comprehensive comparison at the genomic level between the two major groups of strains of this pathogen.
Collapse
Affiliation(s)
- Noam Eckshtain-Levi
- Department of Plant Pathology and Microbiology and the Otto Warburg Center for Agricultural Biotechnology, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of JerusalemRehovot, Israel
| | - Dafna Shkedy
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv UniversityTel Aviv, Israel
| | - Michael Gershovits
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv UniversityTel Aviv, Israel
| | | | - Dafna Tamir-Ariel
- Department of Plant Pathology and Microbiology and the Otto Warburg Center for Agricultural Biotechnology, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of JerusalemRehovot, Israel
| | - Ron Walcott
- Department of Plant Pathology, The University of Georgia, AthensGA, USA
| | - Tal Pupko
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv UniversityTel Aviv, Israel
| | - Saul Burdman
- Department of Plant Pathology and Microbiology and the Otto Warburg Center for Agricultural Biotechnology, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of JerusalemRehovot, Israel
| |
Collapse
|
2
|
Danchin A. Bacteria as computers making computers. FEMS Microbiol Rev 2009; 33:3-26. [PMID: 19016882 PMCID: PMC2704931 DOI: 10.1111/j.1574-6976.2008.00137.x] [Citation(s) in RCA: 102] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2008] [Revised: 09/20/2008] [Accepted: 09/21/2008] [Indexed: 12/13/2022] Open
Abstract
Various efforts to integrate biological knowledge into networks of interactions have produced a lively microbial systems biology. Putting molecular biology and computer sciences in perspective, we review another trend in systems biology, in which recursivity and information replace the usual concepts of differential equations, feedback and feedforward loops and the like. Noting that the processes of gene expression separate the genome from the cell machinery, we analyse the role of the separation between machine and program in computers. However, computers do not make computers. For cells to make cells requires a specific organization of the genetic program, which we investigate using available knowledge. Microbial genomes are organized into a paleome (the name emphasizes the role of the corresponding functions from the time of the origin of life), comprising a constructor and a replicator, and a cenome (emphasizing community-relevant genes), made up of genes that permit life in a particular context. The cell duplication process supposes rejuvenation of the machine and replication of the program. The paleome also possesses genes that enable information to accumulate in a ratchet-like process down the generations. The systems biology must include the dynamics of information creation in its future developments.
Collapse
Affiliation(s)
- Antoine Danchin
- Génétique des Génomes Bactériens, Institut Pasteur, Paris, France.
| |
Collapse
|
3
|
PCR-based Gene Synthesis, Molecular Cloning, High Level Expression, Purification, and Characterization of Novel Antimicrobial Peptide, Brevinin-2R, in Escherichia Coli. Appl Biochem Biotechnol 2007; 149:109-18. [DOI: 10.1007/s12010-007-8024-z] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2007] [Accepted: 08/08/2007] [Indexed: 12/21/2022]
|
4
|
|
5
|
Bailly-Bechet M, Danchin A, Iqbal M, Marsili M, Vergassola M. Codon usage domains over bacterial chromosomes. PLoS Comput Biol 2006; 2:e37. [PMID: 16683018 PMCID: PMC1447655 DOI: 10.1371/journal.pcbi.0020037] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2005] [Accepted: 03/13/2006] [Indexed: 11/19/2022] Open
Abstract
The geography of codon bias distributions over prokaryotic genomes and its impact upon chromosomal organization are analyzed. To this aim, we introduce a clustering method based on information theory, specifically designed to cluster genes according to their codon usage and apply it to the coding sequences of Escherichia coli and Bacillus subtilis. One of the clusters identified in each of the organisms is found to be related to expression levels, as expected, but other groups feature an over-representation of genes belonging to different functional groups, namely horizontally transferred genes, motility, and intermediary metabolism. Furthermore, we show that genes with a similar bias tend to be close to each other on the chromosome and organized in coherent domains, more extended than operons, demonstrating a role of translation in structuring bacterial chromosomes. It is argued that a sizeable contribution to this effect comes from the dynamical compartimentalization induced by the recycling of tRNAs, leading to gene expression rates dependent on their genomic and expression context.
Collapse
Affiliation(s)
- Marc Bailly-Bechet
- CNRS URA 2171, Institute Pasteur, Unité Génétique in silico, Paris, France
| | - Antoine Danchin
- CNRS URA 2171, Institute Pasteur, Unité Génétique des Génomes Bactériens, Paris, France
| | - Mudassar Iqbal
- Abdus Salam International Center Theoretical Physics, Trieste, Italy
- Computing Laboratory, University of Kent, Canterbury, Kent, United Kingdom
| | - Matteo Marsili
- Abdus Salam International Center Theoretical Physics, Trieste, Italy
| | - Massimo Vergassola
- CNRS URA 2171, Institute Pasteur, Unité Génétique in silico, Paris, France
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
6
|
Pascal G, Médigue C, Danchin A. Persistent biases in the amino acid composition of prokaryotic proteins. Bioessays 2006; 28:726-38. [PMID: 16850406 DOI: 10.1002/bies.20431] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Correspondence analysis of 28 proteomes selected to span the entire realm of prokaryotes revealed universal biases in the proteins' amino acid distribution. Integral Inner Membrane Proteins always form an individual cluster, which can then be used to predict protein localisation in unknown proteomes, independently of the organism's biotope or kingdom. Orphan proteins are consistently rich in aromatic residues. Another bias is also ubiquitous: the amino acid composition is driven by the G + C content of the first codon position. An unexpected bias is driven, in many proteomes, by the AAN box of the genetic code, suggesting some functional biochemical relationship between asparagine and lysine. Less-significant biases are driven by the rare amino acids, cysteine and tryptophan. Some allow identification of species-specific functions or localisation such as surface or exported proteins. Errors in genome annotations are also revealed by correspondence analysis, making it useful for quality control and correction.
Collapse
Affiliation(s)
- Géraldine Pascal
- Genoscope/CNRS UMR 8030, Atelier de Génomique Comparative, Evry, France
| | | | | |
Collapse
|
7
|
Ochman H, Lerat E, Daubin V. Examining bacterial species under the specter of gene transfer and exchange. Proc Natl Acad Sci U S A 2005; 102 Suppl 1:6595-9. [PMID: 15851673 PMCID: PMC1131874 DOI: 10.1073/pnas.0502035102] [Citation(s) in RCA: 158] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Even in lieu of a dependable species concept for asexual organisms, the classification of bacteria into discrete taxonomic units is considered to be obstructed by the potential for lateral gene transfer (LGT) among lineages at virtually all phylogenetic levels. In most bacterial genomes, large proportions of genes are introduced by LGT, as indicated by their compositional features and/or phylogenetic distributions, and there is also clear evidence of LGT between very distantly related organisms. By adopting a whole-genome approach, which examined the history of every gene in numerous bacterial genomes, we show that LGT does not hamper phylogenetic reconstruction at many of the shallower taxonomic levels. Despite the high levels of gene acquisition, the only taxonomic group for which appreciable amounts of homologous recombination were detected was within bacterial species. Taken as a whole, the results derived from the analysis of complete gene inventories support several of the current means to recognize and define bacterial species.
Collapse
Affiliation(s)
- Howard Ochman
- Department of Biochemistry and Molecular Biophysics, University of Arizona, Tucson, 85721, USA.
| | | | | |
Collapse
|
8
|
Abstract
The levels of cellular organization in living organisms are the results of a variety of selection pressures. We have investigated here the final outcome of this integrated selective process in proteins of the best known microbial models Escherichia coli, Bacillus subtilis, and Methanococcus jannaschii, supposed to have undergone separate evolution for more than 1 billion years. Using multivariate analysis methods, including correspondence analysis, we studied the overall amino acid composition of all proteins making a proteome. Starting from and further developing previous results that had pointed out some general forces driving the amino acid composition of the proteomes of these model bacteria, we explored the correlations existing between the structure and functions of the proteins forming a proteome and their amino acid composition. The electric charge of amino acids measured against hydrophobicity creates a highly homogeneous cluster, made exclusively of proteins that are core components of the cytoplasmic membrane of the cell (integral inner membrane proteins). A second bias is imposed by the G+C content of the genome, indicating that protein functions are so robust with respect to amino acid changes that they can accommodate a large shift in the nucleotide content of the genome. A remarkable role of aromatic amino acids was uncovered. Expressed orphan proteins are enriched in these residues, suggesting that they might participate in a process of gain of function during evolution.
Collapse
Affiliation(s)
- Géraldine Pascal
- Genoscope/CNRS UMR 8030, Atelier de Génomique Comparative, Evry, France.
| | | | | |
Collapse
|
9
|
|
10
|
Kanaya S, Kinouchi M, Abe T, Kudo Y, Yamada Y, Nishi T, Mori H, Ikemura T. Analysis of codon usage diversity of bacterial genes with a self-organizing map (SOM): characterization of horizontally transferred genes with emphasis on the E. coli O157 genome. Gene 2001; 276:89-99. [PMID: 11591475 DOI: 10.1016/s0378-1119(01)00673-4] [Citation(s) in RCA: 107] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
With increases in the amounts of available DNA sequence data, it has become increasingly important to develop tools for comprehensive systematic analysis and comparison of species-specific characteristics of protein-coding sequences for a wide variety of genomes. In the present study, we used a novel neural-network algorithm, a self-organizing map (SOM), to efficiently and comprehensively analyze codon usage in approximately 60,000 genes from 29 bacterial species simultaneously. This SOM makes it possible to cluster and visualize genes of individual species separately at a much higher resolution than can be obtained with principal component analysis. The organization of the SOM can be explained by the genome G+C% and tRNA compositions of the individual species. We used SOM to examine codon usage heterogeneity in the E. coli O157 genome, which contains 'O157-unique segments' (O-islands), and showed that SOM is a powerful tool for characterization of horizontally transferred genes.
Collapse
Affiliation(s)
- S Kanaya
- Department of Bio-System Engineering, Faculty of Engineering, Yamagata University, Yonezawa, 992-8510, Yamagata-ken, Japan
| | | | | | | | | | | | | | | |
Collapse
|
11
|
Abstract
Our approach in predicting gene expression levels relates to codon usage differences among gene classes. In prokaryotic genomes, genes that deviate strongly in codon usage from the average gene but are sufficiently similar in codon usage to ribosomal protein genes, to translation and transcription processing factors, and to chaperone-degradation proteins are predicted highly expressed (PHX). By these criteria, PHX genes in most prokaryotic genomes include those encoding ribosomal proteins, translation and transcription processing factors, and chaperone proteins and genes of principal energy metabolism. In particular, for the fast-growing species Escherichia coli, Vibrio cholerae, Bacillus subtilis, and Haemophilus influenzae, major glycolysis and tricarboxylic acid cycle genes are PHX. In Synechocystis, prime genes of photosynthesis are PHX, and in methanogens, PHX genes include those essential for methanogenesis. Overall, the three protein families-ribosomal proteins, protein synthesis factors, and chaperone complexes-are needed at many stages of the life cycle, and apparently bacteria have evolved codon usage to maintain appropriate growth, stability, and plasticity. New interpretations of the capacity of Deinococcus radiodurans for resistance to high doses of ionizing radiation is based on an excess of PHX chaperone-degradation genes and detoxification genes. Expression levels of selected classes of genes, including those for flagella, electron transport, detoxification, histidine kinases, and others, are analyzed. Flagellar PHX genes are conspicuous among spirochete genomes. PHX genes are positively correlated with strong Shine-Dalgarno signal sequences. Specific regulatory proteins, e.g., two-component sensor proteins, are rarely PHX. Genes involved in pathways for the synthesis of vitamins record low predicted expression levels. Several distinctive PHX genes of the available complete prokaryotic genomes are highlighted. Relationships of PHX genes with stoichiometry, multifunctionality, and operon structures are discussed. Our methodology may be used complementary to experimental expression analysis.
Collapse
Affiliation(s)
- S Karlin
- Department of Mathematics, Stanford University, California 94305-2125, USA.
| | | |
Collapse
|
12
|
Kanaya S, Yamada Y, Kudo Y, Ikemura T. Studies of codon usage and tRNA genes of 18 unicellular organisms and quantification of Bacillus subtilis tRNAs: gene expression level and species-specific diversity of codon usage based on multivariate analysis. Gene 1999; 238:143-55. [PMID: 10570992 DOI: 10.1016/s0378-1119(99)00225-5] [Citation(s) in RCA: 295] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
We examined codon usage in Bacillus subtilis genes by multivariate analysis, quantified its cellular levels of individual tRNAs, and found a clear constraint of tRNA contents on synonymous codon choice. Individual tRNA levels were proportional to the copy number of the respective tRNA genes. This indicates that the tRNA gene copy number is an important factor to determine in cellular tRNA levels, which is common with Escherichia coli and yeast Saccharomyces cerevisiae. Codon usage in 18 unicellular organisms whose genomes have been sequenced completely was analyzed and compared with the composition of tRNA genes. The 18 organisms are as follows: yeast S. cerevisiae, Aquifex aeolicus, Archaeoglobus fulgidus, B. subtilis, Borrelia burgdorferi, Chlamydia trachomatis, E. coli, Haemophilus influenzae, Helicobacterpylori, Methanococcusjannaschii, Methanobacterium thermoautotrophicum, Mycobacterium tuberculosis, Mycoplasma genitalium, Mycoplasma pneumoniae, Pyrococcus horikoshii, Rickettsia prowazekii, Synechocystis sp., and Treponema pallidum. Codons preferred in highly expressed genes were related to the codons optimal for the translation process, which were predicted by the composition of isoaccepting tRNA genes. Genes with specific codon usage are discussed in connection with their evolutionary origins and functions. The origin and terminus of replication could be predicted on the basis of codon usage when the usage was analyzed relative to the transcription direction of individual genes.
Collapse
Affiliation(s)
- S Kanaya
- Department of Electrical and Information Engineering, Faculty of Engineering, Yamagata University, Yonezawa, Yamagata-ken, Japan
| | | | | | | |
Collapse
|
13
|
Abstract
The availability of the complete sequence of Escherichia coli strain MG1655 provides the first opportunity to assess the overall impact of horizontal genetic transfer on the evolution of bacterial genomes. We found that 755 of 4,288 ORFs (547.8 kb) have been introduced into the E. coli genome in at least 234 lateral transfer events since this species diverged from the Salmonella lineage 100 million years (Myr) ago. The average age of introduced genes was 14.4 Myr, yielding a rate of transfer 16 kb/Myr/lineage since divergence. Although most of the acquired genes subsequently were deleted, the sequences that have persisted ( approximately 18% of the current chromosome) have conferred properties permitting E. coli to explore otherwise unreachable ecological niches.
Collapse
Affiliation(s)
- J G Lawrence
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | | |
Collapse
|
14
|
Affiliation(s)
- A Danchin
- Régulation de l'Expression Génétique, Institut Pasteur, Paris, France.
| | | |
Collapse
|