151
|
Abstract
Predicted highly expressed (PHX) genes are compared for 16 gamma-proteobacteria and their similarities and differences are interpreted with respect to known or predicted physiological characteristics of the organisms. Predicted highly expressed genes often reflect the organism's predominant lifestyle, habitat, nutrition sources and metabolic propensities. This technique allows to predict principal metabolic activities of the microorganisms operating in their natural habitats. Among our findings is an unusually high number of PHX enzymes acting in cell wall biosynthesis, amino acid biosynthesis and replication in the ant endosymbiont Blochmannia floridanus. We ascribe the abundance of these PHX genes to specific aspects of the relationship between the bacterium and its host. Xanthomonas campestris is unique with a very high number of PHX genes acting in flagellum biosynthesis, which may play a special role during its pathogenicity. Shewanella oneidensis possesses three protein complexes which all can function as complex I in the respiratory chain but only the Na(+)-transporting NADH:ubiquinone oxidoreductase nqr-2 operon is PHX. The PHX genes of Vibrio parahaemolyticus are consistent with the microorganism's adaptation to extremely fast growth rates. Comparative analysis of PHX genes from complex environmental genomic sequences as well as from uncultured pathogenic microbes can provide a novel, useful tool to predict global flux of matter and key intermediates.
Collapse
Affiliation(s)
- Jan Mrázek
- Department of Mathematics, Stanford University, CA 94305, USA
| | | | | |
Collapse
|
152
|
|
153
|
van Passel MWJ, Bart A, Luyf ACM, van Kampen AHC, van der Ende A. Compositional discordance between prokaryotic plasmids and host chromosomes. BMC Genomics 2006; 7:26. [PMID: 16480495 PMCID: PMC1382213 DOI: 10.1186/1471-2164-7-26] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2005] [Accepted: 02/15/2006] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Most plasmids depend on the host replication machinery and possess partitioning genes. These properties confine plasmids to a limited range of hosts, yielding a close and presumably stable relationship between plasmid and host. Hence, it is anticipated that due to amelioration the dinucleotide composition of plasmids is similar to that of the genome of their hosts. However, plasmids are also thought to play a major role in horizontal gene transfer and thus are frequently exchanged between hosts, suggesting dinucleotide composition dissimilarity between plasmid and host genome. We compared the dinucleotide composition of a large collection of plasmids with that of their host genomes to shed more light on this enigma. RESULTS The dinucleotide frequency, coined the genome signature, facilitates the identification of putative horizontally transferred DNA in complete genome sequences, since it was found to be typical for a certain genome, and similar between related species. By comparison of the genome signature of 230 plasmid sequences with that of the genome of each respective host, we found that in general the genome signature of plasmids is dissimilar from that of their host genome. CONCLUSION Our results show that the genome signature of plasmids does not resemble that of their host genome. This indicates either absence of amelioration or a less stable relationship between plasmids and their host. We propose an indiscriminate lifestyle for plasmids preserving the genome signature discordance between these episomes and host chromosomes.
Collapse
Affiliation(s)
- Mark WJ van Passel
- Academic Medical Center, Department of Medical Microbiology, Amsterdam, The Netherlands
| | - Aldert Bart
- Academic Medical Center, Department of Medical Microbiology, Amsterdam, The Netherlands
| | - Angela CM Luyf
- Academic Medical Center, Bioinformatics Laboratory, Amsterdam, The Netherlands
| | | | - Arie van der Ende
- Academic Medical Center, Department of Medical Microbiology, Amsterdam, The Netherlands
| |
Collapse
|
154
|
Pride DT, Wassenaar TM, Ghose C, Blaser MJ. Evidence of host-virus co-evolution in tetranucleotide usage patterns of bacteriophages and eukaryotic viruses. BMC Genomics 2006; 7:8. [PMID: 16417644 PMCID: PMC1360066 DOI: 10.1186/1471-2164-7-8] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2005] [Accepted: 01/18/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Virus taxonomy is based on morphologic characteristics, as there are no widely used non-phenotypic measures for comparison among virus families. We examined whether there is phylogenetic signal in virus nucleotide usage patterns that can be used to determine ancestral relationships. The well-studied model of tail morphology in bacteriophage classification was used for comparison with nucleotide usage patterns. Tetranucleotide usage deviation (TUD) patterns were chosen since they have previously been shown to contain phylogenetic signal similar to that of 16S rRNA. RESULTS We found that bacteriophages have unique TUD patterns, representing genomic signatures that are relatively conserved among those with similar host range. Analysis of TUD-based phylogeny indicates that host influences are important in bacteriophage evolution, and phylogenies containing both phages and their hosts support their co-evolution. TUD-based phylogeny of eukaryotic viruses indicates that they cluster largely based on nucleic acid type and genome size. Similarities between eukaryotic virus phylogenies based on TUD and gene content substantiate the TUD methodology. CONCLUSION Differences between phenotypic and TUD analysis may provide clues to virus ancestry not previously inferred. As such, TUD analysis provides a complementary approach to morphology-based systems in analysis of virus evolution.
Collapse
Affiliation(s)
- David T Pride
- Department of Medicine, Division of Infectious Diseases And Geographic Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Trudy M Wassenaar
- Molecular Microbiology and Genomics Consultants, Zotzenheim, Germany
| | - Chandrabali Ghose
- Department of Medicine, Division of Infectious Diseases, Harvard Medical School, Boston, MA, USA
| | - Martin J Blaser
- Departments of Medicine and Microbiology, New York University School of Medicine and VA Medical Center, New York, NY4, USA
| |
Collapse
|
155
|
Abe T, Sugawara H, Kinouchi M, Kanaya S, Ikemura T. Novel phylogenetic studies of genomic sequence fragments derived from uncultured microbe mixtures in environmental and clinical samples. DNA Res 2006; 12:281-90. [PMID: 16769690 DOI: 10.1093/dnares/dsi015] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
A self-organizing map (SOM) was developed as a novel bioinformatics strategy for phylogenetic classification of sequence fragments obtained from pooled genome samples of uncultured microbes in environmental and clinical samples. This phylogenetic classification was possible without either orthologous sequence sets or sequence alignments. We first constructed SOMs for tetranucleotide frequencies in 210,000 5 kb sequence fragments obtained from 1502 prokaryotes for which at least 10 kb of genomic sequence has been deposited in public DNA databases. The sequences could be classified primarily according to phylogenetic groups without information regarding the species. We used the SOM method to classify sequence fragments derived from environmental samples of the Sargasso Sea and of an acidophilic biofilm growing in acid mine drainage. Phylogenetic diversity of the environmental sequences was effectively visualized on a single map. Sequences that were derived from a single genome but cloned independently could be reassociated in silico. G + C% has been used for a long period as a fundamental parameter for phylogenetic classification of microbes, but the G + C% is apparently too simple a parameter to differentiate a wide variety of known species. Oligonucleotide frequency can be used to distinguish the species because oligonucleotide frequencies vary significantly among their genomes.
Collapse
Affiliation(s)
- Takashi Abe
- Center for Information Biology, National Institute of Genetics, The Graduate University for Advanced Studies (Sokendai) Mishima, Shizuoka, Japan.
| | | | | | | | | |
Collapse
|
156
|
Abe T, Sugawara H, Kanaya S, Kinouchi M, Ikemura T. Self-Organizing Map (SOM) unveils and visualizes hidden sequence characteristics of a wide range of eukaryote genomes. Gene 2005; 365:27-34. [PMID: 16364569 DOI: 10.1016/j.gene.2005.09.040] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2005] [Revised: 08/08/2005] [Accepted: 09/07/2005] [Indexed: 11/17/2022]
Abstract
Novel tools are needed for comprehensive comparisons of interspecies characteristics of massive amounts of genomic sequences currently available. An unsupervised neural network algorithm, Self-Organizing Map (SOM), is an effective tool for clustering and visualizing high-dimensional complex data on a single map. We modified the conventional SOM, on the basis of batch-learning SOM, for genome informatics making the learning process and resulting map independent of the order of data input. We generated the SOMs for tri- and tetranucleotide frequencies in 10- and 100-kb sequence fragments from 38 eukaryotes for which almost complete genome sequences are available. SOM recognized species-specific characteristics (key combinations of oligonucleotide frequencies) in the genomic sequences, permitting species-specific classification of the sequences without any information regarding the species. We also generated the SOM for tetranucleotide frequencies in 1-kb sequence fragments from the human genome and found sequences for four functional categories (5' and 3' UTRs, CDSs and introns) were classified primarily according to the categories. Because the classification and visualization power is very high, SOM is an efficient and powerful tool for extracting a wide range of genome information.
Collapse
Affiliation(s)
- Takashi Abe
- Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, and The Graduate University for Advanced Studies (Sokendai), Mishima, Shizuoka 411-8540, Japan
| | | | | | | | | |
Collapse
|
157
|
Coenye T, Vandamme P. Overrepresentation of immunostimulatory CpG motifs in Burkholderia genomes. J Cyst Fibros 2005; 4:193-6. [PMID: 15963770 DOI: 10.1016/j.jcf.2005.03.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2005] [Revised: 02/02/2005] [Accepted: 03/23/2005] [Indexed: 10/25/2022]
Abstract
Pulmonary infections with Burkholderia cepacia complex organisms contribute significantly to morbidity and mortality in patients with cystic fibrosis (CF), partially due to the intense inflammatory response of the host to the presence of bacteria and their byproducts. In the present study we show that Burkholderia genomes contain a large number of immunostimulatory CpG motifs. This is mainly because of their large genome size. This suggests that DNA from Burkholderia sp. has the potential to cause significant inflammatory response. Whether this contributes significantly to the airway inflammation often observed in infected CF patients remains to be determined.
Collapse
Affiliation(s)
- Tom Coenye
- Laboratorium voor Farmaceutische, Microbiologie, Universiteit Gent, Belgium.
| | | |
Collapse
|
158
|
Collyn F, Guy L, Marceau M, Simonet M, Roten CAH. Describing ancient horizontal gene transfers at the nucleotide and gene levels by comparative pathogenicity island genometrics. Bioinformatics 2005; 22:1072-9. [PMID: 16303795 DOI: 10.1093/bioinformatics/bti793] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Lateral gene transfer is a major mechanism contributing to bacterial genome dynamics and pathovar emergence via pathogenicity island (PAI) spreading. However, since few of these genomic exchanges are experimentally reproducible, it is difficult to establish evolutionary scenarios for the successive PAI transmissions between bacterial genera. Methods initially developed at the gene and/or nucleotide level for genomics, i.e. comparisons of concatenated sequences, ortholog frequency, gene order or dinucleotide usage, were combined and applied here to homologous PAIs: we call this approach comparative PAI genometrics. RESULTS YAPI, a Yersinia PAI, and related islands were compared with measure evolutionary relationships between related modules. Through use of our genometric approach designed for tracking codon usage adaptation and gene phylogeny, an ancient inter-genus PAI transfer was oriented for the first time by characterizing the genomic environment in which the ancestral island emerged and its subsequent transfers to other bacterial genera.
Collapse
Affiliation(s)
- F Collyn
- Inserm E0364--Université de Lille II, Faculté de Médecine Henri Warembourg, Institut Pasteur de Lille 1 rue du Pr Calmette, F-59021 Lille, France
| | | | | | | | | |
Collapse
|
159
|
Wu G, Culley DE, Zhang W. Predicted highly expressed genes in the genomes of Streptomyces coelicolor and Streptomyces avermitilis and the implications for their metabolism. MICROBIOLOGY-SGM 2005; 151:2175-2187. [PMID: 16000708 DOI: 10.1099/mic.0.27833-0] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Highly expressed genes in bacteria often have a stronger codon bias than genes expressed at lower levels, due to translational selection. In this study, a comparative analysis of predicted highly expressed (PHX) genes in the Streptomyces coelicolor and Streptomyces avermitilis genomes was performed using the codon adaptation index (CAI) as a numerical estimator of gene expression level. Although it has been suggested that there is little heterogeneity in codon usage in G+C-rich bacteria, considerable heterogeneity was found among genes in these two G+C-rich Streptomyces genomes. Using ribosomal protein genes as references, approximately 10% of the genes were predicted to be PHX genes using a CAI cutoff value of greater than 0.78 and 0.75 in S. coelicolor and S. avermitilis, respectively. The PHX genes showed good agreement with the experimental data on expression levels obtained from proteomic analysis by previous workers. Among 724 and 730 PHX genes identified from S. coelicolor and S. avermitilis, 368 are orthologue genes present in both genomes, which were mostly 'housekeeping' genes involved in cell growth. In addition, 61 orthologous gene pairs with unknown functions were identified as PHX. Only one polyketide synthase gene from each Streptomyces genome was predicted as PHX. Nevertheless, several key genes responsible for producing precursors for secondary metabolites, such as crotonyl-CoA reductase and propionyl-CoA carboxylase, and genes necessary for initiation of secondary metabolism, such as adenosylmethionine synthetase, were among the PHX genes in the two Streptomyces species. The PHX genes exclusive to each genome, and what they imply regarding cellular metabolism, are also discussed.
Collapse
Affiliation(s)
- Gang Wu
- Department of Biological Sciences, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA
| | - David E Culley
- Microbiology Department, Pacific Northwest National Laboratory, 902 Battelle Boulevard, PO Box 999, Mail Stop P7-50, Richland, WA 99352, USA
| | - Weiwen Zhang
- Microbiology Department, Pacific Northwest National Laboratory, 902 Battelle Boulevard, PO Box 999, Mail Stop P7-50, Richland, WA 99352, USA
| |
Collapse
|
160
|
Abstract
The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.
Collapse
Affiliation(s)
- Kevin Chen
- *To whom correspondence should be addressed. E-mail: (KC), (LP)
| | - Lior Pachter
- *To whom correspondence should be addressed. E-mail: (KC), (LP)
| |
Collapse
|
161
|
Chang CH, Hsieh LC, Chen TY, Chen HD, Luo L, Lee HC. Shannon information in complete genomes. J Bioinform Comput Biol 2005; 3:587-608. [PMID: 16108085 DOI: 10.1142/s0219720005001181] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2004] [Revised: 11/11/2004] [Accepted: 11/12/2004] [Indexed: 11/18/2022]
Abstract
Shannon information in the genomes of all completely sequenced prokaryotes and eukaryotes are measured in word lengths of two to ten letters. It is found that in a scale-dependent way, the Shannon information in complete genomes are much greater than that in matching random sequences--thousands of times greater in the case of short words. Furthermore, with the exception of the 14 chromosomes of Plasmodium falciparum, the Shannon information in all available complete genomes belong to a universality class given by an extremely simple formula. The data are consistent with a model for genome growth composed of two main ingredients: random segmental duplications that increase the Shannon information in a scale-independent way, and random point mutations that preferentially reduces the larger-scale Shannon information. The inference drawn from the present study is that the large-scale and coarse-grained growth of genomes was selectively neutral and this suggests an independent corroboration of Kimura's neutral theory of evolution.
Collapse
Affiliation(s)
- Chang-Heng Chang
- Department of Physics and National Central University, Chungli, Taiwan, ROC
| | | | | | | | | | | |
Collapse
|
162
|
Calteau A, Gouy M, Perrière G. Horizontal transfer of two operons coding for hydrogenases between bacteria and archaea. J Mol Evol 2005; 60:557-65. [PMID: 15983865 DOI: 10.1007/s00239-004-0094-8] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2004] [Accepted: 11/19/2004] [Indexed: 11/27/2022]
Abstract
Using a phylogenetic approach, we discovered three putative horizontal transfers between bacterial and archaeal species involving large clusters of genes. One transfer involves an operon of 13 genes, called mbx, which probably was transferred into the genome of Thermotoga maritima from a species belonging or close to the Pyrococcus genus. The two others implied an operon of six genes, called ech, transferred independently to the genomes of Thermoanaerobacter tengcongensis and Desulfovibrio gigas, from a species belonging or close to the Methanosarcina genus. All these transfers affected operons coding for multisubunit membrane-bound (NiFe) hydrogenases involved in the energy metabolism of the donor genomes. The functionality of the transferred operons has not been experimentally demonstrated for T. maritima, whereas in D. gigas and T. tengcongensis the encoded multisubunit hydrogenase could have a role in energy conservation. This report adds several cases of horizontal gene transfers among hydrogenases already described.
Collapse
Affiliation(s)
- Alexandra Calteau
- Laboratoire de Biométrie et Biologie Evolutive, UMR CNRS 5558, Université Claude Bernard--Lyon 1, Villeurbanne, France
| | | | | |
Collapse
|
163
|
Fertil B, Massin M, Lespinats S, Devic C, Dumee P, Giron A. GENSTYLE: exploration and analysis of DNA sequences with genomic signature. Nucleic Acids Res 2005; 33:W512-5. [PMID: 15980524 PMCID: PMC1160249 DOI: 10.1093/nar/gki489] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
GENSTYLE (http://Genstyle.imed.jussieu.fr) is a workspace designed for the characterization and classification of nucleotide sequences. Based on the genomic signature paradigm, GENSTYLE focuses on oligonucleotide frequencies in DNA sequences. Users can select sequences of interest in the GENSTYLE companion database, where the whole set of GenBank sequences is grouped per species, or upload their own sequences to work with. Tools for the exploration and analysis of signatures allow (i) identification of the origin of DNA segments (detection of rare species or species for which technical problems prevent fast characterization, such as micro-organisms with slow growth), (ii) analysis of the homogeneity of a genome and isolation of areas with novel functionality (horizontal transfers for example)--and (iii) molecular phylogeny and taxonomy.
Collapse
Affiliation(s)
- Bernard Fertil
- INSERM U. 678, 91 boulevard de l'Hôpital, 75634 Paris, France.
| | | | | | | | | | | |
Collapse
|
164
|
Stenøien HK, Stephan W. Global mRNA stability is not associated with levels of gene expression in Drosophila melanogaster but shows a negative correlation with codon bias. J Mol Evol 2005; 61:306-14. [PMID: 16044249 DOI: 10.1007/s00239-004-0271-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2004] [Accepted: 03/16/2005] [Indexed: 11/26/2022]
Abstract
A multitude of factors contribute to the regulation of gene expression in living cells. The relationship between codon usage bias and gene expression has been extensively studied, and it has been shown that codon bias may have adaptive significance in many unicellular and multicellular organisms. Given the central role of mRNA in post-transcriptional regulation, we hypothesize that mRNA stability is another important factor associated either with positive or negative regulation of gene expression. We have conducted genome-wide studies of the association between gene expression (measured as transcript abundance in public EST databases), mRNA stability, codon bias, GC content, and gene length in Drosophila melanogaster. To remove potential bias of gene length inherently present in EST libraries, gene expression is measured as normalized transcript abundance. It is demonstrated that codon bias and GC content in second codon position are positively associated with transcript abundance. Gene length is negatively associated with transcript abundance. The stability of thermodynamically predicted mRNA secondary structures is not associated with transcript abundance, but there is a negative correlation between mRNA stability and codon bias. This finding does not support the hypothesis that codon bias has evolved as an indirect consequence of selection favoring thermodynamically stable mRNA molecules.
Collapse
Affiliation(s)
- Hans K Stenøien
- Plant Ecology/Department of Ecology and Evolution, Evolutionary Biology Centre, Uppsala University, SE-752 36, Uppsala, Sweden
| | | |
Collapse
|
165
|
Karlin S, Brocchieri L, Campbell A, Cyert M, Mrázek J. Genomic and proteomic comparisons between bacterial and archaeal genomes and related comparisons with the yeast and fly genomes. Proc Natl Acad Sci U S A 2005; 102:7309-14. [PMID: 15883367 PMCID: PMC1129125 DOI: 10.1073/pnas.0502314102] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Bacterial, archaeal, yeast, and fly genomes are compared with respect to predicted highly expressed (PHX) genes and several genomic properties. There is a striking difference in the status of PHX ribosomal protein (RP) genes where the archaeal genome generally encodes more RP genes and fewer PHX RPs compared with bacterial genomes. The increase in RPs in archaea and eukaryotes compared with that in bacteria may reflect a more complex set of interactions in archaea and eukaryotes in regulating translation, e.g., differences in structure requiring scaffolding of longer rRNA molecules, expanded interactions with the chaperone machinery, and, in eukaryotic interactions with endoplasmic reticulum components. The yeast genome is similar to fast-growing bacteria in PHX genes but also features several cytoskeletal genes, including actin and tropomyosin, and several signal transduction regulatory proteins from the 14.3.3 family. The most PHX genes of Drosophila encode cytoskeletal and exoskeletal proteins. We found that the preference of a microorganism for an anaerobic metabolism correlates with the number of PHX enzymes of the glycolysis pathway that well exceeds the number of PHX enzymes acting in the tricarboxylic acid cycle. Conversely, if the number of PHX enzymes of the tricarboxylic acid cycle well exceeds the PHX enzymes of glycolysis, an aerobic metabolism is preferred. Where the numbers are approximately commensurate, a facultative growth behavior prevails.
Collapse
Affiliation(s)
- Samuel Karlin
- Department of Mathematics, Stanford University, Stanford, CA 94305-2125, USA.
| | | | | | | | | |
Collapse
|
166
|
O'Malley MA, Boucher Y. Paradigm change in evolutionary microbiology. STUDIES IN HISTORY AND PHILOSOPHY OF BIOLOGICAL AND BIOMEDICAL SCIENCES 2005; 36:183-208. [PMID: 16120264 DOI: 10.1016/j.shpsc.2004.12.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2004] [Revised: 07/19/2004] [Indexed: 05/04/2023]
Abstract
Thomas Kuhn had little to say about scientific change in biological science, and biologists are ambivalent about how applicable his framework is for their disciplines. We apply Kuhn's account of paradigm change to evolutionary microbiology, where key Darwinian tenets are being challenged by two decades of findings from molecular phylogenetics. The chief culprit is lateral gene transfer, which undermines the role of vertical descent and the representation of evolutionary history as a tree of life. To assess Kuhn's relevance to this controversy, we add a social analysis of the scientists involved to the historical and philosophical debates. We conclude that while Kuhn's account may capture aspects of the pattern (or outcome) of an episode of scientific change, he has little to say about how the process of generating new understandings is occurring in evolutionary microbiology. Once Kuhn's application is limited to that of an initial investigative probe into how scientific problem-solving occurs, his disciplinary scope becomes broader.
Collapse
Affiliation(s)
- Maureen A O'Malley
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada B3H 1X5.
| | | |
Collapse
|
167
|
Saunders NJ, Boonmee P, Peden JF, Jarvis SA. Inter-species horizontal transfer resulting in core-genome and niche-adaptive variation within Helicobacter pylori. BMC Genomics 2005; 6:9. [PMID: 15676066 PMCID: PMC549213 DOI: 10.1186/1471-2164-6-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2004] [Accepted: 01/27/2005] [Indexed: 11/18/2022] Open
Abstract
Background Horizontal gene transfer is central to evolution in most bacterial species. The detection of exchanged regions is often based upon analysis of compositional characteristics and their comparison to the organism as a whole. In this study we describe a new methodology combining aspects of established signature analysis with textual analysis approaches. This approach has been used to analyze the two available genome sequences of H. pylori. Results This gene-by-gene analysis reveals a wide range of genes related to both virulence behaviour and the strain differences that have been relatively recently acquired from other sequence backgrounds. These frequently involve single genes or small numbers of genes that are not associated with transposases or bacteriophage genes, nor with inverted repeats typically used as markers for horizontal transfer. In addition, clear examples of horizontal exchange in genes associated with 'core' metabolic functions were identified, supported by differences between the sequenced strains, including: ftsK, xerD and polA. In some cases it was possible to determine which strain represented the 'parent' and 'altered' states for insertion-deletion events. Different signature component lengths showed different sensitivities for the detection of some horizontally transferred genes, which may reflect different amelioration rates of sequence components. Conclusion New implementations of signature analysis that can be applied on a gene-by-gene basis for the identification of horizontally acquired sequences are described. These findings highlight the central role of the availability of homologous substrates in evolution mediated by horizontal exchange, and suggest that some components of the supposedly stable 'core genome' may actually be favoured targets for integration of foreign sequences because of their degree of conservation.
Collapse
Affiliation(s)
- Nigel J Saunders
- Bacterial Pathogenesis and Functional Genomics Group, The Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford, OX1 3RE, UK
| | - Prawit Boonmee
- Department of Computer Science, University of Warwick, Coventry, CV4 7AL,UK
| | - John F Peden
- Oxford University Bioinformatics Centre, Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford, OX1 3RE, UK
| | - Stephen A Jarvis
- Department of Computer Science, University of Warwick, Coventry, CV4 7AL,UK
| |
Collapse
|
168
|
Dufraigne C, Fertil B, Lespinats S, Giron A, Deschavanne P. Detection and characterization of horizontal transfers in prokaryotes using genomic signature. Nucleic Acids Res 2005; 33:e6. [PMID: 15653627 PMCID: PMC546175 DOI: 10.1093/nar/gni004] [Citation(s) in RCA: 111] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Horizontal DNA transfer is an important factor of evolution and participates in biological diversity. Unfortunately, the location and length of horizontal transfers (HTs) are known for very few species. The usage of short oligonucleotides in a sequence (the so-called genomic signature) has been shown to be species-specific even in DNA fragments as short as 1 kb. The genomic signature is therefore proposed as a tool to detect HTs. Since DNA transfers originate from species with a signature different from those of the recipient species, the analysis of local variations of signature along recipient genome may allow for detecting exogenous DNA. The strategy consists in (i) scanning the genome with a sliding window, and calculating the corresponding local signature (ii) evaluating its deviation from the signature of the whole genome and (iii) looking for similar signatures in a database of genomic signatures. A total of 22 prokaryote genomes are analyzed in this way. It has been observed that atypical regions make up ∼6% of each genome on the average. Most of the claimed HTs as well as new ones are detected. The origin of putative DNA transfers is looked for among ∼12 000 species. Donor species are proposed and sometimes strongly suggested, considering similarity of signatures. Among the species studied, Bacillus subtilis, Haemophilus Influenzae and Escherichia coli are investigated by many authors and give the opportunity to perform a thorough comparison of most of the bioinformatics methods used to detect HTs.
Collapse
Affiliation(s)
| | | | | | | | - Patrick Deschavanne
- To whom correspondence should be addressed. Tel: 33 1 44 27 77 12; Fax: +33 1 43 26 38 30;
| |
Collapse
|
169
|
Wang Y, Rocha EPC, Leung FCC, Danchin A. Cytosine methylation is not the major factor inducing CpG dinucleotide deficiency in bacterial genomes. J Mol Evol 2004; 58:692-700. [PMID: 15461426 DOI: 10.1007/s00239-004-2591-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
CpG dinucleotide deficiency has been found in viruses, mitochondria, prokaryotes, and eukaryotes. The consensual explanation is that it is due to deamination of methylated cytosines, as established for vertebrate and plants. However, we still do not know whether C5 cytosine methylation is also the major cause of CpG deficiency in bacteria. By combining annotation and experimental data identifying the presence of C5 cytosine methyltransferases with analysis of CpG relative abundance in 67 bacterial species, we found that CpG relative abundance in most bacterial genomes that have cytosine C5 methyltransferases tends to be in the normal range (observed/expected values between 0.82 and 1.21). In contrast, many bacterial species likely to be lacking C5 cytosine methylation showed CpG deficiency. Furthermore, when comparing genomes with one another, TpG and CpA relative abundances were found to be independent from CpG relative abundance. This contrasted with intragenome analyses, where C3pG1 relative abundance (the subscripts refer to position of a nucleotide in a codon) was found to be generally positively correlated with T3pG1 relative abundances when plotted against GC content in protein coding sequences (CDSs). This suggests the existence of alternative mechanisms contributing to CpG deficiency in bacteria.
Collapse
Affiliation(s)
- Yong Wang
- Department of Zoology, University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | | | | | | |
Collapse
|
170
|
TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences. BMC Bioinformatics 2004; 5:163. [PMID: 15507136 PMCID: PMC529438 DOI: 10.1186/1471-2105-5-163] [Citation(s) in RCA: 276] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2004] [Accepted: 10/26/2004] [Indexed: 11/29/2022] Open
Abstract
Background In the emerging field of environmental genomics, direct cloning and sequencing of genomic fragments from complex microbial communities has proven to be a valuable source of new enzymes, expanding the knowledge of basic biological processes. The central problem of this so called metagenome-approach is that the cloned fragments often lack suitable phylogenetic marker genes, rendering the identification of clones that are likely to originate from the same genome difficult or impossible. In such cases, the analysis of intrinsic DNA-signatures like tetranucleotide frequencies can provide valuable hints on fragment affiliation. With this application in mind, the TETRA web-service and the TETRA stand-alone program have been developed, both of which automate the task of comparative tetranucleotide frequency analysis. Availability: Results TETRA provides a statistical analysis of tetranucleotide usage patterns in genomic fragments, either via a web-service or a stand-alone program. With respect to discriminatory power, such an analysis outperforms the assignment of genomic fragments based on the (G+C)-content, which is a widely-used sequence-based measure for assessing fragment relatedness. While the web-service is restricted to the calculation of correlation coefficients between tetranucleotide usage patterns of submitted DNA sequences, the stand-alone program generates a much more detailed output, comprising all raw data and graphical plots. The stand-alone program is controlled via a graphical user interface and can batch-process a multitude of sequences. Furthermore, it comes with pre-computed tetranucleotide usage patterns for 166 prokaryote chromosomes, providing a useful reference dataset and source for data-mining. Conclusions Up to now, the analysis of skewed oligonucleotide distributions within DNA sequences is not a commonly used tool within metagenomics. With the TETRA web-service and stand-alone program, the method is now accessible in an easy to use manner for a broad audience. This will hopefully facilitate the interrelation of genomic fragments from metagenome libraries, ultimately leading to new insights into the genetic potentials of yet uncultured microorganisms.
Collapse
|
171
|
Teeling H, Meyerdierks A, Bauer M, Amann R, Glöckner FO. Application of tetranucleotide frequencies for the assignment of genomic fragments. Environ Microbiol 2004; 6:938-47. [PMID: 15305919 DOI: 10.1111/j.1462-2920.2004.00624.x] [Citation(s) in RCA: 251] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
A basic problem of the metagenomic approach in microbial ecology is the assignment of genomic fragments to a certain species or taxonomic group, when suitable marker genes are absent. Currently, the (G + C)-content together with phylogenetic information and codon adaptation for functional genes is mostly used to assess the relationship of different fragments. These methods, however, can produce ambiguous results. In order to evaluate sequence-based methods for fragment identification, we extensively compared (G + C)-contents and tetranucleotide usage patterns of 9054 fosmid-sized genomic fragments generated in silico from 118 completely sequenced bacterial genomes (40 982 931 fragment pairs were compared in total). The results of this systematic study show that the discriminatory power of correlations of tetranucleotide-derived z-scores is by far superior to that of differences in (G + C)-content and provides reasonable assignment probabilities when applied to metagenome libraries of small diversity. Using six fully sequenced fosmid inserts from a metagenomic analysis of microbial consortia mediating the anaerobic oxidation of methane (AOM), we demonstrate that discrimination based on tetranucleotide-derived z-score correlations was consistent with corresponding data from 16S ribosomal RNA sequence analysis and allowed us to discriminate between fosmid inserts that were indistinguishable with respect to their (G + C)-contents.
Collapse
Affiliation(s)
- Hanno Teeling
- Department of Molecular Ecology, Genomics Group, Max Planck Institute for Marine Microbiology, D-28359 Bremen, Germany
| | | | | | | | | |
Collapse
|
172
|
Touchon M, Arneodo A, d'Aubenton-Carafa Y, Thermes C. Transcription-coupled and splicing-coupled strand asymmetries in eukaryotic genomes. Nucleic Acids Res 2004; 32:4969-78. [PMID: 15388799 PMCID: PMC521644 DOI: 10.1093/nar/gkh823] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Under no-strand bias conditions, each genomic DNA strand should present equimolarities of A and T and of G and C. Deviations from these rules are attributed to asymmetric properties intrinsic to DNA mutation-repair processes. In bacteria, strand biases are associated with replication or transcription. In eukaryotes, recent studies demonstrate that human genes present transcription-coupled biases that might reflect transcription-coupled repair processes. Here, we study strand asymmetries in intron sequences of evolutionarily distant eukaryotes, and show that two superimposed intron biases can be distinguished. (i) Biases that are maximum at intron extremities and decrease over large distances to zero values in internal regions, possibly reflecting interactions between pre-mRNA and splicing machinery; these extend over approximately 0.5 kb in mammals and Arabidopsis thaliana, and over 1 kb in Caenorhabditis elegans and Drosophila melanogaster. (ii) Biases that are constant along introns, possibly associated with transcription. Strikingly, in C.elegans, these latter biases extend over intergenic regions that separate co-oriented genes. When appropriately examined, all genomes present transcription-coupled excess of T over A in the coding strand. On the opposite, GC skews are either positive (mammals, plants) or negative (invertebrates). These results suggest that transcription-coupled asymmetries result from mutation-repair mechanisms that differ between vertebrates and invertebrates.
Collapse
Affiliation(s)
- Marie Touchon
- Centre de Génétique Moléculaire (CNRS), Allée de la Terrasse, 91198 Gif-sur-Yvette, France
| | | | | | | |
Collapse
|
173
|
van Passel MWJ, Bart A, Waaijer RJA, Luyf ACM, van Kampen AHC, van der Ende A. An in vitro strategy for the selective isolation of anomalous DNA from prokaryotic genomes. Nucleic Acids Res 2004; 32:e114. [PMID: 15304543 PMCID: PMC514399 DOI: 10.1093/nar/gnh115] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
In sequenced genomes of prokaryotes, anomalous DNA (aDNA) can be recognized, among others, by atypical clustering of dinucleotides. We hypothesized that atypical clustering of hexameric endonuclease recognition sites in aDNA allows the specific isolation of anomalous sequences in vitro. Clustering of endonuclease recognition sites in aDNA regions of eight published prokaryotic genome sequences was demonstrated. In silico digestion of the Neisseria meningitidis MC58 genome, using four selected endonucleases, revealed that out of 27 of the small fragments predicted (<5 kb), 21 were located in known genomic islands. Of the 24 calculated fragments (>300 bp and <5 kb), 22 met our criteria for aDNA, i.e. a high dinucleotide dissimilarity and/or aberrant GC content. The four enzymes also allowed the identification of aDNA fragments from the related Z2491 strain. Similarly, the sequenced genomes of three strains of Escherichia coli assessed by in silico digestion using XbaI yielded strain-specific sets of fragments of anomalous composition. In vitro applicability of the method was demonstrated by using adaptor-linked PCR, yielding the predicted fragments from the N.meningitidis MC58 genome. In conclusion, this strategy allows the selective isolation of aDNA from prokaryotic genomes by a simple restriction digest-amplification-cloning-sequencing scheme.
Collapse
Affiliation(s)
- M W J van Passel
- Department of Medical Microbiology, Academic Medical Center, Amsterdam, The Netherlands
| | | | | | | | | | | |
Collapse
|
174
|
Abstract
Tracing the history of molecular changes in coronaviruses using phylogenetic methods can provide powerful insights into the patterns of modification to sequences that underlie alteration to selective pressure and molecular function in the SARS-CoV (severe acute respiratory syndrome coronavirus) genome. The topology and branch lengths of the phylogenetic relationships among the family Coronaviridae, including SARS-CoV, have been estimated using the replicase polyprotein. The spike protein fragments S1 (involved in receptor-binding) and S2 (involved in membrane fusion) have been found to have different mutation rates. Fragment S1 can be further divided into two regions (S1A, which comprises approximately the first 400 nucleotides, and S1B, comprising the next 280) that also show different rates of mutation. The phylogeny presented on the basis of S1B shows that SARS-CoV is closely related to MHV (murine hepatitis virus), which is known to bind the murine receptor CEACAM1. The predicted structure, accessibility and mutation rate of the S1B region is also presented. Because anti-SARS drugs based on S2 heptads have short half-lives and are difficult to manufacture, our findings suggest that the S1B region might be of interest for anti-SARS drug discovery.
Collapse
Affiliation(s)
- Pietro Liò
- EMBL European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK.
| | | |
Collapse
|
175
|
Monteiro-Vitorello CB, Camargo LEA, Van Sluys MA, Kitajima JP, Truffi D, do Amaral AM, Harakava R, de Oliveira JCF, Wood D, de Oliveira MC, Miyaki C, Takita MA, da Silva ACR, Furlan LR, Carraro DM, Camarotte G, Almeida NF, Carrer H, Coutinho LL, El-Dorry HA, Ferro MIT, Gagliardi PR, Giglioti E, Goldman MHS, Goldman GH, Kimura ET, Ferro ES, Kuramae EE, Lemos EGM, Lemos MVF, Mauro SMZ, Machado MA, Marino CL, Menck CF, Nunes LR, Oliveira RC, Pereira GG, Siqueira W, de Souza AA, Tsai SM, Zanca AS, Simpson AJG, Brumbley SM, Setúbal JC. The genome sequence of the gram-positive sugarcane pathogen Leifsonia xyli subsp. xyli. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2004; 17:827-836. [PMID: 15305603 DOI: 10.1094/mpmi.2004.17.8.827] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The genome sequence of Leifsonia xyli subsp. xyli, which causes ratoon stunting disease and affects sugarcane worldwide, was determined. The single circular chromosome of Leifsonia xyli subsp. xyli CTCB07 was 2.6 Mb in length with a GC content of 68% and 2,044 predicted open reading frames. The analysis also revealed 307 predicted pseudogenes, which is more than any bacterial plant pathogen sequenced to date. Many of these pseudogenes, if functional, would likely be involved in the degradation of plant heteropolysaccharides, uptake of free sugars, and synthesis of amino acids. Although L. xyli subsp. xyli has only been identified colonizing the xylem vessels of sugarcane, the numbers of predicted regulatory genes and sugar transporters are similar to those in free-living organisms. Some of the predicted pathogenicity genes appear to have been acquired by lateral transfer and include genes for cellulase, pectinase, wilt-inducing protein, lysozyme, and desaturase. The presence of the latter may contribute to stunting, since it is likely involved in the synthesis of abscisic acid, a hormone that arrests growth. Our findings are consistent with the nutritionally fastidious behavior exhibited by L. xyli subsp. xyli and suggest an ongoing adaptation to the restricted ecological niche it inhabits.
Collapse
Affiliation(s)
- Claudia B Monteiro-Vitorello
- Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, Av. Pádua Dias, 11, 13418-900, Piracicaba, SP, Brazil
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
176
|
Abstract
In this review, we focus on a group of mobile genetic elements designated pathogenicity islands (PAI). These elements play a pivotal role in the virulence of bacterial pathogens of humans and are also essential for virulence in pathogens of animals and plants. Characteristic molecular features of PAI of important human pathogens and their role in pathogenesis are described. The availability of a large number of genome sequences of pathogenic bacteria and their benign relatives currently offers a unique opportunity for the identification of novel pathogen-specific genomic islands. However, this knowledge has to be complemented by improved model systems for the analysis of virulence functions of bacterial pathogens. PAI apparently have been acquired during the speciation of pathogens from their nonpathogenic or environmental ancestors. The acquisition of PAI not only is an ancient evolutionary event that led to the appearance of bacterial pathogens on a timescale of millions of years but also may represent a mechanism that contributes to the appearance of new pathogens within a human life span. The acquisition of knowledge about PAI, their structure, their mobility, and the pathogenicity factors they encode not only is helpful in gaining a better understanding of bacterial evolution and interactions of pathogens with eukaryotic host cells but also may have important practical implications such as providing delivery systems for vaccination, tools for cell biology, and tools for the development of new strategies for therapy of bacterial infections.
Collapse
Affiliation(s)
- Herbert Schmidt
- Institut für Medizinische Mikrobiologie und Hygiene, Medizinische Fakultät Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany.
| | | |
Collapse
|
177
|
Chen SL, Lee W, Hottes AK, Shapiro L, McAdams HH. Codon usage between genomes is constrained by genome-wide mutational processes. Proc Natl Acad Sci U S A 2004; 101:3480-5. [PMID: 14990797 PMCID: PMC373487 DOI: 10.1073/pnas.0307827100] [Citation(s) in RCA: 230] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Analysis of genome-wide codon bias shows that only two parameters effectively differentiate the genome-wide codon bias of 100 eubacterial and archaeal organisms. The first parameter correlates with genome GC content, and the second parameter correlates with context-dependent nucleotide bias. Both of these parameters may be calculated from intergenic sequences. Therefore, genome-wide codon bias in eubacteria and archaea may be predicted from intergenic sequences that are not translated. When these two parameters are calculated for genes from nonmammalian eukaryotic organisms, genes from the same organism again have similar values, and genome-wide codon bias may also be predicted from intergenic sequences. In mammals, genes from the same organism are similar only in the second parameter, because GC content varies widely among isochores. Our results suggest that, in general, genome-wide codon bias is determined primarily by mutational processes that act throughout the genome, and only secondarily by selective forces acting on translated sequences.
Collapse
Affiliation(s)
- Swaine L Chen
- Department of Developmental Biology, Stanford University School of Medicine, Beckman Center, B300, Stanford, CA 94304, USA.
| | | | | | | | | |
Collapse
|
178
|
|
179
|
Coenye T, Vandamme P. Extracting phylogenetic information from whole-genome sequencing projects: the lactic acid bacteria as a test case. Microbiology (Reading) 2003; 149:3507-3517. [PMID: 14663083 DOI: 10.1099/mic.0.26515-0] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The availability of an ever increasing number of complete genome sequences of diverse prokaryotic taxa has led to the introduction of novel approaches to infer phylogenetic relationships among bacteria. In the present study the sequences of the 16S rRNA gene and nine housekeeping genes were compared with the fraction of shared putative orthologous protein-encoding genes, conservation of gene order, dinucleotide relative abundance and codon usage among 11 genomes of species belonging to the lactic acid bacteria. In general there is a good correlation between the results obtained with various approaches, although it is clear that there is a stronger phylogenetic signal in some datasets than in others, and that different parameters have different taxonomic resolutions. It appears that trees based on different kinds of information derived from whole-genome sequencing projects do not provide much additional information about the phylogenetic relationships among bacterial taxa compared to more traditional alignment-based methods. Nevertheless, it is expected that the study of these novel forms of information will have its value in taxonomy, to determine which genes are shared, when genes or sets of genes were lost in evolutionary history, to detect the presence of horizontally transferred genes and/or confirm or enhance the phylogenetic signal derived from traditional methods. Although these conclusions are based on a relatively small dataset, they are largely in agreement with other studies and it is anticipated that similar trends will be observed when comparing other genomes.
Collapse
Affiliation(s)
- Tom Coenye
- Laboratorium voor Microbiologie, Universiteit Gent, K.L. Ledeganckstraat 35, B-9000 Gent, Belgium
| | - Peter Vandamme
- Laboratorium voor Microbiologie, Universiteit Gent, K.L. Ledeganckstraat 35, B-9000 Gent, Belgium
| |
Collapse
|
180
|
Miller ES, Heidelberg JF, Eisen JA, Nelson WC, Durkin AS, Ciecko A, Feldblyum TV, White O, Paulsen IT, Nierman WC, Lee J, Szczypinski B, Fraser CM. Complete genome sequence of the broad-host-range vibriophage KVP40: comparative genomics of a T4-related bacteriophage. J Bacteriol 2003; 185:5220-33. [PMID: 12923095 PMCID: PMC180978 DOI: 10.1128/jb.185.17.5220-5233.2003] [Citation(s) in RCA: 181] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2003] [Accepted: 04/30/2003] [Indexed: 11/20/2022] Open
Abstract
The complete genome sequence of the T4-like, broad-host-range vibriophage KVP40 has been determined. The genome sequence is 244,835 bp, with an overall G+C content of 42.6%. It encodes 386 putative protein-encoding open reading frames (CDSs), 30 tRNAs, 33 T4-like late promoters, and 57 potential rho-independent terminators. Overall, 92.1% of the KVP40 genome is coding, with an average CDS size of 587 bp. While 65% of the CDSs were unique to KVP40 and had no known function, the genome sequence and organization show specific regions of extensive conservation with phage T4. At least 99 KVP40 CDSs have homologs in the T4 genome (Blast alignments of 45 to 68% amino acid similarity). The shared CDSs represent 36% of all T4 CDSs but only 26% of those from KVP40. There is extensive representation of the DNA replication, recombination, and repair enzymes as well as the viral capsid and tail structural genes. KVP40 lacks several T4 enzymes involved in host DNA degradation, appears not to synthesize the modified cytosine (hydroxymethyl glucose) present in T-even phages, and lacks group I introns. KVP40 likely utilizes the T4-type sigma-55 late transcription apparatus, but features of early- or middle-mode transcription were not identified. There are 26 CDSs that have no viral homolog, and many did not necessarily originate from Vibrio spp., suggesting an even broader host range for KVP40. From these latter CDSs, an NAD salvage pathway was inferred that appears to be unique among bacteriophages. Features of the KVP40 genome that distinguish it from T4 are presented, as well as those, such as the replication and virion gene clusters, that are substantially conserved.
Collapse
Affiliation(s)
- Eric S Miller
- Department of Microbiology, North Carolina State University, Raleigh, NC 27695-7615, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
181
|
Tu Q, Ding D. Detecting pathogenicity islands and anomalous gene clusters by iterative discriminant analysis. FEMS Microbiol Lett 2003; 221:269-75. [PMID: 12725938 DOI: 10.1016/s0378-1097(03)00204-0] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
We present a simple method to detect pathogenicity islands and anomalous gene clusters in bacterial genomes. The method uses iterative discriminant analysis to define genomic regions that deviate most from the rest of the genome in three compositional criteria: G+C content, dinucleotide frequency and codon usage. Using this method, we identify many virulence-related gene islands, e.g. encoding protein secretion systems, adhesins, toxins, and other anomalous gene clusters, such as prophages. The program and the whole dataset, including the catalogs of genes in the detected anomalous segments, are publicly available at http://compbio.sibsnet.org/projects/pai-ida/. This program can be used in searching for virulence-related factors in newly sequenced bacterial genomes.
Collapse
Affiliation(s)
- Qiang Tu
- Key Laboratory of Proteomics, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, PR China.
| | | |
Collapse
|
182
|
Jansen R, Bussemaker HJ, Gerstein M. Revisiting the codon adaptation index from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models. Nucleic Acids Res 2003; 31:2242-51. [PMID: 12682375 PMCID: PMC153734 DOI: 10.1093/nar/gkg306] [Citation(s) in RCA: 103] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2002] [Revised: 01/23/2003] [Accepted: 02/18/2003] [Indexed: 02/03/2023] Open
Abstract
Highly expressed genes in many bacteria and small eukaryotes often have a strong compositional bias, in terms of codon usage. Two widely used numerical indices, the codon adaptation index (CAI) and the codon usage, use this bias to predict the expression level of genes. When these indices were first introduced, they were based on fairly simple assumptions about which genes are most highly expressed: the CAI was originally based on the codon composition of a set of only 24 highly expressed genes, and the codon usage on assumptions about which functional classes of genes are highly expressed in fast-growing bacteria. Given the recent advent of genome-wide expression data, we should be able to improve on these assumptions. Here, we measure, in yeast, the degree to which consideration of the current genome-wide expression data sets improves the performance of both numerical indices. Indeed, we find that by changing the parameterization of each model its correlation with actual expression levels can be somewhat improved, although both indices are fairly insensitive to the exact way they are parameterized. This insensitivity indicates a consistent codon bias amongst highly expressed genes. We also attempt direct linear regression of codon composition against genome-wide expression levels (and protein abundance data). This has some similarity with the CAI formalism and yields an alternative model for the prediction of expression levels based on the coding sequences of genes. More information is available at http://bioinfo.mbb.yale.edu/expression/codons.
Collapse
Affiliation(s)
- Ronald Jansen
- Department of Molecular Biophysics and Biochemistry, 266 Whitney Avenue, Yale University, PO Box 208114, New Haven, CT 06520, USA
| | | | | |
Collapse
|
183
|
Georgi LL, Wang Y, Reighard GL, Mao L, Wing RA, Abbott AG. Comparison of peach and Arabidopsis genomic sequences: fragmentary conservation of gene neighborhoods. Genome 2003; 46:268-76. [PMID: 12723043 DOI: 10.1139/g03-004] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
We examined the degree of conservation of gene order in two plant species, Prunus persica (peach) and Arabidopsis thaliana (thale cress), whose lineages diverged more than 90 million years ago. In the three peach genomic regions studied, segments with a gene order congruent with A. thaliana were short (two to three genes in length); and for any peach region, corresponding segments were found in diverse locations in the A. thaliana genome. At the gene level and lower, the A. thaliana sequence was enormously useful for identifying likely coding regions in peach sequences and in determining their intron-exon structure. The peach BAC sequence data reported here contained a BLAST-detectable putative coding sequence an average of every 7 kb, and the peach introns identified in this study were, on average, almost twice the length of the corresponding introns in A. thaliana.
Collapse
Affiliation(s)
- Laura L Georgi
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, U.S.A.
| | | | | | | | | | | |
Collapse
|
184
|
Coenye T, Vandamme P. Simple sequence repeats and compositional bias in the bipartite Ralstonia solanacearum GMI1000 genome. BMC Genomics 2003; 4:10. [PMID: 12697060 PMCID: PMC153513 DOI: 10.1186/1471-2164-4-10] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2002] [Accepted: 03/17/2003] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Ralstonia solanacearum is an important plant pathogen. The genome of R. solananearum GMI1000 is organised into two replicons (a 3.7-Mb chromosome and a 2.1-Mb megaplasmid) and this bipartite genome structure is characteristic for most R. solanacearum strains. To determine whether the megaplasmid was acquired via recent horizontal gene transfer or is part of an ancestral single chromosome, we compared the abundance, distribution and composition of simple sequence repeats (SSRs) between both replicons and also compared the respective compositional biases. RESULTS Our data show that both replicons are very similar in respect to distribution and composition of SSRs and presence of compositional biases. Minor variations in SSR and compositional biases observed may be attributable to minor differences in gene expression and regulation of gene expression or can be attributed to the small sample numbers observed. CONCLUSIONS The observed similarities indicate that both replicons have shared a similar evolutionary history and thus suggest that the megaplasmid was not recently acquired from other organisms by lateral gene transfer but is a part of an ancestral R. solanacearum chromosome.
Collapse
Affiliation(s)
- Tom Coenye
- Laboratorium voor Microbiologie, Ghent University,K.L. Ledeganckstraat 35, B-9000 Gent, Belgium
| | - Peter Vandamme
- Laboratorium voor Microbiologie, Ghent University,K.L. Ledeganckstraat 35, B-9000 Gent, Belgium
| |
Collapse
|
185
|
Abstract
Changes in technology in the past decade have had such an impact on the way that molecular evolution research is done that it is difficult now to imagine working in a world without genomics or the Internet. In 1992, GenBank was less than a hundredth of its current size and was updated every three months on a huge spool of tape. Homology searches took 30 minutes and rarely found a hit. Now it is difficult to find sequences with only a few homologs to use as examples for teaching bioinformatics. For molecular evolution researchers, the genomics revolution has showered us with raw data and the information revolution has given us the wherewithal to analyze it. In broad terms, the most significant outcome from these changes has been our newfound ability to examine the evolution of genomes as a whole, enabling us to infer genome-wide evolutionary patterns and to identify subsets of genes whose evolution has been in some way atypical.
Collapse
Affiliation(s)
- Kenneth H Wolfe
- Department of Genetics, Smurfit Institute, University of Dublin, Trinity College, Dublin 2, Ireland.
| | | |
Collapse
|
186
|
Abstract
It is probable that, increasingly, genome investigations are going to be based on statistical formalization. This review summarizes the state of art and potentiality of using statistics in microbial genome analysis. First, I focus on recent advances in functional genomics, such as finding genes and operons, identifying gene conversion events, detecting DNA replication origins and analysing regulatory sites. Then I describe how to use phylogenetic methods in genome analysis and methods for genome-wide scanning for positively selected amino acids. I conclude with speculations on the future course of genome statistical modeling.
Collapse
Affiliation(s)
- Pietro Liò
- Department of Zoology, University of Cambridge, UK.
| |
Collapse
|
187
|
Pride DT, Meinersmann RJ, Wassenaar TM, Blaser MJ. Evolutionary implications of microbial genome tetranucleotide frequency biases. Genome Res 2003; 13:145-58. [PMID: 12566393 PMCID: PMC420360 DOI: 10.1101/gr.335003] [Citation(s) in RCA: 176] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We compared nucleotide usage pattern conservation for related prokaryotes by examining the representation of DNA tetranucleotide combinations in 27 representative microbial genomes. For each of the organisms studied, tetranucleotide usage departures from expectations (TUD) were shared between related organisms using both Markov chain analysis and a zero-order Markov method. Individual strains, multiple chromosomes, plasmids, and bacteriophages share TUDs within a species. TUDs varied between coding and noncoding DNA. Grouping prokaryotes based on TUD profiles resulted in relationships with important differences from those based on 16S rRNA phylogenies, which may reflect unequal rates of evolution of nucleotide usage patterns following divergence of particular organisms from a common ancestor. By both symmetrical tree distance and likelihood analysis, phylogenetic trees based on TUD profiles demonstrate a level of congruence with 16S rRNA trees similar to that of both RpoA and RecA trees. Congruence of these trees indicates that there exists phylogenetic signal in TUD patterns, most prominent in coding region DNA. Because relationships demonstrated in TUD-based analyses utilize whole genomes, they should be considered complementary to phylogenies based on single genetic elements, such as 16S rRNA.
Collapse
MESH Headings
- Chromosome Mapping/methods
- Chromosome Mapping/statistics & numerical data
- Chromosomes, Archaeal/genetics
- Chromosomes, Bacterial/genetics
- Cluster Analysis
- DNA, Archaeal/genetics
- DNA, Bacterial/genetics
- Gene Transfer, Horizontal/genetics
- Genome, Archaeal
- Genome, Bacterial
- Gram-Negative Bacteria/genetics
- Gram-Positive Bacteria/genetics
- Microsatellite Repeats/genetics
- Phylogeny
- Plasmids/genetics
- RNA, Archaeal/genetics
- RNA, Bacterial/genetics
- RNA, Ribosomal, 16S/genetics
- Spirochaeta/genetics
Collapse
Affiliation(s)
- David T Pride
- Department of Microbiology and Immunology, Vanderbilt University, Nashville, Tennessee 37235, USA.
| | | | | | | |
Collapse
|
188
|
Tosato V, Gjuracic K, Vlahovicek K, Pongor S, Danchin A, Bruschi CV. The DNA secondary structure of the Bacillus subtilis genome. FEMS Microbiol Lett 2003; 218:23-30. [PMID: 12583893 DOI: 10.1111/j.1574-6968.2003.tb11493.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
The entire genomic DNA sequence of the Gram-positive bacterium Bacillus subtilis reported in the SubtiList database has been subjected in this work to a complete bioinformatic analysis of the potential formation of secondary DNA structures such as hairpins and bending. The most significant of these structures have been mapped with respect to their genomic location and compared to those structures already known to have a physiological role, such as the rho-independent transcription terminators. The distribution of these structures along the bacterial chromosome shows two major features: (i). the concentration of the most curved DNA in the intergenic regions rather than within the ORFs, and (ii). a decreasing gradient of large hairpins from the origin towards the terC end of chromosomal DNA replication. Given the increasing biological relevance of secondary DNA structures, these findings should facilitate further studies on the evolution, dynamics and expression of the genetic information stored in bacterial genomes.
Collapse
Affiliation(s)
- Valentina Tosato
- Microbiology Group, International Centre for Genetic Engineering and Biotechnology, AREA Science Park, Padriciano 99, 34012, Trieste, Italy
| | | | | | | | | | | |
Collapse
|
189
|
Abstract
After an illustrious history as one of the primary tools that established the foundations of molecular biology, bacteriophage research is now undergoing a renaissance in which the primary focus is on the phages themselves rather than the molecular mechanisms that they explain. Studies of the evolution of phages and their role in natural ecosystems are flourishing. Practical questions, such as how to use phages to combat human diseases that are caused by bacteria, how to eradicate phage pests in the food industry and what role they have in the causation of human diseases, are receiving increased attention. Phages are also useful in the deeper exploration of basic molecular and biophysical questions.
Collapse
Affiliation(s)
- Allan Campbell
- Department of Biological Sciences, Stanford University, Stanford, California 94305, USA.
| |
Collapse
|
190
|
Huang Q, Beharav A, Li Y, Kirzhner V, Nevo E. Mosaic microecological differential stress causes adaptive microsatellite divergence in wild barley, Hordeum spontaneum, at Neve Yaar, Israel. Genome 2002; 45:1216-29. [PMID: 12502268 DOI: 10.1139/g02-073] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Genetic diversity at 38 microsatellite (short sequence repeats (SSRs)) loci was studied in a sample of 54 plants representing a natural population of wild barley, Hordeum spontaneum, at the Neve Yaar microsite in Israel. Wild barley at the microsite was organized in a mosaic pattern over an area of 3180 m2 in the open Tabor oak forest, which was subdivided into four microniches: (i) sun-rock (11 genotypes), (ii) sun-soil (18 genotypes), (iii) shade-soil (11 genotypes), and (iv) shade-rock (14 genotypes). Fifty-four genotypes were tested for ecological-genetic microniche correlates. Analysis of 36 loci showed that allele distributions at SSR loci were nonrandom but structured by ecological stresses (climatic and edaphic). Sixteen (45.7%) of 35 polymorphic loci varied significantly (p < 0.05) in allele frequencies among the microniches. Significant genetic divergence and diversity were found among the four subpopulations. The soil and shade subpopulations showed higher genetic diversities at SSR loci than the rock and sun subpopulations, and the lowest genetic diversity was observed in the sun-rock subpopulation, in contrast with the previous allozyme and RAPD studies. On average, of 36 loci, 88.75% of the total genetic diversity exists within the four microniches, while 11.25% exists between the microniches. In a permutation test, G(ST) was lower for 4999 out of 5000 randomized data sets (p < 0.001) when compared with real data (0.1125). The highest genetic distance was between shade-soil and sun-rock (D = 0.222). Our results suggest that diversifying natural selection may act upon some regulatory regions, resulting in adaptive SSR divergence. Fixation of some loci (GMS61, GMS1, and EBMAC824) at a specific microniche seems to suggest directional selection. The pattern of other SSR loci suggests the operation of balancing selection. SSRs may be either direct targets of selection or markers of selected haplotypes (selective sweep).
Collapse
Affiliation(s)
- Qingyang Huang
- Institute of Evolution, University of Haifa, Haifa, 31905, Israel
| | | | | | | | | |
Collapse
|
191
|
Abstract
The compositional bias of the G+C, di- and tetranucleotide contents in the 6 181 862 bp Pseudomonas putida KT2440 genome was analysed in sliding windows of 4000 bp in steps of 1000 bp. The genome has a low GC skew (mean 0.066) between the leading and lagging strand. The values of GC contents (mean 61.6%) and of dinucleotide relative abundance exhibit skewed Gaussian distributions. The variance of tetranucleotide frequencies, which increases linearly with increasing GC content, shows two overlapping Gaussian distributions of genome sections with low (minor fraction) or high variance (major fraction). Eighty per cent of the chromosome shares similar GC contents and oligonucleotide bias, but 105 islands of 4000 bp or more show atypical GC contents and/or oligonucleotide signature. Almost all islands provide added value to the metabolic proficiency of P. putida as a saprophytic omnivore. Major features are the uptake and degradation of organic chemicals, ion transport and the synthesis and secretion of secondary metabolites. Other islands endow P. putida with determinants of resistance and defenceor with constituents and appendages of the cell wall. A total of 29 islands carry the signature of mobile elements such as phage, transposons, insertion sequence (IS) elements and group II introns, indicating recent acquisition by horizontal gene transfer. The largest gene carries the most unusual sequence that encodes a multirepeat threonine-rich surface adhesion protein. Among the housekeeping genes, only genes of the translational apparatus were located in segments with an atypical signature, suggesting that the synthesis of ribosomal proteins is uncoupled from the rapidly changing translational demands of the cell by the separate utilization of tRNA pools.
Collapse
Affiliation(s)
- Christian Weinel
- Klinische Forschergruppe, OE 6710, Medizinische Hochschule Hannover, Carl-Neuberg-Str 1, D-30623 Hannover, Germany.
| | | | | |
Collapse
|
192
|
Li YC, Korol AB, Fahima T, Beiles A, Nevo E. Microsatellites: genomic distribution, putative functions and mutational mechanisms: a review. Mol Ecol 2002; 11:2453-65. [PMID: 12453231 DOI: 10.1046/j.1365-294x.2002.01643.x] [Citation(s) in RCA: 614] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Microsatellites, or tandem simple sequence repeats (SSR), are abundant across genomes and show high levels of polymorphism. SSR genetic and evolutionary mechanisms remain controversial. Here we attempt to summarize the available data related to SSR distribution in coding and noncoding regions of genomes and SSR functional importance. Numerous lines of evidence demonstrate that SSR genomic distribution is nonrandom. Random expansions or contractions appear to be selected against for at least part of SSR loci, presumably because of their effect on chromatin organization, regulation of gene activity, recombination, DNA replication, cell cycle, mismatch repair system, etc. This review also discusses the role of two putative mutational mechanisms, replication slippage and recombination, and their interaction in SSR variation.
Collapse
Affiliation(s)
- You-Chun Li
- Institute of Evolution, University of Haifa, Haifa 31905, Israel
| | | | | | | | | |
Collapse
|
193
|
Palacios C, Wernegreen JJ. A strong effect of AT mutational bias on amino acid usage in Buchnera is mitigated at high-expression genes. Mol Biol Evol 2002; 19:1575-84. [PMID: 12200484 DOI: 10.1093/oxfordjournals.molbev.a004219] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The advent of full genome sequences provides exceptionally rich data sets to explore molecular and evolutionary mechanisms that shape divergence among and within genomes. In this study, we use multivariate analysis to determine the processes driving genome-wide patterns of amino usage in the obligate endosymbiont Buchnera and its close free-living relative Escherichia coli. In the AT-rich Buchnera genome, the primary source of variation in amino acid usage differentiates high- and low-expression genes. Amino acids of high-expression Buchnera genes are generally less aromatic and use relatively GC-rich codons, suggesting that selection against aromatic amino acids and against amino acids with AT-rich codons is stronger in high-expression genes. Selection to maintain hydrophobic amino acids in integral membrane proteins is a primary factor driving protein evolution in E. coli but is a secondary factor in Buchnera. In E. coli, gene expression is a secondary force driving amino acid usage, and a correlation with tRNA abundance suggests that translational selection contributes to this effect. Although this and previous studies demonstrate that AT mutational bias and genetic drift influence amino acid usage in Buchnera, this genome-wide analysis argues that selection is sufficient to affect the amino acid content of proteins with different expression and hydropathy levels.
Collapse
Affiliation(s)
- Carmen Palacios
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, Massachusetts 02543, USA
| | | |
Collapse
|
194
|
Raeymaekers L, Wuytack E, Willems I, Michiels CW, Wuytack F. Expression of a P-type Ca(2+)-transport ATPase in Bacillus subtilis during sporulation. Cell Calcium 2002; 32:93. [PMID: 12161109 DOI: 10.1016/s0143-4160(02)00125-2] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
The open reading frame designated yloB in the genomic sequence of Bacillus subtilis encodes a putative protein that is most similar to the typically eukaryotic type IIA family of P-type ion-motive ATPases, including the endo(sarco)plasmic reticulum (SERCA) and PMR1 Ca(2+)-transporters, located respectively in the SERCA and the Golgi apparatus. The overall amino acid sequence is more similar to that of the Pmr1s than to the SERCAs, whereas the inverse is seen for the 10 amino acids that form the two Ca(2+)-binding sites in SERCA. Sporulating but not vegetative B. subtilis cells express the predicted protein, as shown by Western blotting and by the formation of a Ca(2+)-dependent phosphorylated intermediate. Half-maximal activation of phosphointermediate formation occurred at 2.5 microM Ca(2+). Insertion mutation of the yloB gene did not affect the growth of vegetative cells, did not prevent the formation of viable spores, and did not significantly affect 45Ca accumulation during sporulation. However, spores from knockouts were less resistant to heat and showed a slower rate of germination. It is concluded that the P-type Ca(2+)-transport ATPase from B. subtilis is not essential for survival, but assists in the formation of resistant spores. The evolutionary relationship of the transporter to the eukaryotic P-type Ca(2+)-transport ATPases is discussed.
Collapse
Affiliation(s)
- L Raeymaekers
- Laboratorium voor Fysiologie, K.U. Leuven, Campus Gasthuisberg O/N, B3000 Leuven, Belgium.
| | | | | | | | | |
Collapse
|
195
|
Li YC, Röder MS, Fahima T, Kirzhner VM, Beiles A, Korol AB, Nevo E. Climatic effects on microsatellite diversity in wild emmer wheat (Triticum dicoccoides) at the Yehudiyya microsite, Israel. Heredity (Edinb) 2002; 89:127-32. [PMID: 12136415 DOI: 10.1038/sj.hdy.6800115] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2000] [Accepted: 05/01/2002] [Indexed: 11/09/2022] Open
Abstract
Microsatellite (SSR) diversity at 28 loci comprising seven types of tandem dinucleotide repeated motifs was analyzed in 105 individual plants of wild emmer wheat, Triticum dicoccoides, from a microsite in Yehudiyya, northeast of the Sea of Galilee, Israel. The study area was less than 1000 m(2) and involved 12 paired plots distributed in a mosaic pattern. Each experiment involved very close (a few meters apart), but sharply divergent, microclimatic niches in the open park forest of Tabor oak: (1) sun, between trees, and (2) shade, under tree canopy. Significant microclimatic divergence characterized many loci displaying asymmetric and non-random distribution of repeat numbers. Niche-specific and niche-unique alleles and linkage disequilibria were found in the two sub-populations. Microsatellite diversity at both single- and two-locus levels is affected by microclimatic environment. The evidence reflects effects of ecological stresses and natural selection on SSR diversity, resulting presumably in adaptive structures.
Collapse
Affiliation(s)
- Y-C Li
- Institute of Evolution, University of Haifa, Mount Carmel, Haifa 31905, Israel
| | | | | | | | | | | | | |
Collapse
|
196
|
Jain R, Rivera MC, Moore JE, Lake JA. Horizontal gene transfer in microbial genome evolution. Theor Popul Biol 2002; 61:489-95. [PMID: 12167368 DOI: 10.1006/tpbi.2002.1596] [Citation(s) in RCA: 135] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Horizontal gene transfer is the collective name for processes that permit the exchange of DNA among organisms of different species. Only recently has it been recognized as a significant contribution to inter-organismal gene exchange. Traditionally, it was thought that microorganisms evolved clonally, passing genes from mother to daughter cells with little or no exchange of DNA among diverse species. Studies of microbial genomes, however, have shown that genomes contain genes that are closely related to a number of different prokaryotes, sometimes to phylogenetically very distantly related ones. (Doolittle et al., 1990, J. Mol. Evol. 31, 383-388; Karlin et al., 1997, J. Bacteriol. 179, 3899-3913; Karlin et al., 1998, Annu. Rev. Genet. 32, 185-225; Lawrence and Ochman, 1998, Proc. Natl. Acad. Sci. USA 95, 9413-9417; Rivera et al., 1998, Proc. Natl. Acad. Sci. USA 95, 6239-6244; Campbell, 2000, Theor. Popul. Biol. 57 71-77; Doolittle, 2000, Sci. Am. 282, 90-95; Ochman and Jones, 2000, Embo. J. 19, 6637-6643; Boucher et al. 2001, Curr. Opin., Microbiol. 4, 285-289; Wang et al., 2001, Mol. Biol. Evol. 18, 792-800). Whereas prokaryotic and eukaryotic evolution was once reconstructed from a single 16S ribosomal RNA (rRNA) gene, the analysis of complete genomes is beginning to yield a different picture of microbial evolution, one that is wrought with the lateral movement of genes across vast phylogenetic distances. (Lane et al., 1988, Methods Enzymol. 167, 138-144; Lake and Rivera, 1996, Proc. Natl. Acad. Sci. USA 91, 2880-2881; Lake et al., 1999, Science 283, 2027-2028).
Collapse
Affiliation(s)
- Ravi Jain
- Molecular Biology Institute, University of Californnia, Los Angeles 90095, USA
| | | | | | | |
Collapse
|
197
|
Wong GKS, Wang J, Tao L, Tan J, Zhang J, Passey DA, Yu J. Compositional gradients in Gramineae genes. Genome Res 2002; 12:851-6. [PMID: 12045139 PMCID: PMC1383739 DOI: 10.1101/gr.189102] [Citation(s) in RCA: 129] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2002] [Accepted: 04/03/2002] [Indexed: 11/24/2022]
Abstract
In this study, we describe a property of Gramineae genes, and perhaps all monocot genes, that is not observed in eudicot genes. Along the direction of transcription, beginning at the junction of the 5'-UTR and the coding region, there are gradients in GC content, codon usage, and amino-acid usage. The magnitudes of these gradients are large enough to hinder the annotation of the rice genome and to confound the detection of protein homologies across the monocot-eudicot divide.
Collapse
Affiliation(s)
- Gane Ka-Shu Wong
- Hangzhou Genomics Institute, Institute of Bioinformatics of Zhejiang University, Key Laboratory of Bioinformatics of Zhejiang Province, Hangzhou 310007, China.
| | | | | | | | | | | | | |
Collapse
|
198
|
Campbell AM. Preferential orientation of natural lambdoid prophages and bacterial chromosome organization. Theor Popul Biol 2002; 61:503-7. [PMID: 12167370 DOI: 10.1006/tpbi.2002.1604] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
All known lambdoid prophages of Escherichia coli have the same orientation with respect to direction of chromosomal replication. This includes 12 prophages that are replicated in one direction and five in the other. Among candidate explanations, the most amenable to experimental study is an effect on dif site function in assuring chromosomal segregation. This is but one of numerous examples of strand bias in the E. coli genome, all of which may interact with one another.
Collapse
Affiliation(s)
- Allan M Campbell
- Department of Biological Sciences, Stanford University, California 94305-5020, USA
| |
Collapse
|
199
|
Lin K, Kuang Y, Joseph JS, Kolatkar PR. Conserved codon composition of ribosomal protein coding genes in Escherichia coli, Mycobacterium tuberculosis and Saccharomyces cerevisiae: lessons from supervised machine learning in functional genomics. Nucleic Acids Res 2002; 30:2599-607. [PMID: 12034849 PMCID: PMC117187 DOI: 10.1093/nar/30.11.2599] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2001] [Revised: 03/15/2002] [Accepted: 03/27/2002] [Indexed: 11/14/2022] Open
Abstract
Genomics projects have resulted in a flood of sequence data. Functional annotation currently relies almost exclusively on inter-species sequence comparison and is restricted in cases of limited data from related species and widely divergent sequences with no known homologs. Here, we demonstrate that codon composition, a fusion of codon usage bias and amino acid composition signals, can accurately discriminate, in the absence of sequence homology information, cytoplasmic ribosomal protein genes from all other genes of known function in Saccharomyces cerevisiae, Escherichia coli and Mycobacterium tuberculosis using an implementation of support vector machines, SVM(light). Analysis of these codon composition signals is instructive in determining features that confer individuality to ribosomal protein genes. Each of the sets of positively charged, negatively charged and small hydrophobic residues, as well as codon bias, contribute to their distinctive codon composition profile. The representation of all these signals is sensitively detected, combined and augmented by the SVMs to perform an accurate classification. Of special mention is an obvious outlier, yeast gene RPL22B, highly homologous to RPL22A but employing very different codon usage, perhaps indicating a non-ribosomal function. Finally, we propose that codon composition be used in combination with other attributes in gene/protein classification by supervised machine learning algorithms.
Collapse
Affiliation(s)
- Kui Lin
- IMCB-BIC, Institute of Molecular and Cell Biology, 30 Medical Drive, 117609 Singapore
| | | | | | | |
Collapse
|
200
|
Abstract
Our thesis is that the DNA composition and structure of genomes are selected in part by mutation bias (GC pressure) and in part by ecology. To illustrate this point, we compare and contrast the oligonucleotide composition and the mosaic structure in 36 complete genomes and in 27 long genomic sequences from archaea and eubacteria. We report the following findings (1) High-GC-content genomes show a large underrepresentation of short distances between G(n) and C(n) homopolymers with respect to distances between A(n) and T(n) homopolymers; we discuss selection versus mutation bias hypotheses. (2) The oligonucleotide compositions of the genomes of Neisseria (meningitidis and gonorrhoea), Helicobacter pylori and Rhodobacter capsulatus are more biased than the other sequenced genomes. (3) The genomes of free-living species or nonchronic pathogens show more mosaic-like structure than genomes of chronic pathogens or intracellular symbionts. (4) Genome mosaicity of intracellular parasites has a maximum corresponding to the average gene length; in the genomes of free-living and nonchronic pathogens the maximum occurs at larger length scales. This suggests that free-living species can incorporate large pieces of DNA from the environment, whereas for intracellular parasites there are recombination events between homologous genes. We discuss the consequences in terms of evolution of genome size. (5) Intracellular symbionts and obligate pathogens show small, but not zero, amount of chromosome mosaicity, suggesting that recombination events occur in these species.
Collapse
Affiliation(s)
- Pietro Liò
- Department of Zoology, University of Cambridge, United Kingdom.
| |
Collapse
|