1
|
Zhao H, Jia RZ, Zhang YL, Zhu YJ, Zeng HC, Kong H, McCafferty H, Guo AP, Peng M. Geographical and Genetic Divergence Among Papaya ringspot virus Populations Within Hainan Province, China. PHYTOPATHOLOGY 2016; 106:937-944. [PMID: 27070425 DOI: 10.1094/phyto-05-15-0111-r] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Papaya ringspot virus (PRSV) severely affects the global papaya industry. Transgenic papaya has been proven to have effective resistance to PRSV isolates from Hawaii, Thailand, Taiwan, and other countries. However, those transgenic cultivars failed to show resistance to Hainan Island isolates. Some 76 PRSV samples, representative of all traditional papaya planting areas across five cities (Wen Chang, n = 13; Cheng Mai, n = 14; Chang Jiang, n = 11; Le Dong, n = 25; and San Ya, n = 13) within Hainan Province, were investigated. Results revealed three genetic diversity groups (Hainan I, II, and III) that correlated with geographical distribution. Frequent mutations among PRSV isolates from Hainan were also observed. The high genetic divergence in PRSV isolates from Hainan is likely to be the cause of the failure of genetically modified papaya that targets sequence-specific virus.
Collapse
Affiliation(s)
- Hui Zhao
- First author: College of Agriculture, Hainan University, Haikou, Hainan, China 570228; first, second, third, fourth, fifth, sixth, eighth, and ninth authors: Key Laboratory of Biology and Genetic Resources of Tropical Crops, Ministry of Agriculture, P.R. China, and Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agriculture Sciences, Haikou, Hainan, China 571101; and second, fourth, and seventh authors: Hawaii Agriculture Research Center, Waipahu 96797
| | - Rui Zong Jia
- First author: College of Agriculture, Hainan University, Haikou, Hainan, China 570228; first, second, third, fourth, fifth, sixth, eighth, and ninth authors: Key Laboratory of Biology and Genetic Resources of Tropical Crops, Ministry of Agriculture, P.R. China, and Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agriculture Sciences, Haikou, Hainan, China 571101; and second, fourth, and seventh authors: Hawaii Agriculture Research Center, Waipahu 96797
| | - Yu-Liang Zhang
- First author: College of Agriculture, Hainan University, Haikou, Hainan, China 570228; first, second, third, fourth, fifth, sixth, eighth, and ninth authors: Key Laboratory of Biology and Genetic Resources of Tropical Crops, Ministry of Agriculture, P.R. China, and Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agriculture Sciences, Haikou, Hainan, China 571101; and second, fourth, and seventh authors: Hawaii Agriculture Research Center, Waipahu 96797
| | - Yun Judy Zhu
- First author: College of Agriculture, Hainan University, Haikou, Hainan, China 570228; first, second, third, fourth, fifth, sixth, eighth, and ninth authors: Key Laboratory of Biology and Genetic Resources of Tropical Crops, Ministry of Agriculture, P.R. China, and Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agriculture Sciences, Haikou, Hainan, China 571101; and second, fourth, and seventh authors: Hawaii Agriculture Research Center, Waipahu 96797
| | - Hui-Cai Zeng
- First author: College of Agriculture, Hainan University, Haikou, Hainan, China 570228; first, second, third, fourth, fifth, sixth, eighth, and ninth authors: Key Laboratory of Biology and Genetic Resources of Tropical Crops, Ministry of Agriculture, P.R. China, and Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agriculture Sciences, Haikou, Hainan, China 571101; and second, fourth, and seventh authors: Hawaii Agriculture Research Center, Waipahu 96797
| | - Hua Kong
- First author: College of Agriculture, Hainan University, Haikou, Hainan, China 570228; first, second, third, fourth, fifth, sixth, eighth, and ninth authors: Key Laboratory of Biology and Genetic Resources of Tropical Crops, Ministry of Agriculture, P.R. China, and Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agriculture Sciences, Haikou, Hainan, China 571101; and second, fourth, and seventh authors: Hawaii Agriculture Research Center, Waipahu 96797
| | - Heather McCafferty
- First author: College of Agriculture, Hainan University, Haikou, Hainan, China 570228; first, second, third, fourth, fifth, sixth, eighth, and ninth authors: Key Laboratory of Biology and Genetic Resources of Tropical Crops, Ministry of Agriculture, P.R. China, and Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agriculture Sciences, Haikou, Hainan, China 571101; and second, fourth, and seventh authors: Hawaii Agriculture Research Center, Waipahu 96797
| | - An-Ping Guo
- First author: College of Agriculture, Hainan University, Haikou, Hainan, China 570228; first, second, third, fourth, fifth, sixth, eighth, and ninth authors: Key Laboratory of Biology and Genetic Resources of Tropical Crops, Ministry of Agriculture, P.R. China, and Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agriculture Sciences, Haikou, Hainan, China 571101; and second, fourth, and seventh authors: Hawaii Agriculture Research Center, Waipahu 96797
| | - Ming Peng
- First author: College of Agriculture, Hainan University, Haikou, Hainan, China 570228; first, second, third, fourth, fifth, sixth, eighth, and ninth authors: Key Laboratory of Biology and Genetic Resources of Tropical Crops, Ministry of Agriculture, P.R. China, and Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agriculture Sciences, Haikou, Hainan, China 571101; and second, fourth, and seventh authors: Hawaii Agriculture Research Center, Waipahu 96797
| |
Collapse
|
2
|
Yanokura E, Oki K, Makino H, Modesto M, Pot B, Mattarelli P, Biavati B, Watanabe K. Subspeciation of Bifidobacterium longum by multilocus approaches and amplified fragment length polymorphism: Description of B. longum subsp. suillum subsp. nov., isolated from the faeces of piglets. Syst Appl Microbiol 2015; 38:305-14. [PMID: 26007614 DOI: 10.1016/j.syapm.2015.05.001] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2015] [Revised: 04/30/2015] [Accepted: 05/06/2015] [Indexed: 11/28/2022]
Abstract
The species Bifidobacterium longum is currently divided into three subspecies, B. longum subsp. longum, B. longum subsp. infantis and B. longum subsp. suis. This classification was based on an assessment of accumulated information on the species' phenotypic and genotypic features. The three subspecies of B. longum were investigated using genotypic identification [amplified-fragment length polymorphism (AFLP), multilocus sequence analysis (MLSA) and multilocus sequence typing (MLST)]. By using the AFLP and the MLSA methods, we allocated 25 strains of B. longum into three major clusters corresponding to the three subspecies; the cluster comprising the strains of B. longum subsp. suis was further divided into two subclusters differentiable by the ability to produce urease. By using the MLST method, the 25 strains of B. longum were divided into eight groups: four major groups corresponding to the results obtained by AFLP and MLSA, plus four minor disparate groups. The results of AFLP, MLSA and MLST analyses were consistent and revealed a novel subspeciation of B. longum, which comprised three known subspecies and a novel subspecies of urease-negative B. longum, for which the name B. longum subsp. suillum subsp. nov. is proposed, with type strain Su 851(T)=DSM 28597(T)=JCM 19995(T).
Collapse
Affiliation(s)
- Emiko Yanokura
- Yakult Central Institute, 5-11 Izumi, Kunitachi, Tokyo 186-8650, Japan
| | - Kaihei Oki
- Yakult Honsha European Research Center for Microbiology ESV, Technologiepark 4, 9052 Zwijnaarde, Belgium
| | - Hiroshi Makino
- Yakult Central Institute, 5-11 Izumi, Kunitachi, Tokyo 186-8650, Japan
| | - Monica Modesto
- Department of Agricultural Sciences, University of Bologna, Viale Fanin 42, 40127 Bologna, Italy
| | - Bruno Pot
- Lactic acid Bacteria and Mucosal Immunity Team, Institut Pasteur de Lille, Rue Prof. Calmette, F-59019 Lille Cedex, France; Center for Infection and Immunity of Lille, F-59019 Lille, France; Université Lille Nord de France, F-59019 Lille, France; CNRS, UMR 8204, F-59019 Lille, France
| | - Paola Mattarelli
- Department of Agricultural Sciences, University of Bologna, Viale Fanin 42, 40127 Bologna, Italy
| | - Bruno Biavati
- Department of Agricultural Sciences, University of Bologna, Viale Fanin 42, 40127 Bologna, Italy
| | - Koichi Watanabe
- Department of Animal Science and Technology, National Taiwan University, No. 50, Lane 155, Sec 3, Keelung Rd., Taipei 10673, Taiwan, ROC.
| |
Collapse
|
3
|
Zuo G, Li Q, Hao B. On K-peptide length in composition vector phylogeny of prokaryotes. Comput Biol Chem 2014; 53 Pt A:166-73. [PMID: 25205031 DOI: 10.1016/j.compbiolchem.2014.08.021] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2014] [Indexed: 11/25/2022]
Abstract
Using an enlarged alphabet of K-tuples is the way to carry out alignment-free comparison of genomes in the composition vector (CV) approach to prokaryotic phylogeny. We summarize the known aspects concerning the choice of K and examine the results of using CVs with subtraction of a statistical background for K=3-9 and using raw CVs without subtraction for K=1-12. The criterion for evaluation consists in direct comparison with taxonomy. For prokaryotes the best performances are obtained for K=5 and 6 with subtraction and for K=11, 12 or even more without subtraction. In general, CVs with subtractions are slightly better and less CPU consuming, but CVs without subtraction may provide complementary information.
Collapse
Affiliation(s)
- Guanghong Zuo
- T-Life Research Center, Fudan University, Shanghai 200433, China
| | - Qiang Li
- CAS-MPG Partner Institute for Computational Biology, Shanghai 200032, China
| | - Bailin Hao
- T-Life Research Center, Fudan University, Shanghai 200433, China.
| |
Collapse
|
4
|
Killer J, Havlík J, Vlková E, Rada V, Pechar R, Benada O, Kopečný J, Kofroňová O, Sechovcová H. Lactobacillus rodentium sp. nov., from the digestive tract of wild rodents. Int J Syst Evol Microbiol 2014; 64:1526-1533. [DOI: 10.1099/ijs.0.054924-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Three strains of regular, long, Gram-stain-positive bacterial rods were isolated using TPY, M.R.S. and Rogosa agar under anaerobic conditions from the digestive tract of wild mice (Mus musculus). All 16S rRNA gene sequences of these isolates were most similar to sequences of
Lactobacillus gasseri
ATCC 33323T and
Lactobacillus johnsonii
ATCC 33200T (97.3 % and 97.2 % sequence similarities, respectively). The novel strains shared 99.2–99.6 % 16S rRNA gene sequence similarities. Type strains of
L. gasseri
and
L. johnsonii
were also most related to the newly isolated strains according to rpoA (83.9–84.0 % similarities), pheS (84.6–87.8 %), atpA (86.2–87.7 %), hsp60 (89.4–90.4 %) and tuf (92.7–93.6 %) gene sequence similarities. Phylogenetic studies based on 16S rRNA, hsp60, rpoA, atpA and pheS gene sequences, other genotypic and many phenotypic characteristics (results of API 50 CHL, Rapid ID 32A and API ZYM biochemical tests; cellular fatty acid profiles; cellular polar lipid profiles; end products of glucose fermentation) showed that these bacterial strains represent a novel species within the genus
Lactobacillus
. The name Lactobacillus rodentium sp. nov. is proposed to accommodate this group of new isolates. The type strain is MYMRS/TLU1T ( = DSM 24759T = CCM 7945T).
Collapse
Affiliation(s)
- J. Killer
- Czech University of Life Sciences, Faculty of Agrobiology, Food and Natural Resources, Department of Microbiology, Nutrition and Dietetics, Kamýcká 129, Prague 6 – Suchdol 165 21, Czech Republic
- Institute of Animal Physiology and Genetics v.v.i., Academy of Sciences of the Czech Republic, Vídeňská 1083, Prague 4 – Krč 142 20, Czech Republic
| | - J. Havlík
- Czech University of Life Sciences, Faculty of Agrobiology, Food and Natural Resources, Department of Microbiology, Nutrition and Dietetics, Kamýcká 129, Prague 6 – Suchdol 165 21, Czech Republic
| | - E. Vlková
- Czech University of Life Sciences, Faculty of Agrobiology, Food and Natural Resources, Department of Microbiology, Nutrition and Dietetics, Kamýcká 129, Prague 6 – Suchdol 165 21, Czech Republic
| | - V. Rada
- Czech University of Life Sciences, Faculty of Agrobiology, Food and Natural Resources, Department of Microbiology, Nutrition and Dietetics, Kamýcká 129, Prague 6 – Suchdol 165 21, Czech Republic
| | - R. Pechar
- Czech University of Life Sciences, Faculty of Agrobiology, Food and Natural Resources, Department of Microbiology, Nutrition and Dietetics, Kamýcká 129, Prague 6 – Suchdol 165 21, Czech Republic
| | - O. Benada
- Department of Biology, Faculty of Science, J. E. Purkyně University in Ustí nad Labem, Za Válcovnou 1000/8, 400 96 Ústí nad Labem, Czech Republic
- Laboratory of Molecular Structure Characterization, Institute of Microbiology, Institute of Microbiology, v.v.i., Academy of Sciences of the Czech Republic, Vídeňská 1083, Prague 4 – Krč 142 20, Czech Republic
| | - J. Kopečný
- Institute of Animal Physiology and Genetics v.v.i., Academy of Sciences of the Czech Republic, Vídeňská 1083, Prague 4 – Krč 142 20, Czech Republic
| | - O. Kofroňová
- Laboratory of Molecular Structure Characterization, Institute of Microbiology, Institute of Microbiology, v.v.i., Academy of Sciences of the Czech Republic, Vídeňská 1083, Prague 4 – Krč 142 20, Czech Republic
| | - H. Sechovcová
- Institute of Animal Physiology and Genetics v.v.i., Academy of Sciences of the Czech Republic, Vídeňská 1083, Prague 4 – Krč 142 20, Czech Republic
| |
Collapse
|
5
|
Reclassification of
Bifidobacterium stercoris
Kim et al. 2010 as a later heterotypic synonym of
Bifidobacterium adolescentis. Int J Syst Evol Microbiol 2013; 63:4350-4353. [DOI: 10.1099/ijs.0.054957-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The taxonomic position of
Bifidobacterium stercoris
Eg1T ( = JCM 15918T) based on comparative 16S rRNA gene and hsp60 sequence analyses was found to be controversial, as the strain showed high similarity to the type strain of
Bifidobacterium adolescentis
, CCUG 18363T. Therefore, the relationship between the two species was investigated by a taxonomic study that included, in addition to re-evaluation of the 16S rRNA gene sequence, determination of DNA–DNA binding and multilocus sequence analysis (MLSA) of housekeeping genes encoding the DNA-directed RNA polymerase B subunit (rpoC), putative xylulose-5-phosphate/fructose-6-phosphate phosphoketolase (xfp), elongation factor EF-G (fusA), 50S ribosomal protein L2 (rplB) and DNA gyrase B subunit (gyrB). Comparative 16S rRNA gene sequence analysis showed relatively high similarity (98.9 %) between
B. stercoris
KCTC 5756T and
B. adolescentis
ATCC 15703T. MLSA revealed close relatedness between
B. stercoris
KCTC 5756T and
B. adolescentis
CCUG 18363T, with 99.3–100 % similarity between the rpoC, xfp, fusA, rplB and gyrB gene sequences. In addition, relatively high dnaJ1 gene sequence similarity of 97.7 % was found between the strains. Similar phenotypes and a high DNA–DNA binding value (78.9 %) confirmed that
B. stercoris
and
B. adolescentis
are synonymous. Based on these results, it is proposed that the species
Bifidobacterium stercoris
Kim et al. 2010 should be reclassified as a later heterotypic synonym of
Bifidobacterium adolescentis
Reuter 1963 (Approved Lists 1980).
Collapse
|
6
|
Prokaryotic phylogenies inferred from whole-genome sequence and annotation data. BIOMED RESEARCH INTERNATIONAL 2013; 2013:409062. [PMID: 24073404 PMCID: PMC3773407 DOI: 10.1155/2013/409062] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Revised: 06/26/2013] [Accepted: 07/22/2013] [Indexed: 11/25/2022]
Abstract
Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis.
Collapse
|
7
|
Abstract
We have developed a semi-automatic methodology to reconstruct the phylogenetic species tree in Protozoa, integrating different phylogenetic algorithms and programs, and demonstrating the utility of a supermatrix approach to construct phylogenomics-based trees using 31 universal orthologs (UO). The species tree obtained was formed by three major clades that were related to three groups of data: i) Species containing at least 80% of UO (25/31) in the concatenated multiple alignment or supermatrix, this clade was called C1, ii) Species containing between 50%–79% (15–24/31) of UO called C2, and iii) Species containing less than 50% (1–14/31) of UO called C3. C1 was composed by only protozoan species, C2 was composed by species related to Protozoa, and C3 was composed by some species of C1 (Protozoa) and C2 (related to Protozoa). Our phylogenomics-based methodology using a supermatrix approach proved to be reliable with protozoan genome data and using at least 25 UO, suggesting that (a) the more UO used the better, (b) using the entire UO sequence or just a conserved block of it for the supermatrix produced similar phylogenomic trees.
Collapse
|
8
|
Analyses of bifidobacterial prophage-like sequences. Antonie Van Leeuwenhoek 2010; 98:39-50. [DOI: 10.1007/s10482-010-9426-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2010] [Accepted: 03/03/2010] [Indexed: 10/19/2022]
|
9
|
Lin GN, Cai Z, Lin G, Chakraborty S, Xu D. ComPhy: prokaryotic composite distance phylogenies inferred from whole-genome gene sets. BMC Bioinformatics 2009; 10 Suppl 1:S5. [PMID: 19208152 PMCID: PMC2648732 DOI: 10.1186/1471-2105-10-s1-s5] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Background With the increasing availability of whole genome sequences, it is becoming more and more important to use complete genome sequences for inferring species phylogenies. We developed a new tool ComPhy, 'Composite Distance Phylogeny', based on a composite distance matrix calculated from the comparison of complete gene sets between genome pairs to produce a prokaryotic phylogeny. Results The composite distance between two genomes is defined by three components: Gene Dispersion Distance (GDD), Genome Breakpoint Distance (GBD) and Gene Content Distance (GCD). GDD quantifies the dispersion of orthologous genes along the genomic coordinates from one genome to another; GBD measures the shared breakpoints between two genomes; GCD measures the level of shared orthologs between two genomes. The phylogenetic tree is constructed from the composite distance matrix using a neighbor joining method. We tested our method on 9 datasets from 398 completely sequenced prokaryotic genomes. We have achieved above 90% agreement in quartet topologies between the tree created by our method and the tree from the Bergey's taxonomy. In comparison to several other phylogenetic analysis methods, our method showed consistently better performance. Conclusion ComPhy is a fast and robust tool for genome-wide inference of evolutionary relationship among genomes. It can be downloaded from .
Collapse
Affiliation(s)
- Guan Ning Lin
- Digital Biology Laboratory, Informatics Institute, Computer Science Department and Christopher S, Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.
| | | | | | | | | |
Collapse
|
10
|
Exploring the diversity of the bifidobacterial population in the human intestinal tract. Appl Environ Microbiol 2009; 75:1534-45. [PMID: 19168652 DOI: 10.1128/aem.02216-08] [Citation(s) in RCA: 230] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Although the health-promoting roles of bifidobacteria are widely accepted, the diversity of bifidobacteria among the human intestinal microbiota is still poorly understood. We performed a census of bifidobacterial populations from human intestinal mucosal and fecal samples by plating them on selective medium, coupled with molecular analysis of selected rRNA gene sequences (16S rRNA gene and internally transcribed spacer [ITS] 16S-23S spacer sequences) of isolated colonies. A total of 900 isolates were collected, of which 704 were shown to belong to bifidobacteria. Analyses showed that the culturable bifidobacterial population from intestinal and fecal samples include six main phylogenetic taxa, i.e., Bifidobacterium longum, Bifidobacterium pseudocatenulatum, Bifidobacterium adolescentis, Bifidobacterium pseudolongum, Bifidobacterium breve, and Bifidobacterium bifidum, and two species mostly detected in fecal samples, i.e., Bifidobacterium dentium and Bifidobacterium animalis subp. lactis. Analysis of bifidobacterial distribution based on age of the subject revealed that certain identified bifidobacterial species were exclusively present in the adult human gut microbiota whereas others were found to be widely distributed. We encountered significant intersubject variability and composition differences between fecal and mucosa-adherent bifidobacterial communities. In contrast, a modest diversification of bifidobacterial populations was noticed between different intestinal regions within the same individual (intrasubject variability). Notably, a small number of bifidobacterial isolates were shown to display a wide ecological distribution, thus suggesting that they possess a broad colonization capacity.
Collapse
|
11
|
Almeida FC, Leszczyniecka M, Fisher PB, DeSalle R. Examining Ancient Inter-domain Horizontal Gene Transfer. Evol Bioinform Online 2008. [DOI: 10.1177/117693430800400002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Details of the genomic changes that occurred in the ancestors of Eukarya, Archaea and Bacteria are elusive. Ancient interdomain horizontal gene transfer (IDHGT) amongst the ancestors of these three domains has been difficult to detect and analyze because of the extreme degree of divergence of genes in these three domains and because most evidence for such events are poorly supported. In addition, many researchers have suggested that the prevalence of IDHGT events early in the evolution of life would most likely obscure the patterns of divergence of major groups of organisms let alone allow the tracking of horizontal transfer at this level. In order to approach this problem, we mined the E. coli genome for genes with distinct paralogs. Using the 1,268 E. coli K-12 genes with 40% or higher similarity level to a paralog elsewhere in the E. coli genome we detected 95 genes found exclusively in Bacteria and Archaea and 86 genes found in Bacteria and Eukarya. These genes form the basis for our analysis of IDHGT. We also applied a newly developed statistical test (the node height test), to examine the robustness of these inferences and to corroborate the phylogenetically identified cases of ancient IDHGT. Our results suggest that ancient inter domain HGT is restricted to special cases, mostly involving symbiosis in eukaryotes and specific adaptations in prokaryotes. Only three genes in the Bacteria + Eukarya class (Deoxyxylulose-5-phosphate synthase (DXPS), fructose 1,6-phosphate aldolase class II protein and glucosamine-6-phosphate deaminase) and three genes–in the Bacteria + Archaea class (ABC-type FE3+ -siderophore transport system, ferrous iron transport protein B, and dipeptide transport protein) showed evidence of ancient IDHGT. However, we conclude that robust estimates of IDHGT will be very difficult to obtain due to the methodological limitations and the extreme sequence saturation of the genes suspected of being involved in IDHGT.
Collapse
Affiliation(s)
- Francisca C. Almeida
- Department of Biology, New York University, New York, NY
- Sackler Institute for Comparative Genomics, American Museum of Natural History, 79th Street @ Central Park West, New York 10024, U.S.A
| | - Magdalena Leszczyniecka
- Departments of Pathology, Urology and Neurosurgery, Herbert Irving Comprehensive Caner Center, Columbia University Medical Center, College of Physicians and Surgeons, New York, U.S.A
| | - Paul B. Fisher
- Departments of Pathology, Urology and Neurosurgery, Herbert Irving Comprehensive Caner Center, Columbia University Medical Center, College of Physicians and Surgeons, New York, U.S.A
| | - Rob DeSalle
- Department of Biology, New York University, New York, NY
- Sackler Institute for Comparative Genomics, American Museum of Natural History, 79th Street @ Central Park West, New York 10024, U.S.A
| |
Collapse
|
12
|
Gao L, Qi J, Sun J, Hao B. Prokaryote phylogeny meets taxonomy: an exhaustive comparison of composition vector trees with systematic bacteriology. ACTA ACUST UNITED AC 2008; 50:587-99. [PMID: 17879055 DOI: 10.1007/s11427-007-0084-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2007] [Accepted: 07/21/2007] [Indexed: 10/22/2022]
Abstract
We perform an exhaustive, taxon by taxon, comparison of the branchings in the composition vector trees (CVTrees) inferred from 432 prokaryotic genomes available on 31 December 2006, with the bacteriologists' taxonomy--primarily the latest online Outline of the Bergey's Manual of Systematic Bacteriology. The CVTree phylogeny agrees very well with the Bergey's taxonomy in majority of fine branchings and overall structures. At the same time most of the differences between the trees and the Manual have been known to biologists to some extent and may hint at taxonomic revisions. Instead of demonstrating the overwhelming agreement this paper puts emphasis on the biological implications of the differences.
Collapse
Affiliation(s)
- Lei Gao
- Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100080, China
| | | | | | | |
Collapse
|
13
|
Brown DR, Whitcomb RF, Bradbury JM. Revised minimal standards for description of new species of the class Mollicutes (division Tenericutes). Int J Syst Evol Microbiol 2008; 57:2703-2719. [PMID: 17978244 DOI: 10.1099/ijs.0.64722-0] [Citation(s) in RCA: 91] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Minimal standards for novel species of the class Mollicutes (trivial term, mollicutes), last published in 1995, require revision. The International Committee on Systematics of Prokaryotes Subcommittee on the Taxonomy of Mollicutes proposes herein revised standards that reflect recent advances in molecular systematics and the species concept for prokaryotes. The mandatory requirements are: (i) deposition of the type strain into two recognized culture collections, preferably located in different countries; (ii) deposition of the 16S rRNA gene sequence into a public database, and a phylogenetic analysis of the relationships among the 16S rRNA gene sequences of the novel species and its neighbours; (iii) deposition of antiserum against the type strain into a recognized collection; (iv) demonstration, by using the combination of 16S rRNA gene sequence analyses, serological analyses and supplementary phenotypic data, that the type strain differs significantly from all previously named species; and (v) assignment to an order, a family and a genus in the class, with an appropriate specific epithet. The 16S rRNA gene sequence provides the primary basis for assignment to hierarchical rank, and may also constitute evidence of species novelty, but serological and supplementary phenotypic data must be presented to substantiate this. Serological methods have been documented to be congruent with DNA-DNA hybridization data and with 16S rRNA gene placements. The novel species must be tested serologically to the greatest extent that the investigators deem feasible against all neighbouring species whose 16S rRNA gene sequences show >0.94 similarity. The investigator is responsible for justifying which characters are most meaningful for assignment to the part of the mollicute phylogenetic tree in which a novel species is located, and for providing the means by which novel species can be identified by other investigators. The publication of the description should appear in a journal having wide circulation. If the journal is not the International Journal of Systematic and Evolutionary Microbiology, copies of the publication must be submitted to that journal so that the name may be considered for inclusion in a Validation List as required by the International Code of Bacteriological Nomenclature (the Bacteriological Code). Updated informal descriptions of the class Mollicutes and some of its constituent higher taxa are available as supplementary material in IJSEM Online.
Collapse
Affiliation(s)
- Daniel R Brown
- Department of Infectious Diseases and Pathology, College of Veterinary Medicine, University of Florida, Gainesville, FL 32610-0880, USA
| | - Robert F Whitcomb
- Collaborator, Vegetable Laboratory, Beltsville Agricultural Research Center, US Department of Agriculture, Beltsville, MD 20705, USA
| | - Janet M Bradbury
- Department of Veterinary Pathology, University of Liverpool, Leahurst, Neston, CH64 7TE, UK
| |
Collapse
|
14
|
Ventura M, Canchaya C, Casale AD, Dellaglio F, Neviani E, Fitzgerald GF, van Sinderen D. Analysis of bifidobacterial evolution using a multilocus approach. Int J Syst Evol Microbiol 2007; 56:2783-2792. [PMID: 17158978 DOI: 10.1099/ijs.0.64233-0] [Citation(s) in RCA: 122] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Bifidobacteria represent one of the most numerous groups of bacteria found in the gastrointestinal tract of humans and animals. In man, gastrointestinal bifidobacteria are associated with health effects and for this reason they are often used as functional ingredients in food and pharmaceutical products. Such applications may benefit from or require a clear and reliable bifidobacterial species identification. The increasing number of available bacterial genome sequences has provided a large amount of housekeeping gene sequences that can be used both for identification of bifidobacterial species as well as for understanding bifidobacterial evolution. In order to assess their relative positions in the evolutionary process, fragments from seven conserved genes, clpC, dnaB, dnaG, dnaJ1, purF, rpoC and xfp, were sequenced from each of the currently described type strains of the genus Bifidobacterium. The results demonstrate that the concatenation of these seven gene sequences for phylogenetic purposes allows a significant increase in the discriminatory power between taxa.
Collapse
Affiliation(s)
- Marco Ventura
- Department of Genetics, Anthropology and Evolution, University of Parma, Parco Area delle Scienze 11a, 43100 Parma, Italy
- Alimentary Pharmabiotic Centre and Department of Microbiology, Bioscience Institute, National University of Ireland, Western Road, Cork, Ireland
| | - Carlos Canchaya
- Alimentary Pharmabiotic Centre and Department of Microbiology, Bioscience Institute, National University of Ireland, Western Road, Cork, Ireland
| | | | - Franco Dellaglio
- Dipartimento Scientifico e Tecnologico, University of Verona, Italy
| | - Erasmo Neviani
- Department of Genetics, Anthropology and Evolution, University of Parma, Parco Area delle Scienze 11a, 43100 Parma, Italy
| | - Gerald F Fitzgerald
- Alimentary Pharmabiotic Centre and Department of Microbiology, Bioscience Institute, National University of Ireland, Western Road, Cork, Ireland
| | - Douwe van Sinderen
- Alimentary Pharmabiotic Centre and Department of Microbiology, Bioscience Institute, National University of Ireland, Western Road, Cork, Ireland
| |
Collapse
|
15
|
Dutilh BE, van Noort V, van der Heijden RTJM, Boekhout T, Snel B, Huynen MA. Assessment of phylogenomic and orthology approaches for phylogenetic inference. Bioinformatics 2007; 23:815-24. [PMID: 17237036 DOI: 10.1093/bioinformatics/btm015] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Phylogenomics integrates the vast amount of phylogenetic information contained in complete genome sequences, and is rapidly becoming the standard for reliably inferring species phylogenies. There are, however, fundamental differences between the ways in which phylogenomic approaches like gene content, superalignment, superdistance and supertree integrate the phylogenetic information from separate orthologous groups. Furthermore, they all depend on the method by which the orthologous groups are initially determined. Here, we systematically compare these four phylogenomic approaches, in parallel with three approaches for large-scale orthology determination: pairwise orthology, cluster orthology and tree-based orthology. RESULTS Including various phylogenetic methods, we apply a total of 54 fully automated phylogenomic procedures to the fungi, the eukaryotic clade with the largest number of sequenced genomes, for which we retrieved a golden standard phylogeny from the literature. Phylogenomic trees based on gene content show, relative to the other methods, a bias in the tree topology that parallels convergence in lifestyle among the species compared, indicating convergence in gene content. CONCLUSIONS Complete genomes are no guarantee for good or even consistent phylogenies. However, the large amounts of data in genomes enable us to carefully select the data most suitable for phylogenomic inference. In terms of performance, the superalignment approach, combined with restrictive orthology, is the most successful in recovering a fungal phylogeny that agrees with current taxonomic views, and allows us to obtain a high-resolution phylogeny. We provide solid support for what has grown to be a common practice in phylogenomics during its advance in recent years. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- B E Dutilh
- Center for Molecular and Biomolecular Informatics/Nijmegen Center for Molecular Life Sciences, Radboud University Nijmegen Medical Center, P.O. Box 9101, 6500 HB, Nijmegen, The Netherlands.
| | | | | | | | | | | |
Collapse
|
16
|
Macario AJL, Brocchieri L, Shenoy AR, Conway de Macario E. Evolution of a Protein-Folding Machine: Genomic and Evolutionary Analyses Reveal Three Lineages of the Archaeal hsp70(dnaK) Gene. J Mol Evol 2006; 63:74-86. [PMID: 16788741 DOI: 10.1007/s00239-005-6207-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2005] [Accepted: 03/14/2006] [Indexed: 11/27/2022]
Abstract
The stress chaperone protein Hsp70 (DnaK) (abbreviated DnaK) and its co-chaperones Hsp40(DnaJ) (or DnaJ) and GrpE are universal in bacteria and eukaryotes but occur only in some archaea clustered in the order 5'-grpE-dnaK-dnaJ-3' in a locus termed Locus I. Three structural varieties of Locus I, termed Types I, II, and III, were identified, respectively, in Methanosarcinales, in Thermoplasmatales and Methanothermobacter thermoautotrophicus, and in Halobacteriales. These Locus I types corresponded to three groups identified by phylogenetic trees of archaeal DnaK proteins including the same archaeal subdivisions. These archaeal DnaK groups were not significantly interrelated, clustering instead with DnaKs from three bacterial lineages, Methanosarcinales with Firmicutes, Thermoplasmatales and M. thermoautotrophicus with Thermotoga, and Halobacteriales with Actinobacteria, suggesting that the three archaeal types of Locus I were acquired by independent events of lateral gene transfer. These associations, however, lacked strong bootstrap support and were sensitive to dataset choice and tree-reconstruction method. Structural features of dnaK loci in bacteria revealed that Methanosarcinales and Firmicutes shared a similar structure, also common to most other bacterial groups. Structural differences were observed instead in Thermotoga compared to Thermoplasmatales and M. thermoautotrophicus, and in Actinobacteria compared to Halobacteriales. It was also found that the association between the DnaK sequences from Halobacteriales and Actinobacteria likely reflects common biases in their amino acid compositions. Although the loci structural features and the DnaK trees suggested the possibility of lateral gene transfer between Firmicutes and Methanosarcinales, the similarity between the archaeal and the ancestral bacterial loci favors the more parsimonious hypothesis that all archaeal sequences originated from a unique prokaryotic ancestor.
Collapse
Affiliation(s)
- Alberto J L Macario
- Division of Molecular Medicine, Wadsworth Center, Room B-749, New York State Department of Health, Empire State Plaza, P.O. Box 509, Albany, NY 12201-0509, USA
| | | | | | | |
Collapse
|
17
|
Glasner ME, Fayazmanesh N, Chiang RA, Sakai A, Jacobson MP, Gerlt JA, Babbitt PC. Evolution of structure and function in the o-succinylbenzoate synthase/N-acylamino acid racemase family of the enolase superfamily. J Mol Biol 2006; 360:228-50. [PMID: 16740275 DOI: 10.1016/j.jmb.2006.04.055] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2006] [Revised: 04/22/2006] [Accepted: 04/25/2006] [Indexed: 11/30/2022]
Abstract
Understanding how proteins evolve to provide both exquisite specificity and proficient activity is a fundamental problem in biology that has implications for protein function prediction and protein engineering. To study this problem, we analyzed the evolution of structure and function in the o-succinylbenzoate synthase/N-acylamino acid racemase (OSBS/NAAAR) family, part of the mechanistically diverse enolase superfamily. Although all characterized members of the family catalyze the OSBS reaction, this family is extraordinarily divergent, with some members sharing <15% identity. In addition, a member of this family, Amycolatopsis OSBS/NAAAR, is promiscuous, catalyzing both dehydration and racemization. Although the OSBS/NAAAR family appears to have a single evolutionary origin, no sequence or structural motifs unique to this family could be identified; all residues conserved in the family are also found in enolase superfamily members that have different functions. Based on their species distribution, several uncharacterized proteins similar to Amycolatopsis OSBS/NAAAR appear to have been transmitted by lateral gene transfer. Like Amycolatopsis OSBS/NAAAR, these might have additional or alternative functions to OSBS because many are from organisms lacking the pathway in which OSBS is an intermediate. In addition to functional differences, the OSBS/NAAAR family exhibits surprising structural variations, including large differences in orientation between the two domains. These results offer several insights into protein evolution. First, orthologous proteins can exhibit significant structural variation, and specificity can be maintained with little conservation of ligand-contacting residues. Second, the discovery of a set of proteins similar to Amycolatopsis OSBS/NAAAR supports the hypothesis that new protein functions evolve through promiscuous intermediates. Finally, a combination of evolutionary, structural, and sequence analyses identified characteristics that might prime proteins, such as Amycolatopsis OSBS/NAAAR, for the evolution of new activities.
Collapse
Affiliation(s)
- Margaret E Glasner
- Department of Biopharmaceutical Sciences, University of California, San Francisco, CA 94143, USA
| | | | | | | | | | | | | |
Collapse
|
18
|
Arvestad L. Efficient Methods for Estimating Amino Acid Replacement Rates. J Mol Evol 2006; 62:663-73. [PMID: 16752207 DOI: 10.1007/s00239-004-0113-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2004] [Accepted: 01/17/2006] [Indexed: 11/30/2022]
Abstract
Replacement rate matrices describe the process of evolution at one position in a protein and are used in many applications where proteins are studied with an evolutionary perspective. Several general matrices have been suggested and have proved to be good approximations of the real process. However, there are data for which general matrices are inappropriate, for example, special protein families, certain lineages in the tree of life, or particular parts of proteins. Analysis of such data could benefit from adaption of a data-specific rate matrix. This paper suggests two new methods for estimating replacement rate matrices from independent pairwise protein sequence alignments and also carefully studies Müller-Vingron's resolvent method. Comprehensive tests on synthetic datasets show that both new methods perform better than the resolvent method in a variety of settings. The best method is furthermore demonstrated to be robust on small datasets as well as practical on very large datasets of real data. Neither short nor divergent sequence pairs have to be discarded, making the method economical with data. A generalization to multialignment data is suggested and used in a test on protein-domain family phylogenies, where it is shown that the method offers family-specific rate matrices that often have a significantly better likelihood than a general matrix.
Collapse
Affiliation(s)
- Lars Arvestad
- Stockholm Bioinformatics Center, Albanova University Center, Royal Institute of Technology (KTH), SE-100 44, Stockholm, Sweden.
| |
Collapse
|
19
|
Abstract
Exponentially accumulating genetic molecular data were supposed to bring us closer to resolving one of the most fundamental issues in biology—the reconstruction of the tree of life. This tree should encompass the evolutionary history of all living creatures on earth and trace back a few billions of years to
the most ancient microbial ancestor.
Ironically, this abundance of data only blurs our traditional beliefs and seems to make this goal harder to achieve than initially thought. This is largelydue to lateral gene transfer, the passage of genetic material between organisms not through lineal descent. Evolution in light of lateral transfer tangles the traditional universal tree of life, turning it into a network of relationships. Lateral
transfer is a significant factor in microbial evolution and is the mechanism of antibiotic resistance spread in bacteria species.
In this paper we survey current methods designed to cope with lateral transfer in conjunction with vertical inheritance. We distinguish between phylogenetic-based methods and sequence-based methods and illuminate the advantages and disadvantages of each. Finally, we sketch a new statistically rigorous approach aimed at identifying lateral transfer between two genomes.
Collapse
Affiliation(s)
- Sagi Snir
- Institute of Evolution, University of Haifa, 31905 Haifa, Israel and Department of Computer Science, Netanya Academic College
| |
Collapse
|
20
|
Susko E, Leigh J, Doolittle WF, Bapteste E. Visualizing and assessing phylogenetic congruence of core gene sets: a case study of the gamma-proteobacteria. Mol Biol Evol 2006; 23:1019-30. [PMID: 16495350 DOI: 10.1093/molbev/msj113] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Here, we address a much-debated topic: is there or is there not an organismal tree of gamma-proteobacteria that can be unambiguously inferred from a core of shared genes? We apply several recently developed analytical methods to this problem, for the first time. Our heat map analyses of P values and of bootstrap bipartitions show the presence of conflicting phylogenetic signals among these core genes. Our synthesis reconstruction suggests that at least 10% of these genes have been laterally transferred during the divergence of the gamma-proteobacteria, and that for most of the rest, there is too little phylogenetic signal to permit firm conclusions about the mode of inheritance. Although there is clearly a central tendency in this data set (it is far from random), lateral gene transfers cannot be ruled out. Instead of an organismal tree, we propose that these core genes could be used to define a more subtle and partially reticulated pattern of relationships.
Collapse
Affiliation(s)
- E Susko
- Genome Atlantic, Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| | | | | | | |
Collapse
|
21
|
Abstract
Genome trees are a means to capture the overwhelming amount of phylogenetic information that is present in genomes. Different formalisms have been introduced to reconstruct genome trees on the basis of various aspects of the genome. On the basis of these aspects, we separate genome trees into five classes: (a) alignment-free trees based on statistic properties of the genome, (b) gene content trees based on the presence and absence of genes, (c) trees based on chromosomal gene order, (d) trees based on average sequence similarity, and (e) phylogenomics-based genome trees. Despite their recent development, genome tree methods have already had some impact on the phylogenetic classification of bacterial species. However, their main impact so far has been on our understanding of the nature of genome evolution and the role of horizontal gene transfer therein. An ideal genome tree method should be capable of using all gene families, including those containing paralogs, in a phylogenomics framework capitalizing on existing methods in conventional phylogenetic reconstruction. We expect such sophisticated methods to help us resolve the branching order between the main bacterial phyla.
Collapse
Affiliation(s)
- Berend Snel
- Center for Molecular and Biomolecular Informatics, Nijmegen, The Netherlands.
| | | | | |
Collapse
|
22
|
Xue H, Ng SK, Tong KL, Wong JTF. Congruence of evidence for a Methanopyrus-proximal root of life based on transfer RNA and aminoacyl-tRNA synthetase genes. Gene 2005; 360:120-30. [PMID: 16153784 DOI: 10.1016/j.gene.2005.06.027] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2005] [Revised: 05/07/2005] [Accepted: 06/03/2005] [Indexed: 11/19/2022]
Abstract
Among 60 organisms, the intraspecies genetic distances between tRNAs cognate for different amino acids, between the initiator and elongator tRNAs for Met, and between potentially paralogous pairs of aminoacyl-tRNA synthetases are found to be at a minimum within the Methanopyrus kandleri genome. These results indicate an exact congruence between the evidence from tRNA and aminoacyl-tRNA synthetase genes locating the root of life closest to this organism.
Collapse
Affiliation(s)
- Hong Xue
- Department of Biochemistry and Applied Genomics Laboratory, Hong Kong University of Science and Technology, Hong Kong, PR China
| | | | | | | |
Collapse
|
23
|
Ida T, Kugimiya M, Kogure M, Takahashi R, Tokuyama T. Phylogenetic relationships among ammonia-oxidizing bacteria as revealed by gene sequences of glyceraldehyde 3-phosphate dehydrogenase and phosphoglycerate kinase. J Biosci Bioeng 2005; 99:569-76. [PMID: 16233833 DOI: 10.1263/jbb.99.569] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2004] [Accepted: 03/11/2005] [Indexed: 11/17/2022]
Abstract
The three previously recognized genera of 'Nitrosolobus', Nitrosospira and 'Nitrosovibrio' were combined into one genus, Nitrosospira, on the basis of 16S rDNA sequence similarities. However, this classification has been controversial for some time, since the marked differences in their shapes suggest that they are not closely related. In this study, the phylogenetic analyses of the three groups using two genotypical markers, glyceraldehyde-3-phosphate dehydrogenase (GAP, gap), and 3-phosphoglycerate kinase (PGK, pgk), were performed. In the phylogenetic tree inferred from gap and pgk, the three genera appeared as clearly separated clusters. This is the first study of markers that are able to reveal the precise phylogenetic relationship among 'Nitrosolobus', Nitrosospira and 'Nitrosovibrio'.
Collapse
Affiliation(s)
- Takeshi Ida
- College of Bioresource Sciences, Nihon University, 1866 Kameino, Fujisawa, Kanagawa 252-8510, Japan
| | | | | | | | | |
Collapse
|
24
|
MacLeod D, Charlebois RL, Doolittle F, Bapteste E. Deduction of probable events of lateral gene transfer through comparison of phylogenetic trees by recursive consolidation and rearrangement. BMC Evol Biol 2005; 5:27. [PMID: 15819979 PMCID: PMC1087482 DOI: 10.1186/1471-2148-5-27] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2004] [Accepted: 04/08/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND When organismal phylogenies based on sequences of single marker genes are poorly resolved, a logical approach is to add more markers, on the assumption that weak but congruent phylogenetic signal will be reinforced in such multigene trees. Such approaches are valid only when the several markers indeed have identical phylogenies, an issue which many multigene methods (such as the use of concatenated gene sequences or the assembly of supertrees) do not directly address. Indeed, even when the true history is a mixture of vertical descent for some genes and lateral gene transfer (LGT) for others, such methods produce unique topologies. RESULTS We have developed software that aims to extract evidence for vertical and lateral inheritance from a set of gene trees compared against an arbitrary reference tree. This evidence is then displayed as a synthesis showing support over the tree for vertical inheritance, overlaid with explicit lateral gene transfer (LGT) events inferred to have occurred over the history of the tree. Like splits-tree methods, one can thus identify nodes at which conflict occurs. Additionally one can make reasonable inferences about vertical and lateral signal, assigning putative donors and recipients. CONCLUSION A tool such as ours can serve to explore the reticulated dimensionality of molecular evolution, by dissecting vertical and lateral inheritance at high resolution. By this, we mean that individual nodes can be examined not only for congruence, but also for coherence in light of LGT. We assert that our tools will facilitate the comparison of phylogenetic trees, and the interpretation of conflicting data.
Collapse
Affiliation(s)
- Dave MacLeod
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Department of Biochemistry & Molecular Biology, Dalhousie University, 5850 College St., Halifax, NS, B3H 1X5, Canada
| | - Robert L Charlebois
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Department of Biochemistry & Molecular Biology, Dalhousie University, 5850 College St., Halifax, NS, B3H 1X5, Canada
| | - Ford Doolittle
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Department of Biochemistry & Molecular Biology, Dalhousie University, 5850 College St., Halifax, NS, B3H 1X5, Canada
| | - Eric Bapteste
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Department of Biochemistry & Molecular Biology, Dalhousie University, 5850 College St., Halifax, NS, B3H 1X5, Canada
| |
Collapse
|
25
|
Ait Tayeb L, Ageron E, Grimont F, Grimont PAD. Molecular phylogeny of the genus Pseudomonas based on rpoB sequences and application for the identification of isolates. Res Microbiol 2005; 156:763-73. [PMID: 15950132 DOI: 10.1016/j.resmic.2005.02.009] [Citation(s) in RCA: 202] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2005] [Revised: 02/16/2005] [Accepted: 02/25/2005] [Indexed: 11/22/2022]
Abstract
Phylogenetic relationships within the genus Pseudomonas were examined by comparing partial (about 1000 nucleotides) rpoB gene sequences. A total of 186 strains belonging to 75 species of Pseudomonas sensu stricto and related species were studied. The phylogenetic resolution of the rpoB tree was approximately three times higher than that of the rrs tree. Ribogroups published earlier correlated well with rpoB sequence clusters. The rpoB sequence database generated by this study was used for identification. A total of 89 isolates (79.5%) were identified to a named species, while 16 isolates (14.3%) corresponded to unnamed species, and 7 isolates (6.2%) had uncertain affiliation. rpoB sequencing is now being used for routine identification of Pseudomonas isolates in our laboratory.
Collapse
Affiliation(s)
- Lyneda Ait Tayeb
- Unité de Biodiversité des Bactéries Pathogènes Emergentes, INSERM Unit 389, Institut Pasteur, 28 rue du Dr. Roux, 75724 Paris Cedex 15, France
| | | | | | | |
Collapse
|
26
|
Abstract
Horizontal gene transfer (HGT) plays a critical role in evolution across all domains of life with important biological and medical implications. I propose a simple class of stochastic models to examine HGT using multiple orthologous gene alignments. The models function in a hierarchical phylogenetic framework. The top level of the hierarchy is based on a random walk process in "tree space" that allows for the development of a joint probabilistic distribution over multiple gene trees and an unknown, but estimable species tree. I consider two general forms of random walks. The first form is derived from the subtree prune and regraft (SPR) operator that mirrors the observed effects that HGT has on inferred trees. The second form is based on walks over complete graphs and offers numerically tractable solutions for an increasing number of taxa. The bottom level of the hierarchy utilizes standard phylogenetic models to reconstruct gene trees given multiple gene alignments conditional on the random walk process. I develop a well-mixing Markov chain Monte Carlo algorithm to fit the models in a Bayesian framework. I demonstrate the flexibility of these stochastic models to test competing ideas about HGT by examining the complexity hypothesis. Using 144 orthologous gene alignments from six prokaryotes previously collected and analyzed, Bayesian model selection finds support for (1) the SPR model over the alternative form, (2) the 16S rRNA reconstruction as the most likely species tree, and (3) increased HGT of operational genes compared to informational genes.
Collapse
Affiliation(s)
- Marc A Suchard
- Department of Biomathematics, David Geffen School of Medicine, University of California, Los Angeles, 90095-1766, USA.
| |
Collapse
|
27
|
Charlebois RL, Doolittle WF. Computing prokaryotic gene ubiquity: rescuing the core from extinction. Genome Res 2005; 14:2469-77. [PMID: 15574825 PMCID: PMC534671 DOI: 10.1101/gr.3024704] [Citation(s) in RCA: 137] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The genomic core concept has found several uses in comparative and evolutionary genomics. Defined as the set of all genes common to (ubiquitous among) all genomes in a phylogenetically coherent group, core size decreases as the number and phylogenetic diversity of the relevant group increases. Here, we focus on methods for defining the size and composition of the core of all genes shared by sequenced genomes of prokaryotes (Bacteria and Archaea). There are few (almost certainly less than 50) genes shared by all of the 147 genomes compared, surely insufficient to conduct all essential functions. Sequencing and annotation errors are responsible for the apparent absence of some genes, while very limited but genuine disappearances (from just one or a few genomes) can account for several others. Core size will continue to decrease as more genome sequences appear, unless the requirement for ubiquity is relaxed. Such relaxation seems consistent with any reasonable biological purpose for seeking a core, but it renders the problem of definition more problematic. We propose an alternative approach (the phylogenetically balanced core), which preserves some of the biological utility of the core concept. Cores, however delimited, preferentially contain informational rather than operational genes; we present a new hypothesis for why this might be so.
Collapse
Affiliation(s)
- Robert L Charlebois
- Genome Atlantic, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, B3H 1X5, Canada
| | | |
Collapse
|
28
|
Devulder G, de Montclos MP, Flandrois JP. A multigene approach to phylogenetic analysis using the genus Mycobacterium as a model. Int J Syst Evol Microbiol 2005; 55:293-302. [PMID: 15653890 DOI: 10.1099/ijs.0.63222-0] [Citation(s) in RCA: 197] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Advances in DNA sequencing and the increasing number of sequences available in databases have greatly enhanced the bacterial identification process. Several species within the genusMycobacteriumcause serious human and animal diseases. In order to assess their relative positions in the evolutionary process, four gene fragments, from the 16S rRNA (564 bp),hsp65(420 bp),rpoB(396 bp) andsod(408 bp) genes, were sequenced from 97 strains, including all available type strains of the genusMycobacterium. The results demonstrate that, in this case, the concatenation of different genes allows significant increases in the power of discrimination and the robustness of the phylogenetic tree. The sequential and/or combined use of sequences of several genes makes it possible to refine the phylogenetic approach and provides a molecular basis for accurate species identification.
Collapse
Affiliation(s)
- G Devulder
- UMR CNRS 5558, Laboratoire de bactériologie, Faculté de Médecine Lyon-Sud, BP 12, 69921 Oullins Cedex, France
| | | | - J P Flandrois
- Laboratoire de bactériologie, CHU Lyon Sud, 69495 Pierre-Bénite Cedex, France
- UMR CNRS 5558, Laboratoire de bactériologie, Faculté de Médecine Lyon-Sud, BP 12, 69921 Oullins Cedex, France
| |
Collapse
|
29
|
RIATA-HGT: A Fast and Accurate Heuristic for Reconstructing Horizontal Gene Transfer. LECTURE NOTES IN COMPUTER SCIENCE 2005. [DOI: 10.1007/11533719_11] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
30
|
Coenye T, Vandamme P. Extracting phylogenetic information from whole-genome sequencing projects: the lactic acid bacteria as a test case. Microbiology (Reading) 2003; 149:3507-3517. [PMID: 14663083 DOI: 10.1099/mic.0.26515-0] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The availability of an ever increasing number of complete genome sequences of diverse prokaryotic taxa has led to the introduction of novel approaches to infer phylogenetic relationships among bacteria. In the present study the sequences of the 16S rRNA gene and nine housekeeping genes were compared with the fraction of shared putative orthologous protein-encoding genes, conservation of gene order, dinucleotide relative abundance and codon usage among 11 genomes of species belonging to the lactic acid bacteria. In general there is a good correlation between the results obtained with various approaches, although it is clear that there is a stronger phylogenetic signal in some datasets than in others, and that different parameters have different taxonomic resolutions. It appears that trees based on different kinds of information derived from whole-genome sequencing projects do not provide much additional information about the phylogenetic relationships among bacterial taxa compared to more traditional alignment-based methods. Nevertheless, it is expected that the study of these novel forms of information will have its value in taxonomy, to determine which genes are shared, when genes or sets of genes were lost in evolutionary history, to detect the presence of horizontally transferred genes and/or confirm or enhance the phylogenetic signal derived from traditional methods. Although these conclusions are based on a relatively small dataset, they are largely in agreement with other studies and it is anticipated that similar trends will be observed when comparing other genomes.
Collapse
Affiliation(s)
- Tom Coenye
- Laboratorium voor Microbiologie, Universiteit Gent, K.L. Ledeganckstraat 35, B-9000 Gent, Belgium
| | - Peter Vandamme
- Laboratorium voor Microbiologie, Universiteit Gent, K.L. Ledeganckstraat 35, B-9000 Gent, Belgium
| |
Collapse
|
31
|
Chattopadhyay S, Chakrabarti J. Temporal changes in phosphoglycerate kinase coding sequences: a quantitative measure. J Comput Biol 2003; 10:83-93. [PMID: 12676052 DOI: 10.1089/106652703763255688] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The ratio of the average of the square of the number of the nucleotides to that of the random sequence of the same strand bias is proposed as a quantitative measure of evolution in some coding DNA sequences. Applying this measure to the phosphoglycerate kinase gene we observe a monotonic rise of the ratio with evolution. We present an interpretation of this data on some bacteria.
Collapse
Affiliation(s)
- Sujay Chattopadhyay
- Department of Theoretical Physics, Indian Association for the Cultivation of Science, Calcutta 700 032,
| | | |
Collapse
|
32
|
Bankier AT, Spriggs HF, Fartmann B, Konfortov BA, Madera M, Vogel C, Teichmann SA, Ivens A, Dear PH. Integrated mapping, chromosomal sequencing and sequence analysis of Cryptosporidium parvum. Genome Res 2003; 13:1787-99. [PMID: 12869580 PMCID: PMC403770 DOI: 10.1101/gr.1555203] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2003] [Accepted: 05/19/2003] [Indexed: 11/24/2022]
Abstract
The apicomplexan Cryptosporidium parvum is one of the most prevalent protozoan parasites of humans. We report the physical mapping of the genome of the Iowa isolate, sequencing and analysis of chromosome 6, and approximately 0.9 Mbp of sequence sampled from the remainder of the genome. To construct a robust physical map, we devised a novel and general strategy, enabling accurate placement of clones regardless of clone artefacts. Analysis reveals a compact genome, unusually rich in membrane proteins. As in Plasmodium falciparum, the mean size of the predicted proteins is larger than that in other sequenced eukaryotes. We find several predicted proteins of interest as potential therapeutic targets, including one exhibiting similarity to the chloroquine resistance protein of Plasmodium. Coding sequence analysis argues against the conventional phylogenetic position of Cryptosporidium and supports an earlier suggestion that this genus arose from an early branching within the Apicomplexa. In agreement with this, we find no significant synteny and surprisingly little protein similarity with Plasmodium. Finally, we find two unusual and abundant repeats throughout the genome. Among sequenced genomes, one motif is abundant only in C. parvum, whereas the other is shared with (but has previously gone unnoticed in) all known genomes of the Coccidia and Haemosporida. These motifs appear to be unique in their structure, distribution and sequences.
Collapse
Affiliation(s)
- Alan T Bankier
- Medical Research Council (MRC) Laboratory of Molecular Biology, Cambridge CB 2 2QH, UK
| | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Brooks DJ, Fresco JR, Lesk AM, Singh M. Evolution of amino acid frequencies in proteins over deep time: inferred order of introduction of amino acids into the genetic code. Mol Biol Evol 2003; 19:1645-55. [PMID: 12270892 DOI: 10.1093/oxfordjournals.molbev.a003988] [Citation(s) in RCA: 129] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
To understand more fully how amino acid composition of proteins has changed over the course of evolution, a method has been developed for estimating the composition of proteins in an ancestral genome. Estimates are based upon the composition of conserved residues in descendant sequences and empirical knowledge of the relative probability of conservation of various amino acids. Simulations are used to model and correct for errors in the estimates. The method was used to infer the amino acid composition of a large protein set in the Last Universal Ancestor (LUA) of all extant species. Relative to the modern protein set, LUA proteins were found to be generally richer in those amino acids that are believed to have been most abundant in the prebiotic environment and poorer in those amino acids that are believed to have been unavailable or scarce. It is proposed that the inferred amino acid composition of proteins in the LUA probably reflects historical events in the establishment of the genetic code.
Collapse
Affiliation(s)
- Dawn J Brooks
- Department of Molecular Biology, Princeton University, New Jersey 08544, USA
| | | | | | | |
Collapse
|
34
|
Affiliation(s)
- James R Brown
- Bioinformatics Division, GlaxoSmithKline, 1250 South Collegeville Road, UP1345 Collegeville, Pennsylvania 19426, USA.
| |
Collapse
|
35
|
Doolittle WF, Boucher Y, Nesbø CL, Douady CJ, Andersson JO, Roger AJ. How big is the iceberg of which organellar genes in nuclear genomes are but the tip? Philos Trans R Soc Lond B Biol Sci 2003; 358:39-57; discussion 57-8. [PMID: 12594917 PMCID: PMC1693099 DOI: 10.1098/rstb.2002.1185] [Citation(s) in RCA: 144] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
As more and more complete bacterial and archaeal genome sequences become available, the role of lateral gene transfer (LGT) in shaping them becomes more and more clear. Over the long term, it may be the dominant force, affecting most genes in most prokaryotes. We review the history of LGT, suggesting reasons why its prevalence and impact were so long dismissed. We discuss various methods purporting to measure the extent of LGT, and evidence for and against the notion that there is a core of never-exchanged genes shared by all genomes, from which we can deduce the "true" organismal tree. We also consider evidence for, and implications of, LGT between prokaryotes and phagocytic eukaryotes.
Collapse
Affiliation(s)
- W F Doolittle
- Genome Atlantic, Dalhousie University, 5850 College Street, Halifax, Nova Scotia B3H 1X5, Canada.
| | | | | | | | | | | |
Collapse
|
36
|
Abstract
Structural analyses on a small number of protein families have shown that residues in protein interfaces are more conserved than average amino acid residues. This is also true of other ligand-binding and active site residues. This raises the question whether protein interactions place additional constraints on sequence divergence beyond this general background of functional restrictions on all different types of proteins. In order to investigate this, the sequence identities of Saccharomyces cerevisiae (SC) proteins to their Schizosaccharomyces pombe (SP) orthologues were used as a measure of sequence divergence. The SC proteins were divided into those in stable complexes, those that participate in transient interactions and the remaining proteins. All types of proteins can undergo extensive divergence: all three sequence identity distributions range from less than 20 to over 90%. However, overall, protein interactions do place additional constraints on sequence divergence and the distributions differ significantly: proteins not known to be involved in interactions have an average sequence identity of 38% while this value is 46% for proteins in stable complexes. Proteins that have transient interactions are intermediate between the two, with an average sequence identity of 41%. This trend is independent of whether the proteins are involved in informational functions (transcription, translation and replication) or not and of protein dispensability.
Collapse
Affiliation(s)
- Sarah A Teichmann
- MRC Laboratory of Molecular Biology, Hills Road, CB2 2QH, Cambridge, UK.
| |
Collapse
|
37
|
Abstract
Genome comparisons indicate that horizontal gene transfer and differential gene loss are major evolutionary phenomena that, at least in prokaryotes, involve a large fraction, if not the majority, of genes. The extent of these events casts doubt on the feasibility of constructing a 'Tree of Life', because the trees for different genes often tell different stories. However, alternative approaches to tree construction that attempt to determine tree topology on the basis of comparisons of complete gene sets seem to reveal a phylogenetic signal that supports the three-domain evolutionary scenario and suggests the possibility of delineation of previously undetected major clades of prokaryotes. If the validity of these whole-genome approaches to tree building is confirmed by analyses of numerous new genomes, which are currently being sequenced at an increasing rate, it would seem that the concept of a universal 'species' tree is still appropriate. However, this tree should be reinterpreted as a prevailing trend in the evolution of genome-scale gene sets rather than as a complete picture of evolution.
Collapse
Affiliation(s)
- Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
38
|
Daubin V, Gouy M, Perrière G. A phylogenomic approach to bacterial phylogeny: evidence of a core of genes sharing a common history. Genome Res 2002; 12:1080-90. [PMID: 12097345 PMCID: PMC186629 DOI: 10.1101/gr.187002] [Citation(s) in RCA: 241] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
It has been claimed that complete genome sequences would clarify phylogenetic relationships between organisms, but up to now, no satisfying approach has been proposed to use efficiently these data. For instance, if the coding of presence or absence of genes in complete genomes gives interesting results, it does not take into account the phylogenetic information contained in sequences and ignores hidden paralogies by using a BLAST reciprocal best hit definition of orthology. In addition, concatenation of sequences of different genes as well as building of consensus trees only consider the few genes that are shared among all organisms. Here we present an attempt to use a supertree method to build the phylogenetic tree of 45 organisms, with special focus on bacterial phylogeny. This led us to perform a phylogenetic study of congruence of tree topologies, which allows the identification of a core of genes supporting similar species phylogeny. We then used this core of genes to infer a tree. This phylogeny presents several differences with the rRNA phylogeny, notably for the position of hyperthermophilic bacteria.
Collapse
Affiliation(s)
- Vincent Daubin
- Laboratoire de Biométrie et Biologie Evolutive, Unité Mixte de Recherche Centre National de la Recherche Scientifique, Université Claude Bernard - Lyon 1, 69622 Villeurbanne Cedex, France
| | | | | |
Collapse
|
39
|
Yanai I, Wolf YI, Koonin EV. Evolution of gene fusions: horizontal transfer versus independent events. Genome Biol 2002; 3:research0024. [PMID: 12049665 PMCID: PMC115226 DOI: 10.1186/gb-2002-3-5-research0024] [Citation(s) in RCA: 62] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2001] [Revised: 02/07/2002] [Accepted: 03/26/2002] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Gene fusions can be used as tools for functional prediction and also as evolutionary markers. Fused genes often show a scattered phyletic distribution, which suggests a role for processes other than vertical inheritance in their evolution. RESULTS The evolutionary history of gene fusions was studied by phylogenetic analysis of the domains in the fused proteins and the orthologous domains that form stand-alone proteins. Clustering of fusion components from phylogenetically distant species was construed as evidence of dissemination of the fused genes by horizontal transfer. Of the 51 examined gene fusions that are represented in at least two of the three primary kingdoms (Bacteria, Archaea and Eukaryota), 31 were most probably disseminated by cross-kingdom horizontal gene transfer, whereas 14 appeared to have evolved independently in different kingdoms and two were probably inherited from the common ancestor of modern life forms. On many occasions, the evolutionary scenario also involves one or more secondary fissions of the fusion gene. For approximately half of the fusions, stand-alone forms of the fusion components are encoded by juxtaposed genes, which are known or predicted to belong to the same operon in some of the prokaryotic genomes. This indicates that evolution of gene fusions often, if not always, involves an intermediate stage, during which the future fusion components exist as juxtaposed and co-regulated, but still distinct, genes within operons. CONCLUSION These findings suggest a major role for horizontal transfer of gene fusions in the evolution of protein-domain architectures, but also indicate that independent fusions of the same pair of domains in distant species is not uncommon, which suggests positive selection for the multidomain architectures.
Collapse
MESH Headings
- DNA, Archaeal/genetics
- DNA, Bacterial/genetics
- DNA, Fungal/genetics
- Databases, Genetic
- Evolution, Molecular
- Gene Transfer, Horizontal/genetics
- Genes, Archaeal/genetics
- Genes, Bacterial/genetics
- Genes, Fungal/genetics
- Genome
- Genome, Bacterial
- Genome, Fungal
- Phylogeny
- Recombination, Genetic/genetics
- Sequence Homology, Nucleic Acid
Collapse
Affiliation(s)
- Itai Yanai
- Bioinformatics Graduate Program and Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MA 20894, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MA 20894, USA
| |
Collapse
|
40
|
Snel B, Bork P, Huynen MA. Genomes in flux: the evolution of archaeal and proteobacterial gene content. Genome Res 2002; 12:17-25. [PMID: 11779827 DOI: 10.1101/gr.176501] [Citation(s) in RCA: 272] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
In the course of evolution, genomes are shaped by processes like gene loss, gene duplication, horizontal gene transfer, and gene genesis (the de novo origin of genes). Here we reconstruct the gene content of ancestral Archaea and Proteobacteria and quantify the processes connecting them to their present day representatives based on the distribution of genes in completely sequenced genomes. We estimate that the ancestor of the Proteobacteria contained around 2500 genes, and the ancestor of the Archaea around 2050 genes. Although it is necessary to invoke horizontal gene transfer to explain the content of present day genomes, gene loss, gene genesis, and simple vertical inheritance are quantitatively the most dominant processes in shaping the genome. Together they result in a turnover of gene content such that even the lineage leading from the ancestor of the Proteobacteria to the relatively large genome of Escherichia coli has lost at least 950 genes. Gene loss, unlike the other processes, correlates fairly well with time. This clock-like behavior suggests that gene loss is under negative selection, while the processes that add genes are under positive selection.
Collapse
Affiliation(s)
- Berend Snel
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany.
| | | | | |
Collapse
|
41
|
Wolf YI, Rogozin IB, Grishin NV, Tatusov RL, Koonin EV. Genome trees constructed using five different approaches suggest new major bacterial clades. BMC Evol Biol 2001; 1:8. [PMID: 11734060 PMCID: PMC60490 DOI: 10.1186/1471-2148-1-8] [Citation(s) in RCA: 234] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2001] [Accepted: 10/23/2001] [Indexed: 12/04/2022] Open
Abstract
BACKGROUND The availability of multiple complete genome sequences from diverse taxa prompts the development of new phylogenetic approaches, which attempt to incorporate information derived from comparative analysis of complete gene sets or large subsets thereof. Such attempts are particularly relevant because of the major role of horizontal gene transfer and lineage-specific gene loss, at least in the evolution of prokaryotes. RESULTS Five largely independent approaches were employed to construct trees for completely sequenced bacterial and archaeal genomes: i) presence-absence of genomes in clusters of orthologous genes; ii) conservation of local gene order (gene pairs) among prokaryotic genomes; iii) parameters of identity distribution for probable orthologs; iv) analysis of concatenated alignments of ribosomal proteins; v) comparison of trees constructed for multiple protein families. All constructed trees support the separation of the two primary prokaryotic domains, bacteria and archaea, as well as some terminal bifurcations within the bacterial and archaeal domains. Beyond these obvious groupings, the trees made with different methods appeared to differ substantially in terms of the relative contributions of phylogenetic relationships and similarities in gene repertoires caused by similar life styles and horizontal gene transfer to the tree topology. The trees based on presence-absence of genomes in orthologous clusters and the trees based on conserved gene pairs appear to be strongly affected by gene loss and horizontal gene transfer. The trees based on identity distributions for orthologs and particularly the tree made of concatenated ribosomal protein sequences seemed to carry a stronger phylogenetic signal. The latter tree supported three potential high-level bacterial clades,: i) Chlamydia-Spirochetes, ii) Thermotogales-Aquificales (bacterial hyperthermophiles), and ii) Actinomycetes-Deinococcales-Cyanobacteria. The latter group also appeared to join the low-GC Gram-positive bacteria at a deeper tree node. These new groupings of bacteria were supported by the analysis of alternative topologies in the concatenated ribosomal protein tree using the Kishino-Hasegawa test and by a census of the topologies of 132 individual groups of orthologous proteins. Additionally, the results of this analysis put into question the sister-group relationship between the two major archaeal groups, Euryarchaeota and Crenarchaeota, and suggest instead that Euryarchaeota might be a paraphyletic group with respect to Crenarchaeota. CONCLUSIONS We conclude that, the extensive horizontal gene flow and lineage-specific gene loss notwithstanding, extension of phylogenetic analysis to the genome scale has the potential of uncovering deep evolutionary relationships between prokaryotic lineages.
Collapse
Affiliation(s)
- Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Nick V Grishin
- Howard Hughes Medical Institute and Department of Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390-9050, USA
| | - Roman L Tatusov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
42
|
Brown JR, Douady CJ, Italia MJ, Marshall WE, Stanhope MJ. Universal trees based on large combined protein sequence data sets. Nat Genet 2001; 28:281-5. [PMID: 11431701 DOI: 10.1038/90129] [Citation(s) in RCA: 249] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Universal trees of life based on small-subunit (SSU) ribosomal RNA (rRNA) support the separate mono/holophyly of the domains Archaea (archaebacteria), Bacteria (eubacteria) and Eucarya (eukaryotes) and the placement of extreme thermophiles at the base of the Bacteria. The concept of universal tree reconstruction recently has been upset by protein trees that show intermixing of species from different domains. Such tree topologies have been attributed to either extensive horizontal gene transfer or degradation of phylogenetic signals because of saturation for amino acid substitutions. Here we use large combined alignments of 23 orthologous proteins conserved across 45 species from all domains to construct highly robust universal trees. Although individual protein trees are variable in their support of domain integrity, trees based on combined protein data sets strongly support separate monophyletic domains. Within the Bacteria, we placed spirochaetes as the earliest derived bacterial group. However, elimination from the combined protein alignment of nine protein data sets, which were likely candidates for horizontal gene transfer, resulted in trees showing thermophiles as the earliest evolved bacterial lineage. Thus, combined protein universal trees are highly congruent with SSU rRNA trees in their strong support for the separate monophyly of domains as well as the early evolution of thermophilic Bacteria.
Collapse
Affiliation(s)
- J R Brown
- Anti-Microbial Bioinformatics Group, GlaxoSmithKline,1250 South Collegeville Road, UP1345 P.O. Box 5089, Collegeville, Pennsylvania 19426-0989, USA.
| | | | | | | | | |
Collapse
|
43
|
Abstract
Conflicting results often accompany phylogenetic analyses of RNA, DNA, or protein sequences across diverse species. Causes contributing to these conflicts relate to ambiguities in identifying homologous characters of alignments, sensitivity of tree-making methods to unequal evolutionary rates, biases in species sampling, unrecognized paralogy, functional differentiation, loss of phylogenetic informational content due to long branches or fast evolution, and difficulties with the assumptions and approximations used to infer phylogenetic relationships. Attempts to surmount these conflicts by averaging over many proteins are problematic due to inherent biases of selected families, lack of signal in others, and events of lateral transfer, fusion, and/or chimerism. The process of assessing reliability of the results using the bootstrap method is strewn with obstacles because of lack of independence and inhomogeneity in the molecular data. Problems inherent to the three major procedures for developing phylogenetic trees--parsimony, likelihood, distance--are reviewed. Special attention is given to the problem of inferring evolutionary distances from patterns of similarity among sequences. The difficulties encountered by methods of phylogenetic reconstructions based on the analysis of divergent sequence families make new methods based on the analysis of complete genomes reasonable alternatives. Several of these are considered, including the signature sequences of Gupta and associates, the study of genome profiles, and the genomic signature set forth by Karlin and colleagues.
Collapse
Affiliation(s)
- L Brocchieri
- Department of Mathematics, Stanford University, Stanford, California 94305-2125, USA
| |
Collapse
|
44
|
Grishin NV, Wolf YI, Koonin EV. From complete genomes to measures of substitution rate variability within and between proteins. Genome Res 2000; 10:991-1000. [PMID: 10899148 PMCID: PMC310923 DOI: 10.1101/gr.10.7.991] [Citation(s) in RCA: 67] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Accumulation of complete genome sequences of diverse organisms creates new possibilities for evolutionary inferences from whole-genome comparisons. In the present study, we analyze the distributions of substitution rates among proteins encoded in 19 complete genomes (the interprotein rate distribution). To estimate these rates, it is necessary to employ another fundamental distribution, that of the substitution rates among sites in proteins (the intraprotein distribution). Using two independent approaches, we show that intraprotein substitution rate variability appears to be significantly greater than generally accepted. This yields more realistic estimates of evolutionary distances from amino-acid sequences, which is critical for evolutionary-tree construction. We demonstrate that the interprotein rate distributions inferred from the genome-to-genome comparisons are similar to each other and can be approximated by a single distribution with a long exponential shoulder. This suggests that a generalized version of the molecular clock hypothesis may be valid on genome scale. We also use the scaling parameter of the obtained interprotein rate distribution to construct a rooted whole-genome phylogeny. The topology of the resulting tree is largely compatible with those of global rRNA-based trees and trees produced by other approaches to genome-wide comparison.
Collapse
Affiliation(s)
- N V Grishin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 USA.
| | | | | |
Collapse
|
45
|
Koonin EV, Aravind L, Kondrashov AS. The impact of comparative genomics on our understanding of evolution. Cell 2000; 101:573-6. [PMID: 10892642 DOI: 10.1016/s0092-8674(00)80867-3] [Citation(s) in RCA: 215] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- E V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
| | | | | |
Collapse
|
46
|
Lin J, Gerstein M. Whole-genome trees based on the occurrence of folds and orthologs: implications for comparing genomes on different levels. Genome Res 2000; 10:808-18. [PMID: 10854412 PMCID: PMC310900 DOI: 10.1101/gr.10.6.808] [Citation(s) in RCA: 121] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/1999] [Accepted: 04/05/2000] [Indexed: 11/25/2022]
Abstract
We built whole-genome trees based on the presence or absence of particular molecular features, either orthologs or folds, in the genomes of a number of recently sequenced microorganisms. To put these genomic trees into perspective, we compared them to the traditional ribosomal phylogeny and also to trees based on the sequence similarity of individual orthologous proteins. We found that our genomic trees based on the overall occurrence of orthologs did not agree well with the traditional tree. This discrepancy, however, vanished when one restricted the tree to proteins involved in transcription and translation, not including problematic proteins involved in metabolism. Protein folds unite superficially unrelated sequence families and represent a most fundamental molecular unit described by genomes. We found that our genomic occurrence tree based on folds agreed fairly well with the traditional ribosomal phylogeny. Surprisingly, despite this overall agreement, certain classes of folds, particularly all-beta ones, had a somewhat different phylogenetic distribution. We also compared our occurrence trees to whole-genome clusters based on the composition of amino acids and di-nucleotides. Finally, we analyzed some technical aspects of genomic trees-e.g., comparing parsimony versus distance-based approaches and examining the effects of increasing numbers of organisms. Additional information (e.g. clickable trees) is available from http://bioinfo.mbb.yale.edu/genome/trees.
Collapse
Affiliation(s)
- J Lin
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520 USA
| | | |
Collapse
|
47
|
Worning P, Jensen LJ, Nelson KE, Brunak S, Ussery DW. Structural analysis of DNA sequence: evidence for lateral gene transfer in Thermotoga maritima. Nucleic Acids Res 2000; 28:706-9. [PMID: 10637321 PMCID: PMC102551 DOI: 10.1093/nar/28.3.706] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The recently published complete DNA sequence of the bacterium Thermotoga maritima provides evidence, based on protein sequence conservation, for lateral gene transfer between Archaea and Bacteria. We introduce a new method of periodicity analysis of DNA sequences, based on structural parameters, which brings independent evidence for the lateral gene transfer in the genome of T.maritima. The structural analysis relates the Archaea-like DNA sequences to the genome of Pyrococcus horikoshii. Analysis of 24 complete genomic DNA sequences shows different periodicity patterns for organisms of different origin. The typical genomic periodicity for Bacteria is 11 bp whilst it is 10 bp for Archaea. Eukaryotes have more complex spectra but the dominant period in the yeast Saccharomyces cerevisiae is 10.2 bp. These periodicities are most likely reflective of differences in chromatin structure.
Collapse
Affiliation(s)
- P Worning
- Center for Biological Sequence Analysis, Department of Biotechnology, Building 208, The Technical University of Denmark
| | | | | | | | | |
Collapse
|
48
|
Affiliation(s)
- S C Morris
- Department of Earth Sciences, University of Cambridge, United Kingdom
| |
Collapse
|
49
|
Abstract
Genomics is changing the landscape of modern biology. The impact is far-reaching because it provides both the most economical means of acquiring large amounts of information and because it has forced the creation of new technologies to exploit this information. Five of the six genomes published in the year from August 1998 to August 1999 were human pathogens, all of which are highly host-adapted. Four of these are obligate intracellular pathogens and the study of these genomes is providing novel insights into the intricacies of pathogen-host interactions and co-evolution. These genomes are also significant because they mark the beginning of an important trend in the sequencing of closely related genomes, including the sequencing of more than one strain from a single pathogenic species. As comparative genomics truly comes of age, the ability to compare the genomes of pathogenic and non-pathogenic organisms will hopefully provide insight into what makes certain bacterial strains and species pathogens.
Collapse
Affiliation(s)
- D Field
- Molecular Infectious Diseases Group, University Department of Paediatrics, Institute of Molecular Medicine, John Radcliffe Hospital, Oxford, OX3 9DS, UK.
| | | | | |
Collapse
|
50
|
Abstract
Cytologically, prokaryotes appear simpler and thus evolutionarily 'older' than eukaryotes. In terms of RNA processing, however, prokaryotes are sophisticated and eukaryotes, which retain many features of an RNA-world, appear primitive. The last universal common ancestor may have been mesophilic and could have had many features of the eukaryote genome, but its cytology is unknown.
Collapse
Affiliation(s)
- D Penny
- Institute of Molecular BioSciences, Massey University, PO Box 11 222, New Zealand.
| | | |
Collapse
|