1
|
Wijaya AJ, Anžel A, Richard H, Hattab G. Current state and future prospects of Horizontal Gene Transfer detection. NAR Genom Bioinform 2025; 7:lqaf005. [PMID: 39935761 PMCID: PMC11811736 DOI: 10.1093/nargab/lqaf005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 12/26/2024] [Accepted: 02/04/2025] [Indexed: 02/13/2025] Open
Abstract
Artificial intelligence (AI) has been shown to be beneficial in a wide range of bioinformatics applications. Horizontal Gene Transfer (HGT) is a driving force of evolutionary changes in prokaryotes. It is widely recognized that it contributes to the emergence of antimicrobial resistance (AMR), which poses a particularly serious threat to public health. Many computational approaches have been developed to study and detect HGT. However, the application of AI in this field has not been investigated. In this work, we conducted a review to provide information on the current trend of existing computational approaches for detecting HGT and to decipher the use of AI in this field. Here, we show a growing interest in HGT detection, characterized by a surge in the number of computational approaches, including AI-based approaches, in recent years. We organize existing computational approaches into a hierarchical structure of computational groups based on their computational methods and show how each computational group evolved. We make recommendations and discuss the challenges of HGT detection in general and the adoption of AI in particular. Moreover, we provide future directions for the field of HGT detection.
Collapse
Affiliation(s)
- Andre Jatmiko Wijaya
- Center for Artificial Intelligent in Public Health Research (ZKI-PH), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
- Department of Mathematics and Computer Science, Freie Universität, Arnimallee 14, 14195 Berlin, Germany
- Genome Competence Center (MF1), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| | - Aleksandar Anžel
- Center for Artificial Intelligent in Public Health Research (ZKI-PH), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| | - Hugues Richard
- Genome Competence Center (MF1), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| | - Georges Hattab
- Center for Artificial Intelligent in Public Health Research (ZKI-PH), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
- Department of Mathematics and Computer Science, Freie Universität, Arnimallee 14, 14195 Berlin, Germany
| |
Collapse
|
2
|
Microbial Genetics and Evolution. Microorganisms 2022; 10:microorganisms10071274. [PMID: 35888993 PMCID: PMC9315481 DOI: 10.3390/microorganisms10071274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 06/17/2022] [Accepted: 06/21/2022] [Indexed: 01/27/2023] Open
|
3
|
Matriano DM, Alegado RA, Conaco C. Detection of horizontal gene transfer in the genome of the choanoflagellate Salpingoeca rosetta. Sci Rep 2021; 11:5993. [PMID: 33727612 PMCID: PMC7971027 DOI: 10.1038/s41598-021-85259-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Accepted: 02/28/2021] [Indexed: 01/31/2023] Open
Abstract
Horizontal gene transfer (HGT), the movement of heritable materials between distantly related organisms, is crucial in eukaryotic evolution. However, the scale of HGT in choanoflagellates, the closest unicellular relatives of metazoans, and its possible roles in the evolution of animal multicellularity remains unexplored. We identified at least 175 candidate HGTs in the genome of the colonial choanoflagellate Salpingoeca rosetta using sequence-based tests. The majority of these were orthologous to genes in bacterial and microalgal lineages, yet displayed genomic features consistent with the rest of the S. rosetta genome-evidence of ancient acquisition events. Putative functions include enzymes involved in amino acid and carbohydrate metabolism, cell signaling, and the synthesis of extracellular matrix components. Functions of candidate HGTs may have contributed to the ability of choanoflagellates to assimilate novel metabolites, thereby supporting adaptation, survival in diverse ecological niches, and response to external cues that are possibly critical in the evolution of multicellularity in choanoflagellates.
Collapse
Affiliation(s)
- Danielle M Matriano
- Marine Science Institute, University of the Philippines, Diliman, Quezon City, Philippines
| | - Rosanna A Alegado
- Department of Oceanography, Hawai'i Sea Grant, Daniel K. Inouye Center for Microbial Oceanography: Research and Education, University of Hawai'i at Manoa, Honolulu, USA
| | - Cecilia Conaco
- Marine Science Institute, University of the Philippines, Diliman, Quezon City, Philippines.
| |
Collapse
|
4
|
Hossain MG, Mahmud MM, Nazir KHMNH, Ueda K. PreS1 Mutations Alter the Large HBsAg Antigenicity of a Hepatitis B Virus Strain Isolated in Bangladesh. Int J Mol Sci 2020; 21:ijms21020546. [PMID: 31952213 PMCID: PMC7014173 DOI: 10.3390/ijms21020546] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Revised: 01/04/2020] [Accepted: 01/13/2020] [Indexed: 02/07/2023] Open
Abstract
Mutations in the hepatitis B virus (HBV) genome can potentially lead to vaccination failure, diagnostic escape, and disease progression. However, there are no reports on viral gene expression and large hepatitis B surface antigen (HBsAg) antigenicity alterations due to mutations in HBV isolated from a Bangladeshi population. Here, we sequenced the full genome of the HBV isolated from a clinically infected patient in Bangladesh. The open reading frames (ORFs) (P, S, C, and X) of the isolated HBV strain were successfully amplified and cloned into a mammalian expression vector. The HBV isolate was identified as genotype C (sub-genotype C2), serotype adr, and evolutionarily related to strains isolated in Indonesia, Malaysia, and China. Clinically significant mutations, such as preS1 C2964A, reverse transcriptase domain I91L, and small HBsAg N3S, were identified. The viral P, S, C, and X genes were expressed in HEK-293T and HepG2 cells by transient transfection with a native subcellular distribution pattern analyzed by immunofluorescence assay. Western blotting of large HBsAg using preS1 antibody showed no staining, and preS1 ELISA showed a significant reduction in reactivity due to amino acid mutations. This mutated preS1 sequence has been identified in several Asian countries. To our knowledge, this is the first report investigating changes in large HBsAg antigenicity due to preS1 mutations.
Collapse
Affiliation(s)
- Md. Golzar Hossain
- Division of Virology, Department of Microbiology and Immunology, Graduate School of Medicine, Osaka University, Osaka 565-0871, Japan
- Department of Microbiology and Hygiene, Bangladesh Agricultural University, Mymensingh 2202, Bangladesh; (M.M.M.); (K.H.M.N.H.N.)
- Correspondence: (M.G.H.); (K.U.)
| | - Md. Muket Mahmud
- Department of Microbiology and Hygiene, Bangladesh Agricultural University, Mymensingh 2202, Bangladesh; (M.M.M.); (K.H.M.N.H.N.)
| | - K. H. M. Nazmul Hussain Nazir
- Department of Microbiology and Hygiene, Bangladesh Agricultural University, Mymensingh 2202, Bangladesh; (M.M.M.); (K.H.M.N.H.N.)
| | - Keiji Ueda
- Division of Virology, Department of Microbiology and Immunology, Graduate School of Medicine, Osaka University, Osaka 565-0871, Japan
- Correspondence: (M.G.H.); (K.U.)
| |
Collapse
|
5
|
Bernard G, Chan CX, Chan YB, Chua XY, Cong Y, Hogan JM, Maetschke SR, Ragan MA. Alignment-free inference of hierarchical and reticulate phylogenomic relationships. Brief Bioinform 2019; 20:426-435. [PMID: 28673025 PMCID: PMC6433738 DOI: 10.1093/bib/bbx067] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Revised: 05/04/2017] [Indexed: 11/22/2022] Open
Abstract
We are amidst an ongoing flood of sequence data arising from the application of high-throughput technologies, and a concomitant fundamental revision in our understanding of how genomes evolve individually and within the biosphere. Workflows for phylogenomic inference must accommodate data that are not only much larger than before, but often more error prone and perhaps misassembled, or not assembled in the first place. Moreover, genomes of microbes, viruses and plasmids evolve not only by tree-like descent with modification but also by incorporating stretches of exogenous DNA. Thus, next-generation phylogenomics must address computational scalability while rethinking the nature of orthogroups, the alignment of multiple sequences and the inference and comparison of trees. New phylogenomic workflows have begun to take shape based on so-called alignment-free (AF) approaches. Here, we review the conceptual foundations of AF phylogenetics for the hierarchical (vertical) and reticulate (lateral) components of genome evolution, focusing on methods based on k-mers. We reflect on what seems to be successful, and on where further development is needed.
Collapse
|
6
|
Prabha R, Singh DP. Cyanobacterial phylogenetic analysis based on phylogenomics approaches render evolutionary diversification and adaptation: an overview of representative orders. 3 Biotech 2019; 9:87. [PMID: 30800598 DOI: 10.1007/s13205-019-1635-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2018] [Accepted: 02/11/2019] [Indexed: 12/12/2022] Open
Abstract
Phylogenetic studies based on a definite set of marker genes usually reconstruct evolutionary relationships among the prokaryotic species. Based on specific target sequences, such studies represent variations and allow identification of similarities or dissimilarities in organisms. With the advent of completely sequenced genomes and accumulation of information on whole prokaryotic genomes, phylogenetic reconstructions should be considered more reliable if they are ideally based on entire genomes to resolve phylogenetic interest. We applied phylogenomics approaches taking into account completely sequenced cyanobacterial genomes to reconstruct underlying species that represented major taxonomic classes and belonged to distinctly different habitats (freshwater, marine, soils, and rocks). We did not rely on describing phylogeny of all representative class of cyanobacterial species on the basis of only ribosomal gene, 16S rDNA gene. In contrast, we analyzed combined molecular marker and phylogenomics approaches (genome alignment, gene content and gene order, composition vector and protein domain content) for accurately inferring phylogenetic relationship of species. We have shown that this approach reflects the impact of evolution on the organisms and considers connects with the ecological adaptation in cyanobacteria in different habitats. Analysis revealed that the members from marine habitat occupy different profile than those from freshwater. Impact of GC content and genomic repetitiveness over the diversification of cyanobacterial species and their possible role in adaptation was also reflected. Members occupying similar habitats cover more evolutionary distance together and also evolve various strategies for adaptation and survival either through genomic repetitiveness or preferences for genes of particular functions or modified GC content. Genomes undergo different changes for their adaptation in diverse habitats.
Collapse
Affiliation(s)
- Ratna Prabha
- 1ICAR-National Bureau of Agriculturally Important Microorganisms, Kushmaur, Maunath Bhanjan, 275101 India
- 2Department of Biotechnology, Mewar University, Gangrar, Chittorgarh, Rajasthan India
| | - Dhananjaya P Singh
- 1ICAR-National Bureau of Agriculturally Important Microorganisms, Kushmaur, Maunath Bhanjan, 275101 India
| |
Collapse
|
7
|
Puigbò P, Wolf YI, Koonin EV. Genome-Wide Comparative Analysis of Phylogenetic Trees: The Prokaryotic Forest of Life. Methods Mol Biol 2019; 1910:241-269. [PMID: 31278667 DOI: 10.1007/978-1-4939-9074-0_8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Genome-wide comparison of phylogenetic trees is becoming an increasingly common approach in evolutionary genomics, and a variety of approaches for such comparison have been developed. In this article we present several methods for comparative analysis of large numbers of phylogenetic trees. To compare phylogenetic trees taking into account the bootstrap support for each internal branch, the boot-split distance (BSD) method is introduced as an extension of the previously developed split distance (SD) method for tree comparison. The BSD method implements the straightforward idea that comparison of phylogenetic trees can be made more robust by treating tree splits differentially depending on the bootstrap support. Approaches are also introduced for detecting treelike and netlike evolutionary trends in the phylogenetic Forest of Life (FOL), i.e., the entirety of the phylogenetic trees for conserved genes of prokaryotes. The principal method employed for this purpose includes mapping quartets of species onto trees to calculate the support of each quartet topology and so to quantify the tree and net contributions to the distances between species. We describe the applications methods used to analyze the FOL and the results obtained with these methods. These results support the concept of the Tree of Life (TOL) as a central evolutionary trend in the FOL as opposed to the traditional view of the TOL as a "species tree."
Collapse
Affiliation(s)
- Pere Puigbò
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.,Division of Genetics and Physiology, Department of Biology, University of Turku, Turku, Finland
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
8
|
Madagascar ground gecko genome analysis characterizes asymmetric fates of duplicated genes. BMC Biol 2018; 16:40. [PMID: 29661185 PMCID: PMC5901865 DOI: 10.1186/s12915-018-0509-4] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Accepted: 03/22/2018] [Indexed: 11/13/2022] Open
Abstract
Background Conventionally, comparison among amniotes – birds, mammals, and reptiles – has often been approached through analyses of mammals and, for comparison, birds. However, birds are morphologically and physiologically derived and, moreover, some parts of their genomes are recognized as difficult to sequence and/or assemble and are thus missing in genome assemblies. Therefore, sequencing the genomes of reptiles would aid comparative studies on amniotes by providing more comprehensive coverage to help understand the molecular mechanisms underpinning evolutionary changes. Results Herein, we present the whole genome sequences of the Madagascar ground gecko (Paroedura picta), a promising study system especially in developmental biology, and used it to identify changes in gene repertoire across amniotes. The genome-wide analysis of the Madagascar ground gecko allowed us to reconstruct a comprehensive set of gene phylogenies comprising 13,043 ortholog groups from diverse amniotes. Our study revealed 469 genes retained by some reptiles but absent from available genome-wide sequence data of both mammals and birds. Importantly, these genes, herein collectively designated as ‘elusive’ genes, exhibited high nucleotide substitution rates and uneven intra-genomic distribution. Furthermore, the genomic regions flanking these elusive genes exhibited distinct characteristics that tended to be associated with increased gene density, repeat element density, and GC content. Conclusion This highly continuous and nearly complete genome assembly of the Madagascar ground gecko will facilitate the use of this species as an experimental animal in diverse fields of biology. Gene repertoire comparisons across amniotes further demonstrated that the fate of a duplicated gene can be affected by the intrinsic properties of its genomic location, which can persist for hundreds of millions of years. Electronic supplementary material The online version of this article (10.1186/s12915-018-0509-4) contains supplementary material, which is available to authorized users.
Collapse
|
9
|
Plazzi F, Puccio G, Passamonti M. Burrowers from the Past: Mitochondrial Signatures of Ordovician Bivalve Infaunalization. Genome Biol Evol 2017; 9:956-967. [PMID: 28338965 PMCID: PMC5393379 DOI: 10.1093/gbe/evx051] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/08/2017] [Indexed: 12/20/2022] Open
Abstract
Bivalves and gastropods are the two largest classes of extant molluscs. Despite sharing a huge number of features, they do not share a key ecological one: gastropods are essentially epibenthic, although most bivalves are infaunal. However, this is not the ancestral bivalve condition; Cambrian forms were surface crawlers and only during the Ordovician a fundamental infaunalization process took place, leading to bivalves as we currently know them. This major ecological shift is linked to the exposure to a different redox environoments (hypoxic or anoxic) and with the Lower Devonian oxygenation event. We investigated selective signatures on bivalve and gastropod mitochondrial genomes with respect to a time calibrated mitochondrial phylogeny by means of dN/dS ratios. We were able to detect 1) a major signal of directional selection between the Ordovician and the Lower Devonian for bivalve mitochondrial Complex I, and 2) an overall higher directional selective pressure on bivalve Complex V with respect to gastropods. These and other minor dN/dS patterns and timings are discussed, showing that the Ordovician infaunalization event left heavy traces in bivalve mitochondrial genomes.
Collapse
Affiliation(s)
- Federico Plazzi
- Department of Biological, Geological and Environmental Sciences, University of Bologna, Italy
| | - Guglielmo Puccio
- Department of Biological, Geological and Environmental Sciences, University of Bologna, Italy
| | - Marco Passamonti
- Department of Biological, Geological and Environmental Sciences, University of Bologna, Italy
| |
Collapse
|
10
|
Plazzi F, Puccio G, Passamonti M. Comparative Large-Scale Mitogenomics Evidences Clade-Specific Evolutionary Trends in Mitochondrial DNAs of Bivalvia. Genome Biol Evol 2016; 8:2544-64. [PMID: 27503296 PMCID: PMC5010914 DOI: 10.1093/gbe/evw187] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/31/2016] [Indexed: 12/28/2022] Open
Abstract
Despite the figure of complete bivalve mitochondrial genomes keeps growing, an assessment of the general features of these genomes in a phylogenetic framework is still lacking, despite the fact that bivalve mitochondrial genomes are unusual under different aspects. In this work, we constructed a dataset of one hundred mitochondrial genomes of bivalves to perform the first systematic comparative mitogenomic analysis, developing a phylogenetic background to scaffold the evolutionary history of the class' mitochondrial genomes. Highly conserved domains were identified in all protein coding genes; however, four genes (namely, atp6, nad2, nad4L, and nad6) were found to be very divergent for many respects, notwithstanding the overall purifying selection working on those genomes. Moreover, the atp8 gene was newly annotated in 20 mitochondrial genomes, where it was previously declared as lacking or only signaled. Supernumerary mitochondrial proteins were compared, but it was possible to find homologies only among strictly related species. The rearrangement rate on the molecule is too high to be used as a phylogenetic marker, but here we demonstrate for the first time in mollusks that there is correlation between rearrangement rates and evolutionary rates. We also developed a new index (HERMES) to estimate the amount of mitochondrial evolution. Many genomic features are phylogenetically congruent and this allowed us to highlight three main phases in bivalve history: the origin, the branching of palaeoheterodonts, and the second radiation leading to the present-day biodiversity.
Collapse
Affiliation(s)
- Federico Plazzi
- Department of Biological, Geological and Environmental Sciences, University of Bologna, via Selmi, 3 - 40126 Bologna, Italy
| | - Guglielmo Puccio
- Department of Biological, Geological and Environmental Sciences, University of Bologna, via Selmi, 3 - 40126 Bologna, Italy
| | - Marco Passamonti
- Department of Biological, Geological and Environmental Sciences, University of Bologna, via Selmi, 3 - 40126 Bologna, Italy
| |
Collapse
|
11
|
DeBlasio DF, Wisecaver JH. SICLE: a high-throughput tool for extracting evolutionary relationships from phylogenetic trees. PeerJ 2016; 4:e2359. [PMID: 27635331 PMCID: PMC5012314 DOI: 10.7717/peerj.2359] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Accepted: 07/23/2016] [Indexed: 11/25/2022] Open
Abstract
We present the phylogeny analysis software SICLE (Sister Clade Extractor), an easy-to-use, high-throughput tool to describe the nearest neighbors to a node of interest in a phylogenetic tree as well as the support value for the relationship. The application is a command line utility that can be embedded into a phylogenetic analysis pipeline or can be used as a subroutine within another C++ program. As a test case, we applied this new tool to the published phylome of Salinibacter ruber, a species of halophilic Bacteriodetes, identifying 13 unique sister relationships to S. ruber across the 4,589 gene phylogenies. S. ruber grouped with bacteria, most often other Bacteriodetes, in the majority of phylogenies, but 91 phylogenies showed a branch-supported sister association between S. ruber and Archaea, an evolutionarily intriguing relationship indicative of horizontal gene transfer. This test case demonstrates how SICLE makes it possible to summarize the phylogenetic information produced by automated phylogenetic pipelines to rapidly identify and quantify the possible evolutionary relationships that merit further investigation. SICLE is available for free for noncommercial use at http://eebweb.arizona.edu/sicle/.
Collapse
Affiliation(s)
- Dan F DeBlasio
- Department of Computer Science, University of Arizona , Tucson , AZ , United States
| | - Jennifer H Wisecaver
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, United States; Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, United States
| |
Collapse
|
12
|
Nguyen M, Ekstrom A, Li X, Yin Y. HGT-Finder: A New Tool for Horizontal Gene Transfer Finding and Application to Aspergillus genomes. Toxins (Basel) 2015; 7:4035-53. [PMID: 26473921 PMCID: PMC4626719 DOI: 10.3390/toxins7104035] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Revised: 09/22/2015] [Accepted: 09/23/2015] [Indexed: 11/16/2022] Open
Abstract
Horizontal gene transfer (HGT) is a fast-track mechanism that allows genetically unrelated organisms to exchange genes for rapid environmental adaptation. We developed a new phyletic distribution-based software, HGT-Finder, which implements a novel bioinformatics algorithm to calculate a horizontal transfer index and a probability value for each query gene. Applying this new tool to the Aspergillus fumigatus, Aspergillus flavus, and Aspergillus nidulans genomes, we found 273, 542, and 715 transferred genes (HTGs), respectively. HTGs have shorter length, higher guanine-cytosine (GC) content, and relaxed selection pressure. Metabolic process and secondary metabolism functions are significantly enriched in HTGs. Gene clustering analysis showed that 61%, 41% and 74% of HTGs in the three genomes form physically linked gene clusters (HTGCs). Overlapping manually curated, secondary metabolite gene clusters (SMGCs) with HTGCs found that 9 of the 33 A. fumigatus SMGCs and 31 of the 65 A. nidulans SMGCs share genes with HTGCs, and that HTGs are significantly enriched in SMGCs. Our genome-wide analysis thus presented very strong evidence to support the hypothesis that HGT has played a very critical role in the evolution of SMGCs. The program is freely available at http://cys.bios.niu.edu/HGTFinder/ HGTFinder.tar.gz.
Collapse
Affiliation(s)
- Marcus Nguyen
- Department of Computer Science, Northern Illinois University, DeKalb, IL 60115-2857, USA.
| | - Alex Ekstrom
- Department of Computer Science, Northern Illinois University, DeKalb, IL 60115-2857, USA.
| | - Xueqiong Li
- Department of Biological Sciences, Northern Illinois University, Montgomery Hall 325A, DeKalb, IL 60115-2857, USA.
- College of Life Sciences, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, Inner Mongolia, China.
| | - Yanbin Yin
- Department of Biological Sciences, Northern Illinois University, Montgomery Hall 325A, DeKalb, IL 60115-2857, USA.
| |
Collapse
|
13
|
Kumar S, Krabberød AK, Neumann RS, Michalickova K, Zhao S, Zhang X, Shalchian-Tabrizi K. BIR Pipeline for Preparation of Phylogenomic Data. Evol Bioinform Online 2015; 11:79-83. [PMID: 25987827 PMCID: PMC4412416 DOI: 10.4137/ebo.s10189] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Revised: 12/03/2014] [Accepted: 12/08/2014] [Indexed: 11/05/2022] Open
Abstract
SUMMARY We present a pipeline named BIR (Blast, Identify and Realign) developed for phylogenomic analyses. BIR is intended for the identification of gene sequences applicable for phylogenomic inference. The pipeline allows users to apply their own manually curated sequence alignments (seed) in search for homologous genes in sequence databases and available genomes. BIR automatically adds the identified sequences from these databases to the seed alignments and reconstruct a phylogenetic tree from each. The BIR pipeline is an efficient tool for the identification of orthologous gene copies because it expands user-defined sequence alignments and conducts massive parallel phylogenetic reconstruction. The application is also particularly useful for large-scale sequencing projects that require management of a large number of single-gene alignments for gene comparison, functional annotation, and evolutionary analyses. AVAILABILITY The BIR user manual is available at http://www.bioportal.no/ and can be accessed through Lifeportal at https://lifeportal.uio.no. Access is free but requires a user account registration using the link "Register for BIR access" from the Lifeportal homepage.
Collapse
Affiliation(s)
- Surendra Kumar
- Section for Genetics and Evolutionary Biology (EVOGENE) and Centre for Epigenetics, Development and Evolution (CEDE), Department of Biosciences, University of Oslo, Norway
| | - Anders K Krabberød
- Section for Genetics and Evolutionary Biology (EVOGENE) and Centre for Epigenetics, Development and Evolution (CEDE), Department of Biosciences, University of Oslo, Norway
| | - Ralf S Neumann
- Section for Genetics and Evolutionary Biology (EVOGENE) and Centre for Epigenetics, Development and Evolution (CEDE), Department of Biosciences, University of Oslo, Norway
| | | | - Sen Zhao
- Genome Biology Group, Department of Caner Prevention, Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway
| | - Xiaoli Zhang
- Section for Genetics and Evolutionary Biology (EVOGENE) and Centre for Epigenetics, Development and Evolution (CEDE), Department of Biosciences, University of Oslo, Norway
| | - Kamran Shalchian-Tabrizi
- Section for Genetics and Evolutionary Biology (EVOGENE) and Centre for Epigenetics, Development and Evolution (CEDE), Department of Biosciences, University of Oslo, Norway
| |
Collapse
|
14
|
Green RE, Braun EL, Armstrong J, Earl D, Nguyen N, Hickey G, Vandewege MW, St John JA, Capella-Gutiérrez S, Castoe TA, Kern C, Fujita MK, Opazo JC, Jurka J, Kojima KK, Caballero J, Hubley RM, Smit AF, Platt RN, Lavoie CA, Ramakodi MP, Finger JW, Suh A, Isberg SR, Miles L, Chong AY, Jaratlerdsiri W, Gongora J, Moran C, Iriarte A, McCormack J, Burgess SC, Edwards SV, Lyons E, Williams C, Breen M, Howard JT, Gresham CR, Peterson DG, Schmitz J, Pollock DD, Haussler D, Triplett EW, Zhang G, Irie N, Jarvis ED, Brochu CA, Schmidt CJ, McCarthy FM, Faircloth BC, Hoffmann FG, Glenn TC, Gabaldón T, Paten B, Ray DA. Three crocodilian genomes reveal ancestral patterns of evolution among archosaurs. Science 2014; 346:1254449. [PMID: 25504731 PMCID: PMC4386873 DOI: 10.1126/science.1254449] [Citation(s) in RCA: 242] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
To provide context for the diversification of archosaurs--the group that includes crocodilians, dinosaurs, and birds--we generated draft genomes of three crocodilians: Alligator mississippiensis (the American alligator), Crocodylus porosus (the saltwater crocodile), and Gavialis gangeticus (the Indian gharial). We observed an exceptionally slow rate of genome evolution within crocodilians at all levels, including nucleotide substitutions, indels, transposable element content and movement, gene family evolution, and chromosomal synteny. When placed within the context of related taxa including birds and turtles, this suggests that the common ancestor of all of these taxa also exhibited slow genome evolution and that the comparatively rapid evolution is derived in birds. The data also provided the opportunity to analyze heterozygosity in crocodilians, which indicates a likely reduction in population size for all three taxa through the Pleistocene. Finally, these data combined with newly published bird genomes allowed us to reconstruct the partial genome of the common ancestor of archosaurs, thereby providing a tool to investigate the genetic starting material of crocodilians, birds, and dinosaurs.
Collapse
Affiliation(s)
- Richard E Green
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA.
| | - Edward L Braun
- Department of Biology and Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | - Joel Armstrong
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA. Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Dent Earl
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA. Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Ngan Nguyen
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA. Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Glenn Hickey
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA. Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Michael W Vandewege
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA
| | - John A St John
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Salvador Capella-Gutiérrez
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation, 08003 Barcelona, Spain. Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Todd A Castoe
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA. Department of Biology, University of Texas, Arlington, TX 76019, USA
| | - Colin Kern
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19717, USA
| | - Matthew K Fujita
- Department of Biology, University of Texas, Arlington, TX 76019, USA
| | - Juan C Opazo
- Instituto de Ciencias Ambientales y Evolutivas, Facultad de Ciencias, Universidad Austral de Chile, Valdivia, Chile
| | - Jerzy Jurka
- Genetic Information Research Institute, Mountain View, CA 94043, USA
| | - Kenji K Kojima
- Genetic Information Research Institute, Mountain View, CA 94043, USA
| | | | | | - Arian F Smit
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Roy N Platt
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA. Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Christine A Lavoie
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Meganathan P Ramakodi
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA. Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA
| | - John W Finger
- Department of Environmental Health Science, University of Georgia, Athens, GA 30602, USA
| | - Alexander Suh
- Institute of Experimental Pathology (ZMBE), University of Münster, D-48149 Münster, Germany. Department of Evolutionary Biology (EBC), Uppsala University, SE-752 36 Uppsala, Sweden
| | - Sally R Isberg
- Porosus Pty. Ltd., Palmerston, NT 0831, Australia. Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia. Centre for Crocodile Research, Noonamah, NT 0837, Australia
| | - Lee Miles
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Amanda Y Chong
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | | | - Jaime Gongora
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Christopher Moran
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Andrés Iriarte
- Departamento de Desarrollo Biotecnológico, Instituto de Higiene, Facultad de Medicina, Universidad de la República, Montevideo, Uruguay
| | - John McCormack
- Moore Laboratory of Zoology, Occidental College, Los Angeles, CA 90041, USA
| | - Shane C Burgess
- College of Agriculture and Life Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Eric Lyons
- School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Christina Williams
- Department of Molecular Biomedical Sciences, North Carolina State University, Raleigh, NC 27607, USA
| | - Matthew Breen
- Department of Molecular Biomedical Sciences, North Carolina State University, Raleigh, NC 27607, USA
| | - Jason T Howard
- Howard Hughes Medical Institute, Department of Neurobiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Cathy R Gresham
- Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Daniel G Peterson
- Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA. Department of Plant and Soil Sciences, Mississippi State University, Mississippi State, MS 39762, USA
| | - Jürgen Schmitz
- Institute of Experimental Pathology (ZMBE), University of Münster, D-48149 Münster, Germany
| | - David D Pollock
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - David Haussler
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA. Howard Hughes Medical Institute, Bethesda, MD 20814, USA
| | - Eric W Triplett
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611, USA
| | - Guojie Zhang
- China National GeneBank, BGI-Shenzhen, Shenzhen, China. Center for Social Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Naoki Irie
- Department of Biological Sciences, Graduate School of Science, University of Tokyo, Tokyo, Japan
| | - Erich D Jarvis
- Howard Hughes Medical Institute, Department of Neurobiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Christopher A Brochu
- Department of Earth and Environmental Sciences, University of Iowa, Iowa City, IA 52242, USA
| | - Carl J Schmidt
- Department of Animal and Food Sciences, University of Delaware, Newark, DE 19717, USA
| | - Fiona M McCarthy
- School of Animal and Comparative Biomedical Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Brant C Faircloth
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90019, USA. Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Federico G Hoffmann
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA. Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Travis C Glenn
- Department of Environmental Health Science, University of Georgia, Athens, GA 30602, USA
| | - Toni Gabaldón
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation, 08003 Barcelona, Spain. Universitat Pompeu Fabra, 08003 Barcelona, Spain. Institució Catalana de Recerca i Estudis Avançats, 08010 Barcelona, Spain
| | - Benedict Paten
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA
| | - David A Ray
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA. Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA. Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA.
| |
Collapse
|
15
|
Duarte M, Jauregui R, Vilchez-Vargas R, Junca H, Pieper DH. AromaDeg, a novel database for phylogenomics of aerobic bacterial degradation of aromatics. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau118. [PMID: 25468931 PMCID: PMC4250580 DOI: 10.1093/database/bau118] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Understanding prokaryotic transformation of recalcitrant pollutants and the in-situ metabolic nets require the integration of massive amounts of biological data. Decades of biochemical studies together with novel next-generation sequencing data have exponentially increased information on aerobic aromatic degradation pathways. However, the majority of protein sequences in public databases have not been experimentally characterized and homology-based methods are still the most routinely used approach to assign protein function, allowing the propagation of misannotations. AromaDeg is a web-based resource targeting aerobic degradation of aromatics that comprises recently updated (September 2013) and manually curated databases constructed based on a phylogenomic approach. Grounded in phylogenetic analyses of protein sequences of key catabolic protein families and of proteins of documented function, AromaDeg allows query and data mining of novel genomic, metagenomic or metatranscriptomic data sets. Essentially, each query sequence that match a given protein family of AromaDeg is associated to a specific cluster of a given phylogenetic tree and further function annotation and/or substrate specificity may be inferred from the neighboring cluster members with experimentally validated function. This allows a detailed characterization of individual protein superfamilies as well as high-throughput functional classifications. Thus, AromaDeg addresses the deficiencies of homology-based protein function prediction, combining phylogenetic tree construction and integration of experimental data to obtain more accurate annotations of new biological data related to aerobic aromatic biodegradation pathways. We pursue in future the expansion of AromaDeg to other enzyme families involved in aromatic degradation and its regular update. Database URL:http://aromadeg.siona.helmholtz-hzi.de
Collapse
Affiliation(s)
- Márcia Duarte
- Microbial Interactions and Processes Research Group, HZI-Helmholtz Centre for Infection Research, Inhoffenstr. 7, D-38124 Braunschweig, Germany, Research Group Microbial Ecology, Metabolism, Genomics and Evolution of Communities of Environmental Microorganisms, CorpoGen. Carrera 5 No. 66A-35, Bogotá, Colombia and Faculty of Basic and Applied Sciences, Universidad Militar Nueva Granada-UMNG, Campus Cajicá, Bogotá DC, Colombia
| | - Ruy Jauregui
- Microbial Interactions and Processes Research Group, HZI-Helmholtz Centre for Infection Research, Inhoffenstr. 7, D-38124 Braunschweig, Germany, Research Group Microbial Ecology, Metabolism, Genomics and Evolution of Communities of Environmental Microorganisms, CorpoGen. Carrera 5 No. 66A-35, Bogotá, Colombia and Faculty of Basic and Applied Sciences, Universidad Militar Nueva Granada-UMNG, Campus Cajicá, Bogotá DC, Colombia Microbial Interactions and Processes Research Group, HZI-Helmholtz Centre for Infection Research, Inhoffenstr. 7, D-38124 Braunschweig, Germany, Research Group Microbial Ecology, Metabolism, Genomics and Evolution of Communities of Environmental Microorganisms, CorpoGen. Carrera 5 No. 66A-35, Bogotá, Colombia and Faculty of Basic and Applied Sciences, Universidad Militar Nueva Granada-UMNG, Campus Cajicá, Bogotá DC, Colombia
| | - Ramiro Vilchez-Vargas
- Microbial Interactions and Processes Research Group, HZI-Helmholtz Centre for Infection Research, Inhoffenstr. 7, D-38124 Braunschweig, Germany, Research Group Microbial Ecology, Metabolism, Genomics and Evolution of Communities of Environmental Microorganisms, CorpoGen. Carrera 5 No. 66A-35, Bogotá, Colombia and Faculty of Basic and Applied Sciences, Universidad Militar Nueva Granada-UMNG, Campus Cajicá, Bogotá DC, Colombia
| | - Howard Junca
- Microbial Interactions and Processes Research Group, HZI-Helmholtz Centre for Infection Research, Inhoffenstr. 7, D-38124 Braunschweig, Germany, Research Group Microbial Ecology, Metabolism, Genomics and Evolution of Communities of Environmental Microorganisms, CorpoGen. Carrera 5 No. 66A-35, Bogotá, Colombia and Faculty of Basic and Applied Sciences, Universidad Militar Nueva Granada-UMNG, Campus Cajicá, Bogotá DC, Colombia Microbial Interactions and Processes Research Group, HZI-Helmholtz Centre for Infection Research, Inhoffenstr. 7, D-38124 Braunschweig, Germany, Research Group Microbial Ecology, Metabolism, Genomics and Evolution of Communities of Environmental Microorganisms, CorpoGen. Carrera 5 No. 66A-35, Bogotá, Colombia and Faculty of Basic and Applied Sciences, Universidad Militar Nueva Granada-UMNG, Campus Cajicá, Bogotá DC, Colombia
| | - Dietmar H Pieper
- Microbial Interactions and Processes Research Group, HZI-Helmholtz Centre for Infection Research, Inhoffenstr. 7, D-38124 Braunschweig, Germany, Research Group Microbial Ecology, Metabolism, Genomics and Evolution of Communities of Environmental Microorganisms, CorpoGen. Carrera 5 No. 66A-35, Bogotá, Colombia and Faculty of Basic and Applied Sciences, Universidad Militar Nueva Granada-UMNG, Campus Cajicá, Bogotá DC, Colombia
| |
Collapse
|
16
|
Kannan S, Rogozin IB, Koonin EV. MitoCOGs: clusters of orthologous genes from mitochondria and implications for the evolution of eukaryotes. BMC Evol Biol 2014; 14:237. [PMID: 25421434 PMCID: PMC4256733 DOI: 10.1186/s12862-014-0237-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2014] [Accepted: 11/07/2014] [Indexed: 01/19/2023] Open
Abstract
Background Mitochondria are ubiquitous membranous organelles of eukaryotic cells that evolved from an alpha-proteobacterial endosymbiont and possess a small genome that encompasses from 3 to 106 genes. Accumulation of thousands of mitochondrial genomes from diverse groups of eukaryotes provides an opportunity for a comprehensive reconstruction of the evolution of the mitochondrial gene repertoire. Results Clusters of orthologous mitochondrial protein-coding genes (MitoCOGs) were constructed from all available mitochondrial genomes and complemented with nuclear orthologs of mitochondrial genes. With minimal exceptions, the mitochondrial gene complements of eukaryotes are subsets of the superset of 66 genes found in jakobids. Reconstruction of the evolution of mitochondrial genomes indicates that the mitochondrial gene set of the last common ancestor of the extant eukaryotes was slightly larger than that of jakobids. This superset of mitochondrial genes likely represents an intermediate stage following the loss and transfer to the nucleus of most of the endosymbiont genes early in eukaryote evolution. Subsequent evolution in different lineages involved largely parallel transfer of ancestral endosymbiont genes to the nuclear genome. The intron density in nuclear orthologs of mitochondrial genes typically is nearly the same as in the rest of the genes in the respective genomes. However, in land plants, the intron density in nuclear orthologs of mitochondrial genes is almost 1.5-fold lower than the genomic mean, suggestive of ongoing transfer of functional genes from mitochondria to the nucleus. Conclusions The MitoCOGs are expected to become an important resource for the study of mitochondrial evolution. The nearly complete superset of mitochondrial genes in jakobids likely represents an intermediate stage in the evolution of eukaryotes after the initial, extensive loss and transfer of the endosymbiont genes. In addition, the bacterial multi-subunit RNA polymerase that is encoded in the jakobid mitochondrial genomes was replaced by a single-subunit phage-type RNA polymerase in the rest of the eukaryotes. These results are best compatible with the rooting of the eukaryotic tree between jakobids and the rest of the eukaryotes. The land plants are the only eukaryotic branch in which the gene transfer from the mitochondrial to the nuclear genome appears to be an active, ongoing process. Electronic supplementary material The online version of this article (doi:10.1186/s12862-014-0237-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sivakumar Kannan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| | - Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| |
Collapse
|
17
|
Heydari M, Marashi SA, Tusserkani R, Sadeghi M. Reconstruction of phylogenetic trees of prokaryotes using maximal common intervals. Biosystems 2014; 124:86-94. [PMID: 25195150 DOI: 10.1016/j.biosystems.2014.09.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2012] [Revised: 08/13/2014] [Accepted: 09/01/2014] [Indexed: 11/15/2022]
Abstract
One of the fundamental problems in bioinformatics is phylogenetic tree reconstruction, which can be used for classifying living organisms into different taxonomic clades. The classical approach to this problem is based on a marker such as 16S ribosomal RNA. Since evolutionary events like genomic rearrangements are not included in reconstructions of phylogenetic trees based on single genes, much effort has been made to find other characteristics for phylogenetic reconstruction in recent years. With the increasing availability of completely sequenced genomes, gene order can be considered as a new solution for this problem. In the present work, we applied maximal common intervals (MCIs) in two or more genomes to infer their distance and to reconstruct their evolutionary relationship. Additionally, measures based on uncommon segments (UCS's), i.e., those genomic segments which are not detected as part of any of the MCIs, are also used for phylogenetic tree reconstruction. We applied these two types of measures for reconstructing the phylogenetic tree of 63 prokaryotes with known COG (clusters of orthologous groups) families. Similarity between the MCI-based (resp. UCS-based) reconstructed phylogenetic trees and the phylogenetic tree obtained from NCBI taxonomy browser is as high as 93.1% (resp. 94.9%). We show that in the case of this diverse dataset of prokaryotes, tree reconstruction based on MCI and UCS outperforms most of the currently available methods based on gene orders, including breakpoint distance and DCJ. We additionally tested our new measures on a dataset of 13 closely-related bacteria from the genus Prochlorococcus. In this case, distances like rearrangement distance, breakpoint distance and DCJ proved to be useful, while our new measures are still appropriate for phylogenetic reconstruction.
Collapse
Affiliation(s)
- Mahdi Heydari
- Department of Algorithms and Computation, College of Engineering, University of Tehran, Tehran, Iran
| | - Sayed-Amir Marashi
- Department of Biotechnology, College of Science, University of Tehran, Tehran, Iran; School of Biological Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran.
| | - Ruzbeh Tusserkani
- School of Mathematics, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Mehdi Sadeghi
- National Institute of Genetic Engineering and Biotechnology, Tehran, Iran
| |
Collapse
|
18
|
Zhu Q, Kosoy M, Dittmar K. HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers. BMC Genomics 2014; 15:717. [PMID: 25159222 PMCID: PMC4155097 DOI: 10.1186/1471-2164-15-717] [Citation(s) in RCA: 101] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2014] [Accepted: 08/20/2014] [Indexed: 11/23/2022] Open
Abstract
Background First pass methods based on BLAST match are commonly used as an initial step to separate the different phylogenetic histories of genes in microbial genomes, and target putative horizontal gene transfer (HGT) events. This will continue to be necessary given the rapid growth of genomic data and the technical difficulties in conducting large-scale explicit phylogenetic analyses. However, these methods often produce misleading results due to their inability to resolve indirect phylogenetic links and their vulnerability to stochastic events. Results A new computational method of rapid, exhaustive and genome-wide detection of HGT was developed, featuring the systematic analysis of BLAST hit distribution patterns in the context of a priori defined hierarchical evolutionary categories. Genes that fall beyond a series of statistically determined thresholds are identified as not adhering to the typical vertical history of the organisms in question, but instead having a putative horizontal origin. Tests on simulated genomic data suggest that this approach effectively targets atypically distributed genes that are highly likely to be HGT-derived, and exhibits robust performance compared to conventional BLAST-based approaches. This method was further tested on real genomic datasets, including Rickettsia genomes, and was compared to previous studies. Results show consistency with currently employed categories of HGT prediction methods. In-depth analysis of both simulated and real genomic data suggests that the method is notably insensitive to stochastic events such as gene loss, rate variation and database error, which are common challenges to the current methodology. An automated pipeline was created to implement this approach and was made publicly available at: https://github.com/DittmarLab/HGTector. The program is versatile, easily deployed, has a low requirement for computational resources. Conclusions HGTector is an effective tool for initial or standalone large-scale discovery of candidate HGT-derived genes. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-717) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Qiyun Zhu
- Department of Biological Sciences, University at Buffalo, State University of New York, 109 Cooke Hall, Buffalo, NY 14260, USA.
| | | | | |
Collapse
|
19
|
The impact of automated filtering of BLAST-determined homologs in the phylogenetic detection of horizontal gene transfer from a transcriptome assembly. Mol Phylogenet Evol 2014; 71:184-92. [DOI: 10.1016/j.ympev.2013.11.016] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Revised: 10/09/2013] [Accepted: 11/25/2013] [Indexed: 12/24/2022]
|
20
|
Horner DS, Pesole G. Phylogenetic analyses: a brief introduction to methods and their application. Expert Rev Mol Diagn 2014; 4:339-50. [PMID: 15137901 DOI: 10.1586/14737159.4.3.339] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Phylogenetic analysis of molecular sequence data plays an increasingly important role in clinical medicine, both in the emerging field of molecular epidemiology and in the rational design of new therapeutic agents. The aims of this review are to introduce some of the methods used to construct phylogenetic trees, to illustrate some of the pitfalls that can introduce artifactual results and to speculate on the long-term importance of this area of computational biology in clinical medicine.
Collapse
Affiliation(s)
- David S Horner
- Department of Biomolecular Sciences and Biotechnology, University of Milan, Via Celoria 26, 20133 Milano, Italy.
| | | |
Collapse
|
21
|
Abstract
A major goal of many evolutionary analyses is to determine the true evolutionary history of an organism. Molecular methods that rely on the phylogenetic signal generated by a few to a handful of loci can be used to approximate the evolution of the entire organism but fall short of providing a global, genome-wide, perspective on evolutionary processes. Indeed, individual genes in a genome may have different evolutionary histories. Therefore, it is informative to analyze the number and kind of phylogenetic topologies found within an orthologous set of genes across a genome. Here we present PhyBin: a flexible program for clustering gene trees based on topological structure. PhyBin can generate bins of topologies corresponding to exactly identical trees or can utilize Robinson-Fould’s distance matrices to generate clusters of similar trees, using a user-defined threshold. Additionally, PhyBin allows the user to adjust for potential noise in the dataset (as may be produced when comparing very closely related organisms) by pre-processing trees to collapse very short branches or those nodes not meeting a defined bootstrap threshold. As a test case, we generated individual trees based on an orthologous gene set from 10 Wolbachia species across four different supergroups (A–D) and utilized PhyBin to categorize the complete set of topologies produced from this dataset. Using this approach, we were able to show that although a single topology generally dominated the analysis, confirming the separation of the supergroups, many genes supported alternative evolutionary histories. Because PhyBin’s output provides the user with lists of gene trees in each topological cluster, it can be used to explore potential reasons for discrepancies between phylogenies including homoplasies, long-branch attraction, or horizontal gene transfer events.
Collapse
Affiliation(s)
- Ryan R Newton
- School of Informatics and Computing, Indiana University , Bloomington, IN , United States
| | | |
Collapse
|
22
|
Romance of the three domains: how cladistics transformed the classification of cellular organisms. Protein Cell 2013; 4:664-76. [PMID: 23873078 PMCID: PMC4875529 DOI: 10.1007/s13238-013-3050-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Accepted: 07/01/2013] [Indexed: 11/23/2022] Open
Abstract
Cladistics is a biological philosophy that uses genealogical relationship among species and an inferred sequence of divergence as the basis of classification. This review critically surveys the chronological development of biological classification from Aristotle through our postgenomic era with a central focus on cladistics. In 1957, Julian Huxley coined cladogenesis to denote splitting from subspeciation. In 1960, the English translation of Willi Hennig’s 1950 work, Systematic Phylogenetics, was published, which received strong opposition from pheneticists, such as numerical taxonomists Peter Sneath and Robert Sokal, and evolutionary taxonomist, Ernst Mayr, and sparked acrimonious debates in 1960–1980. In 1977–1990, Carl Woese pioneered in using small subunit rRNA gene sequences to delimitate the three domains of cellular life and established major prokaryotic phyla. Cladistics has since dominated taxonomy. Despite being compatible with modern microbiological observations, i.e. organisms with unusual phenotypes, restricted expression of characteristics and occasionally being uncultivable, increasing recognition of pervasiveness and abundance of horizontal gene transfer has challenged relevance and validity of cladistics. The mosaic nature of eukaryotic and prokaryotic genomes was also gradually discovered. In the mid-2000s, high-throughput and whole-genome sequencing became routine and complex geneologies of organisms have led to the proposal of a reticulated web of life. While genomics only indirectly leads to understanding of functional adaptations to ecological niches, computational modeling of entire organisms is underway and the gap between genomics and phenetics may soon be bridged. Controversies are not expected to settle as taxonomic classifications shall remain subjective to serve the human scientist, not the classified.
Collapse
|
23
|
Jiménez-Guri E, Huerta-Cepas J, Cozzuto L, Wotton KR, Kang H, Himmelbauer H, Roma G, Gabaldón T, Jaeger J. Comparative transcriptomics of early dipteran development. BMC Genomics 2013; 14:123. [PMID: 23432914 PMCID: PMC3616871 DOI: 10.1186/1471-2164-14-123] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2012] [Accepted: 02/19/2013] [Indexed: 12/24/2022] Open
Abstract
Background Modern sequencing technologies have massively increased the amount of data available for comparative genomics. Whole-transcriptome shotgun sequencing (RNA-seq) provides a powerful basis for comparative studies. In particular, this approach holds great promise for emerging model species in fields such as evolutionary developmental biology (evo-devo). Results We have sequenced early embryonic transcriptomes of two non-drosophilid dipteran species: the moth midge Clogmia albipunctata, and the scuttle fly Megaselia abdita. Our analysis includes a third, published, transcriptome for the hoverfly Episyrphus balteatus. These emerging models for comparative developmental studies close an important phylogenetic gap between Drosophila melanogaster and other insect model systems. In this paper, we provide a comparative analysis of early embryonic transcriptomes across species, and use our data for a phylogenomic re-evaluation of dipteran phylogenetic relationships. Conclusions We show how comparative transcriptomics can be used to create useful resources for evo-devo, and to investigate phylogenetic relationships. Our results demonstrate that de novo assembly of short (Illumina) reads yields high-quality, high-coverage transcriptomic data sets. We use these data to investigate deep dipteran phylogenetic relationships. Our results, based on a concatenation of 160 orthologous genes, provide support for the traditional view of Clogmia being the sister group of Brachycera (Megaselia, Episyrphus, Drosophila), rather than that of Culicomorpha (which includes mosquitoes and blackflies).
Collapse
Affiliation(s)
- Eva Jiménez-Guri
- EMBL/CRG Research Unit in Systems Biology, Centre de Regulació Genòmica (CRG), and Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Silva LL, Marcet-Houben M, Nahum LA, Zerlotini A, Gabaldón T, Oliveira G. The Schistosoma mansoni phylome: using evolutionary genomics to gain insight into a parasite's biology. BMC Genomics 2012; 13:617. [PMID: 23148687 PMCID: PMC3534613 DOI: 10.1186/1471-2164-13-617] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2012] [Accepted: 10/22/2012] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Schistosoma mansoni is one of the causative agents of schistosomiasis, a neglected tropical disease that affects about 237 million people worldwide. Despite recent efforts, we still lack a general understanding of the relevant host-parasite interactions, and the possible treatments are limited by the emergence of resistant strains and the absence of a vaccine. The S. mansoni genome was completely sequenced and still under continuous annotation. Nevertheless, more than 45% of the encoded proteins remain without experimental characterization or even functional prediction. To improve our knowledge regarding the biology of this parasite, we conducted a proteome-wide evolutionary analysis to provide a broad view of the S. mansoni's proteome evolution and to improve its functional annotation. RESULTS Using a phylogenomic approach, we reconstructed the S. mansoni phylome, which comprises the evolutionary histories of all parasite proteins and their homologs across 12 other organisms. The analysis of a total of 7,964 phylogenies allowed a deeper understanding of genomic complexity and evolutionary adaptations to a parasitic lifestyle. In particular, the identification of lineage-specific gene duplications pointed to the diversification of several protein families that are relevant for host-parasite interaction, including proteases, tetraspanins, fucosyltransferases, venom allergen-like proteins, and tegumental-allergen-like proteins. In addition to the evolutionary knowledge, the phylome data enabled us to automatically re-annotate 3,451 proteins through a phylogenetic-based approach rather than solely sequence similarity searches. To allow further exploitation of this valuable data, all information has been made available at PhylomeDB (http://www.phylomedb.org). CONCLUSIONS In this study, we used an evolutionary approach to assess S. mansoni parasite biology, improve genome/proteome functional annotation, and provide insights into host-parasite interactions. Taking advantage of a proteome-wide perspective rather than focusing on individual proteins, we identified that this parasite has experienced specific gene duplication events, particularly affecting genes that are potentially related to the parasitic lifestyle. These innovations may be related to the mechanisms that protect S. mansoni against host immune responses being important adaptations for the parasite survival in a potentially hostile environment. Continuing this work, a comparative analysis involving genomic, transcriptomic, and proteomic data from other helminth parasites, other parasites, and vectors will supply more information regarding parasite's biology as well as host-parasite interactions.
Collapse
Affiliation(s)
- Larissa Lopes Silva
- Grupo de Genômica e Biologia Computacional, Centro de Pesquisas René Rachou. Instituto Nacional de Ciência e Tecnologia em Doenças Tropicais. Fundação Oswaldo Cruz - FIOCRUZ, Belo Horizonte, MG, 30190-002, Brazil
- Centro de Excelência em Bioinformática, Fundação Oswaldo Cruz – FIOCRUZ, Belo Horizonte, MG, Brazil
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais – UFMG, Belo Horizonte, MG, Brazil
| | - Marina Marcet-Houben
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Dr. Aiguader, 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain
| | - Laila Alves Nahum
- Grupo de Genômica e Biologia Computacional, Centro de Pesquisas René Rachou. Instituto Nacional de Ciência e Tecnologia em Doenças Tropicais. Fundação Oswaldo Cruz - FIOCRUZ, Belo Horizonte, MG, 30190-002, Brazil
- Centro de Excelência em Bioinformática, Fundação Oswaldo Cruz – FIOCRUZ, Belo Horizonte, MG, Brazil
- Faculdade Infórium de Tecnologia, Belo Horizonte, MG, 30130-180, Brazil
| | - Adhemar Zerlotini
- Centro de Excelência em Bioinformática, Fundação Oswaldo Cruz – FIOCRUZ, Belo Horizonte, MG, Brazil
- Laboratório Multiusuário de Bioinformática, Embrapa Informática Agropecuária, Campinas, São Paulo, Brazil
| | - Toni Gabaldón
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Dr. Aiguader, 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain
| | - Guilherme Oliveira
- Grupo de Genômica e Biologia Computacional, Centro de Pesquisas René Rachou. Instituto Nacional de Ciência e Tecnologia em Doenças Tropicais. Fundação Oswaldo Cruz - FIOCRUZ, Belo Horizonte, MG, 30190-002, Brazil
- Centro de Excelência em Bioinformática, Fundação Oswaldo Cruz – FIOCRUZ, Belo Horizonte, MG, Brazil
| |
Collapse
|
25
|
Koonin EV, Wolf YI. Evolution of microbes and viruses: a paradigm shift in evolutionary biology? Front Cell Infect Microbiol 2012; 2:119. [PMID: 22993722 PMCID: PMC3440604 DOI: 10.3389/fcimb.2012.00119] [Citation(s) in RCA: 86] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2012] [Accepted: 08/27/2012] [Indexed: 01/21/2023] Open
Abstract
When Charles Darwin formulated the central principles of evolutionary biology in the Origin of Species in 1859 and the architects of the Modern Synthesis integrated these principles with population genetics almost a century later, the principal if not the sole objects of evolutionary biology were multicellular eukaryotes, primarily animals and plants. Before the advent of efficient gene sequencing, all attempts to extend evolutionary studies to bacteria have been futile. Sequencing of the rRNA genes in thousands of microbes allowed the construction of the three- domain “ribosomal Tree of Life” that was widely thought to have resolved the evolutionary relationships between the cellular life forms. However, subsequent massive sequencing of numerous, complete microbial genomes revealed novel evolutionary phenomena, the most fundamental of these being: (1) pervasive horizontal gene transfer (HGT), in large part mediated by viruses and plasmids, that shapes the genomes of archaea and bacteria and call for a radical revision (if not abandonment) of the Tree of Life concept, (2) Lamarckian-type inheritance that appears to be critical for antivirus defense and other forms of adaptation in prokaryotes, and (3) evolution of evolvability, i.e., dedicated mechanisms for evolution such as vehicles for HGT and stress-induced mutagenesis systems. In the non-cellular part of the microbial world, phylogenomics and metagenomics of viruses and related selfish genetic elements revealed enormous genetic and molecular diversity and extremely high abundance of viruses that come across as the dominant biological entities on earth. Furthermore, the perennial arms race between viruses and their hosts is one of the defining factors of evolution. Thus, microbial phylogenomics adds new dimensions to the fundamental picture of evolution even as the principle of descent with modification discovered by Darwin and the laws of population genetics remain at the core of evolutionary biology.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health Bethesda, MD, USA.
| | | |
Collapse
|
26
|
Hadrys H, Simon S, Kaune B, Schmitt O, Schöner A, Jakob W, Schierwater B. Isolation of Hox cluster genes from insects reveals an accelerated sequence evolution rate. PLoS One 2012; 7:e34682. [PMID: 22685537 PMCID: PMC3369913 DOI: 10.1371/journal.pone.0034682] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2012] [Accepted: 03/08/2012] [Indexed: 01/10/2023] Open
Abstract
Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda) that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera). We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx) from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.
Collapse
Affiliation(s)
- Heike Hadrys
- ITZ, Division of Ecology and Evolution, Stiftung Tieraerztliche Hochschule Hannover, Hannover, Germany.
| | | | | | | | | | | | | |
Collapse
|
27
|
Puigbò P, Wolf YI, Koonin EV. Genome-wide comparative analysis of phylogenetic trees: the prokaryotic forest of life. Methods Mol Biol 2012; 856:53-79. [PMID: 22399455 PMCID: PMC3842619 DOI: 10.1007/978-1-61779-585-5_3] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Genome-wide comparison of phylogenetic trees is becoming an increasingly common approach in evolutionary genomics, and a variety of approaches for such comparison have been developed. In this article, we present several methods for comparative analysis of large numbers of phylogenetic trees. To compare phylogenetic trees taking into account the bootstrap support for each internal branch, the Boot-Split Distance (BSD) method is introduced as an extension of the previously developed Split Distance method for tree comparison. The BSD method implements the straightforward idea that comparison of phylogenetic trees can be made more robust by treating tree splits differentially depending on the bootstrap support. Approaches are also introduced for detecting tree-like and net-like evolutionary trends in the phylogenetic Forest of Life (FOL), i.e., the entirety of the phylogenetic trees for conserved genes of prokaryotes. The principal method employed for this purpose includes mapping quartets of species onto trees to calculate the support of each quartet topology and so to quantify the tree and net contributions to the distances between species. We describe the application of these methods to analyze the FOL and the results obtained with these methods. These results support the concept of the Tree of Life (TOL) as a central evolutionary trend in the FOL as opposed to the traditional view of the TOL as a "species tree."
Collapse
Affiliation(s)
- Pere Puigbò
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health. Bethesda, Maryland 20894. USA
| | - Yuri I. Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health. Bethesda, Maryland 20894. USA
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health. Bethesda, Maryland 20894. USA
| |
Collapse
|
28
|
Cohen O, Pupko T. Inference of gain and loss events from phyletic patterns using stochastic mapping and maximum parsimony--a simulation study. Genome Biol Evol 2011; 3:1265-75. [PMID: 21971516 PMCID: PMC3215202 DOI: 10.1093/gbe/evr101] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/27/2011] [Indexed: 12/26/2022] Open
Abstract
Bacterial evolution is characterized by frequent gain and loss events of gene families. These events can be inferred from phyletic pattern data-a compact representation of gene family repertoire across multiple genomes. The maximum parsimony paradigm is a classical and prevalent approach for the detection of gene family gains and losses mapped on specific branches. We and others have previously developed probabilistic models that aim to account for the gain and loss stochastic dynamics. These models are a critical component of a methodology termed stochastic mapping, in which probabilities and expectations of gain and loss events are estimated for each branch of an underlying phylogenetic tree. In this work, we present a phyletic pattern simulator in which the gain and loss dynamics are assumed to follow a continuous-time Markov chain along the tree. Various models and options are implemented to make the simulation software useful for a large number of studies in which binary (presence/absence) data are analyzed. Using this simulation software, we compared the ability of the maximum parsimony and the stochastic mapping approaches to accurately detect gain and loss events along the tree. Our simulations cover a large array of evolutionary scenarios in terms of the propensities for gene family gains and losses and the variability of these propensities among gene families. Although in all simulation schemes, both methods obtain relatively low levels of false positive rates, stochastic mapping outperforms maximum parsimony in terms of true positive rates. We further studied the factors that influence the performance of both methods. We find, for example, that the accuracy of maximum parsimony inference is substantially reduced when the goal is to map gain and loss events along internal branches of the phylogenetic tree. Furthermore, the accuracy of stochastic mapping is reduced with smaller data sets (limited number of gene families) due to unreliable estimation of branch lengths. Our simulator and simulation results are additionally relevant for the analysis of other types of binary-coded data, such as the existence of homologues restriction sites, gaps, and introns, to name a few. Both the simulation software and the inference methodology are freely available at a user-friendly server: http://gloome.tau.ac.il/.
Collapse
Affiliation(s)
- Ofir Cohen
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Tal Pupko
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
- National Evolutionary Synthesis Center, Durham, North Carolina
| |
Collapse
|
29
|
Abstract
The species is a fundamental unit of biological organization, but its relevance for Bacteria and Archaea is still hotly debated. Even more controversial is whether the deeper branches of the ribosomal RNA-derived phylogenetic tree, such as the phyla, have ecological importance. Here, we discuss the ecological coherence of high bacterial taxa in the light of genome analyses and present examples of niche differentiation between deeply diverging groups in terrestrial and aquatic systems. The ecological relevance of high bacterial taxa has implications for bacterial taxonomy, evolution and ecology.
Collapse
|
30
|
Marcet-Houben M, Gabaldón T. TreeKO: a duplication-aware algorithm for the comparison of phylogenetic trees. Nucleic Acids Res 2011; 39:e66. [PMID: 21335609 PMCID: PMC3105381 DOI: 10.1093/nar/gkr087] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Comparisons of tree topologies provide relevant information in evolutionary studies. Most existing methods share the drawback of requiring a complete and exact mapping of terminal nodes between the compared trees. This severely limits the scope of genome-wide analyses, since trees containing duplications are pruned arbitrarily or discarded. To overcome this, we have developed treeKO, an algorithm that enables the comparison of tree topologies, even in the presence of duplication and loss events. To do so treeKO recursively splits gene trees into pruned trees containing only orthologs to subsequently compute a distance based on the combined analyses of all pruned tree comparisons. In addition treeKO, implements the possibility of computing phylome support values, and reconciliation-based measures such as the number of inferred duplication and loss events.
Collapse
|
31
|
Cohen O, Gophna U, Pupko T. The Complexity Hypothesis Revisited: Connectivity Rather Than Function Constitutes a Barrier to Horizontal Gene Transfer. Mol Biol Evol 2010; 28:1481-9. [DOI: 10.1093/molbev/msq333] [Citation(s) in RCA: 146] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
32
|
Huerta-Cepas J, Capella-Gutierrez S, Pryszcz LP, Denisov I, Kormes D, Marcet-Houben M, Gabaldón T. PhylomeDB v3.0: an expanding repository of genome-wide collections of trees, alignments and phylogeny-based orthology and paralogy predictions. Nucleic Acids Res 2010; 39:D556-60. [PMID: 21075798 PMCID: PMC3013701 DOI: 10.1093/nar/gkq1109] [Citation(s) in RCA: 128] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The growing availability of complete genomic sequences from diverse species has brought about the need to scale up phylogenomic analyses, including the reconstruction of large collections of phylogenetic trees. Here, we present the third version of PhylomeDB (http://phylomeDB.org), a public database for genome-wide collections of gene phylogenies (phylomes). Currently, PhylomeDB is the largest phylogenetic repository and hosts 17 phylomes, comprising 416,093 trees and 165,840 alignments. It is also a major source for phylogeny-based orthology and paralogy predictions, covering about 5 million proteins in 717 fully-sequenced genomes. For each protein-coding gene in a seed genome, the database provides original and processed alignments, phylogenetic trees derived from various methods and phylogeny-based predictions of orthology and paralogy relationships. The new version of phylomeDB has been extended with novel data access and visualization features, including the possibility of programmatic access. Available seed species include model organisms such as human, yeast, Escherichia coli or Arabidopsis thaliana, but also alternative model species such as the human pathogen Candida albicans, or the pea aphid Acyrtosiphon pisum. Finally, PhylomeDB is currently being used by several genome sequencing projects that couple the genome annotation process with the reconstruction of the corresponding phylome, a strategy that provides relevant evolutionary insights.
Collapse
Affiliation(s)
- Jaime Huerta-Cepas
- Bioinformatics and Genomics Programme, Centre de Regulació Genòmica, 08003 Barcelona, Spain
| | | | | | | | | | | | | |
Collapse
|
33
|
Hendrickson RC, Wang C, Hatcher EL, Lefkowitz EJ. Orthopoxvirus genome evolution: the role of gene loss. Viruses 2010; 2:1933-1967. [PMID: 21994715 PMCID: PMC3185746 DOI: 10.3390/v2091933] [Citation(s) in RCA: 150] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2010] [Revised: 08/25/2010] [Accepted: 09/01/2010] [Indexed: 12/26/2022] Open
Abstract
Poxviruses are highly successful pathogens, known to infect a variety of hosts. The family Poxviridae includes Variola virus, the causative agent of smallpox, which has been eradicated as a public health threat but could potentially reemerge as a bioterrorist threat. The risk scenario includes other animal poxviruses and genetically engineered manipulations of poxviruses. Studies of orthologous gene sets have established the evolutionary relationships of members within the Poxviridae family. It is not clear, however, how variations between family members arose in the past, an important issue in understanding how these viruses may vary and possibly produce future threats. Using a newly developed poxvirus-specific tool, we predicted accurate gene sets for viruses with completely sequenced genomes in the genus Orthopoxvirus. Employing sensitive sequence comparison techniques together with comparison of syntenic gene maps, we established the relationships between all viral gene sets. These techniques allowed us to unambiguously identify the gene loss/gain events that have occurred over the course of orthopoxvirus evolution. It is clear that for all existing Orthopoxvirus species, no individual species has acquired protein-coding genes unique to that species. All existing species contain genes that are all present in members of the species Cowpox virus and that cowpox virus strains contain every gene present in any other orthopoxvirus strain. These results support a theory of reductive evolution in which the reduction in size of the core gene set of a putative ancestral virus played a critical role in speciation and confining any newly emerging virus species to a particular environmental (host or tissue) niche.
Collapse
Affiliation(s)
- Robert Curtis Hendrickson
- Department of Microbiology, University of Alabama at Birmingham, BBRB 276/11, 845 19th St S, Birmingham, AL 35222, USA; E-Mails: (R.C.H.); (E.L.H.)
| | - Chunlin Wang
- Stanford Genome Technology Center, Stanford University, 855 California Ave, Palo Alto, CA 94304, USA; E-Mail:
| | - Eneida L. Hatcher
- Department of Microbiology, University of Alabama at Birmingham, BBRB 276/11, 845 19th St S, Birmingham, AL 35222, USA; E-Mails: (R.C.H.); (E.L.H.)
| | - Elliot J. Lefkowitz
- Department of Microbiology, University of Alabama at Birmingham, BBRB 276/11, 845 19th St S, Birmingham, AL 35222, USA; E-Mails: (R.C.H.); (E.L.H.)
| |
Collapse
|
34
|
Characterization of novel Brucella strains originating from wild native rodent species in North Queensland, Australia. Appl Environ Microbiol 2010; 76:5837-45. [PMID: 20639360 DOI: 10.1128/aem.00620-10] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We report on the characterization of a group of seven novel Brucella strains isolated in 1964 from three native rodent species in North Queensland, Australia, during a survey of wild animals. The strains were initially reported to be Brucella suis biovar 3 on the basis of microbiological test results. Our results indicated that the rodent strains had microbiological traits distinct from those of B. suis biovar 3 and all other Brucella spp. To reinvestigate these rodent strains, we sequenced the 16S rRNA, recA, and rpoB genes and nine housekeeping genes and also performed multiple-locus variable-number tandem-repeat (VNTR) analysis (MLVA). The rodent strains have a unique 16S rRNA gene sequence compared to the sequences of the classical Brucella spp. Sequence analysis of the recA, rpoB, and nine housekeeping genes reveals that the rodent strains are genetically identical to each other at these loci and divergent from any of the currently described Brucella sequence types. However, all seven of the rodent strains do exhibit distinctive allelic MLVA profiles, although none demonstrated an amplicon for VNTR 07, whereas the other Brucella spp. did. Phylogenetic analysis of the MLVA data reveals that the rodent strains form a distinct clade separate from the classical Brucella spp. Furthermore, whole-genome sequence comparison using the maximal unique exact matches index (MUMi) demonstrated a high degree of relatedness of one of the seven rodent Brucella strains (strain NF 2653) to another Australian rodent Brucella strain (strain 83-13). Our findings strongly suggest that this group of Brucella strains isolated from wild Australian rodents defines a new species in the Brucella genus.
Collapse
|
35
|
Detwiler JT, Criscione CD. An infectious topic in reticulate evolution: introgression and hybridization in animal parasites. Genes (Basel) 2010; 1:102-23. [PMID: 24710013 PMCID: PMC3960858 DOI: 10.3390/genes1010102] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2010] [Revised: 06/07/2010] [Accepted: 06/07/2010] [Indexed: 02/08/2023] Open
Abstract
Little attention has been given to the role that introgression and hybridization have played in the evolution of parasites. Most studies are host-centric and ask if the hybrid of a free-living species is more or less susceptible to parasite infection. Here we focus on what is known about how introgression and hybridization have influenced the evolution of protozoan and helminth parasites of animals. There are reports of genome or gene introgression from distantly related taxa into apicomplexans and filarial nematodes. Most common are genetic based reports of potential hybridization among congeneric taxa, but in several cases, more work is needed to definitively conclude current hybridization. In the medically important Trypanosoma it is clear that some clonal lineages are the product of past hybridization events. Similarly, strong evidence exists for current hybridization in human helminths such as Schistosoma and Ascaris. There remain topics that warrant further examination such as the potential hybrid origin of polyploid platyhelminths. Furthermore, little work has investigated the phenotype or fitness, and even less the epidemiological significance of hybrid parasites.
Collapse
Affiliation(s)
- Jillian T Detwiler
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, TX 77843, USA.
| | - Charles D Criscione
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, TX 77843, USA.
| |
Collapse
|
36
|
Bohlin L, Göransson U, Alsmark C, Wedén C, Backlund A. Natural products in modern life science. PHYTOCHEMISTRY REVIEWS : PROCEEDINGS OF THE PHYTOCHEMICAL SOCIETY OF EUROPE 2010; 9:279-301. [PMID: 20700376 PMCID: PMC2912726 DOI: 10.1007/s11101-009-9160-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2009] [Accepted: 11/17/2009] [Indexed: 05/02/2023]
Abstract
With a realistic threat against biodiversity in rain forests and in the sea, a sustainable use of natural products is becoming more and more important. Basic research directed against different organisms in Nature could reveal unexpected insights into fundamental biological mechanisms but also new pharmaceutical or biotechnological possibilities of more immediate use. Many different strategies have been used prospecting the biodiversity of Earth in the search for novel structure-activity relationships, which has resulted in important discoveries in drug development. However, we believe that the development of multidisciplinary incentives will be necessary for a future successful exploration of Nature. With this aim, one way would be a modernization and renewal of a venerable proven interdisciplinary science, Pharmacognosy, which represents an integrated way of studying biological systems. This has been demonstrated based on an explanatory model where the different parts of the model are explained by our ongoing research. Anti-inflammatory natural products have been discovered based on ethnopharmacological observations, marine sponges in cold water have resulted in substances with ecological impact, combinatory strategy of ecology and chemistry has revealed new insights into the biodiversity of fungi, in depth studies of cyclic peptides (cyclotides) has created new possibilities for engineering of bioactive peptides, development of new strategies using phylogeny and chemography has resulted in new possibilities for navigating chemical and biological space, and using bioinformatic tools for understanding of lateral gene transfer could provide potential drug targets. A multidisciplinary subject like Pharmacognosy, one of several scientific disciplines bridging biology and chemistry with medicine, has a strategic position for studies of complex scientific questions based on observations in Nature. Furthermore, natural product research based on intriguing scientific questions in Nature can be of value to increase the attraction for young students in modern life science.
Collapse
Affiliation(s)
- Lars Bohlin
- Division of Pharmacognosy, Department of Medicinal Chemistry, Biomedical Centre, Uppsala University, Box 574, 751 23 Uppsala, Sweden
| | - Ulf Göransson
- Division of Pharmacognosy, Department of Medicinal Chemistry, Biomedical Centre, Uppsala University, Box 574, 751 23 Uppsala, Sweden
| | - Cecilia Alsmark
- Division of Pharmacognosy, Department of Medicinal Chemistry, Biomedical Centre, Uppsala University, Box 574, 751 23 Uppsala, Sweden
| | - Christina Wedén
- Division of Pharmacognosy, Department of Medicinal Chemistry, Biomedical Centre, Uppsala University, Box 574, 751 23 Uppsala, Sweden
| | - Anders Backlund
- Division of Pharmacognosy, Department of Medicinal Chemistry, Biomedical Centre, Uppsala University, Box 574, 751 23 Uppsala, Sweden
| |
Collapse
|
37
|
Hartman AL, Norais C, Badger JH, Delmas S, Haldenby S, Madupu R, Robinson J, Khouri H, Ren Q, Lowe TM, Maupin-Furlow J, Pohlschroder M, Daniels C, Pfeiffer F, Allers T, Eisen JA. The complete genome sequence of Haloferax volcanii DS2, a model archaeon. PLoS One 2010; 5:e9605. [PMID: 20333302 PMCID: PMC2841640 DOI: 10.1371/journal.pone.0009605] [Citation(s) in RCA: 204] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2009] [Accepted: 02/11/2010] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Haloferax volcanii is an easily culturable moderate halophile that grows on simple defined media, is readily transformable, and has a relatively stable genome. This, in combination with its biochemical and genetic tractability, has made Hfx. volcanii a key model organism, not only for the study of halophilicity, but also for archaeal biology in general. METHODOLOGY/PRINCIPAL FINDINGS We report here the sequencing and analysis of the genome of Hfx. volcanii DS2, the type strain of this species. The genome contains a main 2.848 Mb chromosome, three smaller chromosomes pHV1, 3, 4 (85, 438, 636 kb, respectively) and the pHV2 plasmid (6.4 kb). CONCLUSIONS/SIGNIFICANCE The completed genome sequence, presented here, provides an invaluable tool for further in vivo and in vitro studies of Hfx. volcanii.
Collapse
Affiliation(s)
- Amber L. Hartman
- Department of Biology, Johns Hopkins University, Baltimore, Maryland, United States of America
- The Institute for Genomic Research (J. Craig Venter Institute), Rockville, Maryland, United States of America
- UC Davis Genome Center, University of California Davis, Davis, California, United States of America
| | - Cédric Norais
- Institut de Génétique et Microbiologie, Université Paris-Sud, Paris, France
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Jonathan H. Badger
- The Institute for Genomic Research (J. Craig Venter Institute), Rockville, Maryland, United States of America
| | - Stéphane Delmas
- Institute of Genetics, University of Nottingham, Nottingham, United Kingdom
| | - Sam Haldenby
- Institute of Genetics, University of Nottingham, Nottingham, United Kingdom
| | - Ramana Madupu
- The Institute for Genomic Research (J. Craig Venter Institute), Rockville, Maryland, United States of America
| | - Jeffrey Robinson
- The Institute for Genomic Research (J. Craig Venter Institute), Rockville, Maryland, United States of America
| | - Hoda Khouri
- The Institute for Genomic Research (J. Craig Venter Institute), Rockville, Maryland, United States of America
| | - Qinghu Ren
- The Institute for Genomic Research (J. Craig Venter Institute), Rockville, Maryland, United States of America
| | - Todd M. Lowe
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
| | - Julie Maupin-Furlow
- Department of Microbiology and Cell Science, University of Florida, Gainesville, Florida, United States of America
| | - Mecky Pohlschroder
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Charles Daniels
- Department of Microbiology, Ohio State University, Columbus, Ohio, United States of America
| | - Friedhelm Pfeiffer
- Department of Membrane Biochemistry, Max-Planck-Institute of Biochemistry, Martinsried, Germany
| | - Thorsten Allers
- Institute of Genetics, University of Nottingham, Nottingham, United Kingdom
| | - Jonathan A. Eisen
- The Institute for Genomic Research (J. Craig Venter Institute), Rockville, Maryland, United States of America
- UC Davis Genome Center, University of California Davis, Davis, California, United States of America
- Department of Medical Microbiology and Immunology, University of California Davis, Davis, California, United States of America
- Department of Evolution and Ecology, University of California Davis, Davis, California, United States of America
| |
Collapse
|
38
|
Prediction of horizontal gene transfers in eukaryotes: approaches and challenges. Biochem Soc Trans 2009; 37:792-5. [PMID: 19614596 DOI: 10.1042/bst0370792] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
HGT (horizontal gene transfer) is recognized as an important force in bacterial evolution. Now that many eukaryotic genomes have been sequenced, it has become possible to carry out studies of HGT in eukaryotes. The present review compares the different approaches that exist for identifying HGT genes and assess them in the context of studying eukaryotic evolution. The metabolic evolution resource metaTIGER is then described, with discussion of its application in identification of HGT in eukaryotes.
Collapse
|
39
|
Han Y, Burnette JM, Wessler SR. TARGeT: a web-based pipeline for retrieving and characterizing gene and transposable element families from genomic sequences. Nucleic Acids Res 2009; 37:e78. [PMID: 19429695 PMCID: PMC2699529 DOI: 10.1093/nar/gkp295] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2009] [Revised: 04/15/2009] [Accepted: 04/15/2009] [Indexed: 11/23/2022] Open
Abstract
Gene families compose a large proportion of eukaryotic genomes. The rapidly expanding genomic sequence database provides a good opportunity to study gene family evolution and function. However, most gene family identification programs are restricted to searching protein databases where data are often lagging behind the genomic sequence data. Here, we report a user-friendly web-based pipeline, named TARGeT (Tree Analysis of Related Genes and Transposons), which uses either a DNA or amino acid 'seed' query to: (i) automatically identify and retrieve gene family homologs from a genomic database, (ii) characterize gene structure and (iii) perform phylogenetic analysis. Due to its high speed, TARGeT is also able to characterize very large gene families, including transposable elements (TEs). We evaluated TARGeT using well-annotated datasets, including the ascorbate peroxidase gene family of rice, maize and sorghum and several TE families in rice. In all cases, TARGeT rapidly recapitulated the known homologs and predicted new ones. We also demonstrated that TARGeT outperforms similar pipelines and has functionality that is not offered elsewhere.
Collapse
Affiliation(s)
| | | | - Susan R. Wessler
- Department of Plant Biology, University of Georgia, Athens, GA 30602, USA
| |
Collapse
|
40
|
Abstract
Interest in understanding the transition from prevertebrates to vertebrates at the molecular level has resulted in accumulating genomic and transcriptomic sequence data for the earliest groups of extant vertebrates, namely, hagfishes (Myxiniformes) and lampreys (Petromyzontiformes). Molecular phylogenetic studies on species phylogeny have revealed the monophyly of cyclostomes and the deep divergence between hagfishes and lampreys (more than 400 million years). In parallel, recent molecular phylogenetic studies have shed light on the complex evolution of the cyclostome genome. This consists of whole genome duplications, shared at least partly with gnathostomes (jawed vertebrates), and cyclostome lineage-specific secondary modifications of the genome, such as gene gains and losses. Therefore, the analysis of cyclostome genomes requires caution in distinguishing between orthology and paralogy in gene molecular phylogeny at the gene family scale, as well as between apomorphic and plesiomorphic genomic traits in larger-scale analyses. In this review, we propose possible ways of improving the resolvability of these evolutionary events, and discuss probable scenarios for cyclostome genome evolution, with special emphasis on the hypothesis that two-round (2R) genome duplication events occurred before the divergence between cyclostomes and gnathostomes, and therefore that a post-2R state is a genomic synapomorphy for all extant vertebrates.
Collapse
Affiliation(s)
- Shigehiro Kuraku
- Lehrstuhl für Zoologie und Evolutionsbiologie, Department of Biology, University of Konstanz, Universitätsstrasse 10, 78457 Konstanz, Germany.
| |
Collapse
|
41
|
Whitaker JW, McConkey GA, Westhead DR. The transferome of metabolic genes explored: analysis of the horizontal transfer of enzyme encoding genes in unicellular eukaryotes. Genome Biol 2009; 10:R36. [PMID: 19368726 PMCID: PMC2688927 DOI: 10.1186/gb-2009-10-4-r36] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2008] [Revised: 04/06/2009] [Accepted: 04/15/2009] [Indexed: 12/02/2022] Open
Abstract
Metabolic network analysis in multiple eukaryotes identifies how horizontal and endosymbiotic gene transfer of metabolic enzyme-encoding genes leads to functional gene gain during evolution. Background Metabolic networks are responsible for many essential cellular processes, and exhibit a high level of evolutionary conservation from bacteria to eukaryotes. If genes encoding metabolic enzymes are horizontally transferred and are advantageous, they are likely to become fixed. Horizontal gene transfer (HGT) has played a key role in prokaryotic evolution and its importance in eukaryotes is increasingly evident. High levels of endosymbiotic gene transfer (EGT) accompanied the establishment of plastids and mitochondria, and more recent events have allowed further acquisition of bacterial genes. Here, we present the first comprehensive multi-species analysis of E/HGT of genes encoding metabolic enzymes from bacteria to unicellular eukaryotes. Results The phylogenetic trees of 2,257 metabolic enzymes were used to make E/HGT assertions in ten groups of unicellular eukaryotes, revealing the sources and metabolic processes of the transferred genes. Analyses revealed a preference for enzymes encoded by genes gained through horizontal and endosymbiotic transfers to be connected in the metabolic network. Enrichment in particular functional classes was particularly revealing: alongside plastid related processes and carbohydrate metabolism, this highlighted a number of pathways in eukaryotic parasites that are rich in enzymes encoded by transferred genes, and potentially key to pathogenicity. The plant parasites Phytophthora were discovered to have a potential pathway for lipopolysaccharide biosynthesis of E/HGT origin not seen before in eukaryotes outside the Plantae. Conclusions The number of enzymes encoded by genes gained through E/HGT has been established, providing insight into functional gain during the evolution of unicellular eukaryotes. In eukaryotic parasites, genes encoding enzymes that have been gained through horizontal transfer may be attractive drug targets if they are part of processes not present in the host, or are significantly diverged from equivalent host enzymes.
Collapse
Affiliation(s)
- John W Whitaker
- Institute of Molecular and Cellular Biology, University of Leeds, Leeds, West Yorkshire, LS2 9JT, UK
| | | | | |
Collapse
|
42
|
Abstract
A universal Tree of Life has been a longstanding goal of the biosciences. The most common Tree of Life, based on the small subunit rRNA gene, may or may not represent the phylogenetic history of microorganisms. The horizontal transfer of genes from one taxon to another provides a means by which each gene may tell of an independent history. When complete genomes became available, the extent to which horizontal gene transfer (HGT) has occurred became more evident. When using genomic data to study the Tree of Life, one can use any of the four broad approaches: (i) build lots of individual gene trees ("phylogenomics"), (ii) concatenate genes together for an analysis yielding one "supergene" tree, (iii) form a single tree based on the "gene content" within genomes using either orthologs or homologs, or (iv) investigate the order of genes within genomes to discern some aspects of microbial evolution. The application of whole genome tree building has suggested that there is a core tree, that such a core tree can be investigated using these varied methods, and that the results are largely similar to those of the rRNA universal Tree of Life. Some of the most interesting features of the rRNA tree, such as early diverging hyperthermophilic lineages are still uncertain, but remain a possibility. Genomic trees and geologic evidence together suggest that the vertical descent of genes and the horizontal transfer of genes between genetically similar lineages ultimately results in a core Tree of Life with at least some lineages that have phenotypic characteristics recognizable for billions of years.
Collapse
Affiliation(s)
- Christopher H House
- Department of Geosciences and Pennsylvania State Astrobiology Research Center, Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
43
|
Alsmark UC, Sicheritz-Ponten T, Foster PG, Hirt RP, Embley TM. Horizontal gene transfer in eukaryotic parasites: a case study of Entamoeba histolytica and Trichomonas vaginalis. Methods Mol Biol 2009; 532:489-500. [PMID: 19271203 DOI: 10.1007/978-1-60327-853-9_28] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Over the past few years it has become apparent that horizontal gene transfer (HGT) has played an important role in the evolution of pathogenic prokaryotes. What is less clear is the exact role that HGT has played in shaping the metabolism of eukaryotic organisms. The main problems are the reliable inference of HGT on a genomic scale as well as the functional assignment of genes in these poorly studied organisms. We have screened the completed genomes of the protists Entamoeba histolytica and Trichomonas vaginalis for cases of HGT from prokaryotes. Using a fast primary screen followed by a conservative phylogenetic approach, we found 68 and 153 recent cases of HGT in the respective organisms. The majority of transferred genes that fall into functional categories code for enzymes involved in metabolism. We found a broad range of prokaryotic lineages represented among the donors, but organisms that share similar environmental niches with E. histolytica and T. vaginalis, such as the gut and the vaginal mucosa, dominate.
Collapse
Affiliation(s)
- U Cecilia Alsmark
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne, UK
| | | | | | | | | |
Collapse
|
44
|
Podar M, Anderson I, Makarova KS, Elkins JG, Ivanova N, Wall MA, Lykidis A, Mavromatis K, Sun H, Hudson ME, Chen W, Deciu C, Hutchison D, Eads JR, Anderson A, Fernandes F, Szeto E, Lapidus A, Kyrpides NC, Saier MH, Richardson PM, Rachel R, Huber H, Eisen JA, Koonin EV, Keller M, Stetter KO. A genomic analysis of the archaeal system Ignicoccus hospitalis-Nanoarchaeum equitans. Genome Biol 2008; 9:R158. [PMID: 19000309 PMCID: PMC2614490 DOI: 10.1186/gb-2008-9-11-r158] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2008] [Revised: 10/21/2008] [Accepted: 11/10/2008] [Indexed: 01/03/2023] Open
Abstract
Sequencing of the complete genome of Ignicoccus hospitalis gives insight into its association with another species of Archaea, Nanoarchaeum equitans. Background The relationship between the hyperthermophiles Ignicoccus hospitalis and Nanoarchaeum equitans is the only known example of a specific association between two species of Archaea. Little is known about the mechanisms that enable this relationship. Results We sequenced the complete genome of I. hospitalis and found it to be the smallest among independent, free-living organisms. A comparative genomic reconstruction suggests that the I. hospitalis lineage has lost most of the genes associated with a heterotrophic metabolism that is characteristic of most of the Crenarchaeota. A streamlined genome is also suggested by a low frequency of paralogs and fragmentation of many operons. However, this process appears to be partially balanced by lateral gene transfer from archaeal and bacterial sources. Conclusions A combination of genomic and cellular features suggests highly efficient adaptation to the low energy yield of sulfur-hydrogen respiration and efficient inorganic carbon and nitrogen assimilation. Evidence of lateral gene exchange between N. equitans and I. hospitalis indicates that the relationship has impacted both genomes. This association is the simplest symbiotic system known to date and a unique model for studying mechanisms of interspecific relationships at the genomic and metabolic levels.
Collapse
Affiliation(s)
- Mircea Podar
- Biosciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN 37831, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Koonin EV, Wolf YI. Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res 2008; 36:6688-719. [PMID: 18948295 PMCID: PMC2588523 DOI: 10.1093/nar/gkn668] [Citation(s) in RCA: 480] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The first bacterial genome was sequenced in 1995, and the first archaeal genome in 1996. Soon after these breakthroughs, an exponential rate of genome sequencing was established, with a doubling time of approximately 20 months for bacteria and approximately 34 months for archaea. Comparative analysis of the hundreds of sequenced bacterial and dozens of archaeal genomes leads to several generalizations on the principles of genome organization and evolution. A crucial finding that enables functional characterization of the sequenced genomes and evolutionary reconstruction is that the majority of archaeal and bacterial genes have conserved orthologs in other, often, distant organisms. However, comparative genomics also shows that horizontal gene transfer (HGT) is a dominant force of prokaryotic evolution, along with the loss of genetic material resulting in genome contraction. A crucial component of the prokaryotic world is the mobilome, the enormous collection of viruses, plasmids and other selfish elements, which are in constant exchange with more stable chromosomes and serve as HGT vehicles. Thus, the prokaryotic genome space is a tightly connected, although compartmentalized, network, a novel notion that undermines the ‘Tree of Life’ model of evolution and requires a new conceptual framework and tools for the study of prokaryotic evolution.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| | | |
Collapse
|
46
|
Podell S, Gaasterland T, Allen EE. A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm. BMC Bioinformatics 2008; 9:419. [PMID: 18840280 PMCID: PMC2573894 DOI: 10.1186/1471-2105-9-419] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2008] [Accepted: 10/07/2008] [Indexed: 01/30/2023] Open
Abstract
Background The process of horizontal gene transfer (HGT) is believed to be widespread in Bacteria and Archaea, but little comparative data is available addressing its occurrence in complete microbial genomes. Collection of high-quality, automated HGT prediction data based on phylogenetic evidence has previously been impractical for large numbers of genomes at once, due to prohibitive computational demands. DarkHorse, a recently described statistical method for discovering phylogenetically atypical genes on a genome-wide basis, provides a means to solve this problem through lineage probability index (LPI) ranking scores. LPI scores inversely reflect phylogenetic distance between a test amino acid sequence and its closest available database matches. Proteins with low LPI scores are good horizontal gene transfer candidates; those with high scores are not. Description The DarkHorse algorithm has been applied to 955 microbial genome sequences, and the results organized into a web-searchable relational database, called the DarkHorse HGT Candidate Resource . Users can select individual genomes or groups of genomes to screen by LPI score, search for protein functions by descriptive annotation or amino acid sequence similarity, or select proteins with unusual G+C composition in their underlying coding sequences. The search engine reports LPI scores for match partners as well as query sequences, providing the opportunity to explore whether potential HGT donor sequences are phylogenetically typical or atypical within their own genomes. This information can be used to predict whether or not sufficient information is available to build a well-supported phylogenetic tree using the potential donor sequence. Conclusion The DarkHorse HGT Candidate database provides a powerful, flexible set of tools for identifying phylogenetically atypical proteins, allowing researchers to explore both individual HGT events in single genomes, and large-scale HGT patterns among protein families and genome groups. Although the DarkHorse algorithm cannot, by itself, provide definitive proof of horizontal gene transfer, it is a flexible, powerful tool that can be combined with slower, more rigorous methods in situations where these other methods could not otherwise be applied.
Collapse
Affiliation(s)
- Sheila Podell
- Marine Biology Research Division, Scripps Institution of Oceanography University of California at San Diego, La Jolla, CA 92093 USA.
| | | | | |
Collapse
|
47
|
Glansdorff N, Xu Y, Labedan B. The last universal common ancestor: emergence, constitution and genetic legacy of an elusive forerunner. Biol Direct 2008; 3:29. [PMID: 18613974 PMCID: PMC2478661 DOI: 10.1186/1745-6150-3-29] [Citation(s) in RCA: 186] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2008] [Accepted: 07/09/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Since the reclassification of all life forms in three Domains (Archaea, Bacteria, Eukarya), the identity of their alleged forerunner (Last Universal Common Ancestor or LUCA) has been the subject of extensive controversies: progenote or already complex organism, prokaryote or protoeukaryote, thermophile or mesophile, product of a protracted progression from simple replicators to complex cells or born in the cradle of "catalytically closed" entities? We present a critical survey of the topic and suggest a scenario. RESULTS LUCA does not appear to have been a simple, primitive, hyperthermophilic prokaryote but rather a complex community of protoeukaryotes with a RNA genome, adapted to a broad range of moderate temperatures, genetically redundant, morphologically and metabolically diverse. LUCA's genetic redundancy predicts loss of paralogous gene copies in divergent lineages to be a significant source of phylogenetic anomalies, i.e. instances where a protein tree departs from the SSU-rRNA genealogy; consequently, horizontal gene transfer may not have the rampant character assumed by many. Examining membrane lipids suggest LUCA had sn1,2 ester fatty acid lipids from which Archaea emerged from the outset as thermophilic by "thermoreduction," with a new type of membrane, composed of sn2,3 ether isoprenoid lipids; this occurred without major enzymatic reconversion. Bacteria emerged by reductive evolution from LUCA and some lineages further acquired extreme thermophily by convergent evolution. This scenario is compatible with the hypothesis that the RNA to DNA transition resulted from different viral invasions as proposed by Forterre. Beyond the controversy opposing "replication first" to metabolism first", the predictive arguments of theories on "catalytic closure" or "compositional heredity" heavily weigh in favour of LUCA's ancestors having emerged as complex, self-replicating entities from which a genetic code arose under natural selection. CONCLUSION Life was born complex and the LUCA displayed that heritage. It had the "body "of a mesophilic eukaryote well before maturing by endosymbiosis into an organism adapted to an atmosphere rich in oxygen. Abundant indications suggest reductive evolution of this complex and heterogeneous entity towards the "prokaryotic" Domains Archaea and Bacteria. The word "prokaryote" should be abandoned because epistemologically unsound. REVIEWERS This article was reviewed by Anthony Poole, Patrick Forterre, and Nicolas Galtier.
Collapse
Affiliation(s)
- Nicolas Glansdorff
- JM Wiame Research Institute for Microbiology and Vrije Universiteit Brussel, 1 ave E. Gryzon, B-1070 Brussels, Belgium.
| | | | | |
Collapse
|
48
|
Levasseur A, Pontarotti P, Poch O, Thompson JD. Strategies for reliable exploitation of evolutionary concepts in high throughput biology. Evol Bioinform Online 2008; 4:121-37. [PMID: 19204813 PMCID: PMC2614184 DOI: 10.4137/ebo.s597] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
The recent availability of the complete genome sequences of a large number of model organisms, together with the immense amount of data being produced by the new high-throughput technologies, means that we can now begin comparative analyses to understand the mechanisms involved in the evolution of the genome and their consequences in the study of biological systems. Phylogenetic approaches provide a unique conceptual framework for performing comparative analyses of all this data, for propagating information between different systems and for predicting or inferring new knowledge. As a result, phylogeny-based inference systems are now playing an increasingly important role in most areas of high throughput genomics, including studies of promoters (phylogenetic footprinting), interactomes (based on the presence and degree of conservation of interacting proteins), and in comparisons of transcriptomes or proteomes (phylogenetic proximity and co-regulation/co-expression). Here we review the recent developments aimed at making automatic, reliable phylogeny-based inference feasible in large-scale projects. We also discuss how evolutionary concepts and phylogeny-based inference strategies are now being exploited in order to understand the evolution and function of biological systems. Such advances will be fundamental for the success of the emerging disciplines of systems biology and synthetic biology, and will have wide-reaching effects in applied fields such as biotechnology, medicine and pharmacology.
Collapse
Affiliation(s)
- Anthony Levasseur
- Phylogenomics Laboratory, EA 3781 Evolution Biologique, Université de Provence, 13331 Marseille, France
| | | | | | | |
Collapse
|
49
|
Lima WC, Paquola AC, Varani AM, Van Sluys MA, Menck CF. Laterally transferred genomic islands in Xanthomonadales related to pathogenicity and primary metabolism. FEMS Microbiol Lett 2008; 281:87-97. [DOI: 10.1111/j.1574-6968.2008.01083.x] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
|
50
|
Fisher KM. Bayesian reconstruction of ancestral expression of the LEA gene families reveals propagule-derived desiccation tolerance in resurrection plants. AMERICAN JOURNAL OF BOTANY 2008; 95:506-515. [PMID: 21632376 DOI: 10.3732/ajb.95.4.506] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Desiccation tolerance is a complex trait that is broadly but infrequently present throughout the evolutionary tree of life. Desiccation tolerance has played a significant role in land plant evolution, in both the vegetative and reproductive life history stages. In the land plants, the late embryogenesis abundant (LEA) gene families are involved in both abiotic stress tolerance and the development of reproductive propagules. They are also a major component of vegetative desiccation tolerance. Phylogenies were estimated for four families of LEA genes from Arabidopsis, Physcomitrella, and the desiccation tolerant plants Tortula ruralis, Craterostigma plantagineum, and Xerophyta humilis. Microarray expression data from Arabidopsis and a subset of the Physcomitrella LEAs were used to estimate ancestral expression patterns in the LEA families and to evaluate alternative hypotheses for the origins of vegetative desiccation tolerance in the flowering plants. The results contradict the idea that vegetative desiccation tolerance in the resurrection angiosperms Craterostigma and Xerophyta arose through the co-option of genes exclusively related to stress tolerance, and support the propagule-derived origin of vegetative desiccation tolerance in the resurrection plants.
Collapse
Affiliation(s)
- Kirsten M Fisher
- National Evolutionary Synthesis Center, 2024 West Main Street Suite A200, Durham, North Carolina 27705 USA
| |
Collapse
|