1
|
Forterre P. The Last Universal Common Ancestor of Ribosome-Encoding Organisms: Portrait of LUCA. J Mol Evol 2024; 92:550-583. [PMID: 39158619 DOI: 10.1007/s00239-024-10186-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Accepted: 06/25/2024] [Indexed: 08/20/2024]
Abstract
The existence of LUCA in the distant past is the logical consequence of the binary mechanism of cell division. The biosphere in which LUCA and contemporaries were living was the product of a long cellular evolution from the origin of life to the second age of the RNA world. A parsimonious scenario suggests that the molecular fabric of LUCA was much simpler than those of modern organisms, explaining why the evolutionary tempo was faster at the time of LUCA than it was during the diversification of the three domains. Although LUCA was possibly equipped with a RNA genome and most likely lacked an ATP synthase, it was already able to perform basic metabolic functions and to produce efficient proteins. However, the proteome of LUCA and its inferred metabolism remains to be correctly explored by in-depth phylogenomic analyses and updated datasets. LUCA was probably a mesophile or a moderate thermophile since phylogenetic analyses indicate that it lacked reverse gyrase, an enzyme systematically present in all hyperthermophiles. The debate about the position of Eukarya in the tree of life, either sister group to Archaea or descendants of Archaea, has important implications to draw the portrait of LUCA. In the second alternative, one can a priori exclude the presence of specific eukaryotic features in LUCA. In contrast, if Archaea and Eukarya are sister group, some eukaryotic features, such as the spliceosome, might have been present in LUCA and later lost in Archaea and Bacteria. The nature of the LUCA virome is another matter of debate. I suggest here that DNA viruses only originated during the diversification of the three domains from an RNA-based LUCA to explain the odd distribution pattern of DNA viruses in the tree of life.
Collapse
|
2
|
Krupovic M, Dolja VV, Koonin EV. The virome of the last eukaryotic common ancestor and eukaryogenesis. Nat Microbiol 2023; 8:1008-1017. [PMID: 37127702 PMCID: PMC11130978 DOI: 10.1038/s41564-023-01378-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Accepted: 03/29/2023] [Indexed: 05/03/2023]
Abstract
All extant eukaryotes descend from the last eukaryotic common ancestor (LECA), which is thought to have featured complex cellular organization. To gain insight into LECA biology and eukaryogenesis-the origin of the eukaryotic cell, which remains poorly understood-we reconstructed the LECA virus repertoire. We compiled an inventory of eukaryotic hosts of all major virus taxa and reconstructed the LECA virome by inferring the origins of these groups of viruses. The origin of the LECA virome can be traced back to a small set of bacterial-not archaeal-viruses. This provenance of the LECA virome is probably due to the bacterial origin of eukaryotic membranes, which is most compatible with two endosymbiosis events in a syntrophic model of eukaryogenesis. In the first endosymbiosis, a bacterial host engulfed an Asgard archaeon, preventing archaeal viruses from entry owing to a lack of archaeal virus receptors on the external membranes.
Collapse
Affiliation(s)
- Mart Krupovic
- Institut Pasteur, Université Paris Cité, CNRS UMR6047, Archaeal Virology Unit, Paris, France.
| | - Valerian V Dolja
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, USA.
| |
Collapse
|
3
|
Prondzinsky P, Toyoda S, McGlynn SE. The methanogen core and pangenome: conservation and variability across biology's growth temperature extremes. DNA Res 2023; 30:dsac048. [PMID: 36454681 PMCID: PMC9886072 DOI: 10.1093/dnares/dsac048] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 11/09/2022] [Accepted: 11/29/2022] [Indexed: 12/05/2022] Open
Abstract
Temperature is a key variable in biological processes. However, a complete understanding of biological temperature adaptation is lacking, in part because of the unique constraints among different evolutionary lineages and physiological groups. Here we compared the genomes of cultivated psychrotolerant and thermotolerant methanogens, which are physiologically related and span growth temperatures from -2.5°C to 122°C. Despite being phylogenetically distributed amongst three phyla in the archaea, the genomic core of cultivated methanogens comprises about one-third of a given genome, while the genome fraction shared by any two organisms decreases with increasing phylogenetic distance between them. Increased methanogenic growth temperature is associated with reduced genome size, and thermotolerant organisms-which are distributed across the archaeal tree-have larger core genome fractions, suggesting that genome size is governed by temperature rather than phylogeny. Thermotolerant methanogens are enriched in metal and other transporters, and psychrotolerant methanogens are enriched in proteins related to structure and motility. Observed amino acid compositional differences between temperature groups include proteome charge, polarity and unfolding entropy. Our results suggest that in the methanogens, shared physiology maintains a large, conserved genomic core even across large phylogenetic distances and biology's temperature extremes.
Collapse
Affiliation(s)
- Paula Prondzinsky
- Earth-Life Science Institute, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, 152-8550 Tokyo, Japan
- Department of Chemical Science and Engineering, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, 226-8503 Yokohama, Japan
| | - Sakae Toyoda
- Department of Chemical Science and Engineering, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, 226-8503 Yokohama, Japan
| | - Shawn Erin McGlynn
- Earth-Life Science Institute, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, 152-8550 Tokyo, Japan
- Center for Sustainable Resource Science, RIKEN, 2-1 Hirosawa, Wako, 351-0198 Saitama, Japan
- Blue Marble Space Institute of Science, Seattle, WA 98154, USA
| |
Collapse
|
4
|
Affiliation(s)
- Hugo Menet
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558,Villeurbanne, France
| | - Vincent Daubin
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558,Villeurbanne, France
- * E-mail: (VD); (ET)
| | - Eric Tannier
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558,Villeurbanne, France
- Inria, centre de recherche de Lyon, Villeurbanne, France
- * E-mail: (VD); (ET)
| |
Collapse
|
5
|
Csűrös M. Gain-loss-duplication models for copy number evolution on a phylogeny: Exact algorithms for computing the likelihood and its gradient. Theor Popul Biol 2022; 145:80-94. [DOI: 10.1016/j.tpb.2022.03.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 03/07/2022] [Accepted: 03/10/2022] [Indexed: 10/18/2022]
|
6
|
Genomic Insights into the Ecological Role and Evolution of a Novel Thermoplasmata Order, " Candidatus Sysuiplasmatales". Appl Environ Microbiol 2021; 87:e0106521. [PMID: 34524897 DOI: 10.1128/aem.01065-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Recent omics studies have provided invaluable insights into the metabolic potential, adaptation, and evolution of novel archaeal lineages from a variety of extreme environments. We utilized a genome-resolved metagenomic approach to recover eight medium- to high-quality metagenome-assembled genomes (MAGs) that likely represent a new order ("Candidatus Sysuiplasmatales") in the class Thermoplasmata from mine tailings and acid mine drainage (AMD) sediments sampled from two copper mines in South China. 16S rRNA gene-based analyses revealed a narrow habitat range for these uncultured archaea limited to AMD and hot spring-related environments. Metabolic reconstruction indicated a facultatively anaerobic heterotrophic lifestyle. This may allow the archaea to adapt to oxygen fluctuations and is thus in marked contrast to the majority of lineages in the domain Archaea, which typically show obligately anaerobic metabolisms. Notably, "Ca. Sysuiplasmatales" could conserve energy through degradation of fatty acids, amino acid metabolism, and oxidation of reduced inorganic sulfur compounds (RISCs), suggesting that they may contribute to acid generation in the extreme mine environments. Unlike the closely related orders Methanomassiliicoccales and "Candidatus Gimiplasmatales," "Ca. Sysuiplasmatales" lacks the capacity to perform methanogenesis and carbon fixation. Ancestral state reconstruction indicated that "Ca. Sysuiplasmatales," the closely related orders Methanomassiliicoccales and "Ca. Gimiplasmatales," and the orders SG8-5 and RBG-16-68-12 originated from a facultatively anaerobic ancestor capable of carbon fixation via the bacterial-type H4F Wood-Ljungdahl pathway (WLP). Their metabolic divergence might be attributed to different evolutionary paths. IMPORTANCE A wide array of archaea populate Earth's extreme environments; therefore, they may play important roles in mediating biogeochemical processes such as iron and sulfur cycling. However, our knowledge of archaeal biology and evolution is still limited, since the majority of the archaeal diversity is uncultured. For instance, most order-level lineages except Thermoplasmatales, Aciduliprofundales, and Methanomassiliicoccales within Thermoplasmata do not have cultured representatives. Here, we report the discovery and genomic characterization of a novel order, "Ca. Sysuiplasmatales," within Thermoplasmata in extremely acidic mine environments. "Ca. Sysuiplasmatales" are inferred to be facultatively anaerobic heterotrophs and likely contribute to acid generation through the oxidation of RISCs. The physiological divergence between "Ca. Sysuiplasmatales" and closely related Thermoplasmata lineages may be attributed to different evolutionary paths. These results expand our knowledge of archaea in the extreme mine ecosystem.
Collapse
|
7
|
Psomopoulos FE, van Helden J, Médigue C, Chasapi A, Ouzounis CA. Ancestral state reconstruction of metabolic pathways across pangenome ensembles. Microb Genom 2021; 6. [PMID: 32924924 PMCID: PMC7725326 DOI: 10.1099/mgen.0.000429] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
As genome sequencing efforts are unveiling the genetic diversity of the biosphere with an unprecedented speed, there is a need to accurately describe the structural and functional properties of groups of extant species whose genomes have been sequenced, as well as their inferred ancestors, at any given taxonomic level of their phylogeny. Elaborate approaches for the reconstruction of ancestral states at the sequence level have been developed, subsequently augmented by methods based on gene content. While these approaches of sequence or gene-content reconstruction have been successfully deployed, there has been less progress on the explicit inference of functional properties of ancestral genomes, in terms of metabolic pathways and other cellular processes. Herein, we describe PathTrace, an efficient algorithm for parsimony-based reconstructions of the evolutionary history of individual metabolic pathways, pivotal representations of key functional modules of cellular function. The algorithm is implemented as a five-step process through which pathways are represented as fuzzy vectors, where each enzyme is associated with a taxonomic conservation value derived from the phylogenetic profile of its protein sequence. The method is evaluated with a selected benchmark set of pathways against collections of genome sequences from key data resources. By deploying a pangenome-driven approach for pathway sets, we demonstrate that the inferred patterns are largely insensitive to noise, as opposed to gene-content reconstruction methods. In addition, the resulting reconstructions are closely correlated with the evolutionary distance of the taxa under study, suggesting that a diligent selection of target pangenomes is essential for maintaining cohesiveness of the method and consistency of the inference, serving as an internal control for an arbitrary selection of queries. The PathTrace method is a first step towards the large-scale analysis of metabolic pathway evolution and our deeper understanding of functional relationships reflected in emerging pangenome collections.
Collapse
Affiliation(s)
- Fotis E Psomopoulos
- Institute of Applied Biosciences (INAB), Center for Research & Technology Hellas (CERTH), GR-57001 Thessalonica, Greece
| | - Jacques van Helden
- Lab. Technological Advances for Genomics & Clinics (TAGC), Université d'Aix-Marseille (AMU), INSERM Unit U1090, 163, Avenue de Luminy, 13288 Marseille cedex 09, France
| | - Claudine Médigue
- UMR 8030, CNRS, Université Evry-Val-d'Essonne, CEA, Institut de Biologie François Jacob - Genoscope, Laboratoire d'Analyses Bioinformatiques pour la Génomique et le Métabolisme, Evry, France
| | - Anastasia Chasapi
- Biological Computation & Process Laboratory (BCPL), Chemical Process & Energy Resources Institute (CPERI), Center for Research & Technology Hellas (CERTH), GR-57001 Thessalonica, Greece
| | - Christos A Ouzounis
- Biological Computation & Process Laboratory (BCPL), Chemical Process & Energy Resources Institute (CPERI), Center for Research & Technology Hellas (CERTH), GR-57001 Thessalonica, Greece
| |
Collapse
|
8
|
Fukunaga T, Iwasaki W. Mirage: estimation of ancestral gene-copy numbers by considering different evolutionary patterns among gene families. BIOINFORMATICS ADVANCES 2021; 1:vbab014. [PMID: 36700099 PMCID: PMC9710636 DOI: 10.1093/bioadv/vbab014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 07/22/2021] [Accepted: 07/28/2021] [Indexed: 01/28/2023]
Abstract
Motivation Reconstruction of gene copy number evolution is an essential approach for understanding how complex biological systems have been organized. Although various models have been proposed for gene copy number evolution, existing evolutionary models have not appropriately addressed the fact that different gene families can have very different gene gain/loss rates. Results In this study, we developed Mirage (MIxtuRe model for Ancestral Genome Estimation), which allows different gene families to have flexible gene gain/loss rates. Mirage can use three models for formulating heterogeneous evolution among gene families: the discretized Γ model, probability distribution-free model and pattern mixture (PM) model. Simulation analysis showed that Mirage can accurately estimate heterogeneous gene gain/loss rates and reconstruct gene-content evolutionary history. Application to empirical datasets demonstrated that the PM model fits genome data from various taxonomic groups better than the other heterogeneous models. Using Mirage, we revealed that metabolic function-related gene families displayed frequent gene gains and losses in all taxa investigated. Availability and implementation The source code of Mirage is freely available at https://github.com/fukunagatsu/Mirage. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Tsukasa Fukunaga
- Waseda Institute for Advanced Study, Waseda University, Tokyo 1690051, Japan,Department of Computer Science, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo 1130032, Japan,To whom correspondence should be addressed. or
| | - Wataru Iwasaki
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 2770882, Japan,Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo 1130032, Japan,Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 2770882, Japan,Atmosphere and Ocean Research Institute, The University of Tokyo, Chiba 2770882, Japan,Institute for Quantitative Biosciences, The University of Tokyo, Tokyo 1130032, Japan,Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Tokyo 1130032, Japan,To whom correspondence should be addressed. or
| |
Collapse
|
9
|
Abstract
DPANN is known as highly diverse, globally widespread, and mostly ectosymbiotic archaeal superphylum. However, this group of archaea was overlooked for a long time, and there were limited in-depth studies reported. In this investigation, 41 metagenome-assembled genomes (MAGs) belonging to the DPANN superphylum were recovered (18 MAGs had average nucleotide identity [ANI] values of <95% and a percentage of conserved proteins [POCP] of >50%, while 14 MAGs showed a POCP of <50%), which were analyzed comparatively with 515 other published DPANN genomes. Mismatches to known 16S rRNA gene primers were identified among 16S rRNA genes of DPANN archaea. Numbers of gene families lost (mostly related to energy and amino acid metabolism) were over three times greater than those gained in the evolution of DPANN archaea. Lateral gene transfer (LGT; ∼45.5% was cross-domain) had facilitated niche adaption of the DPANN archaea, ensuring a delicate equilibrium of streamlined genomes with efficient niche-adaptive strategies. For instance, LGT-derived cytochrome bd ubiquinol oxidase and arginine deiminase in the genomes of “Candidatus Micrarchaeota” could help them better adapt to aerobic acidic mine drainage habitats. In addition, most DPANN archaea acquired enzymes for biosynthesis of extracellular polymeric substances (EPS) and transketolase/transaldolase for the pentose phosphate pathway from Bacteria. IMPORTANCE The domain Archaea is a key research model for gaining insights into the origin and evolution of life, as well as the relevant biogeochemical processes. The discovery of nanosized DPANN archaea has overthrown many aspects of microbiology. However, the DPANN superphylum still contains a vast genetic novelty and diversity that need to be explored. Comprehensively comparative genomic analysis on the DPANN superphylum was performed in this study, with an attempt to illuminate its metabolic potential, ecological distribution and evolutionary history. Many interphylum differences within the DPANN superphylum were found. For example, Altiarchaeota had the biggest genome among DPANN phyla, possessing many pathways missing in other phyla, such as formaldehyde assimilation and the Wood-Ljungdahl pathway. In addition, LGT acted as an important force to provide DPANN archaeal genetic flexibility that permitted the occupation of diverse niches. This study has advanced our understanding of the diversity and genome evolution of archaea.
Collapse
|
10
|
Devos DP. Reconciling Asgardarchaeota Phylogenetic Proximity to Eukaryotes and Planctomycetes Cellular Features in the Evolution of Life. Mol Biol Evol 2021; 38:3531-3542. [PMID: 34229349 PMCID: PMC8382908 DOI: 10.1093/molbev/msab186] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
The relationship between the three domains of life—Archaea, Bacteria, and Eukarya—is one of Biology’s greatest mysteries. Current favored models imply two ancestral domains, Bacteria and Archaea, with eukaryotes originating within Archaea. This type of models has been supported by the recent description of the Asgardarchaeota, the closest prokaryotic relatives of eukaryotes. However, there are many problems associated with any scenarios implying that eukaryotes originated from within the Archaea, including genome mosaicism, phylogenies, the cellular organization of the Archaea, and their ancestral character. By contrast, all models of eukaryogenesis fail to consider two relevant discoveries: the detection of membrane coat proteins, and of phagocytosis-related processes in Planctomycetes, which are among the bacteria with the most developed endomembrane system. Consideration of these often overlooked features and others found in Planctomycetes and related bacteria suggest an evolutionary model based on a single ancestral domain. In this model, the proximity of Asgard and eukaryotes is not rejected but instead, Asgard are considered as diverging away from a common ancestor instead of on the way toward the eukaryotic ancestor. This model based on a single ancestral domain solves most of the ambiguities associated with the ones based on two ancestral domains. The single-domain model is better suited to explain the origin and evolution of all three domains of life, blurring the distinctions between them. Support for this model as well as the opportunities that it presents not only for reinterpreting previous results, but also for planning future experiments, are explored.
Collapse
Affiliation(s)
- Damien P Devos
- Centro Andaluz de Biología del Desarrollo (CABD) - CSIC, Junta de Andalucía, Universidad Pablo de Olavide, Carretera de Utrera Km 1, Seville, 41013, Spain
| |
Collapse
|
11
|
Banerjee R, Chaudhari NM, Lahiri A, Gautam A, Bhowmik D, Dutta C, Chattopadhyay S, Huson DH, Paul S. Interplay of Various Evolutionary Modes in Genome Diversification and Adaptive Evolution of the Family Sulfolobaceae. Front Microbiol 2021; 12:639995. [PMID: 34248865 PMCID: PMC8267890 DOI: 10.3389/fmicb.2021.639995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 05/06/2021] [Indexed: 11/21/2022] Open
Abstract
Sulfolobaceae family, comprising diverse thermoacidophilic and aerobic sulfur-metabolizing Archaea from various geographical locations, offers an ideal opportunity to infer the evolutionary dynamics across the members of this family. Comparative pan-genomics coupled with evolutionary analyses has revealed asymmetric genome evolution within the Sulfolobaceae family. The trend of genome streamlining followed by periods of differential gene gains resulted in an overall genome expansion in some species of this family, whereas there was reduction in others. Among the core genes, both Sulfolobus islandicus and Saccharolobus solfataricus showed a considerable fraction of positively selected genes and also higher frequencies of gene acquisition. In contrast, Sulfolobus acidocaldarius genomes experienced substantial amount of gene loss and strong purifying selection as manifested by relatively lower genome size and higher genome conservation. Central carbohydrate metabolism and sulfur metabolism coevolved with the genome diversification pattern of this archaeal family. The autotrophic CO2 fixation with three significant positively selected enzymes from S. islandicus and S. solfataricus was found to be more imperative than heterotrophic CO2 fixation for Sulfolobaceae. Overall, our analysis provides an insight into the interplay of various genomic adaptation strategies including gene gain-loss, mutation, and selection influencing genome diversification of Sulfolobaceae at various taxonomic levels and geographical locations.
Collapse
Affiliation(s)
- Rachana Banerjee
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, Kolkata, India
| | - Narendrakumar M. Chaudhari
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, Kolkata, India
| | - Abhishake Lahiri
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, Kolkata, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad- 201002, India
| | - Anupam Gautam
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, Kolkata, India
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research, Kolkata, India
| | - Debaleena Bhowmik
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, Kolkata, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad- 201002, India
| | - Chitra Dutta
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, Kolkata, India
| | - Sujay Chattopadhyay
- JIS Institute of Advanced Studies and Research, JIS University, Kolkata, India
| | - Daniel H. Huson
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
- Cluster of Excellence: Controlling Microbes to Fight Infection, Tübingen, Germany
| | - Sandip Paul
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, Kolkata, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad- 201002, India
| |
Collapse
|
12
|
Abram K, Udaondo Z, Bleker C, Wanchai V, Wassenaar TM, Robeson MS, Ussery DW. Mash-based analyses of Escherichia coli genomes reveal 14 distinct phylogroups. Commun Biol 2021; 4:117. [PMID: 33500552 PMCID: PMC7838162 DOI: 10.1038/s42003-020-01626-5] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Accepted: 12/21/2020] [Indexed: 01/30/2023] Open
Abstract
In this study, more than one hundred thousand Escherichia coli and Shigella genomes were examined and classified. This is, to our knowledge, the largest E. coli genome dataset analyzed to date. A Mash-based analysis of a cleaned set of 10,667 E. coli genomes from GenBank revealed 14 distinct phylogroups. A representative genome or medoid identified for each phylogroup was used as a proxy to classify 95,525 unassembled genomes from the Sequence Read Archive (SRA). We find that most of the sequenced E. coli genomes belong to four phylogroups (A, C, B1 and E2(O157)). Authenticity of the 14 phylogroups is supported by several different lines of evidence: phylogroup-specific core genes, a phylogenetic tree constructed with 2613 single copy core genes, and differences in the rates of gene gain/loss/duplication. The methodology used in this work is able to reproduce known phylogroups, as well as to identify previously uncharacterized phylogroups in E. coli species.
Collapse
Affiliation(s)
- Kaleb Abram
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, 72205, USA
| | - Zulema Udaondo
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, 72205, USA
| | - Carissa Bleker
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee, Knoxville, Tennessee, 37996, USA
- Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Tennessee, 37996, USA
| | - Visanu Wanchai
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, 72205, USA
| | - Trudy M Wassenaar
- Molecular Microbiology and Genomics Consultants, 55576, Zotzenheim, Germany
| | - Michael S Robeson
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, 72205, USA
| | - David W Ussery
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, 72205, USA.
| |
Collapse
|
13
|
Zwaenepoel A, Van de Peer Y. Model-Based Detection of Whole-Genome Duplications in a Phylogeny. Mol Biol Evol 2020; 37:2734-2746. [PMID: 32359154 DOI: 10.1093/molbev/msaa111] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Ancient whole-genome duplications (WGDs) leave signatures in comparative genomic data sets that can be harnessed to detect these events of presumed evolutionary importance. Current statistical approaches for the detection of ancient WGDs in a phylogenetic context have two main drawbacks. The first is that unwarranted restrictive assumptions on the "background" gene duplication and loss rates make inferences unreliable in the face of model violations. The second is that most methods can only be used to examine a limited set of a priori selected WGD hypotheses and cannot be used to discover WGDs in a phylogeny. In this study, we develop an approach for WGD inference using gene count data that seeks to overcome both issues. We employ a phylogenetic birth-death model that includes WGD in a flexible hierarchical Bayesian approach and use reversible-jump Markov chain Monte Carlo to perform Bayesian inference of branch-specific duplication, loss, and WGD retention rates across the space of WGD configurations. We evaluate the proposed method using simulations, apply it to data sets from flowering plants, and discuss the statistical intricacies of model-based WGD inference.
Collapse
Affiliation(s)
- Arthur Zwaenepoel
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.,Center for Plant Systems Biology, VIB, Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent, Belgium
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.,Center for Plant Systems Biology, VIB, Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent, Belgium.,Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| |
Collapse
|
14
|
Hunnicutt KE, Tiley GP, Williams RC, Larsen PA, Blanco MB, Rasoloarison RM, Campbell CR, Zhu K, Weisrock DW, Matsunami H, Yoder AD. Comparative Genomic Analysis of the Pheromone Receptor Class 1 Family (V1R) Reveals Extreme Complexity in Mouse Lemurs (Genus, Microcebus) and a Chromosomal Hotspot across Mammals. Genome Biol Evol 2020; 12:3562-3579. [PMID: 31555816 PMCID: PMC6944220 DOI: 10.1093/gbe/evz200] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2019] [Indexed: 12/14/2022] Open
Abstract
Sensory gene families are of special interest for both what they can tell us about molecular evolution and what they imply as mediators of social communication. The vomeronasal type-1 receptors (V1Rs) have often been hypothesized as playing a fundamental role in driving or maintaining species boundaries given their likely function as mediators of intraspecific mate choice, particularly in nocturnal mammals. Here, we employ a comparative genomic approach for revealing patterns of V1R evolution within primates, with a special focus on the small-bodied nocturnal mouse and dwarf lemurs of Madagascar (genera Microcebus and Cheirogaleus, respectively). By doubling the existing genomic resources for strepsirrhine primates (i.e. the lemurs and lorises), we find that the highly speciose and morphologically cryptic mouse lemurs have experienced an elaborate proliferation of V1Rs that we argue is functionally related to their capacity for rapid lineage diversification. Contrary to a previous study that found equivalent degrees of V1R diversity in diurnal and nocturnal lemurs, our study finds a strong correlation between nocturnality and V1R elaboration, with nocturnal lemurs showing elaborate V1R repertoires and diurnal lemurs showing less diverse repertoires. Recognized subfamilies among V1Rs show unique signatures of diversifying positive selection, as might be expected if they have each evolved to respond to specific stimuli. Furthermore, a detailed syntenic comparison of mouse lemurs with mouse (genus Mus) and other mammalian outgroups shows that orthologous mammalian subfamilies, predicted to be of ancient origin, tend to cluster in a densely populated region across syntenic chromosomes that we refer to as a V1R "hotspot."
Collapse
Affiliation(s)
- Kelsie E Hunnicutt
- Department of Biology, Duke University, Durham, North Carolina
- Department of Biological Sciences, University of Denver, Denver, Colorado
| | - George P Tiley
- Department of Biology, Duke University, Durham, North Carolina
| | - Rachel C Williams
- Department of Biology, Duke University, Durham, North Carolina
- Duke Lemur Center, Duke University, Durham, North Carolina
| | - Peter A Larsen
- Department of Biology, Duke University, Durham, North Carolina
- Department of Veterinary and Biomedical Sciences, University of Minnesota, Saint Paul, Minnesota
| | | | - Rodin M Rasoloarison
- Behavioral Ecology and Sociobiology Unit, German Primate Centre, Göttingen, Germany
- Département de Biologie Animale, Université d’Antananarivo, Madagascar, Antananarivo, Madagascar
| | - C Ryan Campbell
- Department of Biology, Duke University, Durham, North Carolina
| | - Kevin Zhu
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina
| | - David W Weisrock
- Department of Biology, University of Kentucky, Lexington, Kentucky
| | - Hiroaki Matsunami
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina
- Department of Neurobiology, Duke Institute for Brain Sciences, Duke University Medical Center, Durham, North Carolina
| | - Anne D Yoder
- Department of Biology, Duke University, Durham, North Carolina
| |
Collapse
|
15
|
Armaleo D, Müller O, Lutzoni F, Andrésson ÓS, Blanc G, Bode HB, Collart FR, Dal Grande F, Dietrich F, Grigoriev IV, Joneson S, Kuo A, Larsen PE, Logsdon JM, Lopez D, Martin F, May SP, McDonald TR, Merchant SS, Miao V, Morin E, Oono R, Pellegrini M, Rubinstein N, Sanchez-Puerta MV, Savelkoul E, Schmitt I, Slot JC, Soanes D, Szövényi P, Talbot NJ, Veneault-Fourrey C, Xavier BB. The lichen symbiosis re-viewed through the genomes of Cladonia grayi and its algal partner Asterochloris glomerata. BMC Genomics 2019; 20:605. [PMID: 31337355 PMCID: PMC6652019 DOI: 10.1186/s12864-019-5629-x] [Citation(s) in RCA: 69] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Accepted: 03/20/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Lichens, encompassing 20,000 known species, are symbioses between specialized fungi (mycobionts), mostly ascomycetes, and unicellular green algae or cyanobacteria (photobionts). Here we describe the first parallel genomic analysis of the mycobiont Cladonia grayi and of its green algal photobiont Asterochloris glomerata. We focus on genes/predicted proteins of potential symbiotic significance, sought by surveying proteins differentially activated during early stages of mycobiont and photobiont interaction in coculture, expanded or contracted protein families, and proteins with differential rates of evolution. RESULTS A) In coculture, the fungus upregulated small secreted proteins, membrane transport proteins, signal transduction components, extracellular hydrolases and, notably, a ribitol transporter and an ammonium transporter, and the alga activated DNA metabolism, signal transduction, and expression of flagellar components. B) Expanded fungal protein families include heterokaryon incompatibility proteins, polyketide synthases, and a unique set of G-protein α subunit paralogs. Expanded algal protein families include carbohydrate active enzymes and a specific subclass of cytoplasmic carbonic anhydrases. The alga also appears to have acquired by horizontal gene transfer from prokaryotes novel archaeal ATPases and Desiccation-Related Proteins. Expanded in both symbionts are signal transduction components, ankyrin domain proteins and transcription factors involved in chromatin remodeling and stress responses. The fungal transportome is contracted, as are algal nitrate assimilation genes. C) In the mycobiont, slow-evolving proteins were enriched for components involved in protein translation, translocation and sorting. CONCLUSIONS The surveyed genes affect stress resistance, signaling, genome reprogramming, nutritional and structural interactions. The alga carries many genes likely transferred horizontally through viruses, yet we found no evidence of inter-symbiont gene transfer. The presence in the photobiont of meiosis-specific genes supports the notion that sexual reproduction occurs in Asterochloris while they are free-living, a phenomenon with implications for the adaptability of lichens and the persistent autonomy of the symbionts. The diversity of the genes affecting the symbiosis suggests that lichens evolved by accretion of many scattered regulatory and structural changes rather than through introduction of a few key innovations. This predicts that paths to lichenization were variable in different phyla, which is consistent with the emerging consensus that ascolichens could have had a few independent origins.
Collapse
Affiliation(s)
| | - Olaf Müller
- Department of Biology, Duke University, Durham, USA
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, USA
| | | | - Ólafur S. Andrésson
- Faculty of Life and Environmental Sciences, University of Iceland, Reykjavík, Iceland
| | - Guillaume Blanc
- Aix Marseille University, Université de Toulon, CNRS, IRD, MIO UM 110, 13288 Marseille, France
| | - Helge B. Bode
- Molekulare Biotechnologie, Fachbereich Biowissenschaften & Buchmann Institute for Molecular Life Sciences (BMLS), Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Frank R. Collart
- Argonne National Laboratory, Biosciences Division, Argonne, & Department of Bioengineering, University of Illinois at Chicago, Chicago, USA
| | - Francesco Dal Grande
- Senckenberg Biodiversity and Climate Research Center (SBiK-F), Frankfurt am Main, Germany
| | - Fred Dietrich
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, USA
| | - Igor V. Grigoriev
- US Department of Energy Joint Genome Institute, Walnut Creek, USA
- Department of Plant and Microbial Biology, University of California – Berkeley, Berkeley, USA
| | - Suzanne Joneson
- Department of Biology, Duke University, Durham, USA
- College of General Studies, University of Wisconsin - Milwaukee at Waukesha, Waukesha, USA
| | - Alan Kuo
- US Department of Energy Joint Genome Institute, Walnut Creek, USA
| | - Peter E. Larsen
- Argonne National Laboratory, Biosciences Division, Argonne, & Department of Bioengineering, University of Illinois at Chicago, Chicago, USA
| | | | | | - Francis Martin
- INRA, Université de Lorraine, Interactions Arbres-Microorganismes, INRA-Nancy, Champenoux, France
| | - Susan P. May
- Department of Biology, Duke University, Durham, USA
- Department of Population Health and Pathobiology, College of Veterinary Medicine, North Carolina State University, Raleigh, USA
| | - Tami R. McDonald
- Department of Biology, Duke University, Durham, USA
- Department of Biology, St. Catherine University, St. Paul, USA
| | - Sabeeha S. Merchant
- Department of Plant and Microbial Biology, University of California – Berkeley, Berkeley, USA
- Department of Molecular and Cell Biology, University of California – Berkeley, Berkeley, USA
| | - Vivian Miao
- Department of Microbiology and Immunology, University of British Columbia, Vancouver, Canada
| | - Emmanuelle Morin
- INRA, Université de Lorraine, Interactions Arbres-Microorganismes, INRA-Nancy, Champenoux, France
| | - Ryoko Oono
- Department of Ecology, Evolution, and Marine Biology, University of California - Santa Barbara, Santa Barbara, USA
| | - Matteo Pellegrini
- Department of Molecular, Cell, and Developmental Biology, and DOE Institute for Genomics and Proteomics, University of California, Los Angeles, USA
| | - Nimrod Rubinstein
- National Evolutionary Synthesis Center, Durham, USA
- Calico Life Sciences LLC, South San Francisco, USA
| | | | | | - Imke Schmitt
- Senckenberg Biodiversity and Climate Research Center (SBiK-F), Frankfurt am Main, Germany
- Institute of Ecology, Evolution and Diversity, Fachbereich Biowissenschaften, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Jason C. Slot
- College of Food, Agricultural, and Environmental Sciences, Department of Plant Pathology, The Ohio State University, Columbus, USA
| | - Darren Soanes
- College of Life & Environmental Sciences, University of Exeter, Exeter, UK
| | - Péter Szövényi
- Department of Systematic and Evolutionary Botany, University of Zurich, Zurich, Switzerland
| | | | - Claire Veneault-Fourrey
- INRA, Université de Lorraine, Interactions Arbres-Microorganismes, INRA-Nancy, Champenoux, France
- Université de Lorraine, INRA, Interactions Arbres-Microorganismes, Faculté des Sciences et Technologies, Vandoeuvre les Nancy Cedex, France
| | - Basil B. Xavier
- Faculty of Life and Environmental Sciences, University of Iceland, Reykjavík, Iceland
- Laboratory of Medical Microbiology, Vaccine & Infectious Disease Institute, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
16
|
Zwaenepoel A, Van de Peer Y. Inference of Ancient Whole-Genome Duplications and the Evolution of Gene Duplication and Loss Rates. Mol Biol Evol 2019; 36:1384-1404. [PMID: 31004147 DOI: 10.1093/molbev/msz088] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Gene tree-species tree reconciliation methods have been employed for studying ancient whole-genome duplication (WGD) events across the eukaryotic tree of life. Most approaches have relied on using maximum likelihood trees and the maximum parsimony reconciliation thereof to count duplication events on specific branches of interest in a reference species tree. Such approaches do not account for uncertainty in the gene tree and reconciliation, or do so only heuristically. The effects of these simplifications on the inference of ancient WGDs are unclear. In particular, the effects of variation in gene duplication and loss rates across the species tree have not been considered. Here, we developed a full probabilistic approach for phylogenomic reconciliation-based WGD inference, accounting for both gene tree and reconciliation uncertainty using a method based on the principle of amalgamated likelihood estimation. The model and methods are implemented in a maximum likelihood and Bayesian setting and account for variation of duplication and loss rates across the species tree, using methods inspired by phylogenetic divergence time estimation. We applied our newly developed framework to ancient WGDs in land plants and investigated the effects of duplication and loss rate variation on reconciliation and gene count based assessment of these earlier proposed WGDs.
Collapse
Affiliation(s)
- Arthur Zwaenepoel
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- Center for Plant Systems Biology, VIB, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent, Belgium
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- Center for Plant Systems Biology, VIB, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent, Belgium
- Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| |
Collapse
|
17
|
Abstract
The nucleocytoplasmic large DNA viruses (NCLDVs) are a monophyletic group of diverse eukaryotic viruses that reproduce primarily in the cytoplasm of the infected cells and include the largest viruses currently known: the giant mimiviruses, pandoraviruses, and pithoviruses. With virions measuring up to 1.5 μm and genomes of up to 2.5 Mb, the giant viruses break the now-outdated definition of a virus and extend deep into the genome size range typical of bacteria and archaea. Additionally, giant viruses encode multiple proteins that are universal among cellular life forms, particularly components of the translation system, the signature cellular molecular machinery. These findings triggered hypotheses on the origin of giant viruses from cells, likely of an extinct fourth domain of cellular life, via reductive evolution. However, phylogenomic analyses reveal a different picture, namely multiple origins of giant viruses from smaller NCLDVs via acquisition of multiple genes from the eukaryotic hosts and bacteria, along with gene duplication. Thus, with regard to their origin, the giant viruses do not appear to qualitatively differ from the rest of the virosphere. However, the evolutionary forces that led to the emergence of virus gigantism remain enigmatic.
Collapse
Affiliation(s)
- Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Natalya Yutin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
18
|
Genome size evolution in the Archaea. Emerg Top Life Sci 2018; 2:595-605. [PMID: 33525826 PMCID: PMC7289037 DOI: 10.1042/etls20180021] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Revised: 09/26/2018] [Accepted: 09/28/2018] [Indexed: 11/17/2022]
Abstract
What determines variation in genome size, gene content and genetic diversity at the broadest scales across the tree of life? Much of the existing work contrasts eukaryotes with prokaryotes, the latter represented mainly by Bacteria. But any general theory of genome evolution must also account for the Archaea, a diverse and ecologically important group of prokaryotes that represent one of the primary domains of cellular life. Here, we survey the extant diversity of Bacteria and Archaea, and ask whether the general principles of genome evolution deduced from the study of Bacteria and eukaryotes also apply to the archaeal domain. Although Bacteria and Archaea share a common prokaryotic genome architecture, the extant diversity of Bacteria appears to be much higher than that of Archaea. Compared with Archaea, Bacteria also show much greater genome-level specialisation to specific ecological niches, including parasitism and endosymbiosis. The reasons for these differences in long-term diversification rates are unclear, but might be related to fundamental differences in informational processing machineries and cell biological features that may favour archaeal diversification in harsher or more energy-limited environments. Finally, phylogenomic analyses suggest that the first Archaea were anaerobic autotrophs that evolved on the early Earth.
Collapse
|
19
|
Davidson P, Eutsey R, Redler B, Hiller NL, Laub MT, Durand D. Flexibility and constraint: Evolutionary remodeling of the sporulation initiation pathway in Firmicutes. PLoS Genet 2018; 14:e1007470. [PMID: 30212463 PMCID: PMC6136694 DOI: 10.1371/journal.pgen.1007470] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Accepted: 06/04/2018] [Indexed: 12/16/2022] Open
Abstract
The evolution of signal transduction pathways is constrained by the requirements of signal fidelity, yet flexibility is necessary to allow pathway remodeling in response to environmental challenges. A detailed understanding of how flexibility and constraint shape bacterial two component signaling systems is emerging, but how new signal transduction architectures arise remains unclear. Here, we investigate pathway remodeling using the Firmicute sporulation initiation (Spo0) pathway as a model. The present-day Spo0 pathways in Bacilli and Clostridia share common ancestry, but possess different architectures. In Clostridium acetobutylicum, sensor kinases directly phosphorylate Spo0A, the master regulator of sporulation. In Bacillus subtilis, Spo0A is activated via a four-protein phosphorelay. The current view favors an ancestral direct phosphorylation architecture, with the phosphorelay emerging in the Bacillar lineage. Our results reject this hypothesis. Our analysis of 84 broadly distributed Firmicute genomes predicts phosphorelays in numerous Clostridia, contrary to the expectation that the Spo0 phosphorelay is unique to Bacilli. Our experimental verification of a functional Spo0 phosphorelay encoded by Desulfotomaculum acetoxidans (Class Clostridia) further supports functional phosphorelays in Clostridia, which strongly suggests that the ancestral Spo0 pathway was a phosphorelay. Cross complementation assays between Bacillar and Clostridial phosphorelays demonstrate conservation of interaction specificity since their divergence over 2.7 BYA. Further, the distribution of direct phosphorylation Spo0 pathways is patchy, suggesting multiple, independent instances of remodeling from phosphorelay to direct phosphorylation. We provide evidence that these transitions are likely the result of changes in sporulation kinase specificity or acquisition of a sensor kinase with specificity for Spo0A, which is remarkably conserved in both architectures. We conclude that flexible encoding of interaction specificity, a phenotype that is only intermittently essential, and the recruitment of kinases to recognize novel environmental signals resulted in a consistent and repeated pattern of remodeling of the Spo0 pathway. Survival in a changing world requires signal transduction circuitry that can evolve to sense and respond to new environmental challenges. The Firmicute sporulation initiation (Spo0) pathway is a compelling example of a pathway with a circuit diagram that has changed over the course of evolution. In Clostridium acetobutylicum, a sensor kinase directly activates the master regulator of sporulation, Spo0A. In Bacillus subtilis, Spo0A is activated indirectly via a four-protein phosphorelay. These early observations suggested that the ancestral Spo0A was directly phosphorylated by a kinase in the earliest spore-former and that the Spo0 phosphorelay arose later in Bacilli via gain of additional proteins and interactions. Our analysis, based on a much larger set of genomes, surprisingly reveals phosphorelays, not only in Bacilli, but in many Clostridia. These findings support a model wherein sporulation was initiated by a Spo0 phosphorelay in the ancestral spore-former and the direct phosphorylation Spo0 pathways, which are observed in distinct sets of Clostridial taxa, are the result of convergent, reductive evolution. Further, our evidence suggests that these remodeling events were mediated by changes in kinase specificity, implicating flexible pathway remodeling, potentially combined with the recruitment of kinases, in Spo0 pathway evolution.
Collapse
Affiliation(s)
- Philip Davidson
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Rory Eutsey
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Brendan Redler
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - N. Luisa Hiller
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Center of Excellence in Biofilm Research, Allegheny Health Network, Pittsburgh, Pennsylvania, United States of America
| | - Michael T. Laub
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Dannie Durand
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Department of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|
20
|
Updating the genomic taxonomy and epidemiology of Campylobacter hyointestinalis. Sci Rep 2018; 8:2393. [PMID: 29403020 PMCID: PMC5799301 DOI: 10.1038/s41598-018-20889-x] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Accepted: 01/25/2018] [Indexed: 12/24/2022] Open
Abstract
Campylobacter hyointestinalis is a member of an emerging group of zoonotic Campylobacter spp. that are increasingly identified in both gastric and non-gastric disease in humans. Here, we discovered C. hyointestinalis in three separate classes of New Zealand ruminant livestock; cattle, sheep and deer. To investigate the relevance of these findings we performed a systematic literature review on global C. hyointestinalis epidemiology and used comparative genomics to better understand and classify members of the species. We found that C. hyointestinalis subspecies hyointestinalis has an open pangenome, with accessory gene contents involved in many essential processes such as metabolism, virulence and defence. We observed that horizontal gene transfer is likely to have played an overwhelming role in species diversification, favouring a public-goods-like mechanism of gene ‘acquisition and resampling’ over a tree-of-life-like vertical inheritance model of evolution. As a result, simplistic gene-based inferences of taxonomy by similarity are likely to be misleading. Such genomic plasticity will also mean that local evolutionary histories likely influence key species characteristics, such as host-association and virulence. This may help explain geographical differences in reported C. hyointestinalis epidemiology and limits what characteristics may be generalised, requiring further genomic studies of C. hyointestinalis in areas where it causes disease.
Collapse
|
21
|
|
22
|
Brito PH, Chevreux B, Serra CR, Schyns G, Henriques AO, Pereira-Leal JB. Genetic Competence Drives Genome Diversity in Bacillus subtilis. Genome Biol Evol 2018; 10:108-124. [PMID: 29272410 PMCID: PMC5765554 DOI: 10.1093/gbe/evx270] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/19/2017] [Indexed: 12/18/2022] Open
Abstract
Prokaryote genomes are the result of a dynamic flux of genes, with increases achieved via horizontal gene transfer and reductions occurring through gene loss. The ecological and selective forces that drive this genomic flexibility vary across species. Bacillus subtilis is a naturally competent bacterium that occupies various environments, including plant-associated, soil, and marine niches, and the gut of both invertebrates and vertebrates. Here, we quantify the genomic diversity of B. subtilis and infer the genome dynamics that explain the high genetic and phenotypic diversity observed. Phylogenomic and comparative genomic analyses of 42 B. subtilis genomes uncover a remarkable genome diversity that translates into a core genome of 1,659 genes and an asymptotic pangenome growth rate of 57 new genes per new genome added. This diversity is due to a large proportion of low-frequency genes that are acquired from closely related species. We find no gene-loss bias among wild isolates, which explains why the cloud genome, 43% of the species pangenome, represents only a small proportion of each genome. We show that B. subtilis can acquire xenologous copies of core genes that propagate laterally among strains within a niche. While not excluding the contributions of other mechanisms, our results strongly suggest a process of gene acquisition that is largely driven by competence, where the long-term maintenance of acquired genes depends on local and global fitness effects. This competence-driven genomic diversity provides B. subtilis with its generalist character, enabling it to occupy a wide range of ecological niches and cycle through them.
Collapse
Affiliation(s)
- Patrícia H Brito
- Instituto Gulbenkian de Ciência, Oeiras, Portugal
- Nova Medical School, Faculdade de Ciências Médicas, Universidade Nova de Lisboa, Portugal
| | - Bastien Chevreux
- DSM Nutritional Products, Ltd., 60 Westview street, Lexington MA, USA
| | - Cláudia R Serra
- Instituto de Tecnologia Química e Biológica, Oeiras, Portugal
| | - Ghislain Schyns
- DSM Nutritional Products, Ltd., 60 Westview street, Lexington MA, USA
| | | | - José B Pereira-Leal
- Instituto Gulbenkian de Ciência, Oeiras, Portugal
- Ophiomics—Precision Medicine, Lisbon, Portugal
| |
Collapse
|
23
|
Schulz F, Yutin N, Ivanova NN, Ortega DR, Lee TK, Vierheilig J, Daims H, Horn M, Wagner M, Jensen GJ, Kyrpides NC, Koonin EV, Woyke T. Giant viruses with an expanded complement of translation system components. Science 2017; 356:82-85. [PMID: 28386012 DOI: 10.1126/science.aal4657] [Citation(s) in RCA: 171] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2016] [Revised: 01/18/2017] [Accepted: 03/15/2017] [Indexed: 12/24/2022]
Abstract
The discovery of giant viruses blurred the sharp division between viruses and cellular life. Giant virus genomes encode proteins considered as signatures of cellular organisms, particularly translation system components, prompting hypotheses that these viruses derived from a fourth domain of cellular life. Here we report the discovery of a group of giant viruses (Klosneuviruses) in metagenomic data. Compared with other giant viruses, the Klosneuviruses encode an expanded translation machinery, including aminoacyl transfer RNA synthetases with specificities for all 20 amino acids. Notwithstanding the prevalence of translation system components, comprehensive phylogenomic analysis of these genes indicates that Klosneuviruses did not evolve from a cellular ancestor but rather are derived from a much smaller virus through extensive gain of host genes.
Collapse
Affiliation(s)
- Frederik Schulz
- Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA.
| | - Natalya Yutin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Natalia N Ivanova
- Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Davi R Ortega
- Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Tae Kwon Lee
- Department of Microbiology and Ecosystem Science, Division of Microbial Ecology, "Chemistry Meets Microbiology" Research Network, University of Vienna, 1090 Vienna, Austria
| | - Julia Vierheilig
- Department of Microbiology and Ecosystem Science, Division of Microbial Ecology, "Chemistry Meets Microbiology" Research Network, University of Vienna, 1090 Vienna, Austria
| | - Holger Daims
- Department of Microbiology and Ecosystem Science, Division of Microbial Ecology, "Chemistry Meets Microbiology" Research Network, University of Vienna, 1090 Vienna, Austria
| | - Matthias Horn
- Department of Microbiology and Ecosystem Science, Division of Microbial Ecology, "Chemistry Meets Microbiology" Research Network, University of Vienna, 1090 Vienna, Austria
| | - Michael Wagner
- Department of Microbiology and Ecosystem Science, Division of Microbial Ecology, "Chemistry Meets Microbiology" Research Network, University of Vienna, 1090 Vienna, Austria
| | - Grant J Jensen
- Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA 91125, USA.,Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA 91125, USA
| | - Nikos C Kyrpides
- Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | - Tanja Woyke
- Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA.
| |
Collapse
|
24
|
Williams TA, Szöllősi GJ, Spang A, Foster PG, Heaps SE, Boussau B, Ettema TJG, Embley TM. Integrative modeling of gene and genome evolution roots the archaeal tree of life. Proc Natl Acad Sci U S A 2017; 114:E4602-E4611. [PMID: 28533395 PMCID: PMC5468678 DOI: 10.1073/pnas.1618463114] [Citation(s) in RCA: 152] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A root for the archaeal tree is essential for reconstructing the metabolism and ecology of early cells and for testing hypotheses that propose that the eukaryotic nuclear lineage originated from within the Archaea; however, published studies based on outgroup rooting disagree regarding the position of the archaeal root. Here we constructed a consensus unrooted archaeal topology using protein concatenation and a multigene supertree method based on 3,242 single gene trees, and then rooted this tree using a recently developed model of genome evolution. This model uses evidence from gene duplications, horizontal transfers, and gene losses contained in 31,236 archaeal gene families to identify the most likely root for the tree. Our analyses support the monophyly of DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota, Nanohaloarchaea), a recently discovered cosmopolitan and genetically diverse lineage, and, in contrast to previous work, place the tree root between DPANN and all other Archaea. The sister group to DPANN comprises the Euryarchaeota and the TACK Archaea, including Lokiarchaeum, which our analyses suggest are monophyletic sister lineages. Metabolic reconstructions on the rooted tree suggest that early Archaea were anaerobes that may have had the ability to reduce CO2 to acetate via the Wood-Ljungdahl pathway. In contrast to proposals suggesting that genome reduction has been the predominant mode of archaeal evolution, our analyses infer a relatively small-genomed archaeal ancestor that subsequently increased in complexity via gene duplication and horizontal gene transfer.
Collapse
Affiliation(s)
- Tom A Williams
- School of Earth Sciences, University of Bristol, Bristol BS8 1TQ, United Kingdom;
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Gergely J Szöllősi
- MTA-ELTE Lendület Evolutionary Genomics Research Group, 1117 Budapest, Hungary
| | - Anja Spang
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, SE-75123 Uppsala, Sweden
| | - Peter G Foster
- Department of Life Sciences, Natural History Museum, London SW7 5BD, United Kingdom
| | - Sarah E Heaps
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, United Kingdom
- School of Mathematics & Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
| | - Bastien Boussau
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR5558, F-69622 Villeurbanne, France
| | - Thijs J G Ettema
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, SE-75123 Uppsala, Sweden
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, United Kingdom
| |
Collapse
|
25
|
Nagy LG, Szöllősi G. Fungal Phylogeny in the Age of Genomics: Insights Into Phylogenetic Inference From Genome-Scale Datasets. ADVANCES IN GENETICS 2017; 100:49-72. [DOI: 10.1016/bs.adgen.2017.09.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
26
|
Arguments Reinforcing the Three-Domain View of Diversified Cellular Life. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2016; 2016:1851865. [PMID: 28050162 PMCID: PMC5165138 DOI: 10.1155/2016/1851865] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2016] [Revised: 10/18/2016] [Accepted: 11/03/2016] [Indexed: 11/18/2022]
Abstract
The archaeal ancestor scenario (AAS) for the origin of eukaryotes implies the emergence of a new kind of organism from the fusion of ancestral archaeal and bacterial cells. Equipped with this “chimeric” molecular arsenal, the resulting cell would gradually accumulate unique genes and develop the complex molecular machineries and cellular compartments that are hallmarks of modern eukaryotes. In this regard, proteins related to phagocytosis and cell movement should be present in the archaeal ancestor, thus identifying the recently described candidate archaeal phylum “Lokiarchaeota” as resembling a possible candidate ancestor of eukaryotes. Despite its appeal, AAS seems incompatible with the genomic, molecular, and biochemical differences that exist between Archaea and Eukarya. In particular, the distribution of conserved protein domain structures in the proteomes of cellular organisms and viruses appears hard to reconcile with the AAS. In addition, concerns related to taxon and character sampling, presupposing bacterial outgroups in phylogenies, and nonuniform effects of protein domain structure rearrangement and gain/loss in concatenated alignments of protein sequences cast further doubt on AAS-supporting phylogenies. Here, we evaluate AAS against the traditional “three-domain” world of cellular organisms and propose that the discovery of Lokiarchaeota could be better reconciled under the latter view, especially in light of several additional biological and technical considerations.
Collapse
|
27
|
Koonin EV. Horizontal gene transfer: essentiality and evolvability in prokaryotes, and roles in evolutionary transitions. F1000Res 2016; 5. [PMID: 27508073 PMCID: PMC4962295 DOI: 10.12688/f1000research.8737.1] [Citation(s) in RCA: 96] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/18/2016] [Indexed: 01/01/2023] Open
Abstract
The wide spread of gene exchange and loss in the prokaryotic world has prompted the concept of ‘lateral genomics’ to the point of an outright denial of the relevance of phylogenetic trees for evolution. However, the pronounced coherence congruence of the topologies of numerous gene trees, particularly those for (nearly) universal genes, translates into the notion of a statistical tree of life (STOL), which reflects a central trend of vertical evolution. The STOL can be employed as a framework for reconstruction of the evolutionary processes in the prokaryotic world. Quantitatively, however, horizontal gene transfer (HGT) dominates microbial evolution, with the rate of gene gain and loss being comparable to the rate of point mutations and much greater than the duplication rate. Theoretical models of evolution suggest that HGT is essential for the survival of microbial populations that otherwise deteriorate due to the Muller’s ratchet effect. Apparently, at least some bacteria and archaea evolved dedicated vehicles for gene transfer that evolved from selfish elements such as plasmids and viruses. Recent phylogenomic analyses suggest that episodes of massive HGT were pivotal for the emergence of major groups of organisms such as multiple archaeal phyla as well as eukaryotes. Similar analyses appear to indicate that, in addition to donating hundreds of genes to the emerging eukaryotic lineage, mitochondrial endosymbiosis severely curtailed HGT. These results shed new light on the routes of evolutionary transitions, but caution is due given the inherent uncertainty of deep phylogenies.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
28
|
Abstract
A headline on the front page of the New York Times for November 3, 1977, read "Scientists Discover a Way of Life That Predates Higher Organisms". The accompanying article described a spectacular claim by Carl Woese and George Fox to have discovered a third form of life, a new 'domain' that we now call Archaea. It's not that these microbes were unknown before, nor was it the case that their peculiarities had gone completely unnoticed. Indeed, Ralph Wolfe, in the same department at the University of Illinois as Woese, had already discovered how it was that methanogens (uniquely on the planet) make methane, and the bizarre adaptations that allow extremely halophilic archaea (then called halobacteria) and thermoacidophiles to live in the extreme environments where they do were already under investigation in many labs. But what Woese and Fox had found was that these organisms were related to each other not just in their 'extremophily' but also phylogenetically. And, most surprisingly, they were only remotely related to the rest of the prokaryotes, which we now call the domain Bacteria (Figure 1).
Collapse
Affiliation(s)
- Laura Eme
- Department of Biochemistry and Molecular Biology, Dalhousie University, P.O. Box 15000, Halifax, Nova Scotia B3H 4R2, Canada
| | - W Ford Doolittle
- Department of Biochemistry and Molecular Biology, Dalhousie University, P.O. Box 15000, Halifax, Nova Scotia B3H 4R2, Canada.
| |
Collapse
|
29
|
Koonin EV. Origin of eukaryotes from within archaea, archaeal eukaryome and bursts of gene gain: eukaryogenesis just made easier? Philos Trans R Soc Lond B Biol Sci 2016; 370:20140333. [PMID: 26323764 PMCID: PMC4571572 DOI: 10.1098/rstb.2014.0333] [Citation(s) in RCA: 94] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The origin of eukaryotes is a fundamental, forbidding evolutionary puzzle. Comparative genomic analysis clearly shows that the last eukaryotic common ancestor (LECA) possessed most of the signature complex features of modern eukaryotic cells, in particular the mitochondria, the endomembrane system including the nucleus, an advanced cytoskeleton and the ubiquitin network. Numerous duplications of ancestral genes, e.g. DNA polymerases, RNA polymerases and proteasome subunits, also can be traced back to the LECA. Thus, the LECA was not a primitive organism and its emergence must have resulted from extensive evolution towards cellular complexity. However, the scenario of eukaryogenesis, and in particular the relationship between endosymbiosis and the origin of eukaryotes, is far from being clear. Four recent developments provide new clues to the likely routes of eukaryogenesis. First, evolutionary reconstructions suggest complex ancestors for most of the major groups of archaea, with the subsequent evolution dominated by gene loss. Second, homologues of signature eukaryotic proteins, such as actin and tubulin that form the core of the cytoskeleton or the ubiquitin system, have been detected in diverse archaea. The discovery of this ‘dispersed eukaryome’ implies that the archaeal ancestor of eukaryotes was a complex cell that might have been capable of a primitive form of phagocytosis and thus conducive to endosymbiont capture. Third, phylogenomic analyses converge on the origin of most eukaryotic genes of archaeal descent from within the archaeal evolutionary tree, specifically, the TACK superphylum. Fourth, evidence has been presented that the origin of the major archaeal phyla involved massive acquisition of bacterial genes. Taken together, these findings make the symbiogenetic scenario for the origin of eukaryotes considerably more plausible and the origin of the organizational complexity of eukaryotic cells more readily explainable than they appeared until recently.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
30
|
Mariscal C, Doolittle WF. Eukaryotes first: how could that be? Philos Trans R Soc Lond B Biol Sci 2016; 370:20140322. [PMID: 26323754 DOI: 10.1098/rstb.2014.0322] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In the half century since the formulation of the prokaryote : eukaryote dichotomy, many authors have proposed that the former evolved from something resembling the latter, in defiance of common (and possibly common sense) views. In such 'eukaryotes first' (EF) scenarios, the last universal common ancestor is imagined to have possessed significantly many of the complex characteristics of contemporary eukaryotes, as relics of an earlier 'progenotic' period or RNA world. Bacteria and Archaea thus must have lost these complex features secondarily, through 'streamlining'. If the canonical three-domain tree in which Archaea and Eukarya are sisters is accepted, EF entails that Bacteria and Archaea are convergently prokaryotic. We ask what this means and how it might be tested.
Collapse
Affiliation(s)
- Carlos Mariscal
- Departments of Philosophy, Dalhousie University, PO Box 15000, Halifax, Nova Scotia, Canada B3H 4R2 Biochemistry and Molecular Biology, Dalhousie University, PO Box 15000, Halifax, Nova Scotia, Canada B3H 4R2
| | - W Ford Doolittle
- Biochemistry and Molecular Biology, Dalhousie University, PO Box 15000, Halifax, Nova Scotia, Canada B3H 4R2
| |
Collapse
|
31
|
Vakirlis N, Sarilar V, Drillon G, Fleiss A, Agier N, Meyniel JP, Blanpain L, Carbone A, Devillers H, Dubois K, Gillet-Markowska A, Graziani S, Huu-Vang N, Poirel M, Reisser C, Schott J, Schacherer J, Lafontaine I, Llorente B, Neuvéglise C, Fischer G. Reconstruction of ancestral chromosome architecture and gene repertoire reveals principles of genome evolution in a model yeast genus. Genome Res 2016; 26:918-32. [PMID: 27247244 PMCID: PMC4937564 DOI: 10.1101/gr.204420.116] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2016] [Accepted: 04/28/2016] [Indexed: 12/22/2022]
Abstract
Reconstructing genome history is complex but necessary to reveal quantitative principles governing genome evolution. Such reconstruction requires recapitulating into a single evolutionary framework the evolution of genome architecture and gene repertoire. Here, we reconstructed the genome history of the genus Lachancea that appeared to cover a continuous evolutionary range from closely related to more diverged yeast species. Our approach integrated the generation of a high-quality genome data set; the development of AnChro, a new algorithm for reconstructing ancestral genome architecture; and a comprehensive analysis of gene repertoire evolution. We found that the ancestral genome of the genus Lachancea contained eight chromosomes and about 5173 protein-coding genes. Moreover, we characterized 24 horizontal gene transfers and 159 putative gene creation events that punctuated species diversification. We retraced all chromosomal rearrangements, including gene losses, gene duplications, chromosomal inversions and translocations at single gene resolution. Gene duplications outnumbered losses and balanced rearrangements with 1503, 929, and 423 events, respectively. Gene content variations between extant species are mainly driven by differential gene losses, while gene duplications remained globally constant in all lineages. Remarkably, we discovered that balanced chromosomal rearrangements could be responsible for up to 14% of all gene losses by disrupting genes at their breakpoints. Finally, we found that nonsynonymous substitutions reached fixation at a coordinated pace with chromosomal inversions, translocations, and duplications, but not deletions. Overall, we provide a granular view of genome evolution within an entire eukaryotic genus, linking gene content, chromosome rearrangements, and protein divergence into a single evolutionary framework.
Collapse
Affiliation(s)
- Nikolaos Vakirlis
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS, Institut de Biologie Paris-Seine, Laboratory of Computational and Quantitative Biology, F-75005, Paris, France
| | - Véronique Sarilar
- Micalis Institute, INRA, AgroParisTech, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Guénola Drillon
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS, Institut de Biologie Paris-Seine, Laboratory of Computational and Quantitative Biology, F-75005, Paris, France
| | - Aubin Fleiss
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS, Institut de Biologie Paris-Seine, Laboratory of Computational and Quantitative Biology, F-75005, Paris, France
| | - Nicolas Agier
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS, Institut de Biologie Paris-Seine, Laboratory of Computational and Quantitative Biology, F-75005, Paris, France
| | - Jean-Philippe Meyniel
- ISoft, Route de l'Orme, Parc "Les Algorithmes" Bâtiment Euclide, 91190 Saint-Aubin, France
| | - Lou Blanpain
- Micalis Institute, INRA, AgroParisTech, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS, Institut de Biologie Paris-Seine, Laboratory of Computational and Quantitative Biology, F-75005, Paris, France
| | - Hugo Devillers
- Micalis Institute, INRA, AgroParisTech, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Kenny Dubois
- CRCM, CNRS, UMR7258, Inserm, U1068; Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, F-13009, Marseille, France
| | - Alexandre Gillet-Markowska
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS, Institut de Biologie Paris-Seine, Laboratory of Computational and Quantitative Biology, F-75005, Paris, France
| | - Stéphane Graziani
- ISoft, Route de l'Orme, Parc "Les Algorithmes" Bâtiment Euclide, 91190 Saint-Aubin, France
| | - Nguyen Huu-Vang
- Micalis Institute, INRA, AgroParisTech, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Marion Poirel
- ISoft, Route de l'Orme, Parc "Les Algorithmes" Bâtiment Euclide, 91190 Saint-Aubin, France
| | - Cyrielle Reisser
- Department of Genetics, Genomics and Microbiology, University of Strasbourg/CNRS, UMR 7156, 67083 Strasbourg, France
| | - Jonathan Schott
- CRCM, CNRS, UMR7258, Inserm, U1068; Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, F-13009, Marseille, France
| | - Joseph Schacherer
- Department of Genetics, Genomics and Microbiology, University of Strasbourg/CNRS, UMR 7156, 67083 Strasbourg, France
| | - Ingrid Lafontaine
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS, Institut de Biologie Paris-Seine, Laboratory of Computational and Quantitative Biology, F-75005, Paris, France
| | - Bertrand Llorente
- CRCM, CNRS, UMR7258, Inserm, U1068; Institut Paoli-Calmettes, Aix-Marseille Université, UM 105, F-13009, Marseille, France
| | - Cécile Neuvéglise
- Micalis Institute, INRA, AgroParisTech, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Gilles Fischer
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS, Institut de Biologie Paris-Seine, Laboratory of Computational and Quantitative Biology, F-75005, Paris, France
| |
Collapse
|
32
|
O'Malley MA, Wideman JG, Ruiz-Trillo I. Losing Complexity: The Role of Simplification in Macroevolution. Trends Ecol Evol 2016; 31:608-621. [PMID: 27212432 DOI: 10.1016/j.tree.2016.04.004] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Revised: 04/18/2016] [Accepted: 04/19/2016] [Indexed: 10/21/2022]
Abstract
Macroevolutionary patterns can be produced by combinations of diverse and even oppositional dynamics. A growing body of data indicates that secondary simplifications of molecular and cellular structures are common. Some major diversifications in eukaryotes have occurred because of loss and minimalisation; numerous episodes in prokaryote evolution have likewise been driven by the reduction of structure. After examining a range of examples of secondary simplification and its consequences across the tree of life, we address how macroevolutionary explanations might incorporate simplification as well as complexification, and adaptive as well as nonadaptive dynamics.
Collapse
Affiliation(s)
- Maureen A O'Malley
- UMR5164, University of Bordeaux, 146 Rue Léo Saignat, Bordeaux 33076, France.
| | | | - Iñaki Ruiz-Trillo
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Passeig Marítim de la Barceloneta 37-49, 08003 Barcelona, Spain; Departament de Genètica, Universitat de Barcelona, 08028 Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats, Pg Lluis Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
33
|
Tiley GP, Ané C, Burleigh JG. Evaluating and Characterizing Ancient Whole-Genome Duplications in Plants with Gene Count Data. Genome Biol Evol 2016; 8:1023-37. [PMID: 26988251 PMCID: PMC4860690 DOI: 10.1093/gbe/evw058] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Whole-genome duplications (WGDs) have helped shape the genomes of land plants, and recent evidence suggests that the genomes of all angiosperms have experienced at least two ancient WGDs. In plants, WGDs often are followed by rapid fractionation, in which many homeologous gene copies are lost. Thus, it can be extremely difficult to identify, let alone characterize, ancient WGDs. In this study, we use a new maximum likelihood estimator to test for evidence of ancient WGDs in land plants and estimate the fraction of new genes copies that are retained following a WGD using gene count data, the number of gene copies in gene families. We identified evidence of many putative ancient WGDs in land plants and found that the genome fractionation rates vary tremendously among ancient WGDs. Analyses of WGDs within Brassicales also indicate that background gene duplication and loss rates vary across land plants, and different gene families have different probabilities of being retained following a WGD. Although our analyses are largely robust to errors in duplication and loss rates and the choice of priors, simulations indicate that this method can have trouble detecting multiple WGDs that occur on the same branch, especially when the gene retention rates for ancient WGDs are very low. They also suggest that we should carefully evaluate evidence for some ancient plant WGD hypotheses.
Collapse
Affiliation(s)
| | - Cécile Ané
- Department of Statistics, University of Wisconsin-Madison Department of Botany, University of Wisconsin-Madison
| | | |
Collapse
|
34
|
Zamani-Dahaj SA, Okasha M, Kosakowski J, Higgs PG. Estimating the Frequency of Horizontal Gene Transfer Using Phylogenetic Models of Gene Gain and Loss. Mol Biol Evol 2016; 33:1843-57. [DOI: 10.1093/molbev/msw062] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
35
|
Tekaia F. Inferring Orthologs: Open Questions and Perspectives. GENOMICS INSIGHTS 2016; 9:17-28. [PMID: 26966373 PMCID: PMC4778853 DOI: 10.4137/gei.s37925] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Revised: 12/30/2015] [Accepted: 01/02/2016] [Indexed: 01/25/2023]
Abstract
With the increasing number of sequenced genomes and their comparisons, the detection of orthologs is crucial for reliable functional annotation and evolutionary analyses of genes and species. Yet, the dynamic remodeling of genome content through gain, loss, transfer of genes, and segmental and whole-genome duplication hinders reliable orthology detection. Moreover, the lack of direct functional evidence and the questionable quality of some available genome sequences and annotations present additional difficulties to assess orthology. This article reviews the existing computational methods and their potential accuracy in the high-throughput era of genome sequencing and anticipates open questions in terms of methodology, reliability, and computation. Appropriate taxon sampling together with combination of methods based on similarity, phylogeny, synteny, and evolutionary knowledge that may help detecting speciation events appears to be the most accurate strategy. This review also raises perspectives on the potential determination of orthology throughout the whole species phylogeny.
Collapse
Affiliation(s)
- Fredj Tekaia
- Institut Pasteur, Unit of Structural Microbiology, CNRS URA 3528 and University Paris Diderot, Sorbonne Paris Cité, Paris, France
| |
Collapse
|
36
|
Zhao J, Teufel AI, Liberles DA, Liu L. A generalized birth and death process for modeling the fates of gene duplication. BMC Evol Biol 2015; 15:275. [PMID: 26643106 PMCID: PMC4672517 DOI: 10.1186/s12862-015-0539-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2015] [Accepted: 11/10/2015] [Indexed: 01/15/2023] Open
Abstract
Background Accurately estimating the timing and mode of gene duplications along the evolutionary history of species can provide invaluable information about underlying mechanisms by which the genomes of organisms evolved and the genes with novel functions arose. Mechanistic models have previously been introduced that allow for probabilistic inference of the evolutionary mechanism for duplicate gene retention based upon the average rate of loss over time of the duplicate. However, there is currently no probabilistic model embedded in a birth-death modeling framework that can take into account the effects of different evolutionary mechanisms of gene retention when analyzing gene family data. Results In this study, we describe a generalized birth-death process for modeling the fates of gene duplication. Use of mechanistic models in a phylogenetic framework requires an age-dependent birth-death process. Starting with a single population corresponding to the lineage of a phylogenetic tree and with an assumption of a clock that starts ticking for each duplicate at its birth, an age-dependent birth-death process is developed by extending the results from the time-dependent birth-death process. The implementation of such models in a full phylogenetic framework is expected to enable large scale probabilistic analysis of duplicates in comparative genomic studies. Conclusions We develop an age-dependent birth-death model for understanding the mechanisms of gene retention, which allows a gene loss rate dependent on each duplication event. Simulation results indicate that different mechanisms of gene retentions produce distinct likelihood functions, which can be used with genomic data to quantitatively distinguish those mechanisms.
Collapse
Affiliation(s)
- Jing Zhao
- Department of Statistics, University of Georgia, 101 Cedar Street, Athens, GA, 30602, USA.
| | - Ashley I Teufel
- Department of Molecular Biology, University of Wyoming, Laramie, WY, 82071, USA. .,Center for Computational Genetics and Genomics and Department of Biology, Temple University, Philadelphia, PA, 19122, USA.
| | - David A Liberles
- Department of Molecular Biology, University of Wyoming, Laramie, WY, 82071, USA. .,Center for Computational Genetics and Genomics and Department of Biology, Temple University, Philadelphia, PA, 19122, USA.
| | - Liang Liu
- Department of Statistics, University of Georgia, 101 Cedar Street, Athens, GA, 30602, USA. .,Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA.
| |
Collapse
|
37
|
Jeong H, Sung S, Kwon T, Seo M, Caetano-Anollés K, Choi SH, Cho S, Nasir A, Kim H. HGTree: database of horizontally transferred genes determined by tree reconciliation. Nucleic Acids Res 2015; 44:D610-9. [PMID: 26578597 PMCID: PMC4702880 DOI: 10.1093/nar/gkv1245] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 11/01/2015] [Indexed: 01/13/2023] Open
Abstract
The HGTree database provides putative genome-wide horizontal gene transfer (HGT) information for 2472 completely sequenced prokaryotic genomes. This task is accomplished by reconstructing approximate maximum likelihood phylogenetic trees for each orthologous gene and corresponding 16S rRNA reference species sets and then reconciling the two trees under parsimony framework. The tree reconciliation method is generally considered to be a reliable way to detect HGT events but its practical use has remained limited because the method is computationally intensive and conceptually challenging. In this regard, HGTree (http://hgtree.snu.ac.kr) represents a useful addition to the biological community and enables quick and easy retrieval of information for HGT-acquired genes to better understand microbial taxonomy and evolution. The database is freely available and can be easily scaled and updated to keep pace with the rapid rise in genomic information.
Collapse
Affiliation(s)
- Hyeonsoo Jeong
- Interdisciplinary Program in Bioinformatics, Seoul National University, Kwan-ak St. 599, Kwan-ak Gu, Seoul, 151-741, Republic of Korea Department of Animal Sciences, University of Illinois, Urbana, IL 61801, USA
| | - Samsun Sung
- C&K genomics, Main Bldg. #514, SNU Research Park, Seoul 151-919, Republic of Korea
| | - Taehyung Kwon
- Department of Agricultural Biotechnology, Seoul National University, Seoul 151-742, Republic of Korea
| | - Minseok Seo
- Interdisciplinary Program in Bioinformatics, Seoul National University, Kwan-ak St. 599, Kwan-ak Gu, Seoul, 151-741, Republic of Korea
| | | | - Sang Ho Choi
- National Research Laboratory of Molecular Microbiology and Toxicology, Department of Agricultural Biotechnology, Center for Food Safety and Toxicology, Seoul National University, Seoul 151-921, Republic of Korea
| | - Seoae Cho
- C&K genomics, Main Bldg. #514, SNU Research Park, Seoul 151-919, Republic of Korea
| | - Arshan Nasir
- Department of Biosciences, COMSATS Institute of Information Technology, Park Road, Chak Shahzad, Islamabad 45550, Pakistan
| | - Heebal Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Kwan-ak St. 599, Kwan-ak Gu, Seoul, 151-741, Republic of Korea Department of Agricultural Biotechnology, Seoul National University, Seoul 151-742, Republic of Korea
| |
Collapse
|
38
|
Abstract
The origin of eukaryotes is one of the hardest problems in evolutionary biology and sometimes raises the ominous specter of irreducible complexity. Reconstruction of the gene repertoire of the last eukaryotic common ancestor (LECA) has revealed a highly complex organism with a variety of advanced features but no detectable evolutionary intermediates to explain their origin. Recently, however, genome analysis of diverse archaea led to the discovery of apparent ancestral versions of several signature eukaryotic systems, such as the actin cytoskeleton and the ubiquitin network, that are scattered among archaea. These findings inspired the hypothesis that the archaeal ancestor of eukaryotes was an unusually complex form with an elaborate intracellular organization. The latest striking discovery made by deep metagenomic sequencing vindicates this hypothesis by showing that in phylogenetic trees eukaryotes fall within a newly identified archaeal group, the Lokiarchaeota, which combine several eukaryotic signatures previously identified in different archaea. The discovery of complex archaea that are the closest living relatives of eukaryotes is most compatible with the symbiogenetic scenario for eukaryogenesis.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| |
Collapse
|
39
|
Fullmer MS, Soucy SM, Gogarten JP. The pan-genome as a shared genomic resource: mutual cheating, cooperation and the black queen hypothesis. Front Microbiol 2015; 6:728. [PMID: 26284032 PMCID: PMC4523029 DOI: 10.3389/fmicb.2015.00728] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Accepted: 07/03/2015] [Indexed: 11/13/2022] Open
Affiliation(s)
- Matthew S Fullmer
- Department of Molecular and Cell Biology, University of Connecticut Storrs, CT, USA
| | - Shannon M Soucy
- Department of Molecular and Cell Biology, University of Connecticut Storrs, CT, USA
| | - Johann Peter Gogarten
- Department of Molecular and Cell Biology, University of Connecticut Storrs, CT, USA ; Institute for Systems Genomics, University of Connecticut Storrs, CT, USA
| |
Collapse
|
40
|
Abstract
Biologists used to draw schematic “universal” trees of life as metaphors illustrating the history of life. It is indeed a priori possible to construct an organismal tree connecting the three major domains of ribosome encoding organisms: Archaea, Bacteria and Eukarya, since they originated by cell division from LUCA. Several universal trees based on ribosomal RNA sequence comparisons proposed at the end of the last century are still widely used, although some of their main features have been challenged by subsequent analyses. Several authors have proposed to replace the traditional universal tree with a ring of life, whereas others have proposed more recently to include viruses as new domains. These proposals are misleading, suggesting that endosymbiosis can modify the shape of a tree or that viruses originated from the last universal common ancestor (LUCA). I propose here an updated version of Woese’s universal tree that includes several rootings for each domain and internal branching within domains that are supported by recent phylogenomic analyses of domain specific proteins. The tree is rooted between Bacteria and Arkarya, a new name proposed for the clade grouping Archaea and Eukarya. A consensus version, in which each of the three domains is unrooted, and a version in which eukaryotes emerged within archaea are also presented. This last scenario assumes the transformation of a modern domain into another, a controversial evolutionary pathway. Viruses are not indicated in these trees but are intrinsically present because they infect the tree from its roots to its leaves. Finally, I present a detailed tree of the domain Archaea, proposing the sub-phylum neo-Euryarchaeota for the monophyletic group of euryarchaeota containing DNA gyrase. These trees, that will be easily updated as new data become available, could be useful to discuss controversial scenarios regarding early life evolution.
Collapse
Affiliation(s)
- Patrick Forterre
- Unité de Biologie Moléculaire du Gène chez les Extrêmophiles, Département de Microbiologie, Institut Pasteur , Paris, France ; Institut de Biologie Intégrative de la cellule, Université Paris-Saclay , Paris, France
| |
Collapse
|
41
|
Lei W, Fang W, Lin Q, Zhou X, Chen X. Characterization of a non-classical MHC class II gene in the vulnerable Chinese egret (Egretta eulophotes). Immunogenetics 2015; 67:463-72. [PMID: 26033691 DOI: 10.1007/s00251-015-0846-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Accepted: 05/16/2015] [Indexed: 11/28/2022]
Abstract
Genes of the major histocompatibility complex (MHC) are valuable makers of adaptive genetic variation in evolutionary ecology research, yet the non-classical MHC genes remain largely unstudied in wild vertebrates. In this study, we have characterized the non-classical MHC class II gene, Egeu-DAB4, in the vulnerable Chinese egret (Ciconiiformes, Ardeidae, Egretta eulophotes). Gene expression analyses showed that Egeu-DAB4 gene had a restricted tissue expression pattern, being expressed in seven examined tissues including the liver, heart, kidney, esophagus, stomach, gallbladder, and intestine, but not in muscle. With respect to polymorphism, only one allele of exon 2 was obtained from Egeu-DAB4 using asymmetric PCR, indicating that Egeu-DAB4 is genetically monomorphic in exon 2. Comparative analyses showed that Egeu-DAB4 had an unusual sequence, with amino acid differences suggesting that its function may differ from those of classical MHC genes. Egeu-DAB4 gene was only found in 30.56-36.56 % of examined Chinese egret individuals. Phylogenetic analysis showed a closer relationship between Egeu-DAB4 and the DAB2 genes in nine other ardeid species. These new findings provide a foundation for further studies to clarify the immunogenetics of non-classical MHC class II gene in the vulnerable Chinese egret and other ciconiiform birds.
Collapse
Affiliation(s)
- Wei Lei
- Key Laboratory of Ministry of Education for Coast and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361102, People's Republic of China,
| | | | | | | | | |
Collapse
|
42
|
Abstract
One of the most fundamental questions in evolutionary biology is the origin of the lineage leading to eukaryotes. Recent phylogenomic analyses have indicated an emergence of eukaryotes from within the radiation of modern Archaea and specifically from a group comprising Thaumarchaeota/"Aigarchaeota" (candidate phylum)/Crenarchaeota/Korarchaeota (TACK). Despite their major implications, these studies were all based on the reconstruction of universal trees and left the exact placement of eukaryotes with respect to the TACK lineage unclear. Here we have applied an original two-step approach that involves the separate analysis of markers shared between Archaea and eukaryotes and between Archaea and Bacteria. This strategy allowed us to use a larger number of markers and greater taxonomic coverage, obtain high-quality alignments, and alleviate tree reconstruction artifacts potentially introduced when analyzing the three domains simultaneously. Our results robustly indicate a sister relationship of eukaryotes with the TACK superphylum that is strongly associated with a distinct root of the Archaea that lies within the Euryarchaeota, challenging the traditional topology of the archaeal tree. Therefore, if we are to embrace an archaeal origin for eukaryotes, our view of the evolution of the third domain of life will have to be profoundly reconsidered, as will many areas of investigation aimed at inferring ancestral characteristics of early life and Earth.
Collapse
|
43
|
Abstract
Horizontal or Lateral Gene Transfer (HGT or LGT) is the transmission of portions of genomic DNA between organisms through a process decoupled from vertical inheritance. In the presence of HGT events, different fragments of the genome are the result of different evolutionary histories. This can therefore complicate the investigations of evolutionary relatedness of lineages and species. Also, as HGT can bring into genomes radically different genotypes from distant lineages, or even new genes bearing new functions, it is a major source of phenotypic innovation and a mechanism of niche adaptation. For example, of particular relevance to human health is the lateral transfer of antibiotic resistance and pathogenicity determinants, leading to the emergence of pathogenic lineages. Computational identification of HGT events relies upon the investigation of sequence composition or evolutionary history of genes. Sequence composition-based ("parametric") methods search for deviations from the genomic average, whereas evolutionary history-based ("phylogenetic") approaches identify genes whose evolutionary history significantly differs from that of the host species. The evaluation and benchmarking of HGT inference methods typically rely upon simulated genomes, for which the true history is known. On real data, different methods tend to infer different HGT events, and as a result it can be difficult to ascertain all but simple and clear-cut HGT events.
Collapse
Affiliation(s)
| | - Nives Škunca
- ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Zurich, Switzerland
| | | | - Christophe Dessimoz
- University College London, London, United Kingdom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| |
Collapse
|
44
|
Koonin EV. The Turbulent Network Dynamics of Microbial Evolution and the Statistical Tree of Life. J Mol Evol 2015; 80:244-50. [PMID: 25894542 PMCID: PMC4472940 DOI: 10.1007/s00239-015-9679-7] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2015] [Accepted: 04/08/2015] [Indexed: 11/05/2024]
Abstract
The wide spread and high rate of gene exchange and loss in the prokaryotic world translate into “network genomics”. The rates of gene gain and loss are comparable with the rate of point mutations but are substantially greater than the duplication rate. Thus, evolution of prokaryotes is primarily shaped by gene gain and loss. These processes are essential to prevent mutational meltdown of microbial populations by stopping Muller’s ratchet and appear to trigger emergence of major novel clades by opening up new ecological niches. At least some bacteria and archaea seem to have evolved dedicated devices for gene transfer. Despite the dominance of gene gain and loss, evolution of genes is intrinsically tree-like. The significant coherence between the topologies of numerous gene trees, particularly those for (nearly) universal genes, is compatible with the concept of a statistical tree of life, which forms the framework for reconstruction of the evolutionary processes in the prokaryotic world.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA,
| |
Collapse
|
45
|
Archaeal Clusters of Orthologous Genes (arCOGs): An Update and Application for Analysis of Shared Features between Thermococcales, Methanococcales, and Methanobacteriales. Life (Basel) 2015; 5:818-40. [PMID: 25764277 PMCID: PMC4390880 DOI: 10.3390/life5010818] [Citation(s) in RCA: 161] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Revised: 02/25/2015] [Accepted: 02/28/2015] [Indexed: 11/18/2022] Open
Abstract
With the continuously accelerating genome sequencing from diverse groups of archaea and bacteria, accurate identification of gene orthology and availability of readily expandable clusters of orthologous genes are essential for the functional annotation of new genomes. We report an update of the collection of archaeal Clusters of Orthologous Genes (arCOGs) to cover, on average, 91% of the protein-coding genes in 168 archaeal genomes. The new arCOGs were constructed using refined algorithms for orthology identification combined with extensive manual curation, including incorporation of the results of several completed and ongoing research projects in archaeal genomics. A new level of classification is introduced, superclusters that unit two or more arCOGs and more completely reflect gene family evolution than individual, disconnected arCOGs. Assessment of the current archaeal genome annotation in public databases indicates that consistent use of arCOGs can significantly improve the annotation quality. In addition to their utility for genome annotation, arCOGs also are a platform for phylogenomic analysis. We explore this aspect of arCOGs by performing a phylogenomic study of the Thermococci that are traditionally viewed as the basal branch of the Euryarchaeota. The results of phylogenomic analysis that involved both comparison of multiple phylogenetic trees and a search for putative derived shared characters by using phyletic patterns extracted from the arCOGs reveal a likely evolutionary relationship between the Thermococci, Methanococci, and Methanobacteria. The arCOGs are expected to be instrumental for a comprehensive phylogenomic study of the archaea.
Collapse
|
46
|
Atkinson GC. The evolutionary and functional diversity of classical and lesser-known cytoplasmic and organellar translational GTPases across the tree of life. BMC Genomics 2015; 16:78. [PMID: 25756599 PMCID: PMC4342817 DOI: 10.1186/s12864-015-1289-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2014] [Accepted: 01/27/2015] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND The ribosome translates mRNA to protein with the aid of a number of accessory protein factors. Translational GTPases (trGTPases) are an integral part of the 'core set' of essential translational factors, and are some of the most conserved proteins across life. This study takes advantage of the wealth of available genomic data, along with novel functional information that has come to light for a number of trGTPases to address the full evolutionary and functional diversity of this superfamily across all domains of life. RESULTS Through sensitive sequence searching combined with phylogenetic analysis, 57 distinct subfamilies of trGTPases are identified: 14 bacterial, 7 archaeal and 35 eukaryotic (of which 21 are known or predicted to be organellar). The results uncover the functional evolution of trGTPases from before the last common ancestor of life on earth to the current day. CONCLUSIONS While some trGTPases are universal, others are limited to certain taxa, suggesting lineage-specific translational control mechanisms that exist on a base of core factors. These lineage-specific features may give organisms the ability to tune their translation machinery to respond to their environment. Only a fraction of the diversity of the trGTPase superfamily has been subjected to experimental analyses; this comprehensive classification brings to light novel and overlooked translation factors that are worthy of further investigation.
Collapse
|
47
|
Richards VP, Palmer SR, Pavinski Bitar PD, Qin X, Weinstock GM, Highlander SK, Town CD, Burne RA, Stanhope MJ. Phylogenomics and the dynamic genome evolution of the genus Streptococcus. Genome Biol Evol 2015; 6:741-53. [PMID: 24625962 PMCID: PMC4007547 DOI: 10.1093/gbe/evu048] [Citation(s) in RCA: 103] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The genus Streptococcus comprises important pathogens that have a severe impact on human health and are responsible for substantial economic losses to agriculture. Here, we utilize 46 Streptococcus genome sequences (44 species), including eight species sequenced here, to provide the first genomic level insight into the evolutionary history and genetic basis underlying the functional diversity of all major groups of this genus. Gene gain/loss analysis revealed a dynamic pattern of genome evolution characterized by an initial period of gene gain followed by a period of loss, as the major groups within the genus diversified. This was followed by a period of genome expansion associated with the origins of the present extant species. The pattern is concordant with an emerging view that genomes evolve through a dynamic process of expansion and streamlining. A large proportion of the pan-genome has experienced lateral gene transfer (LGT) with causative factors, such as relatedness and shared environment, operating over different evolutionary scales. Multiple gene ontology terms were significantly enriched for each group, and mapping terms onto the phylogeny showed that those corresponding to genes born on branches leading to the major groups represented approximately one-fifth of those enriched. Furthermore, despite the extensive LGT, several biochemical characteristics have been retained since group formation, suggesting genomic cohesiveness through time, and that these characteristics may be fundamental to each group. For example, proteolysis: mitis group; urea metabolism: salivarius group; carbohydrate metabolism: pyogenic group; and transcription regulation: bovis group.
Collapse
Affiliation(s)
- Vincent P Richards
- Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University
| | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Petitjean C, Deschamps P, López-García P, Moreira D, Brochier-Armanet C. Extending the conserved phylogenetic core of archaea disentangles the evolution of the third domain of life. Mol Biol Evol 2015; 32:1242-54. [PMID: 25660375 DOI: 10.1093/molbev/msv015] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Initial studies of the archaeal phylogeny relied mainly on the analysis of the RNA component of the small subunit of the ribosome (SSU rRNA). The resulting phylogenies have provided interesting but partial information on the evolutionary history of the third domain of life because SSU rRNA sequences do not contain enough phylogenetic signal to resolve all nodes of the archaeal tree. Thus, many relationships, and especially the most ancient ones, remained elusive. Moreover, SSU rRNA phylogenies can be heavily biased by tree reconstruction artifacts. The sequencing of complete genomes allows using a variety of protein markers as an alternative to SSU rRNA. Taking advantage of the recent burst of archaeal complete genome sequences, we have carried out an in-depth phylogenomic analysis of this domain. We have identified 200 new protein families that, in addition to the ribosomal proteins and the subunits of the RNA polymerase, form a conserved phylogenetic core of archaeal genes. The accurate analysis of these markers combined with desaturation approaches shed new light on the evolutionary history of Archaea and reveals that several relationships recovered in recent analyses are likely the consequence of tree reconstruction artifacts. Among others, we resolve a number of important relationships, such as those among methanogens Class I, and we propose the definition of two new superclasses within the Euryarchaeota: Methanomada and Diaforarchaea.
Collapse
Affiliation(s)
- Céline Petitjean
- UMR CNRS 8079, Unité d'Ecologie, Systématique et Evolution, Université Paris-Sud, Orsay, France
| | - Philippe Deschamps
- UMR CNRS 8079, Unité d'Ecologie, Systématique et Evolution, Université Paris-Sud, Orsay, France
| | | | - David Moreira
- UMR CNRS 8079, Unité d'Ecologie, Systématique et Evolution, Université Paris-Sud, Orsay, France
| | - Céline Brochier-Armanet
- Université de Lyon, Université Lyon 1, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, France
| |
Collapse
|
49
|
Luo H. Evolutionary origin of a streamlined marine bacterioplankton lineage. ISME JOURNAL 2014; 9:1423-33. [PMID: 25431989 DOI: 10.1038/ismej.2014.227] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/11/2014] [Revised: 10/23/2014] [Accepted: 10/30/2014] [Indexed: 12/31/2022]
Abstract
Planktonic bacterial lineages with streamlined genomes are prevalent in the ocean. The base composition of their DNA is often highly biased towards low G+C content, a possible source of systematic error in phylogenetic reconstruction. A total of 228 orthologous protein families were sampled that are shared among major lineages of Alphaproteobacteria, including the marine free-living SAR11 clade and the obligate endosymbiotic Rickettsiales. These two ecologically distinct lineages share genome sizes of <1.5 Mbp and genomic G+C content of <30%. Statistical analyses showed that only 28 protein families are composition-homogeneous, whereas the other 200 families significantly violate the composition-homogeneous assumption included in most phylogenetic methods. RAxML analysis based on the concatenation of 24 ribosomal proteins that fall into the heterogeneous protein category clustered the SAR11 and Rickettsiales lineages at the base of the Alphaproteobacteria tree, whereas that based on the concatenation of 28 homogeneous proteins (including 19 ribosomal proteins) disassociated the lineages and placed SAR11 at the base of the non-endosymbiotic lineages. When the two data sets were concatenated, only a model that accounted for compositional bias yielded a tree identical to the tree built with composition-homogeneous proteins. Ancestral genome analysis suggests that the first evolved SAR11 cell had a small genome streamlined from its ancestor by a factor of two and coinciding with an ecological transition, followed by further gradual streamlining towards the extant SAR11 populations.
Collapse
Affiliation(s)
- Haiwei Luo
- Simon F. S. Li Marine Science Laboratory, School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| |
Collapse
|
50
|
Stolzer M, Wasserman L, Durand D. Robustness of birth-death and gain models for inferring evolutionary events. BMC Genomics 2014; 15 Suppl 6:S9. [PMID: 25572914 PMCID: PMC4239551 DOI: 10.1186/1471-2164-15-s6-s9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Phylogenetic birth-death models are opening a new window on the processes of genome evolution in studies of the evolution of gene and protein families, protein-protein interaction networks, microRNAs, and copy number variation. Given a species tree and a set of genomic characters in present-day species, the birth-death approach estimates the most likely rates required to explain the observed data and returns the expected ancestral character states and the history of character state changes. Achieving a balance between model complexity and generalizability is a fundamental challenge in the application of birth-death models. While more parameters promise greater accuracy and more biologically realistic models, increasing model complexity can lead to overfitting and a heavy computational cost. Results Here we present a systematic, empirical investigation of these tradeoffs, using protein domain families in six metazoan genomes as a case study. We compared models of increasing complexity, implemented in the Count program, with respect to model fit, robustness, and stability. In addition, we used a bootstrapping procedure to assess estimator variability. The results show that the most complex model, which allows for both branch-specific and family-specific rate variation, achieves the best fit, without overfitting. Variance remains low with increasing complexity, except for family-specific loss rates. This variance is reduced when the number of discrete rate categories is increased. Model choice is of greatest concern when different models lead to fundamentally different outcomes. To investigate the extent to which model choice influences biological interpretation, ancestral states and expected events were inferred under each model. Disturbingly, the different models not only resulted in quantitatively different histories, but predicted qualitatively different patterns of domain family turnover and genome expansion and reduction. Conclusions The work presented here evaluates model choice for genomic birth-death models in a systematic way and presents the first use of bootstrapping to assess estimator variance in birth-death models. We find that a model incorporating both lineage and family rate variation yields more accurate estimators without sacrificing generality. Our results indicate that model choice can lead to fundamentally different evolutionary conclusions, emphasizing the importance of more biologically realistic and complex models.
Collapse
|