1
|
Harish A. Protein structures unravel the signatures and patterns of deep time evolution. QRB DISCOVERY 2024; 5:e3. [PMID: 38616890 PMCID: PMC11016368 DOI: 10.1017/qrd.2024.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 11/13/2023] [Accepted: 12/12/2023] [Indexed: 04/16/2024] Open
Abstract
The formulation and testing of hypotheses using 'big biology data' often lie at the interface of computational biology and structural biology. The Protein Data Bank (PDB), which was established about 50 years ago, catalogs three-dimensional (3D) shapes of organic macromolecules and showcases a structural view of biology. The comparative analysis of the structures of homologs, particularly of proteins, from different species has significantly improved the in-depth analyses of molecular and cell biological questions. In addition, computational tools that were developed to analyze the 'protein universe' are providing the means for efficient resolution of longstanding debates in cell and molecular evolution. In celebrating the golden jubilee of the PDB, much has been written about the transformative impact of PDB on a broad range of fields of scientific inquiry and how structural biology transformed the study of the fundamental processes of life. Yet, the transforming influence of PDB on one field of inquiry of fundamental interest-the reconstruction of the distant biological past-has gone almost unnoticed. Here, I discuss the recent advances to highlight how insights and tools of structural biology are bearing on the data required for the empirical resolution of vigorously debated and apparently contradicting hypotheses in evolutionary biology. Specifically, I show that evolutionary characters defined by protein structure are superior compared to conventional sequence characters for reliable, data-driven resolution of competing hypotheses about the origins of the major clades of life and evolutionary relationship among those clades. Since the better quality data unequivocally support two primary domains of life, it is imperative that the primary classification of life be revised accordingly.
Collapse
|
2
|
Moody ERR, Mahendrarajah TA, Dombrowski N, Clark JW, Petitjean C, Offre P, Szöllősi GJ, Spang A, Williams TA. An estimate of the deepest branches of the tree of life from ancient vertically-evolving genes. eLife 2022; 11:66695. [PMID: 35190025 PMCID: PMC8890751 DOI: 10.7554/elife.66695] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 02/07/2022] [Indexed: 11/30/2022] Open
Abstract
Core gene phylogenies provide a window into early evolution, but different gene sets and analytical methods have yielded substantially different views of the tree of life. Trees inferred from a small set of universal core genes have typically supported a long branch separating the archaeal and bacterial domains. By contrast, recent analyses of a broader set of non-ribosomal genes have suggested that Archaea may be less divergent from Bacteria, and that estimates of inter-domain distance are inflated due to accelerated evolution of ribosomal proteins along the inter-domain branch. Resolving this debate is key to determining the diversity of the archaeal and bacterial domains, the shape of the tree of life, and our understanding of the early course of cellular evolution. Here, we investigate the evolutionary history of the marker genes key to the debate. We show that estimates of a reduced Archaea-Bacteria (AB) branch length result from inter-domain gene transfers and hidden paralogy in the expanded marker gene set. By contrast, analysis of a broad range of manually curated marker gene datasets from an evenly sampled set of 700 Archaea and Bacteria reveals that current methods likely underestimate the AB branch length due to substitutional saturation and poor model fit; that the best-performing phylogenetic markers tend to support longer inter-domain branch lengths; and that the AB branch lengths of ribosomal and non-ribosomal marker genes are statistically indistinguishable. Furthermore, our phylogeny inferred from the 27 highest-ranked marker genes recovers a clade of DPANN at the base of the Archaea and places the bacterial Candidate Phyla Radiation (CPR) within Bacteria as the sister group to the Chloroflexota.
Collapse
Affiliation(s)
- Edmund R R Moody
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - Tara A Mahendrarajah
- Department of Marine Microbiology and Biogeochemistry, Royal Netherlands Institute for Sea Research, Den Burg, Netherlands
| | - Nina Dombrowski
- Department of Marine Microbiology and Biogeochemistry, Royal Netherlands Institute for Sea Research, Den Burg, Netherlands
| | - James W Clark
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - Celine Petitjean
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - Pierre Offre
- Department of Marine Microbiology and Biogeochemistry, Royal Netherlands Institute for Sea Research, Den Burg, Netherlands
| | - Gergely J Szöllősi
- Department of Biological Physics, Eötvös Loránd University, Budapest, Hungary
| | - Anja Spang
- Department of Marine Microbiology and Biogeochemistry, Royal Netherlands Institute for Sea Research, Den Burg, Netherlands
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| |
Collapse
|
3
|
Williams TA, Schrempf D, Szöllősi GJ, Cox CJ, Foster PG, Embley TM. Inferring the deep past from molecular data. Genome Biol Evol 2021; 13:6192802. [PMID: 33772552 PMCID: PMC8175050 DOI: 10.1093/gbe/evab067] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/22/2021] [Indexed: 12/17/2022] Open
Abstract
There is an expectation that analyses of molecular sequences might be able to distinguish between alternative hypotheses for ancient relationships, but the phylogenetic methods used and types of data analyzed are of critical importance in any attempt to recover historical signal. Here, we discuss some common issues that can influence the topology of trees obtained when using overly simple models to analyze molecular data that often display complicated patterns of sequence heterogeneity. To illustrate our discussion, we have used three examples of inferred relationships which have changed radically as models and methods of analysis have improved. In two of these examples, the sister-group relationship between thermophilic Thermus and mesophilic Deinococcus, and the position of long-branch Microsporidia among eukaryotes, we show that recovering what is now generally considered to be the correct tree is critically dependent on the fit between model and data. In the third example, the position of eukaryotes in the tree of life, the hypothesis that is currently supported by the best available methods is fundamentally different from the classical view of relationships between major cellular domains. Since heterogeneity appears to be pervasive and varied among all molecular sequence data, and even the best available models can still struggle to deal with some problems, the issues we discuss are generally relevant to phylogenetic analyses. It remains essential to maintain a critical attitude to all trees as hypotheses of relationship that may change with more data and better methods.
Collapse
Affiliation(s)
- Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol BS8 1TQ, United Kingdom
| | - Dominik Schrempf
- Dept. of Biological Physics, Eötvös Loránd University, 1117 Budapest, Hungary
| | - Gergely J Szöllősi
- Dept. of Biological Physics, Eötvös Loránd University, 1117 Budapest, Hungary.,MTA-ELTE "Lendület" Evolutionary Genomics Research Group, 1117 Budapest, Hungary.,Institute of Evolution, Centre for Ecological Research, 1121 Budapest, Hungary
| | - Cymon J Cox
- Centro de Ciências do Mar, Universidade do Algarve, Gambelas, 8005-319 Faro, Portugal
| | - Peter G Foster
- Department of Life Sciences, Natural History Museum, London SW7 5BD, United Kingdom
| | - T Martin Embley
- Biosciences Institute, Centre for Bacterial Cell Biology, Newcastle University, Newcastle upon Tyne NE2 4AX, United Kingdom
| |
Collapse
|
4
|
Gouy M, Tannier E, Comte N, Parsons DP. Seaview Version 5: A Multiplatform Software for Multiple Sequence Alignment, Molecular Phylogenetic Analyses, and Tree Reconciliation. Methods Mol Biol 2021; 2231:241-260. [PMID: 33289897 DOI: 10.1007/978-1-0716-1036-7_15] [Citation(s) in RCA: 99] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
We present Seaview version 5, a multiplatform program to perform multiple alignment and phylogenetic tree building from molecular sequence data. Seaview provides network access to sequence databases, alignment with arbitrary algorithm, parsimony, distance and maximum likelihood tree building with PhyML, and display, printing, and copy-to-clipboard or to SVG files of rooted or unrooted, binary or multifurcating phylogenetic trees. While Seaview is primarily a program providing a graphical user interface to guide the user into performing desired analyses, Seaview possesses also a command-line mode adequate for user-provided scripts. Seaview version 5 introduces the ability to reconcile a gene tree with a reference species tree and use this reconciliation to root and rearrange the gene tree. Seaview is freely available at http://doua.prabi.fr/software/seaview .
Collapse
Affiliation(s)
- Manolo Gouy
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, Villeurbanne, France.
| | - Eric Tannier
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, Villeurbanne, France
- INRIA Grenoble-Rhône-Alpes, Montbonnot, France
| | | | | |
Collapse
|
5
|
Williams TA, Cox CJ, Foster PG, Szöllősi GJ, Embley TM. Phylogenomics provides robust support for a two-domains tree of life. Nat Ecol Evol 2020; 4:138-147. [PMID: 31819234 PMCID: PMC6942926 DOI: 10.1038/s41559-019-1040-x] [Citation(s) in RCA: 141] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 10/15/2019] [Indexed: 11/09/2022]
Abstract
Hypotheses about the origin of eukaryotic cells are classically framed within the context of a universal 'tree of life' based on conserved core genes. Vigorous ongoing debate about eukaryote origins is based on assertions that the topology of the tree of life depends on the taxa included and the choice and quality of genomic data analysed. Here we have reanalysed the evidence underpinning those claims and apply more data to the question by using supertree and coalescent methods to interrogate >3,000 gene families in archaea and eukaryotes. We find that eukaryotes consistently originate from within the archaea in a two-domains tree when due consideration is given to the fit between model and data. Our analyses support a close relationship between eukaryotes and Asgard archaea and identify the Heimdallarchaeota as the current best candidate for the closest archaeal relatives of the eukaryotic nuclear lineage.
Collapse
Affiliation(s)
- Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, UK.
| | - Cymon J Cox
- Centro de Ciências do Mar, Universidade do Algarve, Faro, Portugal
| | - Peter G Foster
- Department of Life Sciences, Natural History Museum, London, UK
| | - Gergely J Szöllősi
- MTA-ELTE "Lendület" Evolutionary Genomics Research Group, Budapest, Hungary
- Department of Biological Physics, Eötvös Loránd University, Budapest, Hungary
- Evolutionary Systems Research Group, Centre for Ecological Research, Hungarian Academy of Sciences, Tihany, Hungary
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne, UK.
| |
Collapse
|
6
|
Start Codon Recognition in Eukaryotic and Archaeal Translation Initiation: A Common Structural Core. Int J Mol Sci 2019; 20:ijms20040939. [PMID: 30795538 PMCID: PMC6412873 DOI: 10.3390/ijms20040939] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Revised: 02/11/2019] [Accepted: 02/13/2019] [Indexed: 01/12/2023] Open
Abstract
Understanding molecular mechanisms of ribosomal translation sheds light on the emergence and evolution of protein synthesis in the three domains of life. Universally, ribosomal translation is described in three steps: initiation, elongation and termination. During initiation, a macromolecular complex assembled around the small ribosomal subunit selects the start codon on the mRNA and defines the open reading frame. In this review, we focus on the comparison of start codon selection mechanisms in eukaryotes and archaea. Eukaryotic translation initiation is a very complicated process, involving many initiation factors. The most widespread mechanism for the discovery of the start codon is the scanning of the mRNA by a pre-initiation complex until the first AUG codon in a correct context is found. In archaea, long-range scanning does not occur because of the presence of Shine-Dalgarno (SD) sequences or of short 5′ untranslated regions. However, archaeal and eukaryotic translation initiations have three initiation factors in common: e/aIF1, e/aIF1A and e/aIF2 are directly involved in the selection of the start codon. Therefore, the idea that these archaeal and eukaryotic factors fulfill similar functions within a common structural ribosomal core complex has emerged. A divergence between eukaryotic and archaeal factors allowed for the adaptation to the long-range scanning process versus the SD mediated prepositioning of the ribosome.
Collapse
|
7
|
Harish A. What is an archaeon and are the Archaea really unique? PeerJ 2018; 6:e5770. [PMID: 30357005 PMCID: PMC6196074 DOI: 10.7717/peerj.5770] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 09/05/2018] [Indexed: 12/05/2022] Open
Abstract
The recognition of the group Archaea as a major branch of the tree of life (ToL) prompted a new view of the evolution of biodiversity. The genomic representation of archaeal biodiversity has since significantly increased. In addition, advances in phylogenetic modeling of multi-locus datasets have resolved many recalcitrant branches of the ToL. Despite the technical advances and an expanded taxonomic representation, two important aspects of the origins and evolution of the Archaea remain controversial, even as we celebrate the 40th anniversary of the monumental discovery. These issues concern (i) the uniqueness (monophyly) of the Archaea, and (ii) the evolutionary relationships of the Archaea to the Bacteria and the Eukarya; both of these are relevant to the deep structure of the ToL. To explore the causes for this persistent ambiguity, I examine multiple datasets and different phylogenetic approaches that support contradicting conclusions. I find that the uncertainty is primarily due to a scarcity of information in standard datasets-universal core-genes datasets-to reliably resolve the conflicts. These conflicts can be resolved efficiently by comparing patterns of variation in the distribution of functional genomic signatures, which are less diffused unlike patterns of primary sequence variation. Relatively lower heterogeneity in distribution patterns minimizes uncertainties and supports statistically robust phylogenetic inferences, especially of the earliest divergences of life. This case study further highlights the limitations of primary sequence data in resolving difficult phylogenetic problems, and raises questions about evolutionary inferences drawn from the analyses of sequence alignments of a small set of core genes. In particular, the findings of this study corroborate the growing consensus that reversible substitution mutations may not be optimal phylogenetic markers for resolving early divergences in the ToL, nor for determining the polarity of evolutionary transitions across the ToL.
Collapse
Affiliation(s)
- Ajith Harish
- Department of Cell and Molecular Biology, Program in Molecular Biology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
8
|
Cherlin S, Heaps SE, Nye TMW, Boys RJ, Williams TA, Embley TM. The Effect of Nonreversibility on Inferring Rooted Phylogenies. Mol Biol Evol 2018; 35:984-1002. [PMID: 29149300 PMCID: PMC5889004 DOI: 10.1093/molbev/msx294] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Most phylogenetic models assume that the evolutionary process is stationary and reversible. In addition to being biologically improbable, these assumptions also impair inference by generating models under which the likelihood does not depend on the position of the root. Consequently, the root of the tree cannot be inferred as part of the analysis. Yet identifying the root position is a key component of phylogenetic inference because it provides a point of reference for polarizing ancestor-descendant relationships and therefore interpreting the tree. In this paper, we investigate the effect of relaxing the unrealistic reversibility assumption and allowing the position of the root to be another unknown. We propose two hierarchical models that are centered on a reversible model but perturbed to allow nonreversibility. The models differ in the degree of structure imposed on the perturbations. The analysis is performed in the Bayesian framework using Markov chain Monte Carlo methods for which software is provided. We illustrate the performance of the two nonreversible models in analyses of simulated data using two types of topological priors. We then apply the models to a real biological data set, the radiation of polyploid yeasts, for which there is robust biological opinion about the root position. Finally, we apply the models to a second biological alignment for which the rooted tree is controversial: the ribosomal tree of life. We compare the two nonreversible models and conclude that both are useful in inferring the position of the root from real biological data.
Collapse
Affiliation(s)
- Svetlana Cherlin
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Sarah E Heaps
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Tom M W Nye
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Richard J Boys
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| |
Collapse
|
9
|
Zhou Z, Liu Y, Li M, Gu JD. Two or three domains: a new view of tree of life in the genomics era. Appl Microbiol Biotechnol 2018; 102:3049-3058. [PMID: 29484479 DOI: 10.1007/s00253-018-8831-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Revised: 01/30/2018] [Accepted: 02/01/2018] [Indexed: 12/26/2022]
Abstract
The deep phylogenetic topology of tree of life is in the center of a long-time dispute. The Woeseian three-domain tree theory, with the Eukarya evolving as a sister clade to Archaea, competes with the two-domain tree theory (the eocyte tree), with the Eukarya branched within Archaea. Revealed by the ongoing debate over the last three decades, sophisticated and proper phylogenetic methods should necessarily be paid with more emphasis, especially these are focusing on the compositional heterogeneity of sites and lineages, and the heterotachy issue. The newly emerging archaeal lineages with numerous eukaryotic-like features, such as membrane trafficking and cellular compartmentalization, are phylogenetically the closest to eukaryotes currently. These findings highlight the evolutionary history from an ancient archaeon to a more complex archaeon with protoeukaryotic-like features and complex cellular structures, thus providing clues to understand eukaryogenesis process. The increasing repertoire of precise genomic contents provides great advantages on understanding the deep phylogeny of tree of life and ancient evolutionary events on Eukarya branching process.
Collapse
Affiliation(s)
- Zhichao Zhou
- Institute for Advanced Study, Shenzhen University, Shenzhen, 518060, People's Republic of China.,Laboratory of Environmental Microbiology and Toxicology, School of Biological Sciences, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, Hong Kong, People's Republic of China
| | - Yang Liu
- Institute for Advanced Study, Shenzhen University, Shenzhen, 518060, People's Republic of China.,Key Laboratory of Optoelectronic Devices and Systems of Ministry of Education and Guangdong Province, College of Optoelectronic Engineering, Shenzhen University, Shenzhen, 518060, People's Republic of China
| | - Meng Li
- Institute for Advanced Study, Shenzhen University, Shenzhen, 518060, People's Republic of China.
| | - Ji-Dong Gu
- Laboratory of Environmental Microbiology and Toxicology, School of Biological Sciences, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, Hong Kong, People's Republic of China
| |
Collapse
|
10
|
Eme L, Spang A, Lombard J, Stairs CW, Ettema TJG. Archaea and the origin of eukaryotes. Nat Rev Microbiol 2017; 15:711-723. [DOI: 10.1038/nrmicro.2017.133] [Citation(s) in RCA: 284] [Impact Index Per Article: 35.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
11
|
Akanni WA, Siu-Ting K, Creevey CJ, McInerney JO, Wilkinson M, Foster PG, Pisani D. Horizontal gene flow from Eubacteria to Archaebacteria and what it means for our understanding of eukaryogenesis. Philos Trans R Soc Lond B Biol Sci 2016; 370:20140337. [PMID: 26323767 DOI: 10.1098/rstb.2014.0337] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The origin of the eukaryotic cell is considered one of the major evolutionary transitions in the history of life. Current evidence strongly supports a scenario of eukaryotic origin in which two prokaryotes, an archaebacterial host and an α-proteobacterium (the free-living ancestor of the mitochondrion), entered a stable symbiotic relationship. The establishment of this relationship was associated with a process of chimerization, whereby a large number of genes from the α-proteobacterial symbiont were transferred to the host nucleus. A general framework allowing the conceptualization of eukaryogenesis from a genomic perspective has long been lacking. Recent studies suggest that the origins of several archaebacterial phyla were coincident with massive imports of eubacterial genes. Although this does not indicate that these phyla originated through the same process that led to the origin of Eukaryota, it suggests that Archaebacteria might have had a general propensity to integrate into their genomes large amounts of eubacterial DNA. We suggest that this propensity provides a framework in which eukaryogenesis can be understood and studied in the light of archaebacterial ecology. We applied a recently developed supertree method to a genomic dataset composed of 392 eubacterial and 51 archaebacterial genera to test whether large numbers of genes flowing from Eubacteria are indeed coincident with the origin of major archaebacterial clades. In addition, we identified two potential large-scale transfers of uncertain directionality at the base of the archaebacterial tree. Our results are consistent with previous findings and seem to indicate that eubacterial gene imports (particularly from δ-Proteobacteria, Clostridia and Actinobacteria) were an important factor in archaebacterial history. Archaebacteria seem to have long relied on Eubacteria as a source of genetic diversity, and while the precise mechanism that allowed these imports is unknown, we suggest that our results support the view that processes comparable to those through which eukaryotes emerged might have been common in archaebacterial history.
Collapse
Affiliation(s)
- Wasiu A Akanni
- School of Biological Sciences and School of Earth Sciences, University of Bristol, Life Sciences Building, Bristol BS8 1TG, UK Department of Biology, National University of Ireland, Maynooth, Co. Kildare, Ireland Department of Life Science, The Natural History Museum, London SW7 5BD, UK
| | - Karen Siu-Ting
- School of Biological Sciences and School of Earth Sciences, University of Bristol, Life Sciences Building, Bristol BS8 1TG, UK Department of Biology, National University of Ireland, Maynooth, Co. Kildare, Ireland Department of Life Science, The Natural History Museum, London SW7 5BD, UK Institute of Biological, Environmental and Rural Sciences (IBERS), Aberystwyth University, Aberystwyth, Ceredigion SY23 3FG, UK
| | - Christopher J Creevey
- Institute of Biological, Environmental and Rural Sciences (IBERS), Aberystwyth University, Aberystwyth, Ceredigion SY23 3FG, UK
| | - James O McInerney
- Department of Biology, National University of Ireland, Maynooth, Co. Kildare, Ireland Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester M13 9PL, UK
| | - Mark Wilkinson
- Department of Life Science, The Natural History Museum, London SW7 5BD, UK
| | - Peter G Foster
- Department of Life Science, The Natural History Museum, London SW7 5BD, UK
| | - Davide Pisani
- School of Biological Sciences and School of Earth Sciences, University of Bristol, Life Sciences Building, Bristol BS8 1TG, UK
| |
Collapse
|
12
|
Abstract
The origin of the eukaryotes is a fundamental scientific question that for over 30 years has generated a spirited debate between the competing Archaea (or three domains) tree and the eocyte tree. As eukaryotes ourselves, humans have a personal interest in our origins. Eukaryotes contain their defining organelle, the nucleus, after which they are named. They have a complex evolutionary history, over time acquiring multiple organelles, including mitochondria, chloroplasts, smooth and rough endoplasmic reticula, and other organelles all of which may hint at their origins. It is the evolutionary history of the nucleus and their other organelles that have intrigued molecular evolutionists, myself included, for the past 30 years and which continues to hold our interest as increasingly compelling evidence favours the eocyte tree. As with any orthodoxy, it takes time to embrace new concepts and techniques.
Collapse
Affiliation(s)
- James A Lake
- MCDB Biology and Human Genetics, University of California, 232 Boyer Hall, Los Angeles, CA 90095, USA
| |
Collapse
|
13
|
Daubin V, Szöllősi GJ. Horizontal Gene Transfer and the History of Life. Cold Spring Harb Perspect Biol 2016; 8:a018036. [PMID: 26801681 DOI: 10.1101/cshperspect.a018036] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Microbes acquire DNA from a variety of sources. The last decades, which have seen the development of genome sequencing, have revealed that horizontal gene transfer has been a major evolutionary force that has constantly reshaped genomes throughout evolution. However, because the history of life must ultimately be deduced from gene phylogenies, the lack of methods to account for horizontal gene transfer has thrown into confusion the very concept of the tree of life. As a result, many questions remain open, but emerging methodological developments promise to use information conveyed by horizontal gene transfer that remains unexploited today.
Collapse
Affiliation(s)
- Vincent Daubin
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, 69000 Lyon, France Centre National de la Recherche Scientifique, Unité Mixte de Recherche 5558, Université Lyon 1, 69622 Villeurbanne, France
| | | |
Collapse
|
14
|
Kurland CG, Harish A. The phylogenomics of protein structures: The backstory. Biochimie 2015; 119:284-302. [DOI: 10.1016/j.biochi.2015.07.027] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2015] [Accepted: 07/28/2015] [Indexed: 12/11/2022]
|
15
|
Williams TA, Heaps SE, Cherlin S, Nye TMW, Boys RJ, Embley TM. New substitution models for rooting phylogenetic trees. Philos Trans R Soc Lond B Biol Sci 2015; 370:20140336. [PMID: 26323766 PMCID: PMC4571574 DOI: 10.1098/rstb.2014.0336] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/04/2015] [Indexed: 12/23/2022] Open
Abstract
The root of a phylogenetic tree is fundamental to its biological interpretation, but standard substitution models do not provide any information on its position. Here, we describe two recently developed models that relax the usual assumptions of stationarity and reversibility, thereby facilitating root inference without the need for an outgroup. We compare the performance of these models on a classic test case for phylogenetic methods, before considering two highly topical questions in evolutionary biology: the deep structure of the tree of life and the root of the archaeal radiation. We show that all three alignments contain meaningful rooting information that can be harnessed by these new models, thus complementing and extending previous work based on outgroup rooting. In particular, our analyses exclude the root of the tree of life from the eukaryotes or Archaea, placing it on the bacterial stem or within the Bacteria. They also exclude the root of the archaeal radiation from several major clades, consistent with analyses using other rooting methods. Overall, our results demonstrate the utility of non-reversible and non-stationary models for rooting phylogenetic trees, and identify areas where further progress can be made.
Collapse
Affiliation(s)
- Tom A Williams
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Sarah E Heaps
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Svetlana Cherlin
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Tom M W Nye
- School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Richard J Boys
- School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| |
Collapse
|
16
|
Abstract
One of the most fundamental questions in evolutionary biology is the origin of the lineage leading to eukaryotes. Recent phylogenomic analyses have indicated an emergence of eukaryotes from within the radiation of modern Archaea and specifically from a group comprising Thaumarchaeota/"Aigarchaeota" (candidate phylum)/Crenarchaeota/Korarchaeota (TACK). Despite their major implications, these studies were all based on the reconstruction of universal trees and left the exact placement of eukaryotes with respect to the TACK lineage unclear. Here we have applied an original two-step approach that involves the separate analysis of markers shared between Archaea and eukaryotes and between Archaea and Bacteria. This strategy allowed us to use a larger number of markers and greater taxonomic coverage, obtain high-quality alignments, and alleviate tree reconstruction artifacts potentially introduced when analyzing the three domains simultaneously. Our results robustly indicate a sister relationship of eukaryotes with the TACK superphylum that is strongly associated with a distinct root of the Archaea that lies within the Euryarchaeota, challenging the traditional topology of the archaeal tree. Therefore, if we are to embrace an archaeal origin for eukaryotes, our view of the evolution of the third domain of life will have to be profoundly reconsidered, as will many areas of investigation aimed at inferring ancestral characteristics of early life and Earth.
Collapse
|
17
|
A Guide to Phylogenetic Reconstruction Using Heterogeneous Models—A Case Study from the Root of the Placental Mammal Tree. COMPUTATION 2015. [DOI: 10.3390/computation3020177] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
18
|
Katz LA, Grant JR. Taxon-Rich Phylogenomic Analyses Resolve the Eukaryotic Tree of Life and Reveal the Power of Subsampling by Sites. Syst Biol 2014; 64:406-15. [DOI: 10.1093/sysbio/syu126] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2014] [Accepted: 12/15/2014] [Indexed: 01/14/2023] Open
Affiliation(s)
- Laura A. Katz
- Department of Biological Sciences, Smith College, Northampton, MA 01063, USA and 2Program in Organismic and Evolutionary Biology, UMass-Amherst, Amherst MA 01003, USA
- Department of Biological Sciences, Smith College, Northampton, MA 01063, USA and 2Program in Organismic and Evolutionary Biology, UMass-Amherst, Amherst MA 01003, USA
| | - Jessica R. Grant
- Department of Biological Sciences, Smith College, Northampton, MA 01063, USA and 2Program in Organismic and Evolutionary Biology, UMass-Amherst, Amherst MA 01003, USA
| |
Collapse
|
19
|
The case for an early biological origin of DNA. J Mol Evol 2014; 79:204-12. [PMID: 25425102 PMCID: PMC4247479 DOI: 10.1007/s00239-014-9656-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Accepted: 11/18/2014] [Indexed: 11/16/2022]
Abstract
All life generates deoxyribonucleotides, the building blocks of DNA, via ribonucleotide reductases (RNRs). The complexity of this reaction suggests it did not evolve until well after the advent of templated protein synthesis, which in turn suggests DNA evolved later than both RNA and templated protein synthesis. However, deoxyribonucleotides may have first been synthesised via an alternative, chemically simpler route—the reversal of the deoxyriboaldolase (DERA) step in deoxyribonucleotide salvage. In light of recent work demonstrating that this reaction can drive synthesis of deoxyribonucleosides, we consider what pressures early adoption of this pathway would have placed on cell metabolism. This in turn provides a rationale for the replacement of DERA-dependent DNA production by RNR-dependent production.
Collapse
|
20
|
McInerney JO, O'Connell MJ, Pisani D. The hybrid nature of the Eukaryota and a consilient view of life on Earth. Nat Rev Microbiol 2014; 12:449-55. [DOI: 10.1038/nrmicro3271] [Citation(s) in RCA: 96] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
21
|
Williams TA, Foster PG, Cox CJ, Embley TM. An archaeal origin of eukaryotes supports only two primary domains of life. Nature 2014; 504:231-6. [PMID: 24336283 DOI: 10.1038/nature12779] [Citation(s) in RCA: 316] [Impact Index Per Article: 28.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2013] [Accepted: 10/14/2013] [Indexed: 02/07/2023]
Abstract
The discovery of the Archaea and the proposal of the three-domains 'universal' tree, based on ribosomal RNA and core genes mainly involved in protein translation, catalysed new ideas for cellular evolution and eukaryotic origins. However, accumulating evidence suggests that the three-domains tree may be incorrect: evolutionary trees made using newer methods place eukaryotic core genes within the Archaea, supporting hypotheses in which an archaeon participated in eukaryotic origins by founding the host lineage for the mitochondrial endosymbiont. These results provide support for only two primary domains of life--Archaea and Bacteria--because eukaryotes arose through partnership between them.
Collapse
Affiliation(s)
- Tom A Williams
- Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne NE2 4HH, UK
| | - Peter G Foster
- Department of Life Sciences, Natural History Museum, London SW7 5BD, UK
| | - Cymon J Cox
- Centro de Ciências do Mar, Universidade do Algarve, Campus de Gambelas, 8005-139 Faro, Portugal
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne NE2 4HH, UK
| |
Collapse
|
22
|
Lasek-Nesselquist E, Gogarten JP. The effects of model choice and mitigating bias on the ribosomal tree of life. Mol Phylogenet Evol 2013; 69:17-38. [PMID: 23707703 DOI: 10.1016/j.ympev.2013.05.006] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Revised: 04/26/2013] [Accepted: 05/08/2013] [Indexed: 01/03/2023]
Abstract
Deep-level relationships within Bacteria, Archaea, and Eukarya as well as the relationships of these three domains to each other require resolution. The ribosomal machinery, universal to all cellular life, represents a protein repertoire resistant to horizontal gene transfer, which provides a largely congruent signal necessary for reconstructing a tree suitable as a backbone for life's reticulate history. Here, we generate a ribosomal tree of life from a robust taxonomic sampling of Bacteria, Archaea, and Eukarya to elucidate deep-level intra-domain and inter-domain relationships. Lack of phylogenetic information and systematic errors caused by inadequate models (that cannot account for substitution rate or compositional heterogeneities) or improper model selection compound conflicting phylogenetic signals from HGT and/or paralogy. Thus, we tested several models of varying sophistication on three different datasets, performed removal of fast-evolving or long-branched Archaea and Eukarya, and employed three different strategies to remove compositional heterogeneity to examine their effects on the topological outcome. Our results support a two-domain topology for the tree of life, where Eukarya emerges from within Archaea as sister to a Korarchaeota/Thaumarchaeota (KT) or Crenarchaeota/KT clade for all models under all or at least one of the strategies employed. Taxonomic manipulation allows single-matrix and certain mixture models to vacillate between two-domain and three-domain phylogenies. We find that models vary in their ability to resolve different areas of the tree of life, which does not necessarily correlate with model complexity. For example, both single-matrix and some mixture models recover monophyletic Crenarchaeota and Euryarchaeota archaeal phyla. In contrast, the most sophisticated model recovers a paraphyletic Euryarchaeota but detects two large clades that comprise the Bacteria, which were recovered separately but never together in the other models. Overall, models recovered consistent topologies despite dataset modifications due to the removal of compositional bias, which reflects either ineffective bias reduction or robust datasets that allow models to overcome reconstruction artifacts. We recommend a comparative approach for evolutionary models to identify model weaknesses as well as consensus relationships.
Collapse
|
23
|
Abstract
The traditional bacterial rooting of the three superkingdoms in sequence-based gene trees is inconsistent with new phylogenetic reconstructions based on genome content of compact protein domains. We find that protein domains at the level of the SCOP superfamily (SF) from sequenced genomes implement with maximum parsimony fully resolved rooted trees. Such genome content trees identify archaea and bacteria (akaryotes) as sister clades that diverge from an akaryote common ancestor, LACA. Several eukaryote sister clades diverge from a eukaryote common ancestor, LECA. LACA and LECA descend in parallel from the most recent universal common ancestor (MRUCA), which is not a bacterium. Rather, MRUCA presents 75% of the unique SFs encoded by extant genomes of the three superkingdoms, each encoding a proteome that partially overlaps all others. This alone implies that the common ancestor to the superkingdoms was very complex. Such ancestral complexity is confirmed by phylogenetic reconstructions. In addition, the divergence of proteomes from the complex ancestor in each superkingdom is both reductive in numbers of unique SFs as well as cumulative in the abundance of surviving SFs. These data suggest that the common ancestor was not the first cell lineage and that modern global phylogeny is the crown of a "recently" re-rooted tree. We suggest that a bottlenecked survivor of an environmental collapse, which preceded the flourishing of the modern crown, seeded the current phylogenetic tree.
Collapse
|
24
|
Brindefalk B, Dessailly BH, Yeats C, Orengo C, Werner F, Poole AM. Evolutionary history of the TBP-domain superfamily. Nucleic Acids Res 2013; 41:2832-45. [PMID: 23376926 PMCID: PMC3597702 DOI: 10.1093/nar/gkt045] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The TATA binding protein (TBP) is an essential transcription initiation factor in Archaea and Eucarya. Bacteria lack TBP, and instead use sigma factors for transcription initiation. TBP has a symmetric structure comprising two repeated TBP domains. Using sequence, structural and phylogenetic analyses, we examine the distribution and evolutionary history of the TBP domain, a member of the helix-grip fold family. Our analyses reveal a broader distribution than for TBP, with TBP-domains being present across all three domains of life. In contrast to TBP, all other characterized examples of the TBP domain are present as single copies, primarily within multidomain proteins. The presence of the TBP domain in the ubiquitous DNA glycosylases suggests that this fold traces back to the ancestor of all three domains of life. The TBP domain is also found in RNase HIII, and phylogenetic analyses show that RNase HIII has evolved from bacterial RNase HII via TBP-domain fusion. Finally, our comparative genomic screens confirm and extend earlier reports of proteins consisting of a single TBP domain among some Archaea. These monopartite TBP-domain proteins suggest that this domain is functional in its own right, and that the TBP domain could have first evolved as an independent protein, which was later recruited in different contexts.
Collapse
Affiliation(s)
- Björn Brindefalk
- Department of Botany, Stockholm University, 106 91 Stockholm, Sweden
| | | | | | | | | | | |
Collapse
|
25
|
Williams TA, Foster PG, Nye TMW, Cox CJ, Embley TM. A congruent phylogenomic signal places eukaryotes within the Archaea. Proc Biol Sci 2012; 279:4870-9. [PMID: 23097517 PMCID: PMC3497233 DOI: 10.1098/rspb.2012.1795] [Citation(s) in RCA: 109] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Determining the relationships among the major groups of cellular life is important for understanding the evolution of biological diversity, but is difficult given the enormous time spans involved. In the textbook ‘three domains’ tree based on informational genes, eukaryotes and Archaea share a common ancestor to the exclusion of Bacteria. However, some phylogenetic analyses of the same data have placed eukaryotes within the Archaea, as the nearest relatives of different archaeal lineages. We compared the support for these competing hypotheses using sophisticated phylogenetic methods and an improved sampling of archaeal biodiversity. We also employed both new and existing tests of phylogenetic congruence to explore the level of uncertainty and conflict in the data. Our analyses suggested that much of the observed incongruence is weakly supported or associated with poorly fitting evolutionary models. All of our phylogenetic analyses, whether on small subunit and large subunit ribosomal RNA or concatenated protein-coding genes, recovered a monophyletic group containing eukaryotes and the TACK archaeal superphylum comprising the Thaumarchaeota, Aigarchaeota, Crenarchaeota and Korarchaeota. Hence, while our results provide no support for the iconic three-domain tree of life, they are consistent with an extended eocyte hypothesis whereby vital components of the eukaryotic nuclear lineage originated from within the archaeal radiation.
Collapse
Affiliation(s)
- Tom A Williams
- Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne NE2 4HH, UK
| | | | | | | | | |
Collapse
|
26
|
The falsifiability of the models for the origin of eukaryotes. Curr Genet 2011; 57:367-90. [DOI: 10.1007/s00294-011-0357-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Revised: 09/29/2011] [Accepted: 09/30/2011] [Indexed: 01/13/2023]
|
27
|
Kelly S, Wickstead B, Gull K. Archaeal phylogenomics provides evidence in support of a methanogenic origin of the Archaea and a thaumarchaeal origin for the eukaryotes. Proc Biol Sci 2011; 278:1009-18. [PMID: 20880885 PMCID: PMC3049024 DOI: 10.1098/rspb.2010.1427] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2010] [Accepted: 09/06/2010] [Indexed: 11/12/2022] Open
Abstract
We have developed a machine-learning approach to identify 3537 discrete orthologue protein sequence groups distributed across all available archaeal genomes. We show that treating these orthologue groups as binary detection/non-detection data is sufficient to capture the majority of archaeal phylogeny. We subsequently use the sequence data from these groups to infer a method and substitution-model-independent phylogeny. By holding this phylogeny constrained and interrogating the intersection of this large dataset with both the Eukarya and the Bacteria using Bayesian and maximum-likelihood approaches, we propose and provide evidence for a methanogenic origin of the Archaea. By the same criteria, we also provide evidence in support of an origin for Eukarya either within or as sisters to the Thaumarchaea.
Collapse
Affiliation(s)
- S Kelly
- Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford OX1 3RE, UK.
| | | | | |
Collapse
|
28
|
Desmond E, Brochier-Armanet C, Forterre P, Gribaldo S. On the last common ancestor and early evolution of eukaryotes: reconstructing the history of mitochondrial ribosomes. Res Microbiol 2011; 162:53-70. [DOI: 10.1016/j.resmic.2010.10.004] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2010] [Accepted: 10/04/2010] [Indexed: 12/31/2022]
|
29
|
Fournier GP, Dick AA, Williams D, Gogarten JP. Evolution of the Archaea: emerging views on origins and phylogeny. Res Microbiol 2010; 162:92-8. [PMID: 21034818 DOI: 10.1016/j.resmic.2010.09.016] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2010] [Accepted: 09/10/2010] [Indexed: 01/04/2023]
Abstract
Of the three domains of life, the Archaea are the most recently discovered and, from the perspective of systematics, perhaps the least understood. More than three decades after their discovery, there is still no overwhelming consensus as to their phylogenetic status, with diverse evidence supporting in varying degrees their monophyly, paraphyly, or even polyphyly. As a further complication, their evolutionary history is inextricably linked to the origin of Eukarya, one of the most challenging problems in evolutionary biology. This exclusive relationship between the eukaryal nucleocytoplasm and the Archaea is further supported by a new methodology for rooting the ribosomal Tree of Life based on amino acid composition. Novel approaches such as utilizing horizontal gene transfers as synchronizing events and branch length analysis of deep paralogs will help to clarify temporal relationships between these lineages, and may prove useful in evaluating the numerous conflicting hypotheses related to the evolution of the Archaea and Eukarya.
Collapse
Affiliation(s)
- Gregory P Fournier
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| | | | | | | |
Collapse
|
30
|
Gribaldo S, Poole AM, Daubin V, Forterre P, Brochier-Armanet C. The origin of eukaryotes and their relationship with the Archaea: are we at a phylogenomic impasse? Nat Rev Microbiol 2010; 8:743-52. [PMID: 20844558 DOI: 10.1038/nrmicro2426] [Citation(s) in RCA: 114] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The origin of eukaryotes and their evolutionary relationship with the Archaea is a major biological question and the subject of intense debate. In the context of the classical view of the universal tree of life, the Archaea and the Eukarya have a common ancestor, the nature of which remains undetermined. Alternative views propose instead that the Eukarya evolved directly from a bona fide archaeal lineage. Several recent large-scale phylogenomic studies using an array of approaches are divided in supporting either one or the other scenario, despite analysing largely overlapping data sets of universal genes. We examine the reasons for such a lack of consensus and consider how alternative approaches may enable progress in answering this fascinating and as-yet-unresolved question.
Collapse
|
31
|
Foster PG, Cox CJ, Embley TM. The primary divisions of life: a phylogenomic approach employing composition-heterogeneous methods. Philos Trans R Soc Lond B Biol Sci 2009; 364:2197-207. [PMID: 19571240 PMCID: PMC2873002 DOI: 10.1098/rstb.2009.0034] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The three-domains tree, which depicts eukaryotes and archaebacteria as monophyletic sister groups, is the dominant model for early eukaryotic evolution. By contrast, the 'eocyte hypothesis', where eukaryotes are proposed to have originated from within the archaebacteria as sister to the Crenarchaeota (also called the eocytes), has been largely neglected in the literature. We have investigated support for these two competing hypotheses from molecular sequence data using methods that attempt to accommodate the across-site compositional heterogeneity and across-tree compositional and rate matrix heterogeneity that are manifest features of these data. When ribosomal RNA genes were analysed using standard methods that do not adequately model these kinds of heterogeneity, the three-domains tree was supported. However, this support was eroded or lost when composition-heterogeneous models were used, with concomitant increase in support for the eocyte tree for eukaryotic origins. Analysis of combined amino acid sequences from 41 protein-coding genes supported the eocyte tree, whether or not composition-heterogeneous models were used. The possible effects of substitutional saturation of our data were examined using simulation; these results suggested that saturation is delayed by among-site rate variation in the sequences, and that phylogenetic signal for ancient relationships is plausibly present in these data.
Collapse
Affiliation(s)
- Peter G. Foster
- Department of Zoology, Natural History Museum, Cromwell Road, London SW7 5BD, UK
| | - Cymon J. Cox
- Centro de Ciências do Mar, Universidade do Algarve, Campus de Gambelas, 8005-139 Faro, Portugal
| | - T. Martin Embley
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle NE2 4HH, UK
| |
Collapse
|
32
|
|
33
|
Abstract
The origin of the eukaryotic genetic apparatus is thought to be central to understanding the evolution of the eukaryotic cell. Disagreement about the source of the relevant genes has spawned competing hypotheses for the origins of the eukaryote nuclear lineage. The iconic rooted 3-domains tree of life shows eukaryotes and archaebacteria as separate groups that share a common ancestor to the exclusion of eubacteria. By contrast, the eocyte hypothesis has eukaryotes originating within the archaebacteria and sharing a common ancestor with a particular group called the Crenarchaeota or eocytes. Here, we have investigated the relative support for each hypothesis from analysis of 53 genes spanning the 3 domains, including essential components of the eukaryotic nucleic acid replication, transcription, and translation apparatus. As an important component of our analysis, we investigated the fit between model and data with respect to composition. Compositional heterogeneity is a pervasive problem for reconstruction of ancient relationships, which, if ignored, can produce an incorrect tree with strong support. To mitigate its effects, we used phylogenetic models that allow for changing nucleotide or amino acid compositions over the tree and data. Our analyses favor a topology that supports the eocyte hypothesis rather than archaebacterial monophyly and the 3-domains tree of life.
Collapse
|
34
|
Spagna JC, Gillespie RG. More data, fewer shifts: Molecular insights into the evolution of the spinning apparatus in non-orb-weaving spiders. Mol Phylogenet Evol 2008; 46:347-68. [DOI: 10.1016/j.ympev.2007.08.008] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2007] [Revised: 07/31/2007] [Accepted: 08/08/2007] [Indexed: 10/22/2022]
|
35
|
|
36
|
Abstract
Numerous scenarios explain the origin of the eukaryote cell by fusion or endosymbiosis between an archaeon and a bacterium (and sometimes a third partner). We evaluate these hypotheses using the following three criteria. Can the data be explained by the null hypothesis that new features arise sequentially along a stem lineage? Second, hypotheses involving an archaeon and a bacterium should undergo standard phylogenetic tests of gene distribution. Third, accounting for past events by processes observed in modern cells is preferable to postulating unknown processes that have never been observed. For example, there are many eukaryote examples of bacteria as endosymbionts or endoparasites, but none known in archaea. Strictly post-hoc hypotheses that ignore this third criterion should be avoided. Applying these three criteria significantly narrows the number of plausible hypotheses. Given current knowledge, our conclusion is that the eukaryote lineage must have diverged from an ancestor of archaea well prior to the origin of the mitochondrion. Significantly, the absence of ancestrally amitochondriate eukaryotes (archezoa) among extant eukaryotes is neither evidence for an archaeal host for the ancestor of mitochondria, nor evidence against a eukaryotic host.
Collapse
Affiliation(s)
- Anthony M Poole
- Department of Molecular Biology and Functional Genomics, Stockholm University, Sweden.
| | | |
Collapse
|
37
|
Abstract
The idea that some eukaryotes primitively lacked mitochondria and were true intermediates in the prokaryote-to-eukaryote transition was an exciting prospect. It spawned major advances in understanding anaerobic and parasitic eukaryotes and those with previously overlooked mitochondria. But the evolutionary gap between prokaryotes and eukaryotes is now deeper, and the nature of the host that acquired the mitochondrion more obscure, than ever before.
Collapse
Affiliation(s)
- T Martin Embley
- School of Biology, The Devonshire Building, University of Newcastle upon Tyne, Newcastle NE1 7RU, UK.
| | | |
Collapse
|
38
|
|
39
|
Poole AM, Logan DT. Modern mRNA proofreading and repair: clues that the last universal common ancestor possessed an RNA genome? Mol Biol Evol 2005; 22:1444-55. [PMID: 15774424 PMCID: PMC7107533 DOI: 10.1093/molbev/msi132] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
RNA repair has now been demonstrated to be a genuine biological process and appears to be present in all three domains of life. In this article, we consider what this might mean for the transition from an early RNA-dominated world to modern cells possessing genetically encoded proteins and DNA. There are significant gaps in our understanding of how the modern protein-DNA world could have evolved from a simpler system, and it is currently uncertain whether DNA genomes evolved once or twice. Against this backdrop, the discovery of RNA repair in modern cells is timely food for thought and brings us conceptually one step closer to understanding how RNA genomes were replaced by DNA genomes. We have examined the available literature on multisubunit RNA polymerase structure and function and conclude that a strong case can be made that the Last Universal Common Ancestor (LUCA) possessed a repair-competent RNA polymerase, which would have been capable of acting on an RNA genome. However, while this lends credibility to the proposal that the LUCA had an RNA genome, the alternative, that LUCA had a DNA genome, cannot be completely ruled out.
Collapse
Affiliation(s)
- Anthony M Poole
- Department of Molecular Biology and Functional Genomics, Stockholm University, Stockholm, Sweden.
| | | |
Collapse
|
40
|
Anderson FE, Swofford DL. Should we be worried about long-branch attraction in real data sets? Investigations using metazoan 18S rDNA. Mol Phylogenet Evol 2004; 33:440-51. [PMID: 15336677 DOI: 10.1016/j.ympev.2004.06.015] [Citation(s) in RCA: 112] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2004] [Revised: 06/01/2004] [Indexed: 11/22/2022]
Abstract
Although long-branch attraction (LBA) is frequently cited as the cause of anomalous phylogenetic groupings, few examples of LBA involving real sequence data are known. We have found several cases of probable LBA by analyzing subsamples from an alignment of 18S rDNA sequences for 133 metazoans. In one example, maximum parsimony analysis of sequences from two rotifers, a ctenophore, and a polychaete annelid resulted in strong support for a tree grouping two "long-branch taxa" (a rotifer and the ctenophore). Maximum-likelihood analysis of the same sequences yielded strong support for a more biologically reasonable "rotifer monophyly" tree. Attempts to break up long branches for problematic subsamples through increased taxon sampling reduced, but did not eliminate, LBA problems. Exhaustive analyses of all quartets for a subset of 50 sequences were performed in order to compare the performance of maximum likelihood, equal-weights parsimony, and two additional variants of parsimony; these methods do differ substantially in their rates of failure to recover trees consistent with well established, but highly unresolved phylogenies. Power analyses using simulations suggest that some incorrect inferences by maximum parsimony are due to statistical inconsistency and that when estimates of central branch lengths for certain quartets are very low, maximum-likelihood analyses have difficulty recovering accepted phylogenies even with large amounts of data. These examples demonstrate that LBA problems can occur in real data sets, and they provide an opportunity to investigate causes of incorrect inferences.
Collapse
Affiliation(s)
- Frank E Anderson
- Department of Zoology and Center for Systematic Biology, Southern Illinois University, Carbondale, IL 62901, USA
| | | |
Collapse
|
41
|
Cejchan PA. LUCA, or just a conserved Archaeon?: Comments on Xue et al. (2003). Gene 2004; 333:47-50. [PMID: 15177679 DOI: 10.1016/j.gene.2004.02.012] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2003] [Revised: 09/24/2003] [Accepted: 02/05/2004] [Indexed: 11/24/2022]
Abstract
In their recent paper, Xue et al. used an unusual technique of rooting the universal phylogenetic tree, which resulted in positioning of the last universal common ancestor within Archaea. The present paper brings some criticisms on the methods and results achieved.
Collapse
Affiliation(s)
- Peter A Cejchan
- Laboratory of Paleobiology and Paleoecology, IG ASCR, Rozvojova 135, Prague CZ-16502, Czech Republic.
| |
Collapse
|
42
|
|
43
|
Abstract
The phylogeny and timescale of life are becoming better understood as the analysis of genomic data from model organisms continues to grow. As a result, discoveries are being made about the early history of life and the origin and development of complex multicellular life. This emerging comparative framework and the emphasis on historical patterns is helping to bridge barriers among organism-based research communities.
Collapse
Affiliation(s)
- S Blair Hedges
- NASA Astrobiology Institute and Department of Biology, 208 Mueller Laboratory, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.
| |
Collapse
|
44
|
Jain R, Rivera MC, Moore JE, Lake JA. Horizontal gene transfer in microbial genome evolution. Theor Popul Biol 2002; 61:489-95. [PMID: 12167368 DOI: 10.1006/tpbi.2002.1596] [Citation(s) in RCA: 135] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Horizontal gene transfer is the collective name for processes that permit the exchange of DNA among organisms of different species. Only recently has it been recognized as a significant contribution to inter-organismal gene exchange. Traditionally, it was thought that microorganisms evolved clonally, passing genes from mother to daughter cells with little or no exchange of DNA among diverse species. Studies of microbial genomes, however, have shown that genomes contain genes that are closely related to a number of different prokaryotes, sometimes to phylogenetically very distantly related ones. (Doolittle et al., 1990, J. Mol. Evol. 31, 383-388; Karlin et al., 1997, J. Bacteriol. 179, 3899-3913; Karlin et al., 1998, Annu. Rev. Genet. 32, 185-225; Lawrence and Ochman, 1998, Proc. Natl. Acad. Sci. USA 95, 9413-9417; Rivera et al., 1998, Proc. Natl. Acad. Sci. USA 95, 6239-6244; Campbell, 2000, Theor. Popul. Biol. 57 71-77; Doolittle, 2000, Sci. Am. 282, 90-95; Ochman and Jones, 2000, Embo. J. 19, 6637-6643; Boucher et al. 2001, Curr. Opin., Microbiol. 4, 285-289; Wang et al., 2001, Mol. Biol. Evol. 18, 792-800). Whereas prokaryotic and eukaryotic evolution was once reconstructed from a single 16S ribosomal RNA (rRNA) gene, the analysis of complete genomes is beginning to yield a different picture of microbial evolution, one that is wrought with the lateral movement of genes across vast phylogenetic distances. (Lane et al., 1988, Methods Enzymol. 167, 138-144; Lake and Rivera, 1996, Proc. Natl. Acad. Sci. USA 91, 2880-2881; Lake et al., 1999, Science 283, 2027-2028).
Collapse
Affiliation(s)
- Ravi Jain
- Molecular Biology Institute, University of Californnia, Los Angeles 90095, USA
| | | | | | | |
Collapse
|
45
|
Abstract
Archaea, members of the third domain of life, are bacterial-looking prokaryotes that harbour many unique genotypic and phenotypic properties, testifying for their peculiar evolutionary status. The archaeal ancestor was probably a hyperthermophilic anaerobe. Two archaeal phyla are presently recognized, the Euryarchaeota and the Crenarchaeota. Methanogenesis was the main invention that occurred in the euryarchaeal phylum and is now shared by several archaeal groups. Adaptation to aerobic conditions occurred several times independently in both Euryarchaeota and Crenarchaeota. Recently, many new groups of Archaea that have not yet been cultured have been detected by PCR amplification of 16S ribosomal RNA from environmental samples. The phenotypic and genotypic characterization of these new groups is now a top priority for further studies on archaeal evolution.
Collapse
Affiliation(s)
- Patrick Forterre
- Institut de Génétique et Microbiologie, UMR 8621 CNRS, Bat 409, Université Paris-Sud, 91405 Orsay Cedex, France.
| | | | | |
Collapse
|
46
|
Buckley TR, Cunningham CW. The effects of nucleotide substitution model assumptions on estimates of nonparametric bootstrap support. Mol Biol Evol 2002; 19:394-405. [PMID: 11919280 DOI: 10.1093/oxfordjournals.molbev.a004094] [Citation(s) in RCA: 73] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The use of parameter-rich substitution models in molecular phylogenetics has been criticized on the basis that these models can cause a reduction both in accuracy and in the ability to discriminate among competing topologies. We have explored the relationship between nucleotide substitution model complexity and nonparametric bootstrap support under maximum likelihood (ML) for six data sets for which the true relationships are known with a high degree of certainty. We also performed equally weighted maximum parsimony analyses in order to assess the effects of ignoring branch length information during tree selection. We observed that maximum parsimony gave the lowest mean estimate of bootstrap support for the correct set of nodes relative to the ML models for every data set except one. For several data sets, we established that the exact distribution used to model among-site rate variation was critical for a successful phylogenetic analysis. Site-specific rate models were shown to perform very poorly relative to gamma and invariable sites models for several of the data sets most likely because of the gross underestimation of branch lengths. The invariable sites model also performed poorly for several data sets where this model had a poor fit to the data, suggesting that addition of the gamma distribution can be critical. Estimates of bootstrap support for the correct nodes often increased under gamma and invariable sites models relative to equal rates models. Our observations are contrary to the prediction that such models cause reduced confidence in phylogenetic hypotheses. Our results raise several issues regarding the process of model selection, and we briefly discuss model selection uncertainty and the role of sensitivity analyses in molecular phylogenetics.
Collapse
Affiliation(s)
- Thomas R Buckley
- Department of Biology, Duke University, Durham, North Carolina, USA.
| | | |
Collapse
|
47
|
Podani J, Oltvai ZN, Jeong H, Tombor B, Barabási AL, Szathmáry E. Comparable system-level organization of Archaea and Eukaryotes. Nat Genet 2001; 29:54-6. [PMID: 11528391 DOI: 10.1038/ng708] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
A central and long-standing issue in evolutionary theory is the origin of the biological variation upon which natural selection acts. Some hypotheses suggest that evolutionary change represents an adaptation to the surrounding environment within the constraints of an organism's innate characteristics. Elucidation of the origin and evolutionary relationship of species has been complemented by nucleotide sequence and gene content analyses, with profound implications for recognizing life's major domains. Understanding of evolutionary relationships may be further expanded by comparing systemic higher-level organization among species. Here we employ multivariate analyses to evaluate the biochemical reaction pathways characterizing 43 species. Comparison of the information transfer pathways of Archaea and Eukaryotes indicates a close relationship between these domains. In addition, whereas eukaryotic metabolic enzymes are primarily of bacterial origin, the pathway-level organization of archaeal and eukaryotic metabolic networks is more closely related. Our analyses therefore suggest that during the symbiotic evolution of eukaryotes, incorporation of bacterial metabolic enzymes into the proto-archaeal proteome was constrained by the host's pre-existing metabolic architecture.
Collapse
Affiliation(s)
- J Podani
- Institute for Advanced Study, Collegium Budapest, H-1014 Budapest, Hungary
| | | | | | | | | | | |
Collapse
|
48
|
Koski LB, Golding GB. The closest BLAST hit is often not the nearest neighbor. J Mol Evol 2001; 52:540-2. [PMID: 11443357 DOI: 10.1007/s002390010184] [Citation(s) in RCA: 322] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2001] [Accepted: 02/20/2001] [Indexed: 10/25/2022]
Abstract
It is well known that basing phylogenetic reconstructions on uncorrected genetic distances can lead to errors in their reconstruction. Nevertheless, it is often common practice to report simply the most similar BLAST (Altschul et al. 1997) hit in genomic reports that discuss many genes (Ruepp et al. 2000; Freiberg et al. 1997). This is because BLAST hits can provide a rapid, efficient, and concise analysis of many genes at once. These hits are often interpreted to imply that the gene is most closely related to the gene or protein in the databases that returned the closest BLAST hit. Though these two may coincide, for many genes, particularly genes with few homologs, they may not be the same. There are a number of circumstances that can account for such limitations in accuracy (Eisen 2000). We stress here that genes appearing to be the most similar based on BLAST hits are often not each others closest relative phylogenetically. The extent to which this occurs depends on the availability of close relatives present in the databases. As an example we have chosen the analysis of the genomes of a crenarcheaota species Aeropyrum pernix, an organism with few close relatives fully sequenced, and Escherichia coli, an organism whose closest relative, Salmonella typhimurium, is completely sequenced.
Collapse
|