1
|
Harish A. Protein structures unravel the signatures and patterns of deep time evolution. QRB DISCOVERY 2024; 5:e3. [PMID: 38616890 PMCID: PMC11016368 DOI: 10.1017/qrd.2024.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 11/13/2023] [Accepted: 12/12/2023] [Indexed: 04/16/2024] Open
Abstract
The formulation and testing of hypotheses using 'big biology data' often lie at the interface of computational biology and structural biology. The Protein Data Bank (PDB), which was established about 50 years ago, catalogs three-dimensional (3D) shapes of organic macromolecules and showcases a structural view of biology. The comparative analysis of the structures of homologs, particularly of proteins, from different species has significantly improved the in-depth analyses of molecular and cell biological questions. In addition, computational tools that were developed to analyze the 'protein universe' are providing the means for efficient resolution of longstanding debates in cell and molecular evolution. In celebrating the golden jubilee of the PDB, much has been written about the transformative impact of PDB on a broad range of fields of scientific inquiry and how structural biology transformed the study of the fundamental processes of life. Yet, the transforming influence of PDB on one field of inquiry of fundamental interest-the reconstruction of the distant biological past-has gone almost unnoticed. Here, I discuss the recent advances to highlight how insights and tools of structural biology are bearing on the data required for the empirical resolution of vigorously debated and apparently contradicting hypotheses in evolutionary biology. Specifically, I show that evolutionary characters defined by protein structure are superior compared to conventional sequence characters for reliable, data-driven resolution of competing hypotheses about the origins of the major clades of life and evolutionary relationship among those clades. Since the better quality data unequivocally support two primary domains of life, it is imperative that the primary classification of life be revised accordingly.
Collapse
|
2
|
Abstract
The rebuttal of the prokaryote-eukaryote dichotomy and the elaboration of the three domains concept by Carl Woese and colleagues has been a breakthrough in biology. With the methodologies available at this time, they have shown that a single molecule, the 16S ribosomal RNA, could reveal the global organization of the living world. Later on, mining archaeal genomes led to major discoveries in archaeal molecular biology, providing a third model for comparative molecular biology. These analyses revealed the strong eukaryal flavor of the basic molecular fabric of Archaea and support rooting the universal tree between Bacteria and Arcarya (the clade grouping Archaea and Eukarya). However, in contradiction with this conclusion, it remains to understand why the archaeal and bacterial mobilomes are so similar and so different from the eukaryal one. These last years, the number of recognized archaea lineages (phyla?) has exploded. The archaeal nomenclature is now in turmoil and debates about the nature of the last universal common ancestor, the last archaeal common ancestor, and the topology of the tree of life are still going on. Interestingly, the expansion of the archaeal eukaryome, especially in the Asgard archaea, has provided new opportunities to study eukaryogenesis. In recent years, the application to Archaea of the new methodologies described in the various chapters of this book have opened exciting avenues to study the molecular biology and the physiology of these fascinating microorganisms.
Collapse
Affiliation(s)
- Patrick Forterre
- Institut Pasteur, 25 rue du Docteur Roux, 75015, Paris, France.
- Institute for Integrative biology of the Cell. université Paris-Saclay, Gif sur Yvette, France.
| |
Collapse
|
3
|
Mughal F, Nasir A, Caetano-Anollés G. The origin and evolution of viruses inferred from fold family structure. Arch Virol 2020; 165:2177-2191. [PMID: 32748179 PMCID: PMC7398281 DOI: 10.1007/s00705-020-04724-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Accepted: 05/30/2020] [Indexed: 12/16/2022]
Abstract
The canonical frameworks of viral evolution describe viruses as cellular predecessors, reduced forms of cells, or entities that escaped cellular control. The discovery of giant viruses has changed these standard paradigms. Their genetic, proteomic and structural complexities resemble those of cells, prompting a redefinition and reclassification of viruses. In a previous genome-wide analysis of the evolution of structural domains in proteomes, with domains defined at the fold superfamily level, we found the origins of viruses intertwined with those of ancient cells. Here, we extend these data-driven analyses to the study of fold families confirming the co-evolution of viruses and ancient cells and the genetic ability of viruses to foster molecular innovation. The results support our suggestion that viruses arose by genomic reduction from ancient cells and validate a co-evolutionary ‘symbiogenic’ model of viral origins.
Collapse
Affiliation(s)
- Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- Illinois Informatics Institute, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Arshan Nasir
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM, USA
- Department of Biosciences, COMSATS University Islamabad, Islamabad, Pakistan
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, USA.
- Illinois Informatics Institute, University of Illinois at Urbana-Champaign, Urbana, IL, USA.
| |
Collapse
|
4
|
Boguszewska K, Szewczuk M, Kaźmierczak-Barańska J, Karwowski BT. The Similarities between Human Mitochondria and Bacteria in the Context of Structure, Genome, and Base Excision Repair System. Molecules 2020; 25:E2857. [PMID: 32575813 PMCID: PMC7356350 DOI: 10.3390/molecules25122857] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 06/17/2020] [Accepted: 06/19/2020] [Indexed: 02/06/2023] Open
Abstract
Mitochondria emerged from bacterial ancestors during endosymbiosis and are crucial for cellular processes such as energy production and homeostasis, stress responses, cell survival, and more. They are the site of aerobic respiration and adenosine triphosphate (ATP) production in eukaryotes. However, oxidative phosphorylation (OXPHOS) is also the source of reactive oxygen species (ROS), which are both important and dangerous for the cell. Human mitochondria contain mitochondrial DNA (mtDNA), and its integrity may be endangered by the action of ROS. Fortunately, human mitochondria have repair mechanisms that allow protecting mtDNA and repairing lesions that may contribute to the occurrence of mutations. Mutagenesis of the mitochondrial genome may manifest in the form of pathological states such as mitochondrial, neurodegenerative, and/or cardiovascular diseases, premature aging, and cancer. The review describes the mitochondrial structure, genome, and the main mitochondrial repair mechanism (base excision repair (BER)) of oxidative lesions in the context of common features between human mitochondria and bacteria. The authors present a holistic view of the similarities of mitochondria and bacteria to show that bacteria may be an interesting experimental model for studying mitochondrial diseases, especially those where the mechanism of DNA repair is impaired.
Collapse
Affiliation(s)
| | | | | | - Bolesław T. Karwowski
- DNA Damage Laboratory of Food Science Department, Faculty of Pharmacy, Medical University of Lodz, ul. Muszynskiego 1, 90-151 Lodz, Poland; (K.B.); (M.S.); (J.K.-B.)
| |
Collapse
|
5
|
Abstract
Background: Locating the root node of the "tree of life" (ToL) is one of the hardest problems in phylogenetics, given the time depth. The root-node, or the universal common ancestor (UCA), groups descendants into organismal clades/domains. Two notable variants of the two-domains ToL (2D-ToL) have gained support recently. One 2D-ToL posits that eukaryotes (organisms with nuclei) and akaryotes (organisms without nuclei) are sister clades that diverged from the UCA, and that Asgard archaea are sister to other archaea. The other 2D-ToL proposes that eukaryotes emerged from within archaea and places Asgard archaea as sister to eukaryotes. Williams et al. ( Nature Ecol. Evol. 4: 138-147; 2020) re-evaluated the data and methods that support the competing two-domains proposals and concluded that eukaryotes are the closest relatives of Asgard archaea. Critique: The poor resolution of the archaea in their analysis, despite employing amino acid alignments from thousands of proteins and the best-fitting substitution models, contradicts their conclusions. We argue that they overlooked important aspects of estimating evolutionary relatedness and assessing phylogenetic signal in empirical data. Which 2D-ToL is better supported depends on which kind of molecular features are better for resolving common ancestors at the roots of clades - protein-domains or their component amino acids. We focus on phylogenetic character reconstructions necessary to describe the UCA or its closest descendants in the absence of reliable fossils. Clarifications: It is well known that different character types present different perspectives on evolutionary history that relate to different phylogenetic depths. We show that protein structural-domains support more reliable phylogenetic reconstructions of deep-diverging clades in the ToL. Accordingly, Eukaryotes and Akaryotes are better supported clades in a 2D-ToL.
Collapse
Affiliation(s)
| | - David Morrison
- Department of Organismal Biology, Systematic Biology, Uppsala University, Uppsala, 752 36, Sweden
| |
Collapse
|
6
|
Williams TA, Cox CJ, Foster PG, Szöllősi GJ, Embley TM. Phylogenomics provides robust support for a two-domains tree of life. Nat Ecol Evol 2020; 4:138-147. [PMID: 31819234 PMCID: PMC6942926 DOI: 10.1038/s41559-019-1040-x] [Citation(s) in RCA: 141] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 10/15/2019] [Indexed: 11/09/2022]
Abstract
Hypotheses about the origin of eukaryotic cells are classically framed within the context of a universal 'tree of life' based on conserved core genes. Vigorous ongoing debate about eukaryote origins is based on assertions that the topology of the tree of life depends on the taxa included and the choice and quality of genomic data analysed. Here we have reanalysed the evidence underpinning those claims and apply more data to the question by using supertree and coalescent methods to interrogate >3,000 gene families in archaea and eukaryotes. We find that eukaryotes consistently originate from within the archaea in a two-domains tree when due consideration is given to the fit between model and data. Our analyses support a close relationship between eukaryotes and Asgard archaea and identify the Heimdallarchaeota as the current best candidate for the closest archaeal relatives of the eukaryotic nuclear lineage.
Collapse
Affiliation(s)
- Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, UK.
| | - Cymon J Cox
- Centro de Ciências do Mar, Universidade do Algarve, Faro, Portugal
| | - Peter G Foster
- Department of Life Sciences, Natural History Museum, London, UK
| | - Gergely J Szöllősi
- MTA-ELTE "Lendület" Evolutionary Genomics Research Group, Budapest, Hungary
- Department of Biological Physics, Eötvös Loránd University, Budapest, Hungary
- Evolutionary Systems Research Group, Centre for Ecological Research, Hungarian Academy of Sciences, Tihany, Hungary
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne, UK.
| |
Collapse
|
7
|
Caetano-Anollés D, Nasir A, Kim KM, Caetano-Anollés G. Testing Empirical Support for Evolutionary Models that Root the Tree of Life. J Mol Evol 2019; 87:131-142. [PMID: 30887086 PMCID: PMC6443624 DOI: 10.1007/s00239-019-09891-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Accepted: 03/06/2019] [Indexed: 12/12/2022]
Abstract
Trees of life (ToLs) can only be rooted with direct methods that seek optimization of character state information in ingroup taxa. This involves optimizing phylogenetic tree, model and data in an exercise of reciprocal illumination. Rooted ToLs have been built from a census of protein structural domains in proteomes using two kinds of models. Fully-reversible models use standard-ordered (additive) characters and Wagner parsimony to generate unrooted trees of proteomes that are then rooted with Weston's generality criterion. Non-reversible models directly build rooted trees with unordered characters and asymmetric stepmatrices of transformation costs that penalize gain over loss of domains. Here, we test the empirical support for the evolutionary models with character state reconstruction methods using two published proteomic datasets. We show that the reversible models match reconstructed frequencies of character change and are faithful to the distribution of serial homologies in trees. In contrast, the non-reversible models go counter to trends in the data they must explain, attracting organisms with large proteomes to the base of the rooted trees while violating the triangle inequality of distances. This can lead to serious reconstruction inconsistencies that show model inadequacy. Our study highlights the aprioristic perils of disposing of countering evidence in natural history reconstruction.
Collapse
Affiliation(s)
- Derek Caetano-Anollés
- Department of Evolutionary Genetics, Max-Planck-Institut für Evolutionsbiologie, Plön, Germany.
| | - Arshan Nasir
- Department of Biosciences, COMSATS University, Islamabad, 45550, Pakistan
| | - Kyung Mo Kim
- Division of Polar Life Sciences, Korea Polar Research Institute, Incheon, Republic of Korea
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, and Illinois Informatics Institute, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| |
Collapse
|
8
|
Start Codon Recognition in Eukaryotic and Archaeal Translation Initiation: A Common Structural Core. Int J Mol Sci 2019; 20:ijms20040939. [PMID: 30795538 PMCID: PMC6412873 DOI: 10.3390/ijms20040939] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Revised: 02/11/2019] [Accepted: 02/13/2019] [Indexed: 01/12/2023] Open
Abstract
Understanding molecular mechanisms of ribosomal translation sheds light on the emergence and evolution of protein synthesis in the three domains of life. Universally, ribosomal translation is described in three steps: initiation, elongation and termination. During initiation, a macromolecular complex assembled around the small ribosomal subunit selects the start codon on the mRNA and defines the open reading frame. In this review, we focus on the comparison of start codon selection mechanisms in eukaryotes and archaea. Eukaryotic translation initiation is a very complicated process, involving many initiation factors. The most widespread mechanism for the discovery of the start codon is the scanning of the mRNA by a pre-initiation complex until the first AUG codon in a correct context is found. In archaea, long-range scanning does not occur because of the presence of Shine-Dalgarno (SD) sequences or of short 5′ untranslated regions. However, archaeal and eukaryotic translation initiations have three initiation factors in common: e/aIF1, e/aIF1A and e/aIF2 are directly involved in the selection of the start codon. Therefore, the idea that these archaeal and eukaryotic factors fulfill similar functions within a common structural ribosomal core complex has emerged. A divergence between eukaryotic and archaeal factors allowed for the adaptation to the long-range scanning process versus the SD mediated prepositioning of the ribosome.
Collapse
|
9
|
Harish A. What is an archaeon and are the Archaea really unique? PeerJ 2018; 6:e5770. [PMID: 30357005 PMCID: PMC6196074 DOI: 10.7717/peerj.5770] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 09/05/2018] [Indexed: 12/05/2022] Open
Abstract
The recognition of the group Archaea as a major branch of the tree of life (ToL) prompted a new view of the evolution of biodiversity. The genomic representation of archaeal biodiversity has since significantly increased. In addition, advances in phylogenetic modeling of multi-locus datasets have resolved many recalcitrant branches of the ToL. Despite the technical advances and an expanded taxonomic representation, two important aspects of the origins and evolution of the Archaea remain controversial, even as we celebrate the 40th anniversary of the monumental discovery. These issues concern (i) the uniqueness (monophyly) of the Archaea, and (ii) the evolutionary relationships of the Archaea to the Bacteria and the Eukarya; both of these are relevant to the deep structure of the ToL. To explore the causes for this persistent ambiguity, I examine multiple datasets and different phylogenetic approaches that support contradicting conclusions. I find that the uncertainty is primarily due to a scarcity of information in standard datasets-universal core-genes datasets-to reliably resolve the conflicts. These conflicts can be resolved efficiently by comparing patterns of variation in the distribution of functional genomic signatures, which are less diffused unlike patterns of primary sequence variation. Relatively lower heterogeneity in distribution patterns minimizes uncertainties and supports statistically robust phylogenetic inferences, especially of the earliest divergences of life. This case study further highlights the limitations of primary sequence data in resolving difficult phylogenetic problems, and raises questions about evolutionary inferences drawn from the analyses of sequence alignments of a small set of core genes. In particular, the findings of this study corroborate the growing consensus that reversible substitution mutations may not be optimal phylogenetic markers for resolving early divergences in the ToL, nor for determining the polarity of evolutionary transitions across the ToL.
Collapse
Affiliation(s)
- Ajith Harish
- Department of Cell and Molecular Biology, Program in Molecular Biology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
10
|
Di Giulio M. On Earth, there would be a number of fundamental kinds of primary cells – cellular domains – greater than or equal to four. J Theor Biol 2018; 443:10-17. [DOI: 10.1016/j.jtbi.2018.01.025] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 01/10/2018] [Accepted: 01/19/2018] [Indexed: 11/15/2022]
|
11
|
Harish A, Kurland CG. Mitochondria are not captive bacteria. J Theor Biol 2017; 434:88-98. [PMID: 28754286 DOI: 10.1016/j.jtbi.2017.07.011] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2017] [Revised: 07/10/2017] [Accepted: 07/14/2017] [Indexed: 10/19/2022]
Abstract
Lynn Sagan's conjecture (1967) that three of the fundamental organelles observed in eukaryote cells, specifically mitochondria, plastids and flagella were once free-living primitive (prokaryotic) cells was accepted after considerable opposition. Even though the idea was swiftly refuted for the specific case of origins of flagella in eukaryotes, the symbiosis model in general was accepted for decades as a realistic hypothesis to describe the endosymbiotic origins of eukaryotes. However, a systematic analysis of the origins of the mitochondrial proteome based on empirical genome evolution models now indicates that 97% of modern mitochondrial protein domains as well their homologues in bacteria and archaea were present in the universal common ancestor (UCA) of the modern tree of life (ToL). These protein domains are universal modular building blocks of modern genes and genomes, each of which is identified by a unique tertiary structure and a specific biochemical function as well as a characteristic sequence profile. Further, phylogeny reconstructed from genome-scale evolution models reveals that Eukaryotes and Akaryotes (archaea and bacteria) descend independently from UCA. That is to say, Eukaryotes and Akaryotes are both primordial lineages that evolved in parallel. Finally, there is no indication of massive inter-lineage exchange of coding sequences during the descent of the two lineages. Accordingly, we suggest that the evolution of the mitochondrial proteome was autogenic (endogenic) and not endosymbiotic (exogenic).
Collapse
Affiliation(s)
- Ajith Harish
- Department of Cell and Molecular Biology, Section of Structural and Molecular Biology, Uppsala University, Uppsala, Sweden.
| | - Charles G Kurland
- Department of Biology, Section of Microbial Ecology, Lund University, Lund, Sweden.
| |
Collapse
|
12
|
Harish A, Kurland CG. Empirical genome evolution models root the tree of life. Biochimie 2017; 138:137-155. [DOI: 10.1016/j.biochi.2017.04.014] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2016] [Accepted: 04/25/2017] [Indexed: 01/05/2023]
|