1
|
Mughal F, Caetano-Anollés G. Evolution of intrinsic disorder in the structural domains of viral and cellular proteomes. Sci Rep 2025; 15:2878. [PMID: 39843714 PMCID: PMC11754631 DOI: 10.1038/s41598-025-86045-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2024] [Accepted: 01/07/2025] [Indexed: 01/24/2025] Open
Abstract
Intrinsically disordered regions are flexible regions that complement the typical structured regions of proteins. Little is known however about their evolution. Here we leverage a comparative and evolutionary genomics approach to analyze intrinsic disorder in the structural domains of thousands of proteomes. Our analysis revealed that viral and cellular proteomes employ similar strategies to increase disorder but achieve different goals. Viral proteomes evolve disorder for economy of genomic material and multifunctionality. On the other hand, cellular proteomes evolve disorder to advance functionality with increasing genomic complexity. Remarkably, phylogenomic analysis of intrinsic disorder showed that ancient domains were ordered and that disorder evolved as a benefit acquired later in evolution. Evolutionary chronologies of domains indexed with disorder levels and distributions across Archaea, Bacteria, Eukarya and viruses revealed six evolutionary phases, the oldest two harboring only ordered and moderate disorder domains. A biphasic spectrum of disorder versus proteome makeup captured the dichotomy in the evolutionary trajectories of viral and cellular ancestors, one following reductive evolution driven by viral spread of molecular wealth and the other following expansive evolutionary trends to advance functionality through massive domain-forming co-option of disordered loop regions.
Collapse
Affiliation(s)
- Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA.
- C.R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL, 61801, USA.
| |
Collapse
|
2
|
Stonik VA, Makarieva TN, Shubina LK, Guzii AG, Ivanchina NV. Structure Diversity and Properties of Some Bola-like Natural Products. Mar Drugs 2024; 23:3. [PMID: 39852505 PMCID: PMC11767167 DOI: 10.3390/md23010003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2024] [Revised: 12/16/2024] [Accepted: 12/23/2024] [Indexed: 01/26/2025] Open
Abstract
In their shapes, molecules of some bipolar metabolites resemble the so-called bola, a hunting weapon of the South American inhabitants, consisting of two heavy balls connected to each other by a long flexible cord. Herein, we discuss the structures and properties of these natural products (bola-like compounds or bolaamphiphiles), containing two polar terminal fragments and a non-polar chain (or chains) between them, from archaea, bacteria, and marine invertebrates. Additional modifications of core compounds of this class, for example, interchain and intrachain cyclization, hydroxylation, methylation, etc., expand the number of known metabolites of this type, providing their great structural variety. Isolation of such complex compounds individually is problematic, since they usually exist as mixtures of regioisomers and stereoisomers, that are very difficult to be separated. The main approaches to the study of their structures combine various methods of HPLC/MS or GC/MS, 2D-NMR experiments and organic synthesis. The recent identification of new enzymes, taking part in their biosynthesis and metabolism, made it possible to understand molecular aspects of their origination and some features of evolution during geological times. The promising properties of these metabolites, such as their ability to self-assemble and stabilize biological or artificial membranes, and biological activities, attract additional attention to them.
Collapse
Affiliation(s)
- Valentin A. Stonik
- G.B. Elyakov Pacific Institute of Bioorganic Chemistry, Far Eastern Branch, Russian Academy of Sciences, Pr. 100-let Vladivostoku 159, 690022 Vladivostok, Russia; (T.N.M.); (L.K.S.); (A.G.G.); (N.V.I.)
| | | | | | | | | |
Collapse
|
3
|
Caetano-Anollés G, Mughal F, Aziz MF, Caetano-Anollés K. Tracing the birth and intrinsic disorder of loops and domains in protein evolution. Biophys Rev 2024; 16:723-735. [PMID: 39830125 PMCID: PMC11735766 DOI: 10.1007/s12551-024-01251-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Accepted: 10/29/2024] [Indexed: 01/22/2025] Open
Abstract
Protein loops and structural domains are building blocks of molecular structure. They hold evolutionary memory and are largely responsible for the many functions and processes that drive the living world. Here, we briefly review two decades of phylogenomic data-driven research focusing on the emergence and evolution of these elemental architects of protein structure. Phylogenetic trees of domains reconstructed from the proteomes of organisms belonging to all three superkingdoms and viruses were used to build chronological timelines describing the origin of each domain and its embedded loops at different levels of structural abstraction. These timelines consistently recovered six distinct evolutionary phases and a most parsimonious evolutionary progression of cellular life. The timelines also traced the birth of domain structures from loops, which allowed to model their growth ab initio with AlphaFold2. Accretion decreased the disorder of the growing molecules, suggesting disorder is molecular size-dependent. A phylogenomic survey of disorder revealed that loops and domains evolved differently. Loops were highly disordered, disorder increased early in evolution, and ordered and moderate disordered structures were derived. Gradual replacement of loops with α-helix and β-strand bracing structures over time paved the way for the dominance of more disordered loop types. In contrast, ancient domains were ordered, with disorder evolving as a benefit acquired later in evolution. These evolutionary patterns explain inverse correlations between disorder and sequence length of loops and domains. Our findings provide a deep evolutionary view of the link between structure, disorder, flexibility, and function.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - M. Fayez Aziz
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - Kelsey Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Callout Biotech, Albuquerque, NM 87112 USA
| |
Collapse
|
4
|
Volovik MV, Batishchev OV. Viral fingerprints of the ion channel evolution: compromise of complexity and function. J Biomol Struct Dyn 2024:1-20. [PMID: 39365745 DOI: 10.1080/07391102.2024.2411523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Accepted: 04/29/2024] [Indexed: 10/06/2024]
Abstract
Evolution from precellular supramolecular assemblies to cellular world originated from the ability to make a barrier between the interior of the cell and the outer environment. This step resulted from the possibility to form a membrane, which preserves the cell like a wall of the castle. However, every castle needs gates for trading, i.e. in the case of cell, for controlled exchange of substances. These 'gates' should have the mechanism of opening and closing, guards, entry rules, and so on. Different structures are known to be able to make membrane permeable to various substances, from ions to macromolecules. They are amphipathic peptides, their assemblies, sophisticated membrane channels with numerous transmembrane domains, etc. Upon evolving, cellular world preserved and selected many variants, which, finally, have provided both prokaryotes and eukaryotes with highly selective and regulated ion channels. However, various simpler variants of ion channels are found in viruses. Despite the origin of viruses is still under debates, they have evolved parallelly with the cellular forms of life. Being initial form of the enveloped organisms, reduction of protocells or their escaped parts, viruses might be fingerprints of the evolutionary steps of cellular structures like ion channels. Therefore, viroporins may provide us a necessary information about selection between high functionality and less complex structure in supporting all the requirements for controlled membrane permeability. In this review we tried to elucidate these compromises and show the possible way of the evolution of ion channels, from peptides to complex multi-subunit structures, basing on viral examples.
Collapse
Affiliation(s)
- Marta V Volovik
- Laboratory of Bioelectrochemistry, A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, Moscow, Russia
| | - Oleg V Batishchev
- Laboratory of Bioelectrochemistry, A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
5
|
Delaye L. The Unfinished Reconstructed Nature of the Last Universal Common Ancestor. J Mol Evol 2024; 92:584-592. [PMID: 39026043 PMCID: PMC11458799 DOI: 10.1007/s00239-024-10187-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 07/01/2024] [Indexed: 07/20/2024]
Abstract
The ultimate consequence of Darwin's theory of common descent implies that all life on earth descends ultimately from a common ancestor. Biochemistry and molecular biology now provide sufficient evidence of shared ancestry of all extant life forms. However, the nature of the Last Universal Common Ancestor (LUCA) has been a topic of much debate over the years. This review offers a historical perspective on different attempts to infer LUCA's nature, exploring the debate surrounding its complexity. We further examine how different methodologies identify sets of ancient protein that exhibit only partial overlap. For example, different bioinformatic approaches have identified distinct protein subunits from the ATP synthetase identified as potentially inherited from LUCA. Additionally, we discuss how detailed molecular evolutionary analysis of reverse gyrase has modified previous inferences about an hyperthermophilic LUCA based mainly on automatic bioinformatic pipelines. We conclude by emphasizing the importance of developing a database dedicated to studying genes and proteins traceable back to LUCA and earlier stages of cellular evolution. Such a database would house the most ancient genes on earth.
Collapse
Affiliation(s)
- Luis Delaye
- Departamento de Ingeniería Genética, Cinvestav Unidad Irapuato, Km 9.6 Libramiento Norte Carretera Irapuato-León CP. 36824, Irapuato, Gto., Mexico.
| |
Collapse
|
6
|
Ledford SM, Meredith LK. Volatile Organic Compound Metabolism on Early Earth. J Mol Evol 2024; 92:605-617. [PMID: 39017923 PMCID: PMC11458752 DOI: 10.1007/s00239-024-10184-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 06/10/2024] [Indexed: 07/18/2024]
Abstract
Biogenic volatile organic compounds (VOCs) constitute a significant portion of gas-phase metabolites in modern ecosystems and have unique roles in moderating atmospheric oxidative capacity, solar radiation balance, and aerosol formation. It has been theorized that VOCs may account for observed geological and evolutionary phenomena during the Archaean, but the direct contribution of biology to early non-methane VOC cycling remains unexplored. Here, we provide an assessment of all potential VOCs metabolized by the last universal common ancestor (LUCA). We identify enzyme functions linked to LUCA orthologous protein groups across eight literature sources and estimate the volatility of all associated substrates to identify ancient volatile metabolites. We hone in on volatile metabolites with confirmed modern emissions that exist in conserved metabolic pathways and produce a curated list of the most likely LUCA VOCs. We introduce volatile organic metabolites associated with early life and discuss their potential influence on early carbon cycling and atmospheric chemistry.
Collapse
Affiliation(s)
- S Marshall Ledford
- Genetics Graduate Interdisciplinary Program, University of Arizona, Tucson, AZ, 85721, USA.
| | - Laura K Meredith
- School of Natural Resources and the Environment, University of Arizona, Tucson, AZ, 85721, USA
- BIO5 Institute, University of Arizona, Tucson, AZ, 85721, USA
| |
Collapse
|
7
|
Caetano-Anollés K, Aziz MF, Mughal F, Caetano-Anollés G. On Protein Loops, Prior Molecular States and Common Ancestors of Life. J Mol Evol 2024; 92:624-646. [PMID: 38652291 PMCID: PMC11458777 DOI: 10.1007/s00239-024-10167-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 03/22/2024] [Indexed: 04/25/2024]
Abstract
The principle of continuity demands the existence of prior molecular states and common ancestors responsible for extant macromolecular structure. Here, we focus on the emergence and evolution of loop prototypes - the elemental architects of protein domain structure. Phylogenomic reconstruction spanning superkingdoms and viruses generated an evolutionary chronology of prototypes with six distinct evolutionary phases defining a most parsimonious evolutionary progression of cellular life. Each phase was marked by strategic prototype accumulation shaping the structures and functions of common ancestors. The last universal common ancestor (LUCA) of cells and viruses and the last universal cellular ancestor (LUCellA) defined stem lines that were structurally and functionally complex. The evolutionary saga highlighted transformative forces. LUCA lacked biosynthetic ribosomal machinery, while the pivotal LUCellA lacked essential DNA biosynthesis and modern transcription. Early proteins therefore relied on RNA for genetic information storage but appeared initially decoupled from it, hinting at transformative shifts of genetic processing. Urancestral loop types suggest advanced folding designs were present at an early evolutionary stage. An exploration of loop geometric properties revealed gradual replacement of prototypes with α-helix and β-strand bracing structures over time, paving the way for the dominance of other loop types. AlphFold2-generated atomic models of prototype accretion described patterns of fold emergence. Our findings favor a ‛processual' model of evolving stem lines aligned with Woese's vision of a communal world. This model prompts discussing the 'problem of ancestors' and the challenges that lie ahead for research in taxonomy, evolution and complexity.
Collapse
Affiliation(s)
- Kelsey Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
- Callout Biotech, Albuquerque, NM, 87112, USA
| | - M Fayez Aziz
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
| |
Collapse
|
8
|
Kailing F, Lieberman J, Wang J, Turner JL, Goldman AD. Evolution of Cellular Organization Along the First Branches of the Tree of Life. J Mol Evol 2024; 92:618-623. [PMID: 39020132 PMCID: PMC11458647 DOI: 10.1007/s00239-024-10188-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Accepted: 07/06/2024] [Indexed: 07/19/2024]
Abstract
Current evidence suggests that some form of cellular organization arose well before the time of the last universal common ancestor (LUCA). Standard phylogenetic analyses have shown that several protein families associated with membrane translocation, membrane transport, and membrane bioenergetics were very likely present in the proteome of the LUCA. Despite these cellular systems emerging prior to the LUCA, extant archaea, bacteria, and eukaryotes have significant differences in cellular infrastructure and the molecular functions that support it, leading some researchers to argue that true cellularity did not evolve until after the LUCA. Here, we use recently reconstructed minimal proteomes of the LUCA as well as the last archaeal common ancestor (LACA) and the last bacterial common ancestor (LBCA) to characterize the evolution of cellular systems along the first branches of the tree of life. We find that a broad set of functions associated with cellular organization were already present by the time of the LUCA. The functional repertoires of the LACA and LBCA related to cellular organization nearly doubled along each branch following the divergence of the LUCA. These evolutionary trends created the foundation for similarities and differences in cellular organization between the taxonomic domains that are still observed today.
Collapse
Affiliation(s)
- Freya Kailing
- Department of Biology, Oberlin College, Oberlin, OH, USA
| | | | - Joshua Wang
- Department of Biology, Oberlin College, Oberlin, OH, USA
| | | | - Aaron D Goldman
- Department of Biology, Oberlin College, Oberlin, OH, USA.
- Blue Marble Space Institute of Science, Seattle, WA, USA.
| |
Collapse
|
9
|
Caetano-Anollés G. Are Viruses Taxonomic Units? A Protein Domain and Loop-Centric Phylogenomic Assessment. Viruses 2024; 16:1061. [PMID: 39066224 PMCID: PMC11281659 DOI: 10.3390/v16071061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 06/26/2024] [Accepted: 06/27/2024] [Indexed: 07/28/2024] Open
Abstract
Virus taxonomy uses a Linnaean-like subsumption hierarchy to classify viruses into taxonomic units at species and higher rank levels. Virus species are considered monophyletic groups of mobile genetic elements (MGEs) often delimited by the phylogenetic analysis of aligned genomic or metagenomic sequences. Taxonomic units are assumed to be independent organizational, functional and evolutionary units that follow a 'natural history' rationale. Here, I use phylogenomic and other arguments to show that viruses are not self-standing genetically-driven systems acting as evolutionary units. Instead, they are crucial components of holobionts, which are units of biological organization that dynamically integrate the genetics, epigenetic, physiological and functional properties of their co-evolving members. Remarkably, phylogenomic analyses show that viruses share protein domains and loops with cells throughout history via massive processes of reticulate evolution, helping spread evolutionary innovations across a wider taxonomic spectrum. Thus, viruses are not merely MGEs or microbes. Instead, their genomes and proteomes conduct cellularly integrated processes akin to those cataloged by the GO Consortium. This prompts the generation of compositional hierarchies that replace the 'is-a-kind-of' by a 'is-a-part-of' logic to better describe the mereology of integrated cellular and viral makeup. My analysis demands a new paradigm that integrates virus taxonomy into a modern evolutionarily centered taxonomy of organisms.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, C. R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL 61801, USA
| |
Collapse
|
10
|
Romei M, Carpentier M, Chomilier J, Lecointre G. Origins and Functional Significance of Eukaryotic Protein Folds. J Mol Evol 2023; 91:854-864. [PMID: 38060007 DOI: 10.1007/s00239-023-10136-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 10/03/2023] [Indexed: 12/08/2023]
Abstract
Folds are the architecture and topology of a protein domain. Categories of folds are very few compared to the astronomical number of sequences. Eukaryotes have more protein folds than Archaea and Bacteria. These folds are of two types: shared with Archaea and/or Bacteria on one hand and specific to eukaryotic clades on the other hand. The first kind of folds is inherited from the first endosymbiosis and confirms the mixed origin of eukaryotes. In a dataset of 1073 folds whose presence or absence has been evidenced among 210 species equally distributed in the three super-kingdoms, we have identified 28 eukaryotic folds unambiguously inherited from Bacteria and 40 eukaryotic folds unambiguously inherited from Archaea. Compared to previous studies, the repartition of informational function is higher than expected for folds originated from Bacteria and as high as expected for folds inherited from Archaea. The second type of folds is specifically eukaryotic and associated with an increase of new folds within eukaryotes distributed in particular clades. Reconstructed ancestral states coupled with dating of each node on the tree of life provided fold appearance rates. The rate is on average twice higher within Eukaryota than within Bacteria or Archaea. The highest rates are found in the origins of eukaryotes, holozoans, metazoans, metazoans stricto sensu, and vertebrates: the roots of these clades correspond to bursts of fold evolution. We could correlate the functions of some of the fold synapomorphies within eukaryotes with significant evolutionary events. Among them, we find evidence for the rise of multicellularity, adaptive immune system, or virus folds which could be linked to an ecological shift made by tetrapods.
Collapse
Affiliation(s)
- Martin Romei
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205), Sorbonne Université, MNHN, CNRS, EPHE, UA, Paris, France
- IMPMC (UMR 7590), BiBiP, Sorbonne Université, CNRS, MNHN, Paris, France
| | - Mathilde Carpentier
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205), Sorbonne Université, MNHN, CNRS, EPHE, UA, Paris, France.
| | - Jacques Chomilier
- IMPMC (UMR 7590), BiBiP, Sorbonne Université, CNRS, MNHN, Paris, France
| | - Guillaume Lecointre
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205), Sorbonne Université, MNHN, CNRS, EPHE, UA, Paris, France
| |
Collapse
|
11
|
Mughal F, Caetano-Anollés G. Evolution of Intrinsic Disorder in Protein Loops. Life (Basel) 2023; 13:2055. [PMID: 37895436 PMCID: PMC10608553 DOI: 10.3390/life13102055] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 10/08/2023] [Accepted: 10/10/2023] [Indexed: 10/29/2023] Open
Abstract
Intrinsic disorder accounts for the flexibility of protein loops, molecular building blocks that are largely responsible for the processes and molecular functions of the living world. While loops likely represent early structural forms that served as intermediates in the emergence of protein structural domains, their origin and evolution remain poorly understood. Here, we conduct a phylogenomic survey of disorder in loop prototypes sourced from the ArchDB classification. Tracing prototypes associated with protein fold families along an evolutionary chronology revealed that ancient prototypes tended to be more disordered than their derived counterparts, with ordered prototypes developing later in evolution. This highlights the central evolutionary role of disorder and flexibility. While mean disorder increased with time, a minority of ordered prototypes exist that emerged early in evolutionary history, possibly driven by the need to preserve specific molecular functions. We also revealed the percolation of evolutionary constraints from higher to lower levels of organization. Percolation resulted in trade-offs between flexibility and rigidity that impacted prototype structure and geometry. Our findings provide a deep evolutionary view of the link between structure, disorder, flexibility, and function, as well as insights into the evolutionary role of intrinsic disorder in loops and their contribution to protein structure and function.
Collapse
Affiliation(s)
- Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL 61801, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL 61801, USA
- C.R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL 61801, USA
| |
Collapse
|
12
|
Caetano-Anollés G, Claverie JM, Nasir A. A critical analysis of the current state of virus taxonomy. Front Microbiol 2023; 14:1240993. [PMID: 37601376 PMCID: PMC10435761 DOI: 10.3389/fmicb.2023.1240993] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 07/20/2023] [Indexed: 08/22/2023] Open
Abstract
Taxonomical classification has preceded evolutionary understanding. For that reason, taxonomy has become a battleground fueled by knowledge gaps, technical limitations, and a priorism. Here we assess the current state of the challenging field, focusing on fallacies that are common in viral classification. We emphasize that viruses are crucial contributors to the genomic and functional makeup of holobionts, organismal communities that behave as units of biological organization. Consequently, viruses cannot be considered taxonomic units because they challenge crucial concepts of organismality and individuality. Instead, they should be considered processes that integrate virions and their hosts into life cycles. Viruses harbor phylogenetic signatures of genetic transfer that compromise monophyly and the validity of deep taxonomic ranks. A focus on building phylogenetic networks using alignment-free methodologies and molecular structure can help mitigate the impasse, at least in part. Finally, structural phylogenomic analysis challenges the polyphyletic scenario of multiple viral origins adopted by virus taxonomy, defeating a polyphyletic origin and supporting instead an ancient cellular origin of viruses. We therefore, prompt abandoning deep ranks and urgently reevaluating the validity of taxonomic units and principles of virus classification.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and C.R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Jean-Michel Claverie
- Structural and Genomic Information Laboratory (UMR7256), Mediterranean Institute of Microbiology (FR3479), IM2B, IOM, Aix Marseille University, CNRS, Marseille, France
| | | |
Collapse
|
13
|
Romei M, Sapriel G, Imbert P, Jamay T, Chomilier J, Lecointre G, Carpentier M. Protein folds as synapomorphies of the tree of life. Evolution 2022; 76:1706-1719. [PMID: 35765784 PMCID: PMC9541633 DOI: 10.1111/evo.14550] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 05/17/2022] [Accepted: 05/31/2022] [Indexed: 01/22/2023]
Abstract
Several studies showed that folds (topology of protein secondary structures) distribution in proteomes may be a global proxy to build phylogeny. Then, some folds should be synapomorphies (derived characters exclusively shared among taxa). However, previous studies used methods that did not allow synapomorphy identification, which requires congruence analysis of folds as individual characters. Here, we map SCOP folds onto a sample of 210 species across the tree of life (TOL). Congruence is assessed using retention index of each fold for the TOL, and principal component analysis for deeper branches. Using a bicluster mapping approach, we define synapomorphic blocks of folds (SBF) sharing similar presence/absence patterns. Among the 1232 folds, 20% are universally present in our TOL, whereas 54% are reliable synapomorphies. These results are similar with CATH and ECOD databases. Eukaryotes are characterized by a large number of them, and several SBFs clearly support nested eukaryotic clades (divergence times from 1100 to 380 mya). Although clearly separated, the three superkingdoms reveal a strong mosaic pattern. This pattern is consistent with the dual origin of eukaryotes and witness secondary endosymbiosis in their phothosynthetic clades. Our study unveils direct analysis of folds synapomorphies as key characters to unravel evolutionary history of species.
Collapse
Affiliation(s)
- Martin Romei
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205)Sorbonne Université, MNHN, CNRS, EPHE, UAParisFrance,IMPMC (UMR 7590), BiBiP, Sorbonne Université, CNRS, MNHNParisFrance
| | - Guillaume Sapriel
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205)Sorbonne Université, MNHN, CNRS, EPHE, UAParisFrance,UFR des sciences de la santéUniversité Versailles‐St‐QuentinVersaillesFrance
| | - Pierre Imbert
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205)Sorbonne Université, MNHN, CNRS, EPHE, UAParisFrance
| | - Théo Jamay
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205)Sorbonne Université, MNHN, CNRS, EPHE, UAParisFrance
| | | | - Guillaume Lecointre
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205)Sorbonne Université, MNHN, CNRS, EPHE, UAParisFrance
| | - Mathilde Carpentier
- Institut Systématique Evolution Biodiversité (ISYEB UMR 7205)Sorbonne Université, MNHN, CNRS, EPHE, UAParisFrance
| |
Collapse
|
14
|
Crapitto AJ, Campbell A, Harris AJ, Goldman AD. A consensus view of the proteome of the last universal common ancestor. Ecol Evol 2022; 12:e8930. [PMID: 35784055 PMCID: PMC9165204 DOI: 10.1002/ece3.8930] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 04/11/2022] [Accepted: 04/14/2022] [Indexed: 12/30/2022] Open
Abstract
The availability of genomic and proteomic data from across the tree of life has made it possible to infer features of the genome and proteome of the last universal common ancestor (LUCA). A number of studies have done so, all using a unique set of methods and bioinformatics databases. Here, we compare predictions across eight such studies and measure both their agreement with one another and with the consensus predictions among them. We find that some LUCA genome studies show a strong agreement with the consensus predictions of the others, but that no individual study shares a high or even moderate degree of similarity with any other individual study. From these observations, we conclude that the consensus among studies provides a more accurate depiction of the core proteome of the LUCA and its functional repertoire. The set of consensus LUCA protein family predictions between all of these studies portrays a LUCA genome that, at minimum, encoded functions related to protein synthesis, amino acid metabolism, nucleotide metabolism, and the use of common, nucleotide-derived organic cofactors.
Collapse
Affiliation(s)
| | - Amy Campbell
- Perelman School of MedicineUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - AJ Harris
- Key Laboratory of Plant Resources Conservation and Sustainable UtilizationSouth China Botanical GardenChinese Academy of SciencesGuangzhouChina
| | - Aaron D. Goldman
- Department of BiologyOberlin CollegeOberlinOhioUSA
- Blue Marble Space Institute of ScienceSeattleWashingtonUSA
| |
Collapse
|
15
|
The Legend of ATP: From Origin of Life to Precision Medicine. Metabolites 2022; 12:metabo12050461. [PMID: 35629965 PMCID: PMC9148104 DOI: 10.3390/metabo12050461] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 05/19/2022] [Accepted: 05/19/2022] [Indexed: 02/05/2023] Open
Abstract
Adenosine triphosphate (ATP) may be the most important biological small molecule. Since it was discovered in 1929, ATP has been regarded as life’s energy reservoir. However, this compound means more to life. Its legend starts at the dawn of life and lasts to this day. ATP must be the basic component of ancient ribozymes and may facilitate the origin of structured proteins. In the existing organisms, ATP continues to construct ribonucleic acid (RNA) and work as a protein cofactor. ATP also functions as a biological hydrotrope, which may keep macromolecules soluble in the primitive environment and can regulate phase separation in modern cells. These functions are involved in the pathogenesis of aging-related diseases and breast cancer, providing clues to discovering anti-aging agents and precision medicine tactics for breast cancer.
Collapse
|
16
|
Scaling laws in enzyme function reveal a new kind of biochemical universality. Proc Natl Acad Sci U S A 2022; 119:2106655119. [PMID: 35217602 PMCID: PMC8892295 DOI: 10.1073/pnas.2106655119] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/18/2021] [Indexed: 11/21/2022] Open
Abstract
Known examples of life all share the same core biochemistry going back to the last universal common ancestor (LUCA), but whether this feature is universal to other examples, including at the origin of life or alien life, is unknown. We show how a physics-inspired statistical approach identifies universal scaling laws across biochemical reactions that are not defined by common chemical components but instead, as macroscale patterns in the reaction functions used by life. The identified scaling relations can be used to predict statistical features of LUCA, and network analyses reveal some of the functional principles that underlie them. They are, therefore, prime candidates for developing new theory on the “laws of life” that might apply to all possible biochemistries. All life on Earth is unified by its use of a shared set of component chemical compounds and reactions, providing a detailed model for universal biochemistry. However, this notion of universality is specific to known biochemistry and does not allow quantitative predictions about examples not yet observed. Here, we introduce a more generalizable concept of biochemical universality that is more akin to the kind of universality found in physics. Using annotated genomic datasets including an ensemble of 11,955 metagenomes, 1,282 archaea, 11,759 bacteria, and 200 eukaryotic taxa, we show how enzyme functions form universality classes with common scaling behavior in their relative abundances across the datasets. We verify that these scaling laws are not explained by the presence of compounds, reactions, and enzyme functions shared across known examples of life. We demonstrate how these scaling laws can be used as a tool for inferring properties of ancient life by comparing their predictions with a consensus model for the last universal common ancestor (LUCA). We also illustrate how network analyses shed light on the functional principles underlying the observed scaling behaviors. Together, our results establish the existence of a new kind of biochemical universality, independent of the details of life on Earth’s component chemistry, with implications for guiding our search for missing biochemical diversity on Earth or for biochemistries that might deviate from the exact chemical makeup of life as we know it, such as at the origins of life, in alien environments, or in the design of synthetic life.
Collapse
|
17
|
Fried SD, Fujishima K, Makarov M, Cherepashuk I, Hlouchova K. Peptides before and during the nucleotide world: an origins story emphasizing cooperation between proteins and nucleic acids. J R Soc Interface 2022; 19:20210641. [PMID: 35135297 PMCID: PMC8833103 DOI: 10.1098/rsif.2021.0641] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Accepted: 01/05/2022] [Indexed: 12/14/2022] Open
Abstract
Recent developments in Origins of Life research have focused on substantiating the narrative of an abiotic emergence of nucleic acids from organic molecules of low molecular weight, a paradigm that typically sidelines the roles of peptides. Nevertheless, the simple synthesis of amino acids, the facile nature of their activation and condensation, their ability to recognize metals and cofactors and their remarkable capacity to self-assemble make peptides (and their analogues) favourable candidates for one of the earliest functional polymers. In this mini-review, we explore the ramifications of this hypothesis. Diverse lines of research in molecular biology, bioinformatics, geochemistry, biophysics and astrobiology provide clues about the progression and early evolution of proteins, and lend credence to the idea that early peptides served many central prebiotic roles before they were encodable by a polynucleotide template, in a putative 'peptide-polynucleotide stage'. For example, early peptides and mini-proteins could have served as catalysts, compartments and structural hubs. In sum, we shed light on the role of early peptides and small proteins before and during the nucleotide world, in which nascent life fully grasped the potential of primordial proteins, and which has left an imprint on the idiosyncratic properties of extant proteins.
Collapse
Affiliation(s)
- Stephen D. Fried
- Department of Chemistry, Johns Hopkins University, Baltimore, MD 21212, USA
- Department of Biophysics, Johns Hopkins University, Baltimore, MD 21212, USA
| | - Kosuke Fujishima
- Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo 1528550, Japan
- Graduate School of Media and Governance, Keio University, Fujisawa 2520882, Japan
| | - Mikhail Makarov
- Department of Cell Biology, Faculty of Science, Charles University, BIOCEV, Prague 12800, Czech Republic
| | - Ivan Cherepashuk
- Department of Cell Biology, Faculty of Science, Charles University, BIOCEV, Prague 12800, Czech Republic
| | - Klara Hlouchova
- Department of Cell Biology, Faculty of Science, Charles University, BIOCEV, Prague 12800, Czech Republic
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague 16610, Czech Republic
| |
Collapse
|
18
|
Caetano-Anollés G, Aziz MF, Mughal F, Caetano-Anollés D. Tracing protein and proteome history with chronologies and networks: folding recapitulates evolution. Expert Rev Proteomics 2021; 18:863-880. [PMID: 34628994 DOI: 10.1080/14789450.2021.1992277] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
INTRODUCTION While the origin and evolution of proteins remain mysterious, advances in evolutionary genomics and systems biology are facilitating the historical exploration of the structure, function and organization of proteins and proteomes. Molecular chronologies are series of time events describing the history of biological systems and subsystems and the rise of biological innovations. Together with time-varying networks, these chronologies provide a window into the past. AREAS COVERED Here, we review molecular chronologies and networks built with modern methods of phylogeny reconstruction. We discuss how chronologies of structural domain families uncover the explosive emergence of metabolism, the late rise of translation, the co-evolution of ribosomal proteins and rRNA, and the late development of the ribosomal exit tunnel; events that coincided with a tendency to shorten folding time. Evolving networks described the early emergence of domains and a late 'big bang' of domain combinations. EXPERT OPINION Two processes, folding and recruitment appear central to the evolutionary progression. The former increases protein persistence. The later fosters diversity. Chronologically, protein evolution mirrors folding by combining supersecondary structures into domains, developing translation machinery to facilitate folding speed and stability, and enhancing structural complexity by establishing long-distance interactions in novel structural and architectural designs.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, USA.,C. R. Woese Institute for Genomic Biology, University of Illinois, Urbana, Illinois, USA
| | - M Fayez Aziz
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, USA
| | - Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, USA
| | - Derek Caetano-Anollés
- Data Science Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| |
Collapse
|
19
|
Caetano-Anollés G. The Compressed Vocabulary of Microbial Life. Front Microbiol 2021; 12:655990. [PMID: 34305827 PMCID: PMC8292947 DOI: 10.3389/fmicb.2021.655990] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 04/27/2021] [Indexed: 12/22/2022] Open
Abstract
Communication is an undisputed central activity of life that requires an evolving molecular language. It conveys meaning through messages and vocabularies. Here, I explore the existence of a growing vocabulary in the molecules and molecular functions of the microbial world. There are clear correspondences between the lexicon, syntax, semantics, and pragmatics of language organization and the module, structure, function, and fitness paradigms of molecular biology. These correspondences are constrained by universal laws and engineering principles. Macromolecular structure, for example, follows quantitative linguistic patterns arising from statistical laws that are likely universal, including the Zipf's law, a special case of the scale-free distribution, the Heaps' law describing sublinear growth typical of economies of scales, and the Menzerath-Altmann's law, which imposes size-dependent patterns of decreasing returns. Trade-off solutions between principles of economy, flexibility, and robustness define a "triangle of persistence" describing the impact of the environment on a biological system. The pragmatic landscape of the triangle interfaces with the syntax and semantics of molecular languages, which together with comparative and evolutionary genomic data can explain global patterns of diversification of cellular life. The vocabularies of proteins (proteomes) and functions (functionomes) revealed a significant universal lexical core supporting a universal common ancestor, an ancestral evolutionary link between Bacteria and Eukarya, and distinct reductive evolutionary strategies of language compression in Archaea and Bacteria. A "causal" word cloud strategy inspired by the dependency grammar paradigm used in catenae unfolded the evolution of lexical units associated with Gene Ontology terms at different levels of ontological abstraction. While Archaea holds the smallest, oldest, and most homogeneous vocabulary of all superkingdoms, Bacteria heterogeneously apportions a more complex vocabulary, and Eukarya pushes functional innovation through mechanisms of flexibility and robustness.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, and C. R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL, United States
| |
Collapse
|
20
|
Abstract
Domains are the structural, functional and evolutionary units of proteins. They combine to form multidomain proteins. The evolutionary history of this molecular combinatorics has been studied with phylogenomic methods. Here, we construct networks of domain organization and explore their evolution. A time series of networks revealed two ancient waves of structural novelty arising from ancient 'p-loop' and 'winged helix' domains and a massive 'big bang' of domain organization. The evolutionary recruitment of domains was highly modular, hierarchical and ongoing. Domain rearrangements elicited non-random and scale-free network structure. Comparative analyses of preferential attachment, randomness and modularity showed yin-and-yang complementary transition and biphasic patterns along the structural chronology. Remarkably, the evolving networks highlighted a central evolutionary role of cofactor-supporting structures of non-ribosomal peptide synthesis pathways, likely crucial to the early development of the genetic code. Some highly modular domains featured dual response regulation in two-component signal transduction systems with DNA-binding activity linked to transcriptional regulation of responses to environmental change. Interestingly, hub domains across the evolving networks shared the historical role of DNA binding and editing, an ancient protein function in molecular evolution. Our investigation unfolds historical source-sink patterns of evolutionary recruitment that further our understanding of protein architectures and functions.
Collapse
|
21
|
Nasir A, Mughal F, Caetano-Anollés G. The tree of life describes a tripartite cellular world. Bioessays 2021; 43:e2000343. [PMID: 33837594 DOI: 10.1002/bies.202000343] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2020] [Revised: 03/11/2021] [Accepted: 03/15/2021] [Indexed: 12/28/2022]
Abstract
The canonical view of a 3-domain (3D) tree of life was recently challenged by the discovery of Asgardarchaeota encoding eukaryote signature proteins (ESPs), which were treated as missing links of a 2-domain (2D) tree. Here we revisit the debate. We discuss methodological limitations of building trees with alignment-dependent approaches, which often fail to satisfactorily address the problem of ''gaps.'' In addition, most phylogenies are reconstructed unrooted, neglecting the power of direct rooting methods. Alignment-free methodologies lift most difficulties but require employing realistic evolutionary models. We argue that the discoveries of Asgards and ESPs, by themselves, do not rule out the 3D tree, which is strongly supported by comparative and evolutionary genomic analyses and vast genomic and biochemical superkingdom distinctions. Given uncertainties of retrodiction and interpretation difficulties, we conclude that the 3D view has not been falsified but instead has been strengthened by genomic analyses. In turn, the objections to the 2D model have not been lifted. The debate remains open. Also see the video abstract here: https://youtu.be/-6TBN0bubI8.
Collapse
Affiliation(s)
- Arshan Nasir
- Theoretical Biology and Biophysics (T-6), Los Alamos National Laboratory, Los Alamos, New Mexico, USA
| | - Fizza Mughal
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Gustavo Caetano-Anollés
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| |
Collapse
|
22
|
Harris AJ, Goldman AD. The very early evolution of protein translocation across membranes. PLoS Comput Biol 2021; 17:e1008623. [PMID: 33684113 PMCID: PMC7987157 DOI: 10.1371/journal.pcbi.1008623] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Revised: 03/23/2021] [Accepted: 12/10/2020] [Indexed: 11/18/2022] Open
Abstract
In this study, we used a computational approach to investigate the early evolutionary history of a system of proteins that, together, embed and translocate other proteins across cell membranes. Cell membranes comprise the basis for cellularity, which is an ancient, fundamental organizing principle shared by all organisms and a key innovation in the evolution of life on Earth. Two related requirements for cellularity are that organisms are able to both embed proteins into membranes and translocate proteins across membranes. One system that accomplishes these tasks is the signal recognition particle (SRP) system, in which the core protein components are the paralogs, FtsY and Ffh. Complementary to the SRP system is the Sec translocation channel, in which the primary channel-forming protein is SecY. We performed phylogenetic analyses that strongly supported prior inferences that FtsY, Ffh, and SecY were all present by the time of the last universal common ancestor of life, the LUCA, and that the ancestor of FtsY and Ffh existed before the LUCA. Further, we combined ancestral sequence reconstruction and protein structure and function prediction to show that the LUCA had an SRP system and Sec translocation channel that were similar to those of extant organisms. We also show that the ancestor of Ffh and FtsY that predated the LUCA was more similar to FtsY than Ffh but could still have comprised a rudimentary protein translocation system on its own. Duplication of the ancestor of FtsY and Ffh facilitated the specialization of FtsY as a membrane bound receptor and Ffh as a cytoplasmic protein that could bind nascent proteins with specific membrane-targeting signal sequences. Finally, we analyzed amino acid frequencies in our ancestral sequence reconstructions to infer that the ancestral Ffh/FtsY protein likely arose prior to or just after the completion of the canonical genetic code. Taken together, our results offer a window into the very early evolutionary history of cellularity.
Collapse
Affiliation(s)
- AJ Harris
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
- Department of Biology, Oberlin College and Conservatory, K123 Science Center, Oberlin, Ohio, United States of America
| | - Aaron David Goldman
- Department of Biology, Oberlin College and Conservatory, K123 Science Center, Oberlin, Ohio, United States of America
- Blue Marble Space Institute of Science, Seattle, Washington, United States of America
| |
Collapse
|
23
|
Kitagawa T, Nishio T, Yoshikawa Y, Umezawa N, Higuchi T, Shew CY, Kenmotsu T, Yoshikawa K. Effects of Structural Isomers of Spermine on the Higher-Order Structure of DNA and Gene Expression. Int J Mol Sci 2021; 22:ijms22052355. [PMID: 33652986 PMCID: PMC7956460 DOI: 10.3390/ijms22052355] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 02/19/2021] [Accepted: 02/23/2021] [Indexed: 11/16/2022] Open
Abstract
Polyamines are involved in various biological functions, including cell proliferation, differentiation, gene regulation, etc. Recently, it was found that polyamines exhibit biphasic effects on gene expression: promotion and inhibition at low and high concentrations, respectively. Here, we compared the effects of three naturally occurring tetravalent polyamines, spermine (SPM), thermospermine (TSPM), and N4-aminopropylspermidine (BSPD). Based on the single DNA observation with fluorescence microscopy together with measurements by atomic force microscopy revealed that these polyamines induce shrinkage and then compaction of DNA molecules, at low and high concentrations, respectively. We also performed the observation to evaluate the effects of these polyamine isomers on the activity of gene expression by adapting a cell-free luciferase assay. Interestingly, the potency of their effects on the DNA conformation and also on the inhibition of gene expression activity indicates the highest for TSPM among spermine isomers. A numerical evaluation of the strength of the interaction of these polyamines with negatively charged double-strand DNA revealed that this ordering of the potency corresponds to the order of the strength of the attractive interaction between phosphate groups of DNA and positively charged amino groups of the polyamines.
Collapse
Affiliation(s)
- Tomoki Kitagawa
- Graduate School of Life and Medical Sciences, Doshisha University, Kyoto 610-0394, Japan; (T.K.); (T.N.); (Y.Y.)
| | - Takashi Nishio
- Graduate School of Life and Medical Sciences, Doshisha University, Kyoto 610-0394, Japan; (T.K.); (T.N.); (Y.Y.)
| | - Yuko Yoshikawa
- Graduate School of Life and Medical Sciences, Doshisha University, Kyoto 610-0394, Japan; (T.K.); (T.N.); (Y.Y.)
| | - Naoki Umezawa
- Graduate School of Pharmaceutical Sciences, Nagoya City University, Nagoya 467-8603, Japan; (N.U.); (T.H.)
| | - Tsunehiko Higuchi
- Graduate School of Pharmaceutical Sciences, Nagoya City University, Nagoya 467-8603, Japan; (N.U.); (T.H.)
| | - Chwen-Yang Shew
- Doctoral Program in Chemistry, The Graduate Center of the City University of New York, New York, NY 10016, USA;
- Department of Chemistry, College of Staten Island, Staten Island, New York, NY 10314, USA
| | - Takahiro Kenmotsu
- Graduate School of Life and Medical Sciences, Doshisha University, Kyoto 610-0394, Japan; (T.K.); (T.N.); (Y.Y.)
- Correspondence: (T.K.); (K.Y.)
| | - Kenichi Yoshikawa
- Graduate School of Life and Medical Sciences, Doshisha University, Kyoto 610-0394, Japan; (T.K.); (T.N.); (Y.Y.)
- Center for Integrative Medicine and Physics, Institute for Advanced Study, Kyoto University, Kyoto 606-8501, Japan
- Correspondence: (T.K.); (K.Y.)
| |
Collapse
|
24
|
Goldman AD, Kacar B. Cofactors are Remnants of Life's Origin and Early Evolution. J Mol Evol 2021; 89:127-133. [PMID: 33547911 PMCID: PMC7982383 DOI: 10.1007/s00239-020-09988-4] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Accepted: 12/21/2020] [Indexed: 12/22/2022]
Abstract
The RNA World is one of the most widely accepted hypotheses explaining the origin of the genetic system used by all organisms today. It proposes that the tripartite system of DNA, RNA, and proteins was preceded by one consisting solely of RNA, which both stored genetic information and performed the molecular functions encoded by that genetic information. Current research into a potential RNA World revolves around the catalytic properties of RNA-based enzymes, or ribozymes. Well before the discovery of ribozymes, Harold White proposed that evidence for a precursor RNA world could be found within modern proteins in the form of coenzymes, the majority of which contain nucleobases or nucleoside moieties, such as Coenzyme A and S-adenosyl methionine, or are themselves nucleotides, such as ATP and NADH (a dinucleotide). These coenzymes, White suggested, had been the catalytic active sites of ancient ribozymes, which transitioned to their current forms after the surrounding ribozyme scaffolds had been replaced by protein apoenzymes during the evolution of translation. Since its proposal four decades ago, this groundbreaking hypothesis has garnered support from several different research disciplines and motivated similar hypotheses about other classes of cofactors, most notably iron-sulfur cluster cofactors as remnants of the geochemical setting of the origin of life. Evidence from prebiotic geochemistry, ribozyme biochemistry, and evolutionary biology, increasingly supports these hypotheses. Certain coenzymes and cofactors may bridge modern biology with the past and can thus provide insights into the elusive and poorly-recorded period of the origin and early evolution of life.
Collapse
Affiliation(s)
- Aaron D Goldman
- Department of Biology, Oberlin College and Conservatory, Oberlin, OH, 44074, USA. .,Blue Marble Space Institute of Science, Seattle, WA, 98154, USA.
| | - Betul Kacar
- Blue Marble Space Institute of Science, Seattle, WA, 98154, USA. .,Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, 85721, USA. .,Lunar and Planetary Laboratory and Department of Astronomy, University of Arizona, Tucson, AZ, 85721, USA. .,Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, 152-8550, Japan.
| |
Collapse
|
25
|
Mughal F, Nasir A, Caetano-Anollés G. The origin and evolution of viruses inferred from fold family structure. Arch Virol 2020; 165:2177-2191. [PMID: 32748179 PMCID: PMC7398281 DOI: 10.1007/s00705-020-04724-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Accepted: 05/30/2020] [Indexed: 12/16/2022]
Abstract
The canonical frameworks of viral evolution describe viruses as cellular predecessors, reduced forms of cells, or entities that escaped cellular control. The discovery of giant viruses has changed these standard paradigms. Their genetic, proteomic and structural complexities resemble those of cells, prompting a redefinition and reclassification of viruses. In a previous genome-wide analysis of the evolution of structural domains in proteomes, with domains defined at the fold superfamily level, we found the origins of viruses intertwined with those of ancient cells. Here, we extend these data-driven analyses to the study of fold families confirming the co-evolution of viruses and ancient cells and the genetic ability of viruses to foster molecular innovation. The results support our suggestion that viruses arose by genomic reduction from ancient cells and validate a co-evolutionary ‘symbiogenic’ model of viral origins.
Collapse
Affiliation(s)
- Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, USA
- Illinois Informatics Institute, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Arshan Nasir
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM, USA
- Department of Biosciences, COMSATS University Islamabad, Islamabad, Pakistan
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, USA.
- Illinois Informatics Institute, University of Illinois at Urbana-Champaign, Urbana, IL, USA.
| |
Collapse
|
26
|
Chu XY, Zhang HY. Cofactors as Molecular Fossils To Trace the Origin and Evolution of Proteins. Chembiochem 2020; 21:3161-3168. [PMID: 32515532 DOI: 10.1002/cbic.202000027] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Revised: 06/03/2020] [Indexed: 12/16/2022]
Abstract
Due to their early origin and extreme conservation, cofactors are valuable molecular fossils for tracing the origin and evolution of proteins. First, as the order of protein folds binding with cofactors roughly coincides with protein-fold chronology, cofactors are considered to have facilitated the origin of primitive proteins by selecting them from pools of random amino acid sequences. Second, in the subsequent evolution of proteins, cofactors still played an important role. More interestingly, as metallic cofactors evolved with geochemical variations, some geochemical events left imprints in the chronology of protein architecture; this provides further evidence supporting the coevolution of biochemistry and geochemistry. In this paper, we attempt to review the molecular fossils used in tracing the origin and evolution of proteins, with a special focus on cofactors.
Collapse
Affiliation(s)
- Xin-Yi Chu
- Hubei Key Laboratory of Agricultural Bioinformatics College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Hong-Yu Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| |
Collapse
|
27
|
Liberles DA, Chang B, Geiler-Samerotte K, Goldman A, Hey J, Kaçar B, Meyer M, Murphy W, Posada D, Storfer A. Emerging Frontiers in the Study of Molecular Evolution. J Mol Evol 2020; 88:211-226. [PMID: 32060574 PMCID: PMC7386396 DOI: 10.1007/s00239-020-09932-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
A collection of the editors of Journal of Molecular Evolution have gotten together to pose a set of key challenges and future directions for the field of molecular evolution. Topics include challenges and new directions in prebiotic chemistry and the RNA world, reconstruction of early cellular genomes and proteins, macromolecular and functional evolution, evolutionary cell biology, genome evolution, molecular evolutionary ecology, viral phylodynamics, theoretical population genomics, somatic cell molecular evolution, and directed evolution. While our effort is not meant to be exhaustive, it reflects research questions and problems in the field of molecular evolution that are exciting to our editors.
Collapse
Affiliation(s)
- David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA.
| | - Belinda Chang
- Department of Ecology and Evolutionary Biology and Department of Cell and Systems Biology, University of Toronto, 25 Harbord Street, Toronto, ON, M5S 3G5, Canada
| | - Kerry Geiler-Samerotte
- Center for Mechanisms of Evolution, School of Life Sciences, Arizona State University, Tempe, AZ, 85287, USA
| | - Aaron Goldman
- Department of Biology, Oberlin College and Conservatory, K123 Science Center, 119 Woodland Street, Oberlin, OH, 44074, USA
| | - Jody Hey
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA
| | - Betül Kaçar
- Department of Molecular and Cell Biology, University of Arizona, Tucson, AZ, 85721, USA
| | - Michelle Meyer
- Department of Biology, Boston College, Chestnut Hill, MA, 02467, USA
| | - William Murphy
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, 77843, USA
| | - David Posada
- Biomedical Research Center (CINBIO), University of Vigo, Vigo, Spain
| | - Andrew Storfer
- School of Biological Sciences, Washington State University, Pullman, WA, 99164, USA
| |
Collapse
|
28
|
Bokhari RH, Amirjan N, Jeong H, Kim KM, Caetano-Anollés G, Nasir A. Bacterial Origin and Reductive Evolution of the CPR Group. Genome Biol Evol 2020; 12:103-121. [PMID: 32031619 PMCID: PMC7093835 DOI: 10.1093/gbe/evaa024] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/31/2020] [Indexed: 12/24/2022] Open
Abstract
The candidate phyla radiation (CPR) is a proposed subdivision within the bacterial domain comprising several candidate phyla. CPR organisms are united by small genome and physical sizes, lack several metabolic enzymes, and populate deep branches within the bacterial subtree of life. These features raise intriguing questions regarding their origin and mode of evolution. In this study, we performed a comparative and phylogenomic analysis to investigate CPR origin and evolution. Unlike previous gene/protein sequence-based reports of CPR evolution, we used protein domain superfamilies classified by protein structure databases to resolve the evolutionary relationships of CPR with non-CPR bacteria, Archaea, Eukarya, and viruses. Across all supergroups, CPR shared maximum superfamilies with non-CPR bacteria and were placed as deep branching bacteria in most phylogenomic trees. CPR contributed 1.22% of new superfamilies to bacteria including the ribosomal protein L19e and encoded four core superfamilies that are likely involved in cell-to-cell interaction and establishing episymbiotic lifestyles. Although CPR and non-CPR bacterial proteomes gained common superfamilies over the course of evolution, CPR and Archaea had more common losses. These losses mostly involved metabolic superfamilies. In fact, phylogenies built from only metabolic protein superfamilies separated CPR and non-CPR bacteria. These findings indicate that CPR are bacterial organisms that have probably evolved in an Archaea-like manner via the early loss of metabolic functions. We also discovered that phylogenies built from metabolic and informational superfamilies gave contrasting views of the groupings among Archaea, Bacteria, and Eukarya, which add to the current debate on the evolutionary relationships among superkingdoms.
Collapse
Affiliation(s)
| | - Nooreen Amirjan
- Department of Biosciences, COMSATS University Islamabad, Pakistan
| | - Hyeonsoo Jeong
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA
| | - Kyung Mo Kim
- Division of Polar Life Sciences, Korea Polar Research Institute, Incheon, Republic of Korea
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana
| | - Arshan Nasir
- Department of Biosciences, COMSATS University Islamabad, Pakistan
- Theoretical Biology & Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico
| |
Collapse
|
29
|
Mughal F, Caetano-Anollés G. MANET 3.0: Hierarchy and modularity in evolving metabolic networks. PLoS One 2019; 14:e0224201. [PMID: 31648227 PMCID: PMC6812854 DOI: 10.1371/journal.pone.0224201] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Accepted: 10/08/2019] [Indexed: 11/30/2022] Open
Abstract
Enzyme recruitment is a fundamental evolutionary driver of modern metabolism. We see evidence of recruitment at work in the metabolic Molecular Ancestry Networks (MANET) database, an online resource that integrates data from KEGG, SCOP and structural phylogenomic reconstruction. The database, which was introduced in 2006, traces the deep history of the structural domains of enzymes in metabolic pathways. Here we release version 3.0 of MANET, which updates data from KEGG and SCOP, links enzyme and PDB information with PDBsum, and traces evolutionary information of domains defined at fold family level of SCOP classification in metabolic subnetwork diagrams. Compared to SCOP folds used in the previous versions, fold families are cohesive units of functional similarity that are highly conserved at sequence level and offer a 10-fold increase of data entries. We surveyed enzymatic, functional and catalytic site distributions among superkingdoms showing that ancient enzymatic innovations followed a biphasic temporal pattern of diversification typical of module innovation. We grouped enzymatic activities of MANET into a hierarchical system of subnetworks and mesonetworks matching KEGG classification. The evolutionary growth of these modules of metabolic activity was studied using bipartite networks and their one-mode projections at enzyme, subnetwork and mesonetwork levels of organization. Evolving metabolic networks revealed patterns of enzyme sharing that transcended mesonetwork boundaries and supported the patchwork model of metabolic evolution. We also explored the scale-freeness, randomness and small-world properties of evolving networks as possible organizing principles of network growth and diversification. The network structure shows an increase in hierarchical modularity and scale-free behavior as metabolic networks unfold in evolutionary time. Remarkably, this evolutionary constraint on structure was stronger at lower levels of metabolic organization. Evolving metabolic structure reveals a 'principle of granularity', an evolutionary increase of the cohesiveness of lower-level parts of a hierarchical system. MANET is available at http://manet.illinois.edu.
Collapse
Affiliation(s)
- Fizza Mughal
- Illinois Informatics Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Gustavo Caetano-Anollés
- Illinois Informatics Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| |
Collapse
|
30
|
Solis AD. Reduced alphabet of prebiotic amino acids optimally encodes the conformational space of diverse extant protein folds. BMC Evol Biol 2019; 19:158. [PMID: 31362700 PMCID: PMC6668081 DOI: 10.1186/s12862-019-1464-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Accepted: 06/19/2019] [Indexed: 11/10/2022] Open
Abstract
Background There is wide agreement that only a subset of the twenty standard amino acids existed prebiotically in sufficient concentrations to form functional polypeptides. We ask how this subset, postulated as {A,D,E,G,I,L,P,S,T,V}, could have formed structures stable enough to found metabolic pathways. Inspired by alphabet reduction experiments, we undertook a computational analysis to measure the structural coding behavior of sequences simplified by reduced alphabets. We sought to discern characteristics of the prebiotic set that would endow it with unique properties relevant to structure, stability, and folding. Results Drawing on a large dataset of single-domain proteins, we employed an information-theoretic measure to assess how well the prebiotic amino acid set preserves fold information against all other possible ten-amino acid sets. An extensive virtual mutagenesis procedure revealed that the prebiotic set excellently preserves sequence-dependent information regarding both backbone conformation and tertiary contact matrix of proteins. We observed that information retention is fold-class dependent: the prebiotic set sufficiently encodes the structure space of α/β and α + β folds, and to a lesser extent, of all-α and all-β folds. The prebiotic set appeared insufficient to encode the small proteins. Assessing how well the prebiotic set discriminates native vs. incorrect sequence-structure matches, we found that α/β and α + β folds exhibit more pronounced energy gaps with the prebiotic set than with nearly all alternatives. Conclusions The prebiotic set optimally encodes local backbone structures that appear in the folded environment and near-optimally encodes the tertiary contact matrix of extant proteins. The fold-class-specific patterns observed from our structural analysis confirm the postulated timeline of fold appearance in proteogenesis derived from proteomic sequence analyses. Polypeptides arising in a prebiotic environment will likely form α/β and α + β-like folds if any at all. We infer that the progressive expansion of the alphabet allowed the increased conformational stability and functional specificity of later folds, including all-α, all-β, and small proteins. Our results suggest that prebiotic sequences are amenable to mutations that significantly lower native conformational energies and increase discrimination amidst incorrect folds. This property may have assisted the genesis of functional proto-enzymes prior to the expansion of the full amino acid alphabet.
Collapse
Affiliation(s)
- Armando D Solis
- Biological Sciences Department, New York City College of Technology (City Tech), The City University of New York (CUNY), 285 Jay Street, Brooklyn, NY, 11201, USA.
| |
Collapse
|
31
|
Identification of functional signatures in the metabolism of the three cellular domains of life. PLoS One 2019; 14:e0217083. [PMID: 31136618 PMCID: PMC6538242 DOI: 10.1371/journal.pone.0217083] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Accepted: 05/04/2019] [Indexed: 11/19/2022] Open
Abstract
In order to identify common and specific enzymatic activities associated with the metabolism of the three cellular domains of life, the conservation and variations between the enzyme contents of Bacteria, Archaea, and Eukarya organisms were evaluated. To this end, the content of enzymes belonging to a particular pathway and their abundance and distribution in 1507 organisms that have been annotated and deposited in the KEGG database were assessed. In addition, we evaluated the consecutive enzymatic reaction pairs obtained from metabolic pathway reactions and transformed into sequences of enzymatic reactions, with catalytic activities encoded in the Enzyme Commission numbers, which are linked by a substrate. Both analyses are complementary: the first considers individual reactions associated with each organism and metabolic map, and the second evaluates the functional associations between pairs of consecutive reactions. From these comparisons, we found a set of five enzymatic reactions that were widely distributed in all the organisms and considered here as universal to Bacteria, Archaea, and Eukarya; whereas 132 pairs out of 3151 reactions were identified as significant, only 5 of them were found to be widely distributed in all the taxonomic divisions. However, these universal reactions are not widely distributed along the metabolic maps, suggesting their dispensability to all metabolic processes. Finally, we found that universal reactions are also associated with ancestral domains, such as those related to phosphorus-containing groups with a phosphate group as acceptor or those related to the ribulose-phosphate binding barrel, triosephosphate isomerase, and D-ribose-5-phosphate isomerase (RpiA) lid domain, among others. Therefore, we consider that this analysis provides clues about the functional constraints associated with the repertoire of enzymatic functions per organism.
Collapse
|
32
|
Staley JT, Caetano-Anollés G. Archaea-First and the Co-Evolutionary Diversification of Domains of Life. Bioessays 2018; 40:e1800036. [PMID: 29944192 DOI: 10.1002/bies.201800036] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Revised: 05/12/2018] [Indexed: 12/13/2022]
Abstract
The origins and evolution of the Archaea, Bacteria, and Eukarya remain controversial. Phylogenomic-wide studies of molecular features that are evolutionarily conserved, such as protein structural domains, suggest Archaea is the first domain of life to diversify from a stem line of descent. This line embodies the last universal common ancestor of cellular life. Here, we propose that ancestors of Euryarchaeota co-evolved with those of Bacteria prior to the diversification of Eukarya. This co-evolutionary scenario is supported by comparative genomic and phylogenomic analyses of the distributions of fold families of domains in the proteomes of free-living organisms, which show horizontal gene recruitments and informational process homologies. It also benefits from the molecular study of cell physiologies responsible for membrane phospholipids, methanogenesis, methane oxidation, cell division, gas vesicles, and the cell wall. Our theory however challenges popular cell fusion and two-domain of life scenarios derived from sequence analysis, demanding phylogenetic reconciliation. Also see the video abstract here: https://youtu.be/9yVWn_Q9faY.
Collapse
Affiliation(s)
- James T Staley
- Department of Microbiology and Astrobiology Program, University of Washington, Seattle, WA, 98195, USA
| | - Gustavo Caetano-Anollés
- Department of Crop Sciences, C. R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| |
Collapse
|
33
|
Li X, Meng D, Li J, Yin H, Liu H, Liu X, Cheng C, Xiao Y, Liu Z, Yan M. Response of soil microbial communities and microbial interactions to long-term heavy metal contamination. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2017; 231:908-917. [PMID: 28886536 DOI: 10.1016/j.envpol.2017.08.057] [Citation(s) in RCA: 255] [Impact Index Per Article: 31.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2017] [Revised: 08/14/2017] [Accepted: 08/14/2017] [Indexed: 05/07/2023]
Abstract
Due to the persistence of metals in the ecosystem and their threat to all living organisms, effects of heavy metal on soil microbial communities were widely studied. However, little was known about the interactions among microorganisms in heavy metal-contaminated soils. In the present study, microbial communities in Non (CON), moderately (CL) and severely (CH) contaminated soils were investigated through high-throughput Illumina sequencing of 16s rRNA gene amplicons, and networks were constructed to show the interactions among microbes. Results showed that the microbial community composition was significantly, while the microbial diversity was not significantly affected by heavy metal contamination. Bacteria showed various response to heavy metals. Bacteria that positively correlated with Cd, e.g. Acidobacteria_Gp and Proteobacteria_thiobacillus, had more links between nodes and more positive interactions among microbes in CL- and CH-networks, while bacteria that negatively correlated with Cd, e.g. Longilinea, Gp2 and Gp4 had fewer network links and more negative interactions in CL and CH-networks. Unlike bacteria, members of the archaeal domain, i.e. phyla Crenarchaeota and Euryarchaeota, class Thermoprotei and order Thermoplasmatales showed only positive correlation with Cd and had more network interactions in CH-networks. The present study indicated that (i) the microbial community composition, as well as network interactions was shift to strengthen adaptability of microorganisms to heavy metal contamination, (ii) archaea were resistant to heavy metal contamination and may contribute to the adaption to heavy metals. It was proposed that the contribution might be achieved either by improving environment conditions or by cooperative interactions.
Collapse
Affiliation(s)
- Xiaoqi Li
- School of Minerals Processing and Bioengineering, Central South University, Changsha 410083, China; Key Laboratory of Biometallurgy, Ministry of Education, Changsha 410083, China
| | - Delong Meng
- School of Minerals Processing and Bioengineering, Central South University, Changsha 410083, China; Key Laboratory of Biometallurgy, Ministry of Education, Changsha 410083, China; School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Juan Li
- College of Agronomy, Hunan Agricultural University, Changsha 410128, China
| | - Huaqun Yin
- School of Minerals Processing and Bioengineering, Central South University, Changsha 410083, China; Key Laboratory of Biometallurgy, Ministry of Education, Changsha 410083, China
| | - Hongwei Liu
- School of Minerals Processing and Bioengineering, Central South University, Changsha 410083, China; Key Laboratory of Biometallurgy, Ministry of Education, Changsha 410083, China
| | - Xueduan Liu
- School of Minerals Processing and Bioengineering, Central South University, Changsha 410083, China; Key Laboratory of Biometallurgy, Ministry of Education, Changsha 410083, China
| | - Cheng Cheng
- School of Life Science, Hunan University of Science and Technology, Yuhu District, Xiangtan, Hunan Province 411201, China
| | - Yunhua Xiao
- School of Minerals Processing and Bioengineering, Central South University, Changsha 410083, China; Key Laboratory of Biometallurgy, Ministry of Education, Changsha 410083, China
| | - Zhenghua Liu
- School of Minerals Processing and Bioengineering, Central South University, Changsha 410083, China; Key Laboratory of Biometallurgy, Ministry of Education, Changsha 410083, China
| | - Mingli Yan
- School of Life Science, Hunan University of Science and Technology, Yuhu District, Xiangtan, Hunan Province 411201, China.
| |
Collapse
|
34
|
Laurie J, Chattopadhyay AK, Flower DR. Protein lipograms. J Theor Biol 2017; 430:109-116. [PMID: 28716385 DOI: 10.1016/j.jtbi.2017.07.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Revised: 06/30/2017] [Accepted: 07/12/2017] [Indexed: 11/20/2022]
Abstract
Linguistic analysis of protein sequences is an underexploited technique. Here, we capitalize on the concept of the lipogram to characterize sequences at the proteome levels. A lipogram is a literary composition which omits one or more letters. A protein lipogram likewise omits one or more types of amino acid. In this article, we establish a usable terminology for the decomposition of a sequence collection in terms of the lipogram. Next, we characterize Uniref50 using a lipogram decomposition. At the global level, protein lipograms exhibit power-law properties. A clear correlation with metabolic cost is seen. Finally, we use the lipogram construction to assign proteomes to the four branches of the tree-of-life: archaea, bacteria, eukaryotes and viruses. We conclude from this pilot study that the lipogram demonstrates considerable potential as an additional tool for sequence analysis and proteome classification.
Collapse
Affiliation(s)
- Jason Laurie
- School of Engineering and Applied Science, Aston University, Birmingham B4 7ET, UK; Systems Analytics Research Institute, Aston University, Birmingham B4 7ET, UK
| | - Amit K Chattopadhyay
- School of Engineering and Applied Science, Aston University, Birmingham B4 7ET, UK; Systems Analytics Research Institute, Aston University, Birmingham B4 7ET, UK
| | - Darren R Flower
- School of Life and Health Sciences, Aston University, Birmingham B4 7ET, UK.
| |
Collapse
|
35
|
Koç I, Caetano-Anollés G. The natural history of molecular functions inferred from an extensive phylogenomic analysis of gene ontology data. PLoS One 2017; 12:e0176129. [PMID: 28467492 PMCID: PMC5414959 DOI: 10.1371/journal.pone.0176129] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Accepted: 04/05/2017] [Indexed: 11/18/2022] Open
Abstract
The origin and natural history of molecular functions hold the key to the emergence of cellular organization and modern biochemistry. Here we use a genomic census of Gene Ontology (GO) terms to reconstruct phylogenies at the three highest (1, 2 and 3) and the lowest (terminal) levels of the hierarchy of molecular functions, which reflect the broadest and the most specific GO definitions, respectively. These phylogenies define evolutionary timelines of functional innovation. We analyzed 249 free-living organisms comprising the three superkingdoms of life, Archaea, Bacteria, and Eukarya. Phylogenies indicate catalytic, binding and transport functions were the oldest, suggesting a 'metabolism-first' origin scenario for biochemistry. Metabolism made use of increasingly complicated organic chemistry. Primordial features of ancient molecular functions and functional recruitments were further distilled by studying the oldest child terms of the oldest level 1 GO definitions. Network analyses showed the existence of an hourglass pattern of enzyme recruitment in the molecular functions of the directed acyclic graph of molecular functions. Older high-level molecular functions were thoroughly recruited at younger lower levels, while very young high-level functions were used throughout the timeline. This pattern repeated in every one of the three mappings, which gave a criss-cross pattern. The timelines and their mappings were remarkable. They revealed the progressive evolutionary development of functional toolkits, starting with the early rise of metabolic activities, followed chronologically by the rise of macromolecular biosynthesis, the establishment of controlled interactions with the environment and self, adaptation to oxygen, and enzyme coordinated regulation, and ending with the rise of structural and cellular complexity. This historical account holds important clues for dissection of the emergence of biomcomplexity and life.
Collapse
Affiliation(s)
- Ibrahim Koç
- Molecular Biology and Genetics, Gebze Technical University, Kocaeli, Turkey
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America
| |
Collapse
|
36
|
Remnants of an Ancient Metabolism without Phosphate. Cell 2017; 168:1126-1134.e9. [PMID: 28262353 DOI: 10.1016/j.cell.2017.02.001] [Citation(s) in RCA: 127] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Revised: 12/16/2016] [Accepted: 01/31/2017] [Indexed: 11/23/2022]
Abstract
Phosphate is essential for all living systems, serving as a building block of genetic and metabolic machinery. However, it is unclear how phosphate could have assumed these central roles on primordial Earth, given its poor geochemical accessibility. We used systems biology approaches to explore the alternative hypothesis that a protometabolism could have emerged prior to the incorporation of phosphate. Surprisingly, we identified a cryptic phosphate-independent core metabolism producible from simple prebiotic compounds. This network is predicted to support the biosynthesis of a broad category of key biomolecules. Its enrichment for enzymes utilizing iron-sulfur clusters, and the fact that thermodynamic bottlenecks are more readily overcome by thioester rather than phosphate couplings, suggest that this network may constitute a "metabolic fossil" of an early phosphate-free nonenzymatic biochemistry. Our results corroborate and expand previous proposals that a putative thioester-based metabolism could have predated the incorporation of phosphate and an RNA-based genetic system. PAPERCLIP.
Collapse
|
37
|
Staley JT, Fuerst JA. Ancient, highly conserved proteins from a LUCA with complex cell biology provide evidence in support of the nuclear compartment commonality (NuCom) hypothesis. Res Microbiol 2017; 168:395-412. [PMID: 28111289 DOI: 10.1016/j.resmic.2017.01.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Revised: 01/08/2017] [Accepted: 01/09/2017] [Indexed: 12/23/2022]
Abstract
The nuclear compartment commonality (NuCom) hypothesis posits a complex last common ancestor (LUCA) with membranous compartments including a nuclear membrane. Such a LUCA then evolved to produce two nucleated lineages of the tree of life: the Planctomycetes-Verrucomicrobia-Chlamydia superphylum (PVC) within the Bacteria, and the Eukarya. We propose that a group of ancient essential protokaryotic signature proteins (PSPs) originating in LUCA were incorporated into ancestors of PVC Bacteria and Eukarya. Tubulins, ubiquitin system enzymes and sterol-synthesizing enzymes are consistent with early origins of these features shared between the PVC superphylum and Eukarya.
Collapse
Affiliation(s)
- James T Staley
- Department of Microbiology and Astrobiology Program, University of Washington, Seattle 98195, USA
| | - John A Fuerst
- School of Chemistry and Molecular Biosciences, University of Queensland, St. Lucia, Queensland 4072, Australia.
| |
Collapse
|
38
|
Arguments Reinforcing the Three-Domain View of Diversified Cellular Life. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2016; 2016:1851865. [PMID: 28050162 PMCID: PMC5165138 DOI: 10.1155/2016/1851865] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2016] [Revised: 10/18/2016] [Accepted: 11/03/2016] [Indexed: 11/18/2022]
Abstract
The archaeal ancestor scenario (AAS) for the origin of eukaryotes implies the emergence of a new kind of organism from the fusion of ancestral archaeal and bacterial cells. Equipped with this “chimeric” molecular arsenal, the resulting cell would gradually accumulate unique genes and develop the complex molecular machineries and cellular compartments that are hallmarks of modern eukaryotes. In this regard, proteins related to phagocytosis and cell movement should be present in the archaeal ancestor, thus identifying the recently described candidate archaeal phylum “Lokiarchaeota” as resembling a possible candidate ancestor of eukaryotes. Despite its appeal, AAS seems incompatible with the genomic, molecular, and biochemical differences that exist between Archaea and Eukarya. In particular, the distribution of conserved protein domain structures in the proteomes of cellular organisms and viruses appears hard to reconcile with the AAS. In addition, concerns related to taxon and character sampling, presupposing bacterial outgroups in phylogenies, and nonuniform effects of protein domain structure rearrangement and gain/loss in concatenated alignments of protein sequences cast further doubt on AAS-supporting phylogenies. Here, we evaluate AAS against the traditional “three-domain” world of cellular organisms and propose that the discovery of Lokiarchaeota could be better reconciled under the latter view, especially in light of several additional biological and technical considerations.
Collapse
|
39
|
A Dynamic Model for the Evolution of Protein Structure. J Mol Evol 2016; 82:230-43. [PMID: 27146880 DOI: 10.1007/s00239-016-9740-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2015] [Accepted: 04/12/2016] [Indexed: 10/21/2022]
Abstract
Domains are folded structures and evolutionary building blocks of protein molecules. Their three-dimensional atomic conformations, which define biological functions, can be coarse-grained into levels of a hierarchy. Here we build global dynamical models for the evolution of domains at fold and fold superfamily (FSF) levels. We fit the models with data from phylogenomic trees of domain structures and evaluate the distributions of the resulting parameters and their implications. The trees were inferred from a census of domain structures in hundreds of genomes from all three superkingdoms of life. The models used birth-death differential equations with the global abundances of structures as state variables, with one set of equations for folds and another for FSFs. Only the transitions present in the tree are assumed possible. Each fold or FSF diversifies in variants, eventually producing a new fold or FSF. The parameters specify rates of generation of variants and of new folds or FSFs. The equations were solved for the parameters by simplifying the trees to a comb-like topology, treating branches as emerging directly from a trunk. We found that the rate constants for folds and FSFs evolved similarly. These parameters showed a sharp transient change at about 1.5 Gyrs ago. This time coincides with a period in which domains massively combined in proteins and their arrangements distributed in novel lineages during the rise of organismal diversification. Our simulations suggest that exploration of protein structure space occurs through coarse-grained discoveries that undergo fine-grained elaboration.
Collapse
|
40
|
The TIM Barrel Architecture Facilitated the Early Evolution of Protein-Mediated Metabolism. J Mol Evol 2016; 82:17-26. [PMID: 26733481 PMCID: PMC4709378 DOI: 10.1007/s00239-015-9722-8] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Accepted: 11/11/2015] [Indexed: 12/30/2022]
Abstract
The triosephosphate isomerase (TIM) barrel protein fold is a structurally repetitive architecture that is present in approximately 10 % of all enzymes. It is generally assumed that this ubiquity in modern proteomes reflects an essential historical role in early protein-mediated metabolism. Here, we provide quantitative and comparative analyses to support several hypotheses about the early importance of the TIM barrel architecture. An information theoretical analysis of protein structures supports the hypothesis that the TIM barrel architecture could arise more easily by duplication and recombination compared to other mixed α/β structures. We show that TIM barrel enzymes corresponding to the most taxonomically broad superfamilies also have the broadest range of functions, often aided by metal and nucleotide-derived cofactors that are thought to reflect an earlier stage of metabolic evolution. By comparison to other putatively ancient protein architectures, we find that the functional diversity of TIM barrel proteins cannot be explained simply by their antiquity. Instead, the breadth of TIM barrel functions can be explained, in part, by the incorporation of a broad range of cofactors, a trend that does not appear to be shared by proteins in general. These results support the hypothesis that the simple and functionally general TIM barrel architecture may have arisen early in the evolution of protein biosynthesis and provided an ideal scaffold to facilitate the metabolic transition from ribozymes, peptides, and geochemical catalysts to modern protein enzymes.
Collapse
|
41
|
Sintes E, De Corte D, Ouillon N, Herndl GJ. Macroecological patterns of archaeal ammonia oxidizers in the Atlantic Ocean. Mol Ecol 2015; 24:4931-42. [PMID: 26336038 PMCID: PMC4950044 DOI: 10.1111/mec.13365] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Revised: 07/29/2015] [Accepted: 08/21/2015] [Indexed: 12/16/2022]
Abstract
Macroecological patterns are found in animals and plants, but also in micro-organisms. Macroecological and biogeographic distribution patterns in marine Archaea, however, have not been studied yet. Ammonia-oxidizing Archaea (AOA) show a bipolar distribution (i.e. similar communities in the northernmost and the southernmost locations, separated by distinct communities in the tropical and gyral regions) throughout the Atlantic, detectable from epipelagic to upper bathypelagic layers (<2000 m depth). This tentatively suggests an influence of the epipelagic conditions of organic matter production on bathypelagic AOA communities. The AOA communities below 2000 m depth showed a less pronounced biogeographic distribution pattern than the upper 2000 m water column. Overall, AOA in the surface and deep Atlantic waters exhibit distance-decay relationships and follow the Rapoport rule in a similar way as bacterial communities and macroorganisms. This indicates a major role of environmental conditions in shaping the community composition and assembly (species sorting) and no, or only weak limits for dispersal in the oceanic thaumarchaeal communities. However, there is indication of a different strength of these relationships between AOA and Bacteria, linked to the intrinsic differences between these two domains.
Collapse
Affiliation(s)
- Eva Sintes
- Department of Limnology and Bio‐OceanographyCenter of EcologyUniversity of ViennaAlthanstrasse 141090ViennaAustria
| | - Daniele De Corte
- Department of Limnology and Bio‐OceanographyCenter of EcologyUniversity of ViennaAlthanstrasse 141090ViennaAustria
| | - Natascha Ouillon
- Department of Limnology and Bio‐OceanographyCenter of EcologyUniversity of ViennaAlthanstrasse 141090ViennaAustria
| | - Gerhard J. Herndl
- Department of Limnology and Bio‐OceanographyCenter of EcologyUniversity of ViennaAlthanstrasse 141090ViennaAustria
- Department of Biological OceanographyRoyal Netherlands Institute for Sea ResearchPO Box 591790Den BurgThe Netherlands
| |
Collapse
|
42
|
Nasir A, Caetano-Anollés G. A phylogenomic data-driven exploration of viral origins and evolution. SCIENCE ADVANCES 2015; 1:e1500527. [PMID: 26601271 PMCID: PMC4643759 DOI: 10.1126/sciadv.1500527] [Citation(s) in RCA: 124] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Accepted: 06/30/2015] [Indexed: 05/05/2023]
Abstract
The origin of viruses remains mysterious because of their diverse and patchy molecular and functional makeup. Although numerous hypotheses have attempted to explain viral origins, none is backed by substantive data. We take full advantage of the wealth of available protein structural and functional data to explore the evolution of the proteomic makeup of thousands of cells and viruses. Despite the extremely reduced nature of viral proteomes, we established an ancient origin of the "viral supergroup" and the existence of widespread episodes of horizontal transfer of genetic information. Viruses harboring different replicon types and infecting distantly related hosts shared many metabolic and informational protein structural domains of ancient origin that were also widespread in cellular proteomes. Phylogenomic analysis uncovered a universal tree of life and revealed that modern viruses reduced from multiple ancient cells that harbored segmented RNA genomes and coexisted with the ancestors of modern cells. The model for the origin and evolution of viruses and cells is backed by strong genomic and structural evidence and can be reconciled with existing models of viral evolution if one considers viruses to have originated from ancient cells and not from modern counterparts.
Collapse
|
43
|
Shahzad K, Mittenthal JE, Caetano-Anollés G. The organization of domains in proteins obeys Menzerath-Altmann's law of language. BMC SYSTEMS BIOLOGY 2015; 9:44. [PMID: 26260760 PMCID: PMC4531524 DOI: 10.1186/s12918-015-0192-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2015] [Accepted: 07/30/2015] [Indexed: 11/10/2022]
Abstract
BACKGROUND The combination of domains in multidomain proteins enhances their function and structure but lengthens the molecules and increases their cost at cellular level. METHODS The dependence of domain length on the number of domains a protein holds was surveyed for a set of 60 proteomes representing free-living organisms from all kingdoms of life. Distributions were fitted using non-linear functions and fitted parameters interpreted with a formulation of decreasing returns. RESULTS We find that domain length decreases with increasing number of domains in proteins, following the Menzerath-Altmann (MA) law of language. Highly significant negative correlations exist for the set of proteomes examined. Mathematically, the MA law expresses as a power law relationship that unfolds when molecular persistence P is a function of domain accretion. P holds two terms, one reflecting the matter-energy cost of adding domains and extending their length, the other reflecting how domain length and number impinges on information and biophysics. The pattern of diminishing returns can therefore be explained as a frustrated interplay between the strategies of economy, flexibility and robustness, matching previously observed trade-offs in the domain makeup of proteomes. Proteomes of Archaea, Fungi and to a lesser degree Plants show the largest push towards molecular economy, each at their own economic stratum. Fungi increase domain size in single domain proteins while reinforcing the pattern of diminishing returns. In contrast, Metazoa, and to lesser degrees Protista and Bacteria, relax economy. Metazoa achieves maximum flexibility and robustness by harboring compact molecules and complex domain organization, offering a new functional vocabulary for molecular biology. CONCLUSIONS The tendency of parts to decrease their size when systems enlarge is universal for language and music, and now for parts of macromolecules, extending the MA law to natural systems.
Collapse
Affiliation(s)
| | - Jay E Mittenthal
- Department of Cell and Developmental Biology, Urbana, IL, 61801, USA.
| | - Gustavo Caetano-Anollés
- Illinois Informatics Institute, Urbana, IL, 61801, USA. .,Department of Crop Sciences, Evolutionary Bioinformatics Laboratory, University of Illinois, 332 NSRC, Urbana, IL, 61801, USA.
| |
Collapse
|
44
|
Caetano-Anollés G, Caetano-Anollés D. Computing the origin and evolution of the ribosome from its structure - Uncovering processes of macromolecular accretion benefiting synthetic biology. Comput Struct Biotechnol J 2015; 13:427-47. [PMID: 27096056 PMCID: PMC4823900 DOI: 10.1016/j.csbj.2015.07.003] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2015] [Revised: 07/16/2015] [Accepted: 07/19/2015] [Indexed: 12/11/2022] Open
Abstract
Accretion occurs pervasively in nature at widely different timeframes. The process also manifests in the evolution of macromolecules. Here we review recent computational and structural biology studies of evolutionary accretion that make use of the ideographic (historical, retrodictive) and nomothetic (universal, predictive) scientific frameworks. Computational studies uncover explicit timelines of accretion of structural parts in molecular repertoires and molecules. Phylogenetic trees of protein structural domains and proteomes and their molecular functions were built from a genomic census of millions of encoded proteins and associated terminal Gene Ontology terms. Trees reveal a ‘metabolic-first’ origin of proteins, the late development of translation, and a patchwork distribution of proteins in biological networks mediated by molecular recruitment. Similarly, the natural history of ancient RNA molecules inferred from trees of molecular substructures built from a census of molecular features shows patchwork-like accretion patterns. Ideographic analyses of ribosomal history uncover the early appearance of structures supporting mRNA decoding and tRNA translocation, the coevolution of ribosomal proteins and RNA, and a first evolutionary transition that brings ribosomal subunits together into a processive protein biosynthetic complex. Nomothetic structural biology studies of tertiary interactions and ancient insertions in rRNA complement these findings, once concentric layering assumptions are removed. Patterns of coaxial helical stacking reveal a frustrated dynamics of outward and inward ribosomal growth possibly mediated by structural grafting. The early rise of the ribosomal ‘turnstile’ suggests an evolutionary transition in natural biological computation. Results make explicit the need to understand processes of molecular growth and information transfer of macromolecules.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, 1101W. Peabody Drive, Urbana, IL 61801, USA; C.R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL 61801, USA
| | - Derek Caetano-Anollés
- C.R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL 61801, USA
| |
Collapse
|
45
|
Nasir A, Kim KM, Caetano-Anollés G. Lokiarchaeota: eukaryote-like missing links from microbial dark matter? Trends Microbiol 2015; 23:448-50. [PMID: 26112912 DOI: 10.1016/j.tim.2015.06.001] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2015] [Accepted: 06/10/2015] [Indexed: 11/25/2022]
Abstract
Identification and genome sequencing of novel organismal groups can reduce the gap between the sequenced minority and the unexplored majority. The recent discovery of phylum Lokiarchaeota promises understanding of biological history. Here we inquire if Lokiarchaeota truly represent ancient eukaryotic ancestors or just microbial dark matter of expanding archaeal diversity.
Collapse
Affiliation(s)
- Arshan Nasir
- Department of Biosciences, COMSATS Institute of Information Technology, Islamabad, Pakistan
| | - Kyung Mo Kim
- Microbial Resource Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, South Korea
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL 61801, USA.
| |
Collapse
|
46
|
Abstract
The concept of the minimal cell has fascinated scientists for a long time, from both fundamental and applied points of view. This broad concept encompasses extreme reductions of genomes, the last universal common ancestor (LUCA), the creation of semiartificial cells, and the design of protocells and chassis cells. Here we review these different areas of research and identify common and complementary aspects of each one. We focus on systems biology, a discipline that is greatly facilitating the classical top-down and bottom-up approaches toward minimal cells. In addition, we also review the so-called middle-out approach and its contributions to the field with mathematical and computational models. Owing to the advances in genomics technologies, much of the work in this area has been centered on minimal genomes, or rather minimal gene sets, required to sustain life. Nevertheless, a fundamental expansion has been taking place in the last few years wherein the minimal gene set is viewed as a backbone of a more complex system. Complementing genomics, progress is being made in understanding the system-wide properties at the levels of the transcriptome, proteome, and metabolome. Network modeling approaches are enabling the integration of these different omics data sets toward an understanding of the complex molecular pathways connecting genotype to phenotype. We review key concepts central to the mapping and modeling of this complexity, which is at the heart of research on minimal cells. Finally, we discuss the distinction between minimizing the number of cellular components and minimizing cellular complexity, toward an improved understanding and utilization of minimal and simpler cells.
Collapse
|
47
|
Nasir A, Sun FJ, Kim KM, Caetano-Anollés G. Untangling the origin of viruses and their impact on cellular evolution. Ann N Y Acad Sci 2015; 1341:61-74. [PMID: 25758413 DOI: 10.1111/nyas.12735] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The origin and evolution of viruses remain mysterious. Here, we focus on the distribution of viral replicons in host organisms, their morphological features, and the evolution of highly conserved protein and nucleic acid structures. The apparent inability of RNA viral replicons to infect contemporary akaryotic species suggests an early origin of RNA viruses and their subsequent loss in akaryotes. A census of virion morphotypes reveals that advanced forms were unique to viruses infecting a specific supergroup, while simpler forms were observed in viruses infecting organisms in all forms of cellular life. Results hint toward an ancient origin of viruses from an ancestral virus harboring either filamentous or spherical virions. Finally, phylogenetic trees built from protein domain and tRNA structures in thousands of genomes suggest that viruses evolved via reductive evolution from ancient cells. The analysis presents a complete account of the evolutionary history of cells and viruses and identifies viruses as crucial agents influencing cellular evolution.
Collapse
Affiliation(s)
- Arshan Nasir
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences and Illinois Informatics Institute, University of Illinois, Urbana, Illinois
| | | | | | | |
Collapse
|
48
|
Kenyon LJ, Sabree ZL. Obligate insect endosymbionts exhibit increased ortholog length variation and loss of large accessory proteins concurrent with genome shrinkage. Genome Biol Evol 2015; 6:763-75. [PMID: 24671745 PMCID: PMC4007534 DOI: 10.1093/gbe/evu055] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Extreme genome reduction has been observed in obligate intracellular insect mutualists and is an assumed consequence of fixed, long-term host isolation. Rapid accumulation of mutations and pseudogenization of genes no longer vital for an intracellular lifestyle, followed by deletion of many genes, are factors that lead to genome reduction. Size reductions in individual genes due to small-scale deletions have also been implicated in contributing to overall genome shrinkage. Conserved protein functional domains are expected to exhibit low tolerance for mutations and therefore remain relatively unchanged throughout protein length reduction while nondomain regions, presumably under less selective pressures, would shorten. This hypothesis was tested using orthologous protein sets from the Flavobacteriaceae (phylum: Bacteroidetes) and Enterobacteriaceae (subphylum: Gammaproteobacteria) families, each of which includes some of the smallest known genomes. Upon examination of protein, functional domain, and nondomain region lengths, we found that proteins were not uniformly shrinking with genome reduction, but instead increased length variability and variability was observed in both the functional domain and nondomain regions. Additionally, as complete gene loss also contributes to overall genome shrinkage, we found that the largest proteins in the proteomes of nonhost-restricted bacteroidetial and gammaproteobacterial species often were inferred to be involved in secondary metabolic processes, extracellular sensing, or of unknown function. These proteins were absent in the proteomes of obligate insect endosymbionts. Therefore, loss of genes encoding large proteins not required for host-restricted lifestyles in obligate endosymbiont proteomes likely contributes to extreme genome reduction to a greater degree than gene shrinkage.
Collapse
Affiliation(s)
- Laura J Kenyon
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University
| | | |
Collapse
|
49
|
The place of RNA in the origin and early evolution of the genetic machinery. Life (Basel) 2014; 4:1050-91. [PMID: 25532530 PMCID: PMC4284482 DOI: 10.3390/life4041050] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2014] [Revised: 12/02/2014] [Accepted: 12/09/2014] [Indexed: 11/17/2022] Open
Abstract
The extant genetic machinery revolves around three interrelated polymers: RNA, DNA and proteins. Two evolutionary views approach this vital connection from opposite perspectives. The RNA World theory posits that life began in a cold prebiotic broth of monomers with the de novo emergence of replicating RNA as functionally self-contained polymer and that subsequent evolution is characterized by RNA → DNA memory takeover and ribozyme → enzyme catalyst takeover. The FeS World theory posits that life began as an autotrophic metabolism in hot volcanic-hydrothermal fluids and evolved with organic products turning into ligands for transition metal catalysts thereby eliciting feedback and feed-forward effects. In this latter context it is posited that the three polymers of the genetic machinery essentially coevolved from monomers through oligomers to polymers, operating functionally first as ligands for ligand-accelerated transition metal catalysis with later addition of base stacking and base pairing, whereby the functional dichotomy between hereditary DNA with stability on geologic time scales and transient, catalytic RNA with stability on metabolic time scales existed since the dawn of the genetic machinery. Both approaches are assessed comparatively for chemical soundness.
Collapse
|
50
|
de Lorenzo V, Sekowska A, Danchin A. Chemical reactivity drives spatiotemporal organisation of bacterial metabolism. FEMS Microbiol Rev 2014; 39:96-119. [PMID: 25227915 DOI: 10.1111/1574-6976.12089] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
In this review, we examine how bacterial metabolism is shaped by chemical constraints acting on the material and dynamic layout of enzymatic networks and beyond. These are moulded not only for optimisation of given metabolic objectives (e.g. synthesis of a particular amino acid or nucleotide) but also for curbing the detrimental reactivity of chemical intermediates. Besides substrate channelling, toxicity is avoided by barriers to free diffusion (i.e. compartments) that separate otherwise incompatible reactions, along with ways for distinguishing damaging vs. harmless molecules. On the other hand, enzymes age and their operating lifetime must be tuned to upstream and downstream reactions. This time dependence of metabolic pathways creates time-linked information, learning and memory. These features suggest that the physical structure of existing biosystems, from operon assemblies to multicellular development may ultimately stem from the need to restrain chemical damage and limit the waste inherent to basic metabolic functions. This provides a new twist of our comprehension of fundamental biological processes in live systems as well as practical take-home lessons for the forward DNA-based engineering of novel biological objects.
Collapse
Affiliation(s)
- Víctor de Lorenzo
- Systems Biology Program, Centro Nacional de Biotecnología CSIC, Cantoblanco-Madrid, Spain
| | - Agnieszka Sekowska
- AMAbiotics SAS, Institut du Cerveau et de la Moëlle Épinière, Hôpital de la Pitié-Salpêtrière, Paris, France
| | - Antoine Danchin
- AMAbiotics SAS, Institut du Cerveau et de la Moëlle Épinière, Hôpital de la Pitié-Salpêtrière, Paris, France
| |
Collapse
|