1
|
Koska S, Leljak-Levanić D, Malenica N, Bigović Villi K, Futo M, Čorak N, Jagić M, Ivanić A, Tušar A, Kasalo N, Domazet-Lošo M, Vlahoviček K, Domazet-Lošo T. Developmental phylotranscriptomics in grapevine suggests an ancestral role of somatic embryogenesis. Commun Biol 2025; 8:265. [PMID: 39972184 PMCID: PMC11839975 DOI: 10.1038/s42003-025-07712-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 02/10/2025] [Indexed: 02/21/2025] Open
Abstract
The zygotic embryogenesis of Arabidopsis, which is initiated by gamete fusion, shows hourglass-shaped ontogeny-phylogeny correlations at the transcriptome level. However, many plants are capable of yielding a fully viable next generation by somatic embryogenesis-a comparable developmental process that usually starts with the embryogenic induction of a diploid somatic cell. To explore the correspondence between ontogeny and phylogeny in this alternative developmental route in plants, here we develop a highly efficient model of somatic embryogenesis in grapevine (Vitis vinifera) and sequence its developmental transcriptomes. By combining the evolutionary properties of grapevine genes with their expression values, recovered from early induction to the formation of juvenile plants, we find a strongly supported hourglass-shaped developmental trajectory. However, in contrast to zygotic embryogenesis in Arabidopsis, where the torpedo stage is the most evolutionarily inert, in the somatic embryogenesis of grapevine, the heart stage expresses the most evolutionarily conserved transcriptome. This represents a surprising finding because it suggests a better evolutionary system-level analogy between animal development and plant somatic embryogenesis than zygotic embryogenesis. We conclude that macroevolutionary logic is deeply hardwired in plant ontogeny and that somatic embryogenesis is likely a primordial embryogenic program in plants.
Collapse
Affiliation(s)
- Sara Koska
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| | - Dunja Leljak-Levanić
- Division of Molecular Biology, Department of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia.
| | - Nenad Malenica
- Division of Molecular Biology, Department of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia.
| | - Kian Bigović Villi
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| | - Momir Futo
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
- School of Medicine, Catholic University of Croatia, Zagreb, Croatia
| | - Nina Čorak
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| | - Mateja Jagić
- Division of Molecular Biology, Department of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia
| | - Ariana Ivanić
- Division of Molecular Biology, Department of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia
| | - Anja Tušar
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| | - Niko Kasalo
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| | - Mirjana Domazet-Lošo
- Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia
| | - Kristian Vlahoviček
- Bioinformatics Group, Division of Molecular Biology, Department of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia
| | - Tomislav Domazet-Lošo
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia.
- School of Medicine, Catholic University of Croatia, Zagreb, Croatia.
| |
Collapse
|
2
|
Xia S, Chen J, Arsala D, Emerson JJ, Long M. Functional innovation through new genes as a general evolutionary process. Nat Genet 2025; 57:295-309. [PMID: 39875578 DOI: 10.1038/s41588-024-02059-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 12/15/2024] [Indexed: 01/30/2025]
Abstract
In the past decade, our understanding of how new genes originate in diverse organisms has advanced substantially, and more than a dozen molecular mechanisms for generating initial gene structures were identified, in addition to gene duplication. These new genes have been found to integrate into and modify pre-existing gene networks primarily through mutation and selection, revealing new patterns and rules with stable origination rates across various organisms. This progress has challenged the prevailing belief that new proteins evolve from pre-existing genes, as new genes may arise de novo from noncoding DNA sequences in many organisms, with high rates observed in flowering plants. New genes have important roles in phenotypic and functional evolution across diverse biological processes and structures, with detectable fitness effects of sexual conflict genes that can shape species divergence. Such knowledge of new genes can be of translational value in agriculture and medicine.
Collapse
Affiliation(s)
- Shengqian Xia
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - Jianhai Chen
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - Deanna Arsala
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - J J Emerson
- Department of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA.
| |
Collapse
|
3
|
Roginski P, Grandchamp A, Quignot C, Lopes A. De Novo Emerged Gene Search in Eukaryotes with DENSE. Genome Biol Evol 2024; 16:evae159. [PMID: 39212967 PMCID: PMC11363675 DOI: 10.1093/gbe/evae159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/07/2024] [Indexed: 09/04/2024] Open
Abstract
The discovery of de novo emerged genes, originating from previously noncoding DNA regions, challenges traditional views of species evolution. Indeed, the hypothesis of neutrally evolving sequences giving rise to functional proteins is highly unlikely. This conundrum has sparked numerous studies to quantify and characterize these genes, aiming to understand their functional roles and contributions to genome evolution. Yet, no fully automated pipeline for their identification is available. Therefore, we introduce DENSE (DE Novo emerged gene SEarch), an automated Nextflow pipeline based on two distinct steps: detection of taxonomically restricted genes (TRGs) through phylostratigraphy, and filtering of TRGs for de novo emerged genes via genome comparisons and synteny search. DENSE is available as a user-friendly command-line tool, while the second step is accessible through a web server upon providing a list of TRGs. Highly flexible, DENSE provides various strategy and parameter combinations, enabling users to adapt to specific configurations or define their own strategy through a rational framework, facilitating protocol communication, and study interoperability. We apply DENSE to seven model organisms, exploring the impact of its strategies and parameters on de novo gene predictions. This thorough analysis across species with different evolutionary rates reveals useful metrics for users to define input datasets, identify favorable/unfavorable conditions for de novo gene detection, and control potential biases in genome annotations. Additionally, predictions made for the seven model organisms are compiled into a requestable database, which we hope will serve as a reference for de novo emerged gene lists generated with specific criteria combinations.
Collapse
Affiliation(s)
- Paul Roginski
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, 91198 Gif-sur-Yvette, France
| | - Anna Grandchamp
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
| | - Chloé Quignot
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, 91198 Gif-sur-Yvette, France
| | - Anne Lopes
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, 91198 Gif-sur-Yvette, France
| |
Collapse
|
4
|
Domazet-Lošo M, Široki T, Šimičević K, Domazet-Lošo T. Macroevolutionary dynamics of gene family gain and loss along multicellular eukaryotic lineages. Nat Commun 2024; 15:2663. [PMID: 38531970 DOI: 10.1038/s41467-024-47017-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 03/11/2024] [Indexed: 03/28/2024] Open
Abstract
The gain and loss of genes fluctuate over evolutionary time in major eukaryotic clades. However, the full profile of these macroevolutionary trajectories is still missing. To give a more inclusive view on the changes in genome complexity across the tree of life, here we recovered the evolutionary dynamics of gene family gain and loss ranging from the ancestor of cellular organisms to 352 eukaryotic species. We show that in all considered lineages the gene family content follows a common evolutionary pattern, where the number of gene families reaches the highest value at a major evolutionary and ecological transition, and then gradually decreases towards extant organisms. This supports theoretical predictions and suggests that the genome complexity is often decoupled from commonly perceived organismal complexity. We conclude that simplification by gene family loss is a dominant force in Phanerozoic genomes of various lineages, probably underpinned by intense ecological specializations and functional outsourcing.
Collapse
Affiliation(s)
- Mirjana Domazet-Lošo
- Department of Applied Computing, Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, HR-10000, Zagreb, Croatia.
| | - Tin Široki
- Department of Applied Computing, Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, HR-10000, Zagreb, Croatia
| | - Korina Šimičević
- Department of Applied Computing, Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, HR-10000, Zagreb, Croatia
| | - Tomislav Domazet-Lošo
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Bijenička cesta 54, HR-10000, Zagreb, Croatia.
- School of Medicine, Catholic University of Croatia, Ilica 242, HR-10000, Zagreb, Croatia.
| |
Collapse
|
5
|
Grandchamp A, Kühl L, Lebherz M, Brüggemann K, Parsch J, Bornberg-Bauer E. Population genomics reveals mechanisms and dynamics of de novo expressed open reading frame emergence in Drosophila melanogaster. Genome Res 2023; 33:872-890. [PMID: 37442576 PMCID: PMC10519401 DOI: 10.1101/gr.277482.122] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 06/06/2023] [Indexed: 07/15/2023]
Abstract
Novel genes are essential for evolutionary innovations and differ substantially even between closely related species. Recently, multiple studies across many taxa showed that some novel genes arise de novo, that is, from previously noncoding DNA. To characterize the underlying mutations that allowed de novo gene emergence and their order of occurrence, homologous regions must be detected within noncoding sequences in closely related sister genomes. So far, most studies do not detect noncoding homologs of de novo genes because of incomplete assemblies and annotations, and long evolutionary distances separating genomes. Here, we overcome these issues by searching for de novo expressed open reading frames (neORFs), the not-yet fixed precursors of de novo genes that emerged within a single species. We sequenced and assembled genomes with long-read technology and the corresponding transcriptomes from inbred lines of Drosophila melanogaster, derived from seven geographically diverse populations. We found line-specific neORFs in abundance but few neORFs shared by lines, suggesting a rapid turnover. Gain and loss of transcription is more frequent than the creation of ORFs, for example, by forming new start and stop codons. Consequently, the gain of ORFs becomes rate limiting and is frequently the initial step in neORFs emergence. Furthermore, transposable elements (TEs) are major drivers for intragenomic duplications of neORFs, yet TE insertions are less important for the emergence of neORFs. However, highly mutable genomic regions around TEs provide new features that enable gene birth. In conclusion, neORFs have a high birth-death rate, are rapidly purged, but surviving neORFs spread neutrally through populations and within genomes.
Collapse
Affiliation(s)
- Anna Grandchamp
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany;
| | - Lucas Kühl
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
| | - Marie Lebherz
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
| | - Kathrin Brüggemann
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
| | - John Parsch
- Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, 82152 Munich, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
- Max Planck Institute for Biology Tübingen, Department of Protein Evolution, 72076 Tübingen, Germany
| |
Collapse
|
6
|
Barrera-Redondo J, Lotharukpong JS, Drost HG, Coelho SM. Uncovering gene-family founder events during major evolutionary transitions in animals, plants and fungi using GenEra. Genome Biol 2023; 24:54. [PMID: 36964572 PMCID: PMC10037820 DOI: 10.1186/s13059-023-02895-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 03/10/2023] [Indexed: 03/26/2023] Open
Abstract
We present GenEra ( https://github.com/josuebarrera/GenEra ), a DIAMOND-fueled gene-family founder inference framework that addresses previously raised limitations and biases in genomic phylostratigraphy, such as homology detection failure. GenEra also reduces computational time from several months to a few days for any genome of interest. We analyze the emergence of taxonomically restricted gene families during major evolutionary transitions in plants, animals, and fungi. Our results indicate that the impact of homology detection failure on inferred patterns of gene emergence is lineage-dependent, suggesting that plants are more prone to evolve novelty through the emergence of new genes compared to animals and fungi.
Collapse
Affiliation(s)
- Josué Barrera-Redondo
- Department of Algal Development and Evolution, Max Planck Institute for Biology, Max-Planck-Ring 5, 72076, Tübingen, Germany.
| | - Jaruwatana Sodai Lotharukpong
- Department of Algal Development and Evolution, Max Planck Institute for Biology, Max-Planck-Ring 5, 72076, Tübingen, Germany
| | - Hajk-Georg Drost
- Computational Biology Group, Department of Molecular Biology, Max Planck Institute for Biology, Max-Planck-Ring 5, 72076, Tübingen, Germany.
| | - Susana M Coelho
- Department of Algal Development and Evolution, Max Planck Institute for Biology, Max-Planck-Ring 5, 72076, Tübingen, Germany.
| |
Collapse
|
7
|
Langthasa J, Mishra S, U M, Kalal R, Bhat R. Mutations in a set of ancient matrisomal glycoprotein genes across neoplasia predispose to disruption of morphogenetic transduction. COMPUTATIONAL AND SYSTEMS ONCOLOGY 2022. [DOI: 10.1002/cso2.1042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Affiliation(s)
- Jimpi Langthasa
- Department of Molecular Reproduction Development and Genetics Indian Institute of Science Bengaluru India
| | - Satyarthi Mishra
- Centre for Nano Science and Engineering Indian Institute of Science Bengaluru India
| | - Monica U
- Department of Molecular Reproduction Development and Genetics Indian Institute of Science Bengaluru India
| | - Ronak Kalal
- Department of Zoology University College of Science, Mohanlal Sukhadia University Udaipur India
| | - Ramray Bhat
- Department of Molecular Reproduction Development and Genetics Indian Institute of Science Bengaluru India
- Centre for BioSystems Science and Engineering Indian Institute of Science Bengaluru India
| |
Collapse
|
8
|
Ma S, Skarica M, Li Q, Xu C, Risgaard RD, Tebbenkamp AT, Mato-Blanco X, Kovner R, Krsnik Ž, de Martin X, Luria V, Martí-Pérez X, Liang D, Karger A, Schmidt DK, Gomez-Sanchez Z, Qi C, Gobeske KT, Pochareddy S, Debnath A, Hottman CJ, Spurrier J, Teo L, Boghdadi AG, Homman-Ludiye J, Ely JJ, Daadi EW, Mi D, Daadi M, Marín O, Hof PR, Rasin MR, Bourne J, Sherwood CC, Santpere G, Girgenti MJ, Strittmatter SM, Sousa AM, Sestan N. Molecular and cellular evolution of the primate dorsolateral prefrontal cortex. Science 2022; 377:eabo7257. [PMID: 36007006 PMCID: PMC9614553 DOI: 10.1126/science.abo7257] [Citation(s) in RCA: 122] [Impact Index Per Article: 40.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The granular dorsolateral prefrontal cortex (dlPFC) is an evolutionary specialization of primates that is centrally involved in cognition. We assessed more than 600,000 single-nucleus transcriptomes from adult human, chimpanzee, macaque, and marmoset dlPFC. Although most cell subtypes defined transcriptomically are conserved, we detected several that exist only in a subset of species as well as substantial species-specific molecular differences across homologous neuronal, glial, and non-neural subtypes. The latter are exemplified by human-specific switching between expression of the neuropeptide somatostatin and tyrosine hydroxylase, the rate-limiting enzyme in dopamine production in certain interneurons. The above molecular differences are also illustrated by expression of the neuropsychiatric risk gene FOXP2, which is human-specific in microglia and primate-specific in layer 4 granular neurons. We generated a comprehensive survey of the dlPFC cellular repertoire and its shared and divergent features in anthropoid primates.
Collapse
Affiliation(s)
- Shaojie Ma
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Mario Skarica
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Qian Li
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Chuan Xu
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Ryan D. Risgaard
- Waisman Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
- Medical Scientist Training Program, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
| | | | - Xoel Mato-Blanco
- Neurogenomics Group, Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), MELIS, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Rothem Kovner
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Željka Krsnik
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
- Croatian Institute for Brain Research, School of Medicine, University of Zagreb, 10000 Zagreb, Croatia
| | - Xabier de Martin
- Neurogenomics Group, Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), MELIS, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Victor Luria
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Xavier Martí-Pérez
- Neurogenomics Group, Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), MELIS, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Dan Liang
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Amir Karger
- IT-Research Computing, Harvard Medical School, Boston, MA, USA
| | - Danielle K. Schmidt
- Waisman Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
| | - Zachary Gomez-Sanchez
- Waisman Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
| | - Cai Qi
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Kevin T. Gobeske
- Division of Neurocritical Care and Emergency Neurology, Department of Neurology, Yale School of Medicine, New Haven, CT 06510, USA
| | - Sirisha Pochareddy
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Ashwin Debnath
- Waisman Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
| | - Cade J. Hottman
- Waisman Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
| | - Joshua Spurrier
- Program in Cellular Neuroscience, Neurodegeneration and Repair, Department of Neurology, Yale School of Medicine, New Haven, CT 06536, USA
| | - Leon Teo
- Australian Regenerative Medicine Institute, 15 Innovation Walk, Monash University, Clayton VIC, 3800, Australia
| | - Anthony G. Boghdadi
- Australian Regenerative Medicine Institute, 15 Innovation Walk, Monash University, Clayton VIC, 3800, Australia
| | - Jihane Homman-Ludiye
- Australian Regenerative Medicine Institute, 15 Innovation Walk, Monash University, Clayton VIC, 3800, Australia
| | - John J. Ely
- MAEBIOS, Alamogordo, NM 88310, USA
- Department of Anthropology and Center for the Advanced Study of Human Paleobiology, The George Washington University, Washington, DC, USA
| | - Etienne W. Daadi
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, USA
| | - Da Mi
- Tsinghua-Peking Center for Life Sciences, IDG/McGovern Institute for Brain Research, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Marcel Daadi
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, USA
- Department of Cell Systems & Anatomy, Radiology, Long School of Medicine, UT Health San Antonio
- NeoNeuron LLC, Palo Alto, CA 94306, USA
| | - Oscar Marín
- Centre for Developmental Neurobiology, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE1 1UL, UK
- MRC Centre for Neurodevelopmental Disorders, King’s College London, London SE1 1UL, UK
| | - Patrick R. Hof
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Mladen-Roko Rasin
- Department of Neuroscience and Cell Biology, Robert Wood Johnson Medical School, Rutgers University, Piscataway, NJ 08854, USA
| | - James Bourne
- Australian Regenerative Medicine Institute, 15 Innovation Walk, Monash University, Clayton VIC, 3800, Australia
| | - Chet C. Sherwood
- Department of Anthropology and Center for the Advanced Study of Human Paleobiology, The George Washington University, Washington, DC, USA
| | - Gabriel Santpere
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
- Neurogenomics Group, Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), MELIS, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Matthew J. Girgenti
- Department of Psychiatry, Yale School of Medicine, New Haven, CT 06510, USA
- National Center for PTSD, US Department of Veterans Affairs, White River Junction, VT, USA
| | - Stephen M. Strittmatter
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
- Program in Cellular Neuroscience, Neurodegeneration and Repair, Department of Neurology, Yale School of Medicine, New Haven, CT 06536, USA
- Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - André M.M. Sousa
- Waisman Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
- Department of Neuroscience, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
| | - Nenad Sestan
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
- Department of Psychiatry, Yale School of Medicine, New Haven, CT 06510, USA
- Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
- Departments of Genetics and Comparative Medicine, Program in Cellular Neuroscience, Neurodegeneration and Repair, and Yale Child Study Center, Yale School of Medicine, New Haven, CT 06510, USA
| |
Collapse
|
9
|
Eicholt LA, Aubel M, Berk K, Bornberg‐Bauer E, Lange A. Heterologous expression of naturally evolved putative de novo proteins with chaperones. Protein Sci 2022; 31:e4371. [PMID: 35900020 PMCID: PMC9278007 DOI: 10.1002/pro.4371] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Revised: 05/03/2022] [Accepted: 05/14/2022] [Indexed: 11/23/2022]
Abstract
Over the past decade, evidence has accumulated that new protein-coding genes can emerge de novo from previously non-coding DNA. Most studies have focused on large scale computational predictions of de novo protein-coding genes across a wide range of organisms. In contrast, experimental data concerning the folding and function of de novo proteins are scarce. This might be due to difficulties in handling de novo proteins in vitro, as most are short and predicted to be disordered. Here, we propose a guideline for the effective expression of eukaryotic de novo proteins in Escherichia coli. We used 11 sequences from Drosophila melanogaster and 10 from Homo sapiens, that are predicted de novo proteins from former studies, for heterologous expression. The candidate de novo proteins have varying secondary structure and disorder content. Using multiple combinations of purification tags, E. coli expression strains, and chaperone systems, we were able to increase the number of solubly expressed putative de novo proteins from 30% to 62%. Our findings indicate that the best combination for expressing putative de novo proteins in E. coli is a GST-tag with T7 Express cells and co-expressed chaperones. We found that, overall, proteins with higher predicted disorder were easier to express. STATEMENT: Today, we know that proteins do not only evolve by duplication and divergence of existing proteins but also arise from previously non-coding DNA. These proteins are called de novo proteins. Their properties are still poorly understood and their experimental analysis faces major obstacles. Here, we aim to present a starting point for soluble expression of de novo proteins with the help of chaperones and thereby enable further characterization.
Collapse
Affiliation(s)
- Lars A. Eicholt
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
| | - Margaux Aubel
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
| | - Katrin Berk
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
| | - Erich Bornberg‐Bauer
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
- Max Planck‐Institute for Biology TuebingenTübingenGermany
| | - Andreas Lange
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
| |
Collapse
|
10
|
Li F, Rane RV, Luria V, Xiong Z, Chen J, Li Z, Catullo RA, Griffin PC, Schiffer M, Pearce S, Lee SF, McElroy K, Stocker A, Shirriffs J, Cockerell F, Coppin C, Sgrò CM, Karger A, Cain JW, Weber JA, Santpere G, Kirschner MW, Hoffmann AA, Oakeshott JG, Zhang G. Phylogenomic analyses of the genus Drosophila reveals genomic signals of climate adaptation. Mol Ecol Resour 2022; 22:1559-1581. [PMID: 34839580 PMCID: PMC9299920 DOI: 10.1111/1755-0998.13561] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 11/10/2021] [Indexed: 01/13/2023]
Abstract
Many Drosophila species differ widely in their distributions and climate niches, making them excellent subjects for evolutionary genomic studies. Here, we have developed a database of high-quality assemblies for 46 Drosophila species and one closely related Zaprionus. Fifteen of the genomes were newly sequenced, and 20 were improved with additional sequencing. New or improved annotations were generated for all 47 species, assisted by new transcriptomes for 19. Phylogenomic analyses of these data resolved several previously ambiguous relationships, especially in the melanogaster species group. However, it also revealed significant phylogenetic incongruence among genes, mainly in the form of incomplete lineage sorting in the subgenus Sophophora but also including asymmetric introgression in the subgenus Drosophila. Using the phylogeny as a framework and taking into account these incongruences, we then screened the data for genome-wide signals of adaptation to different climatic niches. First, phylostratigraphy revealed relatively high rates of recent novel gene gain in three temperate pseudoobscura and five desert-adapted cactophilic mulleri subgroup species. Second, we found differing ratios of nonsynonymous to synonymous substitutions in several hundred orthologues between climate generalists and specialists, with trends for significantly higher ratios for those in tropical and lower ratios for those in temperate-continental specialists respectively than those in the climate generalists. Finally, resequencing natural populations of 13 species revealed tropics-restricted species generally had smaller population sizes, lower genome diversity and more deleterious mutations than the more widespread species. We conclude that adaptation to different climates in the genus Drosophila has been associated with large-scale and multifaceted genomic changes.
Collapse
Affiliation(s)
- Fang Li
- BGI‐ShenzhenShenzhenChina
- Section for Ecology and EvolutionDepartment of BiologyUniversity of CopenhagenCopenhagenDenmark
| | - Rahul V. Rane
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
- Bio21 InstituteSchool of BioSciencesUniversity of MelbourneParkvilleVic.Australia
| | - Victor Luria
- Department of Systems BiologyHarvard Medical SchoolBostonMassachusettsUSA
| | - Zijun Xiong
- BGI‐ShenzhenShenzhenChina
- State Key Laboratory of Genetic Resources and EvolutionKunming Institute of ZoologyChinese Academy of Sciences (CAS)KunmingYunnanChina
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
| | | | | | - Renee A. Catullo
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
- Division of Ecology and EvolutionCentre for Biodiversity AnalysisThe Australian National UniversityActonACTAustralia
| | - Philippa C. Griffin
- Bio21 InstituteSchool of BioSciencesUniversity of MelbourneParkvilleVic.Australia
| | - Michele Schiffer
- Bio21 InstituteSchool of BioSciencesUniversity of MelbourneParkvilleVic.Australia
- Daintree Rainforest ObservatoryJames Cook UniversityCape TribulationQldAustralia
| | - Stephen Pearce
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
| | - Siu Fai Lee
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
- Applied BioSciencesMacquarie UniversityNorth RydeNSWAustralia
| | - Kerensa McElroy
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
| | - Ann Stocker
- Bio21 InstituteSchool of BioSciencesUniversity of MelbourneParkvilleVic.Australia
| | - Jennifer Shirriffs
- Bio21 InstituteSchool of BioSciencesUniversity of MelbourneParkvilleVic.Australia
| | - Fiona Cockerell
- School of Biological SciencesMonash UniversityClaytonVic.Australia
| | - Chris Coppin
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
| | - Carla M. Sgrò
- School of Biological SciencesMonash UniversityClaytonVic.Australia
| | - Amir Karger
- IT ‐ Research ComputingHarvard Medical SchoolBostonMassachusettsUSA
| | - John W. Cain
- Department of MathematicsHarvard UniversityCambridgeMassachusettsUSA
| | - Jessica A. Weber
- Department of GeneticsHarvard Medical SchoolBostonMassachusettsUSA
| | - Gabriel Santpere
- Neurogenomics Group, Research Programme on Biomedical Informatics (GRIB)Department of Experimental and Health Sciences (DCEXS)Hospital del Mar Medical Research Institute (IMIM)Universitat Pompeu FabraBarcelonaCataloniaSpain
| | - Marc W. Kirschner
- Department of Systems BiologyHarvard Medical SchoolBostonMassachusettsUSA
| | - Ary A. Hoffmann
- Bio21 InstituteSchool of BioSciencesUniversity of MelbourneParkvilleVic.Australia
| | - John G. Oakeshott
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
- Applied BioSciencesMacquarie UniversityNorth RydeNSWAustralia
| | - Guojie Zhang
- BGI‐ShenzhenShenzhenChina
- Section for Ecology and EvolutionDepartment of BiologyUniversity of CopenhagenCopenhagenDenmark
- State Key Laboratory of Genetic Resources and EvolutionKunming Institute of ZoologyChinese Academy of Sciences (CAS)KunmingYunnanChina
- Center for Excellence in Animal Evolution and GeneticsChinese Academy of SciencesKunmingChina
| |
Collapse
|
11
|
A Thermodynamic Model for Water Activity and Redox Potential in Evolution and Development. J Mol Evol 2022; 90:182-199. [DOI: 10.1007/s00239-022-10051-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 02/22/2022] [Indexed: 10/18/2022]
|
12
|
Merényi Z, Virágh M, Gluck-Thaler E, Slot JC, Kiss B, Varga T, Geösel A, Hegedüs B, Bálint B, Nagy LG. Gene age shapes the transcriptional landscape of sexual morphogenesis in mushroom forming fungi (Agaricomycetes). eLife 2022; 11:71348. [PMID: 35156613 PMCID: PMC8893723 DOI: 10.7554/elife.71348] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 02/11/2022] [Indexed: 11/13/2022] Open
Abstract
Multicellularity has been one of the most important innovations in the history of life. The role of gene regulatory changes in driving transitions to multicellularity is being increasingly recognized; however, factors influencing gene expression patterns are poorly known in many clades. Here, we compared the developmental transcriptomes of complex multicellular fruiting bodies of eight Agaricomycetes and Cryptococcus neoformans, a closely related human pathogen with a simple morphology. In-depth analysis in Pleurotus ostreatus revealed that allele-specific expression, natural antisense transcripts, and developmental gene expression, but not RNA editing or a ‘developmental hourglass,’ act in concert to shape its transcriptome during fruiting body development. We found that transcriptional patterns of genes strongly depend on their evolutionary ages. Young genes showed more developmental and allele-specific expression variation, possibly because of weaker evolutionary constraint, suggestive of nonadaptive expression variance in fruiting bodies. These results prompted us to define a set of conserved genes specifically regulated only during complex morphogenesis by excluding young genes and accounting for deeply conserved ones shared with species showing simple sexual development. Analysis of the resulting gene set revealed evolutionary and functional associations with complex multicellularity, which allowed us to speculate they are involved in complex multicellular morphogenesis of mushroom fruiting bodies.
Collapse
Affiliation(s)
- Zsolt Merényi
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| | - Máté Virágh
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| | - Emile Gluck-Thaler
- Department of Biology, University of Pennsylvania, Philadelphia, United States
| | - Jason C Slot
- Department of Plant Pathology, Ohio State University, Columbus, United States
| | - Brigitta Kiss
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| | - Torda Varga
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| | - András Geösel
- Department of Vegetable and Mushroom Growing, Hungarian University of Agriculture and Life Sciences, Budapest, Hungary
| | - Botond Hegedüs
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| | - Balázs Bálint
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| | - László G Nagy
- Synthetic and Systems Biology Unit, Biological Research Center, Szeged, Hungary
| |
Collapse
|
13
|
Xu X, Zhang QY, Chu XY, Quan Y, Lv BM, Zhang HY. Facilitating Antiviral Drug Discovery Using Genetic and Evolutionary Knowledge. Viruses 2021; 13:v13112117. [PMID: 34834924 PMCID: PMC8626054 DOI: 10.3390/v13112117] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Revised: 10/19/2021] [Accepted: 10/19/2021] [Indexed: 12/15/2022] Open
Abstract
Over the course of human history, billions of people worldwide have been infected by various viruses. Despite rapid progress in the development of biomedical techniques, it is still a significant challenge to find promising new antiviral targets and drugs. In the past, antiviral drugs mainly targeted viral proteins when they were used as part of treatment strategies. Since the virus mutation rate is much faster than that of the host, such drugs feature drug resistance and narrow-spectrum antiviral problems. Therefore, the targeting of host molecules has gradually become an important area of research for the development of antiviral drugs. In recent years, rapid advances in high-throughput sequencing techniques have enabled numerous genetic studies (such as genome-wide association studies (GWAS), clustered regularly interspersed short palindromic repeats (CRISPR) screening, etc.) for human diseases, providing valuable genetic and evolutionary resources. Furthermore, it has been revealed that successful drug targets exhibit similar genetic and evolutionary features, which are of great value in identifying promising drug targets and discovering new drugs. Considering these developments, in this article the authors propose a host-targeted antiviral drug discovery strategy based on knowledge of genetics and evolution. We first comprehensively summarized the genetic, subcellular location, and evolutionary features of the human genes that have been successfully used as antiviral targets. Next, the summarized features were used to screen novel druggable antiviral targets and to find potential antiviral drugs, in an attempt to promote the discovery of new antiviral drugs.
Collapse
Affiliation(s)
| | - Qing-Ye Zhang
- Correspondence: (Q.-Y.Z.); (H.-Y.Z.); Tel.: +86-27-8728-0877 (H.-Y.Z.)
| | | | | | | | - Hong-Yu Zhang
- Correspondence: (Q.-Y.Z.); (H.-Y.Z.); Tel.: +86-27-8728-0877 (H.-Y.Z.)
| |
Collapse
|
14
|
Prensner JR, Enache OM, Luria V, Krug K, Clauser KR, Dempster JM, Karger A, Wang L, Stumbraite K, Wang VM, Botta G, Lyons NJ, Goodale A, Kalani Z, Fritchman B, Brown A, Alan D, Green T, Yang X, Jaffe JD, Roth JA, Piccioni F, Kirschner MW, Ji Z, Root DE, Golub TR. Noncanonical open reading frames encode functional proteins essential for cancer cell survival. Nat Biotechnol 2021; 39:697-704. [PMID: 33510483 PMCID: PMC8195866 DOI: 10.1038/s41587-020-00806-2] [Citation(s) in RCA: 104] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 12/16/2020] [Indexed: 01/30/2023]
Abstract
Although genomic analyses predict many noncanonical open reading frames (ORFs) in the human genome, it is unclear whether they encode biologically active proteins. Here we experimentally interrogated 553 candidates selected from noncanonical ORF datasets. Of these, 57 induced viability defects when knocked out in human cancer cell lines. Following ectopic expression, 257 showed evidence of protein expression and 401 induced gene expression changes. Clustered regularly interspaced short palindromic repeat (CRISPR) tiling and start codon mutagenesis indicated that their biological effects required translation as opposed to RNA-mediated effects. We found that one of these ORFs, G029442-renamed glycine-rich extracellular protein-1 (GREP1)-encodes a secreted protein highly expressed in breast cancer, and its knockout in 263 cancer cell lines showed preferential essentiality in breast cancer-derived lines. The secretome of GREP1-expressing cells has an increased abundance of the oncogenic cytokine GDF15, and GDF15 supplementation mitigated the growth-inhibitory effect of GREP1 knockout. Our experiments suggest that noncanonical ORFs can express biologically active proteins that are potential therapeutic targets.
Collapse
Affiliation(s)
- John R. Prensner
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215,Division of Pediatric Hematology/Oncology, Boston Children’s Hospital, Boston, MA, 02115
| | - Oana M. Enache
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Victor Luria
- Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA
| | - Karsten Krug
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Karl R. Clauser
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | | | - Amir Karger
- IT-Research Computing, Harvard Medical School, Boston, MA, USA, 02115
| | - Li Wang
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | | | - Vickie M. Wang
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Ginevra Botta
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | | | - Amy Goodale
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Zohra Kalani
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | | | - Adam Brown
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Douglas Alan
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Thomas Green
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Xiaoping Yang
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Jacob D. Jaffe
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.,Present address: Inzen Therapeutics, Cambridge, MA, 02139, USA
| | | | - Federica Piccioni
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.,Present address: Merck Research Laboratories, Boston, MA, 02115, USA
| | - Marc W. Kirschner
- Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA
| | - Zhe Ji
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611,Department of Biomedical Engineering, McCormick School of Engineering, Northwestern University, Evanston, IL 60628
| | - David E. Root
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - Todd R. Golub
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.,Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215,Division of Pediatric Hematology/Oncology, Boston Children’s Hospital, Boston, MA, 02115,Corresponding author: Address correspondence to: Todd R. Golub, MD, Chief Scientific Officer, Broad Institute of Harvard and MIT, Room 4013, 415 Main Street, Cambridge, MA, 02142, , Phone: 617-714-7050
| |
Collapse
|
15
|
Lange A, Patel PH, Heames B, Damry AM, Saenger T, Jackson CJ, Findlay GD, Bornberg-Bauer E. Structural and functional characterization of a putative de novo gene in Drosophila. Nat Commun 2021; 12:1667. [PMID: 33712569 PMCID: PMC7954818 DOI: 10.1038/s41467-021-21667-6] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 02/03/2021] [Indexed: 11/26/2022] Open
Abstract
Comparative genomic studies have repeatedly shown that new protein-coding genes can emerge de novo from noncoding DNA. Still unknown is how and when the structures of encoded de novo proteins emerge and evolve. Combining biochemical, genetic and evolutionary analyses, we elucidate the function and structure of goddard, a gene which appears to have evolved de novo at least 50 million years ago within the Drosophila genus. Previous studies found that goddard is required for male fertility. Here, we show that Goddard protein localizes to elongating sperm axonemes and that in its absence, elongated spermatids fail to undergo individualization. Combining modelling, NMR and circular dichroism (CD) data, we show that Goddard protein contains a large central α-helix, but is otherwise partially disordered. We find similar results for Goddard's orthologs from divergent fly species and their reconstructed ancestral sequences. Accordingly, Goddard's structure appears to have been maintained with only minor changes over millions of years.
Collapse
Affiliation(s)
- Andreas Lange
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Prajal H Patel
- Department of Biology, College of the Holy Cross, Worcester, MA, USA
| | - Brennen Heames
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Adam M Damry
- Research School of Chemistry, ANU College of Science, Canberra, Australia
| | - Thorsten Saenger
- Department of Pediatric Kidney, Liver and Metabolic Diseases, Hannover Medical School, Hannover, Germany
| | - Colin J Jackson
- Research School of Chemistry, ANU College of Science, Canberra, Australia
| | | | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany.
| |
Collapse
|
16
|
Natsidis P, Kapli P, Schiffer PH, Telford MJ. Systematic errors in orthology inference and their effects on evolutionary analyses. iScience 2021; 24:102110. [PMID: 33659875 PMCID: PMC7892920 DOI: 10.1016/j.isci.2021.102110] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 01/03/2021] [Accepted: 01/21/2021] [Indexed: 01/13/2023] Open
Abstract
The availability of complete sets of genes from many organisms makes it possible to identify genes unique to (or lost from) certain clades. This information is used to reconstruct phylogenetic trees; identify genes involved in the evolution of clade specific novelties; and for phylostratigraphy—identifying ages of genes in a given species. These investigations rely on accurately predicted orthologs. Here we use simulation to produce sets of orthologs that experience no gains or losses. We show that errors in identifying orthologs increase with higher rates of evolution. We use the predicted sets of orthologs, with errors, to reconstruct phylogenetic trees; to count gains and losses; and for phylostratigraphy. Our simulated data, containing information only from errors in orthology prediction, closely recapitulate findings from empirical data. We suggest published downstream analyses must be informed to a large extent by errors in orthology prediction that mimic expected patterns of gene evolution. Presence of shared orthologs across species is used for evolutionary analyses We simulated realistic sets of orthologs with no gains or losses Errors predicting shared orthologs correlate with phylogenetic relationships Presence/absence datasets based on errors recapitulate findings from empirical data
Collapse
Affiliation(s)
- Paschalis Natsidis
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Ecology, University College London, London WC1E 6BT, UK
| | - Paschalia Kapli
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Ecology, University College London, London WC1E 6BT, UK
| | - Philipp H Schiffer
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Ecology, University College London, London WC1E 6BT, UK
| | - Maximilian J Telford
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Ecology, University College London, London WC1E 6BT, UK
| |
Collapse
|
17
|
Uncovering de novo gene birth in yeast using deep transcriptomics. Nat Commun 2021; 12:604. [PMID: 33504782 PMCID: PMC7841160 DOI: 10.1038/s41467-021-20911-3] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 01/04/2021] [Indexed: 01/30/2023] Open
Abstract
De novo gene origination has been recently established as an important mechanism for the formation of new genes. In organisms with a large genome, intergenic and intronic regions provide plenty of raw material for new transcriptional events to occur, but little is know about how de novo transcripts originate in more densely-packed genomes. Here, we identify 213 de novo originated transcripts in Saccharomyces cerevisiae using deep transcriptomics and genomic synteny information from multiple yeast species grown in two different conditions. We find that about half of the de novo transcripts are expressed from regions which already harbor other genes in the opposite orientation; these transcripts show similar expression changes in response to stress as their overlapping counterparts, and some appear to translate small proteins. Thus, a large fraction of de novo genes in yeast are likely to co-evolve with already existing genes.
Collapse
|
18
|
James JE, Willis SM, Nelson PG, Weibel C, Kosinski LJ, Masel J. Universal and taxon-specific trends in protein sequences as a function of age. eLife 2021; 10:e57347. [PMID: 33416492 PMCID: PMC7819706 DOI: 10.7554/elife.57347] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Accepted: 01/05/2021] [Indexed: 01/12/2023] Open
Abstract
Extant protein-coding sequences span a huge range of ages, from those that emerged only recently to those present in the last universal common ancestor. Because evolution has had less time to act on young sequences, there might be 'phylostratigraphy' trends in any properties that evolve slowly with age. A long-term reduction in hydrophobicity and hydrophobic clustering was found in previous, taxonomically restricted studies. Here we perform integrated phylostratigraphy across 435 fully sequenced species, using sensitive HMM methods to detect protein domain homology. We find that the reduction in hydrophobic clustering is universal across lineages. However, only young animal domains have a tendency to have higher structural disorder. Among ancient domains, trends in amino acid composition reflect the order of recruitment into the genetic code, suggesting that the composition of the contemporary descendants of ancient sequences reflects amino acid availability during the earliest stages of life, when these sequences first emerged.
Collapse
Affiliation(s)
- Jennifer E James
- Department of Ecology and Evolutionary Biology, University of ArizonaTucsonUnited States
| | - Sara M Willis
- Department of Ecology and Evolutionary Biology, University of ArizonaTucsonUnited States
| | - Paul G Nelson
- Department of Ecology and Evolutionary Biology, University of ArizonaTucsonUnited States
| | - Catherine Weibel
- Department of Physics, University of ArizonaTucsonUnited States
- Department of Mathematics, University of ArizonaTucsonUnited States
| | - Luke J Kosinski
- Department of Molecular and Cellular Biology, University of ArizonaTucsonUnited States
| | - Joanna Masel
- Department of Ecology and Evolutionary Biology, University of ArizonaTucsonUnited States
| |
Collapse
|
19
|
Futo M, Opašić L, Koska S, Čorak N, Široki T, Ravikumar V, Thorsell A, Lenuzzi M, Kifer D, Domazet-Lošo M, Vlahoviček K, Mijakovic I, Domazet-Lošo T. Embryo-Like Features in Developing Bacillus subtilis Biofilms. Mol Biol Evol 2021; 38:31-47. [PMID: 32871001 PMCID: PMC7783165 DOI: 10.1093/molbev/msaa217] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Correspondence between evolution and development has been discussed for more than two centuries. Recent work reveals that phylogeny-ontogeny correlations are indeed present in developmental transcriptomes of eukaryotic clades with complex multicellularity. Nevertheless, it has been largely ignored that the pervasive presence of phylogeny-ontogeny correlations is a hallmark of development in eukaryotes. This perspective opens a possibility to look for similar parallelisms in biological settings where developmental logic and multicellular complexity are more obscure. For instance, it has been increasingly recognized that multicellular behavior underlies biofilm formation in bacteria. However, it remains unclear whether bacterial biofilm growth shares some basic principles with development in complex eukaryotes. Here we show that the ontogeny of growing Bacillus subtilis biofilms recapitulates phylogeny at the expression level. Using time-resolved transcriptome and proteome profiles, we found that biofilm ontogeny correlates with the evolutionary measures, in a way that evolutionary younger and more diverged genes were increasingly expressed toward later timepoints of biofilm growth. Molecular and morphological signatures also revealed that biofilm growth is highly regulated and organized into discrete ontogenetic stages, analogous to those of eukaryotic embryos. Together, this suggests that biofilm formation in Bacillus is a bona fide developmental process comparable to organismal development in animals, plants, and fungi. Given that most cells on Earth reside in the form of biofilms and that biofilms represent the oldest known fossils, we anticipate that the widely adopted vision of the first life as a single-cell and free-living organism needs rethinking.
Collapse
Affiliation(s)
- Momir Futo
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| | - Luka Opašić
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
- Department for Evolutionary Theory, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Sara Koska
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| | - Nina Čorak
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| | - Tin Široki
- Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia
| | - Vaishnavi Ravikumar
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Annika Thorsell
- Proteomics Core Facility, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Maša Lenuzzi
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Domagoj Kifer
- Faculty of Pharmacy and Biochemistry, University of Zagreb, Zagreb, Croatia
| | - Mirjana Domazet-Lošo
- Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia
| | - Kristian Vlahoviček
- Bioinformatics Group, Division of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia
- School of Biosciences, University of Skövde, Skövde, Sweden
| | - Ivan Mijakovic
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kgs. Lyngby, Denmark
- Systems and Synthetic Biology Division, Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Tomislav Domazet-Lošo
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
- Catholic University of Croatia, Zagreb, Croatia
| |
Collapse
|
20
|
Dowling D, Schmitz JF, Bornberg-Bauer E. Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage. Genome Biol Evol 2020; 12:2183-2195. [PMID: 33210146 PMCID: PMC7674706 DOI: 10.1093/gbe/evaa194] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/12/2020] [Indexed: 12/12/2022] Open
Abstract
In addition to known genes, much of the human genome is transcribed into RNA. Chance formation of novel open reading frames (ORFs) can lead to the translation of myriad new proteins. Some of these ORFs may yield advantageous adaptive de novo proteins. However, widespread translation of noncoding DNA can also produce hazardous protein molecules, which can misfold and/or form toxic aggregates. The dynamics of how de novo proteins emerge from potentially toxic raw materials and what influences their long-term survival are unknown. Here, using transcriptomic data from human and five other primates, we generate a set of transcribed human ORFs at six conservation levels to investigate which properties influence the early emergence and long-term retention of these expressed ORFs. As these taxa diverged from each other relatively recently, we present a fine scale view of the evolution of novel sequences over recent evolutionary time. We find that novel human-restricted ORFs are preferentially located on GC-rich gene-dense chromosomes, suggesting their retention is linked to pre-existing genes. Sequence properties such as intrinsic structural disorder and aggregation propensity-which have been proposed to play a role in survival of de novo genes-remain unchanged over time. Even very young sequences code for proteins with low aggregation propensities, suggesting that genomic regions with many novel transcribed ORFs are concomitantly less likely to produce ORFs which code for harmful toxic proteins. Our data indicate that the survival of these novel ORFs is largely stochastic rather than shaped by selection.
Collapse
Affiliation(s)
- Daniel Dowling
- Institute for Evolution and Biodiversity, University of Münster, Germany
| | - Jonathan F Schmitz
- Institute for Evolution and Biodiversity, University of Münster, Germany
| | | |
Collapse
|
21
|
Weisman CM, Murray AW, Eddy SR. Many, but not all, lineage-specific genes can be explained by homology detection failure. PLoS Biol 2020; 18:e3000862. [PMID: 33137085 PMCID: PMC7660931 DOI: 10.1371/journal.pbio.3000862] [Citation(s) in RCA: 100] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 11/12/2020] [Accepted: 09/21/2020] [Indexed: 12/21/2022] Open
Abstract
Genes for which homologs can be detected only in a limited group of evolutionarily related species, called “lineage-specific genes,” are pervasive: Essentially every lineage has them, and they often comprise a sizable fraction of the group’s total genes. Lineage-specific genes are often interpreted as “novel” genes, representing genetic novelty born anew within that lineage. Here, we develop a simple method to test an alternative null hypothesis: that lineage-specific genes do have homologs outside of the lineage that, even while evolving at a constant rate in a novelty-free manner, have merely become undetectable by search algorithms used to infer homology. We show that this null hypothesis is sufficient to explain the lack of detected homologs of a large number of lineage-specific genes in fungi and insects. However, we also find that a minority of lineage-specific genes in both clades are not well explained by this novelty-free model. The method provides a simple way of identifying which lineage-specific genes call for special explanations beyond homology detection failure, highlighting them as interesting candidates for further study. Lineage-specific gene families may arise from evolutionary innovations such as de novo gene origination, or may simply mean that a similarity search program failed to identify more distant homologs. A new computational method for modeling the expected decay of similarity search scores with evolutionary distance allows distinction between the two explanations.
Collapse
Affiliation(s)
- Caroline M. Weisman
- Department of Molecular & Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Andrew W. Murray
- Department of Molecular & Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Sean R. Eddy
- Department of Molecular & Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Harvard University, Cambridge, Massachusetts, United States of America
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
22
|
Osuna-Cruz CM, Bilcke G, Vancaester E, De Decker S, Bones AM, Winge P, Poulsen N, Bulankova P, Verhelst B, Audoor S, Belisova D, Pargana A, Russo M, Stock F, Cirri E, Brembu T, Pohnert G, Piganeau G, Ferrante MI, Mock T, Sterck L, Sabbe K, De Veylder L, Vyverman W, Vandepoele K. The Seminavis robusta genome provides insights into the evolutionary adaptations of benthic diatoms. Nat Commun 2020; 11:3320. [PMID: 32620776 PMCID: PMC7335047 DOI: 10.1038/s41467-020-17191-8] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Accepted: 06/12/2020] [Indexed: 12/15/2022] Open
Abstract
Benthic diatoms are the main primary producers in shallow freshwater and coastal environments, fulfilling important ecological functions such as nutrient cycling and sediment stabilization. However, little is known about their evolutionary adaptations to these highly structured but heterogeneous environments. Here, we report a reference genome for the marine biofilm-forming diatom Seminavis robusta, showing that gene family expansions are responsible for a quarter of all 36,254 protein-coding genes. Tandem duplications play a key role in extending the repertoire of specific gene functions, including light and oxygen sensing, which are probably central for its adaptation to benthic habitats. Genes differentially expressed during interactions with bacteria are strongly conserved in other benthic diatoms while many species-specific genes are strongly upregulated during sexual reproduction. Combined with re-sequencing data from 48 strains, our results offer insights into the genetic diversity and gene functions in benthic diatoms.
Collapse
Affiliation(s)
- Cristina Maria Osuna-Cruz
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, 9052, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Technologiepark 71, 9052, Ghent, Belgium
| | - Gust Bilcke
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, 9052, Ghent, Belgium
- Protistology and Aquatic Ecology, Department of Biology, Ghent University, 9000, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, 9000, Ghent, Belgium
| | - Emmelien Vancaester
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, 9052, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Technologiepark 71, 9052, Ghent, Belgium
| | - Sam De Decker
- Protistology and Aquatic Ecology, Department of Biology, Ghent University, 9000, Ghent, Belgium
| | - Atle M Bones
- Cell Molecular Biology and Genomics Group, Department of Biology, Norwegian University of Science and Technology, 7491, Trondheim, Norway
| | - Per Winge
- Cell Molecular Biology and Genomics Group, Department of Biology, Norwegian University of Science and Technology, 7491, Trondheim, Norway
| | - Nicole Poulsen
- B CUBE Center for Molecular Bioengineering, Technical University of Dresden, Tatzberg 41, 01307, Dresden, Germany
| | - Petra Bulankova
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, 9052, Ghent, Belgium
| | - Bram Verhelst
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, 9052, Ghent, Belgium
| | - Sien Audoor
- Protistology and Aquatic Ecology, Department of Biology, Ghent University, 9000, Ghent, Belgium
| | - Darja Belisova
- Protistology and Aquatic Ecology, Department of Biology, Ghent University, 9000, Ghent, Belgium
| | - Aikaterini Pargana
- Protistology and Aquatic Ecology, Department of Biology, Ghent University, 9000, Ghent, Belgium
| | - Monia Russo
- Integrative Marine Ecology, Stazione Zoologica Anton Dohrn, Villa Comunale, Naples, Italy
| | - Frederike Stock
- Protistology and Aquatic Ecology, Department of Biology, Ghent University, 9000, Ghent, Belgium
| | - Emilio Cirri
- Friedrich Schiller University Jena, Institute of Inorganic and Analytical Chemistry, Lessingstrasse 8, 07745, Jena, Germany
| | - Tore Brembu
- Cell Molecular Biology and Genomics Group, Department of Biology, Norwegian University of Science and Technology, 7491, Trondheim, Norway
| | - Georg Pohnert
- Friedrich Schiller University Jena, Institute of Inorganic and Analytical Chemistry, Lessingstrasse 8, 07745, Jena, Germany
| | - Gwenael Piganeau
- Sorbonne Université, CNRS, UMR 7232 Biologie Intégrative des Organismes Marins BIOM, Observatoire Océanologique, F-66650, Banyuls-sur-Mer, France
| | | | - Thomas Mock
- School of Environmental Sciences, University of East Anglia, Norwich Research Park, Norwich, NR4 7TJ, UK
| | - Lieven Sterck
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, 9052, Ghent, Belgium
| | - Koen Sabbe
- Protistology and Aquatic Ecology, Department of Biology, Ghent University, 9000, Ghent, Belgium
| | - Lieven De Veylder
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052, Ghent, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, 9052, Ghent, Belgium
| | - Wim Vyverman
- Protistology and Aquatic Ecology, Department of Biology, Ghent University, 9000, Ghent, Belgium
| | - Klaas Vandepoele
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052, Ghent, Belgium.
- VIB Center for Plant Systems Biology, Technologiepark 71, 9052, Ghent, Belgium.
- Bioinformatics Institute Ghent, Ghent University, Technologiepark 71, 9052, Ghent, Belgium.
| |
Collapse
|
23
|
Wang J, Zhang L, Lian S, Qin Z, Zhu X, Dai X, Huang Z, Ke C, Zhou Z, Wei J, Liu P, Hu N, Zeng Q, Dong B, Dong Y, Kong D, Zhang Z, Liu S, Xia Y, Li Y, Zhao L, Xing Q, Huang X, Hu X, Bao Z, Wang S. Evolutionary transcriptomics of metazoan biphasic life cycle supports a single intercalation origin of metazoan larvae. Nat Ecol Evol 2020; 4:725-736. [PMID: 32203475 DOI: 10.1038/s41559-020-1138-1] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Accepted: 02/06/2020] [Indexed: 12/16/2022]
Abstract
The transient larva-bearing biphasic life cycle is the hallmark of many metazoan phyla, but how metazoan larvae originated remains a major enigma in animal evolution. There are two hypotheses for larval origin. The 'larva-first' hypothesis suggests that the first metazoans were similar to extant larvae, with later evolution of the adult-added biphasic life cycle; the 'adult-first' hypothesis suggests that the first metazoans were adult forms, with the biphasic life cycle arising later via larval intercalation. Here, we investigate the evolutionary origin of primary larvae by conducting ontogenetic transcriptome profiling for Mollusca-the largest marine phylum characterized by a trochophore larval stage and highly variable adult forms. We reveal that trochophore larvae exhibit rapid transcriptome evolution with extraordinary incorporation of novel genes (potentially contributing to adult shell evolution), and that cell signalling/communication genes (for example, caveolin and innexin) are probably crucial for larval evolution. Transcriptome age analysis of eight metazoan species reveals the wide presence of young larval transcriptomes in both trochozoans and other major metazoan lineages, therefore arguing against the prevailing larva-first hypothesis. Our findings support an adult-first evolutionary scenario with a single metazoan larval intercalation, and suggest that the first appearance of proto-larva probably occurred after the divergence of direct-developing Ctenophora from a metazoan ancestor.
Collapse
Affiliation(s)
- Jing Wang
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China.,Laboratory for Marine Biology and Biotechnology, Pilot National Laboratory for Marine Science and Technology, Qingdao, China
| | - Lingling Zhang
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China.,Laboratory for Marine Biology and Biotechnology, Pilot National Laboratory for Marine Science and Technology, Qingdao, China
| | - Shanshan Lian
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China.,Laboratory for Marine Biology and Biotechnology, Pilot National Laboratory for Marine Science and Technology, Qingdao, China
| | - Zhenkui Qin
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Xuan Zhu
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Xiaoting Dai
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Zekun Huang
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, China
| | - Caihuan Ke
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, China
| | - Zunchun Zhou
- Liaoning Key Lab of Marine Fishery Molecular Biology, Liaoning Ocean and Fisheries Science Research Institute, Dalian, China
| | - Jiankai Wei
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Pingping Liu
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Naina Hu
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Qifan Zeng
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China.,Laboratory for Marine Biology and Biotechnology, Pilot National Laboratory for Marine Science and Technology, Qingdao, China
| | - Bo Dong
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China.,Laboratory for Marine Biology and Biotechnology, Pilot National Laboratory for Marine Science and Technology, Qingdao, China
| | - Ying Dong
- Liaoning Key Lab of Marine Fishery Molecular Biology, Liaoning Ocean and Fisheries Science Research Institute, Dalian, China
| | - Dexu Kong
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Zhifeng Zhang
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Sinuo Liu
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Yu Xia
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Yangping Li
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Liang Zhao
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Qiang Xing
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Xiaoting Huang
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Xiaoli Hu
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China.,Laboratory for Marine Fisheries Science and Food Production Processes, Pilot National Laboratory for Marine Science and Technology, Qingdao, China
| | - Zhenmin Bao
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China.,Laboratory for Marine Fisheries Science and Food Production Processes, Pilot National Laboratory for Marine Science and Technology, Qingdao, China
| | - Shi Wang
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China. .,Laboratory for Marine Biology and Biotechnology, Pilot National Laboratory for Marine Science and Technology, Qingdao, China. .,The Sars-Fang Centre, Ocean University of China, Qingdao, China.
| |
Collapse
|
24
|
Vakirlis N, Carvunis AR, McLysaght A. Synteny-based analyses indicate that sequence divergence is not the main source of orphan genes. eLife 2020; 9:e53500. [PMID: 32066524 PMCID: PMC7028367 DOI: 10.7554/elife.53500] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 01/07/2020] [Indexed: 12/20/2022] Open
Abstract
The origin of 'orphan' genes, species-specific sequences that lack detectable homologues, has remained mysterious since the dawn of the genomic era. There are two dominant explanations for orphan genes: complete sequence divergence from ancestral genes, such that homologues are not readily detectable; and de novo emergence from ancestral non-genic sequences, such that homologues genuinely do not exist. The relative contribution of the two processes remains unknown. Here, we harness the special circumstance of conserved synteny to estimate the contribution of complete divergence to the pool of orphan genes. By separately comparing yeast, fly and human genes to related taxa using conservative criteria, we find that complete divergence accounts, on average, for at most a third of eukaryotic orphan and taxonomically restricted genes. We observe that complete divergence occurs at a stable rate within a phylum but at different rates between phyla, and is frequently associated with gene shortening akin to pseudogenization.
Collapse
Affiliation(s)
- Nikolaos Vakirlis
- Smurfit Institute of GeneticsTrinity College Dublin, University of DublinDublinIreland
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, Pittsburgh Center for Evolutionary Biology and Medicine, School of MedicineUniversity of PittsburghPittsburghUnited States
| | - Aoife McLysaght
- Smurfit Institute of GeneticsTrinity College Dublin, University of DublinDublinIreland
| |
Collapse
|
25
|
Xie C, Bekpen C, Künzel S, Keshavarz M, Krebs-Wheaton R, Skrabar N, Ullrich KK, Tautz D. A de novo evolved gene in the house mouse regulates female pregnancy cycles. eLife 2019; 8:44392. [PMID: 31436535 PMCID: PMC6760900 DOI: 10.7554/elife.44392] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Accepted: 08/21/2019] [Indexed: 12/16/2022] Open
Abstract
The de novo emergence of new genes has been well documented through genomic analyses. However, a functional analysis, especially of very young protein-coding genes, is still largely lacking. Here, we identify a set of house mouse-specific protein-coding genes and assess their translation by ribosome profiling and mass spectrometry data. We functionally analyze one of them, Gm13030, which is specifically expressed in females in the oviduct. The interruption of the reading frame affects the transcriptional network in the oviducts at a specific stage of the estrous cycle. This includes the upregulation of Dcpp genes, which are known to stimulate the growth of preimplantation embryos. As a consequence, knockout females have their second litters after shorter times and have a higher infanticide rate. Given that Gm13030 shows no signs of positive selection, our findings support the hypothesis that a de novo evolved gene can directly adopt a function without much sequence adaptation. Different species have specific genes that set them apart from other species. Yet exactly how these species-specific genes originate is not fully known. The traditional view is that existing old genes are duplicated to make a ‘spare’ copy, which can change through mutations into a new gene with a new role gradually over time. Despite there being lots of evidence supporting this theory, not all new genes found in recent years can be traced back to older genes. This led to an alternative view – that recently evolved genes can also appear ‘de novo’, and come from regions of random DNA sequences that did not previously code for a protein. So far, the possibility of genes forming de novo during evolution has largely been supported by comparing and analyzing the genomes of related species. However, very little is known about the biological role these de novo genes play. Now, Xie et al. have generated a list of recently evolved de novo mouse genes, and carried out a detailed analysis of one de novo gene expressed in females at the time when embryos implant into the uterus wall. To study the role of this gene, Xie et al. created a strain of knock-out mice that have a defunct version of the protein coded by the gene. Loss of this protein caused female mice to have their second litter after a shorter period of time and increased the likelihood that female mice would terminate their newborn pups. This suggests that this newly discovered de novo gene is involved in regulating the female reproductive cycles of mice. Further analysis showed that this de novo gene counteracts the action of an older gene that promotes the implantation of embryos. This gene has therefore likely evolved due to the benefit it offers mothers, as it protects them from experiencing the increased physiological stress caused by a premature second pregnancy. These findings support the idea that genes which have evolved de novo can have an essential biological purpose despite coming from random DNA sequences. This establishes that de novo evolution of genes is the second major mechanism of how new genes with significant biological roles can form in the genome.
Collapse
Affiliation(s)
- Chen Xie
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Cemalettin Bekpen
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Sven Künzel
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Maryam Keshavarz
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Rebecca Krebs-Wheaton
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Neva Skrabar
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Kristian Karsten Ullrich
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Diethard Tautz
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| |
Collapse
|
26
|
Hilgers L, Hartmann S, Hofreiter M, von Rintelen T. Novel Genes, Ancient Genes, and Gene Co-Option Contributed to the Genetic Basis of the Radula, a Molluscan Innovation. Mol Biol Evol 2019; 35:1638-1652. [PMID: 29672732 PMCID: PMC5995198 DOI: 10.1093/molbev/msy052] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
The radula is the central foraging organ and apomorphy of the Mollusca. However, in contrast to other innovations, including the mollusk shell, genetic underpinnings of radula formation remain virtually unknown. Here, we present the first radula formative tissue transcriptome using the viviparous freshwater snail Tylomelania sarasinorum and compare it to foot tissue and the shell-building mantle of the same species. We combine differential expression, functional enrichment, and phylostratigraphic analyses to identify both specific and shared genetic underpinnings of the three tissues as well as their dominant functions and evolutionary origins. Gene expression of radula formative tissue is very distinct, but nevertheless more similar to mantle than to foot. Generally, the genetic bases of both radula and shell formation were shaped by novel orchestration of preexisting genes and continuous evolution of novel genes. A significantly increased proportion of radula-specific genes originated since the origin of stem-mollusks, indicating that novel genes were especially important for radula evolution. Genes with radula-specific expression in our study are frequently also expressed during the formation of other lophotrochozoan hard structures, like chaetae (hes1, arx), spicules (gbx), and shells of mollusks (gbx, heph) and brachiopods (heph), suggesting gene co-option for hard structure formation. Finally, a Lophotrochozoa-specific chitin synthase with a myosin motor domain (CS-MD), which is expressed during mollusk and brachiopod shell formation, had radula-specific expression in our study. CS-MD potentially facilitated the construction of complex chitinous structures and points at the potential of molecular novelties to promote the evolution of different morphological innovations.
Collapse
Affiliation(s)
- Leon Hilgers
- Museum für Naturkunde Berlin, Leibniz Institute for Evolution and Biodiversity Science, Berlin, Germany
- Adaptive Evolutionary Genomics Department, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
- Corresponding author: E-mail:
| | - Stefanie Hartmann
- Adaptive Evolutionary Genomics Department, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
| | - Michael Hofreiter
- Adaptive Evolutionary Genomics Department, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
| | - Thomas von Rintelen
- Museum für Naturkunde Berlin, Leibniz Institute for Evolution and Biodiversity Science, Berlin, Germany
| |
Collapse
|
27
|
Affiliation(s)
- Stephen Branden Van Oss
- Department of Computational and Systems Biology, Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States of America
| |
Collapse
|
28
|
Zhang L, Tan Y, Fan S, Zhang X, Zhang Z. Phylostratigraphic analysis of gene co-expression network reveals the evolution of functional modules for ovarian cancer. Sci Rep 2019; 9:2623. [PMID: 30796309 PMCID: PMC6384884 DOI: 10.1038/s41598-019-40023-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Accepted: 01/23/2019] [Indexed: 01/06/2023] Open
Abstract
Ovarian cancer (OV) is an extremely lethal disease. However, the evolutionary machineries of OV are still largely unknown. Here, we used a method that combines phylostratigraphy information with gene co-expression networks to extensively study the evolutionary compositions of OV. The present co-expression network construction yielded 18,549 nodes and 114,985 edges based on 307 OV expression samples obtained from the Genome Data Analysis Centers database. A total of 20 modules were identified as OV related clusters. The human genome sequences were divided into 19 phylostrata (PS), the majority (67.45%) of OV genes was already present in the eukaryotic ancestor. There were two strong peaks of the emergence of OV genes screened by hypergeometric test: the evolution of the multicellular metazoan organisms (PS5 and PS6, P value = 0.002) and the emergence of bony fish (PS11 and PS12, P value = 0.009). Hence, the origin of OV is far earlier than its emergence. The integrated analysis of the topology of OV modules and the phylogenetic data revealed an evolutionary pattern of OV in human, namely, OV modules have arisen step by step during the evolution of the respective lineages. New genes have evolved and become locked into a pathway, where more and more biological pathways are fixed into OV modules by recruiting new genes during human evolution.
Collapse
Affiliation(s)
- Luoyan Zhang
- Key Lab of Plant Stress Research, College of Life Science, Shandong Normal University, Jinan, 250014, Shandong, China
| | - Yi Tan
- Qilu Cell Therapy Technology Co., Ltd, Jinan, 250000, Shandong, China
| | - Shoujin Fan
- Key Lab of Plant Stress Research, College of Life Science, Shandong Normal University, Jinan, 250014, Shandong, China
| | - Xuejie Zhang
- Key Lab of Plant Stress Research, College of Life Science, Shandong Normal University, Jinan, 250014, Shandong, China
| | - Zhen Zhang
- Laboratory for Molecular Immunology, Institute of Basic Medicine, Shandong Academy of Medical Sciences, Jinan, 250062, Shandong, China.
| |
Collapse
|
29
|
Vakirlis N, Hebert AS, Opulente DA, Achaz G, Hittinger CT, Fischer G, Coon JJ, Lafontaine I. A Molecular Portrait of De Novo Genes in Yeasts. Mol Biol Evol 2019; 35:631-645. [PMID: 29220506 DOI: 10.1093/molbev/msx315] [Citation(s) in RCA: 82] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
New genes, with novel protein functions, can evolve "from scratch" out of intergenic sequences. These de novo genes can integrate the cell's genetic network and drive important phenotypic innovations. Therefore, identifying de novo genes and understanding how the transition from noncoding to coding occurs are key problems in evolutionary biology. However, identifying de novo genes is a difficult task, hampered by the presence of remote homologs, fast evolving sequences and erroneously annotated protein coding genes. To overcome these limitations, we developed a procedure that handles the usual pitfalls in de novo gene identification and predicted the emergence of 703 de novo gene candidates in 15 yeast species from 2 genera whose phylogeny spans at least 100 million years of evolution. We validated 85 candidates by proteomic data, providing new translation evidence for 25 of them through mass spectrometry experiments. We also unambiguously identified the mutations that enabled the transition from noncoding to coding for 30 Saccharomyces de novo genes. We established that de novo gene origination is a widespread phenomenon in yeasts, only a few being ultimately maintained by selection. We also found that de novo genes preferentially emerge next to divergent promoters in GC-rich intergenic regions where the probability of finding a fortuitous and transcribed ORF is the highest. Finally, we found a more than 3-fold enrichment of de novo genes at recombination hot spots, which are GC-rich and nucleosome-free regions, suggesting that meiotic recombination contributes to de novo gene emergence in yeasts.
Collapse
Affiliation(s)
- Nikolaos Vakirlis
- Sorbonne Universités, UPMC Univ Paris 06, CNRS, Institut de Biologie Paris Seine, Biologie Computationnelle et Quantitative UMR7238, 75005 Paris, France
| | - Alex S Hebert
- Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI.,DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI
| | - Dana A Opulente
- Laboratory of Genetics, Genome Center of Wisconsin, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI
| | - Guillaume Achaz
- Atelier de BioInformatique, ISyEB UMR7205 Muséum National d'Histoire Naturelle, Paris, France.,SMILE Group, CIRB UMR7241, Collège de France, Paris, France
| | - Chris Todd Hittinger
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI.,Laboratory of Genetics, Genome Center of Wisconsin, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI
| | - Gilles Fischer
- Sorbonne Universités, UPMC Univ Paris 06, CNRS, Institut de Biologie Paris Seine, Biologie Computationnelle et Quantitative UMR7238, 75005 Paris, France
| | - Joshua J Coon
- Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI.,DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI.,Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI.,Department of Chemistry, University of Wisconsin-Madison, Madison, WI.,Morgridge Institute for Research, Madison, WI
| | - Ingrid Lafontaine
- Atelier de BioInformatique, ISyEB UMR7205 Muséum National d'Histoire Naturelle, Paris, France.,Sorbonne Universités, UPMC Univ Paris 06, CNRS, Institut de Biologie Physico-Chimique, Physiologie Membranaire et Moléculaire du Chloroplaste UMR7141, 75005 Paris, France
| |
Collapse
|
30
|
Lu GA, Zhao Y, Yang H, Lan A, Shi S, Liufu Z, Huang Y, Tang T, Xu J, Shen X, Wu CI. Death of new microRNA genes in Drosophila via gradual loss of fitness advantages. Genome Res 2018; 28:1309-1318. [PMID: 30049791 PMCID: PMC6120634 DOI: 10.1101/gr.233809.117] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Accepted: 07/20/2018] [Indexed: 01/23/2023]
Abstract
The prevalence of de novo coding genes is controversial due to length and coding constraints. Noncoding genes, especially small ones, are freer to evolve de novo by comparison. The best examples are microRNAs (miRNAs), a large class of regulatory molecules ∼22 nt in length. Here, we study six de novo miRNAs in Drosophila, which, like most new genes, are testis-specific. We ask how and why de novo genes die because gene death must be sufficiently frequent to balance the many new births. By knocking out each miRNA gene, we analyzed their contributions to the nine components of male fitness (sperm production, length, and competitiveness, among others). To our surprise, the knockout mutants often perform better than the wild type in some components, and slightly worse in others. When two of the younger miRNAs are assayed in long-term laboratory populations, their total fitness contributions are found to be essentially zero. These results collectively suggest that adaptive de novo genes die regularly, not due to the loss of functionality, but due to the canceling out of positive and negative fitness effects, which may be characterized as "quasi-neutrality." Since de novo genes often emerge adaptively and become lost later, they reveal ongoing period-specific adaptations, reminiscent of the "Red-Queen" metaphor for long-term evolution.
Collapse
Affiliation(s)
- Guang-An Lu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Yixin Zhao
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Hao Yang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Ao Lan
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Suhua Shi
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Zhongqi Liufu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Yumei Huang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Tian Tang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Jin Xu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, California 94305, USA
| | - Xu Shen
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Chung-I Wu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|
31
|
Moyers BA, Zhang J. Toward Reducing Phylostratigraphic Errors and Biases. Genome Biol Evol 2018; 10:2037-2048. [PMID: 30060201 PMCID: PMC6105108 DOI: 10.1093/gbe/evy161] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/28/2018] [Indexed: 01/03/2023] Open
Abstract
Phylostratigraphy is a method for estimating gene age, usually applied to large numbers of genes in order to detect nonrandom age-distributions of gene properties that could shed light on mechanisms of gene origination and evolution. However, phylostratigraphy underestimates gene age with a nonnegligible probability. The underestimation is severer for genes with certain properties, creating spurious age distributions of these properties and those correlated with these properties. Here we explore three strategies to reduce phylostratigraphic error/bias. First, we test several alternative homology detection methods (PSIBLAST, HMMER, PHMMER, OMA, and GLAM2Scan) in phylostratigraphy, but fail to find any that noticeably outperforms the commonly used BLASTP. Second, using machine learning, we look for predictors of error-prone genes to exclude from phylostratigraphy, but cannot identify reliable predictors. Finally, we remove from phylostratigraphic analysis genes exhibiting errors in simulation, which by definition minimizes error/bias if the simulation is sufficiently realistic. Using this last approach, we show that some previously reported phylostratigraphic trends (e.g., younger proteins tend to evolve more rapidly and be shorter) disappear or even reverse, reconfirming the necessity of controlling phylostratigraphic error/bias. Taken together, our analyses demonstrate that phylostratigraphic errors/biases are refractory to several potential solutions but can be controlled at least partially by the exclusion of error-prone genes identified via realistic simulations. These results are expected to stimulate the judicious use of error-aware phylostratigraphy and reevaluation of previous phylostratigraphic findings.
Collapse
Affiliation(s)
- Bryan A Moyers
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
32
|
Abstract
De novo genes are very important for evolutionary innovation. However, how these genes originate and spread remains largely unknown. To better understand this, we rigorously searched for de novo genes in Saccharomyces cerevisiae S288C and examined their spread and fixation in the population. Here, we identified 84 de novo genes in S. cerevisiae S288C since the divergence with their sister groups. Transcriptome and ribosome profiling data revealed at least 8 (10%) and 28 (33%) de novo genes being expressed and translated only under specific conditions, respectively. DNA microarray data, based on 2-fold change, showed that 87% of the de novo genes are regulated during various biological processes, such as nutrient utilization and sporulation. Our comparative and evolutionary analyses further revealed that some factors, including single nucleotide polymorphism (SNP)/indel mutation, high GC content, and DNA shuffling, contribute to the birth of de novo genes, while domestication and natural selection drive the spread and fixation of these genes. Finally, we also provide evidence suggesting the possible parallel origin of a de novo gene between S. cerevisiae and Saccharomyces paradoxus. Together, our study provides several new insights into the origin and spread of de novo genes. Emergence of de novo genes has occurred in many lineages during evolution, but the birth, spread, and function of these genes remain unresolved. Here we have searched for de novo genes from Saccharomyces cerevisiae S288C using rigorous methods, which reduced the effects of bad annotation and genomic gaps on the identification of de novo genes. Through this analysis, we have found 84 new genes originating de novo from previously noncoding regions, 87% of which are very likely involved in various biological processes. We noticed that 10% and 33% of de novo genes were only expressed and translated under specific conditions, therefore, verification of de novo genes through transcriptome and ribosome profiling, especially from limited expression data, may underestimate the number of bona fide new genes. We further show that SNP/indel mutation, high GC content, and DNA shuffling could be involved in the birth of de novo genes, while domestication and natural selection drive the spread and fixation of these genes. Finally, we provide evidence suggesting the possible parallel origin of a new gene.
Collapse
|
33
|
Abstract
Phylostratigraphy, originally designed for gene age estimation by BLAST-based protein homology searches of sequenced genomes, has been widely used for studying patterns and inferring mechanisms of gene origination and evolution. We previously showed by computer simulation that phylostratigraphy underestimates gene age for a nonnegligible fraction of genes and that the underestimation is severer for genes with certain properties such as fast evolution and short protein sequences. Consequently, many previously reported age distributions of gene properties may have been methodological artifacts rather than biological realities. Domazet-Lošo and colleagues recently argued that our simulations were flawed and that phylostratigraphic bias does not impact inferences about gene emergence and evolution. Here we discuss conceptual difficulties of phylostratigraphy, identify numerous problems in Domazet-Lošo et al.’s argument, reconfirm phylostratigraphic error using simulations suggested by Domazet-Lošo and colleagues, and demonstrate that a phylostratigraphic trend claimed to be robust to error disappears when genes likely to be error-resistant are analyzed. We conclude that extreme caution is needed in interpreting phylostratigraphic results because of the inherent biases of the method and that reanalysis using genes exhibiting no error in realistic simulations may help reduce spurious findings.
Collapse
Affiliation(s)
- Bryan A Moyers
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan
| |
Collapse
|
34
|
Lei L, Steffen JG, Osborne EJ, Toomajian C. Plant organ evolution revealed by phylotranscriptomics in Arabidopsis thaliana. Sci Rep 2017; 7:7567. [PMID: 28790409 PMCID: PMC5548721 DOI: 10.1038/s41598-017-07866-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2017] [Accepted: 07/04/2017] [Indexed: 11/18/2022] Open
Abstract
The evolution of phenotypes occurs through changes both in protein sequence and gene expression levels. Though much of plant morphological evolution can be explained by changes in gene expression, examining its evolution has challenges. To gain a new perspective on organ evolution in plants, we applied a phylotranscriptomics approach. We combined a phylostratigraphic approach with gene expression based on the strand-specific RNA-seq data from seedling, floral bud, and root of 19 Arabidopsis thaliana accessions to examine the age and sequence divergence of transcriptomes from these organs and how they adapted over time. Our results indicate that, among the sense and antisense transcriptomes of these organs, the sense transcriptomes of seedlings are the evolutionarily oldest across all accessions and are the most conserved in amino acid sequence for most accessions. In contrast, among the sense transcriptomes from these same organs, those from floral bud are evolutionarily youngest and least conserved in sequence for most accessions. Different organs have adaptive peaks at different stages in their evolutionary history; however, all three show a common adaptive signal from the Magnoliophyta to Brassicale stage. Our research highlights how phylotranscriptomic analyses can be used to trace organ evolution in the deep history of plant species.
Collapse
Affiliation(s)
- Li Lei
- Kansas State University, Department of Plant Pathology, Manhattan, KS, 66506, USA.
| | - Joshua G Steffen
- Colby-Sawyer College, Natural Sciences Department, New London, NH, 03257, USA
| | - Edward J Osborne
- University of Utah, Department of Biology, Salt Lake City, UT, 84111, USA
| | | |
Collapse
|
35
|
Abstract
The phenomenon of de novo gene birth from junk DNA is surprising, because random polypeptides are expected to be toxic. There are two conflicting views about how de novo gene birth is nevertheless possible: the continuum hypothesis invokes a gradual gene birth process, while the preadaptation hypothesis predicts that young genes will show extreme levels of gene-like traits. We show that intrinsic structural disorder conforms to the predictions of the preadaptation hypothesis and falsifies the continuum hypothesis, with all genes having higher levels than translated junk DNA, but young genes having the highest level of all. Results are robust to homology detection bias, to the non-independence of multiple members of the same gene family, and to the false positive annotation of protein-coding genes.
Collapse
|
36
|
Schmitz JF, Bornberg-Bauer E. Fact or fiction: updates on how protein-coding genes might emerge de novo from previously non-coding DNA. F1000Res 2017; 6:57. [PMID: 28163910 PMCID: PMC5247788 DOI: 10.12688/f1000research.10079.1] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/17/2017] [Indexed: 12/31/2022] Open
Abstract
Over the last few years, there has been an increasing amount of evidence for the
de novo emergence of protein-coding genes, i.e. out of non-coding DNA. Here, we review the current literature and summarize the state of the field. We focus specifically on open questions and challenges in the study of
de novo protein-coding genes such as the identification and verification of
de novo-emerged genes. The greatest obstacle to date is the lack of high-quality genomic data with very short divergence times which could help precisely pin down the location of origin of a
de novo gene. We conclude that, while there is plenty of evidence from a genetics perspective, there is a lack of functional studies of bona fide
de novo genes and almost no knowledge about protein structures and how they come about during the emergence of
de novo protein-coding genes. We suggest that future studies should concentrate on the functional and structural characterization of
de novo protein-coding genes as well as the detailed study of the emergence of functional
de novo protein-coding genes.
Collapse
Affiliation(s)
- Jonathan F Schmitz
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| |
Collapse
|
37
|
High expression of new genes in trochophore enlightening the ontogeny and evolution of trochozoans. Sci Rep 2016; 6:34664. [PMID: 27698463 PMCID: PMC5048140 DOI: 10.1038/srep34664] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Accepted: 09/19/2016] [Indexed: 11/08/2022] Open
Abstract
Animals with trochophore larvae belong to Trochozoa, one of the main branches of Bilateria. In addition to exhibiting spiral cleavage and early cell fate determination, trochozoans typically undergo indirect development, which contributes to the most unique characteristics of their ontogeny. The indirect development of trochozoans has provoked discussion regarding the origin and evolution of marine larvae and is interesting from the perspective of phylogeny-ontogeny correspondence. While these phylo-onto correlations have an hourglass shape in Deuterostomia, Ecdysozoa, plants and even fungi, they have seldom been studied in Trochozoa, and even Lophotrochozoa. Here, we compared the ontogenetic transcriptomes of the Pacific oyster, Crassostrea gigas (Bivalvia, Mollusca), the Pacific abalone, Haliotis discus hannai (Gastropoda, Mollusca), and the sand worm Perinereis aibuhitensis (Polychaeta, Annelida) using several complementary phylotranscriptomic methods to examine their evolutionary trajectories. The results revealed the late trochophore stage as the phylotypic phase. However, this basic pattern is accompanied with increased use of new genes in the trochophore stages which marks specific adaptations of the larval body plans.
Collapse
|