1
|
Lee U, Mozeika SM, Zhao L. A Synergistic, Cultivator Model of De Novo Gene Origination. Genome Biol Evol 2024; 16:evae103. [PMID: 38748819 PMCID: PMC11152449 DOI: 10.1093/gbe/evae103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/12/2024] [Indexed: 06/07/2024] Open
Abstract
The origin and fixation of evolutionarily young genes is a fundamental question in evolutionary biology. However, understanding the origins of newly evolved genes arising de novo from noncoding genomic sequences is challenging. This is partly due to the low likelihood that several neutral or nearly neutral mutations fix prior to the appearance of an important novel molecular function. This issue is particularly exacerbated in large effective population sizes where the effect of drift is small. To address this problem, we propose a regulation-focused, cultivator model for de novo gene evolution. This cultivator-focused model posits that each step in a novel variant's evolutionary trajectory is driven by well-defined, selectively advantageous functions for the cultivator genes, rather than solely by the de novo genes, emphasizing the critical role of genome organization in the evolution of new genes.
Collapse
Affiliation(s)
- UnJin Lee
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Shawn M Mozeika
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| |
Collapse
|
2
|
Chen J, Li Q, Xia S, Arsala D, Sosa D, Wang D, Long M. The Rapid Evolution of De Novo Proteins in Structure and Complex. Genome Biol Evol 2024; 16:evae107. [PMID: 38753069 PMCID: PMC11149777 DOI: 10.1093/gbe/evae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/10/2024] [Indexed: 06/06/2024] Open
Abstract
Recent studies in the rice genome-wide have established that de novo genes, evolving from noncoding sequences, enhance protein diversity through a stepwise process. However, the pattern and rate of their evolution in protein structure over time remain unclear. Here, we addressed these issues within a surprisingly short evolutionary timescale (<1 million years for 97% of Oryza de novo genes) with comparative approaches to gene duplicates. We found that de novo genes evolve faster than gene duplicates in the intrinsically disordered regions (such as random coils), secondary structure elements (such as α helix and β strand), hydrophobicity, and molecular recognition features. In de novo proteins, specifically, we observed an 8% to 14% decay in random coils and intrinsically disordered region lengths and a 2.3% to 6.5% increase in structured elements, hydrophobicity, and molecular recognition features, per million years on average. These patterns of structural evolution align with changes in amino acid composition over time as well. We also revealed higher positive charges but smaller molecular weights for de novo proteins than duplicates. Tertiary structure predictions showed that most de novo proteins, though not typically well folded on their own, readily form low-energy and compact complexes with other proteins facilitated by extensive residue contacts and conformational flexibility, suggesting a faster-binding scenario in de novo proteins to promote interaction. These analyses illuminate a rapid evolution of protein structure in de novo genes in rice genomes, originating from noncoding sequences, highlighting their quick transformation into active, protein complex-forming components within a remarkably short evolutionary timeframe.
Collapse
Affiliation(s)
- Jianhai Chen
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Qingrong Li
- Division of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
- Department of Cellular & Molecular Medicine, School of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Shengqian Xia
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Deanna Arsala
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Dylan Sosa
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Dong Wang
- Division of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
- Department of Cellular & Molecular Medicine, School of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
3
|
Andjus S, Szachnowski U, Vogt N, Gioftsidi S, Hatin I, Cornu D, Papadopoulos C, Lopes A, Namy O, Wery M, Morillon A. Pervasive translation of Xrn1-sensitive unstable long noncoding RNAs in yeast. RNA (NEW YORK, N.Y.) 2024; 30:662-679. [PMID: 38443115 PMCID: PMC11098462 DOI: 10.1261/rna.079903.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 02/15/2024] [Indexed: 03/07/2024]
Abstract
Despite being predicted to lack coding potential, cytoplasmic long noncoding (lnc)RNAs can associate with ribosomes. However, the landscape and biological relevance of lncRNA translation remain poorly studied. In yeast, cytoplasmic Xrn1-sensitive unstable transcripts (XUTs) are targeted by nonsense-mediated mRNA decay (NMD), suggesting a translation-dependent degradation process. Here, we report that XUTs are pervasively translated, which impacts their decay. We show that XUTs globally accumulate upon translation elongation inhibition, but not when initial ribosome loading is impaired. Ribo-seq confirmed ribosomes binding to XUTs and identified ribosome-associated 5'-proximal small ORFs. Mechanistically, the NMD-sensitivity of XUTs mainly depends on the 3'-untranslated region length. Finally, we show that the peptide resulting from the translation of an NMD-sensitive XUT reporter exists in NMD-competent cells. Our work highlights the role of translation in the posttranscriptional metabolism of XUTs. We propose that XUT-derived peptides could be exposed to natural selection, while NMD restricts XUT levels.
Collapse
Affiliation(s)
- Sara Andjus
- ncRNA, Epigenetic and Genome Fluidity, Institut Curie, PSL University, Sorbonne Université, CNRS UMR3244, F-75248 Paris Cedex 05, France
| | - Ugo Szachnowski
- ncRNA, Epigenetic and Genome Fluidity, Institut Curie, Sorbonne Université, CNRS UMR3244, F-75248 Paris Cedex 05, France
| | - Nicolas Vogt
- ncRNA, Epigenetic and Genome Fluidity, Institut Curie, Sorbonne Université, CNRS UMR3244, F-75248 Paris Cedex 05, France
| | - Stamatia Gioftsidi
- ncRNA, Epigenetic and Genome Fluidity, Institut Curie, Sorbonne Université, CNRS UMR3244, F-75248 Paris Cedex 05, France
| | - Isabelle Hatin
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - David Cornu
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Chris Papadopoulos
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Anne Lopes
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Namy
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Maxime Wery
- ncRNA, Epigenetic and Genome Fluidity, Institut Curie, Sorbonne Université, CNRS UMR3244, F-75248 Paris Cedex 05, France
| | - Antonin Morillon
- ncRNA, Epigenetic and Genome Fluidity, Institut Curie, Sorbonne Université, CNRS UMR3244, F-75248 Paris Cedex 05, France
| |
Collapse
|
4
|
Linnenbrink M, Breton G, Misra P, Pfeifle C, Dutheil JY, Tautz D. Experimental Evaluation of a Direct Fitness Effect of the De Novo Evolved Mouse Gene Pldi. Genome Biol Evol 2024; 16:evae084. [PMID: 38742287 DOI: 10.1093/gbe/evae084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2024] [Indexed: 05/16/2024] Open
Abstract
De novo evolved genes emerge from random parts of noncoding sequences and have, therefore, no homologs from which a function could be inferred. While expression analysis and knockout experiments can provide insights into the function, they do not directly test whether the gene is beneficial for its carrier. Here, we have used a seminatural environment experiment to test the fitness of the previously identified de novo evolved mouse gene Pldi, which has been implicated to have a role in sperm differentiation. We used a knockout mouse strain for this gene and competed it against its parental wildtype strain for several generations of free reproduction. We found that the knockout (ko) allele frequency decreased consistently across three replicates of the experiment. Using an approximate Bayesian computation framework that simulated the data under a demographic scenario mimicking the experiment's demography, we could estimate a selection coefficient ranging between 0.21 and 0.61 for the wildtype allele compared to the ko allele in males, under various models. This implies a relatively strong selective advantage, which would fix the new gene in less than hundred generations after its emergence.
Collapse
Affiliation(s)
- Miriam Linnenbrink
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
- Present address: Max Planck Institute for Biological Intelligence, 82152 Martinsried, Germany
| | - Gwenna Breton
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
- Present address: Clinical Genomics Gothenburg, Science for Life Laboratory, Sahlgrenska Academy, University of Gothenburg, and Center for Medical Genomics, Department of Clinical Genetic and Genomics, Sahlgrenska University Hospital, Sweden
| | - Pallavi Misra
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
- Present address: Laboratory Corporation of America (LabCorp), Westborough, MA 01581, USA
| | - Christine Pfeifle
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - Julien Y Dutheil
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - Diethard Tautz
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| |
Collapse
|
5
|
Lee U, Li C, Langer CB, Svetec N, Zhao L. Comparative Single Cell Analysis of Transcriptional Bursting Reveals the Role of Genome Organization on de novo Transcript Origination. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.29.591771. [PMID: 38746255 PMCID: PMC11092510 DOI: 10.1101/2024.04.29.591771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Spermatogenesis is a key developmental process underlying the origination of newly evolved genes. However, rapid cell type-specific transcriptomic divergence of the Drosophila germline has posed a significant technical barrier for comparative single-cell RNA-sequencing (scRNA-Seq) studies. By quantifying a surprisingly strong correlation between species-and cell type-specific divergence in three closely related Drosophila species, we apply a simple statistical procedure to identify a core set of 198 genes that are highly predictive of cell type identity while remaining robust to species-specific differences that span over 25-30 million years of evolution. We then utilize cell type classifications based on the 198-gene set to show how transcriptional divergence in cell type increases throughout spermatogenic developmental time, contrasting with traditional hourglass models of whole-organism development. With these cross-species cell type classifications, we then investigate the influence of genome organization on the molecular evolution of spermatogenesis vis-a-vis transcriptional bursting. We first demonstrate how mechanistic control of pre-meiotic transcription is achieved by altering transcriptional burst size while post-meiotic control is exerted via altered bursting frequency. We then report how global differences in autosomal vs. X chromosomal transcription likely arise in a developmental stage preceding full testis organogenesis by showing evolutionarily conserved decreases in X-linked transcription bursting kinetics in all examined somatic and germline cell types. Finally, we provide evidence supporting the cultivator model of de novo gene origination by demonstrating how the appearance of newly evolved testis-specific transcripts potentially provides short-range regulation of the transcriptional bursting properties of neighboring genes during key stages of spermatogenesis.
Collapse
|
6
|
Aubel M, Buchel F, Heames B, Jones A, Honc O, Bornberg-Bauer E, Hlouchova K. High-throughput Selection of Human de novo-emerged sORFs with High Folding Potential. Genome Biol Evol 2024; 16:evae069. [PMID: 38597156 PMCID: PMC11024478 DOI: 10.1093/gbe/evae069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/11/2024] [Accepted: 03/23/2024] [Indexed: 04/11/2024] Open
Abstract
De novo genes emerge from previously noncoding stretches of the genome. Their encoded de novo proteins are generally expected to be similar to random sequences and, accordingly, with no stable tertiary fold and high predicted disorder. However, structural properties of de novo proteins and whether they differ during the stages of emergence and fixation have not been studied in depth and rely heavily on predictions. Here we generated a library of short human putative de novo proteins of varying lengths and ages and sorted the candidates according to their structural compactness and disorder propensity. Using Förster resonance energy transfer combined with Fluorescence-activated cell sorting, we were able to screen the library for most compact protein structures, as well as most elongated and flexible structures. We find that compact de novo proteins are on average slightly shorter and contain lower predicted disorder than less compact ones. The predicted structures for most and least compact de novo proteins correspond to expectations in that they contain more secondary structure content or higher disorder content, respectively. Our experiments indicate that older de novo proteins have higher compactness and structural propensity compared with young ones. We discuss possible evolutionary scenarios and their implications underlying the age-dependencies of compactness and structural content of putative de novo proteins.
Collapse
Affiliation(s)
- Margaux Aubel
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| | - Filip Buchel
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czech Republic
- Department of Biochemistry, Faculty of Science, Charles University, Prague, Czech Republic
| | - Brennen Heames
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| | - Alun Jones
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| | - Ondrej Honc
- Imaging Methods Core Facility, BIOCEV, Prague, Czech Republic
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
- Department of Protein Evolution, Max Planck-Institute for Biology Tuebingen, Tuebingen, Germany
| | - Klara Hlouchova
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czech Republic
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague, Czech Republic
| |
Collapse
|
7
|
Fleck K, Luria V, Garag N, Karger A, Hunter T, Marten D, Phu W, Nam KM, Sestan N, O’Donnell-Luria AH, Erceg J. Functional associations of evolutionarily recent human genes exhibit sensitivity to the 3D genome landscape and disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.17.585403. [PMID: 38559085 PMCID: PMC10980080 DOI: 10.1101/2024.03.17.585403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Genome organization is intricately tied to regulating genes and associated cell fate decisions. In this study, we examine the positioning and functional significance of human genes, grouped by their evolutionary age, within the 3D organization of the genome. We reveal that genes of different evolutionary origin have distinct positioning relationships with both domains and loop anchors, and remarkably consistent relationships with boundaries across cell types. While the functional associations of each group of genes are primarily cell type-specific, such associations of conserved genes maintain greater stability across 3D genomic features and disease than recently evolved genes. Furthermore, the expression of these genes across various tissues follows an evolutionary progression, such that RNA levels increase from young genes to ancient genes. Thus, the distinct relationships of gene evolutionary age, function, and positioning within 3D genomic features contribute to tissue-specific gene regulation in development and disease.
Collapse
Affiliation(s)
- Katherine Fleck
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269
| | - Victor Luria
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| | - Nitanta Garag
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
| | - Amir Karger
- IT-Research Computing, Harvard Medical School, Boston, MA 02115
| | - Trevor Hunter
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
| | - Daniel Marten
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
| | - William Phu
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
| | - Kee-Myoung Nam
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06510
| | - Nenad Sestan
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510
| | - Anne H. O’Donnell-Luria
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
- Department of Pediatrics, Harvard Medical School, Boston, MA 02115
| | - Jelena Erceg
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030
| |
Collapse
|
8
|
Liu X, Xiao C, Xu X, Zhang J, Mo F, Chen JY, Delihas N, Zhang L, An NA, Li CY. Origin of functional de novo genes in humans from "hopeful monsters". WILEY INTERDISCIPLINARY REVIEWS. RNA 2024; 15:e1845. [PMID: 38605485 DOI: 10.1002/wrna.1845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 03/13/2024] [Accepted: 03/18/2024] [Indexed: 04/13/2024]
Abstract
For a long time, it was believed that new genes arise only from modifications of preexisting genes, but the discovery of de novo protein-coding genes that originated from noncoding DNA regions demonstrates the existence of a "motherless" origination process for new genes. However, the features, distributions, expression profiles, and origin modes of these genes in humans seem to support the notion that their origin is not a purely "motherless" process; rather, these genes arise preferentially from genomic regions encoding preexisting precursors with gene-like features. In such a case, the gene loci are typically not brand new. In this short review, we will summarize the definition and features of human de novo genes and clarify their process of origination from ancestral non-coding genomic regions. In addition, we define the favored precursors, or "hopeful monsters," for the origin of de novo genes and present a discussion of the functional significance of these young genes in brain development and tumorigenesis in humans. This article is categorized under: RNA Evolution and Genomics > RNA and Ribonucleoprotein Evolution.
Collapse
Affiliation(s)
- Xiaoge Liu
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Chunfu Xiao
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Xinwei Xu
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Jie Zhang
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Fan Mo
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Jia-Yu Chen
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Chemistry and Biomedicine Innovation Center (ChemBIC), Nanjing University, Nanjing, China
| | - Nicholas Delihas
- Department of Microbiology and Immunology, Renaissance School of Medicine, Stony Brook University, Stony Brook, New York, USA
| | - Li Zhang
- Chinese Institute for Brain Research, Beijing, China
| | - Ni A An
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Chuan-Yun Li
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
- Chinese Institute for Brain Research, Beijing, China
- Southwest United Graduate School, Kunming, China
| |
Collapse
|
9
|
Grandchamp A, Czuppon P, Bornberg-Bauer E. Quantification and modeling of turnover dynamics of de novo transcripts in Drosophila melanogaster. Nucleic Acids Res 2024; 52:274-287. [PMID: 38000384 PMCID: PMC10783523 DOI: 10.1093/nar/gkad1079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 10/13/2023] [Accepted: 10/28/2023] [Indexed: 11/26/2023] Open
Abstract
Most of the transcribed eukaryotic genomes are composed of non-coding transcripts. Among these transcripts, some are newly transcribed when compared to outgroups and are referred to as de novo transcripts. De novo transcripts have been shown to play a major role in genomic innovations. However, little is known about the rates at which de novo transcripts are gained and lost in individuals of the same species. Here, we address this gap and estimate the de novo transcript turnover rate with an evolutionary model. We use DNA long reads and RNA short reads from seven geographically remote samples of inbred individuals of Drosophila melanogaster to detect de novo transcripts that are gained on a short evolutionary time scale. Overall, each sampled individual contains around 2500 unspliced de novo transcripts, with most of them being sample specific. We estimate that around 0.15 transcripts are gained per year, and that each gained transcript is lost at a rate around 5× 10-5 per year. This high turnover of transcripts suggests frequent exploration of new genomic sequences within species. These rate estimates are essential to comprehend the process and timescale of de novo gene birth.
Collapse
Affiliation(s)
- Anna Grandchamp
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Peter Czuppon
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
- Department of Protein Evolution, Max Planck Institute for Biology, Tübingen, Germany
| |
Collapse
|
10
|
Frumkin I, Laub MT. Selection of a de novo gene that can promote survival of Escherichia coli by modulating protein homeostasis pathways. Nat Ecol Evol 2023; 7:2067-2079. [PMID: 37945946 PMCID: PMC10697842 DOI: 10.1038/s41559-023-02224-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Accepted: 09/12/2023] [Indexed: 11/12/2023]
Abstract
Cellular novelty can emerge when non-functional loci become functional genes in a process termed de novo gene birth. But how proteins with random amino acid sequences beneficially integrate into existing cellular pathways remains poorly understood. We screened ~108 genes, generated from random nucleotide sequences and devoid of homology to natural genes, for their ability to rescue growth arrest of Escherichia coli cells producing the ribonuclease toxin MazF. We identified ~2,000 genes that could promote growth, probably by reducing transcription from the promoter driving toxin expression. Additionally, one random protein, named Random antitoxin of MazF (RamF), modulated protein homeostasis by interacting with chaperones, leading to MazF proteolysis and a consequent loss of its toxicity. Finally, we demonstrate that random proteins can improve during evolution by identifying beneficial mutations that turned RamF into a more efficient inhibitor. Our work provides a mechanistic basis for how de novo gene birth can produce functional proteins that effectively benefit cells evolving under stress.
Collapse
Affiliation(s)
- Idan Frumkin
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Michael T Laub
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Howard Hughes Medical Institute, Cambridge, MA, USA.
| |
Collapse
|
11
|
Mani S, Tlusty T. Gene birth in a model of non-genic adaptation. BMC Biol 2023; 21:257. [PMID: 37957718 PMCID: PMC10644530 DOI: 10.1186/s12915-023-01745-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Accepted: 10/24/2023] [Indexed: 11/15/2023] Open
Abstract
BACKGROUND Over evolutionary timescales, genomic loci can switch between functional and non-functional states through processes such as pseudogenization and de novo gene birth. Particularly, de novo gene birth is a widespread process, and many examples continue to be discovered across diverse evolutionary lineages. However, the general mechanisms that lead to functionalization are poorly understood, and estimated rates of de novo gene birth remain contentious. Here, we address this problem within a model that takes into account mutations and structural variation, allowing us to estimate the likelihood of emergence of new functions at non-functional loci. RESULTS Assuming biologically reasonable mutation rates and mutational effects, we find that functionalization of non-genic loci requires the realization of strict conditions. This is in line with the observation that most de novo genes are localized to the vicinity of established genes. Our model also provides an explanation for the empirical observation that emerging proto-genes are often lost despite showing signs of adaptation. CONCLUSIONS Our work elucidates the properties of non-genic loci that make them fertile for adaptation, and our results offer mechanistic insights into the process of de novo gene birth.
Collapse
Affiliation(s)
- Somya Mani
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, Republic of Korea.
| | - Tsvi Tlusty
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, Republic of Korea
- Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology (UNIST), Ulsan 44919, Republic of Korea
| |
Collapse
|
12
|
Chen R, Xiao N, Lu Y, Tao T, Huang Q, Wang S, Wang Z, Chuan M, Bu Q, Lu Z, Wang H, Su Y, Ji Y, Ding J, Gharib A, Liu H, Zhou Y, Tang S, Liang G, Zhang H, Yi C, Zheng X, Cheng Z, Xu Y, Li P, Xu C, Huang J, Li A, Yang Z. A de novo evolved gene contributes to rice grain shape difference between indica and japonica. Nat Commun 2023; 14:5906. [PMID: 37737275 PMCID: PMC10516980 DOI: 10.1038/s41467-023-41669-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Accepted: 09/13/2023] [Indexed: 09/23/2023] Open
Abstract
The role of de novo evolved genes from non-coding sequences in regulating morphological differentiation between species/subspecies remains largely unknown. Here, we show that a rice de novo gene GSE9 contributes to grain shape difference between indica/xian and japonica/geng varieties. GSE9 evolves from a previous non-coding region of wild rice Oryza rufipogon through the acquisition of start codon. This gene is inherited by most japonica varieties, while the original sequence (absence of start codon, gse9) is present in majority of indica varieties. Knockout of GSE9 in japonica varieties leads to slender grains, whereas introgression to indica background results in round grains. Population evolutionary analyses reveal that gse9 and GSE9 are derived from wild rice Or-I and Or-III groups, respectively. Our findings uncover that the de novo GSE9 gene contributes to the genetic and morphological divergence between indica and japonica subspecies, and provide a target for precise manipulation of rice grain shape.
Collapse
Affiliation(s)
- Rujia Chen
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China
| | - Ning Xiao
- Institute of Agricultural Sciences for Lixiahe Region in Jiangsu, Yangzhou, 225009, China
| | - Yue Lu
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China
| | - Tianyun Tao
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
| | - Qianfeng Huang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
| | - Shuting Wang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
| | - Zhichao Wang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
| | - Mingli Chuan
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
| | - Qing Bu
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
| | - Zhou Lu
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
| | - Hanyao Wang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
| | - Yanze Su
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
| | - Yi Ji
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
| | - Jianheng Ding
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
| | - Ahmed Gharib
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
- Rice Department, Field Crops Research Institute, ARC, Sakha, Kafr El-Sheikh, 33717, Egypt
| | - Huixin Liu
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China
| | - Yong Zhou
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China
| | - Shuzhu Tang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China
| | - Guohua Liang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China
| | - Honggen Zhang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China
| | - Chuandeng Yi
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China
| | - Xiaoming Zheng
- National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Zhukuan Cheng
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China
| | - Yang Xu
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China
| | - Pengcheng Li
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China
| | - Chenwu Xu
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China.
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China.
| | - Jinling Huang
- Department of Biology, East Carolina University, Greenville, NC, 27858, USA.
- State Key Laboratory of Crop Stress Adaptation and Improvement, Key Laboratory of Plant Stress Biology, School of Life Sciences, Henan University, Kaifeng, 475004, China.
- Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China.
| | - Aihong Li
- Institute of Agricultural Sciences for Lixiahe Region in Jiangsu, Yangzhou, 225009, China.
| | - Zefeng Yang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agriculture College of Yangzhou University, Yangzhou, 225009, China.
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009, China.
| |
Collapse
|
13
|
Liang X, Heath LS. Towards understanding paleoclimate impacts on primate de novo genes. G3 (BETHESDA, MD.) 2023; 13:jkad135. [PMID: 37313728 PMCID: PMC10468307 DOI: 10.1093/g3journal/jkad135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/31/2023] [Accepted: 06/08/2023] [Indexed: 06/15/2023]
Abstract
De novo genes are genes that emerge as new genes in some species, such as primate de novo genes that emerge in certain primate species. Over the past decade, a great deal of research has been conducted regarding their emergence, origins, functions, and various attributes in different species, some of which have involved estimating the ages of de novo genes. However, limited by the number of species available for whole-genome sequencing, relatively few studies have focused specifically on the emergence time of primate de novo genes. Among those, even fewer investigate the association between primate gene emergence with environmental factors, such as paleoclimate (ancient climate) conditions. This study investigates the relationship between paleoclimate and human gene emergence at primate species divergence. Based on 32 available primate genome sequences, this study has revealed possible associations between temperature changes and the emergence of de novo primate genes. Overall, findings in this study are that de novo genes tended to emerge in the recent 13 MY when the temperature continues cooling, which is consistent with past findings. Furthermore, in the context of an overall trend of cooling temperature, new primate genes were more likely to emerge during local warming periods, where the warm temperature more closely resembled the environmental condition that preceded the cooling trend. Results also indicate that both primate de novo genes and human cancer-associated genes have later origins in comparison to random human genes. Future studies can be in-depth on understanding human de novo gene emergence from an environmental perspective as well as understanding species divergence from a gene emergence perspective.
Collapse
Affiliation(s)
- Xiao Liang
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Lenwood S Heath
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| |
Collapse
|
14
|
Lombardo KD, Sheehy HK, Cridland JM, Begun DJ. Identifying candidate de novo genes expressed in the somatic female reproductive tract of Drosophila melanogaster. G3 (BETHESDA, MD.) 2023; 13:jkad122. [PMID: 37259569 PMCID: PMC10411569 DOI: 10.1093/g3journal/jkad122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Revised: 05/18/2023] [Accepted: 05/22/2023] [Indexed: 06/02/2023]
Abstract
Most eukaryotic genes have been vertically transmitted to the present from distant ancestors. However, variable gene number across species indicates that gene gain and loss also occurs. While new genes typically originate as products of duplications and rearrangements of preexisting genes, putative de novo genes-genes born out of ancestrally nongenic sequence-have been identified. Previous studies of de novo genes in Drosophila have provided evidence that expression in male reproductive tissues is common. However, no studies have focused on female reproductive tissues. Here we begin addressing this gap in the literature by analyzing the transcriptomes of 3 female reproductive tract organs (spermatheca, seminal receptacle, and parovaria) in 3 species-our focal species, Drosophila melanogaster-and 2 closely related species, Drosophila simulans and Drosophila yakuba, with the goal of identifying putative D. melanogaster-specific de novo genes expressed in these tissues. We discovered several candidate genes, located in sequence annotated as intergenic. Consistent with the literature, these genes tend to be short, single exon, and lowly expressed. We also find evidence that some of these genes are expressed in other D. melanogaster tissues and both sexes. The relatively small number of intergenic candidate genes discovered here is similar to that observed in the accessory gland, but substantially fewer than that observed in the testis.
Collapse
Affiliation(s)
- Kaelina D Lombardo
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
| | - Hayley K Sheehy
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
| | - Julie M Cridland
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
| | - David J Begun
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
| |
Collapse
|
15
|
Kakino K, Mon H, Ebihara T, Hino M, Masuda A, Lee JM, Kusakabe T. Comprehensive Transcriptome Analysis in the Testis of the Silkworm, Bombyx mori. INSECTS 2023; 14:684. [PMID: 37623394 PMCID: PMC10455414 DOI: 10.3390/insects14080684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 07/26/2023] [Accepted: 08/01/2023] [Indexed: 08/26/2023]
Abstract
Spermatogenesis is an important process in reproduction and is conserved across species, but in Bombyx mori, it shows peculiarities, such as the maintenance of spermatogonia by apical cells and fertilization by dimorphic spermatozoa. In this study, we attempted to characterize the genes expressed in the testis of B. mori, focusing on aspects of expression patterns and gene function by transcriptome comparisons between different tissues, internal testis regions, and Drosophila melanogaster. The transcriptome analysis of 12 tissues of B. mori, including those of testis, revealed the widespread gene expression of 20,962 genes and 1705 testis-specific genes. A comparative analysis of the stem region (SR) and differentiated regions (DR) of the testis revealed 4554 and 3980 specific-enriched genes, respectively. In addition, comparisons with D. melanogaster testis transcriptome revealed homologs of 1204 SR and 389 DR specific-enriched genes that were similarly expressed in equivalent regions of Drosophila testis. Moreover, gene ontology (GO) enrichment analysis was performed for SR-specific enriched genes and DR-specific enriched genes, and the GO terms of several biological processes were enriched, confirming previous findings. This study advances our understanding of spermatogenesis in B. mori and provides an important basis for future research, filling a knowledge gap between fly and mammalian studies.
Collapse
Affiliation(s)
- Kohei Kakino
- Laboratory of Insect Genome Science, Kyushu University Graduate School of Bioresource and Bioenvironmental Sciences, Motooka 744, Nishi-ku, Fukuoka 819-0395, Japan; (K.K.); (H.M.); (T.E.)
| | - Hiroaki Mon
- Laboratory of Insect Genome Science, Kyushu University Graduate School of Bioresource and Bioenvironmental Sciences, Motooka 744, Nishi-ku, Fukuoka 819-0395, Japan; (K.K.); (H.M.); (T.E.)
| | - Takeru Ebihara
- Laboratory of Insect Genome Science, Kyushu University Graduate School of Bioresource and Bioenvironmental Sciences, Motooka 744, Nishi-ku, Fukuoka 819-0395, Japan; (K.K.); (H.M.); (T.E.)
| | - Masato Hino
- Laboratory of Sanitary Entomology, Kyushu University Graduate School of Bioresource and Bioenvironmental Sciences, Motooka 744, Nishi-ku, Fukuoka 819-0395, Japan;
| | - Akitsu Masuda
- Laboratory of Creative Science for Insect Industries, Kyushu University Graduate School of Bioresource and Bioenvironmental Sciences, Motooka 744, Nishi-ku, Fukuoka 819-0395, Japan; (A.M.); (J.M.L.)
| | - Jae Man Lee
- Laboratory of Creative Science for Insect Industries, Kyushu University Graduate School of Bioresource and Bioenvironmental Sciences, Motooka 744, Nishi-ku, Fukuoka 819-0395, Japan; (A.M.); (J.M.L.)
| | - Takahiro Kusakabe
- Laboratory of Insect Genome Science, Kyushu University Graduate School of Bioresource and Bioenvironmental Sciences, Motooka 744, Nishi-ku, Fukuoka 819-0395, Japan; (K.K.); (H.M.); (T.E.)
| |
Collapse
|
16
|
Athanasouli M, Akduman N, Röseler W, Theam P, Rödelsperger C. Thousands of Pristionchus pacificus orphan genes were integrated into developmental networks that respond to diverse environmental microbiota. PLoS Genet 2023; 19:e1010832. [PMID: 37399201 DOI: 10.1371/journal.pgen.1010832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 06/15/2023] [Indexed: 07/05/2023] Open
Abstract
Adaptation of organisms to environmental change may be facilitated by the creation of new genes. New genes without homologs in other lineages are known as taxonomically-restricted orphan genes and may result from divergence or de novo formation. Previously, we have extensively characterized the evolution and origin of such orphan genes in the nematode model organism Pristionchus pacificus. Here, we employ large-scale transcriptomics to establish potential functional associations and to measure the degree of transcriptional plasticity among orphan genes. Specifically, we analyzed 24 RNA-seq samples from adult P. pacificus worms raised on 24 different monoxenic bacterial cultures. Based on coexpression analysis, we identified 28 large modules that harbor 3,727 diplogastrid-specific orphan genes and that respond dynamically to different bacteria. These coexpression modules have distinct regulatory architecture and also exhibit differential expression patterns across development suggesting a link between bacterial response networks and development. Phylostratigraphy revealed a considerably high number of family- and even species-specific orphan genes in certain coexpression modules. This suggests that new genes are not attached randomly to existing cellular networks and that integration can happen very fast. Integrative analysis of protein domains, gene expression and ortholog data facilitated the assignments of biological labels for 22 coexpression modules with one of the largest, fast-evolving module being associated with spermatogenesis. In summary, this work presents the first functional annotation for thousands of P. pacificus orphan genes and reveals insights into their integration into environmentally responsive gene networks.
Collapse
Affiliation(s)
- Marina Athanasouli
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, Tübingen, Germany
| | - Nermin Akduman
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, Tübingen, Germany
| | - Waltraud Röseler
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, Tübingen, Germany
| | - Penghieng Theam
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, Tübingen, Germany
| | - Christian Rödelsperger
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, Tübingen, Germany
| |
Collapse
|
17
|
Li J, Lee CR. The role of gene presence-absence variations on genetic incompatibility in Asian rice. THE NEW PHYTOLOGIST 2023; 239:778-791. [PMID: 37194454 PMCID: PMC7615310 DOI: 10.1111/nph.18969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 04/18/2023] [Indexed: 05/18/2023]
Abstract
Genetic incompatibilities are widespread between species. However, it remains unclear whether they all originated after population divergence as suggested by the Bateson-Dobzhansky-Muller model, and if not, what is their prevalence and distribution within populations. The gene presence-absence variations (PAVs) provide an opportunity for investigating gene-gene incompatibility. Here, we searched for the repulsion of coexistence between gene PAVs to identify the negative interaction of gene functions separately in two Oryza sativa subspecies. Many PAVs are involved in subspecies-specific negative epistasis and segregate at low-to-intermediate frequencies in focal subspecies but at low or high frequencies in the other subspecies. Incompatible PAVs are enriched in two functional groups, defense response and protein phosphorylation, which are associated with plant immunity and consistent with autoimmunity being a known mechanism of hybrid incompatibility in plants. Genes in the two enriched functional groups are older and seldom directly interact with each other. Instead, they interact with other younger gene PAVs with diverse functions. Our results illustrate the landscape of genetic incompatibility at gene PAVs in rice, where many incompatible pairs have already segregated as polymorphisms within subspecies, and many are novel negative interactions between older defense-related genes and younger genes with diverse functions.
Collapse
Affiliation(s)
- Juan Li
- Institute of Ecology and Evolutionary Biology, National Taiwan University, Taipei 106319, Taiwan
- Institute of Ecology and Evolution, University of Bern, 3012 Bern, Switzerland
- Swiss Institute for Bioinformatics, 1015 Lausanne, Switzerland
| | - Cheng-Ruei Lee
- Institute of Ecology and Evolutionary Biology, National Taiwan University, Taipei 106319, Taiwan
- Institute of Plant Biology, National Taiwan University, Taipei 106319, Taiwan
| |
Collapse
|
18
|
Peng J, Zhao L. The origin and structural evolution of de novo genes in Drosophila. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.13.532420. [PMID: 37425675 PMCID: PMC10326970 DOI: 10.1101/2023.03.13.532420] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Although previously thought to be unlikely, recent studies have shown that de novo gene origination from previously non-genic sequences is a relatively common mechanism for gene innovation in many species and taxa. These young genes provide a unique set of candidates to study the structural and functional origination of proteins. However, our understanding of their protein structures and how these structures originate and evolve are still limited, due to a lack of systematic studies. Here, we combined high-quality base-level whole genome alignments, bioinformatic analysis, and computational structure modeling to study the origination, evolution, and protein structure of lineage-specific de novo genes. We identified 555 de novo gene candidates in D. melanogaster that originated within the Drosophilinae lineage. We found a gradual shift in sequence composition, evolutionary rates, and expression patterns with their gene ages, which indicates possible gradual shifts or adaptations of their functions. Surprisingly, we found little overall protein structural changes for de novo genes in the Drosophilinae lineage. Using Alphafold2, ESMFold, and molecular dynamics, we identified a number of de novo gene candidates with protein products that are potentially well-folded, many of which are more likely to contain transmembrane and signal proteins compared to other annotated protein-coding genes. Using ancestral sequence reconstruction, we found that most potentially well-folded proteins are often born folded. Interestingly, we observed one case where disordered ancestral proteins become ordered within a relatively short evolutionary time. Single-cell RNA-seq analysis in testis showed that although most de novo genes are enriched in spermatocytes, several young de novo genes are biased in the early spermatogenesis stage, indicating potentially important but less emphasized roles of early germline cells in the de novo gene origination in testis. This study provides a systematic overview of the origin, evolution, and structural changes of Drosophilinae-specific de novo genes.
Collapse
Affiliation(s)
- Junhui Peng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| |
Collapse
|
19
|
Grandchamp A, Kühl L, Lebherz M, Brüggemann K, Parsch J, Bornberg-Bauer E. Population genomics reveals mechanisms and dynamics of de novo expressed open reading frame emergence in Drosophila melanogaster. Genome Res 2023; 33:872-890. [PMID: 37442576 PMCID: PMC10519401 DOI: 10.1101/gr.277482.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 06/06/2023] [Indexed: 07/15/2023]
Abstract
Novel genes are essential for evolutionary innovations and differ substantially even between closely related species. Recently, multiple studies across many taxa showed that some novel genes arise de novo, that is, from previously noncoding DNA. To characterize the underlying mutations that allowed de novo gene emergence and their order of occurrence, homologous regions must be detected within noncoding sequences in closely related sister genomes. So far, most studies do not detect noncoding homologs of de novo genes because of incomplete assemblies and annotations, and long evolutionary distances separating genomes. Here, we overcome these issues by searching for de novo expressed open reading frames (neORFs), the not-yet fixed precursors of de novo genes that emerged within a single species. We sequenced and assembled genomes with long-read technology and the corresponding transcriptomes from inbred lines of Drosophila melanogaster, derived from seven geographically diverse populations. We found line-specific neORFs in abundance but few neORFs shared by lines, suggesting a rapid turnover. Gain and loss of transcription is more frequent than the creation of ORFs, for example, by forming new start and stop codons. Consequently, the gain of ORFs becomes rate limiting and is frequently the initial step in neORFs emergence. Furthermore, transposable elements (TEs) are major drivers for intragenomic duplications of neORFs, yet TE insertions are less important for the emergence of neORFs. However, highly mutable genomic regions around TEs provide new features that enable gene birth. In conclusion, neORFs have a high birth-death rate, are rapidly purged, but surviving neORFs spread neutrally through populations and within genomes.
Collapse
Affiliation(s)
- Anna Grandchamp
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany;
| | - Lucas Kühl
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
| | - Marie Lebherz
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
| | - Kathrin Brüggemann
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
| | - John Parsch
- Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, 82152 Munich, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
- Max Planck Institute for Biology Tübingen, Department of Protein Evolution, 72076 Tübingen, Germany
| |
Collapse
|
20
|
Cridland JM, Contino CE, Begun DJ. Selection and geography shape male reproductive tract transcriptomes in Drosophila melanogaster. Genetics 2023; 224:iyad034. [PMID: 36869688 PMCID: PMC10474930 DOI: 10.1093/genetics/iyad034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 01/25/2023] [Accepted: 02/20/2023] [Indexed: 03/05/2023] Open
Abstract
Transcriptome analysis of several animal clades suggests that male reproductive tract gene expression evolves quickly. However, the factors influencing the abundance and distribution of within-species variation, the ultimate source of interspecific divergence, are poorly known. Drosophila melanogaster, an ancestrally African species that has recently spread throughout the world and colonized the Americas in the last roughly 100 years, exhibits phenotypic and genetic latitudinal clines on multiple continents, consistent with a role for spatially varying selection in shaping its biology. Nevertheless, geographic expression variation in the Americas is poorly described, as is its relationship to African expression variation. Here, we investigate these issues through the analysis of two male reproductive tissue transcriptomes [testis and accessory gland (AG)] in samples from Maine (USA), Panama, and Zambia. We find dramatic differences between these tissues in differential expression between Maine and Panama, with the accessory glands exhibiting abundant expression differentiation and the testis exhibiting very little. Latitudinal expression differentiation appears to be influenced by the selection of Panama expression phenotypes. While the testis shows little latitudinal expression differentiation, it exhibits much greater differentiation than the accessory gland in Zambia vs American population comparisons. Expression differentiation for both tissues is non-randomly distributed across the genome on a chromosome arm scale. Interspecific expression divergence between D. melanogaster and D. simulans is discordant with rates of differentiation between D. melanogaster populations. Strongly heterogeneous expression differentiation across tissues and timescales suggests a complex evolutionary process involving major temporal changes in the way selection influences expression evolution in these organs.
Collapse
Affiliation(s)
- Julie M Cridland
- Department of Evolution and Ecology, University of California-Davis, Davis, CA 95616, USA
| | - Colin E Contino
- Department of Evolution and Ecology, University of California-Davis, Davis, CA 95616, USA
| | - David J Begun
- Department of Evolution and Ecology, University of California-Davis, Davis, CA 95616, USA
| |
Collapse
|
21
|
Lombardo KD, Sheehy HK, Cridland JM, Begun DJ. Identifying candidate de novo genes expressed in the somatic female reproductive tract of Drosophila melanogaster. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.03.539262. [PMID: 37205537 PMCID: PMC10187257 DOI: 10.1101/2023.05.03.539262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Most eukaryotic genes have been vertically transmitted to the present from distant ancestors. However, variable gene number across species indicates that gene gain and loss also occurs. While new genes typically originate as products of duplications and rearrangements of pre-existing genes, putative de novo genes - genes born out of previously non-genic sequence - have been identified. Previous studies of de novo genes in Drosophila have provided evidence that expression in male reproductive tissues is common. However, no studies have focused on female reproductive tissues. Here we begin addressing this gap in the literature by analyzing the transcriptomes of three female reproductive tract organs (spermatheca, seminal receptacle, and parovaria) in three species - our focal species, D. melanogaster - and two closely related species, D. simulans and D. yakuba , with the goal of identifying putative D. melanogaster -specific de novo genes expressed in these tissues. We discovered several candidate genes, which, consistent with the literature, tend to be short, simple, and lowly expressed. We also find evidence that some of these genes are expressed in other D. melanogaster tissues and both sexes. The relatively small number of candidate genes discovered here is similar to that observed in the accessory gland, but substantially fewer than that observed in the testis.
Collapse
Affiliation(s)
- Kaelina D Lombardo
- Department of Evolution and Ecology, University of California, Davis CA 95616
| | - Hayley K Sheehy
- Department of Evolution and Ecology, University of California, Davis CA 95616
| | - Julie M Cridland
- Department of Evolution and Ecology, University of California, Davis CA 95616
| | - David J Begun
- Department of Evolution and Ecology, University of California, Davis CA 95616
| |
Collapse
|
22
|
Evolution and implications of de novo genes in humans. Nat Ecol Evol 2023:10.1038/s41559-023-02014-y. [PMID: 36928843 DOI: 10.1038/s41559-023-02014-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 02/06/2023] [Indexed: 03/18/2023]
Abstract
Genes and translated open reading frames (ORFs) that emerged de novo from previously non-coding sequences provide species with opportunities for adaptation. When aberrantly activated, some human-specific de novo genes and ORFs have disease-promoting properties-for instance, driving tumour growth. Thousands of putative de novo coding sequences have been described in humans, but we still do not know what fraction of those ORFs has readily acquired a function. Here, we discuss the challenges and controversies surrounding the detection, mechanisms of origin, annotation, validation and characterization of de novo genes and ORFs. Through manual curation of literature and databases, we provide a thorough table with most de novo genes reported for humans to date. We re-evaluate each locus by tracing the enabling mutations and list proposed disease associations, protein characteristics and supporting evidence for translation and protein detection. This work will support future explorations of de novo genes and ORFs in humans.
Collapse
|
23
|
Takeda T, Shirai K, Kim YW, Higuchi-Takeuchi M, Shimizu M, Kondo T, Ushijima T, Matsushita T, Shinozaki K, Hanada K. A de novo gene originating from the mitochondria controls floral transition in Arabidopsis thaliana. PLANT MOLECULAR BIOLOGY 2023; 111:189-203. [PMID: 36306001 DOI: 10.1007/s11103-022-01320-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 10/09/2022] [Indexed: 06/16/2023]
Abstract
De novo genes created in the plant mitochondrial genome have frequently been transferred into the nuclear genome via intergenomic gene transfer events. Therefore, plant mitochondria might be a source of de novo genes in the nuclear genome. However, the functions of de novo genes originating from mitochondria and the evolutionary fate remain unclear. Here, we revealed that an Arabidopsis thaliana specific small coding gene derived from the mitochondrial genome regulates floral transition. We previously identified 49 candidate de novo genes that induce abnormal morphological changes on overexpression. We focused on a candidate gene derived from the mitochondrial genome (sORF2146) that encodes 66 amino acids. Comparative genomic analyses indicated that the mitochondrial sORF2146 emerged in the Brassica lineage as a de novo gene. The nuclear sORF2146 emerged following an intergenomic gene transfer event in the A. thaliana after the divergence between Arabidopsis and Capsella. Although the nuclear and mitochondrial sORF2146 sequences are the same in A. thaliana, only the nuclear sORF2146 is transcribed. The nuclear sORF2146 product is localized in mitochondria, which may be associated with the pseudogenization of the mitochondrial sORF2146. To functionally characterize the nuclear sORF2146, we performed a transcriptomic analysis of transgenic plants overexpressing the nuclear sORF2146. Flowering transition-related genes were highly regulated in the transgenic plants. Subsequent phenotypic analyses demonstrated that the overexpression and knockdown of sORF2146 in transgenic plants resulted in delayed and early flowering, respectively. These findings suggest that a lineage-specific de novo gene derived from mitochondria has an important regulatory effect on floral transition.
Collapse
Affiliation(s)
- Tomoyuki Takeda
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka-Shi, Fukuoka, 820-8502, Japan
| | - Kazumasa Shirai
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka-Shi, Fukuoka, 820-8502, Japan
| | - You-Wang Kim
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka-Shi, Fukuoka, 820-8502, Japan
| | | | - Minami Shimizu
- RIKEN Center for Sustainable Resource Science, Yokohama-Shi, Kanagawa, 230-0045, Japan
| | - Takayuki Kondo
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka-Shi, Fukuoka, 820-8502, Japan
| | - Tomokazu Ushijima
- Department of Agricultural Science and Technology, Faculty of Agriculture, Setsunan University, Osaka, Japan
| | - Tomonao Matsushita
- Department of Botany, Graduate School of Science, Kyoto University, Kyoto, Japan
| | - Kazuo Shinozaki
- RIKEN Center for Sustainable Resource Science, Yokohama-Shi, Kanagawa, 230-0045, Japan
| | - Kousuke Hanada
- Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka-Shi, Fukuoka, 820-8502, Japan.
| |
Collapse
|
24
|
An NA, Zhang J, Mo F, Luan X, Tian L, Shen QS, Li X, Li C, Zhou F, Zhang B, Ji M, Qi J, Zhou WZ, Ding W, Chen JY, Yu J, Zhang L, Shu S, Hu B, Li CY. De novo genes with an lncRNA origin encode unique human brain developmental functionality. Nat Ecol Evol 2023; 7:264-278. [PMID: 36593289 PMCID: PMC9911349 DOI: 10.1038/s41559-022-01925-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Accepted: 10/04/2022] [Indexed: 01/03/2023]
Abstract
Human de novo genes can originate from neutral long non-coding RNA (lncRNA) loci and are evolutionarily significant in general, yet how and why this all-or-nothing transition to functionality happens remains unclear. Here, in 74 human/hominoid-specific de novo genes, we identified distinctive U1 elements and RNA splice-related sequences accounting for RNA nuclear export, differentiating mRNAs from lncRNAs, and driving the origin of de novo genes from lncRNA loci. The polymorphic sites facilitating the lncRNA-mRNA conversion through regulating nuclear export are selectively constrained, maintaining a boundary that differentiates mRNAs from lncRNAs. The functional new genes actively passing through it thus showed a mode of pre-adaptive origin, in that they acquire functions along with the achievement of their coding potential. As a proof of concept, we verified the regulations of splicing and U1 recognition on the nuclear export efficiency of one of these genes, the ENSG00000205704, in human neural progenitor cells. Notably, knock-out or over-expression of this gene in human embryonic stem cells accelerates or delays the neuronal maturation of cortical organoids, respectively. The transgenic mice with ectopically expressed ENSG00000205704 showed enlarged brains with cortical expansion. We thus demonstrate the key roles of nuclear export in de novo gene origin. These newly originated genes should reflect the novel uniqueness of human brain development.
Collapse
Affiliation(s)
- Ni A. An
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Jie Zhang
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Fan Mo
- grid.9227.e0000000119573309State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China ,grid.410726.60000 0004 1797 8419University of Chinese Academy of Sciences, Beijing, China
| | - Xuke Luan
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Lu Tian
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Qing Sunny Shen
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Xiangshang Li
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Chunqiong Li
- grid.510934.a0000 0005 0398 4153Chinese Institute for Brain Research, Beijing, China
| | - Fanqi Zhou
- grid.506261.60000 0001 0706 7839State Key Laboratory of Medical Molecular Biology, Key Laboratory of RNA Regulation and Hematopoiesis, Department of Biochemistry and Molecular Biology, Institute of Basic Medical Sciences, School of Basic Medicine, CAMS and Peking Union Medical College, Beijing, China
| | - Boya Zhang
- grid.9227.e0000000119573309State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China ,grid.410726.60000 0004 1797 8419University of Chinese Academy of Sciences, Beijing, China
| | - Mingjun Ji
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Jianhuan Qi
- grid.9227.e0000000119573309State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China ,grid.410726.60000 0004 1797 8419University of Chinese Academy of Sciences, Beijing, China
| | - Wei-Zhen Zhou
- grid.415105.40000 0004 9430 5605State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Wanqiu Ding
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Jia-Yu Chen
- grid.41156.370000 0001 2314 964XState Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Chemistry and Biomedicine Innovation Center (ChemBIC), Nanjing University, Nanjing, China
| | - Jia Yu
- grid.506261.60000 0001 0706 7839State Key Laboratory of Medical Molecular Biology, Key Laboratory of RNA Regulation and Hematopoiesis, Department of Biochemistry and Molecular Biology, Institute of Basic Medical Sciences, School of Basic Medicine, CAMS and Peking Union Medical College, Beijing, China
| | - Li Zhang
- grid.510934.a0000 0005 0398 4153Chinese Institute for Brain Research, Beijing, China
| | - Shaokun Shu
- grid.11135.370000 0001 2256 9319Peking University International Cancer Institute, Beijing, China
| | - Baoyang Hu
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China. .,University of Chinese Academy of Sciences, Beijing, China.
| | - Chuan-Yun Li
- Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China. .,Chinese Institute for Brain Research, Beijing, China.
| |
Collapse
|
25
|
Rabbani M, Zheng X, Manske GL, Vargo A, Shami AN, Li JZ, Hammoud SS. Decoding the Spermatogenesis Program: New Insights from Transcriptomic Analyses. Annu Rev Genet 2022; 56:339-368. [PMID: 36070560 PMCID: PMC10722372 DOI: 10.1146/annurev-genet-080320-040045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Spermatogenesis is a complex differentiation process coordinated spatiotemporally across and along seminiferous tubules. Cellular heterogeneity has made it challenging to obtain stage-specific molecular profiles of germ and somatic cells using bulk transcriptomic analyses. This has limited our ability to understand regulation of spermatogenesis and to integrate knowledge from model organisms to humans. The recent advancement of single-cell RNA-sequencing (scRNA-seq) technologies provides insights into the cell type diversity and molecular signatures in the testis. Fine-grained cell atlases of the testis contain both known and novel cell types and define the functional states along the germ cell developmental trajectory in many species. These atlases provide a reference system for integrated interspecies comparisons to discover mechanistic parallels and to enable future studies. Despite recent advances, we currently lack high-resolution data to probe germ cell-somatic cell interactions in the tissue environment, but the use of highly multiplexed spatial analysis technologies has begun to resolve this problem. Taken together, recent single-cell studies provide an improvedunderstanding of gametogenesis to examine underlying causes of infertility and enable the development of new therapeutic interventions.
Collapse
Affiliation(s)
- Mashiat Rabbani
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, USA;
| | - Xianing Zheng
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, USA;
| | - Gabe L Manske
- Cellular and Molecular Biology Graduate Program, University of Michigan, Ann Arbor, Michigan, USA
| | - Alexander Vargo
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, USA;
| | - Adrienne N Shami
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, USA;
| | - Jun Z Li
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, USA;
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | - Saher Sue Hammoud
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, USA;
- Department of Obstetrics and Gynecology, University of Michigan, Ann Arbor, Michigan, USA
- Department of Urology, University of Michigan, Ann Arbor, Michigan, USA
- Cellular and Molecular Biology Graduate Program, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
26
|
Translation and natural selection of micropeptides from long non-canonical RNAs. Nat Commun 2022; 13:6515. [PMID: 36316320 PMCID: PMC9622821 DOI: 10.1038/s41467-022-34094-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 10/13/2022] [Indexed: 12/25/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) are transcripts longer than 200 nucleotides but lacking canonical coding sequences. Apparently unable to produce peptides, lncRNA function seems to rely only on RNA expression, sequence and structure. Here, we exhaustively detect in-vivo translation of small open reading frames (small ORFs) within lncRNAs using Ribosomal profiling during Drosophila melanogaster embryogenesis. We show that around 30% of lncRNAs contain small ORFs engaged by ribosomes, leading to regulated translation of 100 to 300 micropeptides. We identify lncRNA features that favour translation, such as cistronicity, Kozak sequences, and conservation. For the latter, we develop a bioinformatics pipeline to detect small ORF homologues, and reveal evidence of natural selection favouring the conservation of micropeptide sequence and function across evolution. Our results expand the repertoire of lncRNA biochemical functions, and suggest that lncRNAs give rise to novel coding genes throughout evolution. Since most lncRNAs contain small ORFs with as yet unknown translation potential, we propose to rename them "long non-canonical RNAs".
Collapse
|
27
|
Moutinho AF, Eyre-Walker A, Dutheil JY. Strong evidence for the adaptive walk model of gene evolution in Drosophila and Arabidopsis. PLoS Biol 2022; 20:e3001775. [PMID: 36099311 PMCID: PMC9470001 DOI: 10.1371/journal.pbio.3001775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 08/01/2022] [Indexed: 11/19/2022] Open
Abstract
Understanding the dynamics of species adaptation to their environments has long been a central focus of the study of evolution. Theories of adaptation propose that populations evolve by “walking” in a fitness landscape. This “adaptive walk” is characterised by a pattern of diminishing returns, where populations further away from their fitness optimum take larger steps than those closer to their optimal conditions. Hence, we expect young genes to evolve faster and experience mutations with stronger fitness effects than older genes because they are further away from their fitness optimum. Testing this hypothesis, however, constitutes an arduous task. Young genes are small, encode proteins with a higher degree of intrinsic disorder, are expressed at lower levels, and are involved in species-specific adaptations. Since all these factors lead to increased protein evolutionary rates, they could be masking the effect of gene age. While controlling for these factors, we used population genomic data sets of Arabidopsis and Drosophila and estimated the rate of adaptive substitutions across genes from different phylostrata. We found that a gene’s evolutionary age significantly impacts the molecular rate of adaptation. Moreover, we observed that substitutions in young genes tend to have larger physicochemical effects. Our study, therefore, provides strong evidence that molecular evolution follows an adaptive walk model across a large evolutionary timescale. This study uses population genomic datasets from Arabidopsis and Drosophila to show that young genes adapt faster and are subject to mutations of larger fitness effects, providing strong evidence that molecular evolution follows an adaptive walk model across a large evolutionary timescale.
Collapse
Affiliation(s)
- Ana Filipa Moutinho
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
- * E-mail:
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Julien Y. Dutheil
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
- Unité Mixte de Recherche 5554 Institut des Sciences de l’Evolution, CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France
| |
Collapse
|
28
|
Song H, Guo Z, Zhang X, Sui J. De novo genes in Arachis hypogaea cv. Tifrunner: systematic identification, molecular evolution, and potential contributions to cultivated peanut. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 111:1081-1095. [PMID: 35748398 DOI: 10.1111/tpj.15875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 06/15/2022] [Accepted: 06/21/2022] [Indexed: 06/15/2023]
Abstract
De novo genes are derived from non-coding sequences, and they can play essential roles in organisms. Cultivated peanut (Arachis hypogaea) is a major oil and protein crop derived from a cross between Arachis duranensis and Arachis ipaensis. However, few de novo genes have been documented in Arachis. Here, we identified 381 de novo genes in A. hypogaea cv. Tifrunner based on comparison with five closely related Arachis species. There are distinct differences in gene expression patterns and gene structures between conserved and de novo genes. The identified de novo genes originated from ancestral sequence regions associated with metabolic and biosynthetic processes, and they were subsequently integrated into existing regulatory networks. De novo paralogs and homoeologs were identified in A. hypogaea cv. Tifrunner. De novo paralogs and homoeologs with conserved expression have mismatching cis-acting elements under normal growth conditions. De novo genes potentially have pluripotent functions in responses to biotic stresses as well as in growth and development based on quantitative trait locus data. This work provides a foundation for future research examining gene birth processes and gene function in Arachis and related taxa.
Collapse
Affiliation(s)
- Hui Song
- Grassland Agri-husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Zhonglong Guo
- State Key Laboratory of Protein and Plant Gene Research, Peking-Tsinghua Center for Life Sciences, School of Life Sciences and School of Advanced Agricultural Sciences, Peking University, Beijing, China
| | - Xiaojun Zhang
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| | - Jiongming Sui
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| |
Collapse
|
29
|
Chenevert M, Miller B, Karkoutli A, Rusnak A, Lott SE, Atallah J. The early embryonic transcriptome of a Hawaiian Drosophila picture-wing fly shows evidence of altered gene expression and novel gene evolution. JOURNAL OF EXPERIMENTAL ZOOLOGY. PART B, MOLECULAR AND DEVELOPMENTAL EVOLUTION 2022; 338:277-291. [PMID: 35322942 DOI: 10.1002/jez.b.23129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 01/14/2022] [Accepted: 02/13/2022] [Indexed: 06/14/2023]
Abstract
A massive adaptive radiation on the Hawaiian archipelago has produced approximately one-quarter of the fly species in the family Drosophilidae. The Hawaiian Drosophila clade has long been recognized as a model system for the study of both the ecology of island endemics and the evolution of developmental mechanisms, but relatively few genomic and transcriptomic datasets are available for this group. We present here a differential expression analysis of the transcriptional profiles of two highly conserved embryonic stages in the Hawaiian picture-wing fly Drosophila grimshawi. When we compared our results to previously published datasets across the family Drosophilidae, we identified cases of both gains and losses of gene representation in D. grimshawi, including an apparent delay in Hox gene activation. We also found a high expression of unannotated genes. Most transcripts of unannotated genes with open reading frames do not have identified homologs in non-Hawaiian Drosophila species, although the vast majority have sequence matches in genomes of other Hawaiian picture-wing flies. Some of these unannotated genes may have arisen from noncoding sequence in the ancestor of Hawaiian flies or during the evolution of the clade. Our results suggest that both the modified use of ancestral genes and the evolution of new ones may occur in rapid radiations.
Collapse
Affiliation(s)
- Madeline Chenevert
- Department of Biological Sciences, University of New Orleans, New Orleans, Louisiana, USA
- Hayward Genetics Center, Tulane University School of Medicine, New Orleans, Louisiana, USA
| | - Bronwyn Miller
- Department of Biological Sciences, University of New Orleans, New Orleans, Louisiana, USA
| | - Ahmad Karkoutli
- Department of Biological Sciences, University of New Orleans, New Orleans, Louisiana, USA
- LSUHSC School of Medicine, New Orleans, Louisiana, USA
| | - Anna Rusnak
- Department of Biological Sciences, University of New Orleans, New Orleans, Louisiana, USA
- Center for Biomedical Engineering, Brown University, Box A-2, Arnold Lab, Providence, Rhode Island, USA
| | - Susan E Lott
- Department of Evolution & Ecology, University of California-Davis, Davis, California, USA
| | - Joel Atallah
- Department of Biological Sciences, University of New Orleans, New Orleans, Louisiana, USA
| |
Collapse
|
30
|
Prabh N, Rödelsperger C. Multiple Pristionchus pacificus genomes reveal distinct evolutionary dynamics between de novo candidates and duplicated genes. Genome Res 2022; 32:1315-1327. [PMID: 35618417 PMCID: PMC9341508 DOI: 10.1101/gr.276431.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 05/20/2022] [Indexed: 01/03/2023]
Abstract
The birth of new genes is a major molecular innovation driving phenotypic diversity across all domains of life. Although repurposing of existing protein-coding material by duplication is considered the main process of new gene formation, recent studies have discovered thousands of transcriptionally active sequences as a rich source of new genes. However, differential loss rates have to be assumed to reconcile the high birth rates of these incipient de novo genes with the dominance of ancient gene families in individual genomes. Here, we test this rapid turnover hypothesis in the context of the nematode model organism Pristionchus pacificus We extended the existing species-level phylogenomic framework by sequencing the genomes of six divergent P. pacificus strains. We used these data to study the evolutionary dynamics of different age classes and categories of origin at a population level. Contrasting de novo candidates with new families that arose by duplication and divergence from known genes, we find that de novo candidates are typically shorter, show less expression, and are overrepresented on the sex chromosome. Although the contribution of de novo candidates increases toward young age classes, multiple comparisons within the same age class showed significantly higher attrition in de novo candidates than in known genes. Similarly, young genes remain under weak evolutionary constraints with de novo candidates representing the fastest evolving subcategory. Altogether, this study provides empirical evidence for the rapid turnover hypothesis and highlights the importance of the evolutionary timescale when quantifying the contribution of different mechanisms toward new gene formation.
Collapse
|
31
|
Smith ML, Vanderpool D, Hahn MW. Using all gene families vastly expands data available for phylogenomic inference. Mol Biol Evol 2022; 39:6596367. [PMID: 35642314 PMCID: PMC9178227 DOI: 10.1093/molbev/msac112] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Traditionally, single-copy orthologs have been the gold standard in phylogenomics. Most phylogenomic studies identify putative single-copy orthologs using clustering approaches and retain families with a single sequence per species. This limits the amount of data available by excluding larger families. Recent advances have suggested several ways to include data from larger families. For instance, tree-based decomposition methods facilitate the extraction of orthologs from large families. Additionally, several methods for species tree inference are robust to the inclusion of paralogs and could use all of the data from larger families. Here, we explore the effects of using all families for phylogenetic inference by examining relationships among 26 primate species in detail and by analyzing five additional data sets. We compare single-copy families, orthologs extracted using tree-based decomposition approaches, and all families with all data. We explore several species tree inference methods, finding that identical trees are returned across nearly all subsets of the data and methods for primates. The relationships among Platyrrhini remain contentious; however, the species tree inference method matters more than the subset of data used. Using data from larger gene families drastically increases the number of genes available and leads to consistent estimates of branch lengths, nodal certainty and concordance, and inferences of introgression in primates. For the other data sets, topological inferences are consistent whether single-copy families or orthologs extracted using decomposition approaches are analyzed. Using larger gene families is a promising approach to include more data in phylogenomics without sacrificing accuracy, at least when high-quality genomes are available.
Collapse
Affiliation(s)
- Megan L Smith
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, Indiana, USA
| | - Dan Vanderpool
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, Indiana, USA
| | - Matthew W Hahn
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, Indiana, USA
| |
Collapse
|
32
|
Lee BY, Kim J, Lee J. Intraspecific de novo gene birth revealed by presence-absence variant genes in Caenorhabditis elegans. NAR Genom Bioinform 2022; 4:lqac031. [PMID: 35464238 PMCID: PMC9022459 DOI: 10.1093/nargab/lqac031] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 03/30/2022] [Accepted: 04/13/2022] [Indexed: 12/24/2022] Open
Abstract
Genes embed their evolutionary history in the form of various alleles. Presence-absence variants (PAVs) are extreme cases of such alleles, where a gene present in one haplotype does not exist in another. Because PAVs may result from either birth or death of a gene, PAV genes and their alternative alleles, if available, can represent a basis for rapid intraspecific gene evolution. Using long-read sequencing technologies, this study traced the possible evolution of PAV genes in the PD1074 and CB4856 C. elegans strains as well as their alternative alleles in 14 other wild strains. We updated the CB4856 genome by filling 18 gaps and identified 46 genes and 7,460 isoforms from both strains not annotated previously. We verified 328 PAV genes, out of which 46 were C. elegans-specific. Among these possible newly born genes, 12 had alternative alleles in other wild strains; in particular, the alternative alleles of three genes showed signatures of active transposons. Alternative alleles of three other genes showed another type of signature reflected in accumulation of small insertions or deletions. Research on gene evolution using both species-specific PAV genes and their alternative alleles may provide new insights into the process of gene evolution.
Collapse
Affiliation(s)
- Bo Yun Lee
- Research Institute of Basic Sciences, Seoul National University, Seoul 08826, Korea
| | - Jun Kim
- Research Institute of Basic Sciences, Seoul National University, Seoul 08826, Korea
| | - Junho Lee
- Research Institute of Basic Sciences, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
33
|
Hurtado J, Almeida FC, Belliard SA, Revale S, Hasson E. Research gaps and new insights in the evolution of Drosophila seminal fluid proteins. INSECT MOLECULAR BIOLOGY 2022; 31:139-158. [PMID: 34747062 DOI: 10.1111/imb.12746] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 09/20/2021] [Accepted: 10/25/2021] [Indexed: 06/13/2023]
Abstract
While the striking effects of seminal fluid proteins (SFPs) on females are fairly conserved among Diptera, most SFPs lack detectable homologues among the SFP repertoires of phylogenetically distant species. How such a rapidly changing proteome conserves functions across taxa is a fascinating question. However, this and other pivotal aspects of SFPs' evolution remain elusive because discoveries on these proteins have been mainly restricted to the model Drosophila melanogaster. Here, we provide an overview of the current knowledge on the inter-specific divergence of the SFP repertoire in Drosophila and compile the increasing amount of relevant genomic information from multiple species. Capitalizing on the accumulated knowledge in D. melanogaster, we present novel sets of high-confidence SFP candidates and transcription factors presumptively involved in regulating the expression of SFPs. We also address open questions by performing comparative genomic analyses that failed to support the existence of many conserved SFPs shared by most dipterans and indicated that gene co-option is the most frequent mechanism accounting for the origin of Drosophila SFP-coding genes. We hope our update establishes a starting point to integrate further data and thus widen the understanding of the intricate evolution of these proteins.
Collapse
Affiliation(s)
- Juan Hurtado
- Departamento de Ecología, Genética y Evolución, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (UBA), CABA, Argentina
- Instituto de Ecología, Genética y Evolución de Buenos Aires, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), CABA, Argentina
| | - Francisca Cunha Almeida
- Departamento de Ecología, Genética y Evolución, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (UBA), CABA, Argentina
- Instituto de Ecología, Genética y Evolución de Buenos Aires, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), CABA, Argentina
| | - Silvina Anahí Belliard
- Laboratorio de Insectos de Importancia Agronómica, IGEAF (INTA), GV-IABIMO (CONICET), Buenos Aires, Argentina
| | - Santiago Revale
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Esteban Hasson
- Departamento de Ecología, Genética y Evolución, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (UBA), CABA, Argentina
- Instituto de Ecología, Genética y Evolución de Buenos Aires, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), CABA, Argentina
| |
Collapse
|
34
|
Kopania EEK, Larson EL, Callahan C, Keeble S, Good JM. Molecular Evolution across Mouse Spermatogenesis. Mol Biol Evol 2022; 39:6517785. [PMID: 35099536 PMCID: PMC8844503 DOI: 10.1093/molbev/msac023] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Genes involved in spermatogenesis tend to evolve rapidly, but we lack a clear understanding of how protein sequences and patterns of gene expression evolve across this complex developmental process. We used fluorescence-activated cell sorting (FACS) to generate expression data for early (meiotic) and late (postmeiotic) cell types across 13 inbred strains of mice (Mus) spanning ∼7 My of evolution. We used these comparative developmental data to investigate the evolution of lineage-specific expression, protein-coding sequences, and expression levels. We found increased lineage specificity and more rapid protein-coding and expression divergence during late spermatogenesis, suggesting that signatures of rapid testis molecular evolution are punctuated across sperm development. Despite strong overall developmental parallels in these components of molecular evolution, protein and expression divergences were only weakly correlated across genes. We detected more rapid protein evolution on the X chromosome relative to the autosomes, whereas X-linked gene expression tended to be relatively more conserved likely reflecting chromosome-specific regulatory constraints. Using allele-specific FACS expression data from crosses between four strains, we found that the relative contributions of different regulatory mechanisms also differed between cell types. Genes showing cis-regulatory changes were more common late in spermatogenesis, and tended to be associated with larger differences in expression levels and greater expression divergence between species. In contrast, genes with trans-acting changes were more common early and tended to be more conserved across species. Our findings advance understanding of gene evolution across spermatogenesis and underscore the fundamental importance of developmental context in molecular evolutionary studies.
Collapse
Affiliation(s)
- Emily E K Kopania
- Division of Biological Sciences, University of Montana, Missoula, MT, 59812, USA
| | - Erica L Larson
- Department of Biological Sciences, University of Denver, Denver, CO, 80208, USA
| | - Colin Callahan
- Division of Biological Sciences, University of Montana, Missoula, MT, 59812, USA
| | - Sara Keeble
- Division of Biological Sciences, University of Montana, Missoula, MT, 59812, USA
| | - Jeffrey M Good
- Division of Biological Sciences, University of Montana, Missoula, MT, 59812, USA
| |
Collapse
|
35
|
Cridland JM, Majane AC, Zhao L, Begun DJ. Population biology of accessory gland-expressed de novo genes in Drosophila melanogaster. Genetics 2022; 220:iyab207. [PMID: 34791207 PMCID: PMC8733444 DOI: 10.1093/genetics/iyab207] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 11/08/2021] [Indexed: 12/20/2022] Open
Abstract
Early work on de novo gene discovery in Drosophila was consistent with the idea that many such genes have male-biased patterns of expression, including a large number expressed in the testis. However, there has been little formal analysis of variation in the abundance and properties of de novo genes expressed in different tissues. Here, we investigate the population biology of recently evolved de novo genes expressed in the Drosophila melanogaster accessory gland, a somatic male tissue that plays an important role in male and female fertility and the post mating response of females, using the same collection of inbred lines used previously to identify testis-expressed de novo genes, thus allowing for direct cross tissue comparisons of these genes in two tissues of male reproduction. Using RNA-seq data, we identify candidate de novo genes located in annotated intergenic and intronic sequence and determine the properties of these genes including chromosomal location, expression, abundance, and coding capacity. Generally, we find major differences between the tissues in terms of gene abundance and expression, though other properties such as transcript length and chromosomal distribution are more similar. We also explore differences between regulatory mechanisms of de novo genes in the two tissues and how such differences may interact with selection to produce differences in D. melanogaster de novo genes expressed in the two tissues.
Collapse
Affiliation(s)
- Julie M Cridland
- Department of Evolution and Ecology, University of California, Davis, Davis, CA 95616, USA
| | - Alex C Majane
- Department of Evolution and Ecology, University of California, Davis, Davis, CA 95616, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - David J Begun
- Department of Evolution and Ecology, University of California, Davis, Davis, CA 95616, USA
| |
Collapse
|
36
|
Li J, Singh U, Bhandary P, Campbell J, Arendsee Z, Seetharam AS, Wurtele ES. Foster thy young: enhanced prediction of orphan genes in assembled genomes. Nucleic Acids Res 2021; 50:e37. [PMID: 34928390 PMCID: PMC9023268 DOI: 10.1093/nar/gkab1238] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 10/22/2021] [Accepted: 12/02/2021] [Indexed: 02/06/2023] Open
Abstract
Proteins encoded by newly-emerged genes ('orphan genes') share no sequence similarity with proteins in any other species. They provide organisms with a reservoir of genetic elements to quickly respond to changing selection pressures. Here, we systematically assess the ability of five gene prediction pipelines to accurately predict genes in genomes according to phylostratal origin. BRAKER and MAKER are existing, popular ab initio tools that infer gene structures by machine learning. Direct Inference is an evidence-based pipeline we developed to predict gene structures from alignments of RNA-Seq data. The BIND pipeline integrates ab initio predictions of BRAKER and Direct inference; MIND combines Direct Inference and MAKER predictions. We use highly-curated Arabidopsis and yeast annotations as gold-standard benchmarks, and cross-validate in rice. Each pipeline under-predicts orphan genes (as few as 11 percent, under one prediction scenario). Increasing RNA-Seq diversity greatly improves prediction efficacy. The combined methods (BIND and MIND) yield best predictions overall, BIND identifying 68% of annotated orphan genes, 99% of ancient genes, and give the highest sensitivity score regardless dataset in Arabidopsis. We provide a light weight, flexible, reproducible, and well-documented solution to improve gene prediction.
Collapse
Affiliation(s)
- Jing Li
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50014, USA.,Center for Metabolic Biology, Iowa State University, Ames, IA 50014, USA.,Genetics and Genomics Graduate Program, Iowa State University, Ames, IA 50014, USA
| | - Urminder Singh
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50014, USA.,Center for Metabolic Biology, Iowa State University, Ames, IA 50014, USA.,Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50014, USA
| | - Priyanka Bhandary
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50014, USA.,Center for Metabolic Biology, Iowa State University, Ames, IA 50014, USA.,Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50014, USA
| | - Jacqueline Campbell
- Corn Insects and Crop Genetics Research Unit, US Department of Agriculture Agriculture Research Service, Ames, IA 50014, USA
| | - Zebulun Arendsee
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50014, USA.,Center for Metabolic Biology, Iowa State University, Ames, IA 50014, USA.,Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50014, USA
| | - Arun S Seetharam
- Genome Informatics Facility, Iowa State University, Ames, IA 50014, USA
| | - Eve Syrkin Wurtele
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50014, USA.,Center for Metabolic Biology, Iowa State University, Ames, IA 50014, USA.,Genetics and Genomics Graduate Program, Iowa State University, Ames, IA 50014, USA.,Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50014, USA
| |
Collapse
|
37
|
Cherezov RO, Vorontsova JE, Simonova OB. The Phenomenon of Evolutionary “De Novo Generation” of Genes. Russ J Dev Biol 2021. [DOI: 10.1134/s1062360421060035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
38
|
Papadopoulos C, Callebaut I, Gelly JC, Hatin I, Namy O, Renard M, Lespinet O, Lopes A. Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution. Genome Res 2021; 31:2303-2315. [PMID: 34810219 PMCID: PMC8647833 DOI: 10.1101/gr.275638.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 09/23/2021] [Indexed: 01/08/2023]
Abstract
The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences' properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic open reading frames (ORFs) of S. cerevisiae with the aim of (1) exploring whether the structural states' diversity of proteomes is already present in noncoding sequences, and (2) estimating the potential of the noncoding genome to produce novel protein bricks that could either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural state diversity of canonical proteins, with the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by reconstructing the ancestral sequences of 70 yeast de novo genes and characterized the sequence and structural properties of intergenic ORFs with a strong translation signal. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe.
Collapse
Affiliation(s)
- Chris Papadopoulos
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005 Paris, France
| | - Jean-Christophe Gelly
- Université de Paris, Biologie Intégrée du Globule Rouge, UMR_S1134, BIGR, INSERM, F-75015 Paris, France
- Laboratoire d'Excellence GR-Ex, 75015 Paris, France
- Institut National de la Transfusion Sanguine, F-75015 Paris, France
| | - Isabelle Hatin
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Namy
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Maxime Renard
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Lespinet
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Anne Lopes
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| |
Collapse
|
39
|
Castro JF, Tautz D. The Effects of Sequence Length and Composition of Random Sequence Peptides on the Growth of E. coli Cells. Genes (Basel) 2021; 12:1913. [PMID: 34946861 PMCID: PMC8702183 DOI: 10.3390/genes12121913] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Revised: 11/22/2021] [Accepted: 11/26/2021] [Indexed: 12/21/2022] Open
Abstract
We study the potential for the de novo evolution of genes from random nucleotide sequences using libraries of E. coli expressing random sequence peptides. We assess the effects of such peptides on cell growth by monitoring frequency changes in individual clones in a complex library through four serial passages. Using a new analysis pipeline that allows the tracing of peptides of all lengths, we find that over half of the peptides have consistent effects on cell growth. Across nine different experiments, around 16% of clones increase in frequency and 36% decrease, with some variation between individual experiments. Shorter peptides (8-20 residues), are more likely to increase in frequency, longer ones are more likely to decrease. GC content, amino acid composition, intrinsic disorder, and aggregation propensity show slightly different patterns between peptide groups. Sequences that increase in frequency tend to be more disordered with lower aggregation propensity. This coincides with the observation that young genes with more disordered structures are better tolerated in genomes. Our data indicate that random sequences can be a source of evolutionary innovation, since a large fraction of them are well tolerated by the cells or can provide a growth advantage.
Collapse
Affiliation(s)
| | - Diethard Tautz
- Max Planck Institute for Evolutionary Biology, August-Thienemann Strasse 2, 24306 Plön, Germany;
| |
Collapse
|
40
|
Watson AK, Lopez P, Bapteste E. Hundreds of out-of-frame remodelled gene families in the E. coli pangenome. Mol Biol Evol 2021; 39:6430988. [PMID: 34792602 PMCID: PMC8788219 DOI: 10.1093/molbev/msab329] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
All genomes include gene families with very limited taxonomic distributions that potentially represent new genes and innovations in protein-coding sequence, raising questions on the origins of such genes. Some of these genes are hypothesized to have formed de novo, from noncoding sequences, and recent work has begun to elucidate the processes by which de novo gene formation can occur. A special case of de novo gene formation, overprinting, describes the origin of new genes from noncoding alternative reading frames of existing open reading frames (ORFs). We argue that additionally, out-of-frame gene fission/fusion events of alternative reading frames of ORFs and out-of-frame lateral gene transfers could contribute to the origin of new gene families. To demonstrate this, we developed an original pattern-search in sequence similarity networks, enhancing the use of these graphs, commonly used to detect in-frame remodeled genes. We applied this approach to gene families in 524 complete genomes of Escherichia coli. We identified 767 gene families whose evolutionary history likely included at least one out-of-frame remodeling event. These genes with out-of-frame components represent ∼2.5% of all genes in the E. coli pangenome, suggesting that alternative reading frames of existing ORFs can contribute to a significant proportion of de novo genes in bacteria.
Collapse
Affiliation(s)
- Andrew K Watson
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, 7, quai Saint Bernard, Paris, 75005, France
| | - Philippe Lopez
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, 7, quai Saint Bernard, Paris, 75005, France
| | - Eric Bapteste
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, 7, quai Saint Bernard, Paris, 75005, France
| |
Collapse
|
41
|
Lyu Y, Liufu Z, Xiao J, Tang T. A Rapid Evolving microRNA Cluster Rewires Its Target Regulatory Networks in Drosophila. Front Genet 2021; 12:760530. [PMID: 34777478 PMCID: PMC8581666 DOI: 10.3389/fgene.2021.760530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 10/11/2021] [Indexed: 11/13/2022] Open
Abstract
New miRNAs are evolutionarily important but their functional evolution remains unclear. Here we report that the evolution of a microRNA cluster, mir-972C rewires its downstream regulatory networks in Drosophila. Genomic analysis reveals that mir-972C originated in the common ancestor of Drosophila where it comprises six old miRNAs. It has subsequently recruited six new members in the melanogaster subgroup after evolving for at least 50 million years. Both the young and the old mir-972C members evolved rapidly in seed and non-seed regions. Combining target prediction and cell transfection experiments, we found that the seed and non-seed changes in individual mir-972C members cause extensive target divergence among D. melanogaster, D. simulans, and D. virilis, consistent with the functional evolution of mir-972C reported recently. Intriguingly, the target pool of the cluster as a whole remains relatively conserved. Our results suggest that clustering of young and old miRNAs broadens the target repertoires by acquiring new targets without losing many old ones. This may facilitate the establishment of new miRNAs in existing regulatory networks.
Collapse
Affiliation(s)
- Yang Lyu
- State Key Laboratory of Biocontrol and Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Zhongqi Liufu
- State Key Laboratory of Biocontrol and Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Juan Xiao
- State Key Laboratory of Biocontrol and Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Tian Tang
- State Key Laboratory of Biocontrol and Guangdong Key Laboratory of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
42
|
Jin G, Ma PF, Wu X, Gu L, Long M, Zhang C, Li DZ. New Genes Interacted with Recent Whole Genome Duplicates in the Fast Stem Growth of Bamboos. Mol Biol Evol 2021; 38:5752-5768. [PMID: 34581782 PMCID: PMC8662795 DOI: 10.1093/molbev/msab288] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
As drivers of evolutionary innovations, new genes allow organisms to explore new niches. However, clear examples of this process remain scarce. Bamboos, the unique grass lineage diversifying into the forest, have evolved with a key innovation of fast growth of woody stem, reaching up to 1 m/day. Here, we identify 1,622 bamboo-specific orphan genes that appeared in recent 46 million years, and 19 of them evolved from noncoding ancestral sequences with entire de novo origination process reconstructed. The new genes evolved gradually in exon−intron structure, protein length, expression specificity, and evolutionary constraint. These new genes, whether or not from de novo origination, are dominantly expressed in the rapidly developing shoots, and make transcriptomes of shoots the youngest among various bamboo tissues, rather than reproductive tissue in other plants. Additionally, the particularity of bamboo shoots has also been shaped by recent whole-genome duplicates (WGDs), which evolved divergent expression patterns from ancestral states. New genes and WGDs have been evolutionarily recruited into coexpression networks to underline fast-growing trait of bamboo shoot. Our study highlights the importance of interactions between new genes and genome duplicates in generating morphological innovation.
Collapse
Affiliation(s)
- Guihua Jin
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Peng-Fei Ma
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Xiaopei Wu
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Lianfeng Gu
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, Fujian, 350002, China
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, 60637, USA
| | - Chengjun Zhang
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - De-Zhu Li
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| |
Collapse
|
43
|
Rivard EL, Ludwig AG, Patel PH, Grandchamp A, Arnold SE, Berger A, Scott EM, Kelly BJ, Mascha GC, Bornberg-Bauer E, Findlay GD. A putative de novo evolved gene required for spermatid chromatin condensation in Drosophila melanogaster. PLoS Genet 2021; 17:e1009787. [PMID: 34478447 PMCID: PMC8445463 DOI: 10.1371/journal.pgen.1009787] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 09/16/2021] [Accepted: 08/19/2021] [Indexed: 02/07/2023] Open
Abstract
Comparative genomics has enabled the identification of genes that potentially evolved de novo from non-coding sequences. Many such genes are expressed in male reproductive tissues, but their functions remain poorly understood. To address this, we conducted a functional genetic screen of over 40 putative de novo genes with testis-enriched expression in Drosophila melanogaster and identified one gene, atlas, required for male fertility. Detailed genetic and cytological analyses showed that atlas is required for proper chromatin condensation during the final stages of spermatogenesis. Atlas protein is expressed in spermatid nuclei and facilitates the transition from histone- to protamine-based chromatin packaging. Complementary evolutionary analyses revealed the complex evolutionary history of atlas. The protein-coding portion of the gene likely arose at the base of the Drosophila genus on the X chromosome but was unlikely to be essential, as it was then lost in several independent lineages. Within the last ~15 million years, however, the gene moved to an autosome, where it fused with a conserved non-coding RNA and evolved a non-redundant role in male fertility. Altogether, this study provides insight into the integration of novel genes into biological processes, the links between genomic innovation and functional evolution, and the genetic control of a fundamental developmental process, gametogenesis.
Collapse
Affiliation(s)
- Emily L. Rivard
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Andrew G. Ludwig
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Prajal H. Patel
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | | | - Sarah E. Arnold
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | | | - Emilie M. Scott
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Brendan J. Kelly
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Grace C. Mascha
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Erich Bornberg-Bauer
- University of Münster, Münster, Germany
- Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Geoffrey D. Findlay
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| |
Collapse
|
44
|
Yates TB, Feng K, Zhang J, Singan V, Jawdy SS, Ranjan P, Abraham PE, Barry K, Lipzen A, Pan C, Schmutz J, Chen JG, Tuskan GA, Muchero W. The Ancient Salicoid Genome Duplication Event: A Platform for Reconstruction of De Novo Gene Evolution in Populus trichocarpa. Genome Biol Evol 2021; 13:evab198. [PMID: 34469536 PMCID: PMC8445398 DOI: 10.1093/gbe/evab198] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/22/2021] [Indexed: 12/13/2022] Open
Abstract
Orphan genes are characteristic genomic features that have no detectable homology to genes in any other species and represent an important attribute of genome evolution as sources of novel genetic functions. Here, we identified 445 genes specific to Populus trichocarpa. Of these, we performed deeper reconstruction of 13 orphan genes to provide evidence of de novo gene evolution. Populus and its sister genera Salix are particularly well suited for the study of orphan gene evolution because of the Salicoid whole-genome duplication event which resulted in highly syntenic sister chromosomal segments across the Salicaceae. We leveraged this genomic feature to reconstruct de novo gene evolution from intergenera, interspecies, and intragenomic perspectives by comparing the syntenic regions within the P. trichocarpa reference, then P. deltoides, and finally Salix purpurea. Furthermore, we demonstrated that 86.5% of the putative orphan genes had evidence of transcription. Additionally, we also utilized the Populus genome-wide association mapping panel, a collection of 1,084 undomesticated P. trichocarpa genotypes to further determine putative regulatory networks of orphan genes using expression quantitative trait loci (eQTL) mapping. Functional enrichment of these eQTL subnetworks identified common biological themes associated with orphan genes such as response to stress and defense response. We also identify a putative cis-element for a de novo gene and leverage conserved synteny to describe evolution of a putative transcription factor binding site. Overall, 45% of orphan genes were captured in trans-eQTL networks.
Collapse
Affiliation(s)
- Timothy B Yates
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, Tennessee, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Kai Feng
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Jin Zhang
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Vasanth Singan
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Sara S Jawdy
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Priya Ranjan
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Paul E Abraham
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Kerrie Barry
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Anna Lipzen
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Chongle Pan
- School of Computer Science and Department of Microbiology and Plant Biology, University of Oklahoma, Norman, Oklahoma, USA
| | - Jeremy Schmutz
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama, USA
| | - Jin-Gui Chen
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, Tennessee, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Gerald A Tuskan
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Wellington Muchero
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, Tennessee, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| |
Collapse
|
45
|
Abstract
Because gene expression is important for evolutionary adaptation, its misregulation is an important cause of maladaptation. A misregulated gene can be incorrectly silent ("off") when a transcription factor (TF) that is required for its activation does not binds its regulatory region. Conversely, a misregulated gene can be incorrectly active ("on") when a TF not normally involved in its activation binds its regulatory region, a phenomenon also known as regulatory crosstalk. DNA mutations that destroy or create TF binding sites on DNA are an important source of misregulation and crosstalk. Although misregulation reduces fitness in an environment to which an organism is well-adapted, it may become adaptive in a new environment. Here, I derive simple yet general mathematical expressions that delimit the conditions under which misregulation can be adaptive. These expressions depend on the strength of selection against misregulation, on the fraction of DNA sequence space filled with TF binding sites, and on the fraction of genes that must be expressed for optimal adaptation. I then use empirical data from RNA sequencing, protein-binding microarrays, and genome evolution, together with population genetic simulations to ask when these conditions are likely to be met. I show that they can be met under realistic circumstances, but these circumstances may vary among organisms and environments. My analysis provides a framework in which improved theory and data collection can help us demonstrate the role of misregulation in adaptation. It also shows that misregulation, like DNA mutation, is one of life's many imperfections that can help propel Darwinian evolution.
Collapse
Affiliation(s)
- Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, CH-8057, Switzerland.,The Santa Fe Institute, Santa Fe, NM 87501, USA.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
46
|
Jin L, Tang Q, Hu S, Chen Z, Zhou X, Zeng B, Wang Y, He M, Li Y, Gui L, Shen L, Long K, Ma J, Wang X, Chen Z, Jiang Y, Tang G, Zhu L, Liu F, Zhang B, Huang Z, Li G, Li D, Gladyshev VN, Yin J, Gu Y, Li X, Li M. A pig BodyMap transcriptome reveals diverse tissue physiologies and evolutionary dynamics of transcription. Nat Commun 2021; 12:3715. [PMID: 34140474 PMCID: PMC8211698 DOI: 10.1038/s41467-021-23560-8] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Accepted: 05/04/2021] [Indexed: 12/13/2022] Open
Abstract
A comprehensive transcriptomic survey of pigs can provide a mechanistic understanding of tissue specialization processes underlying economically valuable traits and accelerate their use as a biomedical model. Here we characterize four transcript types (lncRNAs, TUCPs, miRNAs, and circRNAs) and protein-coding genes in 31 adult pig tissues and two cell lines. We uncover the transcriptomic variability among 47 skeletal muscles, and six adipose depots linked to their different origins, metabolism, cell composition, physical activity, and mitochondrial pathways. We perform comparative analysis of the transcriptomes of seven tissues from pigs and nine other vertebrates to reveal that evolutionary divergence in transcription potentially contributes to lineage-specific biology. Long-range promoter–enhancer interaction analysis in subcutaneous adipose tissues across species suggests evolutionarily stable transcription patterns likely attributable to redundant enhancers buffering gene expression patterns against perturbations, thereby conferring robustness during speciation. This study can facilitate adoption of the pig as a biomedical model for human biology and disease and uncovers the molecular bases of valuable traits. A comprehensive transcriptomic survey of the pig could enable mechanistic understanding of tissue specialization and accelerate its use as a biomedical model. Here the authors characterize four distinct transcript types in 31 adult pig tissues to dissect their distinct structural and transcriptional features and uncover transcriptomic variability related to tissue physiology.
Collapse
Affiliation(s)
- Long Jin
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Qianzi Tang
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China.
| | - Silu Hu
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Zhongxu Chen
- Department of Life Science, Tcuni Inc., Chengdu, Sichuan, China
| | - Xuming Zhou
- Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Bo Zeng
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Yuhao Wang
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Mengnan He
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Yan Li
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Lixuan Gui
- Department of Life Science, Tcuni Inc., Chengdu, Sichuan, China
| | - Linyuan Shen
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Keren Long
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Jideng Ma
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Xun Wang
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Zhengli Chen
- Key Laboratory of Animal Disease and Human Health of Sichuan Province, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Yanzhi Jiang
- College of Life Science, Sichuan Agricultural University, Ya'an, Sichuan, China
| | - Guoqing Tang
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Li Zhu
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Fei Liu
- Information and Educational Technology Center, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Bo Zhang
- Ya'an Digital Economy Operation Company, Ya'an, Sichuan, China
| | - Zhiqing Huang
- Institute of Animal Nutrition, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Guisen Li
- Renal Department and Nephrology Institute, Sichuan Provincial People's Hospital, Chengdu, Sichuan, China
| | - Diyan Li
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Vadim N Gladyshev
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Jingdong Yin
- State Key Laboratory of Animal Nutrition, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Yiren Gu
- Animal Breeding and Genetics Key Laboratory of Sichuan Province, Sichuan Animal Science Academy, Chengdu, Sichuan, China
| | - Xuewei Li
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China
| | - Mingzhou Li
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China.
| |
Collapse
|
47
|
Hata T, Takada N, Hayakawa C, Kazama M, Uchikoba T, Tachikawa M, Matsuo M, Satoh S, Obokata J. De novo activated transcription of inserted foreign coding sequences is inheritable in the plant genome. PLoS One 2021; 16:e0252674. [PMID: 34111139 PMCID: PMC8191969 DOI: 10.1371/journal.pone.0252674] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 05/19/2021] [Indexed: 01/16/2023] Open
Abstract
The manner in which inserted foreign coding sequences become transcriptionally activated and fixed in the plant genome is poorly understood. To examine such processes of gene evolution, we performed an artificial evolutionary experiment in Arabidopsis thaliana. As a model of gene-birth events, we introduced a promoterless coding sequence of the firefly luciferase (LUC) gene and established 386 T2-generation transgenic lines. Among them, we determined the individual LUC insertion loci in 76 lines and found that one-third of them were transcribed de novo even in the intergenic or inherently unexpressed regions. In the transcribed lines, transcription-related chromatin marks were detected across the newly activated transcribed regions. These results agreed with our previous findings in A. thaliana cultured cells under a similar experimental scheme. A comparison of the results of the T2-plant and cultured cell experiments revealed that the de novo-activated transcription concomitant with local chromatin remodelling was inheritable. During one-generation inheritance, it seems likely that the transcription activities of the LUC inserts trapped by the endogenous genes/transcripts became stronger, while those of de novo transcription in the intergenic/untranscribed regions became weaker. These findings may offer a clue for the elucidation of the mechanism by which inserted foreign coding sequences become transcriptionally activated and fixed in the plant genome.
Collapse
Affiliation(s)
- Takayuki Hata
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
- Faculty of Agriculture, Setsunan University, Hirakata-shi, Osaka, Japan
| | - Naoto Takada
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Chihiro Hayakawa
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Mei Kazama
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Tomohiro Uchikoba
- Faculty of Life and Environmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Makoto Tachikawa
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Mitsuhiro Matsuo
- Faculty of Agriculture, Setsunan University, Hirakata-shi, Osaka, Japan
| | - Soichirou Satoh
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
- Faculty of Life and Environmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Junichi Obokata
- Faculty of Agriculture, Setsunan University, Hirakata-shi, Osaka, Japan
| |
Collapse
|
48
|
A comparative genomic approach using mouse and fruit fly data to discover genes involved in testis function in hymenopterans with a focus on Nasonia vitripennis. BMC Ecol Evol 2021; 21:90. [PMID: 34011283 PMCID: PMC8132408 DOI: 10.1186/s12862-021-01825-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Accepted: 05/12/2021] [Indexed: 11/18/2022] Open
Abstract
Background Spermatogenesis appears to be a relatively well-conserved process even among distantly related animal taxa such as invertebrates and vertebrates. Although Hymenopterans share many characteristics with other organisms, their complex haplodiploid reproduction system is still relatively unknown. However, they serve as a complementary insect model to Drosophila for studying functional male fertility. In this study, we used a comparative method combining taxonomic, phenotypic data and gene expression to identify candidate genes that could play a significant role in spermatogenesis in hymenopterans. Results Of the 546 mouse genes predominantly or exclusively expressed in the mouse testes, 36% had at least one ortholog in the fruit fly. Of these genes, 68% had at least one ortholog in one of the six hymenopteran species we examined. Based on their gene expression profiles in fruit fly testes, 71 of these genes were hypothesized to play a marked role in testis function. Forty-three of these 71 genes had an ortholog in at least one of the six hymenopteran species examined, and their enriched GO terms were related to the G2/M transition or to cilium organization, assembly, or movement. Second, of the 379 genes putatively involved in male fertility in Drosophila, 224 had at least one ortholog in each of the six Hymenoptera species. Finally, we showed that 199 of these genes were expressed in early pupal testis in Nasonia vitripennis; 86 exhibited a high level of expression, and 54 displayed modulated expression during meiosis. Conclusions In this study combining phylogenetic and experimental approaches, we highlighted genes that may have a major role in gametogenesis in hymenopterans; an essential prerequisite for further research on functional importance of these genes. Supplementary Information The online version contains supplementary material available at 10.1186/s12862-021-01825-6.
Collapse
|
49
|
Witt E, Svetec N, Benjamin S, Zhao L. Transcription Factors Drive Opposite Relationships between Gene Age and Tissue Specificity in Male and Female Drosophila Gonads. Mol Biol Evol 2021; 38:2104-2115. [PMID: 33481021 PMCID: PMC8097261 DOI: 10.1093/molbev/msab011] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Evolutionarily young genes are usually preferentially expressed in the testis across species. Although it is known that older genes are generally more broadly expressed than younger genes, the properties that shaped this pattern are unknown. Older genes may gain expression across other tissues uniformly, or faster in certain tissues than others. Using Drosophila gene expression data, we confirmed previous findings that younger genes are disproportionately testis biased and older genes are disproportionately ovary biased. We found that the relationship between gene age and expression is stronger in the ovary than any other tissue and weakest in testis. We performed ATAC-seq on Drosophila testis and found that although genes of all ages are more likely to have open promoter chromatin in testis than in ovary, promoter chromatin alone does not explain the ovary bias of older genes. Instead, we found that upstream transcription factor (TF) expression is highly predictive of gene expression in ovary but not in testis. In the ovary, TF expression is more predictive of gene expression than open promoter chromatin, whereas testis gene expression is similarly influenced by both TF expression and open promoter chromatin. We propose that the testis is uniquely able to express younger genes controlled by relatively few TFs, whereas older genes with more TF partners are broadly expressed with peak expression most likely in the ovary. The testis allows widespread baseline expression that is relatively unresponsive to regulatory changes, whereas the ovary transcriptome is more responsive to trans-regulation and has a higher ceiling for gene expression.
Collapse
Affiliation(s)
- Evan Witt
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Nicolas Svetec
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Sigi Benjamin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| |
Collapse
|
50
|
Lange A, Patel PH, Heames B, Damry AM, Saenger T, Jackson CJ, Findlay GD, Bornberg-Bauer E. Structural and functional characterization of a putative de novo gene in Drosophila. Nat Commun 2021; 12:1667. [PMID: 33712569 PMCID: PMC7954818 DOI: 10.1038/s41467-021-21667-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 02/03/2021] [Indexed: 11/26/2022] Open
Abstract
Comparative genomic studies have repeatedly shown that new protein-coding genes can emerge de novo from noncoding DNA. Still unknown is how and when the structures of encoded de novo proteins emerge and evolve. Combining biochemical, genetic and evolutionary analyses, we elucidate the function and structure of goddard, a gene which appears to have evolved de novo at least 50 million years ago within the Drosophila genus. Previous studies found that goddard is required for male fertility. Here, we show that Goddard protein localizes to elongating sperm axonemes and that in its absence, elongated spermatids fail to undergo individualization. Combining modelling, NMR and circular dichroism (CD) data, we show that Goddard protein contains a large central α-helix, but is otherwise partially disordered. We find similar results for Goddard's orthologs from divergent fly species and their reconstructed ancestral sequences. Accordingly, Goddard's structure appears to have been maintained with only minor changes over millions of years.
Collapse
Affiliation(s)
- Andreas Lange
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Prajal H Patel
- Department of Biology, College of the Holy Cross, Worcester, MA, USA
| | - Brennen Heames
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Adam M Damry
- Research School of Chemistry, ANU College of Science, Canberra, Australia
| | - Thorsten Saenger
- Department of Pediatric Kidney, Liver and Metabolic Diseases, Hannover Medical School, Hannover, Germany
| | - Colin J Jackson
- Research School of Chemistry, ANU College of Science, Canberra, Australia
| | | | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany.
| |
Collapse
|