1
|
Glaser-Schmitt A, Lebherz M, Saydam E, Bornberg-Bauer E, Parsch J. Expression of De Novo Open Reading Frames in Natural Populations of Drosophila melanogaster. JOURNAL OF EXPERIMENTAL ZOOLOGY. PART B, MOLECULAR AND DEVELOPMENTAL EVOLUTION 2025. [PMID: 40231390 DOI: 10.1002/jez.b.23297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2025] [Revised: 03/14/2025] [Accepted: 04/03/2025] [Indexed: 04/16/2025]
Abstract
De novo genes, which originate from noncoding DNA, are known to have a high rate of turnover over short evolutionary timescales, such as within a species. Thus, their expression is often lineage- or genetic background-specific. However, little is known about their levels and breadth of expression as populations of a species diverge. In this study, we utilized publicly available RNA-seq data to examine the expression of newly evolved open reading frames (neORFs) in comparison to non- and protein-coding genes in Drosophila melanogaster populations from the derived species range in Europe and the ancestral range in sub-Saharan Africa. Our datasets included two adult tissue types as well as whole bodies at two temperatures for both sexes and three larval/prepupal developmental stages in a single tissue and sex, which allowed us to examine neORF expression and divergence across multiple sample types as well as sex and population. We detected a relatively large proportion (approximately 50%) of annotated neORFs as expressed in the population samples, with neORFs often showing greater expression divergence between populations than non- or protein-coding genes. However, differential expression of neORFs between populations tended to occur in a sample type-specific manner. On the other hand, neORFs displayed less sex-biased expression than the other two gene classes, with the majority of sex-biased neORFs detected in whole bodies, which may be attributable to the presence of the gonads. We also found that neORFs shared among multiple lines in the original set of inbred lines in which they were first detected were more likely to be both expressed and differentially expressed in the new population samples, suggesting that neORFs at a higher frequency (i.e. present in more individuals) within a species are more likely to be functional.
Collapse
Affiliation(s)
- Amanda Glaser-Schmitt
- Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, Munich, Bavaria, Germany
| | - Marie Lebherz
- Institute for Evolution and Biodiversity, University of Münster, Münster, North Rhine-Westphalia, Germany
| | - Ezgi Saydam
- Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, Munich, Bavaria, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, North Rhine-Westphalia, Germany
| | - John Parsch
- Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, Munich, Bavaria, Germany
| |
Collapse
|
2
|
Wang C, Feng B, Ding Y, Liu Q, Xia Y, Zheng X, Lian X, Wang X, Hou N, Wang L, Zhang H, Feng J, Tan B. Identification, characterization and expression analysis of lineage-specific genes within 'Zhongyoutao 14' peach (Prunus persica). Gene 2025; 941:149234. [PMID: 39814190 DOI: 10.1016/j.gene.2025.149234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Revised: 12/25/2024] [Accepted: 01/07/2025] [Indexed: 01/18/2025]
Abstract
BACKGROUND With the development of sequencing technology and the rapid increasing in the number of sequenced genomes, lineage-specific genes (LSGs) have been identified and characterized across various species. Similar to other conserved functional genes, LSGs play a crucial role in biological evolution and development. However, the understanding of LSGs remains limited. This study aims to identify significant gene expression profiles of LSGs in peach, which may contribute to the development of specific tissues and important traits. METHODS Seven peach and 341 exogenous species genomes were used in this study. Firstly, the coding sequences of CN14 peach were compared with other genomes to discover LSGs. Next, the LSGs of CN14 peach were compared with other peach genomes to identify the peach specific genes and orphan genes. Furthermore, the tissue specificity expressed PSGs and orphan genes were identified using transcriptome data. In addition, the genes specific expressed in stem might response to GA3 treatment were identified using RT-qPCR. RESULT A total of 74 peach specific genes (PSGs) and 91 Orphan genes were identified. The PSGs and orphan genes had fewer exon numbers, shorter gene lengths and lower molecular weight compared with evolutionarily conserved genes (ECGs). Part of these PSGs and Orphan genes were shown an obvious tissue specificity expression pattern at stem, fruit and flower. Three PSGs and three Orphan genes were identified within the QTLs associated with temperature-sensitive semi-dwarf (TSSD), maturity date (Md), and red flesh around stone (Rfas). Three PSGs and seven Orphan genes were identified in response to GA3, these genes might play important role in stem development of peach. CONCLUSION The identification and characterization of PSGs and Orphan genes not only provide valuable peach-specific genetic resources, but also might contribute to peach specific biological process.
Collapse
Affiliation(s)
- Caijuan Wang
- College of Horticulture, Henan Agricultural University, Zhengzhou, China
| | - Beibei Feng
- College of Horticulture, Henan Agricultural University, Zhengzhou, China
| | - Yejun Ding
- College of Horticulture, Henan Agricultural University, Zhengzhou, China
| | - Qinqi Liu
- College of Horticulture, Henan Agricultural University, Zhengzhou, China
| | - Yukai Xia
- College of Horticulture, Henan Agricultural University, Zhengzhou, China
| | - Xianbo Zheng
- College of Horticulture, Henan Agricultural University, Zhengzhou, China; Henan Engineering and Technology Center for Peach Germplasm Innovation and Utilization, Zhengzhou, China; International Joint Laboratory of Henan Horticultural Crop Biology, Zhengzhou, China
| | - Xiaodong Lian
- College of Horticulture, Henan Agricultural University, Zhengzhou, China; Henan Engineering and Technology Center for Peach Germplasm Innovation and Utilization, Zhengzhou, China; International Joint Laboratory of Henan Horticultural Crop Biology, Zhengzhou, China
| | - Xiaobei Wang
- College of Horticulture, Henan Agricultural University, Zhengzhou, China; Henan Engineering and Technology Center for Peach Germplasm Innovation and Utilization, Zhengzhou, China; International Joint Laboratory of Henan Horticultural Crop Biology, Zhengzhou, China
| | - Nan Hou
- College of Horticulture, Henan Agricultural University, Zhengzhou, China; Henan Engineering and Technology Center for Peach Germplasm Innovation and Utilization, Zhengzhou, China; International Joint Laboratory of Henan Horticultural Crop Biology, Zhengzhou, China
| | - Lei Wang
- College of Horticulture, Henan Agricultural University, Zhengzhou, China; Henan Engineering and Technology Center for Peach Germplasm Innovation and Utilization, Zhengzhou, China; International Joint Laboratory of Henan Horticultural Crop Biology, Zhengzhou, China
| | - Haipeng Zhang
- College of Horticulture, Henan Agricultural University, Zhengzhou, China; Henan Engineering and Technology Center for Peach Germplasm Innovation and Utilization, Zhengzhou, China; International Joint Laboratory of Henan Horticultural Crop Biology, Zhengzhou, China.
| | - Jiancan Feng
- College of Horticulture, Henan Agricultural University, Zhengzhou, China; Henan Engineering and Technology Center for Peach Germplasm Innovation and Utilization, Zhengzhou, China; International Joint Laboratory of Henan Horticultural Crop Biology, Zhengzhou, China.
| | - Bin Tan
- College of Horticulture, Henan Agricultural University, Zhengzhou, China; Henan Engineering and Technology Center for Peach Germplasm Innovation and Utilization, Zhengzhou, China; International Joint Laboratory of Henan Horticultural Crop Biology, Zhengzhou, China.
| |
Collapse
|
3
|
Ma Y, Zhai Q, Liu Z, Liu W. Genome-wide identification and characterization of alfalfa-specific genes in drought stress tolerance. PLANT PHYSIOLOGY AND BIOCHEMISTRY : PPB 2025; 220:109474. [PMID: 39799784 DOI: 10.1016/j.plaphy.2025.109474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 12/24/2024] [Accepted: 01/02/2025] [Indexed: 01/15/2025]
Abstract
Alfalfa (Medicago sativa L.) is a prominent and distinct species within the pasture germplasm innovation industry. However, drought poses a substantial constraint on the yield and distribution of alfalfa by adversely affecting its growth. Although lineage-specific genes are instrumental in modulating plant responses to stress, their role in mediating alfalfa's tolerance to drought stress has yet to be elucidated. In this study, a total of 199 alfalfa-specific genes (ASGs) and 3054 legume-specific genes (LSGs) were identified in alfalfa. Compared with evolutionarily conserved genes, ASGs have shorter sequence length and fewer or no intron. Many alfalfa ASGs can be induced by various abiotic stresses, and the capability of MsASG166 to enhance drought resistance has been substantiated through transgenic research in both yeast and Arabidopsis thaliana. The RNA-Seq and WGCNA analyses revealed that DREB2A and MADS are pivotal genes in the molecular mechanisms through which MsASG166 positively modulates plant drought resistance. This study marks the first identification of lineage-specific genes in alfalfa and an examination of the molecular roles of the MsASG166 gene in drought stress responses. The findings offer valuable genetic resources for the development of novel, genetically engineered alfalfa germplasm with enhanced drought tolerance.
Collapse
Affiliation(s)
- Yitong Ma
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, Gansu 730020, China.
| | - Qingyan Zhai
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, Gansu 730020, China.
| | - Zhipeng Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, Gansu 730020, China.
| | - Wenxian Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, Gansu 730020, China.
| |
Collapse
|
4
|
Pereira AB, Marano M, Bathala R, Zaragoza RA, Neira A, Samano A, Owoyemi A, Casola C. Orphan genes are not a distinct biological entity. Bioessays 2025; 47:e2400146. [PMID: 39491810 DOI: 10.1002/bies.202400146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2024] [Revised: 10/06/2024] [Accepted: 10/11/2024] [Indexed: 11/05/2024]
Abstract
The genome sequencing revolution has revealed that all species possess a large number of unique genes critical for trait variation, adaptation, and evolutionary innovation. One widely used approach to identify such genes consists of detecting protein-coding sequences with no homology in other genomes, termed orphan genes. These genes have been extensively studied, under the assumption that they represent valid proxies for species-specific genes. Here, we critically evaluate taxonomic, phylogenetic, and sequence evolution evidence showing that orphan genes belong to a range of evolutionary ages and thus cannot be assigned to a single lineage. Furthermore, we show that the processes generating orphan genes are substantially more diverse than generally thought and include horizontal gene transfer, transposable element domestication, and overprinting. Thus, orphan genes represent a heterogeneous collection of genes rather than a single biological entity, making them unsuitable as a subject for meaningful investigation of gene evolution and phenotypic innovation.
Collapse
Affiliation(s)
- Andres Barboza Pereira
- Interdisciplinary Graduate Program in Genetics & Genomics, Texas A&M University, College Station, Texas, USA
- Interdisciplinary Doctoral Program in Ecology and Evolutionary Biology, Texas A&M University, College Station, Texas, USA
| | - Matthew Marano
- Interdisciplinary Doctoral Program in Ecology and Evolutionary Biology, Texas A&M University, College Station, Texas, USA
| | - Ramya Bathala
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas, USA
| | | | - Andres Neira
- School of Pharmacy, Texas A&M University, College Station, Texas, USA
| | - Alex Samano
- Department of Biology, Texas A&M University, College Station, Texas, USA
| | - Adekola Owoyemi
- Department of Ecology and Conservation Biology, Texas A&M University, College Station, Texas, USA
| | - Claudio Casola
- Interdisciplinary Graduate Program in Genetics & Genomics, Texas A&M University, College Station, Texas, USA
- Interdisciplinary Doctoral Program in Ecology and Evolutionary Biology, Texas A&M University, College Station, Texas, USA
- Department of Ecology and Conservation Biology, Texas A&M University, College Station, Texas, USA
| |
Collapse
|
5
|
Guay SY, Patel PH, Thomalla JM, McDermott KL, O'Toole JM, Arnold SE, Obrycki SJ, Wolfner MF, Findlay GD. An orphan gene is essential for efficient sperm entry into eggs in Drosophila melanogaster. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.08.607187. [PMID: 39149251 PMCID: PMC11326263 DOI: 10.1101/2024.08.08.607187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
While spermatogenesis has been extensively characterized in the Drosophila melanogaster model system, very little is known about the genes required for fly sperm entry into eggs. We identified a lineage-specific gene, which we named katherine johnson (kj), that is required for efficient fertilization. Males that do not express kj produce and transfer sperm that are stored normally in females, but sperm from these males enter eggs with severely reduced efficiency. Using a tagged transgenic rescue construct, we observed that the KJ protein localizes around the edge of the nucleus at various stages of spermatogenesis but is undetectable in mature sperm. These data suggest that kj exerts an effect on sperm development, the loss of which results in reduced fertilization ability. Interestingly, KJ protein lacks detectable sequence similarity to any other known protein, suggesting that kj could be a lineage-specific orphan gene. While previous bioinformatic analyses indicated that kj was restricted to the melanogaster group of Drosophila, we identified putative orthologs with conserved synteny, male-biased expression, and predicted protein features across the genus, as well as likely instances of gene loss in some lineages. Thus, kj was likely present in the Drosophila common ancestor and subsequently evolved an essential role in fertility in D. melanogaster. Our results demonstrate a new aspect of male reproduction that has been shaped by a lineage-specific gene and provide a molecular foothold for further investigating the mechanism of sperm entry into eggs in Drosophila.
Collapse
Affiliation(s)
- Sara Y Guay
- Department of Biology, College of the Holy Cross, Worcester, MA 01610
| | - Prajal H Patel
- Department of Biology, College of the Holy Cross, Worcester, MA 01610
| | - Jonathon M Thomalla
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853
| | - Kerry L McDermott
- Department of Biology, College of the Holy Cross, Worcester, MA 01610
| | - Jillian M O'Toole
- Department of Biology, College of the Holy Cross, Worcester, MA 01610
| | - Sarah E Arnold
- Department of Biology, College of the Holy Cross, Worcester, MA 01610
| | - Sarah J Obrycki
- Department of Biology, College of the Holy Cross, Worcester, MA 01610
| | - Mariana F Wolfner
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853
| | | |
Collapse
|
6
|
Zhao L, Svetec N, Begun DJ. De Novo Genes. Annu Rev Genet 2024; 58:211-232. [PMID: 39088850 PMCID: PMC12051474 DOI: 10.1146/annurev-genet-111523-102413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2024]
Abstract
Although the majority of annotated new genes in a given genome appear to have arisen from duplication-related mechanisms, recent studies have shown that genes can also originate de novo from ancestrally nongenic sequences. Investigating de novo-originated genes offers rich opportunities to understand the origin and functions of new genes, their regulatory mechanisms, and the associated evolutionary processes. Such studies have uncovered unexpected and intriguing facets of gene origination, offering novel perspectives on the complexity of the genome and gene evolution. In this review, we provide an overview of the research progress in this field, highlight recent advancements, identify key technical and conceptual challenges, and underscore critical questions that remain to be addressed.
Collapse
Affiliation(s)
- Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA; ,
| | - Nicolas Svetec
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA; ,
| | - David J Begun
- Department of Evolution and Ecology, University of California, Davis, California, USA;
| |
Collapse
|
7
|
Middendorf L, Ravi Iyengar B, Eicholt LA. Sequence, Structure, and Functional Space of Drosophila De Novo Proteins. Genome Biol Evol 2024; 16:evae176. [PMID: 39212966 PMCID: PMC11363682 DOI: 10.1093/gbe/evae176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/29/2024] [Indexed: 09/04/2024] Open
Abstract
During de novo emergence, new protein coding genes emerge from previously nongenic sequences. The de novo proteins they encode are dissimilar in composition and predicted biochemical properties to conserved proteins. However, functional de novo proteins indeed exist. Both identification of functional de novo proteins and their structural characterization are experimentally laborious. To identify functional and structured de novo proteins in silico, we applied recently developed machine learning based tools and found that most de novo proteins are indeed different from conserved proteins both in their structure and sequence. However, some de novo proteins are predicted to adopt known protein folds, participate in cellular reactions, and to form biomolecular condensates. Apart from broadening our understanding of de novo protein evolution, our study also provides a large set of testable hypotheses for focused experimental studies on structure and function of de novo proteins in Drosophila.
Collapse
Affiliation(s)
- Lasse Middendorf
- Institute for Evolution and Biodiversity, University of Muenster, Huefferstrasse 1, 48149 Muenster, Germany
| | - Bharat Ravi Iyengar
- Institute for Evolution and Biodiversity, University of Muenster, Huefferstrasse 1, 48149 Muenster, Germany
| | - Lars A Eicholt
- Institute for Evolution and Biodiversity, University of Muenster, Huefferstrasse 1, 48149 Muenster, Germany
| |
Collapse
|
8
|
Domazet-Lošo M, Široki T, Šimičević K, Domazet-Lošo T. Macroevolutionary dynamics of gene family gain and loss along multicellular eukaryotic lineages. Nat Commun 2024; 15:2663. [PMID: 38531970 DOI: 10.1038/s41467-024-47017-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 03/11/2024] [Indexed: 03/28/2024] Open
Abstract
The gain and loss of genes fluctuate over evolutionary time in major eukaryotic clades. However, the full profile of these macroevolutionary trajectories is still missing. To give a more inclusive view on the changes in genome complexity across the tree of life, here we recovered the evolutionary dynamics of gene family gain and loss ranging from the ancestor of cellular organisms to 352 eukaryotic species. We show that in all considered lineages the gene family content follows a common evolutionary pattern, where the number of gene families reaches the highest value at a major evolutionary and ecological transition, and then gradually decreases towards extant organisms. This supports theoretical predictions and suggests that the genome complexity is often decoupled from commonly perceived organismal complexity. We conclude that simplification by gene family loss is a dominant force in Phanerozoic genomes of various lineages, probably underpinned by intense ecological specializations and functional outsourcing.
Collapse
Affiliation(s)
- Mirjana Domazet-Lošo
- Department of Applied Computing, Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, HR-10000, Zagreb, Croatia.
| | - Tin Široki
- Department of Applied Computing, Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, HR-10000, Zagreb, Croatia
| | - Korina Šimičević
- Department of Applied Computing, Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, HR-10000, Zagreb, Croatia
| | - Tomislav Domazet-Lošo
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Bijenička cesta 54, HR-10000, Zagreb, Croatia.
- School of Medicine, Catholic University of Croatia, Ilica 242, HR-10000, Zagreb, Croatia.
| |
Collapse
|
9
|
Peng J, Zhao L. The origin and structural evolution of de novo genes in Drosophila. Nat Commun 2024; 15:810. [PMID: 38280868 PMCID: PMC10821953 DOI: 10.1038/s41467-024-45028-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 01/09/2024] [Indexed: 01/29/2024] Open
Abstract
Recent studies reveal that de novo gene origination from previously non-genic sequences is a common mechanism for gene innovation. These young genes provide an opportunity to study the structural and functional origins of proteins. Here, we combine high-quality base-level whole-genome alignments and computational structural modeling to study the origination, evolution, and protein structures of lineage-specific de novo genes. We identify 555 de novo gene candidates in D. melanogaster that originated within the Drosophilinae lineage. Sequence composition, evolutionary rates, and expression patterns indicate possible gradual functional or adaptive shifts with their gene ages. Surprisingly, we find little overall protein structural changes in candidates from the Drosophilinae lineage. We identify several candidates with potentially well-folded protein structures. Ancestral sequence reconstruction analysis reveals that most potentially well-folded candidates are often born well-folded. Single-cell RNA-seq analysis in testis shows that although most de novo gene candidates are enriched in spermatocytes, several young candidates are biased towards the early spermatogenesis stage, indicating potentially important but less emphasized roles of early germline cells in the de novo gene origination in testis. This study provides a systematic overview of the origin, evolution, and protein structural changes of Drosophilinae-specific de novo genes.
Collapse
Affiliation(s)
- Junhui Peng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA.
| |
Collapse
|
10
|
Rahi ML, Mather PB, de Bello Cioffi M, Ezaz T, Hurwood DA. Genomic Basis of Freshwater Adaptation in the Palaemonid Prawn Genus Macrobrachium: Convergent Evolution Following Multiple Independent Colonization Events. J Mol Evol 2023; 91:976-989. [PMID: 38010517 DOI: 10.1007/s00239-023-10149-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 11/14/2023] [Indexed: 11/29/2023]
Abstract
Adaptation to different salinity environments can enhance morphological and genomic divergence between related aquatic taxa. Species of prawns in the genus Macrobrachium naturally inhabit different osmotic niches and possess distinctive lifecycle traits associated with salinity tolerance. This study was conducted to investigate the patterns of adaptive genomic divergence during freshwater colonization in 34 Macrobrachium species collected from four continents; Australia, Asia, North and South America. Genotyping-by-sequencing (GBS) technique identified 5018 loci containing 82,636 single nucleotide polymorphisms (SNPs) that were used to reconstruct a phylogenomic tree. An additional phylogeny was reconstructed based on 43 candidate genes, previously identified as being potentially associated with freshwater adaptation. Comparison of the two phylogenetic trees revealed contrasting topologies. The GBS tree indicated multiple independent continent-specific invasions into freshwater by Macrobrachium lineages following common marine ancestry, as species with abbreviated larval development (ALD), i.e., species having a full freshwater life history, appeared reciprocally monophyletic within each continent. In contrast, the candidate gene tree showed convergent evolution for all ALD species worldwide, forming a single, well-supported clade. This latter pattern is likely the result of common evolutionary pressures selecting key mutations favored in continental freshwater habitats Results suggest that following multiple independent invasions into continental freshwaters at different evolutionary timescales, Macrobrachium taxa experienced adaptive genomic divergence, and in particular, convergence in the same genomic regions with parallel shifts in specific conserved phenotypic traits, such as evolution of larger eggs with abbreviated larval developmental.
Collapse
Affiliation(s)
- Md Lifat Rahi
- Fisheries and Marine Resource Technology Discipline, Khulna University, Khulna, Bangladesh.
| | - Peter B Mather
- Faculty of Science, Queensland University of Technology (QUT), Brisbane, QLD, 4001, Australia
| | - Marcelo de Bello Cioffi
- Department of Genetics and Evolution, Federal University of Sao Carlos, São Carlos, SP, Brazil
| | - Tariq Ezaz
- Institute for Applied Ecology (IAE), University of Canberra (UC), Canberra, ACT, 2617, Australia
| | - David A Hurwood
- Faculty of Science, Queensland University of Technology (QUT), Brisbane, QLD, 4001, Australia
| |
Collapse
|
11
|
Liang X, Heath LS. Towards understanding paleoclimate impacts on primate de novo genes. G3 (BETHESDA, MD.) 2023; 13:jkad135. [PMID: 37313728 PMCID: PMC10468307 DOI: 10.1093/g3journal/jkad135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/31/2023] [Accepted: 06/08/2023] [Indexed: 06/15/2023]
Abstract
De novo genes are genes that emerge as new genes in some species, such as primate de novo genes that emerge in certain primate species. Over the past decade, a great deal of research has been conducted regarding their emergence, origins, functions, and various attributes in different species, some of which have involved estimating the ages of de novo genes. However, limited by the number of species available for whole-genome sequencing, relatively few studies have focused specifically on the emergence time of primate de novo genes. Among those, even fewer investigate the association between primate gene emergence with environmental factors, such as paleoclimate (ancient climate) conditions. This study investigates the relationship between paleoclimate and human gene emergence at primate species divergence. Based on 32 available primate genome sequences, this study has revealed possible associations between temperature changes and the emergence of de novo primate genes. Overall, findings in this study are that de novo genes tended to emerge in the recent 13 MY when the temperature continues cooling, which is consistent with past findings. Furthermore, in the context of an overall trend of cooling temperature, new primate genes were more likely to emerge during local warming periods, where the warm temperature more closely resembled the environmental condition that preceded the cooling trend. Results also indicate that both primate de novo genes and human cancer-associated genes have later origins in comparison to random human genes. Future studies can be in-depth on understanding human de novo gene emergence from an environmental perspective as well as understanding species divergence from a gene emergence perspective.
Collapse
Affiliation(s)
- Xiao Liang
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Lenwood S Heath
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| |
Collapse
|
12
|
Lombardo KD, Sheehy HK, Cridland JM, Begun DJ. Identifying candidate de novo genes expressed in the somatic female reproductive tract of Drosophila melanogaster. G3 (BETHESDA, MD.) 2023; 13:jkad122. [PMID: 37259569 PMCID: PMC10411569 DOI: 10.1093/g3journal/jkad122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Revised: 05/18/2023] [Accepted: 05/22/2023] [Indexed: 06/02/2023]
Abstract
Most eukaryotic genes have been vertically transmitted to the present from distant ancestors. However, variable gene number across species indicates that gene gain and loss also occurs. While new genes typically originate as products of duplications and rearrangements of preexisting genes, putative de novo genes-genes born out of ancestrally nongenic sequence-have been identified. Previous studies of de novo genes in Drosophila have provided evidence that expression in male reproductive tissues is common. However, no studies have focused on female reproductive tissues. Here we begin addressing this gap in the literature by analyzing the transcriptomes of 3 female reproductive tract organs (spermatheca, seminal receptacle, and parovaria) in 3 species-our focal species, Drosophila melanogaster-and 2 closely related species, Drosophila simulans and Drosophila yakuba, with the goal of identifying putative D. melanogaster-specific de novo genes expressed in these tissues. We discovered several candidate genes, located in sequence annotated as intergenic. Consistent with the literature, these genes tend to be short, single exon, and lowly expressed. We also find evidence that some of these genes are expressed in other D. melanogaster tissues and both sexes. The relatively small number of intergenic candidate genes discovered here is similar to that observed in the accessory gland, but substantially fewer than that observed in the testis.
Collapse
Affiliation(s)
- Kaelina D Lombardo
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
| | - Hayley K Sheehy
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
| | - Julie M Cridland
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
| | - David J Begun
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
| |
Collapse
|
13
|
Zhao Y, Huang S, Zhang Y, Tan C, Feng H. Role of Brassica orphan gene BrLFM on leafy head formation in Chinese cabbage (Brassica rapa). TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:170. [PMID: 37420138 DOI: 10.1007/s00122-023-04411-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 06/22/2023] [Indexed: 07/09/2023]
Abstract
Brassica orphan gene BrFLM, identified by two allelic mutants, was involved in leafy head formation in Chinese cabbage. Leafy head formation is a unique agronomic trait of Chinese cabbage that determines its yield and quality. In our previous study, an EMS mutagenesis Chinese cabbage mutant library was constructed using the heading Chinese cabbage double haploid (DH) line FT as the wild-type. Here, we screened two extremely similar leafy head deficiency mutants lfm-1 and lfm-2 with geotropic growth leaves from the library to investigate the gene(s) related to leafy head formation. Reciprocal crossing results showed that these two mutants were allelic. We utilized lfm-1 to identify the mutant gene(s). Genetic analysis showed that the mutated trait was controlled by a single nuclear gene Brlfm. Mutmap analysis showed that Brlfm was located on chromosome A05, and BraA05g012440.3C or BraA05g021450.3C were the candidate gene. Kompetitive allele-specific PCR analysis eliminated BraA05g012440.3C from the candidates. Sanger sequencing identified an SNP from G to A at the 271st nucleotide on BraA05g021450.3C. The sequencing of lfm-2 detected another non-synonymous SNP (G to A) located at the 266st nucleotide on BraA05g021450.3C, which verified its function on leafy head formation. We blasted BraA05g021450.3C on database and found that it belongs to a Brassica orphan gene encoding an unknown 13.74 kDa protein, named BrLFM. Subcellular localization showed that BrLFM was located in the nucleus. These findings reveal that BrLFM is involved in leafy head formation in Chinese cabbage.
Collapse
Affiliation(s)
- Yonghui Zhao
- College of Horticulture, Shenyang Agricultural University, 120 Dongling Road, Shenhe District, Shenyang, 110866, People's Republic of China
| | - Shengnan Huang
- College of Horticulture, Shenyang Agricultural University, 120 Dongling Road, Shenhe District, Shenyang, 110866, People's Republic of China
| | - Yun Zhang
- College of Horticulture, Shenyang Agricultural University, 120 Dongling Road, Shenhe District, Shenyang, 110866, People's Republic of China
| | - Chong Tan
- College of Horticulture, Shenyang Agricultural University, 120 Dongling Road, Shenhe District, Shenyang, 110866, People's Republic of China
| | - Hui Feng
- College of Horticulture, Shenyang Agricultural University, 120 Dongling Road, Shenhe District, Shenyang, 110866, People's Republic of China.
| |
Collapse
|
14
|
Peng J, Zhao L. The origin and structural evolution of de novo genes in Drosophila. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.13.532420. [PMID: 37425675 PMCID: PMC10326970 DOI: 10.1101/2023.03.13.532420] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Although previously thought to be unlikely, recent studies have shown that de novo gene origination from previously non-genic sequences is a relatively common mechanism for gene innovation in many species and taxa. These young genes provide a unique set of candidates to study the structural and functional origination of proteins. However, our understanding of their protein structures and how these structures originate and evolve are still limited, due to a lack of systematic studies. Here, we combined high-quality base-level whole genome alignments, bioinformatic analysis, and computational structure modeling to study the origination, evolution, and protein structure of lineage-specific de novo genes. We identified 555 de novo gene candidates in D. melanogaster that originated within the Drosophilinae lineage. We found a gradual shift in sequence composition, evolutionary rates, and expression patterns with their gene ages, which indicates possible gradual shifts or adaptations of their functions. Surprisingly, we found little overall protein structural changes for de novo genes in the Drosophilinae lineage. Using Alphafold2, ESMFold, and molecular dynamics, we identified a number of de novo gene candidates with protein products that are potentially well-folded, many of which are more likely to contain transmembrane and signal proteins compared to other annotated protein-coding genes. Using ancestral sequence reconstruction, we found that most potentially well-folded proteins are often born folded. Interestingly, we observed one case where disordered ancestral proteins become ordered within a relatively short evolutionary time. Single-cell RNA-seq analysis in testis showed that although most de novo genes are enriched in spermatocytes, several young de novo genes are biased in the early spermatogenesis stage, indicating potentially important but less emphasized roles of early germline cells in the de novo gene origination in testis. This study provides a systematic overview of the origin, evolution, and structural changes of Drosophilinae-specific de novo genes.
Collapse
Affiliation(s)
- Junhui Peng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| |
Collapse
|
15
|
Fakhar AZ, Liu J, Pajerowska-Mukhtar KM, Mukhtar MS. The Lost and Found: Unraveling the Functions of Orphan Genes. J Dev Biol 2023; 11:27. [PMID: 37367481 PMCID: PMC10299390 DOI: 10.3390/jdb11020027] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 05/19/2023] [Accepted: 05/26/2023] [Indexed: 06/28/2023] Open
Abstract
Orphan Genes (OGs) are a mysterious class of genes that have recently gained significant attention. Despite lacking a clear evolutionary history, they are found in nearly all living organisms, from bacteria to humans, and they play important roles in diverse biological processes. The discovery of OGs was first made through comparative genomics followed by the identification of unique genes across different species. OGs tend to be more prevalent in species with larger genomes, such as plants and animals, and their evolutionary origins remain unclear but potentially arise from gene duplication, horizontal gene transfer (HGT), or de novo origination. Although their precise function is not well understood, OGs have been implicated in crucial biological processes such as development, metabolism, and stress responses. To better understand their significance, researchers are using a variety of approaches, including transcriptomics, functional genomics, and molecular biology. This review offers a comprehensive overview of the current knowledge of OGs in all domains of life, highlighting the possible role of dark transcriptomics in their evolution. More research is needed to fully comprehend the role of OGs in biology and their impact on various biological processes.
Collapse
Affiliation(s)
| | | | | | - M. Shahid Mukhtar
- Department of Biology, University of Alabama at Birmingham, 1300 University Blvd., Birmingham, AL 35294, USA
| |
Collapse
|
16
|
Lombardo KD, Sheehy HK, Cridland JM, Begun DJ. Identifying candidate de novo genes expressed in the somatic female reproductive tract of Drosophila melanogaster. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.03.539262. [PMID: 37205537 PMCID: PMC10187257 DOI: 10.1101/2023.05.03.539262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Most eukaryotic genes have been vertically transmitted to the present from distant ancestors. However, variable gene number across species indicates that gene gain and loss also occurs. While new genes typically originate as products of duplications and rearrangements of pre-existing genes, putative de novo genes - genes born out of previously non-genic sequence - have been identified. Previous studies of de novo genes in Drosophila have provided evidence that expression in male reproductive tissues is common. However, no studies have focused on female reproductive tissues. Here we begin addressing this gap in the literature by analyzing the transcriptomes of three female reproductive tract organs (spermatheca, seminal receptacle, and parovaria) in three species - our focal species, D. melanogaster - and two closely related species, D. simulans and D. yakuba , with the goal of identifying putative D. melanogaster -specific de novo genes expressed in these tissues. We discovered several candidate genes, which, consistent with the literature, tend to be short, simple, and lowly expressed. We also find evidence that some of these genes are expressed in other D. melanogaster tissues and both sexes. The relatively small number of candidate genes discovered here is similar to that observed in the accessory gland, but substantially fewer than that observed in the testis.
Collapse
Affiliation(s)
- Kaelina D Lombardo
- Department of Evolution and Ecology, University of California, Davis CA 95616
| | - Hayley K Sheehy
- Department of Evolution and Ecology, University of California, Davis CA 95616
| | - Julie M Cridland
- Department of Evolution and Ecology, University of California, Davis CA 95616
| | - David J Begun
- Department of Evolution and Ecology, University of California, Davis CA 95616
| |
Collapse
|
17
|
Ezoe A, Iuchi S, Sakurai T, Aso Y, Tokunaga H, Vu AT, Utsumi Y, Takahashi S, Tanaka M, Ishida J, Ishitani M, Seki M. Fully sequencing the cassava full-length cDNA library reveals unannotated transcript structures and alternative splicing events in regions with a high density of single nucleotide variations, insertions-deletions, and heterozygous sequences. PLANT MOLECULAR BIOLOGY 2023; 112:33-45. [PMID: 37014509 DOI: 10.1007/s11103-023-01346-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 02/27/2023] [Indexed: 05/09/2023]
Abstract
The primary transcript structure provides critical insights into protein diversity, transcriptional modification, and functions. Cassava transcript structures are highly diverse because of alternative splicing (AS) events and high heterozygosity. To precisely determine and characterize transcript structures, fully sequencing cloned transcripts is the most reliable method. However, cassava annotations were mainly determined according to fragmentation-based sequencing analyses (e.g., EST and short-read RNA-seq). In this study, we sequenced the cassava full-length cDNA library, which included rare transcripts. We obtained 8,628 non-redundant fully sequenced transcripts and detected 615 unannotated AS events and 421 unannotated loci. The different protein sequences resulting from the unannotated AS events tended to have diverse functional domains, implying that unannotated AS contributes to the truncation of functional domains. The unannotated loci tended to be derived from orphan genes, implying that the loci may be associated with cassava-specific traits. Unexpectedly, individual cassava transcripts were more likely to have multiple AS events than Arabidopsis transcripts, suggestive of the regulated interactions between cassava splicing-related complexes. We also observed that the unannotated loci and/or AS events were commonly in regions with abundant single nucleotide variations, insertions-deletions, and heterozygous sequences. These findings reflect the utility of completely sequenced FLcDNA clones for overcoming cassava-specific annotation-related problems to elucidate transcript structures. Our work provides researchers with transcript structural details that are useful for annotating highly diverse and unique transcripts and alternative splicing events.
Collapse
Affiliation(s)
- Akihiro Ezoe
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
| | - Satoshi Iuchi
- Experimental Plant Division, RIKEN BioResource Research Center, Tsukuba, Ibaraki, 305-0074, Japan
| | - Tetsuya Sakurai
- Multidisciplinary Science Cluster, Interdisciplinary Science Unit, Kochi University, Nankoku, Kochi, 783-8502, Japan
| | - Yukie Aso
- Experimental Plant Division, RIKEN BioResource Research Center, Tsukuba, Ibaraki, 305-0074, Japan
| | - Hiroki Tokunaga
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
- Tropical Agriculture Research Front, Japan International Research Center for Agricultural Sciences, Ishigaki, Okinawa, 907-0002, Japan
| | - Anh Thu Vu
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
| | - Yoshinori Utsumi
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
| | - Satoshi Takahashi
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
- Plant Epigenome Regulation Laboratory, RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
| | - Maho Tanaka
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
- Plant Epigenome Regulation Laboratory, RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
| | - Junko Ishida
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
- Plant Epigenome Regulation Laboratory, RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
| | - Manabu Ishitani
- International Center for Tropical Agriculture (CIAT), Km 17, Recta Cali-Palmira Apartado Aéreo 6713, Cali, Colombia
| | - Motoaki Seki
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan.
- Plant Epigenome Regulation Laboratory, RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan.
- Kihara Institute for Biological Research, Yokohama City University, 641-12 Maioka-cho, Totsuka-ku, Yokohama, Kanagawa, 244-0813, Japan.
| |
Collapse
|
18
|
Evolution and implications of de novo genes in humans. Nat Ecol Evol 2023:10.1038/s41559-023-02014-y. [PMID: 36928843 DOI: 10.1038/s41559-023-02014-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 02/06/2023] [Indexed: 03/18/2023]
Abstract
Genes and translated open reading frames (ORFs) that emerged de novo from previously non-coding sequences provide species with opportunities for adaptation. When aberrantly activated, some human-specific de novo genes and ORFs have disease-promoting properties-for instance, driving tumour growth. Thousands of putative de novo coding sequences have been described in humans, but we still do not know what fraction of those ORFs has readily acquired a function. Here, we discuss the challenges and controversies surrounding the detection, mechanisms of origin, annotation, validation and characterization of de novo genes and ORFs. Through manual curation of literature and databases, we provide a thorough table with most de novo genes reported for humans to date. We re-evaluate each locus by tracing the enabling mutations and list proposed disease associations, protein characteristics and supporting evidence for translation and protein detection. This work will support future explorations of de novo genes and ORFs in humans.
Collapse
|
19
|
Li J, Shen J, Wang R, Chen Y, Zhang T, Wang H, Guo C, Qi J. The nearly complete assembly of the Cercis chinensis genome and Fabaceae phylogenomic studies provide insights into new gene evolution. PLANT COMMUNICATIONS 2023; 4:100422. [PMID: 35957520 PMCID: PMC9860166 DOI: 10.1016/j.xplc.2022.100422] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 08/02/2022] [Accepted: 08/05/2022] [Indexed: 05/27/2023]
Abstract
Fabaceae is a large family of angiosperms with high biodiversity that contains a variety of economically important crops and model plants for the study of biological nitrogen fixation. Polyploidization events have been extensively studied in some Fabaceae plants, but the occurrence of new genes is still concealed, owing to a lack of genomic information on certain species of the basal clade of Fabaceae. Cercis chinensis (Cercidoideae) is one such species; it diverged earliest from Fabaceae and is essential for phylogenomic studies and new gene predictions in Fabaceae. To facilitate genomic studies on Fabaceae, we performed genome sequencing of C. chinensis and obtained a 352.84 Mb genome, which was further assembled into seven pseudochromosomes with 30 612 predicted protein-coding genes. Compared with other legume genomes, that of C. chinensis exhibits no lineage-specific polyploidization event. Further phylogenomic analyses of 22 legumes and 11 other angiosperms revealed that many gene families are lineage specific before and after the diversification of Fabaceae. Among them, dozens of genes are candidates for new genes that have evolved from intergenic regions and are thus regarded as de novo-originated genes. They differ significantly from established genes in coding sequence length, exon number, guanine-cytosine content, and expression patterns among tissues. Functional analysis revealed that many new genes are related to asparagine metabolism. This study represents an important advance in understanding the evolutionary pattern of new genes in legumes and provides a valuable resource for plant phylogenomic studies.
Collapse
Affiliation(s)
- Jinglong Li
- State Key Laboratory of Genetic Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200433, China
| | - Jingting Shen
- State Key Laboratory of Genetic Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200433, China
| | - Rui Wang
- State Key Laboratory of Genetic Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200433, China
| | - Yamao Chen
- State Key Laboratory of Genetic Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200433, China
| | - Taikui Zhang
- State Key Laboratory of Genetic Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200433, China
| | - Haifeng Wang
- College of Agriculture, Guangxi University, Nanning 530004, China
| | - Chunce Guo
- Jiangxi Provincial Key Laboratory for Bamboo Germplasm Resources and Utilization, Forestry College, Jiangxi Agricultural University, Nanchang 330045, China
| | - Ji Qi
- State Key Laboratory of Genetic Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200433, China.
| |
Collapse
|
20
|
Ma J, Jiang Y, Pei W, Wu M, Ma Q, Liu J, Song J, Jia B, Liu S, Wu J, Zhang J, Yu J. Expressed genes and their new alleles identification during fibre elongation reveal the genetic factors underlying improvements of fibre length in cotton. PLANT BIOTECHNOLOGY JOURNAL 2022; 20:1940-1955. [PMID: 35718938 PMCID: PMC9491459 DOI: 10.1111/pbi.13874] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 05/29/2022] [Accepted: 06/11/2022] [Indexed: 05/27/2023]
Abstract
Interspecific breeding in cotton takes advantage of genetic recombination among desirable genes from different parental lines. However, the expression new alleles (ENAs) from crossovers within genic regions and their significance in fibre length (FL) improvement are currently not understood. Here, we generated resequencing genomes of 191 interspecific backcross inbred lines derived from CRI36 (Gossypium hirsutum) × Hai7124 (Gossypium barbadense) and 277 dynamic fibre transcriptomes to identify the ENAs and extremely expressed genes (eGenes) potentially influencing FL, and uncovered the dynamic regulatory network of fibre elongation. Of 35 420 eGenes in developing fibres, 10 366 ENAs were identified and preferentially distributed in chromosomes subtelomeric regions. In total, 1056-1255 ENAs showed transgressive expression in fibres at 5-15 dpa (days post-anthesis) of some BILs, 520 of which were located in FL-quantitative trait locus (QTLs) and GhFLA9 (recombination allele) was identified with a larger effect for FL than GhFLA9 of CRI36 allele. Using ENAs as a type of markers, we identified three novel FL-QTLs. Additionally, 456 extremely eGenes were identified that were preferentially distributed in recombination hotspots. Importantly, 34 of them were significantly associated with FL. Gene expression quantitative trait locus analysis identified 1286, 1089 and 1059 eGenes that were colocalized with the FL trait at 5, 10 and 15 dpa, respectively. Finally, we verified the Ghir_D10G011050 gene linked to fibre elongation by the CRISPR-cas9 system. This study provides the first glimpse into the occurrence, distribution and expression of the developing fibres genes (especially ENAs) in an introgression population, and their possible biological significance in FL.
Collapse
Affiliation(s)
- Jianjiang Ma
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
- Zhengzhou Research Base, State Key Laboratory of Cotton BiologyZhengzhou UniversityZhengzhouChina
| | - Yafei Jiang
- Novogene Bioinformatics InstituteBeijingChina
| | - Wenfeng Pei
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Man Wu
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Qifeng Ma
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Ji Liu
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Jikun Song
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Bing Jia
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Shang Liu
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
| | - Jianyong Wu
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
- Zhengzhou Research Base, State Key Laboratory of Cotton BiologyZhengzhou UniversityZhengzhouChina
| | - Jinfa Zhang
- Department of Plant and Environmental SciencesNew Mexico State UniversityLas CrucesNew MexicoUSA
| | - Jiwen Yu
- State Key Laboratory of Cotton BiologyInstitute of Cotton Research of Chinese Academy of Agricultural SciencesKey Laboratory of Cotton Genetic ImprovementMinistry of AgricultureAnyangChina
- Zhengzhou Research Base, State Key Laboratory of Cotton BiologyZhengzhou UniversityZhengzhouChina
| |
Collapse
|
21
|
Moutinho AF, Eyre-Walker A, Dutheil JY. Strong evidence for the adaptive walk model of gene evolution in Drosophila and Arabidopsis. PLoS Biol 2022; 20:e3001775. [PMID: 36099311 PMCID: PMC9470001 DOI: 10.1371/journal.pbio.3001775] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 08/01/2022] [Indexed: 11/19/2022] Open
Abstract
Understanding the dynamics of species adaptation to their environments has long been a central focus of the study of evolution. Theories of adaptation propose that populations evolve by “walking” in a fitness landscape. This “adaptive walk” is characterised by a pattern of diminishing returns, where populations further away from their fitness optimum take larger steps than those closer to their optimal conditions. Hence, we expect young genes to evolve faster and experience mutations with stronger fitness effects than older genes because they are further away from their fitness optimum. Testing this hypothesis, however, constitutes an arduous task. Young genes are small, encode proteins with a higher degree of intrinsic disorder, are expressed at lower levels, and are involved in species-specific adaptations. Since all these factors lead to increased protein evolutionary rates, they could be masking the effect of gene age. While controlling for these factors, we used population genomic data sets of Arabidopsis and Drosophila and estimated the rate of adaptive substitutions across genes from different phylostrata. We found that a gene’s evolutionary age significantly impacts the molecular rate of adaptation. Moreover, we observed that substitutions in young genes tend to have larger physicochemical effects. Our study, therefore, provides strong evidence that molecular evolution follows an adaptive walk model across a large evolutionary timescale. This study uses population genomic datasets from Arabidopsis and Drosophila to show that young genes adapt faster and are subject to mutations of larger fitness effects, providing strong evidence that molecular evolution follows an adaptive walk model across a large evolutionary timescale.
Collapse
Affiliation(s)
- Ana Filipa Moutinho
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
- * E-mail:
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Julien Y. Dutheil
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
- Unité Mixte de Recherche 5554 Institut des Sciences de l’Evolution, CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France
| |
Collapse
|
22
|
Jiang M, Li X, Dong X, Zu Y, Zhan Z, Piao Z, Lang H. Research Advances and Prospects of Orphan Genes in Plants. FRONTIERS IN PLANT SCIENCE 2022; 13:947129. [PMID: 35874010 PMCID: PMC9305701 DOI: 10.3389/fpls.2022.947129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 06/23/2022] [Indexed: 06/15/2023]
Abstract
Orphan genes (OGs) are defined as genes having no sequence similarity with genes present in other lineages. OGs have been regarded to play a key role in the development of lineage-specific adaptations and can also serve as a constant source of evolutionary novelty. These genes have often been found related to various stress responses, species-specific traits, special expression regulation, and also participate in primary substance metabolism. The advancement in sequencing tools and genome analysis methods has made the identification and characterization of OGs comparatively easier. In the study of OG functions in plants, significant progress has been made. We review recent advances in the fast evolving characteristics, expression modulation, and functional analysis of OGs with a focus on their role in plant biology. We also emphasize current challenges, adoptable strategies and discuss possible future directions of functional study of OGs.
Collapse
Affiliation(s)
- Mingliang Jiang
- School of Agriculture, Jilin Agricultural Science and Technology College, Jilin, China
| | - Xiaonan Li
- College of Horticulture, Shenyang Agricultural University, Shenyang, China
| | - Xiangshu Dong
- School of Agriculture, Yunnan University, Kunming, China
| | - Ye Zu
- College of Horticulture, Shenyang Agricultural University, Shenyang, China
| | - Zongxiang Zhan
- College of Horticulture, Shenyang Agricultural University, Shenyang, China
| | - Zhongyun Piao
- College of Horticulture, Shenyang Agricultural University, Shenyang, China
| | - Hong Lang
- School of Agriculture, Jilin Agricultural Science and Technology College, Jilin, China
| |
Collapse
|
23
|
Prabh N, Rödelsperger C. Multiple Pristionchus pacificus genomes reveal distinct evolutionary dynamics between de novo candidates and duplicated genes. Genome Res 2022; 32:1315-1327. [PMID: 35618417 PMCID: PMC9341508 DOI: 10.1101/gr.276431.121] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 05/20/2022] [Indexed: 01/03/2023]
Abstract
The birth of new genes is a major molecular innovation driving phenotypic diversity across all domains of life. Although repurposing of existing protein-coding material by duplication is considered the main process of new gene formation, recent studies have discovered thousands of transcriptionally active sequences as a rich source of new genes. However, differential loss rates have to be assumed to reconcile the high birth rates of these incipient de novo genes with the dominance of ancient gene families in individual genomes. Here, we test this rapid turnover hypothesis in the context of the nematode model organism Pristionchus pacificus We extended the existing species-level phylogenomic framework by sequencing the genomes of six divergent P. pacificus strains. We used these data to study the evolutionary dynamics of different age classes and categories of origin at a population level. Contrasting de novo candidates with new families that arose by duplication and divergence from known genes, we find that de novo candidates are typically shorter, show less expression, and are overrepresented on the sex chromosome. Although the contribution of de novo candidates increases toward young age classes, multiple comparisons within the same age class showed significantly higher attrition in de novo candidates than in known genes. Similarly, young genes remain under weak evolutionary constraints with de novo candidates representing the fastest evolving subcategory. Altogether, this study provides empirical evidence for the rapid turnover hypothesis and highlights the importance of the evolutionary timescale when quantifying the contribution of different mechanisms toward new gene formation.
Collapse
Affiliation(s)
- Neel Prabh
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, 72076 Tübingen, Germany
| | - Christian Rödelsperger
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, 72076 Tübingen, Germany
| |
Collapse
|
24
|
Raxwal VK, Singh S, Agarwal M, Riha K. Transcriptional and post-transcriptional regulation of young genes in plants. BMC Biol 2022; 20:134. [PMID: 35676681 PMCID: PMC9178820 DOI: 10.1186/s12915-022-01339-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 05/30/2022] [Indexed: 12/03/2022] Open
Abstract
Background New genes continuously emerge from non-coding DNA or by diverging from existing genes, but most of them are rapidly lost and only a few become fixed within the population. We hypothesized that young genes are subject to transcriptional and post-transcriptional regulation to limit their expression and minimize their exposure to purifying selection. Results We performed a protein-based homology search across the tree of life to determine the evolutionary age of protein-coding genes present in the rice genome. We found that young genes in rice have relatively low expression levels, which can be attributed to distal enhancers, and closed chromatin conformation at their transcription start sites (TSS). The chromatin in TSS regions can be re-modeled in response to abiotic stress, indicating conditional expression of young genes. Furthermore, transcripts of young genes in Arabidopsis tend to be targeted by nonsense-mediated RNA decay, presenting another layer of regulation limiting their expression. Conclusions These data suggest that transcriptional and post-transcriptional mechanisms contribute to the conditional expression of young genes, which may alleviate purging selection while providing an opportunity for phenotypic exposure and functionalization. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-022-01339-7.
Collapse
Affiliation(s)
- Vivek Kumar Raxwal
- Department of Botany, University of Delhi, Delhi, 110007, India. .,Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czech Republic.
| | - Somya Singh
- Department of Botany, University of Delhi, Delhi, 110007, India
| | - Manu Agarwal
- Department of Botany, University of Delhi, Delhi, 110007, India.
| | - Karel Riha
- Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czech Republic.
| |
Collapse
|
25
|
Li F, Rane RV, Luria V, Xiong Z, Chen J, Li Z, Catullo RA, Griffin PC, Schiffer M, Pearce S, Lee SF, McElroy K, Stocker A, Shirriffs J, Cockerell F, Coppin C, Sgrò CM, Karger A, Cain JW, Weber JA, Santpere G, Kirschner MW, Hoffmann AA, Oakeshott JG, Zhang G. Phylogenomic analyses of the genus Drosophila reveals genomic signals of climate adaptation. Mol Ecol Resour 2022; 22:1559-1581. [PMID: 34839580 PMCID: PMC9299920 DOI: 10.1111/1755-0998.13561] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 11/10/2021] [Indexed: 01/13/2023]
Abstract
Many Drosophila species differ widely in their distributions and climate niches, making them excellent subjects for evolutionary genomic studies. Here, we have developed a database of high-quality assemblies for 46 Drosophila species and one closely related Zaprionus. Fifteen of the genomes were newly sequenced, and 20 were improved with additional sequencing. New or improved annotations were generated for all 47 species, assisted by new transcriptomes for 19. Phylogenomic analyses of these data resolved several previously ambiguous relationships, especially in the melanogaster species group. However, it also revealed significant phylogenetic incongruence among genes, mainly in the form of incomplete lineage sorting in the subgenus Sophophora but also including asymmetric introgression in the subgenus Drosophila. Using the phylogeny as a framework and taking into account these incongruences, we then screened the data for genome-wide signals of adaptation to different climatic niches. First, phylostratigraphy revealed relatively high rates of recent novel gene gain in three temperate pseudoobscura and five desert-adapted cactophilic mulleri subgroup species. Second, we found differing ratios of nonsynonymous to synonymous substitutions in several hundred orthologues between climate generalists and specialists, with trends for significantly higher ratios for those in tropical and lower ratios for those in temperate-continental specialists respectively than those in the climate generalists. Finally, resequencing natural populations of 13 species revealed tropics-restricted species generally had smaller population sizes, lower genome diversity and more deleterious mutations than the more widespread species. We conclude that adaptation to different climates in the genus Drosophila has been associated with large-scale and multifaceted genomic changes.
Collapse
Affiliation(s)
- Fang Li
- BGI‐ShenzhenShenzhenChina
- Section for Ecology and EvolutionDepartment of BiologyUniversity of CopenhagenCopenhagenDenmark
| | - Rahul V. Rane
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
- Bio21 InstituteSchool of BioSciencesUniversity of MelbourneParkvilleVic.Australia
| | - Victor Luria
- Department of Systems BiologyHarvard Medical SchoolBostonMassachusettsUSA
| | - Zijun Xiong
- BGI‐ShenzhenShenzhenChina
- State Key Laboratory of Genetic Resources and EvolutionKunming Institute of ZoologyChinese Academy of Sciences (CAS)KunmingYunnanChina
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
| | | | | | - Renee A. Catullo
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
- Division of Ecology and EvolutionCentre for Biodiversity AnalysisThe Australian National UniversityActonACTAustralia
| | - Philippa C. Griffin
- Bio21 InstituteSchool of BioSciencesUniversity of MelbourneParkvilleVic.Australia
| | - Michele Schiffer
- Bio21 InstituteSchool of BioSciencesUniversity of MelbourneParkvilleVic.Australia
- Daintree Rainforest ObservatoryJames Cook UniversityCape TribulationQldAustralia
| | - Stephen Pearce
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
| | - Siu Fai Lee
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
- Applied BioSciencesMacquarie UniversityNorth RydeNSWAustralia
| | - Kerensa McElroy
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
| | - Ann Stocker
- Bio21 InstituteSchool of BioSciencesUniversity of MelbourneParkvilleVic.Australia
| | - Jennifer Shirriffs
- Bio21 InstituteSchool of BioSciencesUniversity of MelbourneParkvilleVic.Australia
| | - Fiona Cockerell
- School of Biological SciencesMonash UniversityClaytonVic.Australia
| | - Chris Coppin
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
| | - Carla M. Sgrò
- School of Biological SciencesMonash UniversityClaytonVic.Australia
| | - Amir Karger
- IT ‐ Research ComputingHarvard Medical SchoolBostonMassachusettsUSA
| | - John W. Cain
- Department of MathematicsHarvard UniversityCambridgeMassachusettsUSA
| | - Jessica A. Weber
- Department of GeneticsHarvard Medical SchoolBostonMassachusettsUSA
| | - Gabriel Santpere
- Neurogenomics Group, Research Programme on Biomedical Informatics (GRIB)Department of Experimental and Health Sciences (DCEXS)Hospital del Mar Medical Research Institute (IMIM)Universitat Pompeu FabraBarcelonaCataloniaSpain
| | - Marc W. Kirschner
- Department of Systems BiologyHarvard Medical SchoolBostonMassachusettsUSA
| | - Ary A. Hoffmann
- Bio21 InstituteSchool of BioSciencesUniversity of MelbourneParkvilleVic.Australia
| | - John G. Oakeshott
- Commonwealth Scientific and Industrial Research OrganisationActonACTAustralia
- Applied BioSciencesMacquarie UniversityNorth RydeNSWAustralia
| | - Guojie Zhang
- BGI‐ShenzhenShenzhenChina
- Section for Ecology and EvolutionDepartment of BiologyUniversity of CopenhagenCopenhagenDenmark
- State Key Laboratory of Genetic Resources and EvolutionKunming Institute of ZoologyChinese Academy of Sciences (CAS)KunmingYunnanChina
- Center for Excellence in Animal Evolution and GeneticsChinese Academy of SciencesKunmingChina
| |
Collapse
|
26
|
Gueno J, Borg M, Bourdareau S, Cossard G, Godfroy O, Lipinska A, Tirichine L, Cock J, Coelho S. Chromatin landscape associated with sexual differentiation in a UV sex determination system. Nucleic Acids Res 2022; 50:3307-3322. [PMID: 35253891 PMCID: PMC8989524 DOI: 10.1093/nar/gkac145] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 02/15/2022] [Accepted: 03/04/2022] [Indexed: 12/12/2022] Open
Abstract
In many eukaryotes, such as dioicous mosses and many algae, sex is determined by UV sex chromosomes and is expressed during the haploid phase of the life cycle. In these species, the male and female developmental programs are initiated by the presence of the U- or V-specific regions of the sex chromosomes but, as in XY and ZW systems, sexual differentiation is largely driven by autosomal sex-biased gene expression. The mechanisms underlying the regulation of sex-biased expression of genes during sexual differentiation remain elusive. Here, we investigated the extent and nature of epigenomic changes associated with UV sexual differentiation in the brown alga Ectocarpus, a model UV system. Six histone modifications were quantified in near-isogenic lines, leading to the identification of 16 chromatin signatures across the genome. Chromatin signatures correlated with levels of gene expression and histone PTMs changes in males versus females occurred preferentially at genes involved in sex-specific pathways. Despite the absence of chromosome scale dosage compensation and the fact that UV sex chromosomes recombine across most of their length, the chromatin landscape of these chromosomes was remarkably different to that of autosomes. Hotspots of evolutionary young genes in the pseudoautosomal regions appear to drive the exceptional chromatin features of UV sex chromosomes.
Collapse
Affiliation(s)
- Josselin Gueno
- Sorbonne Université, UPMC Univ Paris 06, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS 90074, F-29688 Roscoff, France
| | - Michael Borg
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen72076, Tübingen, Germany
| | - Simon Bourdareau
- Sorbonne Université, UPMC Univ Paris 06, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS 90074, F-29688 Roscoff, France
| | - Guillaume Cossard
- Sorbonne Université, UPMC Univ Paris 06, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS 90074, F-29688 Roscoff, France
| | - Olivier Godfroy
- Sorbonne Université, UPMC Univ Paris 06, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS 90074, F-29688 Roscoff, France
| | - Agnieszka Lipinska
- Sorbonne Université, UPMC Univ Paris 06, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS 90074, F-29688 Roscoff, France
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen72076, Tübingen, Germany
| | - Leila Tirichine
- Nantes Universite, CNRS, US2B, UMR 6286, F-44000, Nantes, France
| | - J Mark Cock
- Sorbonne Université, UPMC Univ Paris 06, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS 90074, F-29688 Roscoff, France
| | - Susana M Coelho
- Sorbonne Université, UPMC Univ Paris 06, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS 90074, F-29688 Roscoff, France
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen72076, Tübingen, Germany
| |
Collapse
|
27
|
Heinen T, Xie C, Keshavarz M, Stappert D, Künzel S, Tautz D. Evolution of a New Testis-Specific Functional Promoter Within the Highly Conserved Map2k7 Gene of the Mouse. Front Genet 2022; 12:812139. [PMID: 35069705 PMCID: PMC8766832 DOI: 10.3389/fgene.2021.812139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 12/08/2021] [Indexed: 12/03/2022] Open
Abstract
Map2k7 (synonym Mkk7) is a conserved regulatory kinase gene and a central component of the JNK signaling cascade with key functions during cellular differentiation. It shows complex transcription patterns, and different transcript isoforms are known in the mouse (Mus musculus). We have previously identified a newly evolved testis-specific transcript for the Map2k7 gene in the subspecies M. m. domesticus. Here, we identify the new promoter that drives this transcript and find that it codes for an open reading frame (ORF) of 50 amino acids. The new promoter was gained in the stem lineage of closely related mouse species but was secondarily lost in the subspecies M. m. musculus and M. m. castaneus. A single mutation can be correlated with its transcriptional activity in M. m. domesticus, and cell culture assays demonstrate the capability of this mutation to drive expression. A mouse knockout line in which the promoter region of the new transcript is deleted reveals a functional contribution of the newly evolved promoter to sperm motility and the spermatid transcriptome. Our data show that a new functional transcript (and possibly protein) can evolve within an otherwise highly conserved gene, supporting the notion of regulatory changes contributing to the emergence of evolutionary novelties.
Collapse
Affiliation(s)
| | - Chen Xie
- Max-Plank Institute for Evolutionary Biology, Plön, Germany
| | - Maryam Keshavarz
- Max-Plank Institute for Evolutionary Biology, Plön, Germany
- Deutsches Zentrum für Neurodegenerative Erkrankungen e. V. (DZNE), Bonn, Germany
| | - Dominik Stappert
- Deutsches Zentrum für Neurodegenerative Erkrankungen e. V. (DZNE), Bonn, Germany
| | - Sven Künzel
- Max-Plank Institute for Evolutionary Biology, Plön, Germany
| | - Diethard Tautz
- Max-Plank Institute for Evolutionary Biology, Plön, Germany
| |
Collapse
|
28
|
Cherezov RO, Vorontsova JE, Simonova OB. The Phenomenon of Evolutionary “De Novo Generation” of Genes. Russ J Dev Biol 2021. [DOI: 10.1134/s1062360421060035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
29
|
Papadopoulos C, Callebaut I, Gelly JC, Hatin I, Namy O, Renard M, Lespinet O, Lopes A. Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution. Genome Res 2021; 31:2303-2315. [PMID: 34810219 PMCID: PMC8647833 DOI: 10.1101/gr.275638.121] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 09/23/2021] [Indexed: 01/08/2023]
Abstract
The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences' properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic open reading frames (ORFs) of S. cerevisiae with the aim of (1) exploring whether the structural states' diversity of proteomes is already present in noncoding sequences, and (2) estimating the potential of the noncoding genome to produce novel protein bricks that could either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural state diversity of canonical proteins, with the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by reconstructing the ancestral sequences of 70 yeast de novo genes and characterized the sequence and structural properties of intergenic ORFs with a strong translation signal. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe.
Collapse
Affiliation(s)
- Chris Papadopoulos
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005 Paris, France
| | - Jean-Christophe Gelly
- Université de Paris, Biologie Intégrée du Globule Rouge, UMR_S1134, BIGR, INSERM, F-75015 Paris, France
- Laboratoire d'Excellence GR-Ex, 75015 Paris, France
- Institut National de la Transfusion Sanguine, F-75015 Paris, France
| | - Isabelle Hatin
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Namy
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Maxime Renard
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Lespinet
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Anne Lopes
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| |
Collapse
|
30
|
Castro JF, Tautz D. The Effects of Sequence Length and Composition of Random Sequence Peptides on the Growth of E. coli Cells. Genes (Basel) 2021; 12:1913. [PMID: 34946861 PMCID: PMC8702183 DOI: 10.3390/genes12121913] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Revised: 11/22/2021] [Accepted: 11/26/2021] [Indexed: 12/21/2022] Open
Abstract
We study the potential for the de novo evolution of genes from random nucleotide sequences using libraries of E. coli expressing random sequence peptides. We assess the effects of such peptides on cell growth by monitoring frequency changes in individual clones in a complex library through four serial passages. Using a new analysis pipeline that allows the tracing of peptides of all lengths, we find that over half of the peptides have consistent effects on cell growth. Across nine different experiments, around 16% of clones increase in frequency and 36% decrease, with some variation between individual experiments. Shorter peptides (8-20 residues), are more likely to increase in frequency, longer ones are more likely to decrease. GC content, amino acid composition, intrinsic disorder, and aggregation propensity show slightly different patterns between peptide groups. Sequences that increase in frequency tend to be more disordered with lower aggregation propensity. This coincides with the observation that young genes with more disordered structures are better tolerated in genomes. Our data indicate that random sequences can be a source of evolutionary innovation, since a large fraction of them are well tolerated by the cells or can provide a growth advantage.
Collapse
Affiliation(s)
| | - Diethard Tautz
- Max Planck Institute for Evolutionary Biology, August-Thienemann Strasse 2, 24306 Plön, Germany;
| |
Collapse
|
31
|
Watson AK, Lopez P, Bapteste E. Hundreds of out-of-frame remodelled gene families in the E. coli pangenome. Mol Biol Evol 2021; 39:6430988. [PMID: 34792602 PMCID: PMC8788219 DOI: 10.1093/molbev/msab329] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
All genomes include gene families with very limited taxonomic distributions that potentially represent new genes and innovations in protein-coding sequence, raising questions on the origins of such genes. Some of these genes are hypothesized to have formed de novo, from noncoding sequences, and recent work has begun to elucidate the processes by which de novo gene formation can occur. A special case of de novo gene formation, overprinting, describes the origin of new genes from noncoding alternative reading frames of existing open reading frames (ORFs). We argue that additionally, out-of-frame gene fission/fusion events of alternative reading frames of ORFs and out-of-frame lateral gene transfers could contribute to the origin of new gene families. To demonstrate this, we developed an original pattern-search in sequence similarity networks, enhancing the use of these graphs, commonly used to detect in-frame remodeled genes. We applied this approach to gene families in 524 complete genomes of Escherichia coli. We identified 767 gene families whose evolutionary history likely included at least one out-of-frame remodeling event. These genes with out-of-frame components represent ∼2.5% of all genes in the E. coli pangenome, suggesting that alternative reading frames of existing ORFs can contribute to a significant proportion of de novo genes in bacteria.
Collapse
Affiliation(s)
- Andrew K Watson
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, 7, quai Saint Bernard, Paris, 75005, France
| | - Philippe Lopez
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, 7, quai Saint Bernard, Paris, 75005, France
| | - Eric Bapteste
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, 7, quai Saint Bernard, Paris, 75005, France
| |
Collapse
|
32
|
Zhuang X, Cheng CHC. Propagation of a De Novo Gene under Natural Selection: Antifreeze Glycoprotein Genes and Their Evolutionary History in Codfishes. Genes (Basel) 2021; 12:genes12111777. [PMID: 34828383 PMCID: PMC8622921 DOI: 10.3390/genes12111777] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 11/08/2021] [Accepted: 11/08/2021] [Indexed: 11/16/2022] Open
Abstract
The de novo birth of functional genes from non-coding DNA as an important contributor to new gene formation is increasingly supported by evidence from diverse eukaryotic lineages. However, many uncertainties remain, including how the incipient de novo genes would continue to evolve and the molecular mechanisms underlying their evolutionary trajectory. Here we address these questions by investigating evolutionary history of the de novo antifreeze glycoprotein (AFGP) gene and gene family in gadid (codfish) lineages. We examined AFGP phenotype on a phylogenetic framework encompassing a broad sampling of gadids from freezing and non-freezing habitats. In three select species representing different AFGP-bearing clades, we analyzed all AFGP gene family members and the broader scale AFGP genomic regions in detail. Codon usage analyses suggest that motif duplication produced the intragenic AFGP tripeptide coding repeats, and rapid sequence divergence post-duplication stabilized the recombination-prone long repetitive coding region. Genomic loci analyses support AFGP originated once from a single ancestral genomic origin, and shed light on how the de novo gene proliferated into a gene family. Results also show the processes of gene duplication and gene loss are distinctive in separate clades, and both genotype and phenotype are commensurate with differential local selective pressures.
Collapse
Affiliation(s)
- Xuan Zhuang
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR 72701, USA
- Correspondence: (X.Z.); (C.-H.C.C.)
| | - C.-H. Christina Cheng
- Department of Evolution, Ecology, and Behavior, University of Illinois, Urbana-Champaign, IL 61801, USA
- Correspondence: (X.Z.); (C.-H.C.C.)
| |
Collapse
|
33
|
Jin G, Ma PF, Wu X, Gu L, Long M, Zhang C, Li DZ. New Genes Interacted with Recent Whole Genome Duplicates in the Fast Stem Growth of Bamboos. Mol Biol Evol 2021; 38:5752-5768. [PMID: 34581782 PMCID: PMC8662795 DOI: 10.1093/molbev/msab288] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
As drivers of evolutionary innovations, new genes allow organisms to explore new niches. However, clear examples of this process remain scarce. Bamboos, the unique grass lineage diversifying into the forest, have evolved with a key innovation of fast growth of woody stem, reaching up to 1 m/day. Here, we identify 1,622 bamboo-specific orphan genes that appeared in recent 46 million years, and 19 of them evolved from noncoding ancestral sequences with entire de novo origination process reconstructed. The new genes evolved gradually in exon−intron structure, protein length, expression specificity, and evolutionary constraint. These new genes, whether or not from de novo origination, are dominantly expressed in the rapidly developing shoots, and make transcriptomes of shoots the youngest among various bamboo tissues, rather than reproductive tissue in other plants. Additionally, the particularity of bamboo shoots has also been shaped by recent whole-genome duplicates (WGDs), which evolved divergent expression patterns from ancestral states. New genes and WGDs have been evolutionarily recruited into coexpression networks to underline fast-growing trait of bamboo shoot. Our study highlights the importance of interactions between new genes and genome duplicates in generating morphological innovation.
Collapse
Affiliation(s)
- Guihua Jin
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Peng-Fei Ma
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Xiaopei Wu
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Lianfeng Gu
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, Fujian, 350002, China
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, 60637, USA
| | - Chengjun Zhang
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - De-Zhu Li
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| |
Collapse
|
34
|
Wang YW, Hess J, Slot JC, Pringle A. De Novo Gene Birth, Horizontal Gene Transfer, and Gene Duplication as Sources of New Gene Families Associated with the Origin of Symbiosis in Amanita. Genome Biol Evol 2021; 12:2168-2182. [PMID: 32926145 PMCID: PMC7674699 DOI: 10.1093/gbe/evaa193] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2020] [Indexed: 12/24/2022] Open
Abstract
By introducing novel capacities and functions, new genes and gene families may play a crucial role in ecological transitions. Mechanisms generating new gene families include de novo gene birth, horizontal gene transfer, and neofunctionalization following a duplication event. The ectomycorrhizal (ECM) symbiosis is a ubiquitous mutualism and the association has evolved repeatedly and independently many times among the fungi, but the evolutionary dynamics enabling its emergence remain elusive. We developed a phylogenetic workflow to first understand if gene families unique to ECM Amanita fungi and absent from closely related asymbiotic species are functionally relevant to the symbiosis, and then to systematically infer their origins. We identified 109 gene families unique to ECM Amanita species. Genes belonging to unique gene families are under strong purifying selection and are upregulated during symbiosis, compared with genes of conserved or orphan gene families. The origins of seven of the unique gene families are strongly supported as either de novo gene birth (two gene families), horizontal gene transfer (four), or gene duplication (one). An additional 34 families appear new because of their selective retention within symbiotic species. Among the 109 unique gene families, the most upregulated gene in symbiotic cultures encodes a 1-aminocyclopropane-1-carboxylate deaminase, an enzyme capable of downregulating the synthesis of the plant hormone ethylene, a common negative regulator of plant-microbial mutualisms.
Collapse
Affiliation(s)
- Yen-Wen Wang
- Departments of Botany and Bacteriology, University of Wisconsin-Madison
| | - Jaqueline Hess
- Department of Soil Ecology, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany
| | - Jason C Slot
- Department of Plant Pathology, The Ohio State University
| | - Anne Pringle
- Departments of Botany and Bacteriology, University of Wisconsin-Madison
| |
Collapse
|
35
|
Rivard EL, Ludwig AG, Patel PH, Grandchamp A, Arnold SE, Berger A, Scott EM, Kelly BJ, Mascha GC, Bornberg-Bauer E, Findlay GD. A putative de novo evolved gene required for spermatid chromatin condensation in Drosophila melanogaster. PLoS Genet 2021; 17:e1009787. [PMID: 34478447 PMCID: PMC8445463 DOI: 10.1371/journal.pgen.1009787] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 09/16/2021] [Accepted: 08/19/2021] [Indexed: 02/07/2023] Open
Abstract
Comparative genomics has enabled the identification of genes that potentially evolved de novo from non-coding sequences. Many such genes are expressed in male reproductive tissues, but their functions remain poorly understood. To address this, we conducted a functional genetic screen of over 40 putative de novo genes with testis-enriched expression in Drosophila melanogaster and identified one gene, atlas, required for male fertility. Detailed genetic and cytological analyses showed that atlas is required for proper chromatin condensation during the final stages of spermatogenesis. Atlas protein is expressed in spermatid nuclei and facilitates the transition from histone- to protamine-based chromatin packaging. Complementary evolutionary analyses revealed the complex evolutionary history of atlas. The protein-coding portion of the gene likely arose at the base of the Drosophila genus on the X chromosome but was unlikely to be essential, as it was then lost in several independent lineages. Within the last ~15 million years, however, the gene moved to an autosome, where it fused with a conserved non-coding RNA and evolved a non-redundant role in male fertility. Altogether, this study provides insight into the integration of novel genes into biological processes, the links between genomic innovation and functional evolution, and the genetic control of a fundamental developmental process, gametogenesis.
Collapse
Affiliation(s)
- Emily L. Rivard
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Andrew G. Ludwig
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Prajal H. Patel
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | | | - Sarah E. Arnold
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | | | - Emilie M. Scott
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Brendan J. Kelly
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Grace C. Mascha
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Erich Bornberg-Bauer
- University of Münster, Münster, Germany
- Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Geoffrey D. Findlay
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| |
Collapse
|
36
|
Li J, Singh U, Arendsee Z, Wurtele ES. Landscape of the Dark Transcriptome Revealed Through Re-mining Massive RNA-Seq Data. Front Genet 2021; 12:722981. [PMID: 34484307 PMCID: PMC8415361 DOI: 10.3389/fgene.2021.722981] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 07/26/2021] [Indexed: 12/13/2022] Open
Abstract
The "dark transcriptome" can be considered the multitude of sequences that are transcribed but not annotated as genes. We evaluated expression of 6,692 annotated genes and 29,354 unannotated open reading frames (ORFs) in the Saccharomyces cerevisiae genome across diverse environmental, genetic and developmental conditions (3,457 RNA-Seq samples). Over 30% of the highly transcribed ORFs have translation evidence. Phylostratigraphic analysis infers most of these transcribed ORFs would encode species-specific proteins ("orphan-ORFs"); hundreds have mean expression comparable to annotated genes. These data reveal unannotated ORFs most likely to be protein-coding genes. We partitioned a co-expression matrix by Markov Chain Clustering; the resultant clusters contain 2,468 orphan-ORFs. We provide the aggregated RNA-Seq yeast data with extensive metadata as a project in MetaOmGraph (MOG), a tool designed for interactive analysis and visualization. This approach enables reuse of public RNA-Seq data for exploratory discovery, providing a rich context for experimentalists to make novel, experimentally testable hypotheses about candidate genes.
Collapse
Affiliation(s)
- Jing Li
- Genetics and Genomics Graduate Program, Iowa State University, Ames, IA, United States
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
| | - Urminder Singh
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| | - Zebulun Arendsee
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| | - Eve Syrkin Wurtele
- Genetics and Genomics Graduate Program, Iowa State University, Ames, IA, United States
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| |
Collapse
|
37
|
Ma D, Ding Q, Guo Z, Zhao Z, Wei L, Li Y, Song S, Zheng HL. Identification, characterization and expression analysis of lineage-specific genes within mangrove species Aegiceras corniculatum. Mol Genet Genomics 2021; 296:1235-1247. [PMID: 34363105 DOI: 10.1007/s00438-021-01810-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 07/22/2021] [Indexed: 11/25/2022]
Abstract
Lineage-specific genes (LSGs) are the genes that have no recognizable homology to any sequences in other species, which are important drivers for the generation of new functions, phenotypic changes, and facilitating species adaptation to environment. Aegiceras corniculatum is one of major mangrove plant species adapted to waterlogging and saline conditions, and the exploration of aegiceras-specific genes (ASGs) is important to reveal its adaptation to the harsh environment. Here, we performed a systematic analysis on ASGs, focusing on their sequence characterization, origination and expression patterns. Our results reveal that there are 4823 ASGs in the genome, approximately 11.84% of all protein-coding genes. High proportion (45.78%) of ASGs originate from gene duplication, and the time of gene duplication of ASGs is consistent with the timing of two genome-wide replication (WGD) events that occurred in A. corniculatum, and also coincides with a short period of global warming during the Paleocene-Eocene Maximum (PETM, 55.5 million years ago). Gene structure analysis showed that ASGs have shorter protein lengths, fewer exons, and higher isoelectric point. Expression patterns analysis showed that ASGs had low levels of expression and more tissue-specific expression. Weighted gene co-expression network analysis (WGCNA) revealed that 86 ASGs co-expressed gene modules were primarily involved in pathways related to adversity stress, including plant hormone signal transduction, phenylpropanoid biosynthesis, photosynthesis, peroxisome and pentose phosphate pathway. This study provides a comprehensive analysis of the characteristics and potential functions of ASGs and identifies key candidate genes, which will contribute to the subsequent further investigation of the adaptation of A. corniculatum to intertidal coastal wetland habitats.
Collapse
Affiliation(s)
- Dongna Ma
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Qiansu Ding
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Zejun Guo
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Zhizhu Zhao
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Liufeng Wei
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Yiying Li
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Institute of Applied Ecology, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Shiwei Song
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Hai-Lei Zheng
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China.
| |
Collapse
|
38
|
Witt E, Svetec N, Benjamin S, Zhao L. Transcription Factors Drive Opposite Relationships between Gene Age and Tissue Specificity in Male and Female Drosophila Gonads. Mol Biol Evol 2021; 38:2104-2115. [PMID: 33481021 PMCID: PMC8097261 DOI: 10.1093/molbev/msab011] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Evolutionarily young genes are usually preferentially expressed in the testis across species. Although it is known that older genes are generally more broadly expressed than younger genes, the properties that shaped this pattern are unknown. Older genes may gain expression across other tissues uniformly, or faster in certain tissues than others. Using Drosophila gene expression data, we confirmed previous findings that younger genes are disproportionately testis biased and older genes are disproportionately ovary biased. We found that the relationship between gene age and expression is stronger in the ovary than any other tissue and weakest in testis. We performed ATAC-seq on Drosophila testis and found that although genes of all ages are more likely to have open promoter chromatin in testis than in ovary, promoter chromatin alone does not explain the ovary bias of older genes. Instead, we found that upstream transcription factor (TF) expression is highly predictive of gene expression in ovary but not in testis. In the ovary, TF expression is more predictive of gene expression than open promoter chromatin, whereas testis gene expression is similarly influenced by both TF expression and open promoter chromatin. We propose that the testis is uniquely able to express younger genes controlled by relatively few TFs, whereas older genes with more TF partners are broadly expressed with peak expression most likely in the ovary. The testis allows widespread baseline expression that is relatively unresponsive to regulatory changes, whereas the ovary transcriptome is more responsive to trans-regulation and has a higher ceiling for gene expression.
Collapse
Affiliation(s)
- Evan Witt
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Nicolas Svetec
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Sigi Benjamin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| |
Collapse
|
39
|
Hill T, Rosales-Stephens HL, Unckless RL. Rapid divergence of the male reproductive proteins in the Drosophila dunni group and implications for postmating incompatibilities between species. G3 (BETHESDA, MD.) 2021; 11:jkab050. [PMID: 33599779 PMCID: PMC8759818 DOI: 10.1093/g3journal/jkab050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 02/17/2021] [Indexed: 11/17/2022]
Abstract
Proteins involved in post-copulatory interactions between males and females are among the fastest evolving genes in many species, usually attributed to their involvement in reproductive conflict. As a result, these proteins are thought to often be involved in the formation of postmating-prezygotic incompatibilities between species. The Drosophila dunni subgroup consists of a dozen recently diverged species found across the Caribbean islands with varying levels of hybrid incompatibility. We performed experimental crosses between species in the dunni group and see some evidence of hybrid incompatibilities. We also find evidence of reduced survival following hybrid mating, likely due to postmating-prezygotic incompatibilities. We assessed rates of evolution between these species genomes and find evidence of rapid evolution and divergence of some reproductive proteins, specifically the seminal fluid proteins. This work suggests the rapid evolution of seminal fluid proteins may be associated with postmating-prezygotic isolation, which acts as a barrier for gene flow between even the most closely related species.
Collapse
Affiliation(s)
- Tom Hill
- The Department of Molecular Biosciences, University of Kansas, Lawrence, KS 66045, USA
| | | | - Robert L Unckless
- The Department of Molecular Biosciences, University of Kansas, Lawrence, KS 66045, USA
| |
Collapse
|
40
|
Zhao Y, Lu GA, Yang H, Lin P, Liufu Z, Tang T, Xu J. Run or Die in the Evolution of New MicroRNAs-Testing the Red Queen Hypothesis on De Novo New Genes. Mol Biol Evol 2021; 38:1544-1553. [PMID: 33306129 PMCID: PMC8042761 DOI: 10.1093/molbev/msaa317] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The Red Queen hypothesis depicts evolution as the continual struggle to adapt. According to this hypothesis, new genes, especially those originating from nongenic sequences (i.e., de novo genes), are eliminated unless they evolve continually in adaptation to a changing environment. Here, we analyze two Drosophila de novo miRNAs that are expressed in a testis-specific manner with very high rates of evolution in their DNA sequence. We knocked out these miRNAs in two sibling species and investigated their contributions to different fitness components. We observed that the fitness contributions of miR-975 in Drosophila simulans seem positive, in contrast to its neutral contributions in D. melanogaster, whereas miR-983 appears to have negative contributions in both species, as the fitness of the knockout mutant increases. As predicted by the Red Queen hypothesis, the fitness difference of these de novo miRNAs indicates their different fates.
Collapse
Affiliation(s)
- Yixin Zhao
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Guang-An Lu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Hao Yang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Pei Lin
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Zhongqi Liufu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Tian Tang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Jin Xu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| |
Collapse
|
41
|
Lange A, Patel PH, Heames B, Damry AM, Saenger T, Jackson CJ, Findlay GD, Bornberg-Bauer E. Structural and functional characterization of a putative de novo gene in Drosophila. Nat Commun 2021; 12:1667. [PMID: 33712569 PMCID: PMC7954818 DOI: 10.1038/s41467-021-21667-6] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 02/03/2021] [Indexed: 11/26/2022] Open
Abstract
Comparative genomic studies have repeatedly shown that new protein-coding genes can emerge de novo from noncoding DNA. Still unknown is how and when the structures of encoded de novo proteins emerge and evolve. Combining biochemical, genetic and evolutionary analyses, we elucidate the function and structure of goddard, a gene which appears to have evolved de novo at least 50 million years ago within the Drosophila genus. Previous studies found that goddard is required for male fertility. Here, we show that Goddard protein localizes to elongating sperm axonemes and that in its absence, elongated spermatids fail to undergo individualization. Combining modelling, NMR and circular dichroism (CD) data, we show that Goddard protein contains a large central α-helix, but is otherwise partially disordered. We find similar results for Goddard's orthologs from divergent fly species and their reconstructed ancestral sequences. Accordingly, Goddard's structure appears to have been maintained with only minor changes over millions of years.
Collapse
Affiliation(s)
- Andreas Lange
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Prajal H Patel
- Department of Biology, College of the Holy Cross, Worcester, MA, USA
| | - Brennen Heames
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Adam M Damry
- Research School of Chemistry, ANU College of Science, Canberra, Australia
| | - Thorsten Saenger
- Department of Pediatric Kidney, Liver and Metabolic Diseases, Hannover Medical School, Hannover, Germany
| | - Colin J Jackson
- Research School of Chemistry, ANU College of Science, Canberra, Australia
| | | | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany.
| |
Collapse
|
42
|
Uncovering de novo gene birth in yeast using deep transcriptomics. Nat Commun 2021; 12:604. [PMID: 33504782 PMCID: PMC7841160 DOI: 10.1038/s41467-021-20911-3] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 01/04/2021] [Indexed: 01/30/2023] Open
Abstract
De novo gene origination has been recently established as an important mechanism for the formation of new genes. In organisms with a large genome, intergenic and intronic regions provide plenty of raw material for new transcriptional events to occur, but little is know about how de novo transcripts originate in more densely-packed genomes. Here, we identify 213 de novo originated transcripts in Saccharomyces cerevisiae using deep transcriptomics and genomic synteny information from multiple yeast species grown in two different conditions. We find that about half of the de novo transcripts are expressed from regions which already harbor other genes in the opposite orientation; these transcripts show similar expression changes in response to stress as their overlapping counterparts, and some appear to translate small proteins. Thus, a large fraction of de novo genes in yeast are likely to co-evolve with already existing genes.
Collapse
|
43
|
Rahi ML, Mather PB, Hurwood DA. Do plasticity in gene expression and physiological responses in Palaemonid prawns facilitate adaptive response to different osmotic challenges? Comp Biochem Physiol A Mol Integr Physiol 2021; 251:110810. [DOI: 10.1016/j.cbpa.2020.110810] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 09/25/2020] [Accepted: 09/25/2020] [Indexed: 12/20/2022]
|
44
|
Xu YC, Guo YL. Less Is More, Natural Loss-of-Function Mutation Is a Strategy for Adaptation. PLANT COMMUNICATIONS 2020; 1:100103. [PMID: 33367264 PMCID: PMC7743898 DOI: 10.1016/j.xplc.2020.100103] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 07/08/2020] [Accepted: 08/12/2020] [Indexed: 05/12/2023]
Abstract
Gene gain and loss are crucial factors that shape the evolutionary success of diverse organisms. In the past two decades, more attention has been paid to the significance of gene gain through gene duplication or de novo genes. However, gene loss through natural loss-of-function (LoF) mutations, which is prevalent in the genomes of diverse organisms, has been largely ignored. With the development of sequencing techniques, many genomes have been sequenced across diverse species and can be used to study the evolutionary patterns of gene loss. In this review, we summarize recent advances in research on various aspects of LoF mutations, including their identification, evolutionary dynamics in natural populations, and functional effects. In particular, we discuss how LoF mutations can provide insights into the minimum gene set (or the essential gene set) of an organism. Furthermore, we emphasize their potential impact on adaptation. At the genome level, although most LoF mutations are neutral or deleterious, at least some of them are under positive selection and may contribute to biodiversity and adaptation. Overall, we highlight the importance of natural LoF mutations as a robust framework for understanding biological questions in general.
Collapse
Affiliation(s)
- Yong-Chao Xu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Ya-Long Guo
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
45
|
Dowling D, Schmitz JF, Bornberg-Bauer E. Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage. Genome Biol Evol 2020; 12:2183-2195. [PMID: 33210146 PMCID: PMC7674706 DOI: 10.1093/gbe/evaa194] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/12/2020] [Indexed: 12/12/2022] Open
Abstract
In addition to known genes, much of the human genome is transcribed into RNA. Chance formation of novel open reading frames (ORFs) can lead to the translation of myriad new proteins. Some of these ORFs may yield advantageous adaptive de novo proteins. However, widespread translation of noncoding DNA can also produce hazardous protein molecules, which can misfold and/or form toxic aggregates. The dynamics of how de novo proteins emerge from potentially toxic raw materials and what influences their long-term survival are unknown. Here, using transcriptomic data from human and five other primates, we generate a set of transcribed human ORFs at six conservation levels to investigate which properties influence the early emergence and long-term retention of these expressed ORFs. As these taxa diverged from each other relatively recently, we present a fine scale view of the evolution of novel sequences over recent evolutionary time. We find that novel human-restricted ORFs are preferentially located on GC-rich gene-dense chromosomes, suggesting their retention is linked to pre-existing genes. Sequence properties such as intrinsic structural disorder and aggregation propensity-which have been proposed to play a role in survival of de novo genes-remain unchanged over time. Even very young sequences code for proteins with low aggregation propensities, suggesting that genomic regions with many novel transcribed ORFs are concomitantly less likely to produce ORFs which code for harmful toxic proteins. Our data indicate that the survival of these novel ORFs is largely stochastic rather than shaped by selection.
Collapse
Affiliation(s)
- Daniel Dowling
- Institute for Evolution and Biodiversity, University of Münster, Germany
| | - Jonathan F Schmitz
- Institute for Evolution and Biodiversity, University of Münster, Germany
| | | |
Collapse
|
46
|
Weisman CM, Murray AW, Eddy SR. Many, but not all, lineage-specific genes can be explained by homology detection failure. PLoS Biol 2020; 18:e3000862. [PMID: 33137085 PMCID: PMC7660931 DOI: 10.1371/journal.pbio.3000862] [Citation(s) in RCA: 100] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 11/12/2020] [Accepted: 09/21/2020] [Indexed: 12/21/2022] Open
Abstract
Genes for which homologs can be detected only in a limited group of evolutionarily related species, called “lineage-specific genes,” are pervasive: Essentially every lineage has them, and they often comprise a sizable fraction of the group’s total genes. Lineage-specific genes are often interpreted as “novel” genes, representing genetic novelty born anew within that lineage. Here, we develop a simple method to test an alternative null hypothesis: that lineage-specific genes do have homologs outside of the lineage that, even while evolving at a constant rate in a novelty-free manner, have merely become undetectable by search algorithms used to infer homology. We show that this null hypothesis is sufficient to explain the lack of detected homologs of a large number of lineage-specific genes in fungi and insects. However, we also find that a minority of lineage-specific genes in both clades are not well explained by this novelty-free model. The method provides a simple way of identifying which lineage-specific genes call for special explanations beyond homology detection failure, highlighting them as interesting candidates for further study. Lineage-specific gene families may arise from evolutionary innovations such as de novo gene origination, or may simply mean that a similarity search program failed to identify more distant homologs. A new computational method for modeling the expected decay of similarity search scores with evolutionary distance allows distinction between the two explanations.
Collapse
Affiliation(s)
- Caroline M. Weisman
- Department of Molecular & Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Andrew W. Murray
- Department of Molecular & Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Sean R. Eddy
- Department of Molecular & Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Harvard University, Cambridge, Massachusetts, United States of America
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
47
|
Shi T, Rahmani RS, Gugger PF, Wang M, Li H, Zhang Y, Li Z, Wang Q, Van de Peer Y, Marchal K, Chen J. Distinct Expression and Methylation Patterns for Genes with Different Fates following a Single Whole-Genome Duplication in Flowering Plants. Mol Biol Evol 2020; 37:2394-2413. [PMID: 32343808 PMCID: PMC7403625 DOI: 10.1093/molbev/msaa105] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
For most sequenced flowering plants, multiple whole-genome duplications (WGDs) are found. Duplicated genes following WGD often have different fates that can quickly disappear again, be retained for long(er) periods, or subsequently undergo small-scale duplications. However, how different expression, epigenetic regulation, and functional constraints are associated with these different gene fates following a WGD still requires further investigation due to successive WGDs in angiosperms complicating the gene trajectories. In this study, we investigate lotus (Nelumbo nucifera), an angiosperm with a single WGD during the K-pg boundary. Based on improved intraspecific-synteny identification by a chromosome-level assembly, transcriptome, and bisulfite sequencing, we explore not only the fundamental distinctions in genomic features, expression, and methylation patterns of genes with different fates after a WGD but also the factors that shape post-WGD expression divergence and expression bias between duplicates. We found that after a WGD genes that returned to single copies show the highest levels and breadth of expression, gene body methylation, and intron numbers, whereas the long-retained duplicates exhibit the highest degrees of protein-protein interactions and protein lengths and the lowest methylation in gene flanking regions. For those long-retained duplicate pairs, the degree of expression divergence correlates with their sequence divergence, degree in protein-protein interactions, and expression level, whereas their biases in expression level reflecting subgenome dominance are associated with the bias of subgenome fractionation. Overall, our study on the paleopolyploid nature of lotus highlights the impact of different functional constraints on gene fate and duplicate divergence following a single WGD in plant.
Collapse
Affiliation(s)
- Tao Shi
- CAS Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
| | - Razgar Seyed Rahmani
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
| | - Paul F Gugger
- Appalachian Laboratory, University of Maryland Center for Environmental Science, Frostburg, MD
| | - Muhua Wang
- School of Marine Sciences, Sun Yat-sen University, Guangzhou, China
| | - Hui Li
- CAS Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yue Zhang
- CAS Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhizhong Li
- CAS Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Qingfeng Wang
- CAS Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
- Sino-African Joint Research Center, Chinese Academy of Sciences, Wuhan, China
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- Centre for Plant Systems Biology, VIB, Ghent, Belgium
- Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
- College of Horticulture, Nanjing Agricultural University, Nanjing, China
| | - Kathleen Marchal
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- Department of Information Technology, IDLab, IMEC, Ghent University, Ghent, Belgium
| | - Jinming Chen
- CAS Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
| |
Collapse
|
48
|
Evolution of novel genes in three-spined stickleback populations. Heredity (Edinb) 2020; 125:50-59. [PMID: 32499660 PMCID: PMC7413265 DOI: 10.1038/s41437-020-0319-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Revised: 04/27/2020] [Accepted: 04/30/2020] [Indexed: 12/22/2022] Open
Abstract
Eukaryotic genomes frequently acquire new protein-coding genes which may significantly impact an organism’s fitness. Novel genes can be created, for example, by duplication of large genomic regions or de novo, from previously non-coding DNA. Either way, creation of a novel transcript is an essential early step during novel gene emergence. Most studies on the gain-and-loss dynamics of novel genes so far have compared genomes between species, constraining analyses to genes that have remained fixed over long time scales. However, the importance of novel genes for rapid adaptation among populations has recently been shown. Therefore, since little is known about the evolutionary dynamics of transcripts across natural populations, we here study transcriptomes from several tissues and nine geographically distinct populations of an ecological model species, the three-spined stickleback. Our findings suggest that novel genes typically start out as transcripts with low expression and high tissue specificity. Early expression regulation appears to be mediated by gene-body methylation. Although most new and narrowly expressed genes are rapidly lost, those that survive and subsequently spread through populations tend to gain broader and higher expression levels. The properties of the encoded proteins, such as disorder and aggregation propensity, hardly change. Correspondingly, young novel genes are not preferentially under positive selection but older novel genes more often overlap with FST outlier regions. Taken together, expression of the surviving novel genes is rapidly regulated, probably via epigenetic mechanisms, while structural properties of encoded proteins are non-debilitating and might only change much later.
Collapse
|
49
|
Pascual-Carreras E, Marin-Barba M, Herrera-Úbeda C, Font-Martín D, Eckelt K, de Sousa N, García-Fernández J, Saló E, Adell T. Planarian cell number depends on blitzschnell, a novel gene family that balances cell proliferation and cell death. Development 2020; 147:dev.184044. [PMID: 32122990 DOI: 10.1242/dev.184044] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Accepted: 02/19/2020] [Indexed: 01/14/2023]
Abstract
Control of cell number is crucial to define body size during animal development and to restrict tumoral transformation. The cell number is determined by the balance between cell proliferation and cell death. Although many genes are known to regulate those processes, the molecular mechanisms underlying the relationship between cell number and body size remain poorly understood. This relationship can be better understood by studying planarians, flatworms that continuously change their body size according to nutrient availability. We identified a novel gene family, blitzschnell (bls), that consists of de novo and taxonomically restricted genes that control cell proliferation:cell death ratio. Their silencing promotes faster regeneration and increases cell number during homeostasis. Importantly, this increase in cell number leads to an increase in body size only in a nutrient-rich environment; in starved planarians, silencing results in a decrease in cell size and cell accumulation that ultimately produces overgrowths. bls expression is downregulated after feeding and is related to activity of the insulin/Akt/mTOR network, suggesting that the bls family evolved in planarians as an additional mechanism for restricting cell number in nutrient-fluctuating environments.
Collapse
Affiliation(s)
- Eudald Pascual-Carreras
- Department of Genetics, Microbiology and Statistics and Institute of Biomedicine, Universitat de Barcelona, Barcelona 08028, Catalunya, Spain.,Institut de Biomedicina de la Universitat de Barcelona (IBUB), Universitat de Barcelona, Barcelona 08028, Catalunya, Spain
| | - Marta Marin-Barba
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK
| | - Carlos Herrera-Úbeda
- Department of Genetics, Microbiology and Statistics and Institute of Biomedicine, Universitat de Barcelona, Barcelona 08028, Catalunya, Spain.,Institut de Biomedicina de la Universitat de Barcelona (IBUB), Universitat de Barcelona, Barcelona 08028, Catalunya, Spain
| | - Daniel Font-Martín
- Department of Genetics, Microbiology and Statistics and Institute of Biomedicine, Universitat de Barcelona, Barcelona 08028, Catalunya, Spain.,Institut de Biomedicina de la Universitat de Barcelona (IBUB), Universitat de Barcelona, Barcelona 08028, Catalunya, Spain
| | - Kay Eckelt
- Department of Genetics, Microbiology and Statistics and Institute of Biomedicine, Universitat de Barcelona, Barcelona 08028, Catalunya, Spain.,Institut de Biomedicina de la Universitat de Barcelona (IBUB), Universitat de Barcelona, Barcelona 08028, Catalunya, Spain
| | - Nidia de Sousa
- Department of Genetics, Microbiology and Statistics and Institute of Biomedicine, Universitat de Barcelona, Barcelona 08028, Catalunya, Spain.,Institut de Biomedicina de la Universitat de Barcelona (IBUB), Universitat de Barcelona, Barcelona 08028, Catalunya, Spain
| | - Jordi García-Fernández
- Department of Genetics, Microbiology and Statistics and Institute of Biomedicine, Universitat de Barcelona, Barcelona 08028, Catalunya, Spain.,Institut de Biomedicina de la Universitat de Barcelona (IBUB), Universitat de Barcelona, Barcelona 08028, Catalunya, Spain
| | - Emili Saló
- Department of Genetics, Microbiology and Statistics and Institute of Biomedicine, Universitat de Barcelona, Barcelona 08028, Catalunya, Spain.,Institut de Biomedicina de la Universitat de Barcelona (IBUB), Universitat de Barcelona, Barcelona 08028, Catalunya, Spain
| | - Teresa Adell
- Department of Genetics, Microbiology and Statistics and Institute of Biomedicine, Universitat de Barcelona, Barcelona 08028, Catalunya, Spain .,Institut de Biomedicina de la Universitat de Barcelona (IBUB), Universitat de Barcelona, Barcelona 08028, Catalunya, Spain
| |
Collapse
|
50
|
Heames B, Schmitz J, Bornberg-Bauer E. A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila. J Mol Evol 2020; 88:382-398. [PMID: 32253450 PMCID: PMC7162840 DOI: 10.1007/s00239-020-09939-z] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Accepted: 03/13/2020] [Indexed: 12/13/2022]
Abstract
Orphan genes, lacking detectable homologs in outgroup species, typically represent 10-30% of eukaryotic genomes. Efforts to find the source of these young genes indicate that de novo emergence from non-coding DNA may in part explain their prevalence. Here, we investigate the roots of orphan gene emergence in the Drosophila genus. Across the annotated proteomes of twelve species, we find 6297 orphan genes within 4953 taxon-specific clusters of orthologs. By inferring the ancestral DNA as non-coding for between 550 and 2467 (8.7-39.2%) of these genes, we describe for the first time how de novo emergence contributes to the abundance of clade-specific Drosophila genes. In support of them having functional roles, we show that de novo genes have robust expression and translational support. However, the distinct nucleotide sequences of de novo genes, which have characteristics intermediate between intergenic regions and conserved genes, reflect their recent birth from non-coding DNA. We find that de novo genes encode more disordered proteins than both older genes and intergenic regions. Together, our results suggest that gene emergence from non-coding DNA provides an abundant source of material for the evolution of new proteins. Following gene birth, gradual evolution over large evolutionary timescales moulds sequence properties towards those of conserved genes, resulting in a continuum of properties whose starting points depend on the nucleotide sequences of an initial pool of novel genes.
Collapse
Affiliation(s)
- Brennen Heames
- Institute for Evolution and Biodiversity, 48149, Münster, Germany
| | - Jonathan Schmitz
- Institute for Evolution and Biodiversity, 48149, Münster, Germany
| | | |
Collapse
|