1
|
Petroll R, Papareddy RK, Krela R, Laigle A, Rivière Q, Bišova K, Mozgová I, Borg M. The Expansion and Diversification of Epigenetic Regulatory Networks Underpins Major Transitions in the Evolution of Land Plants. Mol Biol Evol 2025; 42:msaf064. [PMID: 40127687 PMCID: PMC11982613 DOI: 10.1093/molbev/msaf064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 02/26/2025] [Accepted: 03/05/2025] [Indexed: 03/26/2025] Open
Abstract
Epigenetic silencing is essential for regulating gene expression and cellular diversity in eukaryotes. While DNA and H3K9 methylation silence transposable elements (TEs), H3K27me3 marks deposited by the Polycomb repressive complex 2 (PRC2) silence varying proportions of TEs and genes across different lineages. Despite the major development role epigenetic silencing plays in multicellular eukaryotes, little is known about how epigenetic regulatory networks were shaped over evolutionary time. Here, we analyze epigenomes from diverse species across the green lineage to infer the chronological epigenetic recruitment of genes during land plant evolution. We first reveal the nature of plant heterochromatin in the unicellular chlorophyte microalga Chlorella sorokiniana and identify several genes marked with H3K27me3, highlighting the deep origin of PRC2-regulated genes in the green lineage. By incorporating genomic phylostratigraphy, we show how genes of differing evolutionary age occupy distinct epigenetic states in plants. While young genes tend to be silenced by H3K9 methylation, genes that emerged in land plants are preferentially marked with H3K27me3, some of which form part of a common network of PRC2-repressed genes across distantly related species. Finally, we analyze the potential recruitment of PRC2 to plant H3K27me3 domains and identify conserved DNA-binding sites of ancient transcription factor families known to interact with PRC2. Our findings shed light on the conservation and potential origin of epigenetic regulatory networks in the green lineage, while also providing insight into the evolutionary dynamics and molecular triggers that underlie the adaptation and elaboration of epigenetic regulation, laying the groundwork for its future consideration in other eukaryotic lineages.
Collapse
Affiliation(s)
- Romy Petroll
- Department of Algal Development and Evolution, Max Planck Institute for Biology, Tübingen, Germany
| | - Ranjith K Papareddy
- Gregor Mendel Institute for Molecular Plant Biology, Vienna Biocenter, Vienna, Austria
| | - Rafal Krela
- Biology Centre CAS—Institute of Plant Molecular Biology, České Budějovice, Czech Republic
| | - Alice Laigle
- Department of Algal Development and Evolution, Max Planck Institute for Biology, Tübingen, Germany
| | - Quentin Rivière
- Biology Centre CAS—Institute of Plant Molecular Biology, České Budějovice, Czech Republic
| | - Kateřina Bišova
- Institute of Microbiology CAS, Centre Algatech, Třeboň, Czech Republic
| | - Iva Mozgová
- Biology Centre CAS—Institute of Plant Molecular Biology, České Budějovice, Czech Republic
| | - Michael Borg
- Department of Algal Development and Evolution, Max Planck Institute for Biology, Tübingen, Germany
| |
Collapse
|
2
|
Sivakumar P, Pandey S, Ramesha A, Davda JN, Singh A, Kumar C, Gala H, Subbiah V, Adicherla H, Dhawan J, Aravind L, Siddiqi I. Sporophyte-directed gametogenesis in Arabidopsis. NATURE PLANTS 2025; 11:398-409. [PMID: 40087543 DOI: 10.1038/s41477-025-01932-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 01/30/2025] [Indexed: 03/17/2025]
Abstract
Plants alternate between diploid sporophyte and haploid gametophyte generations1. In mosses, which retain features of ancestral land plants, the gametophyte is dominant and has an independent existence. However, in flowering plants the gametophyte has undergone evolutionary reduction to just a few cells enclosed within the sporophyte. The gametophyte is thought to retain genetic control of its development even after reduction2. Here we show that male gametophyte development in Arabidopsis, long considered to be autonomous, is also under genetic control of the sporophyte via a repressive mechanism that includes large-scale regulation of protein turnover. We identify an Arabidopsis gene SHUKR as an inhibitor of male gametic gene expression. SHUKR is unrelated to proteins of known function and acts sporophytically in meiosis to control gametophyte development by negatively regulating expression of a large set of genes specific to postmeiotic gametogenesis. This control emerged late in evolution as SHUKR homologues are found only in eudicots. We show that SHUKR is rapidly evolving under positive selection, suggesting that variation in control of protein turnover during male gametogenesis has played an important role in evolution within eudicots.
Collapse
Affiliation(s)
- Prakash Sivakumar
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Saurabh Pandey
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
- databaum GmbH, Hamburg, Germany
| | - A Ramesha
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
- Seri-Biotech Research Laboratory, Central Silk Board, Bangalore, India
| | | | - Aparna Singh
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
- Department of Botany, MMV, Banaras Hindu University, Varanasi, India
| | - Chandan Kumar
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
- University of Texas at Austin, Austin, TX, USA
| | - Hardik Gala
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
| | | | | | - Jyotsna Dhawan
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
| | - L Aravind
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Imran Siddiqi
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India.
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India.
| |
Collapse
|
3
|
Xia S, Chen J, Arsala D, Emerson JJ, Long M. Functional innovation through new genes as a general evolutionary process. Nat Genet 2025; 57:295-309. [PMID: 39875578 DOI: 10.1038/s41588-024-02059-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 12/15/2024] [Indexed: 01/30/2025]
Abstract
In the past decade, our understanding of how new genes originate in diverse organisms has advanced substantially, and more than a dozen molecular mechanisms for generating initial gene structures were identified, in addition to gene duplication. These new genes have been found to integrate into and modify pre-existing gene networks primarily through mutation and selection, revealing new patterns and rules with stable origination rates across various organisms. This progress has challenged the prevailing belief that new proteins evolve from pre-existing genes, as new genes may arise de novo from noncoding DNA sequences in many organisms, with high rates observed in flowering plants. New genes have important roles in phenotypic and functional evolution across diverse biological processes and structures, with detectable fitness effects of sexual conflict genes that can shape species divergence. Such knowledge of new genes can be of translational value in agriculture and medicine.
Collapse
Affiliation(s)
- Shengqian Xia
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - Jianhai Chen
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - Deanna Arsala
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - J J Emerson
- Department of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA.
| |
Collapse
|
4
|
Zhao Q, Zheng Y, Li Y, Shi L, Zhang J, Ma D, You M. An Orphan Gene Enhances Male Reproductive Success in Plutella xylostella. Mol Biol Evol 2024; 41:msae142. [PMID: 38990889 PMCID: PMC11290247 DOI: 10.1093/molbev/msae142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 06/28/2024] [Accepted: 07/05/2024] [Indexed: 07/13/2024] Open
Abstract
Plutella xylostella exhibits exceptional reproduction ability, yet the genetic basis underlying the high reproductive capacity remains unknown. Here, we demonstrate that an orphan gene, lushu, which encodes a sperm protein, plays a crucial role in male reproductive success. Lushu is located on the Z chromosome and is prevalent across different P. xylostella populations worldwide. We subsequently generated lushu mutants using transgenic CRISPR/Cas9 system. Knockout of Lushu results in reduced male mating efficiency and accelerated death in adult males. Furthermore, our findings highlight that the deficiency of lushu reduced the transfer of sperms from males to females, potentially resulting in hindered sperm competition. Additionally, the knockout of Lushu results in disrupted gene expression in energy-related pathways and elevated insulin levels in adult males. Our findings reveal that male reproductive performance has evolved through the birth of a newly evolved, lineage-specific gene with enormous potentiality in fecundity success. These insights hold valuable implications for identifying the target for genetic control, particularly in relation to species-specific traits that are pivotal in determining high levels of fecundity.
Collapse
Affiliation(s)
- Qian Zhao
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Ministerial and Provincial Joint Innovation Centre for Safety Production of Cross-Strait Crops, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Joint International Research Laboratory of Ecological Pest Control, Ministry of Education, Fuzhou 350002, China
| | - Yahong Zheng
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Yiying Li
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Lingping Shi
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Jing Zhang
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Ministerial and Provincial Joint Innovation Centre for Safety Production of Cross-Strait Crops, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Joint International Research Laboratory of Ecological Pest Control, Ministry of Education, Fuzhou 350002, China
| | - Dongna Ma
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Minsheng You
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Ministerial and Provincial Joint Innovation Centre for Safety Production of Cross-Strait Crops, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Joint International Research Laboratory of Ecological Pest Control, Ministry of Education, Fuzhou 350002, China
| |
Collapse
|
5
|
Zhao Z, Ma D. Genome-Wide Identification, Characterization and Function Analysis of Lineage-Specific Genes in the Tea Plant Camellia sinensis. Front Genet 2021; 12:770570. [PMID: 34858483 PMCID: PMC8631334 DOI: 10.3389/fgene.2021.770570] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Accepted: 10/14/2021] [Indexed: 11/22/2022] Open
Abstract
Genes that have no homologous sequences with other species are called lineage-specific genes (LSGs), are common in living organisms, and have an important role in the generation of new functions, adaptive evolution and phenotypic alteration of species. Camellia sinensis var. sinensis (CSS) is one of the most widely distributed cultivars for quality green tea production. The rich catechins in tea have antioxidant, free radical elimination, fat loss and cancer prevention potential. To further understand the evolution and utilize the function of LSGs in tea, we performed a comparative genomics approach to identify Camellia-specific genes (CSGs). Our result reveals that 1701 CSGs were identified specific to CSS, accounting for 3.37% of all protein-coding genes. The majority of CSGs (57.08%) were generated by gene duplication, and the time of duplication occurrence coincide with the time of two genome-wide replication (WGD) events that happened in CSS genome. Gene structure analysis revealed that CSGs have shorter gene lengths, fewer exons, higher GC content and higher isoelectric point. Gene expression analysis showed that CSG had more tissue-specific expression compared to evolutionary conserved genes (ECs). Weighted gene co-expression network analysis (WGCNA) showed that 18 CSGs are mainly associated with catechin synthesis-related pathways, including phenylalanine biosynthesis, biosynthesis of amino acids, pentose phosphate pathway, photosynthesis and carbon metabolism. Besides, we found that the expression of three CSGs (CSS0030246, CSS0002298, and CSS0030939) was significantly down-regulated in response to both types of stresses (salt and drought). Our study first systematically identified LSGs in CSS, and comprehensively analyzed the features and potential functions of CSGs. We also identified key candidate genes, which will provide valuable assistance for further studies on catechin synthesis and provide a molecular basis for the excavation of excellent germplasm resources.
Collapse
Affiliation(s)
- Zhizhu Zhao
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, China
| | - Dongna Ma
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, China
| |
Collapse
|
6
|
Williams JH. Consequences of whole genome duplication for 2n pollen performance. PLANT REPRODUCTION 2021; 34:321-334. [PMID: 34302535 DOI: 10.1007/s00497-021-00426-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 07/17/2021] [Indexed: 06/13/2023]
Abstract
The vegetative cell of the angiosperm male gametophyte (pollen) functions as a free-living, single-celled organism that both produces and transports sperm to egg. Whole-genome duplication (WGD) should have strong effects on pollen because of the haploid to diploid transition and because of both genetic and epigenetic effects on cell-level phenotypes. To disentangle historical effects of WGD on pollen performance, studies can compare 1n pollen from diploids to neo-2n pollen from diploids and synthetic autotetraploids to older 2n pollen from established neo-autotetraploids. WGD doubles both gene number and bulk nuclear DNA mass, and a substantial proportion of diploid and autotetraploid heterozygosity can be transmitted to 2n pollen. Relative to 1n pollen, 2n pollen can exhibit heterosis due to higher gene dosage, higher heterozygosity and new allelic interactions. Doubled genome size also has consequences for gene regulation and expression as well as epigenetic effects on cell architecture. Pollen volume doubling is a universal effect of WGD, whereas an increase in aperture number is common among taxa with simultaneous microsporogenesis and pored apertures, mostly eudicots. WGD instantly affects numerous evolved compromises among mature pollen functional traits and these are rapidly shaped by highly diverse tissue interactions and pollen competitive environments in the early post-WGD generations. 2n pollen phenotypes generally incur higher performance costs, and the degree to which these are met or evolve by scaling up provisioning and metabolic vigor needs further study.
Collapse
Affiliation(s)
- Joseph H Williams
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, TN, 37996, USA.
| |
Collapse
|
7
|
Jin G, Ma PF, Wu X, Gu L, Long M, Zhang C, Li DZ. New Genes Interacted with Recent Whole Genome Duplicates in the Fast Stem Growth of Bamboos. Mol Biol Evol 2021; 38:5752-5768. [PMID: 34581782 PMCID: PMC8662795 DOI: 10.1093/molbev/msab288] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
As drivers of evolutionary innovations, new genes allow organisms to explore new niches. However, clear examples of this process remain scarce. Bamboos, the unique grass lineage diversifying into the forest, have evolved with a key innovation of fast growth of woody stem, reaching up to 1 m/day. Here, we identify 1,622 bamboo-specific orphan genes that appeared in recent 46 million years, and 19 of them evolved from noncoding ancestral sequences with entire de novo origination process reconstructed. The new genes evolved gradually in exon−intron structure, protein length, expression specificity, and evolutionary constraint. These new genes, whether or not from de novo origination, are dominantly expressed in the rapidly developing shoots, and make transcriptomes of shoots the youngest among various bamboo tissues, rather than reproductive tissue in other plants. Additionally, the particularity of bamboo shoots has also been shaped by recent whole-genome duplicates (WGDs), which evolved divergent expression patterns from ancestral states. New genes and WGDs have been evolutionarily recruited into coexpression networks to underline fast-growing trait of bamboo shoot. Our study highlights the importance of interactions between new genes and genome duplicates in generating morphological innovation.
Collapse
Affiliation(s)
- Guihua Jin
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Peng-Fei Ma
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Xiaopei Wu
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Lianfeng Gu
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, Fujian, 350002, China
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, 60637, USA
| | - Chengjun Zhang
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - De-Zhu Li
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| |
Collapse
|
8
|
Assis R. No Expression Divergence despite Transcriptional Interference between Nested Protein-Coding Genes in Mammals. Genes (Basel) 2021; 12:genes12091381. [PMID: 34573363 PMCID: PMC8467205 DOI: 10.3390/genes12091381] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 08/23/2021] [Accepted: 08/24/2021] [Indexed: 01/05/2023] Open
Abstract
Nested protein-coding genes accumulated throughout metazoan evolution, with early analyses of human and Drosophila microarray data indicating that this phenomenon was simply due to the presence of large introns. However, a recent study employing RNA-seq data uncovered evidence of transcriptional interference driving rapid expression divergence between Drosophila nested genes, illustrating that accurate expression estimation of overlapping genes can enhance detection of their relationships. Hence, here I apply an analogous approach to strand-specific RNA-seq data from human and mouse to revisit the role of transcriptional interference in the evolution of mammalian nested genes. A genomic survey reveals that whereas mammalian nested genes indeed accrued over evolutionary time, they are retained at lower frequencies than in Drosophila. Though several properties of mammalian nested genes align with observations in Drosophila and with expectations under transcriptional interference, contrary to both, their expression divergence is not statistically different from that between unnested genes, and also does not increase after nesting. Together, these results support the hypothesis that lower selection efficiencies limit rates of gene expression evolution in mammals, leading to their reliance on immediate eradication of deleterious nested genes to avoid transcriptional interference.
Collapse
Affiliation(s)
- Raquel Assis
- Department of Electrical Engineering and Computer Science, Institute for Human Health and Disease Intervention, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
9
|
Ma D, Ding Q, Guo Z, Zhao Z, Wei L, Li Y, Song S, Zheng HL. Identification, characterization and expression analysis of lineage-specific genes within mangrove species Aegiceras corniculatum. Mol Genet Genomics 2021; 296:1235-1247. [PMID: 34363105 DOI: 10.1007/s00438-021-01810-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 07/22/2021] [Indexed: 11/25/2022]
Abstract
Lineage-specific genes (LSGs) are the genes that have no recognizable homology to any sequences in other species, which are important drivers for the generation of new functions, phenotypic changes, and facilitating species adaptation to environment. Aegiceras corniculatum is one of major mangrove plant species adapted to waterlogging and saline conditions, and the exploration of aegiceras-specific genes (ASGs) is important to reveal its adaptation to the harsh environment. Here, we performed a systematic analysis on ASGs, focusing on their sequence characterization, origination and expression patterns. Our results reveal that there are 4823 ASGs in the genome, approximately 11.84% of all protein-coding genes. High proportion (45.78%) of ASGs originate from gene duplication, and the time of gene duplication of ASGs is consistent with the timing of two genome-wide replication (WGD) events that occurred in A. corniculatum, and also coincides with a short period of global warming during the Paleocene-Eocene Maximum (PETM, 55.5 million years ago). Gene structure analysis showed that ASGs have shorter protein lengths, fewer exons, and higher isoelectric point. Expression patterns analysis showed that ASGs had low levels of expression and more tissue-specific expression. Weighted gene co-expression network analysis (WGCNA) revealed that 86 ASGs co-expressed gene modules were primarily involved in pathways related to adversity stress, including plant hormone signal transduction, phenylpropanoid biosynthesis, photosynthesis, peroxisome and pentose phosphate pathway. This study provides a comprehensive analysis of the characteristics and potential functions of ASGs and identifies key candidate genes, which will contribute to the subsequent further investigation of the adaptation of A. corniculatum to intertidal coastal wetland habitats.
Collapse
Affiliation(s)
- Dongna Ma
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Qiansu Ding
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Zejun Guo
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Zhizhu Zhao
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Liufeng Wei
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Yiying Li
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Institute of Applied Ecology, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Shiwei Song
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Hai-Lei Zheng
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China.
| |
Collapse
|
10
|
Hata T, Takada N, Hayakawa C, Kazama M, Uchikoba T, Tachikawa M, Matsuo M, Satoh S, Obokata J. De novo activated transcription of inserted foreign coding sequences is inheritable in the plant genome. PLoS One 2021; 16:e0252674. [PMID: 34111139 PMCID: PMC8191969 DOI: 10.1371/journal.pone.0252674] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 05/19/2021] [Indexed: 01/16/2023] Open
Abstract
The manner in which inserted foreign coding sequences become transcriptionally activated and fixed in the plant genome is poorly understood. To examine such processes of gene evolution, we performed an artificial evolutionary experiment in Arabidopsis thaliana. As a model of gene-birth events, we introduced a promoterless coding sequence of the firefly luciferase (LUC) gene and established 386 T2-generation transgenic lines. Among them, we determined the individual LUC insertion loci in 76 lines and found that one-third of them were transcribed de novo even in the intergenic or inherently unexpressed regions. In the transcribed lines, transcription-related chromatin marks were detected across the newly activated transcribed regions. These results agreed with our previous findings in A. thaliana cultured cells under a similar experimental scheme. A comparison of the results of the T2-plant and cultured cell experiments revealed that the de novo-activated transcription concomitant with local chromatin remodelling was inheritable. During one-generation inheritance, it seems likely that the transcription activities of the LUC inserts trapped by the endogenous genes/transcripts became stronger, while those of de novo transcription in the intergenic/untranscribed regions became weaker. These findings may offer a clue for the elucidation of the mechanism by which inserted foreign coding sequences become transcriptionally activated and fixed in the plant genome.
Collapse
Affiliation(s)
- Takayuki Hata
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
- Faculty of Agriculture, Setsunan University, Hirakata-shi, Osaka, Japan
| | - Naoto Takada
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Chihiro Hayakawa
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Mei Kazama
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Tomohiro Uchikoba
- Faculty of Life and Environmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Makoto Tachikawa
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Mitsuhiro Matsuo
- Faculty of Agriculture, Setsunan University, Hirakata-shi, Osaka, Japan
| | - Soichirou Satoh
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
- Faculty of Life and Environmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Junichi Obokata
- Faculty of Agriculture, Setsunan University, Hirakata-shi, Osaka, Japan
| |
Collapse
|
11
|
DeGiorgio M, Assis R. Learning Retention Mechanisms and Evolutionary Parameters of Duplicate Genes from Their Expression Data. Mol Biol Evol 2021; 38:1209-1224. [PMID: 33045078 PMCID: PMC7947822 DOI: 10.1093/molbev/msaa267] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Learning about the roles that duplicate genes play in the origins of novel phenotypes requires an understanding of how their functions evolve. A previous method for achieving this goal, CDROM, employs gene expression distances as proxies for functional divergence and then classifies the evolutionary mechanisms retaining duplicate genes from comparisons of these distances in a decision tree framework. However, CDROM does not account for stochastic shifts in gene expression or leverage advances in contemporary statistical learning for performing classification, nor is it capable of predicting the parameters driving duplicate gene evolution. Thus, here we develop CLOUD, a multi-layer neural network built on a model of gene expression evolution that can both classify duplicate gene retention mechanisms and predict their underlying evolutionary parameters. We show that not only is the CLOUD classifier substantially more powerful and accurate than CDROM, but that it also yields accurate parameter predictions, enabling a better understanding of the specific forces driving the evolution and long-term retention of duplicate genes. Further, application of the CLOUD classifier and predictor to empirical data from Drosophila recapitulates many previous findings about gene duplication in this lineage, showing that new functions often emerge rapidly and asymmetrically in younger duplicate gene copies, and that functional divergence is driven by strong natural selection. Hence, CLOUD represents a major advancement in classifying retention mechanisms and predicting evolutionary parameters of duplicate genes, thereby highlighting the utility of incorporating sophisticated statistical learning techniques to address long-standing questions about evolution after gene duplication.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431
- Institute for Human Health and Disease Intervention, Florida Atlantic University, Boca Raton, FL 33431
| | - Raquel Assis
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431
- Institute for Human Health and Disease Intervention, Florida Atlantic University, Boca Raton, FL 33431
| |
Collapse
|
12
|
Evolution of novel genes in three-spined stickleback populations. Heredity (Edinb) 2020; 125:50-59. [PMID: 32499660 PMCID: PMC7413265 DOI: 10.1038/s41437-020-0319-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Revised: 04/27/2020] [Accepted: 04/30/2020] [Indexed: 12/22/2022] Open
Abstract
Eukaryotic genomes frequently acquire new protein-coding genes which may significantly impact an organism’s fitness. Novel genes can be created, for example, by duplication of large genomic regions or de novo, from previously non-coding DNA. Either way, creation of a novel transcript is an essential early step during novel gene emergence. Most studies on the gain-and-loss dynamics of novel genes so far have compared genomes between species, constraining analyses to genes that have remained fixed over long time scales. However, the importance of novel genes for rapid adaptation among populations has recently been shown. Therefore, since little is known about the evolutionary dynamics of transcripts across natural populations, we here study transcriptomes from several tissues and nine geographically distinct populations of an ecological model species, the three-spined stickleback. Our findings suggest that novel genes typically start out as transcripts with low expression and high tissue specificity. Early expression regulation appears to be mediated by gene-body methylation. Although most new and narrowly expressed genes are rapidly lost, those that survive and subsequently spread through populations tend to gain broader and higher expression levels. The properties of the encoded proteins, such as disorder and aggregation propensity, hardly change. Correspondingly, young novel genes are not preferentially under positive selection but older novel genes more often overlap with FST outlier regions. Taken together, expression of the surviving novel genes is rapidly regulated, probably via epigenetic mechanisms, while structural properties of encoded proteins are non-debilitating and might only change much later.
Collapse
|
13
|
Kobayashi CR, Castillo-González C, Survotseva Y, Canal E, Nelson ADL, Shippen DE. Recent emergence and extinction of the protection of telomeres 1c gene in Arabidopsis thaliana. PLANT CELL REPORTS 2019; 38:1081-1097. [PMID: 31134349 PMCID: PMC6708462 DOI: 10.1007/s00299-019-02427-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 03/27/2019] [Indexed: 05/20/2023]
Abstract
Duplicate POT1 genes must rapidly diverge or be inactivated. Protection of telomeres 1 (POT1) encodes a conserved telomere binding protein implicated in both chromosome end protection and telomere length maintenance. Most organisms harbor a single POT1 gene, but in the few lineages where the POT1 family has expanded, the duplicate genes have diversified. Arabidopsis thaliana bears three POT1-like loci, POT1a, POT1b and POT1c. POT1a retains the ancestral function of telomerase regulation, while POT1b is implicated in chromosome end protection. Here we examine the function and evolution of the third POT1 paralog, POT1c. POT1c is a new gene, unique to A. thaliana, and was derived from a duplication event involving the POT1a locus and a neighboring gene encoding ribosomal protein S17. The duplicate S17 locus (dS17) is highly conserved across A. thaliana accessions, while POT1c is highly divergent, harboring multiple deletions within the gene body and two transposable elements within the promoter. The POT1c locus is transcribed at very low to non-detectable levels under standard growth conditions. In addition, no discernable molecular or developmental defects are associated with plants bearing a CRISPR mutation in the POT1c locus. However, forced expression of POT1c leads to decreased telomerase enzyme activity and shortened telomeres. Evolutionary reconstruction indicates that transposons invaded the POT1c promoter soon after the locus was formed, permanently silencing the gene. Altogether, these findings argue that POT1 dosage is critically important for viability and duplicate gene copies are retained only upon functional divergence.
Collapse
Affiliation(s)
- Callie R Kobayashi
- Biochemistry and Biophysics, Texas A&M University, College Station, Texas, USA
| | | | - Yulia Survotseva
- Yale Center for Molecular Discovery, Yale University, New Haven, Connecticut, USA
| | - Elijah Canal
- Biochemistry and Biophysics, Texas A&M University, College Station, Texas, USA
| | - Andrew D L Nelson
- The School of Plant Sciences, University of Arizona, Tucson, Arizona, USA
| | - Dorothy E Shippen
- Biochemistry and Biophysics, Texas A&M University, College Station, Texas, USA.
| |
Collapse
|
14
|
Identification, characterization and expression analysis of lineage-specific genes within Triticeae. Genomics 2019; 112:1343-1350. [PMID: 31401233 DOI: 10.1016/j.ygeno.2019.08.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 08/04/2019] [Accepted: 08/07/2019] [Indexed: 12/11/2022]
Abstract
Lineage-specific genes (LSGs) are a set of genes in a given taxon without significant sequence similarity to genes and intergenic sequences of other taxa and are functional. The tribe Triticeae mainly includes species of different ploidy levels, such as staple food crops wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.). This study is aimed at mining and characterizing the Triticeae-specific genes (TSGs) using expressed sequence data of wheat. A total of 3812 TSGs was identified and they were generally characterized by smaller size, fewer exons, shorter open reading frames and lower expression levels. Most TSGs were expressed with tissue preference and many of them were predominantly expressed in reproduction related tissues, especially in young stamen. Nearly one third of the TSGs were stress-responsive and inducible under abiotic and/or biotic stresses. A co-expression-based annotation supported the relevance of some TSGs with reproduction and stress responses, indicating their potential economic importance.
Collapse
|
15
|
Jiang X, Assis R. Rapid functional divergence after small-scale gene duplication in grasses. BMC Evol Biol 2019; 19:97. [PMID: 31046675 PMCID: PMC6498639 DOI: 10.1186/s12862-019-1415-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Accepted: 03/31/2019] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Gene duplication has played an important role in the evolution and domestication of flowering plants. Yet little is known about how plant duplicate genes evolve and are retained over long timescales, particularly those arising from small-scale duplication (SSD) rather than whole-genome duplication (WGD) events. RESULTS We address this question in the Poaceae (grass) family by analyzing gene expression data from nine tissues of Brachypodium distachyon, Oryza sativa japonica (rice), and Sorghum bicolor (sorghum). Consistent with theoretical predictions, expression profiles of most grass genes are conserved after SSD, suggesting that functional conservation is the primary outcome of SSD in grasses. However, we also uncover support for widespread functional divergence, much of which occurs asymmetrically via the process of neofunctionalization. Moreover, neofunctionalization preferentially targets younger (child) duplicate gene copies, is associated with RNA-mediated duplication, and occurs quickly after duplication. Further analysis reveals that functional divergence of SSD-derived genes is positively correlated with both sequence divergence and tissue specificity in all three grass species, and particularly with anther expression in B. distachyon. CONCLUSIONS Our results suggest that SSD-derived grass genes often undergo rapid functional divergence that may be driven by natural selection on male-specific phenotypes. These observations are consistent with those in several animal species, suggesting that duplicate genes take similar evolutionary trajectories in plants and animals.
Collapse
Affiliation(s)
- Xueyuan Jiang
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Raquel Assis
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, USA.
- Department of Biology, Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
16
|
Rapid evolution of protein diversity by de novo origination in Oryza. Nat Ecol Evol 2019; 3:679-690. [PMID: 30858588 DOI: 10.1038/s41559-019-0822-5] [Citation(s) in RCA: 106] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Accepted: 01/23/2019] [Indexed: 12/22/2022]
Abstract
New protein-coding genes that arise de novo from non-coding DNA sequences contribute to protein diversity. However, de novo gene origination is challenging to study as it requires high-quality reference genomes for closely related species, evidence for ancestral non-coding sequences, and transcription and translation of the new genes. High-quality genomes of 13 closely related Oryza species provide unprecedented opportunities to understand de novo origination events. Here, we identify a large number of young de novo genes with discernible recent ancestral non-coding sequences and evidence of translation. Using pipelines examining the synteny relationship between genomes and reciprocal-best whole-genome alignments, we detected at least 175 de novo open reading frames in the focal species O. sativa subspecies japonica, which were all detected in RNA sequencing-based transcriptomes. Mass spectrometry-based targeted proteomics and ribosomal profiling show translational evidence for 57% of the de novo genes. In recent divergence of Oryza, an average of 51.5 de novo genes per million years were generated and retained. We observed evolutionary patterns in which excess indels and early transcription were favoured in origination with a stepwise formation of gene structure. These data reveal that de novo genes contribute to the rapid evolution of protein diversity under positive selection.
Collapse
|
17
|
Wang R, Li M, Wu X, Wang J. The Gene Structure and Expression Level Changes of the GH3 Gene Family in Brassica napus Relative to Its Diploid Ancestors. Genes (Basel) 2019; 10:genes10010058. [PMID: 30658516 PMCID: PMC6356818 DOI: 10.3390/genes10010058] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Revised: 01/10/2019] [Accepted: 01/15/2019] [Indexed: 02/07/2023] Open
Abstract
The GH3 gene family plays a vital role in the phytohormone-related growth and developmental processes. The effects of allopolyploidization on GH3 gene structures and expression levels have not been reported. In this study, a total of 38, 25, and 66 GH3 genes were identified in Brassica rapa (ArAr), Brassica oleracea (CoCo), and Brassica napus (AnACnCn), respectively. BnaGH3 genes were unevenly distributed on chromosomes with 39 on An and 27 on Cn, in which six BnaGH3 genes may appear as new genes. The whole genome triplication allowed the GH3 gene family to expand in diploid ancestors, and allopolyploidization made the GH3 gene family re-expand in B. napus. For most BnaGH3 genes, the exon-intron compositions were similar to diploid ancestors, while the cis-element distributions were obviously different from its ancestors. After allopolyploidization, the expression patterns of GH3 genes from ancestor species changed greatly in B. napus, and the orthologous gene pairs between An/Ar and Cn/Co had diverged expression patterns across four tissues. Our study provides a comprehensive analysis of the GH3 gene family in B. napus, and these results could contribute to identifying genes with vital roles in phytohormone-related growth and developmental processes.
Collapse
Affiliation(s)
- Ruihua Wang
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan 430072, China.
| | - Mengdi Li
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan 430072, China.
| | - Xiaoming Wu
- Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture, Wuhan 430072, China.
| | - Jianbo Wang
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan 430072, China.
| |
Collapse
|
18
|
Banerjee S, Chakraborty S. Protein intrinsic disorder negatively associates with gene age in different eukaryotic lineages. MOLECULAR BIOSYSTEMS 2018; 13:2044-2055. [PMID: 28783193 DOI: 10.1039/c7mb00230k] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The emergence of new protein-coding genes in a specific lineage or species provides raw materials for evolutionary adaptations. Until recently, the biology of new genes emerging particularly from non-genic sequences remained unexplored. Although the new genes are subjected to variable selection pressure and face rapid deletion, some of them become functional and are retained in the gene pool. To acquire functional novelties, new genes often get integrated into the pre-existing ancestral networks. However, the mechanism by which young proteins acquire novel interactions remains unanswered till date. Since structural orientation contributes hugely to the mode of proteins' physical interactions, in this regard, we put forward an interesting question - Do new genes encode proteins with stable folds? Addressing the question, we demonstrated that the intrinsic disorder inversely correlates with the evolutionary gene ages - i.e. young proteins are richer in intrinsic disorder than the ancient ones. We further noted that young proteins, which are initially poorly connected hubs, prefer to be structurally more disordered than well-connected ancient proteins. The phenomenon strikingly defies the usual trend of well-connected proteins being highly disordered in structure. We justified that structural disorder might help poorly connected young proteins to undergo promiscuous interactions, which provides the foundation for novel protein interactions. The study focuses on the evolutionary perspectives of young proteins in the light of structural adaptations.
Collapse
Affiliation(s)
- Sanghita Banerjee
- Machine Intelligence Unit, Indian Statistical Institute, 203 Barrackpore Trunk Road, Kolkata 700108, India.
| | | |
Collapse
|
19
|
Wang B, Regulski M, Tseng E, Olson A, Goodwin S, McCombie WR, Ware D. A comparative transcriptional landscape of maize and sorghum obtained by single-molecule sequencing. Genome Res 2018; 28:921-932. [PMID: 29712755 PMCID: PMC5991521 DOI: 10.1101/gr.227462.117] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Accepted: 04/12/2018] [Indexed: 12/15/2022]
Abstract
Maize and sorghum are both important crops with similar overall plant architectures, but they have key differences, especially in regard to their inflorescences. To better understand these two organisms at the molecular level, we compared expression profiles of both protein-coding and noncoding transcripts in 11 matched tissues using single-molecule, long-read, deep RNA sequencing. This comparative analysis revealed large numbers of novel isoforms in both species. Evolutionarily young genes were likely to be generated in reproductive tissues and usually had fewer isoforms than old genes. We also observed similarities and differences in alternative splicing patterns and activities, both among tissues and between species. The maize subgenomes exhibited no bias in isoform generation; however, genes in the B genome were more highly expressed in pollen tissue, whereas genes in the A genome were more highly expressed in endosperm. We also identified a number of splicing events conserved between maize and sorghum. In addition, we generated comprehensive and high-resolution maps of poly(A) sites, revealing similarities and differences in mRNA cleavage between the two species. Overall, our results reveal considerable splicing and expression diversity between sorghum and maize, well beyond what was reported in previous studies, likely reflecting the differences in architecture between these two species.
Collapse
Affiliation(s)
- Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Michael Regulski
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | | | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Sara Goodwin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | | | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA.,USDA ARS NEA Robert W. Holley Center for Agriculture and Health, Cornell University, Ithaca, New York 14853, USA
| |
Collapse
|
20
|
Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza. Nat Genet 2018; 50:285-296. [DOI: 10.1038/s41588-018-0040-0] [Citation(s) in RCA: 289] [Impact Index Per Article: 41.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2017] [Accepted: 12/18/2017] [Indexed: 11/08/2022]
|
21
|
Huang J, Vendramin S, Shi L, McGinnis KM. Construction and Optimization of a Large Gene Coexpression Network in Maize Using RNA-Seq Data. PLANT PHYSIOLOGY 2017; 175:568-583. [PMID: 28768814 PMCID: PMC5580776 DOI: 10.1104/pp.17.00825] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 07/31/2017] [Indexed: 05/22/2023]
Abstract
With the emergence of massively parallel sequencing, genomewide expression data production has reached an unprecedented level. This abundance of data has greatly facilitated maize research, but may not be amenable to traditional analysis techniques that were optimized for other data types. Using publicly available data, a gene coexpression network (GCN) can be constructed and used for gene function prediction, candidate gene selection, and improving understanding of regulatory pathways. Several GCN studies have been done in maize (Zea mays), mostly using microarray datasets. To build an optimal GCN from plant materials RNA-Seq data, parameters for expression data normalization and network inference were evaluated. A comprehensive evaluation of these two parameters and a ranked aggregation strategy on network performance, using libraries from 1266 maize samples, were conducted. Three normalization methods and 10 inference methods, including six correlation and four mutual information methods, were tested. The three normalization methods had very similar performance. For network inference, correlation methods performed better than mutual information methods at some genes. Increasing sample size also had a positive effect on GCN. Aggregating single networks together resulted in improved performance compared to single networks.
Collapse
Affiliation(s)
- Ji Huang
- Department of Biological Science, Florida State University, Tallahassee, Florida 32306
| | - Stefania Vendramin
- Department of Biological Science, Florida State University, Tallahassee, Florida 32306
| | - Lizhen Shi
- Department of Computer Science, Florida State University, Tallahassee, Florida 32306
| | - Karen M McGinnis
- Department of Biological Science, Florida State University, Tallahassee, Florida 32306
| |
Collapse
|
22
|
Schmitz JF, Bornberg-Bauer E. Fact or fiction: updates on how protein-coding genes might emerge de novo from previously non-coding DNA. F1000Res 2017; 6:57. [PMID: 28163910 PMCID: PMC5247788 DOI: 10.12688/f1000research.10079.1] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/17/2017] [Indexed: 12/31/2022] Open
Abstract
Over the last few years, there has been an increasing amount of evidence for the
de novo emergence of protein-coding genes, i.e. out of non-coding DNA. Here, we review the current literature and summarize the state of the field. We focus specifically on open questions and challenges in the study of
de novo protein-coding genes such as the identification and verification of
de novo-emerged genes. The greatest obstacle to date is the lack of high-quality genomic data with very short divergence times which could help precisely pin down the location of origin of a
de novo gene. We conclude that, while there is plenty of evidence from a genetics perspective, there is a lack of functional studies of bona fide
de novo genes and almost no knowledge about protein structures and how they come about during the emergence of
de novo protein-coding genes. We suggest that future studies should concentrate on the functional and structural characterization of
de novo protein-coding genes as well as the detailed study of the emergence of functional
de novo protein-coding genes.
Collapse
Affiliation(s)
- Jonathan F Schmitz
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| |
Collapse
|
23
|
Peters B, Casey J, Aidley J, Zohrab S, Borg M, Twell D, Brownfield L. A Conserved cis-Regulatory Module Determines Germline Fate through Activation of the Transcription Factor DUO1 Promoter. PLANT PHYSIOLOGY 2017; 173:280-293. [PMID: 27624837 PMCID: PMC5210719 DOI: 10.1104/pp.16.01192] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Accepted: 09/07/2016] [Indexed: 05/07/2023]
Abstract
The development of the male germline within pollen relies upon the activation of numerous target genes by the transcription factor DUO POLLEN1 (DUO1). The expression of DUO1 is restricted to the male germline and is first detected shortly after the asymmetric division that segregates the germ cell lineage. Transcriptional regulation is critical in controlling DUO1 expression, since transcriptional and translational fusions show similar expression patterns. Here, we identify key promoter sequences required for the germline-specific regulation of DUO1 transcription. Combining promoter deletion analyses with phylogenetic footprinting in eudicots and in Arabidopsis accessions, we identify a cis-regulatory module, Regulatory region of DUO1 (ROD1), which replicates the expression pattern of DUO1 in Arabidopsis (Arabidopsis thaliana). We show that ROD1 from the legume Medicago truncatula directs male germline-specific expression in Arabidopsis, demonstrating conservation of DUO1 regulation among eudicots. ROD1 contains several short conserved cis-regulatory elements, including three copies of the motif DNGTGGV, required for germline expression and tandem repeats of the motif YAACYGY, which enhance DUO1 transcription in a positive feedback loop. We conclude that a cis-regulatory module conserved in eudicots directs the spatial and temporal expression of the transcription factor DUO1 to specify male germline fate and sperm cell differentiation.
Collapse
Affiliation(s)
- Benjamin Peters
- Department of Biochemistry, University of Otago, Dunedin 9016, New Zealand (B.P., J.C., S.Z., L.B.); and
- Department of Genetics, University of Leicester, Leicester LE1 7RH, United Kingdom (J.A., M.B., D.T.)
| | - Jonathan Casey
- Department of Biochemistry, University of Otago, Dunedin 9016, New Zealand (B.P., J.C., S.Z., L.B.); and
- Department of Genetics, University of Leicester, Leicester LE1 7RH, United Kingdom (J.A., M.B., D.T.)
| | - Jack Aidley
- Department of Biochemistry, University of Otago, Dunedin 9016, New Zealand (B.P., J.C., S.Z., L.B.); and
- Department of Genetics, University of Leicester, Leicester LE1 7RH, United Kingdom (J.A., M.B., D.T.)
| | - Stuart Zohrab
- Department of Biochemistry, University of Otago, Dunedin 9016, New Zealand (B.P., J.C., S.Z., L.B.); and
- Department of Genetics, University of Leicester, Leicester LE1 7RH, United Kingdom (J.A., M.B., D.T.)
| | - Michael Borg
- Department of Biochemistry, University of Otago, Dunedin 9016, New Zealand (B.P., J.C., S.Z., L.B.); and
- Department of Genetics, University of Leicester, Leicester LE1 7RH, United Kingdom (J.A., M.B., D.T.)
| | - David Twell
- Department of Biochemistry, University of Otago, Dunedin 9016, New Zealand (B.P., J.C., S.Z., L.B.); and
- Department of Genetics, University of Leicester, Leicester LE1 7RH, United Kingdom (J.A., M.B., D.T.)
| | - Lynette Brownfield
- Department of Biochemistry, University of Otago, Dunedin 9016, New Zealand (B.P., J.C., S.Z., L.B.); and
- Department of Genetics, University of Leicester, Leicester LE1 7RH, United Kingdom (J.A., M.B., D.T.)
| |
Collapse
|
24
|
Wang J, Tao F, Marowsky NC, Fan C. Evolutionary Fates and Dynamic Functionalization of Young Duplicate Genes in Arabidopsis Genomes. PLANT PHYSIOLOGY 2016; 172:427-40. [PMID: 27485883 PMCID: PMC5074645 DOI: 10.1104/pp.16.01177] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Accepted: 08/01/2016] [Indexed: 05/02/2023]
Abstract
Gene duplication is a primary means to generate genomic novelties, playing an essential role in speciation and adaptation. Particularly in plants, a high abundance of duplicate genes has been maintained for significantly long periods of evolutionary time. To address the manner in which young duplicate genes were derived primarily from small-scale gene duplication and preserved in plant genomes and to determine the underlying driving mechanisms, we generated transcriptomes to produce the expression profiles of five tissues in Arabidopsis thaliana and the closely related species Arabidopsis lyrata and Capsella rubella Based on the quantitative analysis metrics, we investigated the evolutionary processes of young duplicate genes in Arabidopsis. We determined that conservation, neofunctionalization, and specialization are three main evolutionary processes for Arabidopsis young duplicate genes. We explicitly demonstrated the dynamic functionalization of duplicate genes along the evolutionary time scale. Upon origination, duplicates tend to maintain their ancestral functions; but as they survive longer, they might be likely to develop distinct and novel functions. The temporal evolutionary processes and functionalization of plant duplicate genes are associated with their ancestral functions, dynamic DNA methylation levels, and histone modification abundances. Furthermore, duplicate genes tend to be initially expressed in pollen and then to gain more interaction partners over time. Altogether, our study provides novel insights into the dynamic retention processes of young duplicate genes in plant genomes.
Collapse
Affiliation(s)
- Jun Wang
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| | - Feng Tao
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| | - Nicholas C Marowsky
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| | - Chuanzhu Fan
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| |
Collapse
|
25
|
Wang J, Tao F, Marowsky NC, Fan C. Evolutionary Fates and Dynamic Functionalization of Young Duplicate Genes in Arabidopsis Genomes. PLANT PHYSIOLOGY 2016. [PMID: 27485883 DOI: 10.1104/pp.l6.01177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Gene duplication is a primary means to generate genomic novelties, playing an essential role in speciation and adaptation. Particularly in plants, a high abundance of duplicate genes has been maintained for significantly long periods of evolutionary time. To address the manner in which young duplicate genes were derived primarily from small-scale gene duplication and preserved in plant genomes and to determine the underlying driving mechanisms, we generated transcriptomes to produce the expression profiles of five tissues in Arabidopsis thaliana and the closely related species Arabidopsis lyrata and Capsella rubella Based on the quantitative analysis metrics, we investigated the evolutionary processes of young duplicate genes in Arabidopsis. We determined that conservation, neofunctionalization, and specialization are three main evolutionary processes for Arabidopsis young duplicate genes. We explicitly demonstrated the dynamic functionalization of duplicate genes along the evolutionary time scale. Upon origination, duplicates tend to maintain their ancestral functions; but as they survive longer, they might be likely to develop distinct and novel functions. The temporal evolutionary processes and functionalization of plant duplicate genes are associated with their ancestral functions, dynamic DNA methylation levels, and histone modification abundances. Furthermore, duplicate genes tend to be initially expressed in pollen and then to gain more interaction partners over time. Altogether, our study provides novel insights into the dynamic retention processes of young duplicate genes in plant genomes.
Collapse
Affiliation(s)
- Jun Wang
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| | - Feng Tao
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| | - Nicholas C Marowsky
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| | - Chuanzhu Fan
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| |
Collapse
|
26
|
Li ZW, Chen X, Wu Q, Hagmann J, Han TS, Zou YP, Ge S, Guo YL. On the Origin of De Novo Genes in Arabidopsis thaliana Populations. Genome Biol Evol 2016; 8:2190-202. [PMID: 27401176 PMCID: PMC4987118 DOI: 10.1093/gbe/evw164] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
De novo genes, which originate from ancestral nongenic sequences, are one of the most important sources of protein-coding genes. This origination process is crucial for the adaptation of organisms. However, how de novo genes arise and become fixed in a population or species remains largely unknown. Here, we identified 782 de novo genes from the model plant Arabidopsis thaliana and divided them into three types based on the availability of translational evidence, transcriptional evidence, and neither transcriptional nor translational evidence for their origin. Importantly, by integrating multiple types of omics data, including data from genomes, epigenomes, transcriptomes, and translatomes, we found that epigenetic modifications (DNA methylation and histone modification) play an important role in the origination process of de novo genes. Intriguingly, using the transcriptomes and methylomes from the same population of 84 accessions, we found that de novo genes that are transcribed in approximately half of the total accessions within the population are highly methylated, with lower levels of transcription than those transcribed at other frequencies within the population. We hypothesized that, during the origin of de novo gene alleles, those neutralized to low expression states via DNA methylation have relatively high probabilities of spreading and becoming fixed in a population. Our results highlight the process underlying the origin of de novo genes at the population level, as well as the importance of DNA methylation in this process.
Collapse
Affiliation(s)
- Zi-Wen Li
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| | - Xi Chen
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
| | - Qiong Wu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| | - Jörg Hagmann
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Ting-Shen Han
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
| | - Yu-Pan Zou
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
| | - Song Ge
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| | - Ya-Long Guo
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
27
|
Wang J, Yu Y, Tao F, Zhang J, Copetti D, Kudrna D, Talag J, Lee S, Wing RA, Fan C. DNA methylation changes facilitated evolution of genes derived from Mutator-like transposable elements. Genome Biol 2016; 17:92. [PMID: 27154274 PMCID: PMC4858842 DOI: 10.1186/s13059-016-0954-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Accepted: 04/14/2016] [Indexed: 01/17/2023] Open
Abstract
Background Mutator-like transposable elements, a class of DNA transposons, exist pervasively in both prokaryotic and eukaryotic genomes, with more than 10,000 copies identified in the rice genome. These elements can capture ectopic genomic sequences that lead to the formation of new gene structures. Here, based on whole-genome comparative analyses, we comprehensively investigated processes and mechanisms of the evolution of putative genes derived from Mutator-like transposable elements in ten Oryza species and the outgroup Leersia perieri, bridging ~20 million years of evolutionary history. Results Our analysis identified thousands of putative genes in each of the Oryza species, a large proportion of which have evidence of expression and contain chimeric structures. Consistent with previous reports, we observe that the putative Mutator-like transposable element-derived genes are generally GC-rich and mainly derive from GC-rich parental sequences. Furthermore, we determine that Mutator-like transposable elements capture parental sequences preferentially from genomic regions with low methylation levels and high recombination rates. We explicitly show that methylation levels in the internal and terminated inverted repeat regions of these elements, which might be directed by the 24-nucleotide small RNA-mediated pathway, are different and change dynamically over evolutionary time. Lastly, we demonstrate that putative genes derived from Mutator-like transposable elements tend to be expressed in mature pollen, which have undergone de-methylation programming, thereby providing a permissive expression environment for newly formed/transposable element-derived genes. Conclusions Our results suggest that DNA methylation may be a primary mechanism to facilitate the origination, survival, and regulation of genes derived from Mutator-like transposable elements, thus contributing to the evolution of gene innovation and novelty in plant genomes. Electronic supplementary material The online version of this article (doi:10.1186/s13059-016-0954-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jun Wang
- Department of Biological Sciences, Wayne State University, 5047 Gullen Mall, Detroit, MI, 48202, USA
| | - Yeisoo Yu
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Feng Tao
- Department of Biological Sciences, Wayne State University, 5047 Gullen Mall, Detroit, MI, 48202, USA
| | - Jianwei Zhang
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Dario Copetti
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Dave Kudrna
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Jayson Talag
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Seunghee Lee
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Rod A Wing
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA.,T.T. Chang Genetics Resources Center, International Rice Research Institute, Los Baños, Laguna, 4031, Philippines
| | - Chuanzhu Fan
- Department of Biological Sciences, Wayne State University, 5047 Gullen Mall, Detroit, MI, 48202, USA.
| |
Collapse
|
28
|
Rutley N, Twell D. A decade of pollen transcriptomics. PLANT REPRODUCTION 2015; 28:73-89. [PMID: 25761645 PMCID: PMC4432081 DOI: 10.1007/s00497-015-0261-7] [Citation(s) in RCA: 91] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2015] [Accepted: 02/24/2015] [Indexed: 05/19/2023]
Abstract
KEY MESSAGE Overview of pollen transcriptome studies. Pollen development is driven by gene expression, and knowledge of the molecular events underlying this process has undergone a quantum leap in the last decade through studies of the transcriptome. Here, we outline historical evidence for male haploid gene expression and review the wealth of pollen transcriptome data now available. Knowledge of the transcriptional capacity of pollen has progressed from genetic studies to the direct analysis of RNA and from gene-by-gene studies to analyses on a genomic scale. Microarray and/or RNA-seq data can now be accessed for all phases and cell types of developing pollen encompassing 10 different angiosperms. These growing resources have accelerated research and will undoubtedly inspire new directions and the application of system-based research into the mechanisms that govern the development, function and evolution of angiosperm pollen.
Collapse
Affiliation(s)
- Nicholas Rutley
- Department of Biology, University of Leicester, Leicester, LE1 7RH UK
| | - David Twell
- Department of Biology, University of Leicester, Leicester, LE1 7RH UK
| |
Collapse
|
29
|
Venton D. Highlight: The "Out of Pollen" Hypothesis--In Plants, Animals New Genes Originate in Male Sex Tissue. Genome Biol Evol 2014; 6:2849-50. [PMID: 25352255 PMCID: PMC4224350 DOI: 10.1093/gbe/evu224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|