1
|
Arora S, Hamid F, Kumar S. Fusion transcripts in plants: hidden layer of transcriptome complexity. TRENDS IN PLANT SCIENCE 2025; 30:229-231. [PMID: 39753389 DOI: 10.1016/j.tplants.2024.12.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2024] [Revised: 12/04/2024] [Accepted: 12/06/2024] [Indexed: 03/08/2025]
Abstract
In the realm of genetic information, fusion transcripts contribute to the intricate complexity of the transcriptome across various organisms. Recently, Cong et al. investigated these RNAs in rice, maize, soybean, and arabidopsis (Arabidopsis thaliana), revealing conserved characteristics. These findings enhance our understanding of the functional roles and evolutionary significance of these fusion transcripts.
Collapse
Affiliation(s)
- Simran Arora
- Bioinformatics Laboratory, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Fiza Hamid
- Bioinformatics Laboratory, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Shailesh Kumar
- Bioinformatics Laboratory, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi 110067, India.
| |
Collapse
|
2
|
Chitkara P, Singh A, Gangwar R, Bhardwaj R, Zahra S, Arora S, Hamid F, Arya A, Sahu N, Chakraborty S, Ramesh M, Kumar S. The landscape of fusion transcripts in plants: a new insight into genome complexity. BMC PLANT BIOLOGY 2024; 24:1162. [PMID: 39627690 PMCID: PMC11616359 DOI: 10.1186/s12870-024-05900-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 11/29/2024] [Indexed: 12/06/2024]
Abstract
BACKGROUND Fusion transcripts (FTs), generated by the fusion of genes at the DNA level or RNA-level splicing events significantly contribute to transcriptome diversity. FTs are usually considered unique features of neoplasia and serve as biomarkers and therapeutic targets for multiple cancers. The latest findings show the presence of FTs in normal human physiology. Several discrete reports mentioned the presence of fusion transcripts in planta, has important roles in stress responses, morphological alterations, or traits (e.g. seed size, etc.). RESULTS In this study, we identified 169,197 fusion transcripts in 2795 transcriptome datasets of Arabidopsis thaliana, Cicer arietinum, and Oryza sativa by using a combination of tools, and confirmed the translational activity of 150 fusion transcripts through proteomic datasets. Analysis of the FT junction sequences and their association with epigenetic factors, as revealed by ChIP-Seq datasets, demonstrated an organised process of fusion formation at the DNA level. We investigated the possible impact of three-dimensional chromatin conformation on intra-chromosomal fusion events by leveraging the Hi-C datasets with the incidence of fusion transcripts. We further utilised the long-read RNA-Seq datasets to validate the most reoccurring fusion transcripts in each plant species followed by further authentication through RT-PCR and Sanger sequencing. CONCLUSIONS Our findings suggest that a significant portion of fusion events may be attributed to alternative splicing during transcription, accounting for numerous fusion events without a proportional increase in the number of RNA pairs. Even non-nuclear DNA transcripts from mitochondria and chloroplasts can participate in intra- and inter-chromosomal fusion formation. Genes in close spatial proximity are more prone to undergoing fusion formation, especially in intra-chromosomal FTs. Most of the fusion transcripts may not undergo translation and serve as long non-coding RNAs. The low validation rate of FTs in plants indicated that the fusion transcripts are expressed at very low levels, like in the case of humans. FTs often originate from parental genes involved in essential biological processes, suggesting their relevance across diverse tissues and stress conditions. This study presents a comprehensive repository of fusion transcripts, offering valuable insights into their roles in vital physiological processes and stress responses.
Collapse
Affiliation(s)
- Pragya Chitkara
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Ajeet Singh
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
- Baylor College of Medicine, Houston, TX, USA
| | - Rashmi Gangwar
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Rohan Bhardwaj
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
- Technical University of Munich, Freising, Germany
| | - Shafaque Zahra
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA, USA
| | - Simran Arora
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Fiza Hamid
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Ajay Arya
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Namrata Sahu
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Srija Chakraborty
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
- University of Nottingham, Sutton Bonington Campus, Loughborough, UK
| | - Madhulika Ramesh
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Shailesh Kumar
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India.
| |
Collapse
|
3
|
Arya A, Arora S, Hamid F, Kumar S. PFusionDB: a comprehensive database of plant-specific fusion transcripts. 3 Biotech 2024; 14:282. [PMID: 39479298 PMCID: PMC11519250 DOI: 10.1007/s13205-024-04132-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Accepted: 10/20/2024] [Indexed: 11/02/2024] Open
Abstract
Fusion transcripts (FTs) are well known cancer biomarkers, relatively understudied in plants. Here, we developed PFusionDB (www.nipgr.ac.in/PFusionDB), a novel plant-specific fusion-transcript database. It is a comprehensive repository of 80,170, 39,108, 83,330, and 11,500 unique fusions detected in 1280, 637, 697, and 181 RNA-Seq samples of Arabidopsis thaliana, Oryza sativa japonica, Oryza sativa indica, and Cicer arietinum respectively. Here, a total of 76,599 (Arabidopsis thaliana), 35,480 (Oryza sativa japonica), 72,099 (Oryza sativa indica), and 9524 (Cicer arietinum) fusion transcripts are non-recurrent i.e., only found in one sample. Identification of FTs was performed by using a total of five tools viz. EricScript-Plants, STAR-Fusion, TrinityFusion, SQUID, and MapSplice. At PFusionDB, available fundamental details of fusion events includes the information of parental genes, junction sequence, expression levels of fusion transcripts, breakpoint coordinates, strand information, tissue type, treatment information, fusion type, PFusionDB ID, and Sequence Read Archive (SRA) ID. Further, two search modules: 'Simple Search' and 'Advanced Search', along with a 'Browse' option to data download, are present for the ease of users. Three distinct modules viz. 'BLASTN', 'SW Align', and 'Mapping' are also available for efficient query sequence mapping and alignment to FTs. PFusionDB serves as a crucial resource for delving into the intricate world of fusion transcript in plants, providing researchers with a foundation for further exploration and analysis. Database URL: www.nipgr.ac.in/PFusionDB. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-024-04132-1.
Collapse
Affiliation(s)
- Ajay Arya
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067 India
| | - Simran Arora
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067 India
| | - Fiza Hamid
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067 India
| | - Shailesh Kumar
- Bioinformatics Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067 India
| |
Collapse
|
4
|
Zhou Y, Zhang C, Zhang L, Ye Q, Liu N, Wang M, Long G, Fan W, Long M, Wing RA. Gene fusion as an important mechanism to generate new genes in the genus Oryza. Genome Biol 2022; 23:130. [PMID: 35706016 PMCID: PMC9199173 DOI: 10.1186/s13059-022-02696-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 05/30/2022] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Events of gene fusion have been reported in several organisms. However, the general role of gene fusion as part of new gene origination remains unknown. RESULTS We conduct genome-wide interrogations of four Oryza genomes by designing and implementing novel pipelines to detect fusion genes. Based on the phylogeny of ten plant species, we detect 310 fusion genes across four Oryza species. The estimated rate of origination of fusion genes in the Oryza genus is as high as 63 fusion genes per species per million years, which is fixed at 16 fusion genes per species per million years and much higher than that in flies. By RNA sequencing analysis, we find more than 44% of the fusion genes are expressed and 90% of gene pairs show strong signals of purifying selection. Further analysis of CRISPR/Cas9 knockout lines indicates that newly formed fusion genes regulate phenotype traits including seed germination, shoot length and root length, suggesting the functional significance of these genes. CONCLUSIONS We detect new fusion genes that may drive phenotype evolution in Oryza. This study provides novel insights into the genome evolution of Oryza.
Collapse
Affiliation(s)
- Yanli Zhou
- Germplasm Bank of Wild species, Kunming Institute of Botany, Chinese Academy of Science, Kunming, Yunnan, 650201, China
| | - Chengjun Zhang
- Germplasm Bank of Wild species, Kunming Institute of Botany, Chinese Academy of Science, Kunming, Yunnan, 650201, China.
- Department of Ecology and Evolution, The University of Chicago, 1101 E. 57th Street, Chicago, IL, 60637, USA.
| | - Li Zhang
- Department of Ecology and Evolution, The University of Chicago, 1101 E. 57th Street, Chicago, IL, 60637, USA
- Chinese Institute for Brain Research, (CIBR), Beijing, 102206, China
| | - Qiannan Ye
- Germplasm Bank of Wild species, Kunming Institute of Botany, Chinese Academy of Science, Kunming, Yunnan, 650201, China
| | - Ningyawen Liu
- Germplasm Bank of Wild species, Kunming Institute of Botany, Chinese Academy of Science, Kunming, Yunnan, 650201, China
| | - Muhua Wang
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
- State Key Laboratory for Biocontrol, School of Marine Sciences, Sun Yat-sen University, Zhuhai, 519000, China
| | - Guangqiang Long
- Key Laboratory of Medicinal Plant Biology of Yunnan Province, Yunnan Agricultural University, Kunming, Yunnan, 650201, China
| | - Wei Fan
- Key Laboratory of Medicinal Plant Biology of Yunnan Province, Yunnan Agricultural University, Kunming, Yunnan, 650201, China
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, 1101 E. 57th Street, Chicago, IL, 60637, USA.
| | - Rod A Wing
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA.
- Center for Desert Agriculture, King Abdullah University of Science & Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia.
| |
Collapse
|
5
|
Simon M, Durand S, Ricou A, Vrielynck N, Mayjonade B, Gouzy J, Boyer R, Roux F, Camilleri C, Budar F. APOK3, a pollen killer antidote in Arabidopsis thaliana. Genetics 2022; 221:6603116. [PMID: 35666201 DOI: 10.1093/genetics/iyac089] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 05/27/2022] [Indexed: 11/14/2022] Open
Abstract
The principles of heredity state that the two alleles carried by a heterozygote are equally transmitted to the progeny. However, genomic regions that escape this rule have been reported in many organisms. It is notably the case of genetic loci referred to as gamete killers, where one allele enhances its transmission by causing the death of the gametes that do not carry it. Gamete killers are of great interest, particularly to understand mechanisms of evolution and speciation. Although being common in plants, only a few, all in rice, have so far been deciphered to the causal genes. Here, we studied a pollen killer found in hybrids between two accessions of Arabidopsis thaliana. Exploring natural variation, we observed this pollen killer in many crosses within the species. Genetic analyses revealed that three genetically linked elements are necessary for pollen killer activity. Using mutants, we showed that this pollen killer works according to a poison-antidote model, where the poison kills pollen grains not producing the antidote. We identified the gene encoding the antidote, a chimeric protein addressed to mitochondria. De novo genomic sequencing in twelve natural variants with different behaviors regarding the pollen killer revealed a hyper variable locus, with important structural variations particularly in killer genotypes, where the antidote gene recently underwent duplications. Our results strongly suggest that the gene has newly evolved within A. thaliana. Finally, we identified in the protein sequence polymorphisms related to its antidote activity.
Collapse
Affiliation(s)
- Matthieu Simon
- Université Paris-Saclay,INRAE, AgroParisTech, Institut Jean-Pierre Bourgin (IJPB), 78000, Versailles, France
| | - Stéphanie Durand
- Université Paris-Saclay,INRAE, AgroParisTech, Institut Jean-Pierre Bourgin (IJPB), 78000, Versailles, France
| | - Anthony Ricou
- Université Paris-Saclay,INRAE, AgroParisTech, Institut Jean-Pierre Bourgin (IJPB), 78000, Versailles, France
| | - Nathalie Vrielynck
- Université Paris-Saclay,INRAE, AgroParisTech, Institut Jean-Pierre Bourgin (IJPB), 78000, Versailles, France
| | | | - Jérôme Gouzy
- LIPME,Université de Toulouse,INRAE,CNRS, 31326 Castanet-Tolosan, France
| | - Roxane Boyer
- INRAE, GeT-PlaGe, Genotoul, 31326 Castanet-Tolosan, France(doi : 10.15454/1.5572370921303193E12)
| | - Fabrice Roux
- LIPME,Université de Toulouse,INRAE,CNRS, 31326 Castanet-Tolosan, France
| | - Christine Camilleri
- Université Paris-Saclay,INRAE, AgroParisTech, Institut Jean-Pierre Bourgin (IJPB), 78000, Versailles, France
| | - Françoise Budar
- Université Paris-Saclay,INRAE, AgroParisTech, Institut Jean-Pierre Bourgin (IJPB), 78000, Versailles, France
| |
Collapse
|
6
|
Genetic Dissection of Seed Dormancy using Chromosome Segment Substitution Lines in Rice ( Oryza sativa L.). Int J Mol Sci 2020; 21:ijms21041344. [PMID: 32079255 PMCID: PMC7072991 DOI: 10.3390/ijms21041344] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Revised: 02/08/2020] [Accepted: 02/14/2020] [Indexed: 01/26/2023] Open
Abstract
Timing of germination determines whether a new plant life cycle can be initiated; therefore, appropriate dormancy and rapid germination under diverse environmental conditions are the most important features for a seed. However, the genetic architecture of seed dormancy and germination behavior remains largely elusive. In the present study, a linkage analysis for seed dormancy and germination behavior was conducted using a set of 146 chromosome segment substitution lines (CSSLs), of which each carries a single or a few chromosomal segments of Nipponbare (NIP) in the background of Zhenshan 97 (ZS97). A total of 36 quantitative trait loci (QTLs) for six germination parameters were identified. Among them, qDOM3.1 was validated as a major QTL for seed dormancy in a segregation population derived from the qDOM3.1 near-isogenic line, and further delimited into a genomic region of 90 kb on chromosome 3. Based on genetic analysis and gene expression profiles, the candidate genes were restricted to eight genes, of which four were responsive to the addition of abscisic acid (ABA). Among them, LOC_Os03g01540 was involved in the ABA signaling pathway to regulate seed dormancy. The results will facilitate cloning the major QTLs and understanding the genetic architecture for seed dormancy and germination in rice and other crops.
Collapse
|
7
|
Rapid evolution of protein diversity by de novo origination in Oryza. Nat Ecol Evol 2019; 3:679-690. [PMID: 30858588 DOI: 10.1038/s41559-019-0822-5] [Citation(s) in RCA: 106] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Accepted: 01/23/2019] [Indexed: 12/22/2022]
Abstract
New protein-coding genes that arise de novo from non-coding DNA sequences contribute to protein diversity. However, de novo gene origination is challenging to study as it requires high-quality reference genomes for closely related species, evidence for ancestral non-coding sequences, and transcription and translation of the new genes. High-quality genomes of 13 closely related Oryza species provide unprecedented opportunities to understand de novo origination events. Here, we identify a large number of young de novo genes with discernible recent ancestral non-coding sequences and evidence of translation. Using pipelines examining the synteny relationship between genomes and reciprocal-best whole-genome alignments, we detected at least 175 de novo open reading frames in the focal species O. sativa subspecies japonica, which were all detected in RNA sequencing-based transcriptomes. Mass spectrometry-based targeted proteomics and ribosomal profiling show translational evidence for 57% of the de novo genes. In recent divergence of Oryza, an average of 51.5 de novo genes per million years were generated and retained. We observed evolutionary patterns in which excess indels and early transcription were favoured in origination with a stepwise formation of gene structure. These data reveal that de novo genes contribute to the rapid evolution of protein diversity under positive selection.
Collapse
|
8
|
Johnson C, Conrad LJ, Patel R, Anderson S, Li C, Pereira A, Sundaresan V. Reproductive Long Intergenic Noncoding RNAs Exhibit Male Gamete Specificity and Polycomb Repressive Complex 2-Mediated Repression. PLANT PHYSIOLOGY 2018; 177:1198-1217. [PMID: 29844229 PMCID: PMC6053002 DOI: 10.1104/pp.17.01269] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Accepted: 05/15/2018] [Indexed: 05/22/2023]
Abstract
Long noncoding RNAs (lncRNAs) have been characterized extensively in animals and are involved in several processes, including homeobox gene expression and X-chromosome inactivation. In comparison, there has been much less detailed characterization of plant lncRNAs, and the number of distinct lncRNAs encoded in plant genomes and their regulation by developmental and epigenetic mechanisms remain largely unknown. Here, we analyzed transcriptome data from Asian rice (Oryza sativa) and identified 6,309 long intergenic noncoding RNAs (lincRNAs), focusing on their expression in reproductive tissues and organs. Most O. sativa lincRNAs were expressed in a highly tissue-specific manner, with an unexpectedly high fraction specifically expressed in male gametes. Mutation of a component of the Polycomb Repressive Complex2 (PRC2) resulted in derepression of another large class of lincRNAs, whose expression is correlated with H3K27 trimethylation in developing panicles. Overlap with the sperm cell-specific lincRNAs suggests that epigenetic repression of lincRNAs in the panicles was partially relieved in the male germline. Expression of a subset of lincRNAs also showed modulation by drought in reproductive tissues. Comparison with other cereal genomes showed that the lincRNAs generally have low levels of conservation at both the sequence and structural levels. Use of a novelty detection support vector machine model enabled the detection of nucleotide sequence and structural homology in ∼10% and ∼4% of the lincRNAs in genomes of purple false brome (Brachypodium distachyon) and maize (Zea mays), respectively. This is the first study to report on a large number of lncRNAs that are targets of repression by PRC2 rather than mediating regulation via PRC2. That the vast majority of the lincRNAs reported here do not overlap with those of other rice studies indicates that these are a significant addition to the known lincRNAs in rice.
Collapse
Affiliation(s)
- Cameron Johnson
- Plant Biology Department, University of California, Davis, California 95616
| | - Liza J Conrad
- Plant Biology Department, University of California, Davis, California 95616
| | - Ravi Patel
- Plant Biology Department, University of California, Davis, California 95616
| | - Sarah Anderson
- Plant Biology Department, University of California, Davis, California 95616
| | - Chenxin Li
- Plant Biology Department, University of California, Davis, California 95616
| | - Andy Pereira
- Departments of Crop, Soil, and Environmental Sciences and Plant Pathology, University of Arkansas, Fayetteville, Arkansas 72701
| | | |
Collapse
|
9
|
Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza. Nat Genet 2018; 50:285-296. [DOI: 10.1038/s41588-018-0040-0] [Citation(s) in RCA: 289] [Impact Index Per Article: 41.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2017] [Accepted: 12/18/2017] [Indexed: 11/08/2022]
|
10
|
Emergence of a Novel Chimeric Gene Underlying Grain Number in Rice. Genetics 2016; 205:993-1002. [PMID: 27986805 DOI: 10.1534/genetics.116.188201] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Accepted: 12/08/2016] [Indexed: 02/05/2023] Open
Abstract
Grain number is an important factor in determining grain production of rice (Oryza sativa L.). The molecular genetic basis for grain number is complex. Discovering new genes involved in regulating rice grain number increases our knowledge regarding its molecular mechanisms and aids breeding programs. Here, we identified GRAINS NUMBER 2 (GN2), a novel gene that is responsible for rice grain number, from "Yuanjiang" common wild rice (O. rufipogon Griff.). Transgenic plants overexpressing GN2 showed less grain number, reduced plant height, and later heading date than control plants. Interestingly, GN2 arose through the insertion of a 1094-bp sequence from LOC_Os02g45150 into the third exon of LOC_Os02g56630, and the inserted sequence recruited its nearby sequence to generate the chimeric GN2 The gene structure and expression pattern of GN2 were distinct from those of LOC_Os02g45150 and LOC_Os02g56630 Sequence analysis showed that GN2 may be generated in the natural population of Yuanjiang common wild rice. In this study, we identified a novel functional chimeric gene and also provided information regarding the molecular mechanisms regulating rice grain number.
Collapse
|
11
|
Wang J, Tao F, Marowsky NC, Fan C. Evolutionary Fates and Dynamic Functionalization of Young Duplicate Genes in Arabidopsis Genomes. PLANT PHYSIOLOGY 2016; 172:427-40. [PMID: 27485883 PMCID: PMC5074645 DOI: 10.1104/pp.16.01177] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Accepted: 08/01/2016] [Indexed: 05/02/2023]
Abstract
Gene duplication is a primary means to generate genomic novelties, playing an essential role in speciation and adaptation. Particularly in plants, a high abundance of duplicate genes has been maintained for significantly long periods of evolutionary time. To address the manner in which young duplicate genes were derived primarily from small-scale gene duplication and preserved in plant genomes and to determine the underlying driving mechanisms, we generated transcriptomes to produce the expression profiles of five tissues in Arabidopsis thaliana and the closely related species Arabidopsis lyrata and Capsella rubella Based on the quantitative analysis metrics, we investigated the evolutionary processes of young duplicate genes in Arabidopsis. We determined that conservation, neofunctionalization, and specialization are three main evolutionary processes for Arabidopsis young duplicate genes. We explicitly demonstrated the dynamic functionalization of duplicate genes along the evolutionary time scale. Upon origination, duplicates tend to maintain their ancestral functions; but as they survive longer, they might be likely to develop distinct and novel functions. The temporal evolutionary processes and functionalization of plant duplicate genes are associated with their ancestral functions, dynamic DNA methylation levels, and histone modification abundances. Furthermore, duplicate genes tend to be initially expressed in pollen and then to gain more interaction partners over time. Altogether, our study provides novel insights into the dynamic retention processes of young duplicate genes in plant genomes.
Collapse
Affiliation(s)
- Jun Wang
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| | - Feng Tao
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| | - Nicholas C Marowsky
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| | - Chuanzhu Fan
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| |
Collapse
|
12
|
Wang J, Tao F, Marowsky NC, Fan C. Evolutionary Fates and Dynamic Functionalization of Young Duplicate Genes in Arabidopsis Genomes. PLANT PHYSIOLOGY 2016. [PMID: 27485883 DOI: 10.1104/pp.l6.01177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Gene duplication is a primary means to generate genomic novelties, playing an essential role in speciation and adaptation. Particularly in plants, a high abundance of duplicate genes has been maintained for significantly long periods of evolutionary time. To address the manner in which young duplicate genes were derived primarily from small-scale gene duplication and preserved in plant genomes and to determine the underlying driving mechanisms, we generated transcriptomes to produce the expression profiles of five tissues in Arabidopsis thaliana and the closely related species Arabidopsis lyrata and Capsella rubella Based on the quantitative analysis metrics, we investigated the evolutionary processes of young duplicate genes in Arabidopsis. We determined that conservation, neofunctionalization, and specialization are three main evolutionary processes for Arabidopsis young duplicate genes. We explicitly demonstrated the dynamic functionalization of duplicate genes along the evolutionary time scale. Upon origination, duplicates tend to maintain their ancestral functions; but as they survive longer, they might be likely to develop distinct and novel functions. The temporal evolutionary processes and functionalization of plant duplicate genes are associated with their ancestral functions, dynamic DNA methylation levels, and histone modification abundances. Furthermore, duplicate genes tend to be initially expressed in pollen and then to gain more interaction partners over time. Altogether, our study provides novel insights into the dynamic retention processes of young duplicate genes in plant genomes.
Collapse
Affiliation(s)
- Jun Wang
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| | - Feng Tao
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| | - Nicholas C Marowsky
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| | - Chuanzhu Fan
- Department of Biological Sciences, Wayne State University, Detroit, Michigan 48202
| |
Collapse
|
13
|
Wang J, Yu Y, Tao F, Zhang J, Copetti D, Kudrna D, Talag J, Lee S, Wing RA, Fan C. DNA methylation changes facilitated evolution of genes derived from Mutator-like transposable elements. Genome Biol 2016; 17:92. [PMID: 27154274 PMCID: PMC4858842 DOI: 10.1186/s13059-016-0954-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Accepted: 04/14/2016] [Indexed: 01/17/2023] Open
Abstract
Background Mutator-like transposable elements, a class of DNA transposons, exist pervasively in both prokaryotic and eukaryotic genomes, with more than 10,000 copies identified in the rice genome. These elements can capture ectopic genomic sequences that lead to the formation of new gene structures. Here, based on whole-genome comparative analyses, we comprehensively investigated processes and mechanisms of the evolution of putative genes derived from Mutator-like transposable elements in ten Oryza species and the outgroup Leersia perieri, bridging ~20 million years of evolutionary history. Results Our analysis identified thousands of putative genes in each of the Oryza species, a large proportion of which have evidence of expression and contain chimeric structures. Consistent with previous reports, we observe that the putative Mutator-like transposable element-derived genes are generally GC-rich and mainly derive from GC-rich parental sequences. Furthermore, we determine that Mutator-like transposable elements capture parental sequences preferentially from genomic regions with low methylation levels and high recombination rates. We explicitly show that methylation levels in the internal and terminated inverted repeat regions of these elements, which might be directed by the 24-nucleotide small RNA-mediated pathway, are different and change dynamically over evolutionary time. Lastly, we demonstrate that putative genes derived from Mutator-like transposable elements tend to be expressed in mature pollen, which have undergone de-methylation programming, thereby providing a permissive expression environment for newly formed/transposable element-derived genes. Conclusions Our results suggest that DNA methylation may be a primary mechanism to facilitate the origination, survival, and regulation of genes derived from Mutator-like transposable elements, thus contributing to the evolution of gene innovation and novelty in plant genomes. Electronic supplementary material The online version of this article (doi:10.1186/s13059-016-0954-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jun Wang
- Department of Biological Sciences, Wayne State University, 5047 Gullen Mall, Detroit, MI, 48202, USA
| | - Yeisoo Yu
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Feng Tao
- Department of Biological Sciences, Wayne State University, 5047 Gullen Mall, Detroit, MI, 48202, USA
| | - Jianwei Zhang
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Dario Copetti
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Dave Kudrna
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Jayson Talag
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Seunghee Lee
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Rod A Wing
- Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA.,T.T. Chang Genetics Resources Center, International Rice Research Institute, Los Baños, Laguna, 4031, Philippines
| | - Chuanzhu Fan
- Department of Biological Sciences, Wayne State University, 5047 Gullen Mall, Detroit, MI, 48202, USA.
| |
Collapse
|
14
|
Kim K, Lee SC, Lee J, Yu Y, Yang K, Choi BS, Koh HJ, Waminal NE, Choi HI, Kim NH, Jang W, Park HS, Lee J, Lee HO, Joh HJ, Lee HJ, Park JY, Perumal S, Jayakodi M, Lee YS, Kim B, Copetti D, Kim S, Kim S, Lim KB, Kim YD, Lee J, Cho KS, Park BS, Wing RA, Yang TJ. Complete chloroplast and ribosomal sequences for 30 accessions elucidate evolution of Oryza AA genome species. Sci Rep 2015; 5:15655. [PMID: 26506948 PMCID: PMC4623524 DOI: 10.1038/srep15655] [Citation(s) in RCA: 149] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Accepted: 09/30/2015] [Indexed: 12/15/2022] Open
Abstract
Cytoplasmic chloroplast (cp) genomes and nuclear ribosomal DNA (nR) are the primary sequences used to understand plant diversity and evolution. We introduce a high-throughput method to simultaneously obtain complete cp and nR sequences using Illumina platform whole-genome sequence. We applied the method to 30 rice specimens belonging to nine Oryza species. Concurrent phylogenomic analysis using cp and nR of several of specimens of the same Oryza AA genome species provides insight into the evolution and domestication of cultivated rice, clarifying three ambiguous but important issues in the evolution of wild Oryza species. First, cp-based trees clearly classify each lineage but can be biased by inter-subspecies cross-hybridization events during speciation. Second, O. glumaepatula, a South American wild rice, includes two cytoplasm types, one of which is derived from a recent interspecies hybridization with O. longistminata. Third, the Australian O. rufipogan-type rice is a perennial form of O. meridionalis.
Collapse
Affiliation(s)
- Kyunghee Kim
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea.,Phyzen Genome Institute, 501-1, Gwanak Century Tower, Kwanak-gu, Seoul, 151-836, Republic of Korea
| | - Sang-Choon Lee
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Junki Lee
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Yeisoo Yu
- Phyzen Genome Institute, 501-1, Gwanak Century Tower, Kwanak-gu, Seoul, 151-836, Republic of Korea.,Arizona Genomics Institute, School of Plant Sciences, The University of Arizona, Tucson, Arizona, 85721, USA
| | - Kiwoung Yang
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea.,Department of Horticulture, Sunchon National University, Suncheon, 540-950, Republic of Korea
| | - Beom-Soon Choi
- Phyzen Genome Institute, 501-1, Gwanak Century Tower, Kwanak-gu, Seoul, 151-836, Republic of Korea
| | - Hee-Jong Koh
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Nomar Espinosa Waminal
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Hong-Il Choi
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Nam-Hoon Kim
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Woojong Jang
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Hyun-Seung Park
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Jonghoon Lee
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Hyun Oh Lee
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea.,Phyzen Genome Institute, 501-1, Gwanak Century Tower, Kwanak-gu, Seoul, 151-836, Republic of Korea
| | - Ho Jun Joh
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Hyeon Ju Lee
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Jee Young Park
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Sampath Perumal
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Murukarthick Jayakodi
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Yun Sun Lee
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Backki Kim
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Dario Copetti
- Arizona Genomics Institute, School of Plant Sciences, The University of Arizona, Tucson, Arizona, 85721, USA
| | - Soonok Kim
- Biological and Genetic Resources Assessment Division, National Institute of Biological Resources, Incheon, 404-170, Republic of Korea
| | - Sunggil Kim
- Department of Plant Biotechnology, Biotechnology Research Institute, Chonnam National University, Gwangju, 500-757, Republic of Korea
| | - Ki-Byung Lim
- Department of Horticultural Science, Kyungpook National University, Daegu, 702-701, Republic of Korea
| | - Young-Dong Kim
- Department of Life Science, Hallym University, Chuncheon, Kangwon-do, 200-702, Republic of Korea
| | - Jungho Lee
- Green Plant Institute, #2-202 Biovalley, 89 Seoho-ro, Kwonseon-gu, Suwon, Republic of Korea
| | - Kwang-Su Cho
- Highland Agriculture Research Institute, National Institute of Crop Science, Rural Development Administration, Pyeongchang-gun, Kangwon-do, 232-955, Republic of Korea
| | - Beom-Seok Park
- Department of Agricultural Biotechnology, National Academy of Agricultural Science, Rural Development Administration, Jeonju, 560-500, Republic of Korea
| | - Rod A Wing
- Arizona Genomics Institute, School of Plant Sciences, The University of Arizona, Tucson, Arizona, 85721, USA
| | - Tae-Jin Yang
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| |
Collapse
|