1
|
Fang C, Yang M, Tang Y, Zhang L, Zhao H, Ni H, Chen Q, Meng F, Jiang J. Dynamics of cis-regulatory sequences and transcriptional divergence of duplicated genes in soybean. Proc Natl Acad Sci U S A 2023; 120:e2303836120. [PMID: 37871213 PMCID: PMC10622917 DOI: 10.1073/pnas.2303836120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 09/19/2023] [Indexed: 10/25/2023] Open
Abstract
Transcriptional divergence of duplicated genes after whole genome duplication (WGD) has been described in many plant lineages and is often associated with subgenome dominance, a genome-wide mechanism. However, it is unknown what underlies the transcriptional divergence of duplicated genes in polyploid species that lack subgenome dominance. Soybean is a paleotetraploid with a WGD that occurred 5 to 13 Mya. Approximately 50% of the duplicated genes retained from this WGD exhibit transcriptional divergence. We developed accessible chromatin region (ACR) datasets from leaf, flower, and seed tissues using MNase-hypersensitivity sequencing. We validated enhancer function of several ACRs associated with known genes using CRISPR/Cas9-mediated genome editing. The ACR datasets were used to examine and correlate the transcriptional patterns of 17,111 pairs of duplicated genes in different tissues. We demonstrate that ACR dynamics are correlated with divergence of both expression level and tissue specificity of individual gene pairs. Gain or loss of flanking ACRs and mutation of cis-regulatory elements (CREs) within the ACRs can change the balance of the expression level and/or tissue specificity of the duplicated genes. Analysis of DNA sequences associated with ACRs revealed that the extensive sequence rearrangement after the WGD reshaped the CRE landscape, which appears to play a key role in the transcriptional divergence of duplicated genes in soybean. This may represent a general mechanism for transcriptional divergence of duplicated genes in polyploids that lack subgenome dominance.
Collapse
Affiliation(s)
- Chao Fang
- Department of Plant Biology, Michigan State University, East Lansing, MI48824
| | - Mingyu Yang
- Northeast Institute of Geography and Agroecology, Key Laboratory of Soybean Molecular Design Breeding, Chinese Academy of Sciences, Harbin150081, China
- Key Laboratory of Soybean Biology in Chinese Ministry of Education, Northeast Agricultural University, Harbin150030, China
| | - Yuecheng Tang
- Northeast Institute of Geography and Agroecology, Key Laboratory of Soybean Molecular Design Breeding, Chinese Academy of Sciences, Harbin150081, China
- Key Laboratory of Soybean Biology in Chinese Ministry of Education, Northeast Agricultural University, Harbin150030, China
| | - Ling Zhang
- Agro-Biotechnology Research Institute, Jilin Academy of Agricultural Sciences, Changchun130033, China
| | - Hainan Zhao
- Department of Plant Biology, Michigan State University, East Lansing, MI48824
| | - Hejia Ni
- Key Laboratory of Soybean Biology in Chinese Ministry of Education, Northeast Agricultural University, Harbin150030, China
| | - Qingshan Chen
- Key Laboratory of Soybean Biology in Chinese Ministry of Education, Northeast Agricultural University, Harbin150030, China
| | - Fanli Meng
- Northeast Institute of Geography and Agroecology, Key Laboratory of Soybean Molecular Design Breeding, Chinese Academy of Sciences, Harbin150081, China
| | - Jiming Jiang
- Department of Plant Biology, Michigan State University, East Lansing, MI48824
- Department of Horticulture, Michigan State University, East Lansing, MI48824
- Michigan State University AgBioResearch, East Lansing, MI48824
| |
Collapse
|
2
|
Song H, Wang Q, Zhang Z, Lin K, Pang E. Identification of clade-wide putative cis-regulatory elements from conserved non-coding sequences in Cucurbitaceae genomes. HORTICULTURE RESEARCH 2023; 10:uhad038. [PMID: 37799630 PMCID: PMC10548412 DOI: 10.1093/hr/uhad038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 02/20/2023] [Indexed: 10/07/2023]
Abstract
Cis-regulatory elements regulate gene expression and play an essential role in the development and physiology of organisms. Many conserved non-coding sequences (CNSs) function as cis-regulatory elements. They control the development of various lineages. However, predicting clade-wide cis-regulatory elements across several closely related species remains challenging. Based on the relationship between CNSs and cis-regulatory elements, we present a computational approach that predicts the clade-wide putative cis-regulatory elements in 12 Cucurbitaceae genomes. Using 12-way whole-genome alignment, we first obtained 632 112 CNSs in Cucurbitaceae. Next, we identified 16 552 Cucurbitaceae-wide cis-regulatory elements based on collinearity among all 12 Cucurbitaceae plants. Furthermore, we predicted 3 271 potential regulatory pairs in the cucumber genome, of which 98 were verified using integrative RNA sequencing and ChIP sequencing datasets from samples collected during various fruit development stages. The CNSs, Cucurbitaceae-wide cis-regulatory elements, and their target genes are accessible at http://cmb.bnu.edu.cn/cisRCNEs_cucurbit/. These elements are valuable resources for functionally annotating CNSs and their regulatory roles in Cucurbitaceae genomes.
Collapse
Affiliation(s)
- Hongtao Song
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Qi Wang
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Zhonghua Zhang
- College of Horticulture, Qingdao Agricultural University, Qingdao 266109, China
| | - Kui Lin
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Erli Pang
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| |
Collapse
|
3
|
Li T, Yin L, Stoll CE, Lisch D, Zhao M. Conserved noncoding sequences and de novo Mutator insertion alleles are imprinted in maize. PLANT PHYSIOLOGY 2023; 191:299-316. [PMID: 36173333 PMCID: PMC9806621 DOI: 10.1093/plphys/kiac459] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 08/30/2022] [Indexed: 05/20/2023]
Abstract
Genomic imprinting is an epigenetic phenomenon in which differential allele expression occurs in a parent-of-origin-dependent manner. Imprinting in plants is tightly linked to transposable elements (TEs), and it has been hypothesized that genomic imprinting may be a consequence of demethylation of TEs. Here, we performed high-throughput sequencing of ribonucleic acids from four maize (Zea mays) endosperms that segregated newly silenced Mutator (Mu) transposons and identified 110 paternally expressed imprinted genes (PEGs) and 139 maternally expressed imprinted genes (MEGs). Additionally, two potentially novel paternally suppressed MEGs are associated with de novo Mu insertions. In addition, we find evidence for parent-of-origin effects on expression of 407 conserved noncoding sequences (CNSs) in maize endosperm. The imprinted CNSs are largely localized within genic regions and near genes, but the imprinting status of the CNSs are largely independent of their associated genes. Both imprinted CNSs and PEGs have been subject to relaxed selection. However, our data suggest that although MEGs were already subject to a higher mutation rate prior to their being imprinted, imprinting may be the cause of the relaxed selection of PEGs. In addition, although DNA methylation is lower in the maternal alleles of both the maternally and paternally expressed CNSs (mat and pat CNSs), the difference between the two alleles in H3K27me3 levels was only observed in pat CNSs. Together, our findings point to the importance of both transposons and CNSs in genomic imprinting in maize.
Collapse
Affiliation(s)
- Tong Li
- Department of Biology, Miami University, Oxford, Ohio 45056, USA
- State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing 100193, P.R. China
| | - Liangwei Yin
- Department of Biology, Miami University, Oxford, Ohio 45056, USA
| | - Claire E Stoll
- Department of Biology, Miami University, Oxford, Ohio 45056, USA
| | - Damon Lisch
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, Indiana 47907, USA
| | - Meixia Zhao
- Department of Microbiology and Cell Science, Institute of Food and Agricultural Sciences, University of Florida, Gainesville, Florida 32611, USA
| |
Collapse
|
4
|
Zhou X, Zhu T, Fang W, Yu R, He Z, Chen D. Systematic annotation of conservation states provides insights into regulatory regions in rice. J Genet Genomics 2022; 49:1127-1137. [PMID: 35470092 DOI: 10.1016/j.jgg.2022.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Revised: 04/08/2022] [Accepted: 04/12/2022] [Indexed: 01/14/2023]
Abstract
Plant genomes contain a large fraction of noncoding sequences. The discovery and annotation of conserved noncoding sequences (CNSs) in plants is an ongoing challenge. Here we report the application of comparative genomics to systematically identify CNSs in 50 well-annotated Gramineae genomes using rice (Oryza sativa) as the reference. We conduct multiple-way whole-genome alignments to the rice genome. The rice genome is annotated as 20 conservation states (CSs) at single-nucleotide resolution using a multivariate hidden Markov model (ConsHMM) based on the multiple-genome alignments. Different states show distinct enrichments for various genomic features, and the conservation scores of CSs are highly correlated with the level of associated chromatin accessibility. We find that at least 33.5% of the rice genome is highly under selection, with more than 70% of the sequence lying outside of coding regions. A catalog of 855,366 regulatory CNSs is generated, and they significantly overlapped with putative active regulatory elements such as promoters, enhancers, and transcription factor binding sites. Collectively, our study provides a resource for elucidating functional noncoding regions of the rice genome and an evolutionary aspect of regulatory sequences in higher plants.
Collapse
Affiliation(s)
- Xinkai Zhou
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Tao Zhu
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Wen Fang
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Ranran Yu
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Zhaohui He
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Dijun Chen
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, Jiangsu 210023, China.
| |
Collapse
|
5
|
Pereira WJ, Knaack S, Chakraborty S, Conde D, Folk RA, Triozzi PM, Balmant KM, Dervinis C, Schmidt HW, Ané J, Roy S, Kirst M. Functional and comparative genomics reveals conserved noncoding sequences in the nitrogen-fixing clade. THE NEW PHYTOLOGIST 2022; 234:634-649. [PMID: 35092309 PMCID: PMC9302667 DOI: 10.1111/nph.18006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Accepted: 01/16/2022] [Indexed: 06/14/2023]
Abstract
Nitrogen is one of the most inaccessible plant nutrients, but certain species have overcome this limitation by establishing symbiotic interactions with nitrogen-fixing bacteria in the root nodule. This root-nodule symbiosis (RNS) is restricted to species within a single clade of angiosperms, suggesting a critical, but undetermined, evolutionary event at the base of this clade. To identify putative regulatory sequences implicated in the evolution of RNS, we evaluated the genomes of 25 species capable of nodulation and identified 3091 conserved noncoding sequences (CNS) in the nitrogen-fixing clade (NFC). We show that the chromatin accessibility of 452 CNS correlates significantly with the regulation of genes responding to lipochitooligosaccharides in Medicago truncatula. These included 38 CNS in proximity to 19 known genes involved in RNS. Five such regions are upstream of MtCRE1, Cytokinin Response Element 1, required to activate a suite of downstream transcription factors necessary for nodulation in M. truncatula. Genetic complementation of an Mtcre1 mutant showed a significant decrease of nodulation in the absence of the five CNS, when they are driving the expression of a functional copy of MtCRE1. CNS identified in the NFC may harbor elements required for the regulation of genes controlling RNS in M. truncatula.
Collapse
Affiliation(s)
- Wendell J. Pereira
- School of Forest, Fisheries and Geomatics SciencesUniversity of FloridaGainesvilleFL32611USA
| | - Sara Knaack
- Wisconsin Institute for DiscoveryUniversity of Wisconsin‐MadisonMadisonWI53715USA
| | | | - Daniel Conde
- School of Forest, Fisheries and Geomatics SciencesUniversity of FloridaGainesvilleFL32611USA
| | - Ryan A. Folk
- Department of Biological SciencesMississippi State UniversityStarkvilleMS39762USA
| | - Paolo M. Triozzi
- School of Forest, Fisheries and Geomatics SciencesUniversity of FloridaGainesvilleFL32611USA
| | - Kelly M. Balmant
- School of Forest, Fisheries and Geomatics SciencesUniversity of FloridaGainesvilleFL32611USA
| | - Christopher Dervinis
- School of Forest, Fisheries and Geomatics SciencesUniversity of FloridaGainesvilleFL32611USA
| | - Henry W. Schmidt
- School of Forest, Fisheries and Geomatics SciencesUniversity of FloridaGainesvilleFL32611USA
| | - Jean‐Michel Ané
- Department of BacteriologyUniversity of Wisconsin‐MadisonMadisonWI53706USA
- Department of AgronomyUniversity of Wisconsin‐MadisonMadisonWI53706USA
| | - Sushmita Roy
- Wisconsin Institute for DiscoveryUniversity of Wisconsin‐MadisonMadisonWI53715USA
- Department of Biostatistics and Medical InformaticsUniversity of Wisconsin‐MadisonMadisonWI53715USA
| | - Matias Kirst
- School of Forest, Fisheries and Geomatics SciencesUniversity of FloridaGainesvilleFL32611USA
- Genetics InstituteUniversity of FloridaGainesvilleFL32611USA
| |
Collapse
|
6
|
Horvath R, Josephs EB, Pesquet E, Stinchcombe JR, Wright SI, Scofield D, Slotte T. Selection on Accessible Chromatin Regions in Capsella grandiflora. Mol Biol Evol 2021; 38:5563-5575. [PMID: 34498072 PMCID: PMC8662636 DOI: 10.1093/molbev/msab270] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Accurate estimates of genome-wide rates and fitness effects of new mutations are essential for an improved understanding of molecular evolutionary processes. Although eukaryotic genomes generally contain a large noncoding fraction, functional noncoding regions and fitness effects of mutations in such regions are still incompletely characterized. A promising approach to characterize functional noncoding regions relies on identifying accessible chromatin regions (ACRs) tightly associated with regulatory DNA. Here, we applied this approach to identify and estimate selection on ACRs in Capsella grandiflora, a crucifer species ideal for population genomic quantification of selection due to its favorable population demography. We describe a population-wide ACR distribution based on ATAC-seq data for leaf samples of 16 individuals from a natural population. We use population genomic methods to estimate fitness effects and proportions of positively selected fixations (α) in ACRs and find that intergenic ACRs harbor a considerable fraction of weakly deleterious new mutations, as well as a significantly higher proportion of strongly deleterious mutations than comparable inaccessible intergenic regions. ACRs are enriched for expression quantitative trait loci (eQTL) and depleted of transposable element insertions, as expected if intergenic ACRs are under selection because they harbor regulatory regions. By integrating empirical identification of intergenic ACRs with analyses of eQTL and population genomic analyses of selection, we demonstrate that intergenic regulatory regions are an important source of nearly neutral mutations. These results improve our understanding of selection on noncoding regions and the role of nearly neutral mutations for evolutionary processes in outcrossing Brassicaceae species.
Collapse
Affiliation(s)
- Robert Horvath
- Department of Ecology, Environment and Plant Sciences, Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| | - Emily B Josephs
- Department of Plant Biology, Michigan State University, Lansing, MI, USA
| | - Edouard Pesquet
- Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden
| | - John R Stinchcombe
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| | - Stephen I Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| | - Douglas Scofield
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden
| | - Tanja Slotte
- Department of Ecology, Environment and Plant Sciences, Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| |
Collapse
|
7
|
Song B, Buckler ES, Wang H, Wu Y, Rees E, Kellogg EA, Gates DJ, Khaipho-Burch M, Bradbury PJ, Ross-Ibarra J, Hufford MB, Romay MC. Conserved noncoding sequences provide insights into regulatory sequence and loss of gene expression in maize. Genome Res 2021; 31:1245-1257. [PMID: 34045362 PMCID: PMC8256870 DOI: 10.1101/gr.266528.120] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 05/21/2021] [Indexed: 01/16/2023]
Abstract
Thousands of species will be sequenced in the next few years; however, understanding how their genomes work, without an unlimited budget, requires both molecular and novel evolutionary approaches. We developed a sensitive sequence alignment pipeline to identify conserved noncoding sequences (CNSs) in the Andropogoneae tribe (multiple crop species descended from a common ancestor ∼18 million years ago). The Andropogoneae share similar physiology while being tremendously genomically diverse, harboring a broad range of ploidy levels, structural variation, and transposons. These contribute to the potential of Andropogoneae as a powerful system for studying CNSs and are factors we leverage to understand the function of maize CNSs. We found that 86% of CNSs were comprised of annotated features, including introns, UTRs, putative cis-regulatory elements, chromatin loop anchors, noncoding RNA (ncRNA) genes, and several transposable element superfamilies. CNSs were enriched in active regions of DNA replication in the early S phase of the mitotic cell cycle and showed different DNA methylation ratios compared to the genome-wide background. More than half of putative cis-regulatory sequences (identified via other methods) overlapped with CNSs detected in this study. Variants in CNSs were associated with gene expression levels, and CNS absence contributed to loss of gene expression. Furthermore, the evolution of CNSs was associated with the functional diversification of duplicated genes in the context of maize subgenomes. Our results provide a quantitative understanding of the molecular processes governing the evolution of CNSs in maize.
Collapse
Affiliation(s)
- Baoxing Song
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
| | - Edward S Buckler
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, New York 14853, USA
- Agricultural Research Service, United States Department of Agriculture, Ithaca, New York 14853, USA
| | - Hai Wang
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
- National Maize Improvement Center, Key Laboratory of Crop Heterosis and Utilization, Joint Laboratory for International Cooperation in Crop Molecular Breeding, China Agricultural University, Beijing 100193, China
| | - Yaoyao Wu
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
- Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Evan Rees
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, New York 14853, USA
| | | | - Daniel J Gates
- Department of Evolution and Ecology, University of California Davis, Davis, California 95616, USA
| | - Merritt Khaipho-Burch
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Peter J Bradbury
- Agricultural Research Service, United States Department of Agriculture, Ithaca, New York 14853, USA
| | - Jeffrey Ross-Ibarra
- Department of Evolution and Ecology, University of California Davis, Davis, California 95616, USA
- Center for Population Biology and Genome Center, University of California Davis, Davis, California 95616, USA
| | - Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, Iowa 50011, USA
| | - M Cinta Romay
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
| |
Collapse
|
8
|
Javadi SM, Shobbar ZS, Ebrahimi A, Shahbazi M. New insights on key genes involved in drought stress response of barley: gene networks reconstruction, hub, and promoter analysis. J Genet Eng Biotechnol 2021; 19:2. [PMID: 33409810 PMCID: PMC7788114 DOI: 10.1186/s43141-020-00104-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 12/14/2020] [Indexed: 12/16/2022]
Abstract
Background Barley (Hordeum vulgare L.) is one of the most important cereals worldwide. Although this crop is drought-tolerant, water deficiency negatively affects its growth and production. To detect key genes involved in drought tolerance in barley, a reconstruction of the related gene network and discovery of the hub genes would help. Here, drought-responsive genes in barley were collected through analysis of the available microarray datasets (− 5 ≥ Fold change ≥ 5, adjusted p value ≤ 0.05). Protein-protein interaction (PPI) networks were reconstructed. Results The hub genes were identified by Cytoscape software using three Cyto-hubba algorithms (Degree, Closeness, and MNC), leading to the identification of 17 and 16 non-redundant genes at vegetative and reproductive stages, respectively. These genes consist of some transcription factors such as HvVp1, HvERF4, HvFUS3, HvCBF6, DRF1.3, HvNAC6, HvCO5, and HvWRKY42, which belong to AP2, NAC, Zinc-finger, and WRKY families. In addition, the expression pattern of four hub genes was compared between the two studied cultivars, i.e., “Yousef” (drought-tolerant) and “Morocco” (susceptible). The results of real-time PCR revealed that the expression patterns corresponded well with those determined by the microarray. Also, promoter analysis revealed that some TF families, including AP2, NAC, Trihelix, MYB, and one modular (composed of two HD-ZIP TFs), had a binding site in 85% of promoters of the drought-responsive genes and of the hub genes in barley. Conclusions The identified hub genes, especially those from AP2 and NAC families, might be among key TFs that regulate drought-stress response in barley and are suggested as promising candidate genes for further functional analysis.
Collapse
Affiliation(s)
- Seyedeh Mehri Javadi
- Department of Biotechnology and Plant Breeding, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Zahra-Sadat Shobbar
- Department of Systems Biology, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran.
| | - Asa Ebrahimi
- Department of Biotechnology and Plant Breeding, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Maryam Shahbazi
- Gorgan University of Agricultural Sciences and Natural Resources, Gorgan, Iran
| |
Collapse
|
9
|
Springer N, de León N, Grotewold E. Challenges of Translating Gene Regulatory Information into Agronomic Improvements. TRENDS IN PLANT SCIENCE 2019; 24:1075-1082. [PMID: 31377174 DOI: 10.1016/j.tplants.2019.07.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/26/2019] [Accepted: 07/05/2019] [Indexed: 06/10/2023]
Abstract
Improvement of agricultural species has exploited the genetic variation responsible for complex quantitative traits. Much of the functional variation is regulatory, in cis-regulatory elements and trans-acting factors that ultimately contribute to gene expression differences. However, the identification of gene regulatory network components that, when modulated, will increase plant productivity or resilience, is challenging, yet essential to provide increased predictive power for genome engineering approaches that are likely to benefit useful traits. Here, we discuss the opportunities and limitations of using data obtained from gene coexpression, transcription factor binding, and genome-wide association mapping analyses to predict regulatory interactions that impact crop improvement. It is apparent that a combination of information from these data types is necessary for the reliable identification and utilization of important regulatory interactions that underlie complex agronomic traits.
Collapse
Affiliation(s)
- Nathan Springer
- Department of Plant and Microbial Biology, University of Minnesota, St Paul, MN 55108, USA.
| | - Natalia de León
- Department of Agronomy, University of Wisconsin, Madison, WI 56706, USA
| | - Erich Grotewold
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
10
|
pDHS-ELM: computational predictor for plant DNase I hypersensitive sites based on extreme learning machines. Mol Genet Genomics 2018; 293:1035-1049. [DOI: 10.1007/s00438-018-1436-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 03/27/2018] [Indexed: 10/17/2022]
|
11
|
Conserved noncoding sequences conserve biological networks and influence genome evolution. Heredity (Edinb) 2018; 120:437-451. [PMID: 29396421 DOI: 10.1038/s41437-018-0055-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Revised: 12/14/2017] [Accepted: 01/08/2018] [Indexed: 01/24/2023] Open
Abstract
Comparative genomics approaches have identified numerous conserved cis-regulatory sequences near genes in plant genomes. Despite the identification of these conserved noncoding sequences (CNSs), our knowledge of their functional importance and selection remains limited. Here, we used a combination of DNA methylome analysis, microarray expression analyses, and functional annotation to study these sequences in the model tree Populus trichocarpa. Methylation in CG contexts and non-CG contexts was lower in CNSs, particularly CNSs in the 5'-upstream regions of genes, compared with other sites in the genome. We observed that CNSs are enriched in genes with transcription and binding functions, and this also associated with syntenic genes and those from whole-genome duplications, suggesting that cis-regulatory sequences play a key role in genome evolution. We detected a significant positive correlation between CNS number and protein interactions, suggesting that CNSs may have roles in the evolution and maintenance of biological networks. The divergence of CNSs indicates that duplication-degeneration-complementation drives the subfunctionalization of a proportion of duplicated genes from whole-genome duplication. Furthermore, population genomics confirmed that most CNSs are under strong purifying selection and only a small subset of CNSs shows evidence of adaptive evolution. These findings provide a foundation for future studies exploring these key genomic features in the maintenance of biological networks, local adaptation, and transcription.
Collapse
|
12
|
Liang P, Saqib HSA, Zhang X, Zhang L, Tang H. Single-Base Resolution Map of Evolutionary Constraints and Annotation of Conserved Elements across Major Grass Genomes. Genome Biol Evol 2018; 10:473-488. [PMID: 29378032 PMCID: PMC5798027 DOI: 10.1093/gbe/evy006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/08/2018] [Indexed: 12/20/2022] Open
Abstract
Conserved noncoding sequences (CNSs) are evolutionarily conserved DNA sequences that do not encode proteins but may have potential regulatory roles in gene expression. CNS in crop genomes could be linked to many important agronomic traits and ecological adaptations. Compared with the relatively mature exon annotation protocols, efficient methods are lacking to predict the location of noncoding sequences in the plant genomes. We implemented a computational pipeline that is tailored to the comparisons of plant genomes, yielding a large number of conserved sequences using rice genome as the reference. In this study, we used 17 published grass genomes, along with five monocot genomes as well as the basal angiosperm genome of Amborella trichopoda. Genome alignments among these genomes suggest that at least 12.05% of the rice genome appears to be evolving under constraints in the Poaceae lineage, with close to half of the evolutionarily constrained sequences located outside protein-coding regions. We found evidence for purifying selection acting on the conserved sequences by analyzing segregating SNPs within the rice population. Furthermore, we found that known functional motifs were significantly enriched within CNS, with many motifs associated with the preferred binding of ubiquitous transcription factors. The conserved elements that we have curated are accessible through our public database and the JBrowse server. In-depth functional annotations and evolutionary dynamics of the identified conserved sequences provide a solid foundation for studying gene regulation, genome evolution, as well as to inform gene isolation for cereal biologists.
Collapse
Affiliation(s)
- Pingping Liang
- Key Laboratory of Genetics, Breeding and Multiple Utilization of Corps, Center for Genomics and Biotechnology, Ministry of Education; Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, China
| | - Hafiz Sohaib Ahmed Saqib
- Institute of Applied Ecology, Fujian Agriculture and Forestry University, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xingtan Zhang
- Key Laboratory of Genetics, Breeding and Multiple Utilization of Corps, Center for Genomics and Biotechnology, Ministry of Education; Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Liangsheng Zhang
- Key Laboratory of Genetics, Breeding and Multiple Utilization of Corps, Center for Genomics and Biotechnology, Ministry of Education; Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Haibao Tang
- Key Laboratory of Genetics, Breeding and Multiple Utilization of Corps, Center for Genomics and Biotechnology, Ministry of Education; Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| |
Collapse
|
13
|
Lai X, Behera S, Liang Z, Lu Y, Deogun JS, Schnable JC. STAG-CNS: An Order-Aware Conserved Noncoding Sequences Discovery Tool for Arbitrary Numbers of Species. MOLECULAR PLANT 2017; 10:990-999. [PMID: 28602693 DOI: 10.1016/j.molp.2017.05.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2017] [Revised: 05/24/2017] [Accepted: 05/30/2017] [Indexed: 06/07/2023]
Abstract
One method for identifying noncoding regulatory regions of a genome is to quantify rates of divergence between related species, as functional sequence will generally diverge more slowly. Most approaches to identifying these conserved noncoding sequences (CNSs) based on alignment have had relatively large minimum sequence lengths (≥15 bp) compared with the average length of known transcription factor binding sites. To circumvent this constraint, STAG-CNS that can simultaneously integrate the data from the promoters of conserved orthologous genes in three or more species was developed. Using the data from up to six grass species made it possible to identify conserved sequences as short as 9 bp with false discovery rate ≤0.05. These CNSs exhibit greater overlap with open chromatin regions identified using DNase I hypersensitivity assays, and are enriched in the promoters of genes involved in transcriptional regulation. STAG-CNS was further employed to characterize loss of conserved noncoding sequences associated with retained duplicate genes from the ancient maize polyploidy. Genes with fewer retained CNSs show lower overall expression, although this bias is more apparent in samples of complex organ systems containing many cell types, suggesting that CNS loss may correspond to a reduced number of expression contexts rather than lower expression levels across the entire ancestral expression domain.
Collapse
Affiliation(s)
- Xianjun Lai
- Department of Agronomy and Horticulture, Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68588, USA; Maize Research Institute, Sichuan Agricultural University, Chengdu 611130, China
| | - Sairam Behera
- Department of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Zhikai Liang
- Department of Agronomy and Horticulture, Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Yanli Lu
- Maize Research Institute, Sichuan Agricultural University, Chengdu 611130, China
| | - Jitender S Deogun
- Department of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588, USA.
| | - James C Schnable
- Department of Agronomy and Horticulture, Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68588, USA.
| |
Collapse
|
14
|
You Q, Yan H, Liu Y, Yi X, Zhang K, Xu W, Su Z. A systemic identification approach for primary transcription start site of Arabidopsis miRNAs from multidimensional omics data. Funct Integr Genomics 2016; 17:353-363. [PMID: 28032247 DOI: 10.1007/s10142-016-0541-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Revised: 12/13/2016] [Accepted: 12/19/2016] [Indexed: 01/08/2023]
Abstract
The 22-nucleotide non-coding microRNAs (miRNAs) are mostly transcribed by RNA polymerase II and are similar to protein-coding genes. Unlike the clear process from stem-loop precursors to mature miRNAs, the primary transcriptional regulation of miRNA, especially in plants, still needs to be further clarified, including the original transcription start site, functional cis-elements and primary transcript structures. Due to several well-characterized transcription signals in the promoter region, we proposed a systemic approach integrating multidimensional "omics" (including genomics, transcriptomics, and epigenomics) data to improve the genome-wide identification of primary miRNA transcripts. Here, we used the model plant Arabidopsis thaliana to improve the ability to identify candidate promoter locations in intergenic miRNAs and to determine rules for identifying primary transcription start sites of miRNAs by integrating high-throughput omics data, such as the DNase I hypersensitive sites, chromatin immunoprecipitation-sequencing of polymerase II and H3K4me3, as well as high throughput transcriptomic data. As a result, 93% of refined primary transcripts could be confirmed by the primer pairs from a previous study. Cis-element and secondary structure analyses also supported the feasibility of our results. This work will contribute to the primary transcriptional regulatory analysis of miRNAs, and the conserved regulatory pattern may be a suitable miRNA characteristic in other plant species.
Collapse
Affiliation(s)
- Qi You
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Hengyu Yan
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Yue Liu
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Xin Yi
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Kang Zhang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Wenying Xu
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Zhen Su
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.
| |
Collapse
|
15
|
Control of seed dormancy in Arabidopsis by a cis-acting noncoding antisense transcript. Proc Natl Acad Sci U S A 2016; 113:E7846-E7855. [PMID: 27856735 DOI: 10.1073/pnas.1608827113] [Citation(s) in RCA: 85] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Seed dormancy is one of the most crucial process transitions in a plant's life cycle. Its timing is tightly controlled by the expression level of the Delay of Germination 1 gene (DOG1). DOG1 is the major quantitative trait locus for seed dormancy in Arabidopsis and has been shown to control dormancy in many other plant species. This is reflected by the evolutionary conservation of the functional short alternatively polyadenylated form of the DOG1 mRNA. Notably, the 3' region of DOG1, including the last exon that is not included in this transcript isoform, shows a high level of conservation at the DNA level, but the encoded polypeptide is poorly conserved. Here, we demonstrate that this region of DOG1 contains a promoter for the transcription of a noncoding antisense RNA, asDOG1, that is 5' capped, polyadenylated, and relatively stable. This promoter is autonomous and asDOG1 has an expression profile that is different from known DOG1 transcripts. Using several approaches we show that asDOG1 strongly suppresses DOG1 expression during seed maturation in cis, but is unable to do so in trans Therefore, the negative regulation of seed dormancy by asDOG1 in cis results in allele-specific suppression of DOG1 expression and promotes germination. Given the evolutionary conservation of the asDOG1 promoter, we propose that this cis-constrained noncoding RNA-mediated mechanism limiting the duration of seed dormancy functions across the Brassicaceae.
Collapse
|
16
|
A Genomic Analysis of Factors Driving lincRNA Diversification: Lessons from Plants. G3-GENES GENOMES GENETICS 2016; 6:2881-91. [PMID: 27440919 PMCID: PMC5015945 DOI: 10.1534/g3.116.030338] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Transcriptomic analyses from across eukaryotes indicate that most of the genome is transcribed at some point in the developmental trajectory of an organism. One class of these transcripts is termed long intergenic noncoding RNAs (lincRNAs). Recently, attention has focused on understanding the evolutionary dynamics of lincRNAs, particularly their conservation within genomes. Here, we take a comparative genomic and phylogenetic approach to uncover factors influencing lincRNA emergence and persistence in the plant family Brassicaceae, to which Arabidopsis thaliana belongs. We searched 10 genomes across the family for evidence of > 5000 lincRNA loci from A. thaliana. From loci conserved in the genomes of multiple species, we built alignments and inferred phylogeny. We then used gene tree/species tree reconciliation to examine the duplication history and timing of emergence of these loci. Emergence of lincRNA loci appears to be linked to local duplication events, but, surprisingly, not whole genome duplication events (WGD), or transposable elements. Interestingly, WGD events are associated with the loss of loci for species having undergone relatively recent polyploidy. Lastly, we identify 1180 loci of the 6480 previously annotated A. thaliana lincRNAs (18%) with elevated levels of conservation. These conserved lincRNAs show higher expression, and are enriched for stress-responsiveness and cis-regulatory motifs known as conserved noncoding sequences (CNSs). These data highlight potential functional pathways and suggest that CNSs may regulate neighboring genes at both the genomic and transcriptomic level. In sum, we provide insight into processes that may influence lincRNA diversification by providing an evolutionary context for previously annotated lincRNAs.
Collapse
|
17
|
Hoffmann RD, Palmgren M. Purifying selection acts on coding and non-coding sequences of paralogous genes in Arabidopsis thaliana. BMC Genomics 2016; 17:456. [PMID: 27296049 PMCID: PMC4906602 DOI: 10.1186/s12864-016-2803-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2015] [Accepted: 05/27/2016] [Indexed: 01/13/2023] Open
Abstract
Background Whole-genome duplications in the ancestors of many diverse species provided the genetic material for evolutionary novelty. Several models explain the retention of paralogous genes. However, how these models are reflected in the evolution of coding and non-coding sequences of paralogous genes is unknown. Results Here, we analyzed the coding and non-coding sequences of paralogous genes in Arabidopsis thaliana and compared these sequences with those of orthologous genes in Arabidopsis lyrata. Paralogs with lower expression than their duplicate had more nonsynonymous substitutions, were more likely to fractionate, and exhibited less similar expression patterns with their orthologs in the other species. Also, lower-expressed genes had greater tissue specificity. Orthologous conserved non-coding sequences in the promoters, introns, and 3′ untranslated regions were less abundant at lower-expressed genes compared to their higher-expressed paralogs. A gene ontology (GO) term enrichment analysis showed that paralogs with similar expression levels were enriched in GO terms related to ribosomes, whereas paralogs with different expression levels were enriched in terms associated with stress responses. Conclusions Loss of conserved non-coding sequences in one gene of a paralogous gene pair correlates with reduced expression levels that are more tissue specific. Together with increased mutation rates in the coding sequences, this suggests that similar forces of purifying selection act on coding and non-coding sequences. We propose that coding and non-coding sequences evolve concurrently following gene duplication. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2803-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Robert D Hoffmann
- Center for Membrane Pumps in Cells and Disease - PUMPKIN, Danish National Research Foundation, Department of Plant and Environmental Sciences, University of Copenhagen, 1871, Frederiksberg C, Denmark.
| | - Michael Palmgren
- Center for Membrane Pumps in Cells and Disease - PUMPKIN, Danish National Research Foundation, Department of Plant and Environmental Sciences, University of Copenhagen, 1871, Frederiksberg C, Denmark
| |
Collapse
|
18
|
CNMS: The preferred genic markers for comparative genomic, molecular phylogenetic, functional genetic diversity and differential gene regulatory expression analyses in chickpea. J Biosci 2015; 40:579-92. [PMID: 26333404 DOI: 10.1007/s12038-015-9545-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
The intra/inter-genomic comparative mapping-based phylogenetic footprinting identified 5 paralogous and 656 orthologous genome-wide CNMS markers in the upstream sequences of chickpea genes. These CNMS markers revealed a high-degree of gene-based syntenic relationship between chickpea and Medicago genomes while minimum between chickpea and Vitis genomes. The time of divergence and duplication estimated using CNMS markers highlight the expected phylogenetic relationships between chickpea and six dicot (legume) species as well as occurrence of ancient genome (approximately 53 Mya) with small-scale recent segmental (approximately 10 Mya) duplication events in chickpea. A wider level of functional molecular diversity (14 to 88 percent) and admixed population genetic structure was detected among desi, kabuli and wild genotypes by genic CNMS markers at a genome-wide scale suggesting their utility in large-scale genetic analysis in chickpea. The subfunctionalization at the cis-regulatory element region and TFBS (transcription factor binding site) motif levels in the upstream sequences of CNMS marker-associated orthologous genes than the paralogues was predominant. Functional constraint might have considerable effect on these CNMScontaining regulatory elements controlling consistent orthologous gene expression in dicots. A rapid subfunctionalization based on diverge differential expression of paralogous CNMS marker-associated genes particularly those that underwent recent small-scale segmental duplication events in chickpea was apparent. The differential regulation of expression and subfunctionalization potential of Ultra CNMS marker-associated genes suggest their utility in deciphering the complex gene regulatory function as well as identification and targeted mapping of potential genes/QTLs governing vital agronomic traits in chickpea. The gene-based CNMS markers with desirable inherent genetic attributes like higher degree of comparative genome mapping, functional genetic diversity and differential gene regulatory expression potential can significantly propel the genomics-assisted chickpea crop improvement.
Collapse
|
19
|
Bajaj D, Saxena MS, Kujur A, Das S, Badoni S, Tripathi S, Upadhyaya HD, Gowda CLL, Sharma S, Singh S, Tyagi AK, Parida SK. Genome-wide conserved non-coding microsatellite (CNMS) marker-based integrative genetical genomics for quantitative dissection of seed weight in chickpea. JOURNAL OF EXPERIMENTAL BOTANY 2015; 66:1271-90. [PMID: 25504138 PMCID: PMC4339591 DOI: 10.1093/jxb/eru478] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Phylogenetic footprinting identified 666 genome-wide paralogous and orthologous CNMS (conserved non-coding microsatellite) markers from 5'-untranslated and regulatory regions (URRs) of 603 protein-coding chickpea genes. The (CT)n and (GA)n CNMS carrying CTRMCAMV35S and GAGA8BKN3 regulatory elements, respectively, are abundant in the chickpea genome. The mapped genic CNMS markers with robust amplification efficiencies (94.7%) detected higher intraspecific polymorphic potential (37.6%) among genotypes, implying their immense utility in chickpea breeding and genetic analyses. Seventeen differentially expressed CNMS marker-associated genes showing strong preferential and seed tissue/developmental stage-specific expression in contrasting genotypes were selected to narrow down the gene targets underlying seed weight quantitative trait loci (QTLs)/eQTLs (expression QTLs) through integrative genetical genomics. The integration of transcript profiling with seed weight QTL/eQTL mapping, molecular haplotyping, and association analyses identified potential molecular tags (GAGA8BKN3 and RAV1AAT regulatory elements and alleles/haplotypes) in the LOB-domain-containing protein- and KANADI protein-encoding transcription factor genes controlling the cis-regulated expression for seed weight in the chickpea. This emphasizes the potential of CNMS marker-based integrative genetical genomics for the quantitative genetic dissection of complex seed weight in chickpea.
Collapse
Affiliation(s)
- Deepak Bajaj
- National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Maneesha S Saxena
- National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Alice Kujur
- National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Shouvik Das
- National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Saurabh Badoni
- National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Shailesh Tripathi
- Division of Genetics, Indian Agricultural Research Institute (IARI), New Delhi 110012, India
| | - Hari D Upadhyaya
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502324, Telangana, India
| | - C L L Gowda
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502324, Telangana, India
| | - Shivali Sharma
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502324, Telangana, India
| | - Sube Singh
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502324, Telangana, India
| | - Akhilesh K Tyagi
- National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Swarup K Parida
- National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi 110067, India
| |
Collapse
|
20
|
de los Reyes BG, Mohanty B, Yun SJ, Park MR, Lee DY. Upstream regulatory architecture of rice genes: summarizing the baseline towards genus-wide comparative analysis of regulatory networks and allele mining. RICE (NEW YORK, N.Y.) 2015; 8:14. [PMID: 25844119 PMCID: PMC4385054 DOI: 10.1186/s12284-015-0041-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2014] [Accepted: 01/12/2015] [Indexed: 05/23/2023]
Abstract
Dissecting the upstream regulatory architecture of rice genes and their cognate regulator proteins is at the core of network biology and its applications to comparative functional genomics. With the rapidly advancing comparative genomics resources in the genus Oryza, a reference genome annotation that defines the various cis-elements and trans-acting factors that interface each gene locus with various intrinsic and extrinsic signals for growth, development, reproduction and adaptation must be established to facilitate the understanding of phenotypic variation in the context of regulatory networks. Such information is also important to establish the foundation for mining non-coding sequence variation that defines novel alleles and epialleles across the enormous phenotypic diversity represented in rice germplasm. This review presents a synthesis of the state of knowledge and consensus trends regarding the various cis-acting and trans-acting components that define spatio-temporal regulation of rice genes based on representative examples from both foundational studies in other model and non-model plants, and more recent studies in rice. The goal is to summarize the baseline for systematic upstream sequence annotation of the rapidly advancing genome sequence resources in Oryza in preparation for genus-wide functional genomics. Perspectives on the potential applications of such information for gene discovery, network engineering and genomics-enabled rice breeding are also discussed.
Collapse
Affiliation(s)
| | - Bijayalaxmi Mohanty
- />Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore, 117576 Singapore
| | - Song Joong Yun
- />Department of Crop Science and Institute of Agricultural Science and Technology, Chonbuk National University, Chonju, 561-756 Korea
| | - Myoung-Ryoul Park
- />School of Biology and Ecology, University of Maine, Orono, ME 04469 USA
| | - Dong-Yup Lee
- />Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore, 117576 Singapore
| |
Collapse
|
21
|
Berke L, Snel B. The histone modification H3K27me3 is retained after gene duplication and correlates with conserved noncoding sequences in Arabidopsis. Genome Biol Evol 2014; 6:572-9. [PMID: 24567304 PMCID: PMC3971591 DOI: 10.1093/gbe/evu040] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The histone modification H3K27me3 is involved in repression of transcription and plays a crucial role in developmental transitions in both animals and plants. It is deposited by PRC2 (Polycomb repressive complex 2), a conserved protein complex. In Arabidopsis thaliana, H3K27me3 is found at 15% of all genes. These tend to encode transcription factors and other regulators important for development. However, it is not known how PRC2 is recruited to target loci nor how this set of target genes arose during Arabidopsis evolution. To resolve the latter, we integrated A. thaliana gene families with five independent genome-wide H3K27me3 data sets. Gene families were either significantly enriched or depleted of H3K27me3, showing a strong impact of shared ancestry to H3K27me3 distribution. To quantify this, we performed ancestral state reconstruction of H3K27me3 on phylogenetic trees of gene families. The set of H3K27me3-marked genes changed less than expected by chance, suggesting that H3K27me3 was retained after gene duplication. This retention suggests that the PRC2-recruiting signal could be encoded in the DNA and also conserved among certain duplicated genes. Indeed, H3K27me3-marked genes were overrepresented among paralogs sharing conserved noncoding sequences (CNSs) that are enriched with transcription factor binding sites. The association of upstream CNSs with H3K27me3-marked genes represents the first genome-wide connection between H3K27me3 and potential regulatory elements in plants. Thus, we propose that CNSs likely function as part of the PRC2 recruitment in plants.
Collapse
Affiliation(s)
- Lidija Berke
- Theoretical Biology and Bioinformatics, Department of Biology, Faculty of Science, Utrecht University, The Netherlands
| | | |
Collapse
|
22
|
Mewalal R, Mizrachi E, Mansfield SD, Myburg AA. Cell wall-related proteins of unknown function: missing links in plant cell wall development. PLANT & CELL PHYSIOLOGY 2014; 55:1031-43. [PMID: 24683037 DOI: 10.1093/pcp/pcu050] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
Lignocellulosic biomass is an important feedstock for the pulp and paper industry as well as emerging biofuel and biomaterial industries. However, the recalcitrance of the secondary cell wall to chemical or enzymatic degradation remains a major hurdle for efficient extraction of economically important biopolymers such as cellulose. It has been estimated that approximately 10-15% of about 27,000 protein-coding genes in the Arabidopsis genome are dedicated to cell wall development; however, only about 130 Arabidopsis genes thus far have experimental evidence validating cell wall function. While many genes have been implicated through co-expression analysis with known genes, a large number are broadly classified as proteins of unknown function (PUFs). Recently the functionality of some of these unknown proteins in cell wall development has been revealed using reverse genetic approaches. Given the large number of cell wall-related PUFs, how do we approach and subsequently prioritize the investigation of such unknown genes that may be essential to or influence plant cell wall development and structure? Here, we address the aforementioned question in two parts; we first identify the different kinds of PUFs based on known and predicted features such as protein domains. Knowledge of inherent features of PUFs may allow for functional inference and a concomitant link to biological context. Secondly, we discuss omics-based technologies and approaches that are helping identify and prioritize cell wall-related PUFs by functional association. In this way, hypothesis-driven experiments can be designed for functional elucidation of many proteins that remain missing links in our understanding of plant cell wall biosynthesis.
Collapse
Affiliation(s)
- Ritesh Mewalal
- Department of Genetics, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Hatfield, Pretoria, 0028, South Africa
| | - Eshchar Mizrachi
- Department of Genetics, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Hatfield, Pretoria, 0028, South Africa
| | - Shawn D Mansfield
- Department of Wood Science, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Alexander A Myburg
- Department of Genetics, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Hatfield, Pretoria, 0028, South Africa
| |
Collapse
|
23
|
A MITE transposon insertion is associated with differential methylation at the maize flowering time QTL Vgt1. G3-GENES GENOMES GENETICS 2014; 4:805-12. [PMID: 24607887 PMCID: PMC4025479 DOI: 10.1534/g3.114.010686] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
One of the major quantitative trait loci for flowering time in maize, the Vegetative to generative transition 1 (Vgt1) locus, corresponds to an upstream (70 kb) noncoding regulatory element of ZmRap2.7, a repressor of flowering. At Vgt1, a miniature transposon (MITE) insertion into a conserved noncoding sequence was previously found to be highly associated with early flowering in independent studies. Because cytosine methylation is known to be associated with transposons and to influence gene expression, we aimed to investigate how DNA methylation patterns in wild-type and mutant Vgt1 correlate with ZmRap2.7 expression. The methylation state at Vgt1 was assayed in leaf samples of maize inbred and F1 hybrid samples, and at the syntenic region in sorghum. The Vgt1-linked conserved noncoding sequence was very scarcely methylated both in maize and sorghum. However, in the early maize Vgt1 allele, the region immediately flanking the highly methylated MITE insertion was significantly more methylated and showed features of methylation spreading. Allele-specific expression assays revealed that the presence of the MITE and its heavy methylation appear to be linked to altered ZmRap2.7 transcription. Although not providing proof of causative connection, our results associate transposon-linked differential methylation with allelic state and gene expression at a major flowering time quantitative trait locus in maize.
Collapse
|
24
|
Downs GS, Liseron-Monfils C, Lukens LN. Regulatory motifs identified from a maize developmental coexpression network. Genome 2014; 57:181-4. [DOI: 10.1139/gen-2013-0177] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Transcriptional control is an important determinant of plant development, and distinct modules of coordinated genes characterize the maize developmental transcriptome. Upstream regulatory sequences are often the primary factors that control gene expression pattern and abundance. Here, we identify 244 regulatory motifs that are significantly enriched within 24 gene expression modules previously constructed from transcript abundances of 34 876 Zea mays (maize) gene models from embryogenesis to senescence. Within modules, we identify motifs that have not been characterized. In addition, we identify motifs similar to experimentally verified motifs, and the functions of these motifs overlap with predicted module functions. This work demonstrates the power of transcript-level coexpression modules to identify both variants of known regulatory motifs and novel motifs that control a species’ developmental transcriptome.
Collapse
Affiliation(s)
- Gregory S. Downs
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Christophe Liseron-Monfils
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | - Lewis N. Lukens
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| |
Collapse
|
25
|
Subramaniam S, Wang X, Freeling M, Pires JC. The fate of Arabidopsis thaliana homeologous CNSs and their motifs in the Paleohexaploid Brassica rapa. Genome Biol Evol 2013; 5:646-60. [PMID: 23493633 PMCID: PMC3641636 DOI: 10.1093/gbe/evt035] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Following polyploidy, duplicate genes are often deleted, and if they are not, then duplicate regulatory regions are sometimes lost. By what mechanism is this loss and what is the chance that such a loss removes function? To explore these questions, we followed individual Arabidopsis thaliana–A. thaliana conserved noncoding sequences (CNSs) into the Brassica ancestor, through a paleohexaploidy and into Brassica rapa. Thus, a single Brassicaceae CNS has six potential orthologous positions in B. rapa; a single Arabidopsis CNS has three potential homeologous positions. We reasoned that a CNS, if present on a singlet Brassica gene, would be unlikely to lose function compared with a more redundant CNS, and this is the case. Redundant CNSs go nondetectable often. Using this logic, each mechanism of CNS loss was assigned a metric of functionality. By definition, proved deletions do not function as sequence. Our results indicated that CNSs that go nondetectable by base substitution or large insertion are almost certainly still functional (redundancy does not matter much to their detectability frequency), whereas those lost by inferred deletion or indels are approximately 75% likely to be nonfunctional. Overall, an average nondetectable, once-redundant CNS more than 30 bp in length has a 72% chance of being nonfunctional, and that makes sense because 97% of them sort to a molecular mechanism with “deletion” in its description, but base substitutions do cause loss. Similarly, proved-functional G-boxes go undetectable by deletion 82% of the time. Fractionation mutagenesis is a procedure that uses polyploidy as a mutagenic agent to genetically alter RNA expression profiles, and then to construct testable hypotheses as to the function of the lost regulatory site. We show fractionation mutagenesis to be a “deletion machine” in the Brassica lineage.
Collapse
|
26
|
De Clercq I, Vermeirssen V, Van Aken O, Vandepoele K, Murcha MW, Law SR, Inzé A, Ng S, Ivanova A, Rombaut D, van de Cotte B, Jaspers P, Van de Peer Y, Kangasjärvi J, Whelan J, Van Breusegem F. The membrane-bound NAC transcription factor ANAC013 functions in mitochondrial retrograde regulation of the oxidative stress response in Arabidopsis. THE PLANT CELL 2013; 25:3472-90. [PMID: 24045019 PMCID: PMC3809544 DOI: 10.1105/tpc.113.117168] [Citation(s) in RCA: 246] [Impact Index Per Article: 22.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2013] [Revised: 08/06/2013] [Accepted: 08/26/2013] [Indexed: 05/18/2023]
Abstract
Upon disturbance of their function by stress, mitochondria can signal to the nucleus to steer the expression of responsive genes. This mitochondria-to-nucleus communication is often referred to as mitochondrial retrograde regulation (MRR). Although reactive oxygen species and calcium are likely candidate signaling molecules for MRR, the protein signaling components in plants remain largely unknown. Through meta-analysis of transcriptome data, we detected a set of genes that are common and robust targets of MRR and used them as a bait to identify its transcriptional regulators. In the upstream regions of these mitochondrial dysfunction stimulon (MDS) genes, we found a cis-regulatory element, the mitochondrial dysfunction motif (MDM), which is necessary and sufficient for gene expression under various mitochondrial perturbation conditions. Yeast one-hybrid analysis and electrophoretic mobility shift assays revealed that the transmembrane domain-containing no apical meristem/Arabidopsis transcription activation factor/cup-shaped cotyledon transcription factors (ANAC013, ANAC016, ANAC017, ANAC053, and ANAC078) bound to the MDM cis-regulatory element. We demonstrate that ANAC013 mediates MRR-induced expression of the MDS genes by direct interaction with the MDM cis-regulatory element and triggers increased oxidative stress tolerance. In conclusion, we characterized ANAC013 as a regulator of MRR upon stress in Arabidopsis thaliana.
Collapse
Affiliation(s)
- Inge De Clercq
- Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium
| | - Vanessa Vermeirssen
- Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium
| | - Olivier Van Aken
- Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium
- Australian Research Council Centre of Excellence in Plant Energy Biology, University of Western Australia, Crawley 6009, Western Australia, Australia
| | - Klaas Vandepoele
- Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium
| | - Monika W. Murcha
- Australian Research Council Centre of Excellence in Plant Energy Biology, University of Western Australia, Crawley 6009, Western Australia, Australia
| | - Simon R. Law
- Australian Research Council Centre of Excellence in Plant Energy Biology, University of Western Australia, Crawley 6009, Western Australia, Australia
| | - Annelies Inzé
- Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium
| | - Sophia Ng
- Australian Research Council Centre of Excellence in Plant Energy Biology, University of Western Australia, Crawley 6009, Western Australia, Australia
| | - Aneta Ivanova
- Australian Research Council Centre of Excellence in Plant Energy Biology, University of Western Australia, Crawley 6009, Western Australia, Australia
| | - Debbie Rombaut
- Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium
| | - Brigitte van de Cotte
- Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium
| | - Pinja Jaspers
- Plant Biology, Department of Biological and Environmental Sciences, University of Helsinki, FI-00014 Helsinki, Finland
| | - Yves Van de Peer
- Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium
| | - Jaakko Kangasjärvi
- Plant Biology, Department of Biological and Environmental Sciences, University of Helsinki, FI-00014 Helsinki, Finland
| | - James Whelan
- Australian Research Council Centre of Excellence in Plant Energy Biology, University of Western Australia, Crawley 6009, Western Australia, Australia
- Department of Botany, School of Life Science, La Trobe University, Bundoora, Victoria 3086, Australia
| | - Frank Van Breusegem
- Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium
| |
Collapse
|
27
|
Haudry A, Platts AE, Vello E, Hoen DR, Leclercq M, Williamson RJ, Forczek E, Joly-Lopez Z, Steffen JG, Hazzouri KM, Dewar K, Stinchcombe JR, Schoen DJ, Wang X, Schmutz J, Town CD, Edger PP, Pires JC, Schumaker KS, Jarvis DE, Mandáková T, Lysak MA, van den Bergh E, Schranz ME, Harrison PM, Moses AM, Bureau TE, Wright SI, Blanchette M. An atlas of over 90,000 conserved noncoding sequences provides insight into crucifer regulatory regions. Nat Genet 2013; 45:891-8. [PMID: 23817568 DOI: 10.1038/ng.2684] [Citation(s) in RCA: 211] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2012] [Accepted: 06/04/2013] [Indexed: 12/17/2022]
Abstract
Despite the central importance of noncoding DNA to gene regulation and evolution, understanding of the extent of selection on plant noncoding DNA remains limited compared to that of other organisms. Here we report sequencing of genomes from three Brassicaceae species (Leavenworthia alabamica, Sisymbrium irio and Aethionema arabicum) and their joint analysis with six previously sequenced crucifer genomes. Conservation across orthologous bases suggests that at least 17% of the Arabidopsis thaliana genome is under selection, with nearly one-quarter of the sequence under selection lying outside of coding regions. Much of this sequence can be localized to approximately 90,000 conserved noncoding sequences (CNSs) that show evidence of transcriptional and post-transcriptional regulation. Population genomics analyses of two crucifer species, A. thaliana and Capsella grandiflora, confirm that most of the identified CNSs are evolving under medium to strong purifying selection. Overall, these CNSs highlight both similarities and several key differences between the regulatory DNA of plants and other species.
Collapse
Affiliation(s)
- Annabelle Haudry
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Hupalo D, Kern AD. Conservation and functional element discovery in 20 angiosperm plant genomes. Mol Biol Evol 2013; 30:1729-44. [PMID: 23640124 DOI: 10.1093/molbev/mst082] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Here, we describe the construction of a phylogenetically deep, whole-genome alignment of 20 flowering plants, along with an analysis of plant genome conservation. Each included angiosperm genome was aligned to a reference genome, Arabidopsis thaliana, using the LASTZ/MULTIZ paradigm and tools from the University of California-Santa Cruz Genome Browser source code. In addition to the multiple alignment, we created a local genome browser displaying multiple tracks of newly generated genome annotation, as well as annotation sourced from published data of other research groups. An investigation into A. thaliana gene features present in the aligned A. lyrata genome revealed better conservation of start codons, stop codons, and splice sites within our alignments (51% of features from A. thaliana conserved without interruption in A. lyrata) when compared with previous publicly available plant pairwise alignments (34% of features conserved). The detailed view of conservation across angiosperms revealed not only high coding-sequence conservation but also a large set of previously uncharacterized intergenic conservation. From this, we annotated the collection of conserved features, revealing dozens of putative noncoding RNAs, including some with recorded small RNA expression. Comparing conservation between kingdoms revealed a faster decay of vertebrate genome features when compared with angiosperm genomes. Finally, conserved sequences were searched for folding RNA features, including but not limited to noncoding RNA (ncRNA) genes. Among these, we highlight a double hairpin in the 5'-untranslated region (5'-UTR) of the PRIN2 gene and a putative ncRNA with homology targeting the LAF3 protein.
Collapse
Affiliation(s)
- Daniel Hupalo
- Department of Biological Sciences, Dartmouth College, Hanover, New Hampshire, USA.
| | | |
Collapse
|
29
|
Abstract
For decades, transposable elements have been known to produce a wide variety of changes in plant gene expression and function. This has led to the idea that transposable element activity has played a key part in adaptive plant evolution. This Review describes the kinds of changes that transposable elements can cause, discusses evidence that those changes have contributed to plant evolution and suggests future strategies for determining the extent to which these changes have in fact contributed to plant adaptation and evolution. Recent advances in genomics and phenomics for a range of plant species, particularly crops, have begun to allow the systematic assessment of these questions.
Collapse
Affiliation(s)
- Damon Lisch
- Department of Plant and Microbial Biology, UC Berkeley, Berkeley, California 94720, USA.
| |
Collapse
|
30
|
Turco G, Schnable JC, Pedersen B, Freeling M. Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses. FRONTIERS IN PLANT SCIENCE 2013; 4:170. [PMID: 23874343 PMCID: PMC3708275 DOI: 10.3389/fpls.2013.00170] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2013] [Accepted: 05/13/2013] [Indexed: 05/07/2023]
Abstract
Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize.
Collapse
Affiliation(s)
| | - James C. Schnable
- *Correspondence: James C. Schnable and Michael Freeling, Department of Plant and Microbial Biology, University of California, 111 Koshland Hall, Berkeley, CA 94720, USA e-mail: ;
| | | | - Michael Freeling
- *Correspondence: James C. Schnable and Michael Freeling, Department of Plant and Microbial Biology, University of California, 111 Koshland Hall, Berkeley, CA 94720, USA e-mail: ;
| |
Collapse
|
31
|
|
32
|
Kritsas K, Wuest SE, Hupalo D, Kern AD, Wicker T, Grossniklaus U. Computational analysis and characterization of UCE-like elements (ULEs) in plant genomes. Genome Res 2012; 22:2455-66. [PMID: 22987666 PMCID: PMC3514675 DOI: 10.1101/gr.129346.111] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Ultraconserved elements (UCEs), stretches of DNA that are identical between distantly related species, are enigmatic genomic features whose function is not well understood. First identified and characterized in mammals, UCEs have been proposed to play important roles in gene regulation, RNA processing, and maintaining genome integrity. However, because all of these functions can tolerate some sequence variation, their ultraconserved and ultraselected nature is not explained. We investigated whether there are highly conserved DNA elements without genic function in distantly related plant genomes. We compared the genomes of Arabidopsis thaliana and Vitis vinifera; species that diverged ∼115 million years ago (Mya). We identified 36 highly conserved elements with at least 85% similarity that are longer than 55 bp. Interestingly, these elements exhibit properties similar to mammalian UCEs, such that we named them UCE-like elements (ULEs). ULEs are located in intergenic or intronic regions and are depleted from segmental duplications. Like UCEs, ULEs are under strong purifying selection, suggesting a functional role for these elements. As their mammalian counterparts, ULEs show a sharp drop of A+T content at their borders and are enriched close to genes encoding transcription factors and genes involved in development, the latter showing preferential expression in undifferentiated tissues. By comparing the genomes of Brachypodium distachyon and Oryza sativa, species that diverged ∼50 Mya, we identified a different set of ULEs with similar properties in monocots. The identification of ULEs in plant genomes offers new opportunities to study their possible roles in genome function, integrity, and regulation.
Collapse
Affiliation(s)
- Konstantinos Kritsas
- Institute of Plant Biology & Zürich-Basel Plant Science Center, University Zürich, CH-8008 Zürich, Switzerland
| | | | | | | | | | | |
Collapse
|
33
|
Conserved non-coding regulatory signatures in Arabidopsis co-expressed gene modules. PLoS One 2012; 7:e45041. [PMID: 23024789 PMCID: PMC3443200 DOI: 10.1371/journal.pone.0045041] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2012] [Accepted: 08/11/2012] [Indexed: 11/24/2022] Open
Abstract
Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome.
Collapse
|
34
|
D'Hont A, Denoeud F, Aury JM, Baurens FC, Carreel F, Garsmeur O, Noel B, Bocs S, Droc G, Rouard M, Da Silva C, Jabbari K, Cardi C, Poulain J, Souquet M, Labadie K, Jourda C, Lengellé J, Rodier-Goud M, Alberti A, Bernard M, Correa M, Ayyampalayam S, Mckain MR, Leebens-Mack J, Burgess D, Freeling M, Mbéguié-A-Mbéguié D, Chabannes M, Wicker T, Panaud O, Barbosa J, Hribova E, Heslop-Harrison P, Habas R, Rivallan R, Francois P, Poiron C, Kilian A, Burthia D, Jenny C, Bakry F, Brown S, Guignon V, Kema G, Dita M, Waalwijk C, Joseph S, Dievart A, Jaillon O, Leclercq J, Argout X, Lyons E, Almeida A, Jeridi M, Dolezel J, Roux N, Risterucci AM, Weissenbach J, Ruiz M, Glaszmann JC, Quétier F, Yahiaoui N, Wincker P. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 2012; 488:213-7. [PMID: 22801500 DOI: 10.1038/nature11241] [Citation(s) in RCA: 603] [Impact Index Per Article: 50.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2012] [Accepted: 05/18/2012] [Indexed: 01/17/2023]
Abstract
Bananas (Musa spp.), including dessert and cooking types, are giant perennial monocotyledonous herbs of the order Zingiberales, a sister group to the well-studied Poales, which include cereals. Bananas are vital for food security in many tropical and subtropical countries and the most popular fruit in industrialized countries. The Musa domestication process started some 7,000 years ago in Southeast Asia. It involved hybridizations between diverse species and subspecies, fostered by human migrations, and selection of diploid and triploid seedless, parthenocarpic hybrids thereafter widely dispersed by vegetative propagation. Half of the current production relies on somaclones derived from a single triploid genotype (Cavendish). Pests and diseases have gradually become adapted, representing an imminent danger for global banana production. Here we describe the draft sequence of the 523-megabase genome of a Musa acuminata doubled-haploid genotype, providing a crucial stepping-stone for genetic improvement of banana. We detected three rounds of whole-genome duplications in the Musa lineage, independently of those previously described in the Poales lineage and the one we detected in the Arecales lineage. This first monocotyledon high-continuity whole-genome sequence reported outside Poales represents an essential bridge for comparative genome analysis in plants. As such, it clarifies commelinid-monocotyledon phylogenetic relationships, reveals Poaceae-specific features and has led to the discovery of conserved non-coding sequences predating monocotyledon-eudicotyledon divergence.
Collapse
Affiliation(s)
- Angélique D'Hont
- Centre de coopération Internationale en Recherche Agronomique pour le Développement, UMR AGAP, F-34398 Montpellier, France. angelique.d’
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Parizot B, Roberts I, Raes J, Beeckman T, De Smet I. In silico analyses of pericycle cell populations reinforce their relation with associated vasculature in Arabidopsis. Philos Trans R Soc Lond B Biol Sci 2012; 367:1479-88. [PMID: 22527390 PMCID: PMC3321678 DOI: 10.1098/rstb.2011.0227] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In Arabidopsis, lateral root initiation occurs in a subset of pericycle cells at the xylem pole that will divide asymmetrically to give rise to a new lateral root organ. While lateral roots never develop at the phloem pole, it is unclear how the interaction with xylem and phloem poles determines the distinct pericycle identities with different competences. Nevertheless, pericycle cells at these poles are marked by differences in size, by ultrastructural features and by specific proteins and gene expression. Here, we provide transcriptional evidence that pericycle cells are intimately associated with their vascular tissue instead of being a separate concentric layer. This has implications for the identification of cell- and tissue-specific promoters that are necessary to drive and/or alter gene expression locally, avoiding pleiotropic effects. We were able to identify a small set of genes that display specific expression in the phloem or xylem pole pericycle cells, and we were able to identify motifs that are likely to drive expression in either one of those tissues.
Collapse
Affiliation(s)
- Boris Parizot
- Department of Plant Systems Biology, VIB, Technologiepark 927, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Genetics, Ghent University, Technologiepark 927, 9052 Ghent, Belgium
| | - Ianto Roberts
- Department of Plant Systems Biology, VIB, Technologiepark 927, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Genetics, Ghent University, Technologiepark 927, 9052 Ghent, Belgium
| | - Jeroen Raes
- Department of Plant Systems Biology, VIB, Technologiepark 927, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Genetics, Ghent University, Technologiepark 927, 9052 Ghent, Belgium
| | - Tom Beeckman
- Department of Plant Systems Biology, VIB, Technologiepark 927, 9052 Ghent, Belgium
- Department of Plant Biotechnology and Genetics, Ghent University, Technologiepark 927, 9052 Ghent, Belgium
| | - Ive De Smet
- Division of Plant and Crop Sciences, School of Biosciences, University of Nottingham, Loughborough LE12 5RD, UK
| |
Collapse
|
36
|
Duan J, Wu J, Liu Y, Xiao J, Zhao G, Gu Y, Jia J, Kong X. New cis-regulatory elements in the Rht-D1b locus region of wheat. Funct Integr Genomics 2012; 12:489-500. [PMID: 22592657 DOI: 10.1007/s10142-012-0283-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2012] [Revised: 04/09/2012] [Accepted: 04/10/2012] [Indexed: 01/02/2023]
|
37
|
Abstract
Ultraconserved elements (UCEs) are DNA sequences that are 100% identical (no base substitutions, insertions, or deletions) and located in syntenic positions in at least two genomes. Although hundreds of UCEs have been found in animal genomes, little is known about the incidence of ultraconservation in plant genomes. Using an alignment-free information-retrieval approach, we have comprehensively identified all long identical multispecies elements (LIMEs), which include both syntenic and nonsyntenic regions, of at least 100 identical base pairs shared by at least two genomes. Among six animal genomes, we found the previously known syntenic UCEs as well as previously undescribed nonsyntenic elements. In contrast, among six plant genomes, we only found nonsyntenic LIMEs. LIMEs can also be classified as either simple (repetitive) or complex (nonrepetitive), they may occur in multiple copies in a genome, and they are often spread across multiple chromosomes. Although complex LIMEs were found in both animal and plant genomes, they differed significantly in their composition and copy number. Further analyses of plant LIMEs revealed their functional diversity, encompassing elements found near rRNA and enzyme-coding genes, as well as those found in transposons and noncoding DNA. We conclude that despite the common presence of LIMEs in both animal and plant lineages, the evolutionary processes involved in the creation and maintenance of these elements differ in the two groups and are likely attributable to several mechanisms, including transfer of genetic material from organellar to nuclear genomes, de novo sequence manufacturing, and purifying selection.
Collapse
|
38
|
Freeling M, Woodhouse MR, Subramaniam S, Turco G, Lisch D, Schnable JC. Fractionation mutagenesis and similar consequences of mechanisms removing dispensable or less-expressed DNA in plants. CURRENT OPINION IN PLANT BIOLOGY 2012; 15:131-9. [PMID: 22341793 DOI: 10.1016/j.pbi.2012.01.015] [Citation(s) in RCA: 123] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2011] [Revised: 12/07/2011] [Accepted: 01/21/2012] [Indexed: 05/06/2023]
Abstract
Unlike in mammals, plants rapidly delete functionless, nonrepetitive DNA from their genomes. Following paleopolyploidies, duplicate genes are deleted by intrachromosomal recombination. This may explain how flowering plants have survived multiple whole genome duplications. Genes are disproportionately lost from one parental subgenome, the subgenome that is less expressed in the polyploid. The origin of this unbalanced expression between genomes remains unknown. The consequences of the tradeoffs between transposon repression and gene expression represent one potential explanation of genome dominance. If so, the same mechanisms may act in heterosis: genome dominance is like inbreeding depression. Regulatory DNA deletion following polyploidy combined with abundant RNA-seq expression datasets are being used to generate testable hypothesizes regarding the function of specific cis-regulatory sequences.
Collapse
Affiliation(s)
- Michael Freeling
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA.
| | | | | | | | | | | |
Collapse
|
39
|
Vaughn JN, Ellingson SR, Mignone F, von Arnim A. Known and novel post-transcriptional regulatory sequences are conserved across plant families. RNA (NEW YORK, N.Y.) 2012; 18:368-84. [PMID: 22237150 PMCID: PMC3285926 DOI: 10.1261/rna.031179.111] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
The sequence elements that mediate post-transcriptional gene regulation often reside in the 5' and 3' untranslated regions (UTRs) of mRNAs. Using six different families of dicotyledonous plants, we developed a comparative transcriptomics pipeline for the identification and annotation of deeply conserved regulatory sequences in the 5' and 3' UTRs. Our approach was robust to confounding effects of poor UTR alignability and rampant paralogy in plants. In the 3' UTR, motifs resembling PUMILIO-binding sites form a prominent group of conserved motifs. Additionally, Expansins, one of the few plant mRNA families known to be localized to specific subcellular sites, possess a core conserved RCCCGC motif. In the 5' UTR, one major subset of motifs consists of purine-rich repeats. A distinct and substantial fraction possesses upstream AUG start codons. Half of the AUG containing motifs reveal hidden protein-coding potential in the 5' UTR, while the other half point to a peptide-independent function related to translation. Among the former, we added four novel peptides to the small catalog of conserved-peptide uORFs. Among the latter, our case studies document patterns of uORF evolution that include gain and loss of uORFs, switches in uORF reading frame, and switches in uORF length and position. In summary, nearly three hundred post-transcriptional elements show evidence of purifying selection across the eudicot branch of flowering plants, indicating a regulatory function spanning at least 70 million years. Some of these sequences have experimental precedent, but many are novel and encourage further exploration.
Collapse
Affiliation(s)
- Justin N. Vaughn
- Department of Biochemistry, Cellular and Molecular Biology, The University of Tennessee, Knoxville, Tennessee 37996, USA
| | - Sally R. Ellingson
- Graduate School of Genome Science and Technology, The University of Tennessee, Knoxville, Tennessee 37996, USA
| | - Flavio Mignone
- Dipartimento di Chimica Strutturale e Stereochimica Inorganica, Università degli Studi di Milano, 20133 Milano, Italy
| | - Albrecht von Arnim
- Department of Biochemistry, Cellular and Molecular Biology, The University of Tennessee, Knoxville, Tennessee 37996, USA
- Graduate School of Genome Science and Technology, The University of Tennessee, Knoxville, Tennessee 37996, USA
- Corresponding author.E-mail .
| |
Collapse
|
40
|
Spangler JB, Subramaniam S, Freeling M, Feltus FA. Evidence of function for conserved noncoding sequences in Arabidopsis thaliana. THE NEW PHYTOLOGIST 2012; 193:241-252. [PMID: 21955124 DOI: 10.1111/j.1469-8137.2011.03916.x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
• Whole genome duplication events provide a lineage with a large reservoir of genes that can be molded by evolutionary forces into phenotypes that fit alternative environments. A well-studied whole genome duplication, the α-event, occurred in an ancestor of the model plant Arabidopsis thaliana. Retained segments of the α-event have been defined in recent years in the form of duplicate protein coding sequences (α-pairs) and associated conserved noncoding DNA sequences (CNSs). Our aim was to identify any association between CNSs and α-pair co-functionality at the gene expression level. • Here, we tested for correlation between CNS counts and α-pair co-expression and expression intensity across nine expression datasets: aerial tissue, flowers, leaves, roots, rosettes, seedlings, seeds, shoots and whole plants. • We provide evidence for a putative regulatory role of the CNSs. The association of CNSs with α-pair co-expression and expression intensity varied by gene function, subgene position and the presence of transcription factor binding motifs. A range of possible CNS regulatory mechanisms, including intron-mediated enhancement, messenger RNA fold stability and transcriptional regulation, are discussed. • This study provides a framework to understand how CNS motifs are involved in the maintenance of gene expression after a whole genome duplication event.
Collapse
Affiliation(s)
- Jacob B Spangler
- Department of Genetics & Biochemistry, Clemson University, Clemson, SC 29634, USA
| | - Sabarinath Subramaniam
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA 94720, USA
| | - Michael Freeling
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA 94720, USA
| | - F Alex Feltus
- Department of Genetics & Biochemistry, Clemson University, Clemson, SC 29634, USA
| |
Collapse
|
41
|
Tang H, Lyons E. Unleashing the genome of brassica rapa. FRONTIERS IN PLANT SCIENCE 2012; 3:172. [PMID: 22866056 PMCID: PMC3408644 DOI: 10.3389/fpls.2012.00172] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2012] [Accepted: 07/12/2012] [Indexed: 05/06/2023]
Abstract
The completion and release of the Brassica rapa genome is of great benefit to researchers of the Brassicas, Arabidopsis, and genome evolution. While its lineage is closely related to the model organism Arabidopsis thaliana, the Brassicas experienced a whole genome triplication subsequent to their divergence. This event contemporaneously created three copies of its ancestral genome, which had diploidized through the process of homeologous gene loss known as fractionation. By the fractionation of homeologous gene content and genetic regulatory binding sites, Brassica's genome is well placed to use comparative genomic techniques to identify syntenic regions, homeologous gene duplications, and putative regulatory sequences. Here, we use the comparative genomics platform CoGe to perform several different genomic analyses with which to study structural changes of its genome and dynamics of various genetic elements. Starting with whole genome comparisons, the Brassica paleohexaploidy is characterized, syntenic regions with A. thaliana are identified, and the TOC1 gene in the circadian rhythm pathway from A. thaliana is used to find duplicated orthologs in B. rapa. These TOC1 genes are further analyzed to identify conserved non-coding sequences that contain cis-acting regulatory elements and promoter sequences previously implicated in circadian rhythmicity. Each "cookbook style" analysis includes a step-by-step walk-through with links to CoGe to quickly reproduce each step of the analytical process.
Collapse
Affiliation(s)
| | - Eric Lyons
- iPlant Collaborative, School of Plant Sciences, University of ArizonaTucson, AZ, USA
- *Correspondence: Eric Lyons, iPlant Collaborative, School of Plant Sciences, University of Arizona, Keating Bioresearch Building, 1657 E. Helen St. Tucson, AZ 85745, USA. e-mail:
| |
Collapse
|
42
|
Woodhouse MR, Tang H, Freeling M. Different gene families in Arabidopsis thaliana transposed in different epochs and at different frequencies throughout the rosids. THE PLANT CELL 2011; 23:4241-53. [PMID: 22180627 PMCID: PMC3269863 DOI: 10.1105/tpc.111.093567] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Certain types of gene families, such as those encoding most families of transcription factors, maintain their chromosomal syntenic positions throughout angiosperm evolutionary time. Other nonsyntenic gene families are prone to deletion, tandem duplication, and transposition. Here, we describe the chromosomal positional history of all genes in Arabidopsis thaliana throughout the rosid superorder. We introduce a public database where researchers can look up the positional history of their favorite A. thaliana gene or gene family. Finally, we show that specific gene families transposed at specific points in evolutionary time, particularly after whole-genome duplication events in the Brassicales, and suggest that genes in mobile gene families are under different selection pressure than syntenic genes.
Collapse
Affiliation(s)
- Margaret R Woodhouse
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA.
| | | | | |
Collapse
|
43
|
Gaut B, Yang L, Takuno S, Eguiarte LE. The Patterns and Causes of Variation in Plant Nucleotide Substitution Rates. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2011. [DOI: 10.1146/annurev-ecolsys-102710-145119] [Citation(s) in RCA: 114] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Brandon Gaut
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697; , ,
| | - Liang Yang
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697; , ,
| | - Shohei Takuno
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697; , ,
| | - Luis E. Eguiarte
- Instituto de Ecología, Universidad Nacional Autónoma de México, CP 04510 Mexico City, Mexico;
| |
Collapse
|
44
|
Zhang W, Wu Y, Schnable JC, Zeng Z, Freeling M, Crawford GE, Jiang J. High-resolution mapping of open chromatin in the rice genome. Genome Res 2011; 22:151-62. [PMID: 22110044 DOI: 10.1101/gr.131342.111] [Citation(s) in RCA: 175] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Gene expression is controlled by the complex interaction of transcription factors binding to promoters and other regulatory DNA elements. One common characteristic of the genomic regions associated with regulatory proteins is a pronounced sensitivity to DNase I digestion. We generated genome-wide high-resolution maps of DNase I hypersensitive (DH) sites from both seedling and callus tissues of rice (Oryza sativa). Approximately 25% of the DH sites from both tissues were found in putative promoters, indicating that the vast majority of the gene regulatory elements in rice are not located in promoter regions. We found 58% more DH sites in the callus than in the seedling. For DH sites detected in both the seedling and callus, 31% displayed significantly different levels of DNase I sensitivity within the two tissues. Genes that are differentially expressed in the seedling and callus were frequently associated with DH sites in both tissues. The DNA sequences contained within the DH sites were hypomethylated, consistent with what is known about active gene regulatory elements. Interestingly, tissue-specific DH sites located in the promoters showed a higher level of DNA methylation than the average DNA methylation level of all the DH sites located in the promoters. A distinct elevation of H3K27me3 was associated with intergenic DH sites. These results suggest that epigenetic modifications play a role in the dynamic changes of the numbers and DNase I sensitivity of DH sites during development.
Collapse
Affiliation(s)
- Wenli Zhang
- Department of Horticulture, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | | | | | | | | | | | | |
Collapse
|
45
|
Reineke AR, Bornberg-Bauer E, Gu J. Evolutionary divergence and limits of conserved non-coding sequence detection in plant genomes. Nucleic Acids Res 2011; 39:6029-43. [PMID: 21470961 PMCID: PMC3152334 DOI: 10.1093/nar/gkr179] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2010] [Revised: 02/22/2011] [Accepted: 03/15/2011] [Indexed: 12/17/2022] Open
Abstract
The discovery of regulatory motifs embedded in upstream regions of plants is a particularly challenging bioinformatics task. Previous studies have shown that motifs in plants are short compared with those found in vertebrates. Furthermore, plant genomes have undergone several diversification mechanisms such as genome duplication events which impact the evolution of regulatory motifs. In this article, a systematic phylogenomic comparison of upstream regions is conducted to further identify features of the plant regulatory genomes, the component of genomes regulating gene expression, to enable future de novo discoveries. The findings highlight differences in upstream region properties between major plant groups and the effects of divergence times and duplication events. First, clear differences in upstream region evolution can be detected between monocots and dicots, thus suggesting that a separation of these groups should be made when searching for novel regulatory motifs, particularly since universal motifs such as the TATA box are rare. Second, investigating the decay rate of significantly aligned regions suggests that a divergence time of ~100 mya sets a limit for reliable conserved non-coding sequence (CNS) detection. Insights presented here will set a framework to help identify embedded motifs of functional relevance by understanding the limits of bioinformatics detection for CNSs.
Collapse
Affiliation(s)
| | | | - Jenny Gu
- Institute for Evolution and Biodiversity, University of Münster, Hüfferstrasse 1, 48149, Münster, Germany
| |
Collapse
|
46
|
Abstract
Advances in sequencing technology have led to the availability of complete genome sequences of many different plant species. In order to make sense of this deluge of information, functional genomics efforts have been intensified on many fronts. With improvements in plant transformation technologies, T-DNA and/or transposon-based gene and enhancer-tagged populations in various crop species are being developed to augment functional annotation of genes and also to help clone important genes. State-of-the-art cloning and sequencing technologies, which would help identify T-DNA or transposon junction sequences in large genomes, have also been initiated. This chapter gives a brief history of enhancer trapping and then proceeds to describe gene and enhancer tagging in plants. The significance of reporter gene fusion populations in plant genomics, especially in important cereal crops, is discussed.
Collapse
|
47
|
Abstract
As our ability to generate sequencing data continues to increase, data analysis is replacing data generation as the rate-limiting step in genomics studies. Here we provide a guide to genomic data visualization tools that facilitate analysis tasks by enabling researchers to explore, interpret and manipulate their data, and in some cases perform on-the-fly computations. We will discuss graphical methods designed for the analysis of de novo sequencing assemblies and read alignments, genome browsing, and comparative genomics, highlighting the strengths and limitations of these approaches and the challenges ahead.
Collapse
|
48
|
Abstract
While once almost synonymous, there is an increasing gap between the expanding definition of what constitutes a gene and the conservative and narrowly defined terms code or coding, which for a long time, almost exclusively constituted the open reading frame. Much confusion results from this disparity, especially in light of the plethora of noncoding RNAs (more correctly termed "non-protein-coding RNAs") that usually are encoded and transcribed by their own genes. A simple solution would be to adopt Ed Trifonov's less constrained definition of a code as any sequence pattern that can have a biological function. Such consideration favors not only a more complex view of the gene as an entity composed of many more or less conserved subgenic modules, but also a concept of modular evolution of genes and entire genomes.
Collapse
Affiliation(s)
- Jürgen Brosius
- Institute of Experimental Pathology (ZMBE), University of Münster, Münster, Germany.
| |
Collapse
|
49
|
Priest HD, Filichkin SA, Mockler TC. Cis-regulatory elements in plant cell signaling. CURRENT OPINION IN PLANT BIOLOGY 2009; 12:643-649. [PMID: 19717332 DOI: 10.1016/j.pbi.2009.07.016] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2009] [Revised: 06/30/2009] [Accepted: 07/21/2009] [Indexed: 05/26/2023]
Abstract
Plant cell signaling pathways are in part dependent on transcriptional regulatory networks comprising circuits of transcription factors (TFs) and regulatory DNA elements that control the expression of target genes. Here, we describe experimental and bioinformatic approaches for identifying potential cis-regulatory elements. We also discuss recent integrative genomics studies aimed at elucidating the functions of cis-regulatory elements in aspects of plant biology, including the circadian clock, interactions with the environment, stress responses, and regulation of growth and development by phytohormones. Finally, we discuss emerging technologies and approaches that offer great potential for accelerating the discovery and functional characterization of cis-elements and interacting TFs--which will help realize the promise of systems biology.
Collapse
Affiliation(s)
- Henry D Priest
- Department of Botany and Plant Pathology and Center for Genome Research and Biocomputing, Oregon State University, Corvallis, OR 97331, USA
| | | | | |
Collapse
|
50
|
Contrasting evolutionary dynamics between angiosperm and mammalian genomes. Trends Ecol Evol 2009; 24:572-82. [PMID: 19665255 DOI: 10.1016/j.tree.2009.04.010] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2008] [Revised: 04/06/2009] [Accepted: 04/22/2009] [Indexed: 12/23/2022]
Abstract
Continuing advances in genomics are revealing substantial differences between genomes of major eukaryotic lineages. Because most data (in terms of depth and phylogenetic breadth) are available for angiosperms and mammals, we explore differences between these groups and show that angiosperms have less highly compartmentalized and more diverse genomes than mammals. In considering the causes of these differences, four mechanisms are highlighted: polyploidy, recombination, retrotransposition and genome silencing, which have different modes and time scales of activity. Angiosperm genomes are evolutionarily more dynamic and labile, whereas mammalian genomes are more stable at both the sequence and chromosome level. We suggest that fundamentally different life strategies and development feedback on the genome exist, influencing dynamics and evolutionary trajectories at all levels from the gene to the genome.
Collapse
|