1
|
Song H, Wang Q, Zhang Z, Lin K, Pang E. Identification of clade-wide putative cis-regulatory elements from conserved non-coding sequences in Cucurbitaceae genomes. HORTICULTURE RESEARCH 2023; 10:uhad038. [PMID: 37799630 PMCID: PMC10548412 DOI: 10.1093/hr/uhad038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 02/20/2023] [Indexed: 10/07/2023]
Abstract
Cis-regulatory elements regulate gene expression and play an essential role in the development and physiology of organisms. Many conserved non-coding sequences (CNSs) function as cis-regulatory elements. They control the development of various lineages. However, predicting clade-wide cis-regulatory elements across several closely related species remains challenging. Based on the relationship between CNSs and cis-regulatory elements, we present a computational approach that predicts the clade-wide putative cis-regulatory elements in 12 Cucurbitaceae genomes. Using 12-way whole-genome alignment, we first obtained 632 112 CNSs in Cucurbitaceae. Next, we identified 16 552 Cucurbitaceae-wide cis-regulatory elements based on collinearity among all 12 Cucurbitaceae plants. Furthermore, we predicted 3 271 potential regulatory pairs in the cucumber genome, of which 98 were verified using integrative RNA sequencing and ChIP sequencing datasets from samples collected during various fruit development stages. The CNSs, Cucurbitaceae-wide cis-regulatory elements, and their target genes are accessible at http://cmb.bnu.edu.cn/cisRCNEs_cucurbit/. These elements are valuable resources for functionally annotating CNSs and their regulatory roles in Cucurbitaceae genomes.
Collapse
Affiliation(s)
- Hongtao Song
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Qi Wang
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Zhonghua Zhang
- College of Horticulture, Qingdao Agricultural University, Qingdao 266109, China
| | - Kui Lin
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Erli Pang
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| |
Collapse
|
2
|
Chau JH, Rahfeldt WA, Olmstead RG. Comparison of taxon-specific versus general locus sets for targeted sequence capture in plant phylogenomics. APPLICATIONS IN PLANT SCIENCES 2018; 6:e1032. [PMID: 29732262 PMCID: PMC5895190 DOI: 10.1002/aps3.1032] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 12/22/2017] [Indexed: 05/21/2023]
Abstract
PREMISE OF THE STUDY Targeted sequence capture can be used to efficiently gather sequence data for large numbers of loci, such as single-copy nuclear loci. Most published studies in plants have used taxon-specific locus sets developed individually for a clade using multiple genomic and transcriptomic resources. General locus sets can also be developed from loci that have been identified as single-copy and have orthologs in large clades of plants. METHODS We identify and compare a taxon-specific locus set and three general locus sets (conserved ortholog set [COSII], shared single-copy nuclear [APVO SSC] genes, and pentatricopeptide repeat [PPR] genes) for targeted sequence capture in Buddleja (Scrophulariaceae) and outgroups. We evaluate their performance in terms of assembly success, sequence variability, and resolution and support of inferred phylogenetic trees. RESULTS The taxon-specific locus set had the most target loci. Assembly success was high for all locus sets in Buddleja samples. For outgroups, general locus sets had greater assembly success. Taxon-specific and PPR loci had the highest average variability. The taxon-specific data set produced the best-supported tree, but all data sets showed improved resolution over previous non-sequence capture data sets. DISCUSSION General locus sets can be a useful source of sequence capture targets, especially if multiple genomic resources are not available for a taxon.
Collapse
Affiliation(s)
- John H. Chau
- Department of Biology and Burke MuseumUniversity of WashingtonBox 351800SeattleWashington98195USA
- Centre for Ecological Genomics and Wildlife ConservationDepartment of ZoologyUniversity of JohannesburgP.O. Box 524Auckland Park2006South Africa
| | - Wolfgang A. Rahfeldt
- Department of Biology and Burke MuseumUniversity of WashingtonBox 351800SeattleWashington98195USA
| | - Richard G. Olmstead
- Department of Biology and Burke MuseumUniversity of WashingtonBox 351800SeattleWashington98195USA
| |
Collapse
|
3
|
Abstract
Phylogenomics aims at reconstructing the evolutionary histories of organisms taking into account whole genomes or large fractions of genomes. The abundance of genomic data for an enormous variety of organisms has enabled phylogenomic inference of many groups, and this has motivated the development of many computer programs implementing the associated methods. This chapter surveys phylogenetic concepts and methods aimed at both gene tree and species tree reconstruction while also addressing common pitfalls, providing references to relevant computer programs. A practical phylogenomic analysis example including bacterial genomes is presented at the end of the chapter.
Collapse
Affiliation(s)
- José S L Patané
- Department of Biochemistry, Institute of Chemistry, University of São Paulo, Av. Prof. Lineu Prestes 748, São Paulo, SP, 05508-000, Brazil
| | - Joaquim Martins
- Department of Biochemistry, Institute of Chemistry, University of São Paulo, Av. Prof. Lineu Prestes 748, São Paulo, SP, 05508-000, Brazil
| | - João C Setubal
- Department of Biochemistry, Institute of Chemistry, University of São Paulo, Av. Prof. Lineu Prestes 748, São Paulo, SP, 05508-000, Brazil.
| |
Collapse
|
4
|
Yata VK, Thapa A, Mattaparthi VSK. Structural insight into the binding interactions of modeled structure of Arabidopsis thaliana urease with urea: an in silico study. J Biomol Struct Dyn 2014; 33:845-51. [PMID: 24738549 DOI: 10.1080/07391102.2014.915765] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Urease (EC 3.5.1.5., urea amidohydrolase) catalyzes the hydrolysis of urea to ammonia and carbon dioxide. Urease is present to a greater abundance in plants and plays significant role related to nitrogen recycling from urea. But little is known about the structure and function of the urease derived from the Arabidopsis thaliana, the model system of choice for research in plant biology. In this study, a three-dimensional structural model of A. thaliana urease was constructed using computer-aided molecular modeling technique. The characteristic structural features of the modeled structure were then studied using atomistic molecular dynamics simulation. It was observed that the modeled structure was stable and regions between residues index (50-80, 500-700) to be significantly flexible. From the docking studies, we detected the possible binding interactions of modeled urease with urea. Ala399, Ile675, Thr398, and Thr679 residues of A. thaliana urease were observed to be significantly involved in binding with the substrate urea. We also compared the docking studies of ureases from other sources such as Canavalia ensiformis, Helicobacter pylori, and Bacillus pasteurii. In addition, we carried out mutation analysis to find the highly mutable amino acid residues of modeled A. thaliana urease. In this particular study, we observed Met485, Tyr510, Ser786, Val426, and Lys765 to be highly mutable amino acids. These results are significant for the mutagenesis analysis. As a whole, this study expounds the salient structural features as well the binding interactions of the modeled structure of A. thaliana urease.
Collapse
Affiliation(s)
- Vinod Kumar Yata
- a Department of Biotechnology , Dr B.R. Ambedkar National Institute of Technology Jalandhar , Jalandhar 144 011 , India
| | | | | |
Collapse
|
5
|
Wu J, Kong X, Shi C, Gu Y, Jin C, Gao L, Jia J. Dynamic evolution of rht-1 homologous regions in grass genomes. PLoS One 2013; 8:e75544. [PMID: 24086561 PMCID: PMC3782514 DOI: 10.1371/journal.pone.0075544] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2013] [Accepted: 08/18/2013] [Indexed: 11/18/2022] Open
Abstract
Hexaploid bread wheat contains A, B, and D three subgenomes with its well-characterized ancestral genomes existed at diploid and tetraploid levels, making the wheat act as a good model species for studying evolutionary genomic dynamics. Here, we performed intra- and inter-species comparative analyses of wheat and related grass genomes to examine the dynamics of homologous regions surrounding Rht-1, a well-known "green revolution" gene. Our results showed that the divergence of the two A genomes in the Rht-1 region from the diploid and tetraploid species is greater than that from the tetraploid and hexaploid wheat. The divergence of D genome between diploid and hexaploid is lower than those of A genome, suggesting that D genome diverged latter than others. The divergence among the A, B and D subgenomes was larger than that among different ploidy levels for each subgenome which mainly resulted from genomic structural variation of insertions and, perhaps deletions, of the repetitive sequences. Meanwhile, the repetitive sequences caused genome expansion further after the divergence of the three subgenomes. However, several conserved non-coding sequences were identified to be shared among the three subgenomes of wheat, suggesting that they may have played an important role to maintain the homolog of three subgenomes. This is a pilot study on evolutionary dynamics across the wheat ploids, subgenomes and differently related grasses. Our results gained new insights into evolutionary dynamics of Rht-1 region at sequence level as well as the evolution of wheat during the plolyploidization process.
Collapse
Affiliation(s)
- Jing Wu
- Key Laboratory of Crop Germplasm Resources and Utilization, Ministry of Agriculture, the National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, the Chinese Academy of Agricultural Sciences, Beijing, China
| | - Xiuying Kong
- Key Laboratory of Crop Germplasm Resources and Utilization, Ministry of Agriculture, the National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, the Chinese Academy of Agricultural Sciences, Beijing, China
| | - Chao Shi
- Plant Germplasm and Genomics Center, Germplasm Bank of Wild Species in Southwest China, Kunming Institute of Botany, the Chinese Academy of Sciences, Kunming, China
| | - Yongqiang Gu
- United States Department of Agriculture, Agricultural Research Service, Western Regional Research Center, Albany, California, United States of America
| | - Cuiyun Jin
- Key Laboratory of Crop Germplasm Resources and Utilization, Ministry of Agriculture, the National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, the Chinese Academy of Agricultural Sciences, Beijing, China
| | - Lizhi Gao
- Plant Germplasm and Genomics Center, Germplasm Bank of Wild Species in Southwest China, Kunming Institute of Botany, the Chinese Academy of Sciences, Kunming, China
- * E-mail: (JJ); (LG)
| | - Jizeng Jia
- Key Laboratory of Crop Germplasm Resources and Utilization, Ministry of Agriculture, the National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, the Chinese Academy of Agricultural Sciences, Beijing, China
- * E-mail: (JJ); (LG)
| |
Collapse
|
6
|
Hupalo D, Kern AD. Conservation and functional element discovery in 20 angiosperm plant genomes. Mol Biol Evol 2013; 30:1729-44. [PMID: 23640124 DOI: 10.1093/molbev/mst082] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Here, we describe the construction of a phylogenetically deep, whole-genome alignment of 20 flowering plants, along with an analysis of plant genome conservation. Each included angiosperm genome was aligned to a reference genome, Arabidopsis thaliana, using the LASTZ/MULTIZ paradigm and tools from the University of California-Santa Cruz Genome Browser source code. In addition to the multiple alignment, we created a local genome browser displaying multiple tracks of newly generated genome annotation, as well as annotation sourced from published data of other research groups. An investigation into A. thaliana gene features present in the aligned A. lyrata genome revealed better conservation of start codons, stop codons, and splice sites within our alignments (51% of features from A. thaliana conserved without interruption in A. lyrata) when compared with previous publicly available plant pairwise alignments (34% of features conserved). The detailed view of conservation across angiosperms revealed not only high coding-sequence conservation but also a large set of previously uncharacterized intergenic conservation. From this, we annotated the collection of conserved features, revealing dozens of putative noncoding RNAs, including some with recorded small RNA expression. Comparing conservation between kingdoms revealed a faster decay of vertebrate genome features when compared with angiosperm genomes. Finally, conserved sequences were searched for folding RNA features, including but not limited to noncoding RNA (ncRNA) genes. Among these, we highlight a double hairpin in the 5'-untranslated region (5'-UTR) of the PRIN2 gene and a putative ncRNA with homology targeting the LAF3 protein.
Collapse
Affiliation(s)
- Daniel Hupalo
- Department of Biological Sciences, Dartmouth College, Hanover, New Hampshire, USA.
| | | |
Collapse
|
7
|
Ryu T, Seridi L, Ravasi T. The evolution of ultraconserved elements with different phylogenetic origins. BMC Evol Biol 2012; 12:236. [PMID: 23217155 PMCID: PMC3556307 DOI: 10.1186/1471-2148-12-236] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Accepted: 11/09/2012] [Indexed: 11/10/2022] Open
Abstract
Background Ultraconserved elements of DNA have been identified in vertebrate and invertebrate genomes. These elements have been found to have diverse functions, including enhancer activities in developmental processes. The evolutionary origins and functional roles of these elements in cellular systems, however, have not yet been determined. Results Here, we identified a wide range of ultraconserved elements common to distant species, from primitive aquatic organisms to terrestrial species with complicated body systems, including some novel elements conserved in fruit fly and human. In addition to a well-known association with developmental genes, these DNA elements have a strong association with genes implicated in essential cell functions, such as epigenetic regulation, apoptosis, detoxification, innate immunity, and sensory reception. Interestingly, we observed that ultraconserved elements clustered by sequence similarity. Furthermore, species composition and flanking genes of clusters showed lineage-specific patterns. Ultraconserved elements are highly enriched with binding sites to developmental transcription factors regardless of how they cluster. Conclusion We identified large numbers of ultraconserved elements across distant species. Specific classes of these conserved elements seem to have been generated before the divergence of taxa and fixed during the process of evolution. Our findings indicate that these ultraconserved elements are not the exclusive property of higher modern eukaryotes, but rather transmitted from their metazoan ancestors.
Collapse
Affiliation(s)
- Taewoo Ryu
- Integrative Systems Biology Lab, Division of Biological and Environmental Sciences & Engineering, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia.
| | | | | |
Collapse
|
8
|
Kritsas K, Wuest SE, Hupalo D, Kern AD, Wicker T, Grossniklaus U. Computational analysis and characterization of UCE-like elements (ULEs) in plant genomes. Genome Res 2012; 22:2455-66. [PMID: 22987666 PMCID: PMC3514675 DOI: 10.1101/gr.129346.111] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Ultraconserved elements (UCEs), stretches of DNA that are identical between distantly related species, are enigmatic genomic features whose function is not well understood. First identified and characterized in mammals, UCEs have been proposed to play important roles in gene regulation, RNA processing, and maintaining genome integrity. However, because all of these functions can tolerate some sequence variation, their ultraconserved and ultraselected nature is not explained. We investigated whether there are highly conserved DNA elements without genic function in distantly related plant genomes. We compared the genomes of Arabidopsis thaliana and Vitis vinifera; species that diverged ∼115 million years ago (Mya). We identified 36 highly conserved elements with at least 85% similarity that are longer than 55 bp. Interestingly, these elements exhibit properties similar to mammalian UCEs, such that we named them UCE-like elements (ULEs). ULEs are located in intergenic or intronic regions and are depleted from segmental duplications. Like UCEs, ULEs are under strong purifying selection, suggesting a functional role for these elements. As their mammalian counterparts, ULEs show a sharp drop of A+T content at their borders and are enriched close to genes encoding transcription factors and genes involved in development, the latter showing preferential expression in undifferentiated tissues. By comparing the genomes of Brachypodium distachyon and Oryza sativa, species that diverged ∼50 Mya, we identified a different set of ULEs with similar properties in monocots. The identification of ULEs in plant genomes offers new opportunities to study their possible roles in genome function, integrity, and regulation.
Collapse
Affiliation(s)
- Konstantinos Kritsas
- Institute of Plant Biology & Zürich-Basel Plant Science Center, University Zürich, CH-8008 Zürich, Switzerland
| | | | | | | | | | | |
Collapse
|
9
|
Abstract
Ultraconserved elements (UCEs) are DNA sequences that are 100% identical (no base substitutions, insertions, or deletions) and located in syntenic positions in at least two genomes. Although hundreds of UCEs have been found in animal genomes, little is known about the incidence of ultraconservation in plant genomes. Using an alignment-free information-retrieval approach, we have comprehensively identified all long identical multispecies elements (LIMEs), which include both syntenic and nonsyntenic regions, of at least 100 identical base pairs shared by at least two genomes. Among six animal genomes, we found the previously known syntenic UCEs as well as previously undescribed nonsyntenic elements. In contrast, among six plant genomes, we only found nonsyntenic LIMEs. LIMEs can also be classified as either simple (repetitive) or complex (nonrepetitive), they may occur in multiple copies in a genome, and they are often spread across multiple chromosomes. Although complex LIMEs were found in both animal and plant genomes, they differed significantly in their composition and copy number. Further analyses of plant LIMEs revealed their functional diversity, encompassing elements found near rRNA and enzyme-coding genes, as well as those found in transposons and noncoding DNA. We conclude that despite the common presence of LIMEs in both animal and plant lineages, the evolutionary processes involved in the creation and maintenance of these elements differ in the two groups and are likely attributable to several mechanisms, including transfer of genetic material from organellar to nuclear genomes, de novo sequence manufacturing, and purifying selection.
Collapse
|
10
|
Lü ZR, Seo E, Yan L, Yin SJ, Si YX, Qian GY, Park YD, Yang JM. High-Throughput Integrated Analyses for the Tyrosinase-Induced Melanogenesis: Microarray, Proteomics and Interactomics Studies. J Biomol Struct Dyn 2010; 28:259-76. [DOI: 10.1080/07391102.2010.10507358] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
11
|
|
12
|
Liu Z, Xu Y, Wu L, Zhang S. Evolution of galanin receptor genes: insights from the deuterostome genomes. J Biomol Struct Dyn 2010; 28:97-106. [PMID: 20476798 DOI: 10.1080/07391102.2010.10507346] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Galanin exerts its biological activities through three different G protein-coupled receptors, Galr1, Galr2 and Galr3. To obtain insights into the evolution of Galrs, we searched the genomes of the deuterostomes by extensive BLAST survey and phylogenetic analyses. The Galr2 and Galr3 share similar genomic structures, and most of them are composed of 2 exons and 1 intron. However, most of Galr1 are composed of 3 extrons and 2 introns. We did not detect the typical Galr genes in the genomic databases of invertebrate deutserotomes, but three Galr1/Alstr homologs and two Galr1/Gpr151 homologs in amphioxus, two Galr1/Gpr151 homologs in sea squirt and one Galr1/Gpr151 homologs in sea urchin were identified. It is highly possible that the Galr genes in vertebrates may evolve from the homologous genes of Galr1/Alstr/Gpr151 in invertebrate deuterostomes. We also proposed that Galr3 genes were the products of Galr2 duplication during evolution, while Galr2 genes may evolve from Galr1.
Collapse
Affiliation(s)
- Z Liu
- Key Laboratory of Marine Genetics and Breeding (Ocean University of China), Ministry of Education, Qingdao 266003, China.
| | | | | | | |
Collapse
|
13
|
Anbazhagan P, Purushottam M, Kiran Kumar HB, Mukherjee O, Jain S, Sowdhamini R. Phylogenetic Analysis and Selection Pressures of 5-HT Receptors in Human and Non-human Primates: Receptor of an Ancient Neurotransmitter. J Biomol Struct Dyn 2010; 27:581-98. [DOI: 10.1080/07391102.2010.10508573] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
14
|
Freeling M, Subramaniam S. Conserved noncoding sequences (CNSs) in higher plants. CURRENT OPINION IN PLANT BIOLOGY 2009; 12:126-32. [PMID: 19249238 DOI: 10.1016/j.pbi.2009.01.005] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2008] [Revised: 01/22/2009] [Accepted: 01/22/2009] [Indexed: 05/09/2023]
Abstract
Plant conserved noncoding sequences (CNSs)--a specific category of phylogenetic footprint--have been shown experimentally to function. No plant CNS is conserved to the extent that ultraconserved noncoding sequences are conserved in vertebrates. Plant CNSs are enriched in known transcription factor or other cis-acting binding sites, and are usually clustered around genes. Genes that encode transcription factors and/or those that respond to stimuli are particularly CNS-rich. Only rarely could this function involve small RNA binding. Some transcribed CNSs encode short translation products as a form of negative control. Approximately 4% of Arabidopsis gene content is estimated to be both CNS-rich and occupies a relatively long stretch of chromosome: Bigfoot genes (long phylogenetic footprints). We discuss a 'DNA-templated protein assembly' idea that might help explain Bigfoot gene CNSs.
Collapse
Affiliation(s)
- Michael Freeling
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA.
| | | |
Collapse
|