1
|
Contreras-Moreira B, Saraf S, Naamati G, Casas AM, Amberkar SS, Flicek P, Jones AR, Dyer S. GET_PANGENES: calling pangenes from plant genome alignments confirms presence-absence variation. Genome Biol 2023; 24:223. [PMID: 37798615 PMCID: PMC10552430 DOI: 10.1186/s13059-023-03071-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 09/21/2023] [Indexed: 10/07/2023] Open
Abstract
Crop pangenomes made from individual cultivar assemblies promise easy access to conserved genes, but genome content variability and inconsistent identifiers hamper their exploration. To address this, we define pangenes, which summarize a species coding potential and link back to original annotations. The protocol get_pangenes performs whole genome alignments (WGA) to call syntenic gene models based on coordinate overlaps. A benchmark with small and large plant genomes shows that pangenes recapitulate phylogeny-based orthologies and produce complete soft-core gene sets. Moreover, WGAs support lift-over and help confirm gene presence-absence variation. Source code and documentation: https://github.com/Ensembl/plant-scripts .
Collapse
Affiliation(s)
- Bruno Contreras-Moreira
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK.
- Estación Experimental Aula Dei-CSIC, 50059, Zaragoza, Spain.
| | - Shradha Saraf
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Ana M Casas
- Estación Experimental Aula Dei-CSIC, 50059, Zaragoza, Spain
| | - Sandeep S Amberkar
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Andrew R Jones
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK.
| |
Collapse
|
2
|
Kim J, Shujaat M, Tayara H. iProm-Zea: A two-layer model to identify plant promoters and their types using convolutional neural network. Genomics 2022; 114:110384. [PMID: 35533969 DOI: 10.1016/j.ygeno.2022.110384] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 04/18/2022] [Accepted: 05/02/2022] [Indexed: 01/14/2023]
Abstract
A promoter is a short DNA sequence near the start codon, responsible for initiating the transcription of a specific gene in the genome. The accurate recognition of promoters is important for achieving a better understanding of transcriptional regulation. Because of their importance in the process of biological transcriptional regulation, there is an urgent need to develop in silico tools to identify promoters and their types in a timely and accurate manner. A number of prediction methods have been developed in this regard; however, almost all of them are merely used for identifying promoters and their strength or sigma types. The TATA box region in TATA promoter influences the post-transcriptional processes; therefore, in the current study, we developed a two-layer predictor called "iProm-Zea" using the convolutional neural network (CNN) for identify TATA and TATA less promoters. The first layer can be used to identify a given DNA sequence as a promoter or non-promoter. The second layer can be used to identify whether the recognized promoter is the TATA promoter. To find an optimal feature encoding scheme and model, we employed four feature encoding schemes on different machine learning and CNN algorithms, and based on the evaluation results, we selected a one-hot encoding scheme and a CNN model for iProm-Zea. The 5-fold cross validation testing results demonstrated that the constructed predictor showed great potential for identifying promoters and classifying them as TATA and TATA less promoters. Furthermore, we performed cross-species analysis of iProm-Zea to evaluate its performance in other species. Moreover, to make it easier for other experimental scientists to obtain the results they need, we established a freely accessible and user-friendly web server at http://nsclbio.jbnu.ac.kr/tools/iProm-Zea/.
Collapse
|
3
|
Tay Fernandez C. Making a Pangenome Using the Iterative Mapping Approach. Methods Mol Biol 2022; 2443:259-271. [PMID: 35037211 DOI: 10.1007/978-1-0716-2067-0_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Pangenomes have replaced single reference genomes as genetic references, as they contain a better scope of the diversity found in a single species. This protocol outlines the iterative mapping approach in constructing a pangenome, including how to check the raw data, align the data to a reference, how to assemble the data, and how to remove potential contaminants from the final assembly.
Collapse
|
4
|
Sousa B, Lopes J, Leal A, Martins M, Soares C, Azenha M, Fidalgo F, Teixeira J. Specific glutathione-S-transferases ensure an efficient detoxification of diclofenac in Solanum lycopersicum L. plants. Plant Physiol Biochem 2021; 168:263-271. [PMID: 34666279 DOI: 10.1016/j.plaphy.2021.10.019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 09/21/2021] [Accepted: 10/14/2021] [Indexed: 06/13/2023]
Abstract
Diclofenac (DCF) is a very common pharmaceutical that, due to its high use and low removal rate, is considered a prominent contaminant in surface and groundwater worldwide. In this study, Solanum lycopersicum L. cv. Micro-Tom (tomato) was used to disclose the role of glutathione (GSH)-related enzymes, as GSH conjugation with DCF is a well reported detoxification mechanism in mammals and some plant species. To achieve this, S. lycopersicum plants were exposed to 0.5 and 5 mg L-1 of DCF for 5 weeks under a semi-hydroponic experiment. The results here obtained point towards an efficient DCF detoxification mechanism that prevents DCF bioaccumulation in fruits, minimizing any concerns for human health. Although a systemic response seems to be present in response to DCF, the current data also shows that its detoxification is mostly a root-specific process. Furthermore, it appears that GSH-mediated DCF detoxification is the main mechanism activated, as glutathione-S-transferase (GST) activity was greatly enhanced in roots of tomato plants treated with 5 mg L-1 DCF, accompanied by increased glutathione reductase activity, responsible for GSH regeneration. By applying a targeted gene expression analysis, we provide evidence, for the first time, that SlGSTF4 and SlGSTF5 genes, coding for GSTs from phi class, were the main players driving the conjugation of this contaminant. In this sense, and even though tomato plants appear to be somewhat tolerant to DCF exposure, research on GST activity can prove to be instrumental in remediating DCF-contaminated environments and improving plant growth under such conditions.
Collapse
Affiliation(s)
- Bruno Sousa
- GreenUPorto - Sustainable Agrifood Production Research Centre and Inov4Agro, Biology Department, Faculty of Sciences of University of Porto, Rua do Campo Alegre s/n, 4169-007, Porto, Portugal.
| | - Jorge Lopes
- GreenUPorto - Sustainable Agrifood Production Research Centre and Inov4Agro, Biology Department, Faculty of Sciences of University of Porto, Rua do Campo Alegre s/n, 4169-007, Porto, Portugal
| | - André Leal
- GreenUPorto - Sustainable Agrifood Production Research Centre and Inov4Agro, Biology Department, Faculty of Sciences of University of Porto, Rua do Campo Alegre s/n, 4169-007, Porto, Portugal
| | - Maria Martins
- GreenUPorto - Sustainable Agrifood Production Research Centre and Inov4Agro, Biology Department, Faculty of Sciences of University of Porto, Rua do Campo Alegre s/n, 4169-007, Porto, Portugal
| | - Cristiano Soares
- GreenUPorto - Sustainable Agrifood Production Research Centre and Inov4Agro, Biology Department, Faculty of Sciences of University of Porto, Rua do Campo Alegre s/n, 4169-007, Porto, Portugal
| | - Manuel Azenha
- CIQ-UP, Chemistry and Biochemistry Department, Faculty of Sciences of University of Porto, Rua do Campo Alegre s/n, 4169-007, Porto, Portugal
| | - Fernanda Fidalgo
- GreenUPorto - Sustainable Agrifood Production Research Centre and Inov4Agro, Biology Department, Faculty of Sciences of University of Porto, Rua do Campo Alegre s/n, 4169-007, Porto, Portugal
| | - Jorge Teixeira
- GreenUPorto - Sustainable Agrifood Production Research Centre and Inov4Agro, Biology Department, Faculty of Sciences of University of Porto, Rua do Campo Alegre s/n, 4169-007, Porto, Portugal
| |
Collapse
|
5
|
Shen Y, Chen LL, Gao J. CharPlant: A De Novo Open Chromatin Region Prediction Tool for Plant Genomes. Genomics Proteomics Bioinformatics 2021; 19:860-871. [PMID: 33662624 PMCID: PMC9170768 DOI: 10.1016/j.gpb.2020.06.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 05/17/2020] [Accepted: 10/28/2020] [Indexed: 11/01/2022]
Abstract
Chromatin accessibility is a highly informative structural feature for understanding gene transcription regulation, because it indicates the degree to which nuclear macromolecules such as proteins and RNAs can access chromosomal DNA. Studies have shown that chromatin accessibility is highly dynamic during stress response, stimulus response, and developmental transition. Moreover, physical access to chromosomal DNA in eukaryotes is highly cell-specific. Therefore, current technologies such as DNase-seq, ATAC-seq, and FAIRE-seq reveal only a portion of the open chromatin regions (OCRs) present in a given species. Thus, the genome-wide distribution of OCRs remains unknown. In this study, we developed a bioinformatics tool called CharPlant for the de novo prediction of OCRs in plant genomes. To develop this tool, we constructed a three-layer convolutional neural network (CNN) and subsequently trained the CNN using DNase-seq and ATAC-seq datasets of four plant species. The model simultaneously learns the sequence motifs and regulatory logics, which are jointly used to determine DNA accessibility. All of these steps are integrated into CharPlant, which can be run using a simple command line. The results of data analysis using CharPlant in this study demonstrate its prediction power and computational efficiency. To our knowledge, CharPlant is the first de novo prediction tool that can identify potential OCRs in the whole genome. The source code of CharPlant and supporting files are freely available from https://github.com/Yin-Shen/CharPlant.
Collapse
Affiliation(s)
- Yin Shen
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Ling-Ling Chen
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China; National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Junxiang Gao
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
6
|
Wang J, Orlov YL, Li X, Zhou Y, Liu Y, Yuan C, Chen M. In situ dissecting the evolution of gene duplication with different histone modification patterns based on high-throughput data analysis in Arabidopsis thaliana. PeerJ 2021; 9:e10426. [PMID: 33505781 PMCID: PMC7792519 DOI: 10.7717/peerj.10426] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Accepted: 11/03/2020] [Indexed: 02/04/2023] Open
Abstract
Background Genetic regulation is known to contribute to the divergent expression of duplicate genes; however, little is known about how epigenetic modifications regulate the expression of duplicate genes in plants. Methods The histone modification (HM) profile patterns of different modes of gene duplication, including the whole genome duplication, proximal duplication, tandem duplication and transposed duplication were characterized based on ChIP-chip or ChIP-seq datasets. In this study, 10 distinct HM marks including H2Bub, H3K4me1, H3K4me2, H3K4me3, H3K9ac, H3K9me2, H3K27me1, H3K27me3, H3K36me3 and H3K14ac were analyzed. Moreover, the features of gene duplication with different HM patterns were characterized based on 88 RNA-seq datasets of Arabidopsis thaliana. Results This study showed that duplicate genes in Arabidopsis have a more similar HM pattern than single-copy genes in both their promoters and protein-coding regions. The evolution of HM marks is found to be coupled with coding sequence divergence and expression divergence after gene duplication. We found that functionally selective constraints may impose on epigenetic evolution after gene duplication. Furthermore, duplicate genes with distinct functions have more divergence in histone modification compared with the ones with the same function, while higher expression divergence is found with mutations of chromatin modifiers. This study shows the role of epigenetic marks in regulating gene expression and functional divergence after gene duplication in plants based on sequencing data.
Collapse
Affiliation(s)
- Jingjing Wang
- Center for Stem Cell and Regenerative Medicine, The First Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, P. R. China.,Department of Bioinformatics, The State Key Laboratory of Plant Physiology and Biochemistry, Institute of Plant Science, College of Life Sciences, Zhejiang University, Hangzhou, P. R. China.,James D. Watson Institute of Genome Sciences, Zhejiang University, Hangzhou, P. R. China.,Zhejiang Provincial Key Lab for Tissue Engineering and Regenerative Medicine, Dr. Li Dak Sum & Yip Yio Chin Center for Stem Cell and Regenerative Medicine, Zhejiang University, Hangzhou, P. R. China
| | - Yuriy L Orlov
- The Digital Health Institute, I.M Sechenov First Moscow State Medical University (Sechenov University), Moscow, Russia.,Novosibirsk State University, Novosibirsk, Russia.,Agrarian and Technological Institute, Peoples' Friendship University of Russia (RUDN), Moscow, Russia
| | - Xue Li
- James D. Watson Institute of Genome Sciences, Zhejiang University, Hangzhou, P. R. China.,Institute of Hematology, Zhejiang University, Hangzhou, P. R. China
| | - Yincong Zhou
- Department of Bioinformatics, The State Key Laboratory of Plant Physiology and Biochemistry, Institute of Plant Science, College of Life Sciences, Zhejiang University, Hangzhou, P. R. China.,James D. Watson Institute of Genome Sciences, Zhejiang University, Hangzhou, P. R. China
| | - Yongjing Liu
- Department of Bioinformatics, The State Key Laboratory of Plant Physiology and Biochemistry, Institute of Plant Science, College of Life Sciences, Zhejiang University, Hangzhou, P. R. China.,James D. Watson Institute of Genome Sciences, Zhejiang University, Hangzhou, P. R. China
| | - Chunhui Yuan
- Zhejiang Provincial Key Lab for Tissue Engineering and Regenerative Medicine, Dr. Li Dak Sum & Yip Yio Chin Center for Stem Cell and Regenerative Medicine, Zhejiang University, Hangzhou, P. R. China.,Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province, P. R. China
| | - Ming Chen
- Center for Stem Cell and Regenerative Medicine, The First Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, P. R. China.,Department of Bioinformatics, The State Key Laboratory of Plant Physiology and Biochemistry, Institute of Plant Science, College of Life Sciences, Zhejiang University, Hangzhou, P. R. China.,James D. Watson Institute of Genome Sciences, Zhejiang University, Hangzhou, P. R. China
| |
Collapse
|
7
|
Eskier D, Karakülah G. In Silico Identification of Stress-Associated Transposable Elements in Arabidopsis thaliana Using Public Transcriptome Data. Methods Mol Biol 2021; 2250:15-30. [PMID: 33900589 DOI: 10.1007/978-1-0716-1134-0_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Transposable elements (TEs) have been associated with stress response in many plants, making them a key target of study. However, the high variability, genomic repeat-heavy nature, and widely noncoding character of TEs have made them difficult to study using non-specialized methods, whether experimental or computational. In this chapter, we introduce two computational workflows to analyze transposable elements using publicly available transcriptome data. In the first of these methods, we identify TEs, which show differential expression under salt stress using sample transcriptome libraries that includes noncoding transcripts. In the second, we identify protein-coding genes with differential expression under the same conditions, and determine which TEs are enriched in the promoter regions of these stress-related genes.
Collapse
Affiliation(s)
- Doğa Eskier
- İzmir International Biomedicine and Genome Institute (iBG-İzmir), Dokuz Eylül University, İnciralti, İzmir, Turkey
- İzmir Biomedicine and Genome Center (IBG), İnciralti, İzmir, Turkey
| | - Gökhan Karakülah
- İzmir International Biomedicine and Genome Institute (iBG-İzmir), Dokuz Eylül University, İnciralti, İzmir, Turkey.
- İzmir Biomedicine and Genome Center (IBG), İnciralti, İzmir, Turkey.
| |
Collapse
|
8
|
de Assis R, Baba VY, Cintra LA, Gonçalves LSA, Rodrigues R, Vanzela ALL. Genome relationships and LTR-retrotransposon diversity in three cultivated Capsicum L. (Solanaceae) species. BMC Genomics 2020; 21:237. [PMID: 32183698 PMCID: PMC7076952 DOI: 10.1186/s12864-020-6618-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Accepted: 02/24/2020] [Indexed: 01/08/2023] Open
Abstract
Background Plant genomes are rich in repetitive sequences, and transposable elements (TEs) are the most accumulated of them. This mobile fraction can be distinguished as Class I (retrotransposons) and Class II (transposons). Retrotransposons that are transposed using an intermediate RNA and that accumulate in a “copy-and-paste” manner were screened in three genomes of peppers (Solanaceae). The present study aimed to understand the genome relationships among Capsicum annuum, C. chinense, and C. baccatum, based on a comparative analysis of the function, diversity and chromosome distribution of TE lineages in the Capsicum karyotypes. Due to the great commercial importance of pepper in natura, as a spice or as an ornamental plant, these genomes have been widely sequenced, and all of the assemblies are available in the SolGenomics group. These sequences were used to compare all repetitive fractions from a cytogenomic point of view. Results The qualification and quantification of LTR-retrotransposons (LTR-RT) families were contrasted with molecular cytogenetic data, and the results showed a strong genome similarity between C. annuum and C. chinense as compared to C. baccatum. The Gypsy superfamily is more abundant than Copia, especially for Tekay/Del lineage members, including a high representation in C. annuum and C. chinense. On the other hand, C. baccatum accumulates more Athila/Tat sequences. The FISH results showed retrotransposons differentially scattered along chromosomes, except for CRM lineage sequences, which mainly have a proximal accumulation associated with heterochromatin bands. Conclusions The results confirm a close genomic relationship between C. annuum and C. chinense in comparison to C. baccatum. Centromeric GC-rich bands may be associated with the accumulation regions of CRM elements, whereas terminal and subterminal AT- and GC-rich bands do not correspond to the accumulation of the retrotransposons in the three Capsicum species tested.
Collapse
Affiliation(s)
- Rafael de Assis
- Laboratório de Citogenética e Diversidade Vegetal, Universidade Estadual de Londrina, 86057-970, Londrina, Paraná, Brazil
| | - Viviane Yumi Baba
- Departamento de Agronomia, Universidade Estadual de Londrina, 86057-970, Londrina, Paraná, Brazil
| | - Leonardo Adabo Cintra
- Laboratório de Citogenética e Diversidade Vegetal, Universidade Estadual de Londrina, 86057-970, Londrina, Paraná, Brazil
| | | | - Rosana Rodrigues
- Laboratório de Melhoramento Genético Vegetal, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Campos dos Goytacazes, Rio de Janeiro, 28013-602, Brazil
| | - André Luís Laforga Vanzela
- Laboratório de Citogenética e Diversidade Vegetal, Universidade Estadual de Londrina, 86057-970, Londrina, Paraná, Brazil.
| |
Collapse
|
9
|
Weitemier K, Straub SC, Fishbein M, Bailey CD, Cronn RC, Liston A. A draft genome and transcriptome of common milkweed ( Asclepias syriaca) as resources for evolutionary, ecological, and molecular studies in milkweeds and Apocynaceae. PeerJ 2019; 7:e7649. [PMID: 31579586 PMCID: PMC6756140 DOI: 10.7717/peerj.7649] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 08/09/2019] [Indexed: 02/06/2023] Open
Abstract
Milkweeds (Asclepias) are used in wide-ranging studies including floral development, pollination biology, plant-insect interactions and co-evolution, secondary metabolite chemistry, and rapid diversification. We present a transcriptome and draft nuclear genome assembly of the common milkweed, Asclepias syriaca. This reconstruction of the nuclear genome is augmented by linkage group information, adding to existing chloroplast and mitochondrial genomic resources for this member of the Apocynaceae subfamily Asclepiadoideae. The genome was sequenced to 80.4× depth and the draft assembly contains 54,266 scaffolds ≥1 kbp, with N50 = 3,415 bp, representing 37% (156.6 Mbp) of the estimated 420 Mbp genome. A total of 14,474 protein-coding genes were identified based on transcript evidence, closely related proteins, and ab initio models, and 95% of genes were annotated. A large proportion of gene space is represented in the assembly, with 96.7% of Asclepias transcripts, 88.4% of transcripts from the related genus Calotropis, and 90.6% of proteins from Coffea mapping to the assembly. Scaffolds covering 75 Mbp of the Asclepias assembly formed 11 linkage groups. Comparisons of these groups with pseudochromosomes in Coffea found that six chromosomes show consistent stability in gene content, while one may have a long history of fragmentation and rearrangement. The progesterone 5β-reductase gene family, a key component of cardenolide production, is likely reduced in Asclepias relative to other Apocynaceae. The genome and transcriptome of common milkweed provide a rich resource for future studies of the ecology and evolution of a charismatic plant family.
Collapse
Affiliation(s)
- Kevin Weitemier
- Department of Fisheries and Wildlife, Oregon State University, Corvallis, OR, USA
| | | | - Mark Fishbein
- Department of Plant Biology, Ecology, and Evolution, Oklahoma State University, Stillwater, OK, USA
| | - C. Donovan Bailey
- Department of Biology, New Mexico State University, Las Cruces, NM, USA
| | - Richard C. Cronn
- Pacific Northwest Research Station, USDA Forest Service, Corvallis, OR, USA
| | - Aaron Liston
- Department of Botany & Plant Pathology, Oregon State University, Corvallis, OR, USA
| |
Collapse
|
10
|
Guigon I, Legrand S, Berthelot JF, Bini S, Lanselle D, Benmounah M, Touzet H. miRkwood: a tool for the reliable identification of microRNAs in plant genomes. BMC Genomics 2019; 20:532. [PMID: 31253093 PMCID: PMC6599362 DOI: 10.1186/s12864-019-5913-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Accepted: 06/18/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) play crucial roles in post-transcriptional regulation of eukaryotic gene expression and are involved in many aspects of plant development. Although several prediction tools are available for metazoan genomes, the number of tools dedicated to plants is relatively limited. RESULTS Here, we present miRkwood, a user-friendly tool for the identification of miRNAs in plant genomes using small RNA sequencing data. Deep-sequencing data of Argonaute associated small RNAs showed that miRkwood is able to identify a large diversity of plant miRNAs and limits false positive predictions. Moreover, it outperforms current tools such as ShortStack and contrary to ShortStack, miRkwood provides a quality score allowing users to rank miRNA predictions. CONCLUSION miRkwood is a very efficient tool for the annotation of miRNAs in plant genomes. It is available as a web server, as a standalone version, as a docker image and as a Galaxy tool: http://bioinfo.cristal.univ-lille.fr/mirkwood.
Collapse
Affiliation(s)
| | - Sylvain Legrand
- University of Lille, CNRS, UMR 8198 - Evo-Eco-Paleo, F-59000 Lille, France
| | | | - Sébastien Bini
- University of Lille, CNRS, INRIA, UMR 9189 - CRIStAL, F-59000 Lille, France
| | - Delphine Lanselle
- University of Lille, CNRS, UMR 8198 - Evo-Eco-Paleo, F-59000 Lille, France
| | | | - Hélène Touzet
- University of Lille, CNRS, INRIA, UMR 9189 - CRIStAL, F-59000 Lille, France
| |
Collapse
|
11
|
Guigon I, Legrand S, Berthelot JF, Bini S, Lanselle D, Benmounah M, Touzet H. miRkwood: a tool for the reliable identification of microRNAs in plant genomes. BMC Genomics 2019. [PMID: 31253093 DOI: 10.1186/s12864-019-5913-9.31253093] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2023] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) play crucial roles in post-transcriptional regulation of eukaryotic gene expression and are involved in many aspects of plant development. Although several prediction tools are available for metazoan genomes, the number of tools dedicated to plants is relatively limited. RESULTS Here, we present miRkwood, a user-friendly tool for the identification of miRNAs in plant genomes using small RNA sequencing data. Deep-sequencing data of Argonaute associated small RNAs showed that miRkwood is able to identify a large diversity of plant miRNAs and limits false positive predictions. Moreover, it outperforms current tools such as ShortStack and contrary to ShortStack, miRkwood provides a quality score allowing users to rank miRNA predictions. CONCLUSION miRkwood is a very efficient tool for the annotation of miRNAs in plant genomes. It is available as a web server, as a standalone version, as a docker image and as a Galaxy tool: http://bioinfo.cristal.univ-lille.fr/mirkwood.
Collapse
Affiliation(s)
| | - Sylvain Legrand
- University of Lille, CNRS, UMR 8198 - Evo-Eco-Paleo, F-59000, Lille, France.
| | | | - Sébastien Bini
- University of Lille, CNRS, INRIA, UMR 9189 - CRIStAL, F-59000, Lille, France
| | - Delphine Lanselle
- University of Lille, CNRS, UMR 8198 - Evo-Eco-Paleo, F-59000, Lille, France
| | | | - Hélène Touzet
- University of Lille, CNRS, INRIA, UMR 9189 - CRIStAL, F-59000, Lille, France.
| |
Collapse
|
12
|
Song QA, Catlin NS, Brad Barbazuk W, Li S. Computational analysis of alternative splicing in plant genomes. Gene 2019; 685:186-195. [PMID: 30321657 DOI: 10.1016/j.gene.2018.10.026] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Revised: 09/16/2018] [Accepted: 10/11/2018] [Indexed: 12/11/2022]
Abstract
Computational analyses play crucial roles in characterizing splicing isoforms in plant genomes. In this review, we provide a survey of computational tools used in recently published, genome-scale splicing analyses in plants. We summarize the commonly used software and pipelines for read mapping, isoform reconstruction, isoform quantification, and differential expression analysis. We also discuss methods for analyzing long reads and the strategies to combine long and short reads in identifying splicing isoforms. We review several tools for characterizing local splicing events, splicing graphs, coding potential, and visualizing splicing isoforms. We further discuss the procedures for identifying conserved splicing isoforms across plant species. Finally, we discuss the outlook of integrating other genomic data with splicing analyses to identify regulatory mechanisms of AS on genome-wide scale.
Collapse
Affiliation(s)
- Qi A Song
- Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, United States of America
| | - Nathan S Catlin
- Department of Biology, University of Florida, Gainesville, FL 32611, United States of America
| | - W Brad Barbazuk
- Department of Biology, University of Florida, Gainesville, FL 32611, United States of America; Genetics Institute, University of Florida, Gainesville, FL 32611, United States of America
| | - Song Li
- School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, United States of America.
| |
Collapse
|
13
|
Tamby JP, Brunaud V. FLAGdb ++: A Bioinformatic Environment to Study and Compare Plant Genomes. Methods Mol Biol 2017; 1533:79-101. [PMID: 27987165 DOI: 10.1007/978-1-4939-6658-5_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Today, the growing knowledge and data accumulation on plant genomes do not solve in a simple way the task of gene function inference. Because data of different types are coming from various sources, we need to integrate and analyze them to help biologists in this task. We created FLAGdb++ ( http://tools.ips2.u-psud.fr/FLAGdb ) to take up this challenge for a selection of plant genomes. In order to enrich gene function predictions, structural and functional annotations of the genomes are explored to generate meta-data and to compare them. Since data are numerous and complex, we focused on accessibility and visualization with an original and user-friendly interface. In this chapter we present the main tools of FLAGdb++ and a use-case to explore a gene family: structural and functional properties of this family and research of orthologous genes in the other plant genomes.
Collapse
|
14
|
Abstract
Gramene is an integrated informatics resource for accessing, visualizing, and comparing plant genomes and biological pathways. Originally targeting grasses, Gramene has grown to host annotations for economically important and research model crops, including wheat, potato, tomato, banana, grape, poplar, and Chlamydomonas. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. This chapter outlines system requirements for end users and database hosting, data types and basic navigation within Gramene, and provides examples of how to (1) view a phylogenetic tree for a family of transcription factors, (2) explore genetic variation in the orthologues of a gene with a known trait association, and (3) upload, visualize, and privately share end user data into a new genome browser track.Moreover, this is the first publication describing Gramene's new web interface-intended to provide a simplified portal to the most complete and up-to-date set of plant genome and pathway annotations.
Collapse
Affiliation(s)
| | - Joshua Stein
- Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY, 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY, 11724, USA
| | - Ken Youens-Clark
- Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY, 11724, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, 97331, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY, 11724, USA.
- USDA-ARS NEA Plant, Soil & Nutrition Laboratory Research Unit, Cornell University, Ithaca, NY, 14853, USA.
| |
Collapse
|
15
|
Gul A, Ahad A, Akhtar S, Ahmad Z, Rashid B, Husnain T. Microarray: gateway to unravel the mystery of abiotic stresses in plants. Biotechnol Lett 2015; 38:527-43. [PMID: 26667130 DOI: 10.1007/s10529-015-2010-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2015] [Accepted: 12/02/2015] [Indexed: 10/22/2022]
Abstract
Environmental factors, such as drought, salinity, extreme temperature, ozone poisoning, metal toxicity etc., significantly affect crops. To study these factors and to design a possible remedy, biological experimental data concerning these crops requires the quantification of gene expression and comparative analyses at high throughput level. Development of microarrays is the platform to study the differential expression profiling of the targeted genes. This technology can be applied to gene expression studies, ranging from individual genes to whole genome level. It is now possible to perform the quantification of the differential expression of genes on a glass slide in a single experiment. This review documents recently published reports on the use of microarrays for the identification of genes in different plant species playing their role in different cellular networks under abiotic stresses. The regulation pattern of differentially-expressed genes, individually or in group form, may help us to study different pathways and functions at the cellular and molecular level. These studies can provide us with a lot of useful information to unravel the mystery of abiotic stresses in important crop plants.
Collapse
Affiliation(s)
- Ambreen Gul
- Centre of Excellence in Molecular Biology, University of the Punjab Lahore, 87 W Canal Bank Road, Thokar Niaz Baig, Lahore, 53700, Pakistan
| | - Ammara Ahad
- Centre of Excellence in Molecular Biology, University of the Punjab Lahore, 87 W Canal Bank Road, Thokar Niaz Baig, Lahore, 53700, Pakistan
| | - Sidra Akhtar
- Centre of Excellence in Molecular Biology, University of the Punjab Lahore, 87 W Canal Bank Road, Thokar Niaz Baig, Lahore, 53700, Pakistan
| | - Zarnab Ahmad
- Centre of Excellence in Molecular Biology, University of the Punjab Lahore, 87 W Canal Bank Road, Thokar Niaz Baig, Lahore, 53700, Pakistan
| | - Bushra Rashid
- Centre of Excellence in Molecular Biology, University of the Punjab Lahore, 87 W Canal Bank Road, Thokar Niaz Baig, Lahore, 53700, Pakistan.
| | - Tayyab Husnain
- Centre of Excellence in Molecular Biology, University of the Punjab Lahore, 87 W Canal Bank Road, Thokar Niaz Baig, Lahore, 53700, Pakistan
| |
Collapse
|
16
|
Abstract
BACKGROUND The Tc1/mariner superfamily of transposable elements (TEs) is widespread in animal genomes. Mariner-like elements, which bear a DDD triad catalytic motif, have been identified in a wide range of flowering plant species. However, as the founding member of the superfamily, Tc1-like elements that bear a DD34E triad catalytic motif are only known to unikonts (animals, fungi, and Entamoeba). RESULTS Here we report the identification of Tc1-like elements (TLEs) in plant genomes. These elements bear the four terminal nucleotides and the characteristic DD34E triad motif of Tc1 element. The two TLE families (PpTc1, PpTc2) identified in the moss (Physcomitrella patens) genome contain highly similar copies. Multiple copies of PpTc1 are actively transcribed and the transcripts encode intact full length transposase coding sequences. TLEs are also found in angiosperm genome sequence databases of rice (Oryza sativa), dwarf birch (Betula nana), cabbage (Brassica rapa), hemp (Cannabis sativa), barley (Hordium valgare), lettuce (Lactuta sativa), poplar (Populus trichocarpa), pear (Pyrus x bretschneideri), and wheat (Triticum urartu). CONCLUSIONS This study extends the occurrence of TLEs to the plant phylum. The elements in the moss genome have amplified recently and may still be capable of transposition. The TLEs are also present in angiosperm genomes, but apparently much less abundant than in moss.
Collapse
Affiliation(s)
- Yuan Liu
- Department of Biology, University of Toronto at Mississauga, 3359 Mississauga Road, L5L 1C6 Mississauga, ON, Canada ; Cell and Systems Biology, University of Toronto, Toronto, Canada
| | - Guojun Yang
- Department of Biology, University of Toronto at Mississauga, 3359 Mississauga Road, L5L 1C6 Mississauga, ON, Canada ; Cell and Systems Biology, University of Toronto, Toronto, Canada
| |
Collapse
|
17
|
Asamizu E, Ichihara H, Nakaya A, Nakamura Y, Hirakawa H, Ishii T, Tamura T, Fukami-Kobayashi K, Nakajima Y, Tabata S. Plant Genome DataBase Japan (PGDBj): a portal website for the integration of plant genome-related databases. Plant Cell Physiol 2014; 55:e8. [PMID: 24363285 PMCID: PMC3894704 DOI: 10.1093/pcp/pct189] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
The Plant Genome DataBase Japan (PGDBj, http://pgdbj.jp/?ln=en) is a portal website that aims to integrate plant genome-related information from databases (DBs) and the literature. The PGDBj is comprised of three component DBs and a cross-search engine, which provides a seamless search over the contents of the DBs. The three DBs are as follows. (i) The Ortholog DB, providing gene cluster information based on the amino acid sequence similarity. Over 500,000 amino acid sequences of 20 Viridiplantae species were subjected to reciprocal BLAST searches and clustered. Sequences from plant genome DBs (e.g. TAIR10 and RAP-DB) were also included in the cluster with a direct link to the original DB. (ii) The Plant Resource DB, integrating the SABRE DB, which provides cDNA and genome sequence resources accumulated and maintained in the RIKEN BioResource Center and National BioResource Projects. (iii) The DNA Marker DB, providing manually or automatically curated information of DNA markers, quantitative trait loci and related linkage maps, from the literature and external DBs. As the PGDBj targets various plant species, including model plants, algae, and crops important as food, fodder and biofuel, researchers in the field of basic biology as well as a wide range of agronomic fields are encouraged to perform searches using DNA sequences, gene names, traits and phenotypes of interest. The PGDBj will return the search results from the component DBs and various types of linked external DBs.
Collapse
Affiliation(s)
- Erika Asamizu
- Department of Plant Genome Research, Kazusa DNA Research Institute, 2-6-7 Kazusa-kamatari, Kisarazu, Chiba, 292-0818 Japan
| | - Hisako Ichihara
- Department of Plant Genome Research, Kazusa DNA Research Institute, 2-6-7 Kazusa-kamatari, Kisarazu, Chiba, 292-0818 Japan
| | - Akihiro Nakaya
- Center for Transdisciplinary Research, Niigata University, 1-757 Asahimachi-dori, Chuo-ku, Niigata, 951-8585 Japan
| | - Yasukazu Nakamura
- Department of Plant Genome Research, Kazusa DNA Research Institute, 2-6-7 Kazusa-kamatari, Kisarazu, Chiba, 292-0818 Japan
| | - Hideki Hirakawa
- Department of Plant Genome Research, Kazusa DNA Research Institute, 2-6-7 Kazusa-kamatari, Kisarazu, Chiba, 292-0818 Japan
| | - Takahiro Ishii
- Department of Plant Genome Research, Kazusa DNA Research Institute, 2-6-7 Kazusa-kamatari, Kisarazu, Chiba, 292-0818 Japan
| | - Takuro Tamura
- LINE Co., Ltd., 5-201 Kandamatsunaga-cho, Tokyo, 101-0023 Japan
| | | | - Yukari Nakajima
- Department of Plant Genome Research, Kazusa DNA Research Institute, 2-6-7 Kazusa-kamatari, Kisarazu, Chiba, 292-0818 Japan
| | - Satoshi Tabata
- Department of Plant Genome Research, Kazusa DNA Research Institute, 2-6-7 Kazusa-kamatari, Kisarazu, Chiba, 292-0818 Japan
- *Corresponding author: Fax: +81-438-52-3918; E-mail,
| |
Collapse
|