1
|
Wang Y, Lecourieux F, Zhang R, Dai Z, Lecourieux D, Li S, Liang Z. Data Comparison and Software Design for Easy Selection and Application of CRISPR-based Genome Editing Systems in Plants. Genomics Proteomics Bioinformatics 2021; 19:937-948. [PMID: 34280549 PMCID: PMC9402788 DOI: 10.1016/j.gpb.2019.05.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/03/2019] [Revised: 04/29/2019] [Accepted: 05/31/2019] [Indexed: 11/18/2022]
Abstract
CRISPR-based genome editing systems have been successfully and effectively used in many organisms. However, only a few studies have reported the comparison between CRISPR/Cas9 and CRISPR/Cpf1 systems in the whole-genome applications. Although many web-based toolkits are available, there is still a shortage of comprehensive, user-friendly, and plant-specific CRISPR databases and desktop software. In this study, we identified and analyzed the similarities and differences between CRISPR/Cas9 and CRISPR/Cpf1 systems by considering the abundance of proto-spacer adjacent motif (PAM) sites, the effects of GC content, optimal proto-spacer length, potential universality within the plant kingdom, PAM-rich region (PARR) inhibiting ratio, and the effects of G-quadruplex (G-Q) structures. Using this information, we built a comprehensive CRISPR database (including 138 plant genome data sources, www.grapeworld.cn/pc/index.html), which provides search tools for the identification of CRISPR editing sites in both CRISPR/Cas9 and CRISPR/Cpf1 systems. We also developed a desktop software on the basis of the Perl/Tk tool, which facilitates and improves the detection and analysis of CRISPR editing sites at the whole-genome level on Linux and/or Windows platform. Therefore, this study provides helpful data and software for easy selection and application of CRISPR-based genome editing systems in plants.
Collapse
Affiliation(s)
- Yi Wang
- Beijing Key Laboratory of Grape Science and Enology, and CAS Key Laboratory of Plant Resources, Institute of Botany, the Innovative Academy of Seed Design, Chinese Academy of Science, Beijing 100093, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Fatma Lecourieux
- Ecophysiology and Functional Genomics of Grapevine, Bordeaux Sciences Agro, INRAE, University of Bordeaux, Institute for Vine and Wine Sciences Bordeaux-Aquitaine, Villenave d'Ornon 33140, France
| | - Rui Zhang
- College of Plant Protection, Shandong Agricultural University, Taian 271018, China
| | - Zhanwu Dai
- Ecophysiology and Functional Genomics of Grapevine, Bordeaux Sciences Agro, INRAE, University of Bordeaux, Institute for Vine and Wine Sciences Bordeaux-Aquitaine, Villenave d'Ornon 33140, France
| | - David Lecourieux
- Ecophysiology and Functional Genomics of Grapevine, Bordeaux Sciences Agro, INRAE, University of Bordeaux, Institute for Vine and Wine Sciences Bordeaux-Aquitaine, Villenave d'Ornon 33140, France
| | - Shaohua Li
- University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Zhenchang Liang
- Beijing Key Laboratory of Grape Science and Enology, and CAS Key Laboratory of Plant Resources, Institute of Botany, the Innovative Academy of Seed Design, Chinese Academy of Science, Beijing 100093, China; Sino-Africa Joint Research Center, Chinese Academy of Sciences, Wuhan 430074, China.
| |
Collapse
|
2
|
Fazzi-Gomes P, Aguiar J, Cabral GF, Marques D, Palheta H, Moreira F, Rodrigues M, Cavalcante R, Souza J, Silva C, Hamoy I, Santos S. Genomic approach for conservation and the sustainable management of endangered species of the Amazon. PLoS One 2021; 16:e0240002. [PMID: 33626057 DOI: 10.1371/journal.pone.0240002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 12/10/2020] [Indexed: 11/19/2022] Open
Abstract
A broad panel of potentially amplifiable microsatellite loci and a multiplex system were developed for the Amazonian symbol fish species Arapaima gigas, which is currently in high danger of extinction due to the disorderly fishing exploitation. Several factors have contributed to the increase of this threat, among which we highlight the lack of genetic information about the structure and taxonomic status of the species, as well as the lack of accurate tools for evaluation of the effectivity of current management programs. Based on Arapaima gigas’ whole genome, available at the NCBI database (ID: 12404), a total of 95,098 unique perfect microsatellites were identified, including their proposed primers. From this panel, a multiplex system containing 12 tetranucleotide microsatellite markers was validated. These tools are valuable for research in as many areas as bioinformatics, ecology, genetics, evolution and comparative studies, since they are able to provide more accurate information for fishing management, conservation of wild populations and genetic management of aquaculture.
Collapse
|
3
|
Fu C, Du J, Tian X, He Z, Fu L, Wang Y, Xu D, Xu X, Xia X, Zhang Y, Cao S. Rapid identification and characterization of genetic loci for defective kernel in bread wheat. BMC Plant Biol 2019; 19:483. [PMID: 31703630 PMCID: PMC6842267 DOI: 10.1186/s12870-019-2102-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 10/28/2019] [Indexed: 06/10/2023]
Abstract
BACKGROUND Wheat is a momentous crop and feeds billions of people in the world. The improvement of wheat yield is very important to ensure world food security. Normal development of grain is the essential guarantee for wheat yield formation. The genetic study of grain phenotype and identification of key genes for grain filling are of great significance upon dissecting the molecular mechanism of wheat grain morphogenesis and yield potential. RESULTS Here we identified a pair of defective kernel (Dek) isogenic lines, BL31 and BL33, with plump and shrunken mature grains, respectively, and constructed a genetic population from the BL31/BL33 cross. Ten chromosomes had higher frequency of polymorphic single nucleotide polymorphism (SNP) markers between BL31 and BL33 using Wheat660K chip. Totally 783 simple sequence repeat (SSR) markers were chosen from the above chromosomes and 15 of these were integrated into two linkage groups using the genetic population. Genetic mapping identified three QTL, QDek.caas-3BS.1, QDek.caas-3BS.2 and QDek.caas-4AL, explaining 14.78-18.17%, 16.61-21.83% and 19.08-28.19% of phenotypic variances, respectively. Additionally, five polymorphic SNPs from Wheat660K were successfully converted into cleaved amplified polymorphic sequence (CAPS) markers and enriched the target regions of the above QTL. Biochemical analyses revealed that BL33 has significantly higher grain sucrose contents at filling stages and lower mature grain starch contents than BL31, indicating that the Dek QTL may be involved in carbohydrate metabolism. As such, the candidate genes for each QTL were predicated according to International Wheat Genome Sequence Consortium (IWGSC) RefSeq v1.0. CONCLUSIONS Three major QTL for Dek were identified and their causal genes were predicted, laying a foundation to conduct fine mapping and dissect the regulatory mechanism underlying Dek trait in wheat.
Collapse
Affiliation(s)
- Chao Fu
- Institute of Crop Sciences, National Wheat Improvement Center, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Jiuyuan Du
- Wheat Research Institute, Gansu Academy of Agricultural Sciences, Lanzhou, 730070, China
| | - Xiuling Tian
- Institute of Crop Sciences, National Wheat Improvement Center, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Zhonghu He
- Institute of Crop Sciences, National Wheat Improvement Center, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
- International Maize and Wheat Improvement Center, 12 Zhongguancun South Street, Beijing, 100081, China
| | - Luping Fu
- Institute of Crop Sciences, National Wheat Improvement Center, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Yue Wang
- Institute of Crop Sciences, National Wheat Improvement Center, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Dengan Xu
- Institute of Crop Sciences, National Wheat Improvement Center, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Xiaoting Xu
- Institute of Crop Sciences, National Wheat Improvement Center, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Xianchun Xia
- Institute of Crop Sciences, National Wheat Improvement Center, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Yan Zhang
- Institute of Crop Sciences, National Wheat Improvement Center, Chinese Academy of Agricultural Sciences, Beijing, 100081, China.
| | - Shuanghe Cao
- Institute of Crop Sciences, National Wheat Improvement Center, Chinese Academy of Agricultural Sciences, Beijing, 100081, China.
| |
Collapse
|
4
|
Wang X, Zhang Y, Qiao L, Chen B. Comparative analyses of simple sequence repeats (SSRs) in 23 mosquito species genomes: Identification, characterization and distribution (Diptera: Culicidae). Insect Sci 2019; 26:607-619. [PMID: 29484820 PMCID: PMC7379697 DOI: 10.1111/1744-7917.12577] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Revised: 01/20/2018] [Accepted: 01/24/2018] [Indexed: 05/28/2023]
Abstract
Simple sequence repeats (SSRs) exist in both eukaryotic and prokaryotic genomes and are the most popular genetic markers, but the SSRs of mosquito genomes are still not well understood. In this study, we identified and analyzed the SSRs in 23 mosquito species using Drosophila melanogaster as reference at the whole-genome level. The results show that SSR numbers (33 076-560 175/genome) and genome sizes (574.57-1342.21 Mb) are significantly positively correlated (R2 = 0.8992, P < 0.01), but the correlation in individual species varies in these mosquito species. In six types of SSR, mono- to trinucleotide SSRs are dominant with cumulative percentages of 95.14%-99.00% and densities of 195.65/Mb-787.51/Mb, whereas tetra- to hexanucleotide SSRs are rare with 1.12%-4.22% and 3.76/Mb-40.23/Mb. The (A/T)n, (AC/GT)n and (AGC/GCT)n are the most frequent motifs in mononucleotide, dinucleotide and trinucleotide SSRs, respectively, and the motif frequencies of tetra- to hexanucleotide SSRs appear to be species-specific. The 10-20 bp length of SSRs are dominant with the number of 110 561 ± 93 482 and the frequency of 87.25% ± 5.73% on average, and the number and frequency decline with the increase of length. Most SSRs (83.34% ± 7.72%) are located in intergenic regions, followed by intron regions (11.59% ± 5.59%), exon regions (3.74% ± 1.95%), and untranslated regions (1.32% ± 1.39%). The mono-, di- and trinucleotide SSRs are the main SSRs in both gene regions (98.55% ± 0.85%) and exon regions (99.27% ± 0.52%). An average of 42.52% of total genes contains SSRs, and the preference for SSR occurrence in different gene subcategories are species-specific. The study provides useful insights into the SSR diversity, characteristics and distribution in 23 mosquito species of genomes.
Collapse
Affiliation(s)
- Xiao‐Ting Wang
- Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular BiologyChongqing Normal UniversityChongqingChina
| | - Yu‐Juan Zhang
- Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular BiologyChongqing Normal UniversityChongqingChina
| | - Liang Qiao
- Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular BiologyChongqing Normal UniversityChongqingChina
| | - Bin Chen
- Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular BiologyChongqing Normal UniversityChongqingChina
| |
Collapse
|
5
|
Chen G, Zhang W, Fang J, Dong L. Identification of massive molecular markers in Echinochloa phyllopogon using a restriction-site associated DNA approach. Plant Divers 2017; 39:287-293. [PMID: 30159521 PMCID: PMC6112297 DOI: 10.1016/j.pld.2017.08.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Revised: 08/23/2017] [Accepted: 08/28/2017] [Indexed: 06/08/2023]
Abstract
Echinochloa phyllopogon proliferation seriously threatens rice production worldwide. We combined a restriction-site associated DNA (RAD) approach with Illumina DNA sequencing for rapid and mass discovery of simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) markers for E. phyllopogon. RAD tags were generated from the genomic DNA of two E. phyllopogon plants, and sequenced to produce 5197.7 Mb and 5242.9 Mb high quality sequences, respectively. The GC content of E. phyllopogon was 45.8%, which is high for monocots. In total, 4710 putative SSRs were identified in 4132 contigs, which permitted the design of PCR primers for E. phyllopogon. Most repeat motifs among the SSRs identified were dinucleotide (>82%), and most of these SSRs were four motif-repeats (>75%). The most frequent motif was AT, accounting for 36.3%-37.2%, followed by AG and AC. In total, 78 putative polymorphic SSR loci were found. A total of 49,179 SNPs were discovered between the two samples of E. phyllopogon, 67.1% of which were transversions and 32.9% were transitions. We used eight SSRs to study the genetic diversity of four E. phyllopogon populations collected from rice fields in China and all eight loci tested were polymorphic.
Collapse
Affiliation(s)
- Guoqi Chen
- College of Plant Protection, Nanjing Agricultural University, Nanjing 210095, China
- Key Laboratory of Integrated Pest Management on Crops in East China (Nanjing Agricultural University), Ministry of Agriculture, Nanjing 210095, China
| | - Wei Zhang
- College of Plant Protection, Nanjing Agricultural University, Nanjing 210095, China
- Key Laboratory of Integrated Pest Management on Crops in East China (Nanjing Agricultural University), Ministry of Agriculture, Nanjing 210095, China
| | - Jiapeng Fang
- College of Plant Protection, Nanjing Agricultural University, Nanjing 210095, China
- Key Laboratory of Integrated Pest Management on Crops in East China (Nanjing Agricultural University), Ministry of Agriculture, Nanjing 210095, China
| | - Liyao Dong
- College of Plant Protection, Nanjing Agricultural University, Nanjing 210095, China
- Key Laboratory of Integrated Pest Management on Crops in East China (Nanjing Agricultural University), Ministry of Agriculture, Nanjing 210095, China
| |
Collapse
|
6
|
Talukder SK, Saha MC. Toward Genomics-Based Breeding in C3 Cool-Season Perennial Grasses. Front Plant Sci 2017; 8:1317. [PMID: 28798766 PMCID: PMC5526908 DOI: 10.3389/fpls.2017.01317] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2017] [Accepted: 07/12/2017] [Indexed: 05/13/2023]
Abstract
Most important food and feed crops in the world belong to the C3 grass family. The future of food security is highly reliant on achieving genetic gains of those grasses. Conventional breeding methods have already reached a plateau for improving major crops. Genomics tools and resources have opened an avenue to explore genome-wide variability and make use of the variation for enhancing genetic gains in breeding programs. Major C3 annual cereal breeding programs are well equipped with genomic tools; however, genomic research of C3 cool-season perennial grasses is lagging behind. In this review, we discuss the currently available genomics tools and approaches useful for C3 cool-season perennial grass breeding. Along with a general review, we emphasize the discussion focusing on forage grasses that were considered orphan and have little or no genetic information available. Transcriptome sequencing and genotype-by-sequencing technology for genome-wide marker detection using next-generation sequencing (NGS) are very promising as genomics tools. Most C3 cool-season perennial grass members have no prior genetic information; thus NGS technology will enhance collinear study with other C3 model grasses like Brachypodium and rice. Transcriptomics data can be used for identification of functional genes and molecular markers, i.e., polymorphism markers and simple sequence repeats (SSRs). Genome-wide association study with NGS-based markers will facilitate marker identification for marker-assisted selection. With limited genetic information, genomic selection holds great promise to breeders for attaining maximum genetic gain of the cool-season C3 perennial grasses. Application of all these tools can ensure better genetic gains, reduce length of selection cycles, and facilitate cultivar development to meet the future demand for food and fodder.
Collapse
|
7
|
Li Z, Chen F, Huang C, Zheng W, Yu C, Cheng H, Zhou R. Genome-wide mapping and characterization of microsatellites in the swamp eel genome. Sci Rep 2017; 7:3157. [PMID: 28600492 PMCID: PMC5466649 DOI: 10.1038/s41598-017-03330-7] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 04/26/2017] [Indexed: 11/09/2022] Open
Abstract
We described genome-wide screening and characterization of microsatellites in the swamp eel genome. A total of 99,293 microsatellite loci were identified in the genome with an overall density of 179 microsatellites per megabase of genomic sequences. The dinucleotide microsatellites were the most abundant type representing 71% of the total microsatellite loci and the AC-rich motifs were the most recurrent in all repeat types. Microsatellite frequency decreased as numbers of repeat units increased, which was more obvious in long than short microsatellite motifs. Most of microsatellites were located in non-coding regions, whereas only approximately 1% of the microsatellites were detected in coding regions. Trinucleotide repeats were most abundant microsatellites in the coding regions, which represented amino acid repeats in proteins. There was a chromosome-biased distribution of microsatellites in non-coding regions, with the highest density of 203.95/Mb on chromosome 8 and the least on chromosome 7 (164.06/Mb). The most abundant dinucleotides (AC)n was mainly located on chromosome 8. Notably, genomic mapping showed that there was a chromosome-biased association of genomic distributions between microsatellites and transposon elements. Thus, the novel dataset of microsatellites in swamp eel provides a valuable resource for further studies on QTL-based selection breeding, genetic resource conservation and evolutionary genetics.
Collapse
Affiliation(s)
- Zhigang Li
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China
| | - Feng Chen
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China
| | - Chunhua Huang
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China
| | - Weixin Zheng
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China
| | - Chunlai Yu
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China
| | - Hanhua Cheng
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China.
| | - Rongjia Zhou
- Hubei Key Laboratory of Cell Homeostasis, Laboratory of Molecular and Developmental Genetics, College of Life Sciences, Wuhan University, Wuhan, 430072, P. R. China.
| |
Collapse
|
8
|
Filho JAF, de Brito LS, Leão AP, Alves AA, Formighieri EF, Júnior MTS. In Silico Approach for Characterization and Comparison of Repeats in the Genomes of Oil and Date Palms. Bioinform Biol Insights 2017; 11:1177932217702388. [PMID: 28469420 PMCID: PMC5402704 DOI: 10.1177/1177932217702388] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Accepted: 03/02/2017] [Indexed: 11/16/2022] Open
Abstract
Transposable elements (TEs) are mobile genetic elements present in almost all eukaryotic genomes. Due to their typical patterns of repetition, discovery, and characterization, they demand analysis by various bioinformatics software. Probably, as a result of the need for a complex analysis, many genomes publicly available do not have these elements annotated yet. In this study, a de novo and homology-based identification of TEs and microsatellites was performed using genomic data from 3 palm species: Elaeis oleifera (American oil palm, v.1, Embrapa, unpublished; v.8, Malaysian Palm Oil Board [MPOB], public), Elaeis guineensis (African oil palm, v.5, MPOB, public), and Phoenix dactylifera (date palm). The estimated total coverage of TEs was 50.96% (523 572 kb) and 42.31% (593 463 kb), 39.41% (605 015 kb), and 33.67% (187 361 kb), respectively. A total of 155 726 microsatellite loci were identified in the genomes of oil and date palms. This is the first detailed description of repeats in the genomes of oil and date palms. A relatively high diversity and abundance of TEs were found in the genomes, opening a range of further opportunities for applied research in these genera. The development of molecular markers (mainly simple sequence repeat), which may be immediately applied in breeding programs of those species to support the selection of superior genotypes and to enhance knowledge of the genetic structure of the breeding and natural populations, is the most notable opportunity.
Collapse
Affiliation(s)
- Jaire Alves Ferreira Filho
- Graduate Program in Plant Biotechnology, Federal University of Lavras (UFLA), Lavras, Brazil.,Embrapa Agroenergia, Parque Estação Biológica (PqEB), Brasília, Brazil.,Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | | | | | | | | | - Manoel Teixeira Souza Júnior
- Graduate Program in Plant Biotechnology, Federal University of Lavras (UFLA), Lavras, Brazil.,Embrapa Agroenergia, Parque Estação Biológica (PqEB), Brasília, Brazil
| |
Collapse
|
9
|
Evangelistella C, Valentini A, Ludovisi R, Firrincieli A, Fabbrini F, Scalabrin S, Cattonaro F, Morgante M, Mugnozza GS, Keurentjes JJB, Harfouche A. De novo assembly, functional annotation, and analysis of the giant reed ( Arundo donax L.) leaf transcriptome provide tools for the development of a biofuel feedstock. Biotechnol Biofuels 2017; 10:138. [PMID: 28572841 PMCID: PMC5450047 DOI: 10.1186/s13068-017-0828-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Accepted: 05/23/2017] [Indexed: 05/07/2023]
Abstract
BACKGROUND Arundo donax has attracted renewed interest as a potential candidate energy crop for use in biomass-to-liquid fuel conversion processes and biorefineries. This is due to its high productivity, adaptability to marginal land conditions, and suitability for biofuel and biomaterial production. Despite its importance, the genomic resources currently available for supporting the improvement of this species are still limited. RESULTS We used RNA sequencing (RNA-Seq) to de novo assemble and characterize the A. donax leaf transcriptome. The sequencing generated 1249 million clean reads that were assembled using single-k-mer and multi-k-mer approaches into 62,596 unique sequences (unitranscripts) with an N50 of 1134 bp. TransDecoder and Trinotate software suites were used to obtain putative coding sequences and annotate them by mapping to UniProtKB/Swiss-Prot and UniRef90 databases, searching for known transcripts, proteins, protein domains, and signal peptides. Furthermore, the unitranscripts were annotated by mapping them to the NCBI non-redundant, GO and KEGG pathway databases using Blast2GO. The transcriptome was also characterized by BLAST searches to investigate homologous transcripts of key genes involved in important metabolic pathways, such as lignin, cellulose, purine, and thiamine biosynthesis and carbon fixation. Moreover, a set of homologous transcripts of key genes involved in stomatal development and of genes coding for stress-associated proteins (SAPs) were identified. Additionally, 8364 simple sequence repeat (SSR) markers were identified and surveyed. SSRs appeared more abundant in non-coding regions (63.18%) than in coding regions (36.82%). This SSR dataset represents the first marker catalogue of A. donax. 53 SSRs (PolySSRs) were then predicted to be polymorphic between ecotype-specific assemblies, suggesting genetic variability in the studied ecotypes. CONCLUSIONS This study provides the first publicly available leaf transcriptome for the A. donax bioenergy crop. The functional annotation and characterization of the transcriptome will be highly useful for providing insight into the molecular mechanisms underlying its extreme adaptability. The identification of homologous transcripts involved in key metabolic pathways offers a platform for directing future efforts in genetic improvement of this species. Finally, the identified SSRs will facilitate the harnessing of untapped genetic diversity. This transcriptome should be of value to ongoing functional genomics and genetic studies in this crop of paramount economic importance.
Collapse
Affiliation(s)
- Chiara Evangelistella
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Via S. Camillo de Lellis snc, 01100 Viterbo, Italy
| | - Alessio Valentini
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Via S. Camillo de Lellis snc, 01100 Viterbo, Italy
| | - Riccardo Ludovisi
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Via S. Camillo de Lellis snc, 01100 Viterbo, Italy
| | - Andrea Firrincieli
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Via S. Camillo de Lellis snc, 01100 Viterbo, Italy
| | - Francesco Fabbrini
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Via S. Camillo de Lellis snc, 01100 Viterbo, Italy
- Alasia Franco Vivai s.s., Strada Solerette, 5/A, 12038 Savigliano, Italy
| | - Simone Scalabrin
- IGA Technology Services, Via J. Linussio, 51-Z.I.U, 33100 Udine, Italy
| | | | - Michele Morgante
- Department of Agricultural and Environmental Sciences, University of Udine, Via delle Scienze, 206, 33100 Udine, Italy
- Institute of Applied Genomics, Via J. Linussio, 51-Z.I.U, 33100 Udine, Italy
| | - Giuseppe Scarascia Mugnozza
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Via S. Camillo de Lellis snc, 01100 Viterbo, Italy
| | - Joost J. B. Keurentjes
- Laboratory of Genetics, Wageningen University, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands
| | - Antoine Harfouche
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Via S. Camillo de Lellis snc, 01100 Viterbo, Italy
| |
Collapse
|
10
|
Ranade SS, Lin YC, Van de Peer Y, García-Gil MR. Comparative in silico analysis of SSRs in coding regions of high confidence predicted genes in Norway spruce (Picea abies) and Loblolly pine (Pinus taeda). BMC Genet 2015; 16:149. [PMID: 26706685 PMCID: PMC4691297 DOI: 10.1186/s12863-015-0304-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2015] [Accepted: 12/10/2015] [Indexed: 11/24/2022] Open
Abstract
Background Microsatellites or simple sequence repeats (SSRs) are DNA sequences consisting of 1–6 bp tandem repeat motifs present in the genome. SSRs are considered to be one of the most powerful tools in genetic studies. We carried out a comparative study of perfect SSR loci belonging to class I (≥20) and class II (≥12 and <20 bp) types located in coding regions of high confidence genes in Picea abies and Pinus taeda. SSRLocator was used to retrieve SSRs from the full length CDS of predicted genes in both species. Results Trimers were the most abundant motifs in class I followed by hexamers in Picea abies, while trimers and hexamers were equally abundant in Pinus taeda class I SSRs. Hexamers were most frequent within class II SSRs followed by trimers, in both species. Although the frequency of genes containing SSRs was slightly higher in Pinus taeda, SSR counts per Mbp for class I was similar in both species (P-value = 0.22); while for class II SSRs, it was significantly higher in Picea abies (P-value = 0.00009). AT-rich motifs were higher in abundance than the GC-rich motifs, within class II SSRs in both the species (P-values = 10−9 and 0). With reference to class I SSRs, AT-rich and GC-rich motifs were detected with equal frequency in Pinus taeda (P-value = 0.24); while in Picea abies, GC-rich motifs were detected with higher frequency than the AT-rich motifs (P-value = 0.0005). Conclusions Our study gives a comparative overview of the genome SSRs composition based on high confidence genes in the two recently sequenced and economically important conifers and, also provides information on functional molecular markers that can be applied in genetic studies in Pinus and Picea species. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0304-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sonali Sachin Ranade
- Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre, Swedish University of Agricultural Sciences, SE-901 83, Umeå, Sweden.
| | - Yao-Cheng Lin
- Department of Plant Systems Biology (VIB) and Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052, Ghent, Belgium.
| | - Yves Van de Peer
- Department of Plant Systems Biology (VIB) and Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052, Ghent, Belgium. .,Genomics Research Institute, University of Pretoria, Hatfield Campus, Pretoria, 0028, South Africa. .,Bioinformatics Institute Ghent, Ghent University, 9052, Ghent, Belgium.
| | - María Rosario García-Gil
- Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre, Swedish University of Agricultural Sciences, SE-901 83, Umeå, Sweden.
| |
Collapse
|