1
|
Shahzad K, Zhang M, Mubeen I, Zhang X, Guo L, Qi T, Feng J, Tang H, Qiao X, Wu J, Xing C. Integrative analyses of long and short-read RNA sequencing reveal the spliced isoform regulatory network of seedling growth dynamics in upland cotton. Funct Integr Genomics 2024; 24:156. [PMID: 39230785 DOI: 10.1007/s10142-024-01420-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 08/08/2024] [Accepted: 08/08/2024] [Indexed: 09/05/2024]
Abstract
The polyploid genome of cotton has significantly increased the transcript complexity. Recent advances in full-length transcript sequencing are now widely used to characterize the complete landscape of transcriptional events. Such studies in cotton can help us to explore the genetic mechanisms of the cotton seedling growth. Through long-read single-molecule RNA sequencing, this study compared the transcriptomes of three yield contrasting genotypes of upland cotton. Our analysis identified different numbers of spliced isoforms from 31,166, 28,716, and 28,713 genes in SJ48, Z98, and DT8 cotton genotypes, respectively, most of which were novel compared to previous cotton reference transcriptomes, and showed significant differences in the number of exon structures and coding sequence length due to intron retention. Quantification of isoform expression revealed significant differences in expression in the root and leaf of each genotype. An array of key isoform target genes showed protein kinase or phosphorylation functions, and their protein interaction network contained most of the circadian oscillator proteins. Spliced isoforms from the GIGANTEA (GI) protien were differentially regulated in each genotype and might be expected to regulate translational activities, including the sequence and function of target proteins. In addition, these spliced isoforms generate diurnal expression profiles in cotton leaves, which may alter the transcriptional regulatory network of seedling growth. Silencing of the novel spliced GI isoform Gh_A02G0645_N17 significantly affected biomass traits, contributed to variable growth, and increased transcription of the early flowering pathway gene ELF in cotton. Our high-throughput hybrid sequencing results will be useful to dissect functional differences among spliced isoforms in the polyploid cotton genome.
Collapse
Affiliation(s)
- Kashif Shahzad
- State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Key Laboratory for Cotton Genetic Improvement, Ministry of Agriculture and Rural Affairs, 38 Huanghe Dadao, Anyang, 455000, Henan, China
| | - Meng Zhang
- State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Key Laboratory for Cotton Genetic Improvement, Ministry of Agriculture and Rural Affairs, 38 Huanghe Dadao, Anyang, 455000, Henan, China
| | - Iqra Mubeen
- State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Key Laboratory for Cotton Genetic Improvement, Ministry of Agriculture and Rural Affairs, 38 Huanghe Dadao, Anyang, 455000, Henan, China
| | - Xuexian Zhang
- State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Key Laboratory for Cotton Genetic Improvement, Ministry of Agriculture and Rural Affairs, 38 Huanghe Dadao, Anyang, 455000, Henan, China
| | - Liping Guo
- State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Key Laboratory for Cotton Genetic Improvement, Ministry of Agriculture and Rural Affairs, 38 Huanghe Dadao, Anyang, 455000, Henan, China
| | - Tingxiang Qi
- State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Key Laboratory for Cotton Genetic Improvement, Ministry of Agriculture and Rural Affairs, 38 Huanghe Dadao, Anyang, 455000, Henan, China
| | - Juanjuan Feng
- State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Key Laboratory for Cotton Genetic Improvement, Ministry of Agriculture and Rural Affairs, 38 Huanghe Dadao, Anyang, 455000, Henan, China
| | - Huini Tang
- State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Key Laboratory for Cotton Genetic Improvement, Ministry of Agriculture and Rural Affairs, 38 Huanghe Dadao, Anyang, 455000, Henan, China
| | - Xiuqin Qiao
- State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Key Laboratory for Cotton Genetic Improvement, Ministry of Agriculture and Rural Affairs, 38 Huanghe Dadao, Anyang, 455000, Henan, China
| | - Jianyong Wu
- State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Key Laboratory for Cotton Genetic Improvement, Ministry of Agriculture and Rural Affairs, 38 Huanghe Dadao, Anyang, 455000, Henan, China.
| | - Chaozhu Xing
- State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Key Laboratory for Cotton Genetic Improvement, Ministry of Agriculture and Rural Affairs, 38 Huanghe Dadao, Anyang, 455000, Henan, China.
| |
Collapse
|
2
|
Varsamis GD, Karafyllidis IG, Gilkes KM, Arranz U, Martin-Cuevas R, Calleja G, Wong J, Jessen HC, Dimitrakis P, Kolovos P, Sandaltzopoulos R. Quantum algorithm for de novo DNA sequence assembly based on quantum walks on graphs. Biosystems 2023; 233:105037. [PMID: 37734700 DOI: 10.1016/j.biosystems.2023.105037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/16/2023] [Accepted: 09/18/2023] [Indexed: 09/23/2023]
Abstract
De novo DNA sequence assembly is based on finding paths in overlap graphs, which is a NP-hard problem. We developed a quantum algorithm for de novo assembly based on quantum walks in graphs. The overlap graph is partitioned repeatedly to smaller graphs that form a hierarchical structure. We use quantum walks to find paths in low rank graphs and a quantum algorithm that finds Hamiltonian paths in high hierarchical rank. We tested the partitioning quantum algorithm, as well as the quantum algorithm that finds Hamiltonian paths in high hierarchical rank and confirmed its correct operation using Qiskit. We developed a custom simulation for quantum walks to search for paths in low rank graphs. The approach described in this paper may serve as a basis for the development of efficient quantum algorithms that solve the de novo DNA assembly problem.
Collapse
Affiliation(s)
- G D Varsamis
- Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, 67100, Greece
| | - I G Karafyllidis
- Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, 67100, Greece; National Centre for Scientific Research Demokritos, Athens, 15342, Greece.
| | - K M Gilkes
- EY Global Innovation Quantum Computing Lab, USA
| | - U Arranz
- EY Global Innovation Quantum Computing Lab, Spain
| | | | - G Calleja
- EY Global Innovation Quantum Computing Lab, Spain
| | - J Wong
- EY Global Innovation Quantum Computing Lab, USA
| | - H C Jessen
- EY Global Innovation Quantum Computing Lab, Denmark
| | - P Dimitrakis
- National Centre for Scientific Research Demokritos, Athens, 15342, Greece
| | - P Kolovos
- Department of Molecular Biology and Genetics, Democritus University of Thrace, Alexandroupolis, 68100, Greece
| | - R Sandaltzopoulos
- Department of Molecular Biology and Genetics, Democritus University of Thrace, Alexandroupolis, 68100, Greece
| |
Collapse
|
3
|
Bi Q, Zhao Y, Cui Y, Wang L. Genome survey sequencing and genetic background characterization of yellow horn based on next-generation sequencing. Mol Biol Rep 2019; 46:4303-4312. [PMID: 31115837 DOI: 10.1007/s11033-019-04884-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Accepted: 05/15/2019] [Indexed: 11/29/2022]
Abstract
Yellowhorn (Xanthoceras sorbifolium Bunge) is an important wood oil tree species, with high ornamental and medicinal value. Nevertheless, genomic information of yellowhorn is currently unavailable. Here, for the first time, we conducted a genome survey of two yellowhorn cultivars, Zhongshi 4 and Zhongshi 9, which had distinct differences on the phenotype and drought resistance, to obtain knowledge on the genomic information by next generation sequencing (NGS). Meanwhile, its genome size was estimated using flow cytometry. As a result, the whole genome survey of Zhongshi 4 and Zhongshi 9 generated 34.40 and 39.55 GB sequence data. The genome size of Zhongshi 4 and Zhongshi 9 estimated were about 536.58 Mb and 569.52 Mb, which were closed to results of flow cytometry. The heterozygosity rates were calculated to be 0.75% and 0.89%, and the repeat rates were 60.08% and 62.00%. These reads were assembled into 1024,373 and 885,404 contigs with a N50 length of 1005 bp and 1219 bp, respectively, which were further assembled into 714,369 and 686,128 scaffolds with scaffold N50 length of ~ 1963 bp and ~ 1938 bp, total length of 386,915 Kb and 391,904 Kb. These results indicated that there was little difference in genome size and complexity among different cultivars. In addition, 63137 and 65271 high-quality genomic simple sequence repeat (SSR) markers in Zhongshi 4 and Zhongshi 9 were generated. We suggest that the technologies combining Illumina and PacBio, assisted by Hi-C and matching assemble software should be used to one of two yellowhorn cultivars genome sequencing. The result will help to design whole genome sequencing strategies for yellowhorn, and provided a large amount of gene resources for further excavation and utilization of yellowhorn.
Collapse
Affiliation(s)
- Quanxin Bi
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Yang Zhao
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Yifan Cui
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Libing Wang
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China.
| |
Collapse
|
4
|
Moll KM, Zhou P, Ramaraj T, Fajardo D, Devitt NP, Sadowsky MJ, Stupar RM, Tiffin P, Miller JR, Young ND, Silverstein KAT, Mudge J. Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula. BMC Genomics 2017; 18:578. [PMID: 28778149 PMCID: PMC5545040 DOI: 10.1186/s12864-017-3971-4] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Accepted: 07/31/2017] [Indexed: 12/16/2022] Open
Abstract
Background Third generation sequencing technologies, with sequencing reads in the tens- of kilo-bases, facilitate genome assembly by spanning ambiguous regions and improving continuity. This has been critical for plant genomes, which are difficult to assemble due to high repeat content, gene family expansions, segmental and tandem duplications, and polyploidy. Recently, high-throughput mapping and scaffolding strategies have further improved continuity. Together, these long-range technologies enable quality draft assemblies of complex genomes in a cost-effective and timely manner. Results Here, we present high quality genome assemblies of the model legume plant, Medicago truncatula (R108) using PacBio, Dovetail Chicago (hereafter, Dovetail) and BioNano technologies. To test these technologies for plant genome assembly, we generated five assemblies using all possible combinations and ordering of these three technologies in the R108 assembly. While the BioNano and Dovetail joins overlapped, they also showed complementary gains in continuity and join numbers. Both technologies spanned repetitive regions that PacBio alone was unable to bridge. Combining technologies, particularly Dovetail followed by BioNano, resulted in notable improvements compared to Dovetail or BioNano alone. A combination of PacBio, Dovetail, and BioNano was used to generate a high quality draft assembly of R108, a M. truncatula accession widely used in studies of functional genomics. As a test for the usefulness of the resulting genome sequence, the new R108 assembly was used to pinpoint breakpoints and characterize flanking sequence of a previously identified translocation between chromosomes 4 and 8, identifying more than 22.7 Mb of novel sequence not present in the earlier A17 reference assembly. Conclusions Adding Dovetail followed by BioNano data yielded complementary improvements in continuity over the original PacBio assembly. This strategy proved efficient and cost-effective for developing a quality draft assembly compared to traditional reference assemblies. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3971-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Karen M Moll
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM, 87505, USA.,Montana State University, Center for Biofilm Engineering, Bozeman, MT, 59717, USA
| | - Peng Zhou
- Department of Plant Biology, University of Minnesota, Saint Paul, MN, USA
| | - Thiruvarangan Ramaraj
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM, 87505, USA
| | - Diego Fajardo
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM, 87505, USA
| | - Nicholas P Devitt
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM, 87505, USA
| | - Michael J Sadowsky
- Department of Soil, Water & Climate, Plant and Microbial Biology and BioTechnology Institute, University of Minnesota, St. Paul, MN, USA
| | - Robert M Stupar
- Department of Agronomy and Plant Genetics, University of Minnesota, Saint Paul, MN, USA
| | - Peter Tiffin
- Department of Plant and Microbial Biology, University of Minnesota, Saint Paul, MN, USA
| | | | - Nevin D Young
- Department of Plant and Microbial Biology, University of Minnesota, Saint Paul, MN, USA
| | | | - Joann Mudge
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM, 87505, USA.
| |
Collapse
|
5
|
Wang Y, Wang Y, Li K, Song X, Chen J. Characterization and Comparative Expression Profiling of Browning Response in Medinilla formosana after Cutting. FRONTIERS IN PLANT SCIENCE 2016; 7:1897. [PMID: 28066460 PMCID: PMC5178855 DOI: 10.3389/fpls.2016.01897] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 11/30/2016] [Indexed: 06/06/2023]
Abstract
Plant browning is a recalcitrant problem for in vitro culture and often leads to poor growth of explants and even failure of tissue culture. However, the molecular mechanisms underlying browning-induced physiological processes remain unclear. Medinilla is considered one of the most difficult genera for tissue culture owning to its severe browning. In the present study, intact aseptic plantlets of Medinilla formosana Hayata previously obtained by ovary culture, were used to explore the characteristics and molecular mechanism of the browning response. Successive morphological and anatomical observations after cutting showed that the browning of M. formosana was not lethal but adaptive. De novo transcriptome and digital gene expression (DGE) profiling using Illumina high-throughput sequencing were then used to explore molecular regulation after cutting. About 7.5 million tags of de novo transcriptome were obtained and 58,073 unigenes were assembled and annotated. A total of 6,431 differentially expressed genes (DEGs) at three stages after cutting were identified, and the expression patterns of these browning-related genes were clustered and analyzed. A number of putative DEGs involved in signal transduction and secondary metabolism were particularly studied and the potential roles of these cutting-responsive mRNAs in plant defense to diverse abiotic stresses are discussed. The DGE profiling data were also validated by quantitative RT-PCR analysis. The data obtained in this study provide an excellent resource for unraveling the molecular mechanisms of browning processes during in vitro tissue culture, and lay a foundation for future studies to inhibit and eliminate browning damage.
Collapse
Affiliation(s)
- Yan Wang
- State Key Laboratory Breeding Base for Zhejiang Sustainable Pest and Disease Control, Ministry of Agriculture Key Laboratory of Biotechnology in Plant Protection, Institute of Virology and Biotechnology, Zhejiang Academy of Agricultural SciencesHangzhou, China
| | - Yiting Wang
- State Key Laboratory Breeding Base for Zhejiang Sustainable Pest and Disease Control, Ministry of Agriculture Key Laboratory of Biotechnology in Plant Protection, Institute of Virology and Biotechnology, Zhejiang Academy of Agricultural SciencesHangzhou, China
| | - Kunfeng Li
- Agriculture Experiment Station, Zhejiang UniversityHangzhou, China
| | - Xijiao Song
- State Key Laboratory Breeding Base for Zhejiang Sustainable Pest and Disease Control, Ministry of Agriculture Key Laboratory of Biotechnology in Plant Protection, Institute of Virology and Biotechnology, Zhejiang Academy of Agricultural SciencesHangzhou, China
| | - Jianping Chen
- State Key Laboratory Breeding Base for Zhejiang Sustainable Pest and Disease Control, Ministry of Agriculture Key Laboratory of Biotechnology in Plant Protection, Institute of Virology and Biotechnology, Zhejiang Academy of Agricultural SciencesHangzhou, China
| |
Collapse
|
6
|
Park YJ, Li X, Noh SJ, Kim JK, Lim SS, Park NI, Kim S, Kim YB, Kim YO, Lee SW, Arasu MV, Al-Dhabi NA, Park SU. Transcriptome and metabolome analysis in shoot and root of Valeriana fauriei. BMC Genomics 2016; 17:303. [PMID: 27107812 PMCID: PMC4842265 DOI: 10.1186/s12864-016-2616-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 04/13/2016] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND Valeriana fauriei is commonly used in the treatment of cardiovascular diseases in many countries. Several constituents with various pharmacological properties are present in the roots of Valeriana species. Although many researches on V. fauriei have been done since a long time, further studies in the discipline make a limit due to inadequate genomic information. Hence, Illumina HiSeq 2500 system was conducted to obtain the transcriptome data from shoot and root of V. fauriei. RESULTS A total of 97,595 unigenes were noticed from 346,771,454 raw reads after preprocessing and assembly. Of these, 47,760 unigens were annotated with Uniprot BLAST hits and mapped to COG, GO and KEGG pathway. Also, 70,013 and 88,827 transcripts were expressed in root and shoot of V. fauriei, respectively. Among the secondary metabolite biosynthesis, terpenoid backbone and phenylpropanoid biosynthesis were large groups, where transcripts was involved. To characterize the molecular basis of terpenoid, carotenoid, and phenylpropanoid biosynthesis, the levels of transcription were determined by qRT-PCR. Also, secondary metabolites content were measured using GC/MS and HPLC analysis for that gene expression correlated with its accumulation respectively between shoot and root of V. fauriei. CONCLUSIONS We have identified the transcriptome using Illumina HiSeq system in shoot and root of V. fauriei. Also, we have demonstrated gene expressions associated with secondary metabolism such as terpenoid, carotenoid, and phenylpropanoid.
Collapse
Affiliation(s)
- Yun Ji Park
- />Department of Crop Science, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon, 305-764 Korea
| | - Xiaohua Li
- />Department of Crop Science, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon, 305-764 Korea
| | - Seung Jae Noh
- />Code Division, Insilicogen Inc., Suwon, Gyeonggi-do 441-813 Korea
| | - Jae Kwang Kim
- />Division of Life Sciences and Bio-Resource and Environmental Center, Incheon National University, Yeonsu-gu, Incheon, 406-772 Korea
| | - Soon Sung Lim
- />Department of Food and Nutrition and Institute of Natural Medicine, Hallym University, Chuncheon, 200-702 Korea
| | - Nam Il Park
- />Deptartment of Plant Science, Gangneung-Wonju National University, 7 Jukheon-gil, Gangneung-si, Gangwon-do 210-702 Korea
| | - Soonok Kim
- />Biological and Genetic Resources Assessment Division, National Institute of Biological Resources, Incheon, 404-170 Korea
| | - Yeon Bok Kim
- />Department of Herbal Crop Research, National Institute of Horticultural and Herbal Science (NIHHS), Rural Development Administration (RDA), Bisanro 92, Eumseong, Chungbuk 369-873 Republic of Korea
| | - Young Ock Kim
- />Department of Herbal Crop Research, National Institute of Horticultural and Herbal Science (NIHHS), Rural Development Administration (RDA), Bisanro 92, Eumseong, Chungbuk 369-873 Republic of Korea
| | - Sang Won Lee
- />Department of Herbal Crop Research, National Institute of Horticultural and Herbal Science (NIHHS), Rural Development Administration (RDA), Bisanro 92, Eumseong, Chungbuk 369-873 Republic of Korea
| | - Mariadhas Valan Arasu
- />Department of Botany and Microbiology, Addiriyah Chair for Environmental Studies, College of Science, King Saud University, P. O. Box 2455, Riyadh, 11451 Saudi Arabia
| | - Naif Abdullah Al-Dhabi
- />Department of Botany and Microbiology, Addiriyah Chair for Environmental Studies, College of Science, King Saud University, P. O. Box 2455, Riyadh, 11451 Saudi Arabia
| | - Sang Un Park
- />Department of Crop Science, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon, 305-764 Korea
| |
Collapse
|
7
|
Ma X, Tang Z, Qin J, Meng Y. The use of high-throughput sequencing methods for plant microRNA research. RNA Biol 2016; 12:709-19. [PMID: 26016494 DOI: 10.1080/15476286.2015.1053686] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023] Open
Abstract
MicroRNA (miRNA) acts as a critical regulator of gene expression at post-transcriptional and occasionally transcriptional levels in plants. Identification of reliable miRNA genes, monitoring the procedures of transcription, processing and maturation of the miRNAs, quantification of the accumulation levels of the miRNAs in specific biological samples, and validation of miRNA-target interactions become the basis for thoroughly understanding of the miRNA-mediated regulatory networks and the underlying mechanisms. Great progresses have been achieved for sequencing technology. Based on the high degree of sequencing depth and coverage, the high-throughput sequencing (HTS, also called next-generation sequencing) technology provides unprecedentedly efficient way for genome-wide or transcriptome-wide studies. In this review, we will introduce several HTS platform-based methods useful for plant miRNA research, including RNA-seq (RNA sequencing), RNA-PET-seq (paired end tag sequencing of RNAs), sRNA-seq (small RNA sequencing), dsRNA-seq (double-stranded RNA sequencing), ssRNA-seq (single-stranded RNA sequencing) and degradome-seq (degradome sequencing). In particular, we will provide some special cases to illustrate the novel use of HTS methods for investigation of the processing modes of the miRNA precursors, identification of the RNA editing sites on miRNA precursors, mature miRNAs and target transcripts, re-examination of the current miRNA registries, and discovery of novel miRNA species and novel miRNA-target interactions. Summarily, we opinioned that integrative use of the above mentioned HTS methods could make the studies on miRNAs more efficient.
Collapse
Affiliation(s)
- Xiaoxia Ma
- a College of Life and Environmental Sciences; Hangzhou Normal University ; Hangzhou , PR China
| | | | | | | |
Collapse
|
8
|
Pauwels K, De Keersmaecker SC, De Schrijver A, du Jardin P, Roosens NH, Herman P. Next-generation sequencing as a tool for the molecular characterisation and risk assessment of genetically modified plants: Added value or not? Trends Food Sci Technol 2015. [DOI: 10.1016/j.tifs.2015.07.009] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
9
|
Abstract
SSR genotyping involves the use of simple sequence repeats (SSRs) as DNA markers. SSRs, also called microsatellites, are a type of repetitive DNA sequence ubiquitous in most plant genomes. SSRs contain repeats of a motif sequence 1-6 bp in length. Due to this structure SSRs frequently undergo mutations, mainly due to DNA polymerase errors, which involve the addition or subtraction of a repeat unit. Hence, SSR sequences are highly polymorphic and may be readily used for detection of allelic variation within populations. SSRs are present within both genic and nongenic regions and are occasionally transcribed, and hence may be identified in expressed sequence tags (ESTs) as well as more commonly in nongenic DNA sequences. SSR genotyping involves the design of DNA-based primers to amplify SSR sequences from extracted genomic DNA, followed by amplification of the SSR repeat region using polymerase chain reaction, and subsequent visualization of the resulting DNA products, usually using gel electrophoresis. These procedures are described in this chapter. SSRs have been one of the most favored molecular markers for plant genotyping in the last 20 years due to their high levels of polymorphism, wide distribution across most plant genomes, and ease of use and will continue to be a useful tool in many species for years to come.
Collapse
Affiliation(s)
- Annaliese S Mason
- School of Agriculture and Food Sciences, University of Queensland, Brisbane, QLD, 4072, Australia,
| |
Collapse
|
10
|
Ruperao P, Edwards D. Bioinformatics: identification of markers from next-generation sequence data. Methods Mol Biol 2015; 1245:29-47. [PMID: 25373747 DOI: 10.1007/978-1-4939-1966-6_3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
With the advent of sequencing technology, next-generation sequencing (NGS) technology has dramatically revolutionized plant genomics. NGS technology combined with new software tools enables the discovery, validation, and assessment of genetic markers on a large scale. Among different markers systems, simple sequence repeats (SSRs) and Single nucleotide polymorphisms (SNPs) are the markers of choice for genetics and plant breeding. SSR markers have been a choice for large-scale characterization of germplasm collections, construction of genetic maps, and QTL identification. Similarly, SNPs are the most abundant genetic variations with higher frequencies throughout the genome of plant species. This chapter discusses various tools available for genome assembly and widely focuses on SSR and SNP marker discovery.
Collapse
Affiliation(s)
- Pradeep Ruperao
- School of Agriculture and Food Sciences, University of Queensland, Brisbane, QLD, Australia
| | | |
Collapse
|
11
|
Abstract
Most plant species are known to be either ancient or recent polyploids, containing more than one genome as a result of past interspecific hybridization events (allopolyploidy) and/or genome doubling (autopolyploidy). Genotyping in polyploid species offers a set of unique challenges. Most molecular marker methodologies are made more complex by polyploidy, as multilocus alleles are generally produced when a single locus is targeted. Genotyping by sequencing is also more challenging in polyploids, with problematic assemblies of duplicated regions and difficulties in distinguishing between inter- and intragenomic polymorphisms. Strategies for identifying and overcoming the challenges of polyploidy in plant genotyping are proposed.
Collapse
Affiliation(s)
- Annaliese S Mason
- School of Agriculture and Food Sciences, University of Queensland, Brisbane, QLD, 4072, Australia,
| |
Collapse
|
12
|
Patel DA, Zander M, Dalton-Morgan J, Batley J. Advances in plant genotyping: where the future will take us. Methods Mol Biol 2015; 1245:1-11. [PMID: 25373745 DOI: 10.1007/978-1-4939-1966-6_1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Genetic diversity between individuals can be tracked and monitored using a range of molecular markers. These markers can detect variation ranging in scale from a single base pair up to duplications and translocations of entire chromosomal regions. The genotyping of individuals allows the detection of this variation and it has been successfully applied in plant science for many years. The increasing amounts of sequence data able to be generated using next-generation sequencing (NGS) technologies have produced a vast expansion in the rate of discovery of polymorphisms, with single nucleotide polymorphisms (SNPs) predominating as the marker of choice. This increase in polymorphic marker resources through efficient discovery, coupled with the utility of SNPs, has enabled the shift to high-throughput genotyping assays and these methods are reviewed and discussed here, alongside the recent innovations allowing increased throughput.
Collapse
Affiliation(s)
- Dhwani A Patel
- School of Agriculture and Food Sciences, University of Queensland, Brisbane, QLD, Australia
| | | | | | | |
Collapse
|
13
|
Cao W, Fu B, Wu K, Li N, Zhou Y, Gao Z, Lin M, Li G, Wu X, Ma Z, Jia H. Construction and characterization of three wheat bacterial artificial chromosome libraries. Int J Mol Sci 2014; 15:21896-912. [PMID: 25464379 PMCID: PMC4284684 DOI: 10.3390/ijms151221896] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Revised: 11/22/2014] [Accepted: 11/24/2014] [Indexed: 11/29/2022] Open
Abstract
We have constructed three bacterial artificial chromosome (BAC) libraries of wheat cultivar Triticum aestivum Wangshuibai, germplasms T. monococcum TA2026 and TA2033. A total of 1,233,792,170,880 and 263,040 clones were picked and arrayed in 384-well plates. On the basis of genome sizes of 16.8 Gb for hexaploid wheat and 5.6 Gb for diploid wheat, the three libraries represented 9.05-, 2.60-, and 3.71-fold coverage of the haploid genomes, respectively. An improved descending pooling system for BAC libraries screening was established. This improved strategy can save 80% of the time and 68% of polymerase chain reaction (PCR) with the same successful rate as the universal 6D pooling strategy.
Collapse
Affiliation(s)
- Wenjin Cao
- College of Agricultural Sciences, Nanjing Agricultural University, Nanjing 210095, China.
| | - Bisheng Fu
- College of Agricultural Sciences, Nanjing Agricultural University, Nanjing 210095, China.
| | - Kun Wu
- College of Agricultural Sciences, Nanjing Agricultural University, Nanjing 210095, China.
| | - Na Li
- College of Agricultural Sciences, Nanjing Agricultural University, Nanjing 210095, China.
| | - Yan Zhou
- College of Agricultural Sciences, Nanjing Agricultural University, Nanjing 210095, China.
| | - Zhongxia Gao
- College of Agricultural Sciences, Nanjing Agricultural University, Nanjing 210095, China.
| | - Musen Lin
- College of Agricultural Sciences, Nanjing Agricultural University, Nanjing 210095, China.
| | - Guoqiang Li
- College of Agricultural Sciences, Nanjing Agricultural University, Nanjing 210095, China.
| | - Xinyi Wu
- College of Agricultural Sciences, Nanjing Agricultural University, Nanjing 210095, China.
| | - Zhengqiang Ma
- College of Agricultural Sciences, Nanjing Agricultural University, Nanjing 210095, China.
| | - Haiyan Jia
- College of Agricultural Sciences, Nanjing Agricultural University, Nanjing 210095, China.
| |
Collapse
|
14
|
Abstract
The demand for rapid and accurate diagnosis of plant diseases has risen in the last decade. On-site diagnosis of single or multiple pathogens using portable devices is the first step in this endeavour. Despite extensive attempts to develop portable devices for pathogen detection, current technologies are still restricted to detecting known pathogens with limited detection accuracy. Developing new detection techniques for rapid and accurate detection of multiple plant pathogens and their associated variants is essential. Recent single DNA sequencing technologies are a promising new avenue for developing future portable devices for plant pathogen detection. In this review, we detail the current progress in portable devices and technologies used for detecting plant pathogens, the current position of emerging sequencing technologies for analysis of plant genomics, and the future of portable devices for rapid pathogen diagnosis.
Collapse
Affiliation(s)
- Amir Sanati Nezhad
- McGill University and Genome Quebec Innovation Centre, Department of Biomedical Engineering, McGill University, Montreal, Quebec, Canada.
| |
Collapse
|
15
|
Guo Q, Ma X, Wei S, Qiu D, Wilson IW, Wu P, Tang Q, Liu L, Dong S, Zu W. De novo transcriptome sequencing and digital gene expression analysis predict biosynthetic pathway of rhynchophylline and isorhynchophylline from Uncaria rhynchophylla, a non-model plant with potent anti-alzheimer's properties. BMC Genomics 2014; 15:676. [PMID: 25112168 PMCID: PMC4143583 DOI: 10.1186/1471-2164-15-676] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2013] [Accepted: 08/04/2014] [Indexed: 12/02/2022] Open
Abstract
Background The major medicinal alkaloids isolated from Uncaria rhynchophylla (gouteng in chinese) capsules are rhynchophylline (RIN) and isorhynchophylline (IRN). Extracts containing these terpene indole alkaloids (TIAs) can inhibit the formation and destabilize preformed fibrils of amyloid β protein (a pathological marker of Alzheimer’s disease), and have been shown to improve the cognitive function of mice with Alzheimer-like symptoms. The biosynthetic pathways of RIN and IRN are largely unknown. Results In this study, RNA-sequencing of pooled Uncaria capsules RNA samples taken at three developmental stages that accumulate different amount of RIN and IRN was performed. More than 50 million high-quality reads from a cDNA library were generated and de novo assembled. Sequences for all of the known enzymes involved in TIAs synthesis were identified. Additionally, 193 cytochrome P450 (CYP450), 280 methyltransferase and 144 isomerase genes were identified, that are potential candidates for enzymes involved in RIN and IRN synthesis. Digital gene expression profile (DGE) analysis was performed on the three capsule developmental stages, and based on genes possessing expression profiles consistent with RIN and IRN levels; four CYP450s, three methyltransferases and three isomerases were identified as the candidates most likely to be involved in the later steps of RIN and IRN biosynthesis. Conclusion A combination of de novo transcriptome assembly and DGE analysis was shown to be a powerful method for identifying genes encoding enzymes potentially involved in the biosynthesis of important secondary metabolites in a non-model plant. The transcriptome data from this study provides an important resource for understanding the formation of major bioactive constituents in the capsule extract from Uncaria, and provides information that may aid in metabolic engineering to increase yields of these important alkaloids. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-676) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Xiaojun Ma
- College of Agriculture, Northeast Agricultural University, Harbin 150030, China.
| | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Ruperao P, Chan CKK, Azam S, Karafiátová M, Hayashi S, Cížková J, Saxena RK, Simková H, Song C, Vrána J, Chitikineni A, Visendi P, Gaur PM, Millán T, Singh KB, Taran B, Wang J, Batley J, Doležel J, Varshney RK, Edwards D. A chromosomal genomics approach to assess and validate the desi and kabuli draft chickpea genome assemblies. PLANT BIOTECHNOLOGY JOURNAL 2014; 12:778-86. [PMID: 24702794 DOI: 10.1111/pbi.12182] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2013] [Revised: 01/21/2014] [Accepted: 02/09/2014] [Indexed: 05/09/2023]
Abstract
With the expansion of next-generation sequencing technology and advanced bioinformatics, there has been a rapid growth of genome sequencing projects. However, while this technology enables the rapid and cost-effective assembly of draft genomes, the quality of these assemblies usually falls short of gold standard genome assemblies produced using the more traditional BAC by BAC and Sanger sequencing approaches. Assembly validation is often performed by the physical anchoring of genetically mapped markers, but this is prone to errors and the resolution is usually low, especially towards centromeric regions where recombination is limited. New approaches are required to validate reference genome assemblies. The ability to isolate individual chromosomes combined with next-generation sequencing permits the validation of genome assemblies at the chromosome level. We demonstrate this approach by the assessment of the recently published chickpea kabuli and desi genomes. While previous genetic analysis suggests that these genomes should be very similar, a comparison of their chromosome sizes and published assemblies highlights significant differences. Our chromosomal genomics analysis highlights short defined regions that appear to have been misassembled in the kabuli genome and identifies large-scale misassembly in the draft desi genome. The integration of chromosomal genomics tools within genome sequencing projects has the potential to significantly improve the construction and validation of genome assemblies. The approach could be applied both for new genome assemblies as well as published assemblies, and complements currently applied genome assembly strategies.
Collapse
Affiliation(s)
- Pradeep Ruperao
- University of Queensland, St. Lucia, Queensland, Australia; Australian Centre for Plant Functional Genomics, University of Queensland, St. Lucia, Queensland, Australia; International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, Andhra Pradesh, India
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Azam S, Rathore A, Shah TM, Telluri M, Amindala B, Ruperao P, Katta MAVSK, Varshney RK. An integrated SNP mining and utilization (ISMU) pipeline for next generation sequencing data. PLoS One 2014; 9:e101754. [PMID: 25003610 PMCID: PMC4086967 DOI: 10.1371/journal.pone.0101754] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Accepted: 06/11/2014] [Indexed: 12/30/2022] Open
Abstract
Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a standalone free software.
Collapse
Affiliation(s)
- Sarwar Azam
- Centre of Excellence in Genomics, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India
| | - Abhishek Rathore
- Centre of Excellence in Genomics, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India
| | - Trushar M. Shah
- Centre of Excellence in Genomics, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India
| | - Mohan Telluri
- Centre of Excellence in Genomics, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India
| | - BhanuPrakash Amindala
- Centre of Excellence in Genomics, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India
| | - Pradeep Ruperao
- Centre of Excellence in Genomics, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India
- School of Agriculture and Food Sciences, University of Queensland, Brisbane, Australia
| | - Mohan A. V. S. K. Katta
- Centre of Excellence in Genomics, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India
| | - Rajeev K. Varshney
- Centre of Excellence in Genomics, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India
- * E-mail:
| |
Collapse
|
18
|
Sheth BP, Thaker VS. Plant systems biology: insights, advances and challenges. PLANTA 2014; 240:33-54. [PMID: 24671625 DOI: 10.1007/s00425-014-2059-5] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Accepted: 03/06/2014] [Indexed: 05/20/2023]
Abstract
Plants dwelling at the base of biological food chain are of fundamental significance in providing solutions to some of the most daunting ecological and environmental problems faced by our planet. The reductionist views of molecular biology provide only a partial understanding to the phenotypic knowledge of plants. Systems biology offers a comprehensive view of plant systems, by employing a holistic approach integrating the molecular data at various hierarchical levels. In this review, we discuss the basics of systems biology including the various 'omics' approaches and their integration, the modeling aspects and the tools needed for the plant systems research. A particular emphasis is given to the recent analytical advances, updated published examples of plant systems biology studies and the future trends.
Collapse
Affiliation(s)
- Bhavisha P Sheth
- Department of Biosciences, Centre for Advanced Studies in Plant Biotechnology and Genetic Engineering, Saurashtra University, Rajkot, 360005, Gujarat, India,
| | | |
Collapse
|
19
|
Abstract
Differences between plant genomes range from single nucleotide polymorphisms to large-scale duplications, deletions and rearrangements. The large polymorphisms are termed structural variants (SVs). SVs have received significant attention in human genetics and were found to be responsible for various chronic diseases. However, little effort has been directed towards understanding the role of SVs in plants. Many recent advances in plant genetics have resulted from improvements in high-resolution technologies for measuring SVs, including microarray-based techniques, and more recently, high-throughput DNA sequencing. In this review we describe recent reports of SV in plants and describe the genomic technologies currently used to measure these SVs.
Collapse
|
20
|
Bohra A, Pandey MK, Jha UC, Singh B, Singh IP, Datta D, Chaturvedi SK, Nadarajan N, Varshney RK. Genomics-assisted breeding in four major pulse crops of developing countries: present status and prospects. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2014; 127:1263-91. [PMID: 24710822 PMCID: PMC4035543 DOI: 10.1007/s00122-014-2301-3] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2013] [Accepted: 03/17/2014] [Indexed: 05/08/2023]
Abstract
KEY MESSAGE Given recent advances in pulse molecular biology, genomics-driven breeding has emerged as a promising approach to address the issues of limited genetic gain and low productivity in various pulse crops. The global population is continuously increasing and is expected to reach nine billion by 2050. This huge population pressure will lead to severe shortage of food, natural resources and arable land. Such an alarming situation is most likely to arise in developing countries due to increase in the proportion of people suffering from protein and micronutrient malnutrition. Pulses being a primary and affordable source of proteins and minerals play a key role in alleviating the protein calorie malnutrition, micronutrient deficiencies and other undernourishment-related issues. Additionally, pulses are a vital source of livelihood generation for millions of resource-poor farmers practising agriculture in the semi-arid and sub-tropical regions. Limited success achieved through conventional breeding so far in most of the pulse crops will not be enough to feed the ever increasing population. In this context, genomics-assisted breeding (GAB) holds promise in enhancing the genetic gains. Though pulses have long been considered as orphan crops, recent advances in the area of pulse genomics are noteworthy, e.g. discovery of genome-wide genetic markers, high-throughput genotyping and sequencing platforms, high-density genetic linkage/QTL maps and, more importantly, the availability of whole-genome sequence. With genome sequence in hand, there is a great scope to apply genome-wide methods for trait mapping using association studies and to choose desirable genotypes via genomic selection. It is anticipated that GAB will speed up the progress of genetic improvement of pulses, leading to the rapid development of cultivars with higher yield, enhanced stress tolerance and wider adaptability.
Collapse
Affiliation(s)
- Abhishek Bohra
- Indian Institute of Pulses Research (IIPR), Kanpur, 208024 India
| | - Manish K. Pandey
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, 502324 India
| | - Uday C. Jha
- Indian Institute of Pulses Research (IIPR), Kanpur, 208024 India
| | - Balwant Singh
- National Research Centre on Plant Biotechnology (NRCPB), New Delhi, 110012 India
| | - Indra P. Singh
- Indian Institute of Pulses Research (IIPR), Kanpur, 208024 India
| | - Dibendu Datta
- Indian Institute of Pulses Research (IIPR), Kanpur, 208024 India
| | | | - N. Nadarajan
- Indian Institute of Pulses Research (IIPR), Kanpur, 208024 India
| | - Rajeev K. Varshney
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, 502324 India
- The University of Western Australia (UWA), Crawley, 6009 Australia
| |
Collapse
|
21
|
Rallapalli G, Kemen EM, Robert-Seilaniantz A, Segonzac C, Etherington GJ, Sohn KH, MacLean D, Jones JDG. EXPRSS: an Illumina based high-throughput expression-profiling method to reveal transcriptional dynamics. BMC Genomics 2014; 15:341. [PMID: 24884414 PMCID: PMC4035070 DOI: 10.1186/1471-2164-15-341] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2013] [Accepted: 03/31/2014] [Indexed: 01/19/2023] Open
Abstract
Background Next Generation Sequencing technologies have facilitated differential gene expression analysis through RNA-seq and Tag-seq methods. RNA-seq has biases associated with transcript lengths, lacks uniform coverage of regions in mRNA and requires 10–20 times more reads than a typical Tag-seq. Most existing Tag-seq methods either have biases or not high throughput due to use of restriction enzymes or enzymatic manipulation of 5’ ends of mRNA or use of RNA ligations. Results We have developed EXpression Profiling through Randomly Sheared cDNA tag Sequencing (EXPRSS) that employs acoustic waves to randomly shear cDNA and generate sequence tags at a relatively defined position (~150-200 bp) from the 3′ end of each mRNA. Implementation of the method was verified through comparative analysis of expression data generated from EXPRSS, NlaIII-DGE and Affymetrix microarray and through qPCR quantification of selected genes. EXPRSS is a strand specific and restriction enzyme independent tag sequencing method that does not require cDNA length-based data transformations. EXPRSS is highly reproducible, is high-throughput and it also reveals alternative polyadenylation and polyadenylated antisense transcripts. It is cost-effective using barcoded multiplexing, avoids the biases of existing SAGE and derivative methods and can reveal polyadenylation position from paired-end sequencing. Conclusions EXPRSS Tag-seq provides sensitive and reliable gene expression data and enables high-throughput expression profiling with relatively simple downstream analysis. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-341) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Jonathan D G Jones
- The Sainsbury Laboratory, Norwich Research Park, Colney, Norwich, UK NR4 7UH.
| |
Collapse
|
22
|
Single-nucleotide polymorphism markers from de-novo assembly of the pomegranate transcriptome reveal germplasm genetic diversity. PLoS One 2014; 9:e88998. [PMID: 24558460 PMCID: PMC3928336 DOI: 10.1371/journal.pone.0088998] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2013] [Accepted: 01/10/2014] [Indexed: 12/31/2022] Open
Abstract
Pomegranate is a valuable crop that is grown commercially in many parts of the world. Wild species have been reported from India, Turkmenistan and Socotra. Pomegranate fruit has a variety of health-beneficial qualities. However, despite this crop's importance, only moderate effort has been invested in studying its biochemical or physiological properties or in establishing genomic and genetic infrastructures. In this study, we reconstructed a transcriptome from two phenotypically different accessions using 454-GS-FLX Titanium technology. These data were used to explore the functional annotation of 45,187 fully annotated contigs. We further compiled a genetic-variation resource of 7,155 simple-sequence repeats (SSRs) and 6,500 single-nucleotide polymorphisms (SNPs). A subset of 480 SNPs was sampled to investigate the genetic structure of the broad pomegranate germplasm collection at the Agricultural Research Organization (ARO), which includes accessions from different geographical areas worldwide. This subset of SNPs was found to be polymorphic, with 10.7% loci with minor allele frequencies of (MAF<0.05). These SNPs were successfully used to classify the ARO pomegranate collection into two major groups of accessions: one from India, China and Iran, composed of mainly unknown country origin and which was more of an admixture than the other major group, composed of accessions mainly from the Mediterranean basin, Central Asia and California. This study establishes a high-throughput transcriptome and genetic-marker infrastructure. Moreover, it sheds new light on the genetic interrelations between pomegranate species worldwide and more accurately defines their genetic nature.
Collapse
|
23
|
Zhang S, Chen W, Xin L, Gao Z, Hou Y, Yu X, Zhang Z, Qu S. Genomic variants of genes associated with three horticultural traits in apple revealed by genome re-sequencing. HORTICULTURE RESEARCH 2014; 1:14045. [PMID: 26504548 PMCID: PMC4596325 DOI: 10.1038/hortres.2014.45] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Revised: 06/06/2014] [Accepted: 07/25/2014] [Indexed: 05/06/2023]
Abstract
The apple (Malus × domestica Borkh.) cultivar 'Su Shuai' exhibits greater disease resistance, shorter internodes and lighter fruit flavor compared with its parents 'Golden Delicious' and 'Indo'. To obtain a comprehensive overview of the sequence variation in these three horticultural traits, the genomes of 'Su Shuai' and 'Indo' were resequenced using next-generation sequencing and compared to the genome of 'Golden Delicious'. A wide range of genetic variations were detected, including 2 454 406 and 18 749 349 single nucleotide polymorphism (SNP) and 59 547 and 50 143 structural variants (SVs) in the 'Indo' and 'Su Shuai' genomes, respectively. Among the SVs in 'Su Shuai', 17 genes related to disease resistance, 10 genes related to Gibberellin (GA) and 19 genes associated with fruit flavor were identified. The expression patterns of eight of the SV genes were examined using reverse transcription-quantitative polymerase chain reaction (RT-qPCR). The results of this study illustrate the genomic variation in these cultivars and provide evidence for a genetic basis for the horticultural traits of disease resistance, short internodes and lighter flavor exhibited in these cultivars. These results provide a genetic basis for the phenotypic characteristics of 'Su Shuai' and, as such, these SVs could serve as gene-specific molecular markers in maker-assisted breeding of apples.
Collapse
Affiliation(s)
- Shijie Zhang
- College of Horticulture, Nanjing Agricultural University, Nanjing 210095, China
| | - Weiping Chen
- College of Agriculture, Nanjing Agricultural University, Nanjing 210095, China
| | - Lu Xin
- College of Horticulture, Nanjing Agricultural University, Nanjing 210095, China
| | - Zhihong Gao
- College of Horticulture, Nanjing Agricultural University, Nanjing 210095, China
| | - Yingjun Hou
- College of Horticulture, Nanjing Agricultural University, Nanjing 210095, China
| | - Xinyi Yu
- College of Horticulture, Nanjing Agricultural University, Nanjing 210095, China
| | - Zhen Zhang
- College of Horticulture, Nanjing Agricultural University, Nanjing 210095, China
| | - Shenchun Qu
- College of Horticulture, Nanjing Agricultural University, Nanjing 210095, China
| |
Collapse
|
24
|
Next generation characterisation of cereal genomes for marker discovery. BIOLOGY 2013; 2:1357-77. [PMID: 24833229 PMCID: PMC4009793 DOI: 10.3390/biology2041357] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2013] [Revised: 10/29/2013] [Accepted: 11/08/2013] [Indexed: 12/30/2022]
Abstract
Cereal crops form the bulk of the world’s food sources, and thus their importance cannot be understated. Crop breeding programs increasingly rely on high-resolution molecular genetic markers to accelerate the breeding process. The development of these markers is hampered by the complexity of some of the major cereal crop genomes, as well as the time and cost required. In this review, we address current and future methods available for the characterisation of cereal genomes, with an emphasis on faster and more cost effective approaches for genome sequencing and the development of markers for trait association and marker assisted selection (MAS) in crop breeding programs.
Collapse
|
25
|
Silva GG, Dutilh BE, Matthews TD, Elkins K, Schmieder R, Dinsdale EA, Edwards RA. Combining de novo and reference-guided assembly with scaffold_builder. SOURCE CODE FOR BIOLOGY AND MEDICINE 2013; 8:23. [PMID: 24267787 PMCID: PMC4177539 DOI: 10.1186/1751-0473-8-23] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2013] [Accepted: 09/24/2013] [Indexed: 01/20/2023]
Abstract
Genome sequencing has become routine, however genome assembly still remains a challenge despite the computational advances in the last decade. In particular, the abundance of repeat elements in genomes makes it difficult to assemble them into a single complete sequence. Identical repeats shorter than the average read length can generally be assembled without issue. However, longer repeats such as ribosomal RNA operons cannot be accurately assembled using existing tools. The application Scaffold_builder was designed to generate scaffolds – super contigs of sequences joined by N-bases – based on the similarity to a closely related reference sequence. This is independent of mate-pair information and can be used complementarily for genome assembly, e.g. when mate-pairs are not available or have already been exploited. Scaffold_builder was evaluated using simulated pyrosequencing reads of the bacterial genomes Escherichia coli 042, Lactobacillus salivarius UCC118 and Salmonella enterica subsp. enterica serovar Typhi str. P-stx-12. Moreover, we sequenced two genomes from Salmonella enterica serovar Typhimurium LT2 G455 and Salmonella enterica serovar Typhimurium SDT1291 and show that Scaffold_builder decreases the number of contig sequences by 53% while more than doubling their average length. Scaffold_builder is written in Python and is available at http://edwards.sdsu.edu/scaffold_builder. A web-based implementation is additionally provided to allow users to submit a reference genome and a set of contigs to be scaffolded.
Collapse
Affiliation(s)
- Genivaldo Gz Silva
- Computational Science Research Center, San Diego State University, San Diego, CA 92182, USA.
| | | | | | | | | | | | | |
Collapse
|
26
|
Fresnedo-Ramírez J, Martínez-García PJ, Parfitt DE, Crisosto CH, Gradziel TM. Heterogeneity in the entire genome for three genotypes of peach [Prunus persica (L.) Batsch] as distinguished from sequence analysis of genomic variants. BMC Genomics 2013; 14:750. [PMID: 24182359 PMCID: PMC4046826 DOI: 10.1186/1471-2164-14-750] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2013] [Accepted: 10/19/2013] [Indexed: 12/22/2022] Open
Abstract
Background Peach [Prunus persica (L.) Batsch] is an economically important fruit crop that has become a genetic-genomic model for all Prunus species in the family Rosaceae. A doubled haploid reference genome sequence length of 227.3 Mb, a narrow genetic base contrasted by a wide phenotypic variability, the generation of cultivars through hybridization with subsequent clonal propagation, and the current accessibility of many founder genotypes, as well as the pedigree of modern commercial cultivars make peach a model for the study of inter-cultivar genomic heterogeneity and its shaping by artificial selection. Results The quantitative genomic differences among the three genotypes studied as genomic variants, included small variants (SNPs and InDels) and structural variants (SV) (duplications, inversions and translocations). The heirloom cultivar 'Georgia Belle’ and an almond by peach introgression breeding line 'F8,1-42’ are more heterogeneous than is the modern cultivar 'Dr. Davis’ when compared to the peach reference genome ('Lovell’). A pair-wise comparison of consensus genome sequences with 'Lovell’ showed that 'F8,1-42’ and 'Georgia Belle’ were more divergent than were 'Dr. Davis’ and 'Lovell’. Conclusions A novel application of emerging bioinformatics tools to the analysis of ongoing genome sequencing project outputs has led to the identification of a range of genomic variants. Results can be used to delineate the genomic and phenotypic differences among peach genotypes. For crops such as fruit trees, the availability of old cultivars, breeding selections and their pedigrees, make them suitable models for the study of genome shaping by artificial selection. The findings from the study of such genomic variants can then elucidate the control of pomological traits and the characterization of metabolic pathways, thus facilitating the development of protocols for the improvement of Prunus crops. Electronic supplementary material The online version of this article (doi: 10.1186/1471-2164-14-750) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jonathan Fresnedo-Ramírez
- Department of Plant Sciences, University of California Davis, One Shields Ave, Davis, CA 95616, USA.
| | | | | | | | | |
Collapse
|
27
|
Shangguan L, Han J, Kayesh E, Sun X, Zhang C, Pervaiz T, Wen X, Fang J. Evaluation of genome sequencing quality in selected plant species using expressed sequence tags. PLoS One 2013; 8:e69890. [PMID: 23922843 PMCID: PMC3726750 DOI: 10.1371/journal.pone.0069890] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2013] [Accepted: 06/14/2013] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND With the completion of genome sequencing projects for more than 30 plant species, large volumes of genome sequences have been produced and stored in online databases. Advancements in sequencing technologies have reduced the cost and time of whole genome sequencing enabling more and more plants to be subjected to genome sequencing. Despite this, genome sequence qualities of multiple plants have not been evaluated. METHODOLOGY/PRINCIPAL FINDING Integrity and accuracy were calculated to evaluate the genome sequence quality of 32 plants. The integrity of a genome sequence is presented by the ratio of chromosome size and genome size (or between scaffold size and genome size), which ranged from 55.31% to nearly 100%. The accuracy of genome sequence was presented by the ratio between matched EST and selected ESTs where 52.93% ∼ 98.28% and 89.02% ∼ 98.85% of the randomly selected clean ESTs could be mapped to chromosome and scaffold sequences, respectively. According to the integrity, accuracy and other analysis of each plant species, thirteen plant species were divided into four levels. Arabidopsis thaliana, Oryza sativa and Zea mays had the highest quality, followed by Brachypodium distachyon, Populus trichocarpa, Vitis vinifera and Glycine max, Sorghum bicolor, Solanum lycopersicum and Fragaria vesca, and Lotus japonicus, Medicago truncatula and Malus × domestica in that order. Assembling the scaffold sequences into chromosome sequences should be the primary task for the remaining nineteen species. Low GC content and repeat DNA influences genome sequence assembly. CONCLUSION The quality of plant genome sequences was found to be lower than envisaged and thus the rapid development of genome sequencing projects as well as research on bioinformatics tools and the algorithms of genome sequence assembly should provide increased processing and correction of genome sequences that have already been published.
Collapse
Affiliation(s)
- Lingfei Shangguan
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Jian Han
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Emrul Kayesh
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Xin Sun
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Changqing Zhang
- College of Horticulture, Jinling Institute of Technology, Nanjing City, Jiangsu Province, China
| | - Tariq Pervaiz
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Xicheng Wen
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Jinggui Fang
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| |
Collapse
|
28
|
Hirsch CN, Buell CR. Tapping the promise of genomics in species with complex, nonmodel genomes. ANNUAL REVIEW OF PLANT BIOLOGY 2013; 64:89-110. [PMID: 23451780 DOI: 10.1146/annurev-arplant-050312-120237] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
Genomics is enabling a renaissance in all disciplines of plant biology. However, many plant genomes are complex and remain recalcitrant to current genomic technologies. The complexities of these nonmodel plant genomes are attributable to gene and genome duplication, heterozygosity, ploidy, and/or repetitive sequences. Methods are available to simplify the genome and reduce these barriers, including inbreeding and genome reduction, making these species amenable to current sequencing and assembly methods. Some, but not all, of the complexities in nonmodel genomes can be bypassed by sequencing the transcriptome rather than the genome. Additionally, comparative genomics approaches, which leverage phylogenetic relatedness, can aid in the interpretation of complex genomes. Although there are limitations in accessing complex nonmodel plant genomes using current sequencing technologies, genome manipulation and resourceful analyses can allow access to even the most recalcitrant plant genomes.
Collapse
Affiliation(s)
- Candice N Hirsch
- Department of Plant Biology, Michigan State University, East Lansing, MI 48824, USA
| | | |
Collapse
|
29
|
Edwards D, Batley J, Snowdon RJ. Accessing complex crop genomes with next-generation sequencing. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2013; 126:1-11. [PMID: 22948437 DOI: 10.1007/s00122-012-1964-x] [Citation(s) in RCA: 135] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Accepted: 08/08/2012] [Indexed: 05/02/2023]
Abstract
Many important crop species have genomes originating from ancestral or recent polyploidisation events. Multiple homoeologous gene copies, chromosomal rearrangements and amplification of repetitive DNA within large and complex crop genomes can considerably complicate genome analysis and gene discovery by conventional, forward genetics approaches. On the other hand, ongoing technological advances in molecular genetics and genomics today offer unprecedented opportunities to analyse and access even more recalcitrant genomes. In this review, we describe next-generation sequencing and data analysis techniques that vastly improve our ability to dissect and mine genomes for causal genes underlying key traits and allelic variation of interest to breeders. We focus primarily on wheat and oilseed rape, two leading examples of major polyploid crop genomes whose size or complexity present different, significant challenges. In both cases, the latest DNA sequencing technologies, applied using quite different approaches, have enabled considerable progress towards unravelling the respective genomes. Our ability to discover the extent and distribution of genetic diversity in crop gene pools, and its relationship to yield and quality-related traits, is swiftly gathering momentum as DNA sequencing and the bioinformatic tools to deal with growing quantities of genomic data continue to develop. In the coming decade, genomic and transcriptomic sequencing, discovery and high-throughput screening of single nucleotide polymorphisms, presence-absence variations and other structural chromosomal variants in diverse germplasm collections will give detailed insight into the origins, domestication and available trait-relevant variation of polyploid crops, in the process facilitating novel approaches and possibilities for genomics-assisted breeding.
Collapse
Affiliation(s)
- David Edwards
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Sciences, University of Queensland, Brisbane, QLD 4072, Australia
| | | | | |
Collapse
|
30
|
de Lima JC, Loss-Morais G, Margis R. MicroRNAs play critical roles during plant development and in response to abiotic stresses. Genet Mol Biol 2012; 35:1069-77. [PMID: 23412556 PMCID: PMC3571433 DOI: 10.1590/s1415-47572012000600023] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
MicroRNAs (miRNAs) have been identified as key molecules in regulatory networks. The fine-tuning role of miRNAs in addition to the regulatory role of transcription factors has shown that molecular events during development are tightly regulated. In addition, several miRNAs play crucial roles in the response to abiotic stress induced by drought, salinity, low temperatures, and metals such as aluminium. Interestingly, several miRNAs have overlapping roles with regard to development, stress responses, and nutrient homeostasis. Moreover, in response to the same abiotic stresses, different expression patterns for some conserved miRNA families among different plant species revealed different metabolic adjustments. The use of deep sequencing technologies for the characterisation of miRNA frequency and the identification of new miRNAs adds complexity to regulatory networks in plants. In this review, we consider the regulatory role of miRNAs in plant development and abiotic stresses, as well as the impact of deep sequencing technologies on the generation of miRNA data.
Collapse
Affiliation(s)
- Júlio César de Lima
- Laboratório de Genomas e Populações de Plantas, Centro de Biotecnologia, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil. ; Laboratório de Fisiologia Vegetal, Departamento de Botânica, Instituto de Biologia, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil. ; Programa de Pósgraduação em Genética e Biologia Molecular, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | | | | |
Collapse
|
31
|
Mutz KO, Heilkenbrinker A, Lönne M, Walter JG, Stahl F. Transcriptome analysis using next-generation sequencing. Curr Opin Biotechnol 2012; 24:22-30. [PMID: 23020966 DOI: 10.1016/j.copbio.2012.09.004] [Citation(s) in RCA: 303] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2012] [Revised: 09/03/2012] [Accepted: 09/04/2012] [Indexed: 12/16/2022]
Abstract
Up to date research in biology, biotechnology, and medicine requires fast genome and transcriptome analysis technologies for the investigation of cellular state, physiology, and activity. Here, microarray technology and next generation sequencing of transcripts (RNA-Seq) are state of the art. Since microarray technology is limited towards the amount of RNA, the quantification of transcript levels and the sequence information, RNA-Seq provides nearly unlimited possibilities in modern bioanalysis. This chapter presents a detailed description of next-generation sequencing (NGS), describes the impact of this technology on transcriptome analysis and explains its possibilities to explore the modern RNA world.
Collapse
Affiliation(s)
- Kai-Oliver Mutz
- Leibniz Universität Hannover, Institute for Technical Chemistry, Callinstrasse 5, 30167 Hannover, Germany
| | | | | | | | | |
Collapse
|
32
|
Why assembling plant genome sequences is so challenging. BIOLOGY 2012; 1:439-59. [PMID: 24832233 PMCID: PMC4009782 DOI: 10.3390/biology1020439] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 07/16/2012] [Revised: 09/05/2012] [Accepted: 09/06/2012] [Indexed: 12/16/2022]
Abstract
In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed.
Collapse
|
33
|
Chang CH, Wang HI, Lu HC, Chen CE, Chen HH, Yeh HH, Tang CY. An efficient RNA interference screening strategy for gene functional analysis. BMC Genomics 2012; 13:491. [PMID: 22988976 PMCID: PMC3533828 DOI: 10.1186/1471-2164-13-491] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2012] [Accepted: 09/11/2012] [Indexed: 01/14/2023] Open
Abstract
Background RNA interference (RNAi) is commonly applied in genome-scale gene functional screens. However, a one-on-one RNAi analysis that targets each gene is cost-ineffective and laborious. Previous studies have indicated that siRNAs can also affect RNAs that are near-perfectly complementary, and this phenomenon has been termed an off-target effect. This phenomenon implies that it is possible to silence several genes simultaneously with a carefully designed siRNA. Results We propose a strategy that is combined with a heuristic algorithm to design suitable siRNAs that can target multiple genes and a group testing method that would reduce the number of required RNAi experiments in a large-scale RNAi analysis. To verify the efficacy of our strategy, we used the Orchid expressed sequence tag data as a case study to screen the putative transcription factors that are involved in plant disease responses. According to our computation, 94 qualified siRNAs were sufficient to examine all of the predicated 229 transcription factors. In addition, among the 94 computer-designed siRNAs, an siRNA that targets both TF15 (a previously identified transcription factor that is involved in the plant disease-response pathway) and TF21 was introduced into orchids. The experimental results showed that this siRNA can simultaneously silence TF15 and TF21, and application of our strategy successfully confirmed that TF15 is involved in plant defense responses. Interestingly, our second-round analysis, which used an siRNA specific to TF21, indicated that TF21 is a previously unidentified transcription factor that is related to plant defense responses. Conclusions Our computational results showed that it is possible to screen all genes with fewer experiments than would be required for the traditional one-on-one RNAi screening. We also verified that our strategy is capable of identifying genes that are involved in a specific phenotype.
Collapse
Affiliation(s)
- Chih-Hung Chang
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| | | | | | | | | | | | | |
Collapse
|
34
|
Edwards D, Wilcox S, Barrero RA, Fleury D, Cavanagh CR, Forrest KL, Hayden MJ, Moolhuijzen P, Keeble-Gagnère G, Bellgard MI, Lorenc MT, Shang CA, Baumann U, Taylor JM, Morell MK, Langridge P, Appels R, Fitzgerald A. Bread matters: a national initiative to profile the genetic diversity of Australian wheat. PLANT BIOTECHNOLOGY JOURNAL 2012; 10:703-8. [PMID: 22681313 DOI: 10.1111/j.1467-7652.2012.00717.x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
The large and complex genome of wheat makes genetic and genomic analysis in this important species both expensive and resource intensive. The application of next-generation sequencing technologies is particularly resource intensive, with at least 17 Gbp of sequence data required to obtain minimal (1×) coverage of the genome. A similar volume of data would represent almost 40× coverage of the rice genome. Progress can be made through the establishment of consortia to produce shared genomic resources. Australian wheat genome researchers, working with Bioplatforms Australia, have collaborated in a national initiative to establish a genetic diversity dataset representing Australian wheat germplasm based on whole genome next-generation sequencing data. Here, we describe the establishment and validation of this resource which can provide a model for broader international initiatives for the analysis of large and complex genomes.
Collapse
Affiliation(s)
- David Edwards
- Australian Centre for Plant Functional Genomics and University of Queensland, St. Lucia, Qld, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Tollenaere R, Hayward A, Dalton-Morgan J, Campbell E, Lee JRM, Lorenc MT, Manoli S, Stiller J, Raman R, Raman H, Edwards D, Batley J. Identification and characterization of candidate Rlm4 blackleg resistance genes in Brassica napus using next-generation sequencing. PLANT BIOTECHNOLOGY JOURNAL 2012; 10:709-15. [PMID: 22726421 DOI: 10.1111/j.1467-7652.2012.00716.x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
A thorough understanding of the relationships between plants and pathogens is essential if we are to continue to meet the agricultural needs of the world's growing population. The identification of genes underlying important quantitative trait loci is extremely challenging in complex genomes such as Brassica napus (canola, oilseed rape or rapeseed). However, recent advances in next-generation sequencing (NGS) enable much quicker identification of candidate genes for traits of interest. Here, we demonstrate this with the identification of candidate disease resistance genes from B. napus for its most devastating fungal pathogen, Leptosphaeria maculans (blackleg fungus). These two species are locked in an evolutionary arms race whereby a gene-for-gene interaction confers either resistance or susceptibility in the plant depending on the genotype of the plant and pathogen. Preliminary analysis of the complete genome sequence of Brassica rapa, the diploid progenitor of B. napus, identified numerous candidate genes with disease resistance characteristics, several of which were clustered around a region syntenic with a major locus (Rlm4) for blackleg resistance on A7 of B. napus. Molecular analyses of the candidate genes using B. napus NGS data are presented, and the difficulties associated with identifying functional gene copies within the highly duplicated Brassica genome are discussed.
Collapse
Affiliation(s)
- Reece Tollenaere
- Centre for Integrative Legume Research and School of Agriculture and Food Sciences, University of Queensland, Brisbane, Qld, Australia
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Desgagné-Penix I, Farrow SC, Cram D, Nowak J, Facchini PJ. Integration of deep transcript and targeted metabolite profiles for eight cultivars of opium poppy. PLANT MOLECULAR BIOLOGY 2012; 79:295-313. [PMID: 22527754 DOI: 10.1007/s11103-012-9913-2] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2011] [Accepted: 04/06/2012] [Indexed: 05/31/2023]
Abstract
Recent advances in DNA sequencing technology and analytical mass spectrometry are providing unprecedented opportunities to develop the functional genomics resources required to investigate complex biological processes in non-model plants. Opium poppy produces a wide variety of benzylisoquinoline alkaloids (BIAs), including the pharmaceutical compounds codeine, morphine, noscapine and papaverine. A functional genomics platform to identify novel BIA biosynthetic and regulatory genes in opium poppy has been established based on the differential metabolite profile of eight selected cultivars. Stem cDNA libraries from each of the eight opium poppy cultivars were subjected to 454 pyrosequencing and searchable expressed sequence tag databases were created from the assembled reads. These deep and integrated metabolite and transcript databases provide a nearly complete representation of the genetic and metabolic variances responsible for the differential occurrence of specific BIAs in each cultivar as demonstrated using the biochemically well characterized pathway from tyrosine to morphine. Similar correlations between the occurrence of specific transcripts and alkaloids effectively reveals candidate genes encoding uncharacterized biosynthetic enzymes as shown using cytochromes P450 potentially involved in the formation of papaverine and noscapine.
Collapse
Affiliation(s)
- Isabel Desgagné-Penix
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, Canada.
| | | | | | | | | |
Collapse
|
37
|
Solieri L, Dakal TC, Giudici P. Next-generation sequencing and its potential impact on food microbial genomics. ANN MICROBIOL 2012. [DOI: 10.1007/s13213-012-0478-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022] Open
|
38
|
Garzón-Martínez GA, Zhu ZI, Landsman D, Barrero LS, Mariño-Ramírez L. The Physalis peruviana leaf transcriptome: assembly, annotation and gene model prediction. BMC Genomics 2012; 13:151. [PMID: 22533342 PMCID: PMC3488962 DOI: 10.1186/1471-2164-13-151] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2011] [Accepted: 04/25/2012] [Indexed: 11/16/2022] Open
Abstract
Background Physalis peruviana commonly known as Cape gooseberry is a member of the Solanaceae family that has an increasing popularity due to its nutritional and medicinal values. A broad range of genomic tools is available for other Solanaceae, including tomato and potato. However, limited genomic resources are currently available for Cape gooseberry. Results We report the generation of a total of 652,614 P. peruviana Expressed Sequence Tags (ESTs), using 454 GS FLX Titanium technology. ESTs, with an average length of 371 bp, were obtained from a normalized leaf cDNA library prepared using a Colombian commercial variety. De novo assembling was performed to generate a collection of 24,014 isotigs and 110,921 singletons, with an average length of 1,638 bp and 354 bp, respectively. Functional annotation was performed using NCBI’s BLAST tools and Blast2GO, which identified putative functions for 21,191 assembled sequences, including gene families involved in all the major biological processes and molecular functions as well as defense response and amino acid metabolism pathways. Gene model predictions in P. peruviana were obtained by using the genomes of Solanum lycopersicum (tomato) and Solanum tuberosum (potato). We predict 9,436 P. peruviana sequences with multiple-exon models and conserved intron positions with respect to the potato and tomato genomes. Additionally, to study species diversity we developed 5,971 SSR markers from assembled ESTs. Conclusions We present the first comprehensive analysis of the Physalis peruviana leaf transcriptome, which will provide valuable resources for development of genetic tools in the species. Assembled transcripts with gene models could serve as potential candidates for marker discovery with a variety of applications including: functional diversity, conservation and improvement to increase productivity and fruit quality. P. peruviana was estimated to be phylogenetically branched out before the divergence of five other Solanaceae family members, S. lycopersicum, S. tuberosum, Capsicum spp, S. melongena and Petunia spp.
Collapse
Affiliation(s)
- Gina A Garzón-Martínez
- Plant Molecular Genetics Laboratory, Center of Biotechnology and Bioindustry (CBB), Colombian Corporation for Agricultural Research (CORPOICA), Bogota, Colombia
| | | | | | | | | |
Collapse
|
39
|
Hayward A, Mason AS, Dalton-Morgan J, Zander M, Edwards D, Batley J. SNP discovery and applications in Brassica napus. ACTA ACUST UNITED AC 2012. [DOI: 10.5010/jpb.2012.39.1.049] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
40
|
Martínez-Gómez P, Sánchez-Pérez R, Rubio M. Clarifying omics concepts, challenges, and opportunities for Prunus breeding in the postgenomic era. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2012; 16:268-83. [PMID: 22394278 DOI: 10.1089/omi.2011.0133] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
The recent sequencing of the complete genome of the peach, together with the availability of new high-throughput genome, transcriptome, proteome, and metabolome analysis technologies, offers new possibilities for Prunus breeders in what has been described as the postgenomic era. In this context, new biological challenges and opportunities for the application of these technologies in the development of efficient marker-assisted selection strategies in Prunus breeding include genome resequencing using DNA-Seq, the study of RNA regulation at transcriptional and posttranscriptional levels using tilling microarray and RNA-Seq, protein and metabolite identification and annotation, and standardization of phenotype evaluation. Additional biological opportunities include the high level of synteny among Prunus genomes. Finally, the existence of biases presents another important biological challenge in attaining knowledge from these new high-throughput omics disciplines. On the other hand, from the philosophical point of view, we are facing a revolution in the use of new high-throughput analysis techniques that may mean a scientific paradigm shift in Prunus genetics and genomics theories. The evaluation of scientific progress is another important question in this postgenomic context. Finally, the incommensurability of omics theories in the new high-throughput analysis context presents an additional philosophical challenge.
Collapse
|
41
|
Hayward A, McLanders J, Campbell E, Edwards D, Batley J. Genomic advances will herald new insights into the Brassica: Leptosphaeria maculans pathosystem. PLANT BIOLOGY (STUTTGART, GERMANY) 2012; 14 Suppl 1:1-10. [PMID: 21973193 DOI: 10.1111/j.1438-8677.2011.00481.x] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The study of the relationship between plants and phytopathogenic fungi is one of the most rapidly moving fields in the plant sciences, the findings of which have contributed to the development of new strategies and technologies to protect crops. Plants employ sophisticated mechanisms to perceive and appropriately defend themselves against pathogens. A good example of plant and pathogen evolution is the gene-for-gene interaction between the fungal pathogen Leptosphaeria maculans, the causal agent of blackleg disease, and Brassica crops. This interaction has been studied at the genetic and physiological level due to its agro-economic importance. The newly available genome sequence for Brassica spp. and L. maculans will provide the resources to study the co-evolution of this plant and pathogen. Particularly, an understanding of the co-evolution of genes responsible for virulence and resistance will lead to improved plant protection strategies for Brassica canola and provide a model to understand plant-pathogen interactions in other major crops. This review summarises the research-to-date in the study of the Brassica-L. maculans gene-for-gene interaction, with a focus on the genetics of resistance in Brassica and the wealth of information to be gained from genome sequencing efforts.
Collapse
Affiliation(s)
- A Hayward
- ARC Centre of Excellence for Integrative Legume Research and School of Agriculture and Food Sciences, University of Queensland, Brisbane, QLD, Australia
| | | | | | | | | |
Collapse
|
42
|
Lai K, Berkman PJ, Lorenc MT, Duran C, Smits L, Manoli S, Stiller J, Edwards D. WheatGenome.info: an integrated database and portal for wheat genome information. PLANT & CELL PHYSIOLOGY 2012; 53:e2. [PMID: 22009731 DOI: 10.1093/pcp/pcr141] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Bread wheat (Triticum aestivum) is one of the most important crop plants, globally providing staple food for a large proportion of the human population. However, improvement of this crop has been limited due to its large and complex genome. Advances in genomics are supporting wheat crop improvement. We provide a variety of web-based systems hosting wheat genome and genomic data to support wheat research and crop improvement. WheatGenome.info is an integrated database resource which includes multiple web-based applications. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second-generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This system includes links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/.
Collapse
Affiliation(s)
- Kaitao Lai
- School of Agriculture and Food Sciences and Australian Centre for Plant Functional Genomics, University of Queensland, Brisbane, QLD 4072, Australia
| | | | | | | | | | | | | | | |
Collapse
|
43
|
Azam S, Thakur V, Ruperao P, Shah T, Balaji J, Amindala B, Farmer AD, Studholme DJ, May GD, Edwards D, Jones JDG, Varshney RK. Coverage-based consensus calling (CbCC) of short sequence reads and comparison of CbCC results to identify SNPs in chickpea (Cicer arietinum; Fabaceae), a crop species without a reference genome. AMERICAN JOURNAL OF BOTANY 2012; 99:186-192. [PMID: 22301893 DOI: 10.3732/ajb.1100419] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
PREMISE OF THE STUDY Next-generation sequencing (NGS) technologies are frequently used for resequencing and mining of single nucleotide polymorphisms (SNPs) by comparison to a reference genome. In crop species such as chickpea (Cicer arietinum) that lack a reference genome sequence, NGS-based SNP discovery is a challenge. Therefore, unlike probability-based statistical approaches for consensus calling and by comparison with a reference sequence, a coverage-based consensus calling (CbCC) approach was applied and two genotypes were compared for SNP identification. METHODS A CbCC approach is used in this study with four commonly used short read alignment tools (Maq, Bowtie, Novoalign, and SOAP2) and 15.7 and 22.1 million Illumina reads for chickpea genotypes ICC4958 and ICC1882, together with the chickpea trancriptome assembly (CaTA). KEY RESULTS A nonredundant set of 4543 SNPs was identified between two chickpea genotypes. Experimental validation of 224 randomly selected SNPs showed superiority of Maq among individual tools, as 50.0% of SNPs predicted by Maq were true SNPs. For combinations of two tools, greatest accuracy (55.7%) was reported for Maq and Bowtie, with a combination of Bowtie, Maq, and Novoalign identifying 61.5% true SNPs. SNP prediction accuracy generally increased with increasing reads depth. CONCLUSIONS This study provides a benchmark comparison of tools as well as read depths for four commonly used tools for NGS SNP discovery in a crop species without a reference genome sequence. In addition, a large number of SNPs have been identified in chickpea that would be useful for molecular breeding.
Collapse
Affiliation(s)
- Sarwar Azam
- Centre of Excellence in Genomics, International Crops Research Institute for the Semi-Arid Tropics, Patancheru 502324, Andhra Pradesh, India
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Berkman PJ, Skarshewski A, Manoli S, Lorenc MT, Stiller J, Smits L, Lai K, Campbell E, Kubaláková M, Simková H, Batley J, Doležel J, Hernandez P, Edwards D. Sequencing wheat chromosome arm 7BS delimits the 7BS/4AL translocation and reveals homoeologous gene conservation. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2012; 124:423-432. [PMID: 22001910 DOI: 10.1007/s00122-011-1717-2] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2011] [Accepted: 09/27/2011] [Indexed: 05/29/2023]
Abstract
Complex Triticeae genomes pose a challenge to genome sequencing efforts due to their size and repetitive nature. Genome sequencing can reveal details of conservation and rearrangements between related genomes. We have applied Illumina second generation sequencing technology to sequence and assemble the low copy and unique regions of Triticum aestivum chromosome arm 7BS, followed by the construction of a syntenic build based on gene order in Brachypodium. We have delimited the position of a previously reported translocation between 7BS and 4AL with a resolution of one or a few genes and report approximately 13% genes from 7BS having been translocated to 4AL. An additional 13 genes are found on 7BS which appear to have originated from 4AL. The gene content of the 7DS and 7BS syntenic builds indicate a total of ~77,000 genes in wheat. Within wheat syntenic regions, 7BS and 7DS share 740 genes and a common gene conservation rate of ~39% of the genes from the corresponding regions in Brachypodium, as well as a common rate of colinearity with Brachypodium of ~60%. Comparison of wheat homoeologues revealed ~84% of genes previously identified in 7DS have a homoeologue on 7BS or 4AL. The conservation rates we have identified among wheat homoeologues and with Brachypodium provide a benchmark of homoeologous gene conservation levels for future comparative genomic analysis. The syntenic build of 7BS is publicly available at http://www.wheatgenome.info.
Collapse
Affiliation(s)
- Paul J Berkman
- School of Agriculture and Food Sciences and Australian Centre for Plant Functional Genomics, University of Queensland, Brisbane, QLD, 4072, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Berkman PJ, Lai K, Lorenc MT, Edwards D. Next-generation sequencing applications for wheat crop improvement. AMERICAN JOURNAL OF BOTANY 2012; 99:365-71. [PMID: 22268223 DOI: 10.3732/ajb.1100309] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Bread wheat (Triticum aestivum; Poaceae) is a crop plant of great importance. It provides nearly 20% of the world's daily food supply measured by calorie intake, similar to that provided by rice. The yield of wheat has doubled over the last 40 years due to a combination of advanced agronomic practice and improved germplasm through selective breeding. More recently, yield growth has been less dramatic, and a significant improvement in wheat production will be required if demand from the growing human population is to be met. Next-generation sequencing (NGS) technologies are revolutionizing biology and can be applied to address critical issues in plant biology. Technologies can produce draft sequences of genomes with a significant reduction to the cost and timeframe of traditional technologies. In addition, NGS technologies can be used to assess gene structure and expression, and importantly, to identify heritable genome variation underlying important agronomic traits. This review provides an overview of the wheat genome and NGS technologies, details some of the problems in applying NGS technology to wheat, and describes how NGS technologies are starting to impact wheat crop improvement.
Collapse
Affiliation(s)
- Paul J Berkman
- University of Queensland, School of Agriculture and Food Sciences and Australian Centre for Plant Functional Genomics, Brisbane, QLD 4072, Australia
| | | | | | | |
Collapse
|
46
|
DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants. Methods Mol Biol 2012; 862:13-22. [PMID: 22419485 DOI: 10.1007/978-1-61779-609-8_2] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.
Collapse
|
47
|
Abstract
Legumes are the third-largest family of angiosperms, the second-most-important crop family, and a key source of biological nitrogen in agriculture. Recently, the genome sequences of Glycine max (soybean), Medicago truncatula, and Lotus japonicus were substantially completed. Comparisons among legume genomes reveal a key role for duplication, especially a whole-genome duplication event approximately 58 Mya that is shared by most agriculturally important legumes. A second and more recent genome duplication occurred only in the lineage leading to soybean. Outcomes of genome duplication, including gene fractionation and sub- and neofunctionalization, have played key roles in shaping legume genomes and in the evolution of legume-specific traits. Analysis of legume genome sequences also enables the discovery of legume-specific gene families and provides a framework for genome-wide association mapping that will target phenotypes of special importance in legumes. Translating genomic resources from sequenced species to less studied but still important "orphan" legumes will enhance prospects for world food production.
Collapse
Affiliation(s)
- Nevin D Young
- Department of Plant Pathology and Department of Plant Biology, University of Minnesota, St. Paul, MN 55108, USA.
| | | |
Collapse
|
48
|
Lee HC, Lai K, Lorenc MT, Imelfort M, Duran C, Edwards D. Bioinformatics tools and databases for analysis of next-generation sequence data. Brief Funct Genomics 2011; 11:12-24. [DOI: 10.1093/bfgp/elr037] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
|
49
|
Plantagora: modeling whole genome sequencing and assembly of plant genomes. PLoS One 2011; 6:e28436. [PMID: 22174807 PMCID: PMC3236183 DOI: 10.1371/journal.pone.0028436] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2011] [Accepted: 11/08/2011] [Indexed: 01/17/2023] Open
Abstract
Background Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. Methodology/Principal Findings For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. Conclusions/Significance Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly further.
Collapse
|
50
|
Abeel T, Van Parys T, Saeys Y, Galagan J, Van de Peer Y. GenomeView: a next-generation genome browser. Nucleic Acids Res 2011; 40:e12. [PMID: 22102585 PMCID: PMC3258165 DOI: 10.1093/nar/gkr995] [Citation(s) in RCA: 114] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Due to ongoing advances in sequencing technologies, billions of nucleotide sequences are now produced on a daily basis. A major challenge is to visualize these data for further downstream analysis. To this end, we present GenomeView, a stand-alone genome browser specifically designed to visualize and manipulate a multitude of genomics data. GenomeView enables users to dynamically browse high volumes of aligned short-read data, with dynamic navigation and semantic zooming, from the whole genome level to the single nucleotide. At the same time, the tool enables visualization of whole genome alignments of dozens of genomes relative to a reference sequence. GenomeView is unique in its capability to interactively handle huge data sets consisting of tens of aligned genomes, thousands of annotation features and millions of mapped short reads both as viewer and editor. GenomeView is freely available as an open source software package.
Collapse
Affiliation(s)
- Thomas Abeel
- Department of Plant Systems Biology, VIB, Technologiepark 927, 9052 Gent, Belgium.
| | | | | | | | | |
Collapse
|