351
|
LaPierre N, Egan R, Wang W, Wang Z. De novo Nanopore read quality improvement using deep learning. BMC Bioinformatics 2019; 20:552. [PMID: 31694525 PMCID: PMC6833143 DOI: 10.1186/s12859-019-3103-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Accepted: 09/20/2019] [Indexed: 11/23/2022] Open
Abstract
Background Long read sequencing technologies such as Oxford Nanopore can greatly decrease the complexity of de novo genome assembly and large structural variation identification. Currently Nanopore reads have high error rates, and the errors often cluster into low-quality segments within the reads. The limited sensitivity of existing read-based error correction methods can cause large-scale mis-assemblies in the assembled genomes, motivating further innovation in this area. Results Here we developed a Convolutional Neural Network (CNN) based method, called MiniScrub, for identification and subsequent “scrubbing” (removal) of low-quality Nanopore read segments to minimize their interference in downstream assembly process. MiniScrub first generates read-to-read overlaps via MiniMap2, then encodes the overlaps into images, and finally builds CNN models to predict low-quality segments. Applying MiniScrub to real world control datasets under several different parameters, we show that it robustly improves read quality, and improves read error correction in the metagenome setting. Compared to raw reads, de novo genome assembly with scrubbed reads produces many fewer mis-assemblies and large indel errors. Conclusions MiniScrub is able to robustly improve read quality of Oxford Nanopore reads, especially in the metagenome setting, making it useful for downstream applications such as de novo assembly. We propose MiniScrub as a tool for preprocessing Nanopore reads for downstream analyses. MiniScrub is open-source software and is available at https://bitbucket.org/berkeleylab/jgi-miniscrub.
Collapse
Affiliation(s)
- Nathan LaPierre
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Rob Egan
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - Wei Wang
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
| | - Zhong Wang
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA. .,EGSB Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA. .,School of Natural Sciences, University of California at Merced, Merced, CA, 95343, USA.
| |
Collapse
|
352
|
Morisse P, Lecroq T, Lefebvre A. Hybrid correction of highly noisy long reads using a variable-order de Bruijn graph. Bioinformatics 2019; 34:4213-4222. [PMID: 29955770 DOI: 10.1093/bioinformatics/bty521] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Accepted: 06/27/2018] [Indexed: 12/31/2022] Open
Abstract
Motivation The recent rise of long read sequencing technologies such as Pacific Biosciences and Oxford Nanopore allows to solve assembly problems for larger and more complex genomes than what allowed short reads technologies. However, these long reads are very noisy, reaching an error rate of around 10-15% for Pacific Biosciences, and up to 30% for Oxford Nanopore. The error correction problem has been tackled by either self-correcting the long reads, or using complementary short reads in a hybrid approach. However, even though sequencing technologies promise to lower the error rate of the long reads below 10%, it is still higher in practice, and correcting such noisy long reads remains an issue. Results We present HG-CoLoR, a hybrid error correction method that focuses on a seed-and-extend approach based on the alignment of the short reads to the long reads, followed by the traversal of a variable-order de Bruijn graph, built from the short reads. Our experiments show that HG-CoLoR manages to efficiently correct highly noisy long reads that display an error rate as high as 44%. When compared to other state-of-the-art long read error correction methods, our experiments also show that HG-CoLoR provides the best trade-off between runtime and quality of the results, and is the only method able to efficiently scale to eukaryotic genomes. Availability and implementation HG-CoLoR is implemented is C++, supported on Linux platforms and freely available at https://github.com/morispi/HG-CoLoR. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
353
|
Chen S, Qiu G, Yang M. SMRT sequencing of full-length transcriptome of seagrasses Zostera japonica. Sci Rep 2019; 9:14537. [PMID: 31601990 PMCID: PMC6787188 DOI: 10.1038/s41598-019-51176-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 09/25/2019] [Indexed: 11/20/2022] Open
Abstract
Seagrass meadows are among the four most productive marine ecosystems in the world. Zostera japonica (Z. japonica) is the most widely distributed species of seagrass in China. However, there is no reference genome or transcriptome available for Z. japonica, impeding progress in functional genomic and molecular ecology studies in this species. Temperature is the main factor that controls the distribution and growth of seagrass around the world, yet how seagrass responds to heat stress remains poorly understood due to the lack of genomic and transcriptomic data. In this study, we applied a combination of second- and third-generation sequencing technologies to sequence full-length transcriptomes of Z. japonica. In total, we obtained 58,134 uniform transcripts, which included 46,070 high-quality full-length transcript sequences. We identified 15,411 simple sequence repeats, 258 long non-coding RNAs and 28,038 open reading frames. Exposure to heat elicited a complex transcriptional response in genes involved in posttranslational modification, protein turnover and chaperones. Overall, our study provides the first large-scale full-length trascriptome in Zostera japonica, allowing for structural, functional and comparative genomics studies in this important seagrass species. Although previous studies have focused specifically on heat shock proteins, we found that examination of other heat stress related genes is important for studying response to heat stress in seagrass. This study provides a genetic resource for the discovery of genes related to heat stress tolerance in this species. Our transcriptome can be further utilized in future studies to understand the molecular adaptation to heat stress in Zostera japonica.
Collapse
Affiliation(s)
- Siting Chen
- Guangxi Key Lab of Mangrove Conservation and Utilization, Guangxi Mangrove Research Center, Guangxi Academy of Sciences, Beihai, Guangxi, 536007, China.
| | - Guanglong Qiu
- Guangxi Key Lab of Mangrove Conservation and Utilization, Guangxi Mangrove Research Center, Guangxi Academy of Sciences, Beihai, Guangxi, 536007, China
| | - Mingliu Yang
- Guangxi Key Lab of Mangrove Conservation and Utilization, Guangxi Mangrove Research Center, Guangxi Academy of Sciences, Beihai, Guangxi, 536007, China
| |
Collapse
|
354
|
Zou J, Mao L, Qiu J, Wang M, Jia L, Wu D, He Z, Chen M, Shen Y, Shen E, Huang Y, Li R, Hu D, Shi L, Wang K, Zhu Q, Ye C, Bancroft I, King GJ, Meng J, Fan L. Genome-wide selection footprints and deleterious variations in young Asian allotetraploid rapeseed. PLANT BIOTECHNOLOGY JOURNAL 2019; 17:1998-2010. [PMID: 30947395 PMCID: PMC6737024 DOI: 10.1111/pbi.13115] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2018] [Revised: 02/16/2019] [Accepted: 03/13/2019] [Indexed: 05/19/2023]
Abstract
Brassica napus (AACC, 2n = 38) is an important oilseed crop grown worldwide. However, little is known about the population evolution of this species, the genomic difference between its major genetic groups, such as European and Asian rapeseed, and the impacts of historical large-scale introgression events on this young tetraploid. In this study, we reported the de novo assembly of the genome sequences of an Asian rapeseed (B. napus), Ningyou 7, and its four progenitors and compared these genomes with other available genomic data from diverse European and Asian cultivars. Our results showed that Asian rapeseed originally derived from European rapeseed but subsequently significantly diverged, with rapid genome differentiation after hybridization and intensive local selective breeding. The first historical introgression of B. rapa dramatically broadened the allelic pool but decreased the deleterious variations of Asian rapeseed. The second historical introgression of the double-low traits of European rapeseed (canola) has reshaped Asian rapeseed into two groups (double-low and double-high), accompanied by an increase in genetic load in the double-low group. This study demonstrates distinctive genomic footprints and deleterious SNP (single nucleotide polymorphism) variants for local adaptation by recent intra- and interspecies introgression events and provides novel insights for understanding the rapid genome evolution of a young allopolyploid crop.
Collapse
Affiliation(s)
- Jun Zou
- National Key Laboratory of Crop Genetic ImprovementHuazhong Agricultural UniversityWuhanChina
| | - Lingfeng Mao
- Institute of Crop Sciences & Institute of BioinformaticsZhejiang UniversityHangzhouChina
| | - Jie Qiu
- Institute of Crop Sciences & Institute of BioinformaticsZhejiang UniversityHangzhouChina
| | - Meng Wang
- National Key Laboratory of Crop Genetic ImprovementHuazhong Agricultural UniversityWuhanChina
| | - Lei Jia
- Institute of Crop Sciences & Institute of BioinformaticsZhejiang UniversityHangzhouChina
| | - Dongya Wu
- Institute of Crop Sciences & Institute of BioinformaticsZhejiang UniversityHangzhouChina
| | - Zhesi He
- Department of BiologyYork UniversityHeslingtonUK
| | - Meihong Chen
- Institute of Crop Sciences & Institute of BioinformaticsZhejiang UniversityHangzhouChina
| | - Yifei Shen
- Institute of Crop Sciences & Institute of BioinformaticsZhejiang UniversityHangzhouChina
| | - Enhui Shen
- Institute of Crop Sciences & Institute of BioinformaticsZhejiang UniversityHangzhouChina
| | - Yongji Huang
- Center for Genomics and BiotechnologyHaixia Institute of Science and Technology (HIST)Fujian Agriculture and Forestry UniversityFuzhouChina
| | - Ruiyuan Li
- National Key Laboratory of Crop Genetic ImprovementHuazhong Agricultural UniversityWuhanChina
| | - Dandan Hu
- National Key Laboratory of Crop Genetic ImprovementHuazhong Agricultural UniversityWuhanChina
| | - Lei Shi
- National Key Laboratory of Crop Genetic ImprovementHuazhong Agricultural UniversityWuhanChina
| | - Kai Wang
- Center for Genomics and BiotechnologyHaixia Institute of Science and Technology (HIST)Fujian Agriculture and Forestry UniversityFuzhouChina
| | | | - Chuyu Ye
- Institute of Crop Sciences & Institute of BioinformaticsZhejiang UniversityHangzhouChina
| | - Ian Bancroft
- Department of BiologyYork UniversityHeslingtonUK
| | - Graham J. King
- Southern Cross Plant ScienceSouthern Cross UniversityLismoreNSWAustralia
| | - Jinling Meng
- National Key Laboratory of Crop Genetic ImprovementHuazhong Agricultural UniversityWuhanChina
| | - Longjiang Fan
- Institute of Crop Sciences & Institute of BioinformaticsZhejiang UniversityHangzhouChina
| |
Collapse
|
355
|
Martin SL, Parent JS, Laforest M, Page E, Kreiner JM, James T. Population Genomic Approaches for Weed Science. PLANTS (BASEL, SWITZERLAND) 2019; 8:E354. [PMID: 31546893 PMCID: PMC6783936 DOI: 10.3390/plants8090354] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 09/12/2019] [Accepted: 09/14/2019] [Indexed: 12/16/2022]
Abstract
Genomic approaches are opening avenues for understanding all aspects of biological life, especially as they begin to be applied to multiple individuals and populations. However, these approaches typically depend on the availability of a sequenced genome for the species of interest. While the number of genomes being sequenced is exploding, one group that has lagged behind are weeds. Although the power of genomic approaches for weed science has been recognized, what is needed to implement these approaches is unfamiliar to many weed scientists. In this review we attempt to address this problem by providing a primer on genome sequencing and provide examples of how genomics can help answer key questions in weed science such as: (1) Where do agricultural weeds come from; (2) what genes underlie herbicide resistance; and, more speculatively, (3) can we alter weed populations to make them easier to control? This review is intended as an introduction to orient weed scientists who are thinking about initiating genome sequencing projects to better understand weed populations, to highlight recent publications that illustrate the potential for these methods, and to provide direction to key tools and literature that will facilitate the development and execution of weed genomic projects.
Collapse
Affiliation(s)
- Sara L Martin
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada.
| | - Jean-Sebastien Parent
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada.
| | - Martin Laforest
- Saint-Jean-sur-Richelieu Research and Development Centre, Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu, QC J3B 3E6, Canada.
| | - Eric Page
- Harrow Research and Development Centre, Agriculture and Agri-Food Canada, Harrow, ON N0R 1G0, Canada.
| | - Julia M Kreiner
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON M5S 3B2, Canada.
| | - Tracey James
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada.
| |
Collapse
|
356
|
He Y, Luo X, Zhou B, Hu T, Meng X, Audano PA, Kronenberg ZN, Eichler EE, Jin J, Guo Y, Yang Y, Qi X, Su B. Long-read assembly of the Chinese rhesus macaque genome and identification of ape-specific structural variants. Nat Commun 2019; 10:4233. [PMID: 31530812 PMCID: PMC6749001 DOI: 10.1038/s41467-019-12174-w] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Accepted: 08/27/2019] [Indexed: 12/20/2022] Open
Abstract
We present a high-quality de novo genome assembly (rheMacS) of the Chinese rhesus macaque (Macaca mulatta) using long-read sequencing and multiplatform scaffolding approaches. Compared to the current Indian rhesus macaque reference genome (rheMac8), rheMacS increases sequence contiguity 75-fold, closing 21,940 of the remaining assembly gaps (60.8 Mbp). We improve gene annotation by generating more than two million full-length transcripts from ten different tissues by long-read RNA sequencing. We sequence resolve 53,916 structural variants (96% novel) and identify 17,000 ape-specific structural variants (ASSVs) based on comparison to ape genomes. Many ASSVs map within ChIP-seq predicted enhancer regions where apes and macaque show diverged enhancer activity and gene expression. We further characterize a subset that may contribute to ape- or great-ape-specific phenotypic traits, including taillessness, brain volume expansion, improved manual dexterity, and large body size. The rheMacS genome assembly serves as an ideal reference for future biomedical and evolutionary studies. Comparative genomic analysis of human and primate relatives can reveal important biological and evolutionary insights. Here, the authors present a long-read assembly of the Chinese rhesus macaque genome and identify ape-specific structural variants.
Collapse
Affiliation(s)
- Yaoxi He
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Primate Research Center, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing, 100101, China
| | - Xin Luo
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Primate Research Center, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing, 100101, China
| | - Bin Zhou
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Primate Research Center, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing, 100101, China
| | - Ting Hu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Primate Research Center, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing, 100101, China
| | - Xiaoyu Meng
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Primate Research Center, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing, 100101, China
| | - Peter A Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - Zev N Kronenberg
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA
| | - Jie Jin
- Nextomics Biosciences, Wuhan, 430000, China
| | - Yongbo Guo
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Primate Research Center, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing, 100101, China
| | - Yanan Yang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Primate Research Center, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
| | - Xuebin Qi
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Primate Research Center, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China
| | - Bing Su
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China. .,Primate Research Center, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China. .,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.
| |
Collapse
|
357
|
Gao Y, Xi F, zhang H, Liu X, Wang H, zhao L, Reddy AS, Gu L. Single-molecule Real-time (SMRT) Isoform Sequencing (Iso-Seq) in Plants: The Status of the Bioinformatics Tools to Unravel the Transcriptome Complexity. Curr Bioinform 2019. [DOI: 10.2174/1574893614666190204151746] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
The advent of the Single-Molecule Real-time (SMRT) Isoform Sequencing
(Iso-Seq) has paved the way to obtain longer full-length transcripts. This method was found to
be much superior in identifying full-length splice variants and other post-transcriptional events as
compared to the Next Generation Sequencing (NGS)-based short read sequencing (RNA-Seq).
Several different bioinformatics tools to analyze the Iso-Seq data have been developed and some
of them are still being refined to address different aspects of transcriptome complexity. However, a
comprehensive summary of the available tools and their utility is still lacking.
Objective:
Here, we summarized the existing Iso-Seq analysis tools and presented an integrated
bioinformatics pipeline for Iso-Seq analysis, which overcomes the limitations of NGS and generates
long contiguous Full-Length Non-Chimeric (FLNC) reads for the analysis of posttranscriptional
events.
Results:
In this review, we summarized recent applications of Iso-Seq in plants, which include improved
genome annotations, identification of novel genes and lncRNAs, identification of fulllength
splice isoforms, detection of novel Alternative Splicing (AS) and Alternative Polyadenylation
(APA) events. In addition, we also discussed the bioinformatics pipeline for comprehensive
Iso-Seq data analysis, including how to reduce the error rate in the reads and how to identify and
quantify post-transcriptional events. Furthermore, the visualization approach of Iso-Seq was discussed
as well. Finally, we discussed methods to combine Iso-Seq data with RNA-Seq for transcriptome
quantification.
Conclusion:
Overall, this review demonstrates that the Iso-Seq is pivotal for analyzing transcriptome
complexity and this new method offers unprecedented opportunities to comprehensively understand
transcripts diversity.
Collapse
Affiliation(s)
- Yubang Gao
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Feihu Xi
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Hangxiao zhang
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Xuqing Liu
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Huiyuan Wang
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Liangzhen zhao
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Anireddy S.N. Reddy
- Department of Biology, Program in Molecular Plant Biology, Program in Cell and Molecular Biology, Colorado State University, Fort Collins, Colorado 80523, United States
| | - Lianfeng Gu
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| |
Collapse
|
358
|
Wingfield BD, Fourie A, Simpson MC, Bushula-Njah VS, Aylward J, Barnes I, Coetzee MPA, Dreyer LL, Duong TA, Geiser DM, Roets F, Steenkamp ET, van der Nest MA, van Heerden CJ, Wingfield MJ. IMA Genome-F 11: Draft genome sequences of Fusarium xylarioides, Teratosphaeria gauchensis and T. zuluensis and genome annotation for Ceratocystis fimbriata. IMA Fungus 2019; 10:13. [PMID: 32355613 PMCID: PMC7184890 DOI: 10.1186/s43008-019-0013-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2019] [Accepted: 07/01/2019] [Indexed: 01/21/2023] Open
Abstract
Draft genomes of the fungal species Fusarium xylarioides, Teratosphaeria gauchensis and T. zuluensis are presented. In addition an annotation of the genome of Ceratocystis fimbriata is presented. Overall these genomes provide a valuable resource for understanding the molecular processes underlying pathogenicity and potential management strategies of these economically important fungi.
Collapse
Affiliation(s)
- Brenda D. Wingfield
- Department of Biochemistry, Genetics and Microbiology (BGM), Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private Bag X20, Hatfield, 0028 South Africa
| | - Arista Fourie
- Department of Biochemistry, Genetics and Microbiology (BGM), Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private Bag X20, Hatfield, 0028 South Africa
| | - Melissa C. Simpson
- Department of Biochemistry, Genetics and Microbiology (BGM), Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private Bag X20, Hatfield, 0028 South Africa
| | - Vuyiswa S. Bushula-Njah
- Department of Biochemistry, Genetics and Microbiology (BGM), Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private Bag X20, Hatfield, 0028 South Africa
| | - Janneke Aylward
- Department of Biochemistry, Genetics and Microbiology (BGM), Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private Bag X20, Hatfield, 0028 South Africa
| | - Irene Barnes
- Department of Biochemistry, Genetics and Microbiology (BGM), Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private Bag X20, Hatfield, 0028 South Africa
| | - Martin P. A. Coetzee
- Department of Biochemistry, Genetics and Microbiology (BGM), Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private Bag X20, Hatfield, 0028 South Africa
| | - Léanne L. Dreyer
- Department of Botany and Zoology, Stellenbosch University, Private Bag X1, Matieland, 7602 South Africa
| | - Tuan A. Duong
- Department of Biochemistry, Genetics and Microbiology (BGM), Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private Bag X20, Hatfield, 0028 South Africa
| | - David M. Geiser
- Fusarium Research Center, Department of Plant Pathology and Environmental Microbiology, 121 Buckhout Lab, University Park, State College, PA 16802 USA
| | - Francois Roets
- Department of Conservation Ecology and Entomology, Stellenbosch University, Private Bag X1, Matieland, 7602 South Africa
| | - E. T. Steenkamp
- Department of Biochemistry, Genetics and Microbiology (BGM), Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private Bag X20, Hatfield, 0028 South Africa
| | - Magriet A. van der Nest
- Department of Biochemistry, Genetics and Microbiology (BGM), Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private Bag X20, Hatfield, 0028 South Africa
- Biotechnology Platform, Agricultural Research Council, Private Bag X05, Onderstepoort, 0002 South Africa
| | - Carel J. van Heerden
- Central Analytical Facilities, Stellenbosch University, Private Bag X1, Matieland, 7602 South Africa
| | - Michael J. Wingfield
- Department of Biochemistry, Genetics and Microbiology (BGM), Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private Bag X20, Hatfield, 0028 South Africa
| |
Collapse
|
359
|
Stroehlein AJ, Korhonen PK, Chong TM, Lim YL, Chan KG, Webster B, Rollinson D, Brindley PJ, Gasser RB, Young ND. High-quality Schistosoma haematobium genome achieved by single-molecule and long-range sequencing. Gigascience 2019; 8:giz108. [PMID: 31494670 PMCID: PMC6736295 DOI: 10.1093/gigascience/giz108] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 06/25/2019] [Accepted: 08/10/2019] [Indexed: 01/30/2023] Open
Abstract
BACKGROUND Schistosoma haematobium causes urogenital schistosomiasis, a neglected tropical disease affecting >100 million people worldwide. Chronic infection with this parasitic trematode can lead to urogenital conditions including female genital schistosomiasis and bladder cancer. At the molecular level, little is known about this blood fluke and the pathogenesis of the disease that it causes. To support molecular studies of this carcinogenic worm, we reported a draft genome for S. haematobium in 2012. Although a useful resource, its utility has been somewhat limited by its fragmentation. FINDINGS Here, we systematically enhanced the draft genome of S. haematobium using a single-molecule and long-range DNA-sequencing approach. We achieved a major improvement in the accuracy and contiguity of the genome assembly, making it superior or comparable to assemblies for other schistosome species. We transferred curated gene models to this assembly and, using enhanced gene annotation pipelines, inferred a gene set with as many or more complete gene models as those of other well-studied schistosomes. Using conserved, single-copy orthologs, we assessed the phylogenetic position of S. haematobium in relation to other parasitic flatworms for which draft genomes were available. CONCLUSIONS We report a substantially enhanced genomic resource that represents a solid foundation for molecular research on S. haematobium and is poised to better underpin population and functional genomic investigations and to accelerate the search for new disease interventions.
Collapse
Affiliation(s)
- Andreas J Stroehlein
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Corner Flemington Road and Park Drive, Parkville, VIC 3010, Australia
| | - Pasi K Korhonen
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Corner Flemington Road and Park Drive, Parkville, VIC 3010, Australia
| | - Teik Min Chong
- Institute of Biological Sciences, Faculty of Science, University of Malaya, 50603 Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, Malaysia
| | - Yan Lue Lim
- Institute of Biological Sciences, Faculty of Science, University of Malaya, 50603 Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, Malaysia
| | - Kok Gan Chan
- Institute of Biological Sciences, Faculty of Science, University of Malaya, 50603 Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, Malaysia
| | - Bonnie Webster
- Parasites and Vectors Division, The Natural History Museum, Cromwell Rd, South Kensington, London SW7 5BD, UK
| | - David Rollinson
- Parasites and Vectors Division, The Natural History Museum, Cromwell Rd, South Kensington, London SW7 5BD, UK
| | - Paul J Brindley
- School of Medicine & Health Sciences, Department of Microbiology, Immunology & Tropical Medicine, George Washington University, 2300 Eye Street, NW, Suite 502, Washington, DC 20037, USA
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Corner Flemington Road and Park Drive, Parkville, VIC 3010, Australia
| | - Neil D Young
- Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Corner Flemington Road and Park Drive, Parkville, VIC 3010, Australia
| |
Collapse
|
360
|
Bråte J, Fuss J, Jakobsen KS, Klaveness D. Draft genome assembly and transcriptome sequencing of the golden algae Hydrurus foetidus (Chrysophyceae). F1000Res 2019; 8:401. [DOI: 10.12688/f1000research.16734.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/27/2019] [Indexed: 11/20/2022] Open
Abstract
Hydrurus foetidus is a freshwater chrysophyte alga. It thrives in cold rivers in polar and high alpine regions. It has several morphological traits reminiscent of single-celled eukaryotes, but can also form macroscopic thalli. Despite its ability to produce polyunsaturated fatty acids, its life under cold conditions and its variable morphology, very little is known about its genome and transcriptome. Here, we present an extensive set of next-generation sequencing data, including genomic short reads from Illumina sequencing and long reads from Nanopore sequencing, as well as full length cDNAs from PacBio IsoSeq sequencing and a small RNA dataset (smaller than 200 bp) sequenced with Illumina. The genome sequences were combined to produce an assembly consisting of 5069 contigs, with a total assembly size of 171 Mb and a 77% BUSCO completeness. The new data generated here may contribute to a better understanding of the evolution and ecological roles of chrysophyte algae, as well as to resolve the branching patterns at a larger phylogenetic scale.
Collapse
|
361
|
Zhang J, Guan W, Huang C, Hu Y, Chen Y, Guo J, Zhou C, Chen R, Du B, Zhu L, Huanhan D, He G. Combining next-generation sequencing and single-molecule sequencing to explore brown plant hopper responses to contrasting genotypes of japonica rice. BMC Genomics 2019; 20:682. [PMID: 31464583 PMCID: PMC6716848 DOI: 10.1186/s12864-019-6049-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Accepted: 08/20/2019] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND The brown plant hopper (BPH), Nilaparvata lugens, is one of the major pest of rice (Oryza sativa). Plant defenses against insect herbivores have been extensively studied, but our understanding of insect responses to host plants' resistance mechanisms is still limited. The purpose of this study is to characterize transcripts of BPH and reveal the responses of BPH insects to resistant rice at transcription level by using the advanced molecular techniques, the next-generation sequencing (NGS) and the single-molecule, real-time (SMRT) sequencing. RESULTS The current study obtained 24,891 collapsed isoforms of full-length transcripts, and 20,662 were mapped to known annotated genes, including 17,175 novel transcripts. The current study also identified 915 fusion genes, 1794 novel genes, 2435 long non-coding RNAs (lncRNAs), and 20,356 alternative splicing events. Moreover, analysis of differentially expressed genes (DEGs) revealed that genes involved in metabolic and cell proliferation processes were significantly enriched in up-regulated and down-regulated sets, respectively, in BPH fed on resistant rice relative to BPH fed on susceptible wild type rice. Furthermore, the FoxO signaling pathway was involved and genes related to BPH starvation response (Nlbmm), apoptosis and autophagy (caspase 8, ATG13, BNIP3 and IAP), active oxygen elimination (catalase, MSR, ferritin) and detoxification (GST, CarE) were up-regulated in BPH responses to resistant rice. CONCLUSIONS The current study provides the first demonstrations of the full diversity and complexity of the BPH transcriptome, and indicates that BPH responses to rice resistance, might be related to starvation stress responses, nutrient transformation, oxidative decomposition, and detoxification. The current result findings will facilitate further exploration of molecular mechanisms of interaction between BPH insects and host rice.
Collapse
Affiliation(s)
- Jing Zhang
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan, China
| | - Wei Guan
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan, China
| | - Chaomei Huang
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan, China
| | - Yinxia Hu
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan, China
| | - Yu Chen
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan, China
| | - Jianping Guo
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan, China
| | - Cong Zhou
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan, China
| | - Rongzhi Chen
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan, China
| | - Bo Du
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan, China
| | - Lili Zhu
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan, China
| | - Danax Huanhan
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan, China
| | - Guangcun He
- State Key Laboratory of Hybrid Rice, College of Life Sciences, Wuhan University, Wuhan, China
| |
Collapse
|
362
|
Wang D, Chen X, Zhang X, Li J, Yi Y, Bian C, Shi Q, Lin H, Li S, Zhang Y, You X. Whole Genome Sequencing of the Giant Grouper ( Epinephelus lanceolatus) and High-Throughput Screening of Putative Antimicrobial Peptide Genes. Mar Drugs 2019; 17:E503. [PMID: 31466296 PMCID: PMC6780625 DOI: 10.3390/md17090503] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 08/20/2019] [Accepted: 08/26/2019] [Indexed: 12/25/2022] Open
Abstract
Giant groupers, the largest grouper type in the world, are of economic importance in marine aquaculture for their rapid growth. At the same time, bacterial and viral diseases have become the main threats to the grouper industry. Here, we report a high-quality genome of a giant grouper sequenced by an Illumina HiSeq X-Ten and PacBio Bioscience Sequel platform. A total of 254 putative antimicrobial peptide (AMP) genes were identified, which can be divided into 34 classes according to the annotation of the Antimicrobial Peptides Database (APD3). Their locations in pseudochromosomes were also determined. Thrombin-, lectin-, and scolopendin-derived putative AMPs were the three largest parts. In addition, expressions of putative AMPs were measured by our transcriptome data. Two putative AMP genes (gapdh1 and gapdh2) were involved in glycolysis, which had extremely high expression levels in giant grouper muscle. As it has been reported that AMPs inhibit the growth of a broad spectrum of microbes and participate in regulating innate and adaptive immune responses, genome sequencing of this study provides a comprehensive cataloging of putative AMPs of groupers, supporting antimicrobial research and aquaculture therapy. These genomic resources will be beneficial to further molecular breeding of this economically important fish.
Collapse
Affiliation(s)
- Dengdong Wang
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory for Aquatic Economic Animals and Guangdong Provincial Engineering Technology Research Center for Healthy Breeding of Important Economic Fish, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510275, China
- Zhanjiang Bay Laboratory, Guangdong Research Center on Reproductive Control and Breeding Technology of Indigenous Valuable Fish Species, Fisheries College, Guangdong Ocean University, Zhanjiang 524088, China
| | - Xiyang Chen
- BGI Education Center, University of Chinese Academy of Sciences, Shenzhen 518083, China
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen 518083, China
| | - Xinhui Zhang
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen 518083, China
| | - Jia Li
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen 518083, China
| | - Yunhai Yi
- BGI Education Center, University of Chinese Academy of Sciences, Shenzhen 518083, China
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen 518083, China
| | - Chao Bian
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen 518083, China
| | - Qiong Shi
- BGI Education Center, University of Chinese Academy of Sciences, Shenzhen 518083, China
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen 518083, China
- Laboratory of Aquatic Genomics, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen 518060, China
| | - Haoran Lin
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory for Aquatic Economic Animals and Guangdong Provincial Engineering Technology Research Center for Healthy Breeding of Important Economic Fish, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510275, China
- Zhanjiang Bay Laboratory, Guangdong Research Center on Reproductive Control and Breeding Technology of Indigenous Valuable Fish Species, Fisheries College, Guangdong Ocean University, Zhanjiang 524088, China
| | - Shuisheng Li
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory for Aquatic Economic Animals and Guangdong Provincial Engineering Technology Research Center for Healthy Breeding of Important Economic Fish, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510275, China.
- Zhanjiang Bay Laboratory, Guangdong Research Center on Reproductive Control and Breeding Technology of Indigenous Valuable Fish Species, Fisheries College, Guangdong Ocean University, Zhanjiang 524088, China.
| | - Yong Zhang
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory for Aquatic Economic Animals and Guangdong Provincial Engineering Technology Research Center for Healthy Breeding of Important Economic Fish, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510275, China.
- Zhanjiang Bay Laboratory, Guangdong Research Center on Reproductive Control and Breeding Technology of Indigenous Valuable Fish Species, Fisheries College, Guangdong Ocean University, Zhanjiang 524088, China.
| | - Xinxin You
- BGI Education Center, University of Chinese Academy of Sciences, Shenzhen 518083, China.
- Shenzhen Key Lab of Marine Genomics, Guangdong Provincial Key Lab of Molecular Breeding in Marine Economic Animals, BGI Academy of Marine Sciences, BGI Marine, BGI, Shenzhen 518083, China.
| |
Collapse
|
363
|
Feng S, Xu M, Liu F, Cui C, Zhou B. Reconstruction of the full-length transcriptome atlas using PacBio Iso-Seq provides insight into the alternative splicing in Gossypium australe. BMC PLANT BIOLOGY 2019; 19:365. [PMID: 31426739 PMCID: PMC6701088 DOI: 10.1186/s12870-019-1968-7] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Accepted: 08/09/2019] [Indexed: 05/11/2023]
Abstract
BACKGROUND Gossypium australe F. Mueller (2n = 2x = 26, G2 genome) possesses valuable characteristics. For example, the delayed gland morphogenesis trait causes cottonseed protein and oil to be edible while retaining resistance to biotic stress. However, the lack of gene sequences and their alternative splicing (AS) in G. australe remain unclear, hindering to explore species-specific biological morphogenesis. RESULTS Here, we report the first sequencing of the full-length transcriptome of the Australian wild cotton species, G. australe, using Pacific Biosciences single-molecule long-read isoform sequencing (Iso-Seq) from the pooled cDNA of ten tissues to identify transcript loci and splice isoforms. We reconstructed the G. australe full-length transcriptome and identified 25,246 genes, 86 pre-miRNAs and 1468 lncRNAs. Most genes (12,832, 50.83%) exhibited two or more isoforms, suggesting a high degree of transcriptome complexity in G. australe. A total of 31,448 AS events in five major types were found among the 9944 gene loci. Among these five major types, intron retention was the most frequent, accounting for 68.85% of AS events. 29,718 polyadenylation sites were detected from 14,536 genes, 7900 of which have alternative polyadenylation sites (APA). In addition, based on our AS events annotations, RNA-Seq short reads from germinating seeds showed that differential expression of these events occurred during seed germination. Ten AS events that were randomly selected were further confirmed by RT-PCR amplification in leaf and germinating seeds. CONCLUSIONS The reconstructed gene sequences and their AS in G. australe would provide information for exploring beneficial characteristics in G. australe.
Collapse
Affiliation(s)
- Shouli Feng
- State Key Laboratory of Crop Genetics & Germplasm Enhancement, MOE Hybrid Cotton R&D Engineering Research Center, Nanjing Agricultural University, Nanjing, 210095 Jiangsu People’s Republic of China
| | - Min Xu
- State Key Laboratory of Crop Genetics & Germplasm Enhancement, MOE Hybrid Cotton R&D Engineering Research Center, Nanjing Agricultural University, Nanjing, 210095 Jiangsu People’s Republic of China
| | - Fujie Liu
- State Key Laboratory of Crop Genetics & Germplasm Enhancement, MOE Hybrid Cotton R&D Engineering Research Center, Nanjing Agricultural University, Nanjing, 210095 Jiangsu People’s Republic of China
| | - Changjiang Cui
- State Key Laboratory of Crop Genetics & Germplasm Enhancement, MOE Hybrid Cotton R&D Engineering Research Center, Nanjing Agricultural University, Nanjing, 210095 Jiangsu People’s Republic of China
| | - Baoliang Zhou
- State Key Laboratory of Crop Genetics & Germplasm Enhancement, MOE Hybrid Cotton R&D Engineering Research Center, Nanjing Agricultural University, Nanjing, 210095 Jiangsu People’s Republic of China
| |
Collapse
|
364
|
Hardwick SA, Joglekar A, Flicek P, Frankish A, Tilgner HU. Getting the Entire Message: Progress in Isoform Sequencing. Front Genet 2019; 10:709. [PMID: 31475029 PMCID: PMC6706457 DOI: 10.3389/fgene.2019.00709] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 07/04/2019] [Indexed: 01/31/2023] Open
Abstract
The advent of second-generation sequencing and its application to RNA sequencing have revolutionized the field of genomics by allowing quantification of gene expression, as well as the definition of transcription start/end sites, exons, splice sites and RNA editing sites. However, due to the sequencing of fragments of cDNAs, these methods have not given a reliable picture of complete RNA isoforms. Third-generation sequencing has filled this gap and allows end-to-end sequencing of entire RNA/cDNA molecules. This approach to transcriptomics has been a "niche" technology for a couple of years but now is becoming mainstream with many different applications. Here, we review the background and progress made to date in this rapidly growing field. We start by reviewing the progressive realization that alternative splicing is omnipresent. We then focus on long-noncoding RNA isoforms and the distinct combination patterns of exons in noncoding and coding genes. We consider the implications of the recent technologies of direct RNA sequencing and single-cell isoform RNA sequencing. Finally, we discuss the parameters that define the success of long-read RNA sequencing experiments and strategies commonly used to make the most of such data.
Collapse
Affiliation(s)
- Simon A. Hardwick
- Brain and Mind Research Institute, Weill Cornell Medicine, NY, United States
- Garvan Institute of Medical Research, Sydney, NSW, Australia
| | - Anoushka Joglekar
- Brain and Mind Research Institute, Weill Cornell Medicine, NY, United States
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, United Kingdom
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, United Kingdom
| | - Hagen U. Tilgner
- Brain and Mind Research Institute, Weill Cornell Medicine, NY, United States
| |
Collapse
|
365
|
Wang J, Deng Y, Zhou Y, Liu D, Yu H, Zhou Y, Lv J, Ou L, Li X, Ma Y, Dai X, Liu F, Zou X, Ouyang B, Li F. Full-length mRNA sequencing and gene expression profiling reveal broad involvement of natural antisense transcript gene pairs in pepper development and response to stresses. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2019; 99:763-783. [PMID: 31009127 DOI: 10.1111/tpj.14351] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2019] [Revised: 03/18/2019] [Accepted: 03/27/2019] [Indexed: 06/09/2023]
Abstract
Pepper is an important vegetable with great economic value and unique biological features. In the past few years, significant development has been made toward understanding the huge complex pepper genome; however, pepper functional genomics has not been well studied. To better understand the pepper gene structure and pepper gene regulation, we conducted full-length mRNA sequencing by PacBio sequencing and obtained 57 862 high-quality full-length mRNA sequences derived from 18 362 previously annotated and 5769 newly detected genes. New gene models were built that combined the full-length mRNA sequences and corrected approximately 500 fragmented gene models from previous annotations. Based on the full-length mRNA, we identified 4114 and 5880 pepper genes forming natural antisense transcript (NAT) genes in-cis and in-trans, respectively. Most of these genes accumulate small RNAs in their overlapping regions. By analyzing these NAT gene expression patterns in our transcriptome data, we identified many NAT pairs responsive to a variety of biological processes in pepper. Pepper formate dehydrogenase 1 (FDH1), which is required for R-gene-mediated disease resistance, may be regulated by nat-siRNAs and participate in a positive feedback loop in salicylic acid biosynthesis during resistance responses. Several cis-NAT pairs and subgroups of trans-NAT genes were responsive to pepper pericarp and placenta development, which may play roles in capsanthin and capsaicin biosynthesis. Using a comparative genomics approach, the evolutionary mechanisms of cis-NATs were investigated, and we found that an increase in intergenic sequences accounted for the loss of most cis-NATs, while transposon insertion contributed to the formation of most new cis-NATs. OPEN RESEARCH BADGES: This article has earned an Open Data Badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available at http://bigd.big.ac.cn/gsa Accession number, CRA001412.
Collapse
Affiliation(s)
- Jubin Wang
- Key Laboratory of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, HB, China
| | - Yingtian Deng
- Key Laboratory of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, HB, China
| | - Yingjia Zhou
- Key Laboratory of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, HB, China
| | - Dan Liu
- Key Laboratory of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, HB, China
| | - Huiyang Yu
- Key Laboratory of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, HB, China
| | - Yuhong Zhou
- Key Laboratory of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, HB, China
| | - Junheng Lv
- Hunan Institute of Vegetable Research, Academy of Agricultural Sciences of Hunan Province, Changsha, HN, China
| | - Lijun Ou
- Hunan Institute of Vegetable Research, Academy of Agricultural Sciences of Hunan Province, Changsha, HN, China
| | - Xuefeng Li
- Hunan Institute of Vegetable Research, Academy of Agricultural Sciences of Hunan Province, Changsha, HN, China
| | - Yanqing Ma
- Hunan Institute of Vegetable Research, Academy of Agricultural Sciences of Hunan Province, Changsha, HN, China
| | - Xiongze Dai
- Hunan Institute of Vegetable Research, Academy of Agricultural Sciences of Hunan Province, Changsha, HN, China
| | - Feng Liu
- Hunan Institute of Vegetable Research, Academy of Agricultural Sciences of Hunan Province, Changsha, HN, China
| | - Xuexiao Zou
- Hunan Institute of Vegetable Research, Academy of Agricultural Sciences of Hunan Province, Changsha, HN, China
| | - Bo Ouyang
- Key Laboratory of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, HB, China
| | - Feng Li
- Key Laboratory of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, HB, China
| |
Collapse
|
366
|
Kaposi Sarcoma-Associated Herpesvirus Glycoprotein H Is Indispensable for Infection of Epithelial, Endothelial, and Fibroblast Cell Types. J Virol 2019; 93:JVI.00630-19. [PMID: 31142670 DOI: 10.1128/jvi.00630-19] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Accepted: 05/15/2019] [Indexed: 02/07/2023] Open
Abstract
Kaposi sarcoma-associated herpesvirus (KSHV) is an emerging pathogen and is the causative infectious agent of Kaposi sarcoma and two malignancies of B cell origin. To date, there is no licensed KSHV vaccine. Development of an effective vaccine against KSHV continues to be limited by a poor understanding of how the virus initiates acute primary infection in vivo in diverse human cell types. The role of glycoprotein H (gH) in herpesvirus entry mechanisms remains largely unresolved. To characterize the requirement for KSHV gH in the viral life cycle and in determination of cell tropism, we generated and characterized a mutant KSHV in which expression of gH was abrogated. Using a bacterial artificial chromosome containing a complete recombinant KSHV genome and recombinant DNA technology, we inserted stop codons into the gH coding region. We used electron microscopy to reveal that the gH-null mutant virus assembled and exited from cells normally, compared to wild-type virus. Using purified virions, we assessed infectivity of the gH-null mutant in diverse mammalian cell types in vitro Unlike wild-type virus or a gH-containing revertant, the gH-null mutant was unable to infect any of the epithelial, endothelial, or fibroblast cell types tested. However, its ability to infect B cells was equivocal and remains to be investigated in vivo due to generally poor infectivity in vitro Together, these results suggest that gH is critical for KSHV infection of highly permissive cell types, including epithelial, endothelial, and fibroblast cells.IMPORTANCE All homologues of herpesvirus gH studied to date have been implicated in playing an essential role in viral infection of diverse permissive cell types. However, the role of gH in the mechanism of KSHV infection remains largely unresolved. In this study, we generated a gH-null mutant KSHV and provided evidence that deficiency of gH expression did not affect viral particle assembly or egress. Using the gH-null mutant, we showed that gH was indispensable for KSHV infection of epithelial, endothelial, and fibroblast cells in vitro This suggests that gH is an important target for the development of a KSHV prophylactic vaccine to prevent initial viral infection.
Collapse
|
367
|
Mukherjee S, Cai Z, Mukherjee A, Longkumer I, Mech M, Vupru K, Khate K, Rajkhowa C, Mitra A, Guldbrandtsen B, Lund MS, Sahana G. Whole genome sequence and de novo assembly revealed genomic architecture of Indian Mithun (Bos frontalis). BMC Genomics 2019; 20:617. [PMID: 31357931 PMCID: PMC6664528 DOI: 10.1186/s12864-019-5980-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Accepted: 07/16/2019] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Mithun (Bos frontalis), also called gayal, is an endangered bovine species, under the tribe bovini with 2n = 58 XX chromosome complements and reared under the tropical rain forests region of India, China, Myanmar, Bhutan and Bangladesh. However, the origin of this species is still disputed and information on its genomic architecture is scanty so far. We trust that availability of its whole genome sequence data and assembly will greatly solve this problem and help to generate many information including phylogenetic status of mithun. Recently, the first genome assembly of gayal, mithun of Chinese origin, was published. However, an improved reference genome assembly would still benefit in understanding genetic variation in mithun populations reared under diverse geographical locations and for building a superior consensus assembly. We, therefore, performed deep sequencing of the genome of an adult female mithun from India, assembled and annotated its genome and performed extensive bioinformatic analyses to produce a superior de novo genome assembly of mithun. RESULTS We generated ≈300 Gigabyte (Gb) raw reads from whole-genome deep sequencing platforms and assembled the sequence data using a hybrid assembly strategy to create a high quality de novo assembly of mithun with 96% recovered as per BUSCO analysis. The final genome assembly has a total length of 3.0 Gb, contains 5,015 scaffolds with an N50 value of 1 Mb. Repeat sequences constitute around 43.66% of the assembly. The genomic alignments between mithun to cattle showed that their genomes, as expected, are highly conserved. Gene annotation identified 28,044 protein-coding genes presented in mithun genome. The gene orthologous groups of mithun showed a high degree of similarity in comparison with other species, while fewer mithun specific coding sequences were found compared to those in cattle. CONCLUSION Here we presented the first de novo draft genome assembly of Indian mithun having better coverage, less fragmented, better annotated, and constitutes a reasonably complete assembly compared to the previously published gayal genome. This comprehensive assembly unravelled the genomic architecture of mithun to a great extent and will provide a reference genome assembly to research community to elucidate the evolutionary history of mithun across its distinct geographical locations.
Collapse
Affiliation(s)
- Sabyasachi Mukherjee
- Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland 797106 India
| | - Zexi Cai
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - Anupama Mukherjee
- Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland 797106 India
- Present address: Dairy Cattle Breeding Division, ICAR-National Dairy Research Institute, Karnal, Haryana 132001 India
| | - Imsusosang Longkumer
- Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland 797106 India
| | - Moonmoon Mech
- Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland 797106 India
| | - Kezhavituo Vupru
- Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland 797106 India
| | - Kobu Khate
- Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland 797106 India
| | - Chandan Rajkhowa
- Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland 797106 India
| | - Abhijit Mitra
- Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland 797106 India
| | - Bernt Guldbrandtsen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - Mogens Sandø Lund
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - Goutam Sahana
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| |
Collapse
|
368
|
Panthee S, Paudel A, Blom J, Hamamoto H, Sekimizu K. Complete Genome Sequence of Weissella hellenica 0916-4-2 and Its Comparative Genomic Analysis. Front Microbiol 2019; 10:1619. [PMID: 31396169 PMCID: PMC6667553 DOI: 10.3389/fmicb.2019.01619] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Accepted: 07/01/2019] [Indexed: 12/21/2022] Open
Abstract
Weissella genus from Leuconostocaceae family forms a group of Gram-positive lactic acid bacteria (LAB) that mostly reside in fermented foods and some have been isolated from the environment and vertebrates including humans. Currently there are 23 recognized species, 16 complete and 37 draft genome assemblies for this genus. Weissella hellenica has been found in various sources and is characterized by their probiotic and bacteriocinogenic properties. Despite its widespread importance, little attention has been paid to genomic characterization of this species with the availability of draft assembly of two species in the public database so far. In this manuscript, we identified W. hellenica 0916-4-2 from fermented kimchi and completed its genome sequence. Comparative genomic analysis identified 88 core genes that had interspecies mean amino acid identity of more than 65%. Whole genome phylogenetic analysis showed that three W. hellenica strains clustered together and the strain 0916-4-2 was close to strain WiKim14. In silico analysis for the secondary metabolites biosynthetic gene cluster showed that Weissella are far less producers of secondary metabolites compared to other members of Leuconostocaceae. The availability of the complete genome of W. hellenica 0916-4-2 will facilitate further comparative genomic analysis of Weissella species, including studies of its biotechnological potential and improving the nutritional value of various food products.
Collapse
Affiliation(s)
- Suresh Panthee
- Institute of Medical Mycology, Teikyo University, Hachioji, Japan
| | - Atmika Paudel
- Institute of Medical Mycology, Teikyo University, Hachioji, Japan
| | - Jochen Blom
- Bioinformatics and Systems Biology, Justus-Liebig-University Giessen, Giessen, Germany
| | - Hiroshi Hamamoto
- Institute of Medical Mycology, Teikyo University, Hachioji, Japan
| | - Kazuhisa Sekimizu
- Institute of Medical Mycology, Teikyo University, Hachioji, Japan.,Genome Pharmaceuticals Institute, Bunkyōku, Japan
| |
Collapse
|
369
|
Gao Y, Liu B, Wang Y, Xing Y. TideHunter: efficient and sensitive tandem repeat detection from noisy long-reads using seed-and-chain. Bioinformatics 2019; 35:i200-i207. [PMID: 31510677 PMCID: PMC6612900 DOI: 10.1093/bioinformatics/btz376] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
MOTIVATION Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) sequencing technologies can produce long-reads up to tens of kilobases, but with high error rates. In order to reduce sequencing error, Rolling Circle Amplification (RCA) has been used to improve library preparation by amplifying circularized template molecules. Linear products of the RCA contain multiple tandem copies of the template molecule. By integrating additional in silico processing steps, these tandem sequences can be collapsed into a consensus sequence with a higher accuracy than the original raw reads. Existing pipelines using alignment-based methods to discover the tandem repeat patterns from the long-reads are either inefficient or lack sensitivity. RESULTS We present a novel tandem repeat detection and consensus calling tool, TideHunter, to efficiently discover tandem repeat patterns and generate high-quality consensus sequences from amplified tandemly repeated long-read sequencing data. TideHunter works with noisy long-reads (PacBio and ONT) at error rates of up to 20% and does not have any limitation of the maximal repeat pattern size. We benchmarked TideHunter using simulated and real datasets with varying error rates and repeat pattern sizes. TideHunter is tens of times faster than state-of-the-art methods and has a higher sensitivity and accuracy. AVAILABILITY AND IMPLEMENTATION TideHunter is written in C, it is open source and is available at https://github.com/yangao07/TideHunter.
Collapse
Affiliation(s)
- Yan Gao
- Department of Computer Science and Technology, Center for Bioinformatics Harbin Institute of Technology, Harbin, Heilongjiang, China
- Center for Computational and Genomic Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Bo Liu
- Department of Computer Science and Technology, Center for Bioinformatics Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Yadong Wang
- Department of Computer Science and Technology, Center for Bioinformatics Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Yi Xing
- Center for Computational and Genomic Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
370
|
Borgers K, Ou JY, Zheng PX, Tiels P, Van Hecke A, Plets E, Michielsen G, Festjens N, Callewaert N, Lin YC. Reference genome and comparative genome analysis for the WHO reference strain for Mycobacterium bovis BCG Danish, the present tuberculosis vaccine. BMC Genomics 2019; 20:561. [PMID: 31286858 PMCID: PMC6615170 DOI: 10.1186/s12864-019-5909-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Accepted: 06/17/2019] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND Mycobacterium bovis bacillus Calmette-Guérin (M. bovis BCG) is the only vaccine available against tuberculosis (TB). In an effort to standardize the vaccine production, three substrains, i.e. BCG Danish 1331, Tokyo 172-1 and Russia BCG-1 were established as the WHO reference strains. Both for BCG Tokyo 172-1 as Russia BCG-1, reference genomes exist, not for BCG Danish. In this study, we set out to determine the completely assembled genome sequence for BCG Danish and to establish a workflow for genome characterization of engineering-derived vaccine candidate strains. RESULTS By combining second (Illumina) and third (PacBio) generation sequencing in an integrated genome analysis workflow for BCG, we could construct the completely assembled genome sequence of BCG Danish 1331 (07/270) (and an engineered derivative that is studied as an improved vaccine candidate, a SapM KO), including the resolution of the analytically challenging long duplication regions. We report the presence of a DU1-like duplication in BCG Danish 1331, while this tandem duplication was previously thought to be exclusively restricted to BCG Pasteur. Furthermore, comparative genome analyses of publicly available data for BCG substrains showed the absence of a DU1 in certain BCG Pasteur substrains and the presence of a DU1-like duplication in some BCG China substrains. By integrating publicly available data, we provide an update to the genome features of the commonly used BCG strains. CONCLUSIONS We demonstrate how this analysis workflow enables the resolution of genome duplications and of the genome of engineered derivatives of the BCG Danish vaccine strain. The BCG Danish WHO reference genome will serve as a reference for future engineered strains and the established workflow can be used to enhance BCG vaccine standardization.
Collapse
Affiliation(s)
- Katlyn Borgers
- VIB-UGhent Center for Medical Biotechnology, Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
- Department of Biochemistry and Microbiology, Ghent University; Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
| | - Jheng-Yang Ou
- Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan, 74145 Taiwan
- Agricultural Biotechnology Research Center, Academia Sinica, Tainan, 74145 Taiwan
| | - Po-Xing Zheng
- Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan, 74145 Taiwan
- Agricultural Biotechnology Research Center, Academia Sinica, Tainan, 74145 Taiwan
| | - Petra Tiels
- VIB-UGhent Center for Medical Biotechnology, Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
- Department of Biochemistry and Microbiology, Ghent University; Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
| | - Annelies Van Hecke
- VIB-UGhent Center for Medical Biotechnology, Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
- Department of Biochemistry and Microbiology, Ghent University; Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
| | - Evelyn Plets
- VIB-UGhent Center for Medical Biotechnology, Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
- Department of Biochemistry and Microbiology, Ghent University; Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
| | - Gitte Michielsen
- VIB-UGhent Center for Medical Biotechnology, Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
- Department of Biochemistry and Microbiology, Ghent University; Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
| | - Nele Festjens
- VIB-UGhent Center for Medical Biotechnology, Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
- Department of Biochemistry and Microbiology, Ghent University; Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
| | - Nico Callewaert
- VIB-UGhent Center for Medical Biotechnology, Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
- Department of Biochemistry and Microbiology, Ghent University; Technologiepark-Zwijnaarde 71, 9052 Ghent, Belgium
| | - Yao-Cheng Lin
- Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan, 74145 Taiwan
- Agricultural Biotechnology Research Center, Academia Sinica, Tainan, 74145 Taiwan
| |
Collapse
|
371
|
Wan Y, Liu X, Zheng D, Wang Y, Chen H, Zhao X, Liang G, Yu D, Gan L. Systematic identification of intergenic long-noncoding RNAs in mouse retinas using full-length isoform sequencing. BMC Genomics 2019; 20:559. [PMID: 31286854 PMCID: PMC6615288 DOI: 10.1186/s12864-019-5903-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 06/12/2019] [Indexed: 02/06/2023] Open
Abstract
Background A great mass of long noncoding RNAs (lncRNAs) have been identified in mouse genome and increasing evidences in the last decades have revealed their crucial roles in diverse biological processes. Nevertheless, the biological roles of lncRNAs in the mouse retina remains largely unknown due to the lack of a comprehensive annotation of lncRNAs expressed in the retina. Results In this study, we applied the long-reads sequencing strategy to unravel the transcriptomes of developing mouse retinas and identified a total of 940 intergenic lncRNAs (lincRNAs) in embryonic and neonatal retinas, including about 13% of them were transcribed from unannotated gene loci. Subsequent analysis revealed that function of lincRNAs expressed in mouse retinas were closely related to the physiological roles of this tissue, including 90 lincRNAs that were differentially expressed after the functional loss of key regulators of retinal ganglion cell (RGC) differentiation. In situ hybridization results demonstrated the enrichment of three class IV POU-homeobox genes adjacent lincRNAs (linc-3a, linc-3b and linc-3c) in ganglion cell layer and indicated they were potentially RGC-specific. Conclusions In summary, this study systematically annotated the lincRNAs expressed in embryonic and neonatal mouse retinas and implied their crucial regulatory roles in retinal development such as RGC differentiation. Electronic supplementary material The online version of this article (10.1186/s12864-019-5903-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ying Wan
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China.,Zhejiang Key Laboratory of Organ Development and Regeneration, Hangzhou Normal University, Hangzhou, China
| | - Xiaoyang Liu
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China.,Zhejiang Key Laboratory of Organ Development and Regeneration, Hangzhou Normal University, Hangzhou, China
| | | | - Yuying Wang
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China.,Zhejiang Key Laboratory of Organ Development and Regeneration, Hangzhou Normal University, Hangzhou, China
| | - Huan Chen
- Key Laboratory of microbiological technology and Bioinformatics in Zhejiang Province, Hangzhou, China
| | - Xiaofeng Zhao
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China.,Zhejiang Key Laboratory of Organ Development and Regeneration, Hangzhou Normal University, Hangzhou, China
| | - Guoqing Liang
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China
| | - Dongliang Yu
- College of Life and Environmental Sciences, Hangzhou Normal University, Hangzhou, China. .,Zhejiang Key Laboratory of Organ Development and Regeneration, Hangzhou Normal University, Hangzhou, China.
| | - Lin Gan
- Department of Ophthalmology and Flaum Eye Institute, University of Rochester, Rochester, NY, 14642, USA.
| |
Collapse
|
372
|
Jiang F, Zhang J, Liu Q, Liu X, Wang H, He J, Kang L. Long-read direct RNA sequencing by 5'-Cap capturing reveals the impact of Piwi on the widespread exonization of transposable elements in locusts. RNA Biol 2019; 16:950-959. [PMID: 30982421 PMCID: PMC6546357 DOI: 10.1080/15476286.2019.1602437] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 03/25/2019] [Accepted: 03/26/2019] [Indexed: 12/20/2022] Open
Abstract
The large genome of the migratory locust (Locusta migratoria) genome accumulates massive amount of accumulated transposable elements (TEs), which show intrinsic transcriptional activities. Hampering the ability to precisely determine full-length RNA transcript sequences are exonized TEs, which produce numerous highly similar fragments that are difficult to resolve using short-read sequencing technology. Here, we applied a 5'-Cap capturing method using Nanopore long-read direct RNA sequencing to characterize full-length transcripts in their native RNA form and to analyze the TE exonization pattern in the locust transcriptome. Our results revealed the widespread establishment of TE exonization and a substantial contribution of TEs to RNA splicing in the locust transcriptome. The results of the transcriptomic spectrum influenced by Piwi expression indicated that TE-derived sequences were the main targets of Piwi-mediated repression. Furthermore, our study showed that Piwi expression regulates the length of RNA transcripts containing TE-derived sequences, creating an alternative UTR usage. Overall, our results reveal the transcriptomic characteristics of TE exonization in the species characterized by large and repetitive genomes.
Collapse
Affiliation(s)
- Feng Jiang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Jie Zhang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
| | - Qing Liu
- Sino-Danish College, University of Chinese Academy of Sciences, Beijing, China
| | - Xiang Liu
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Huimin Wang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
| | - Jing He
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Le Kang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
373
|
Boldogkői Z, Moldován N, Balázs Z, Snyder M, Tombácz D. Long-Read Sequencing – A Powerful Tool in Viral Transcriptome Research. Trends Microbiol 2019; 27:578-592. [DOI: 10.1016/j.tim.2019.01.010] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Revised: 01/21/2019] [Accepted: 01/30/2019] [Indexed: 12/16/2022]
|
374
|
Firtina C, Bar-Joseph Z, Alkan C, Cicek AE. Hercules: a profile HMM-based hybrid error correction algorithm for long reads. Nucleic Acids Res 2019; 46:e125. [PMID: 30124947 PMCID: PMC6265270 DOI: 10.1093/nar/gky724] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 08/07/2018] [Indexed: 01/15/2023] Open
Abstract
Choosing whether to use second or third generation sequencing platforms can lead to trade-offs between accuracy and read length. Several types of studies require long and accurate reads. In such cases researchers often combine both technologies and the erroneous long reads are corrected using the short reads. Current approaches rely on various graph or alignment based techniques and do not take the error profile of the underlying technology into account. Efficient machine learning algorithms that address these shortcomings have the potential to achieve more accurate integration of these two technologies. We propose Hercules, the first machine learning-based long read error correction algorithm. Hercules models every long read as a profile Hidden Markov Model with respect to the underlying platform’s error profile. The algorithm learns a posterior transition/emission probability distribution for each long read to correct errors in these reads. We show on two DNA-seq BAC clones (CH17-157L1 and CH17-227A2) that Hercules-corrected reads have the highest mapping rate among all competing algorithms and have the highest accuracy when the breadth of coverage is high. On a large human CHM1 cell line WGS data set, Hercules is one of the few scalable algorithms; and among those, it achieves the highest accuracy.
Collapse
Affiliation(s)
- Can Firtina
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| | - A Ercument Cicek
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey.,Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
375
|
Abstract
Citrobacter rodentium strain DBS100 causes an infection of the intestines in mice. It provides an important model for human gastrointestinal pathogens, such as enteropathogenic and enterohemorrhagic Escherichia coli, which cause life-threatening infections. To identify the genetic determinants that are common across the enteropathogenic bacteria, we sequenced the DBS100 genome.
Collapse
|
376
|
Chen Z, Lu X, Xuan Y, Tang F, Wang J, Shi D, Fu S, Ren J. Transcriptome analysis based on a combination of sequencing platforms provides insights into leaf pigmentation in Acer rubrum. BMC PLANT BIOLOGY 2019; 19:240. [PMID: 31170934 PMCID: PMC6555730 DOI: 10.1186/s12870-019-1850-7] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 05/28/2019] [Indexed: 05/05/2023]
Abstract
BACKGROUND Red maple (Acer rubrum L.) is one of the most common and widespread trees with colorful leaves. We found a mutant with red, yellow, and green leaf phenotypes in different branches, which provided ideal materials with the same genetic relationship, and little interference from the environment, for the study of complex metabolic networks that underly variations in the coloration of leaves. We applied a combination of NGS and SMRT sequencing to various red maple tissues. RESULTS A total of 125,448 unigenes were obtained, of which 46 and 69 were thought to be related to the synthesis of anthocyanins and carotenoids, respectively. In addition, 88 unigenes were presumed to be involved in the chlorophyll metabolic pathway. Based on a comprehensive analysis of the pigment gene expression network, the mechanisms of leaf color were investigated. The massive accumulation of Cy led to its higher content and proportion than other pigments, which caused the redness of leaves. Yellow coloration was the result of the complete decomposition of chlorophyll pigments, the unmasking of carotenoid pigments, and a slight accumulation of Cy. CONCLUSIONS This study provides a systematic analysis of color variations in the red maple. Moreover, mass sequence data obtained by deep sequencing will provide references for the controlled breeding of red maple.
Collapse
Affiliation(s)
- Zhu Chen
- Institute of Agricultural Engineering, Anhui Academy of Agricultural Sciences, Hefei, 230031 China
| | - Xiaoyu Lu
- College of Forestry and Landscape Architecture, Anhui Agricultural University, Hefei, 230036 Anhui China
| | - Yun Xuan
- Institute of Agricultural Engineering, Anhui Academy of Agricultural Sciences, Hefei, 230031 China
| | - Fei Tang
- Institute of Agricultural Engineering, Anhui Academy of Agricultural Sciences, Hefei, 230031 China
| | - Jingjing Wang
- Institute of Agricultural Engineering, Anhui Academy of Agricultural Sciences, Hefei, 230031 China
| | - Dan Shi
- Institute of Agricultural Engineering, Anhui Academy of Agricultural Sciences, Hefei, 230031 China
| | - Songling Fu
- College of Forestry and Landscape Architecture, Anhui Agricultural University, Hefei, 230036 Anhui China
| | - Jie Ren
- Institute of Agricultural Engineering, Anhui Academy of Agricultural Sciences, Hefei, 230031 China
| |
Collapse
|
377
|
|
378
|
Heydari M, Miclotte G, Van de Peer Y, Fostier J. Illumina error correction near highly repetitive DNA regions improves de novo genome assembly. BMC Bioinformatics 2019; 20:298. [PMID: 31159722 PMCID: PMC6545690 DOI: 10.1186/s12859-019-2906-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 05/17/2019] [Indexed: 11/10/2022] Open
Abstract
Background Several standalone error correction tools have been proposed to correct sequencing errors in Illumina data in order to facilitate de novo genome assembly. However, in a recent survey, we showed that state-of-the-art assemblers often did not benefit from this pre-correction step. We found that many error correction tools introduce new errors in reads that overlap highly repetitive DNA regions such as low-complexity patterns or short homopolymers, ultimately leading to a more fragmented assembly. Results We propose BrownieCorrector, an error correction tool for Illumina sequencing data that focuses on the correction of only those reads that overlap short DNA patterns that are highly repetitive in the genome. BrownieCorrector extracts all reads that contain such a pattern and clusters them into different groups using a community detection algorithm that takes into account both the sequence similarity between overlapping reads and their respective paired-end reads. Each cluster holds reads that originate from the same genomic region and hence each cluster can be corrected individually, thus providing a consistent correction for all reads within that cluster. Conclusions BrownieCorrector is benchmarked using six real Illumina datasets for different eukaryotic genomes. The prior use of BrownieCorrector improves assembly results over the use of uncorrected reads in all cases. In comparison with other error correction tools, BrownieCorrector leads to the best assembly results in most cases even though less than 2% of the reads within a dataset are corrected. Additionally, we investigate the impact of error correction on hybrid assembly where the corrected Illumina reads are supplemented with PacBio data. Our results confirm that BrownieCorrector improves the quality of hybrid genome assembly as well. BrownieCorrector is written in standard C++11 and released under GPL license. BrownieCorrector relies on multithreading to take advantage of multi-core/multi-CPU systems. The source code is available at https://github.com/biointec/browniecorrector. Electronic supplementary material The online version of this article (10.1186/s12859-019-2906-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mahdi Heydari
- Department of Information Technology, Ghent University-imec, IDLab, Ghent, B-9052, Belgium.,Bioinformatics Institute Ghent, Ghent, B-9052, Belgium
| | - Giles Miclotte
- Department of Information Technology, Ghent University-imec, IDLab, Ghent, B-9052, Belgium.,Bioinformatics Institute Ghent, Ghent, B-9052, Belgium
| | - Yves Van de Peer
- Bioinformatics Institute Ghent, Ghent, B-9052, Belgium.,Center for Plant Systems Biology, VIB, Ghent, B-9052, Belgium.,Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, B-9052, Belgium.,Department of Genetics, Genome Research Institute, University of Pretoria, Pretoria, South Africa
| | - Jan Fostier
- Department of Information Technology, Ghent University-imec, IDLab, Ghent, B-9052, Belgium. .,Bioinformatics Institute Ghent, Ghent, B-9052, Belgium.
| |
Collapse
|
379
|
Bi Q, Zhao Y, Du W, Lu Y, Gui L, Zheng Z, Yu H, Cui Y, Liu Z, Cui T, Cui D, Liu X, Li Y, Fan S, Hu X, Fu G, Ding J, Ruan C, Wang L. Pseudomolecule-level assembly of the Chinese oil tree yellowhorn (Xanthoceras sorbifolium) genome. Gigascience 2019; 8:giz070. [PMID: 31241154 PMCID: PMC6593361 DOI: 10.1093/gigascience/giz070] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 03/02/2019] [Accepted: 05/22/2019] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Yellowhorn (Xanthoceras sorbifolium) is a species of the Sapindaceae family native to China and is an oil tree that can withstand cold and drought conditions. A pseudomolecule-level genome assembly for this species will not only contribute to understanding the evolution of its genes and chromosomes but also bring yellowhorn breeding into the genomic era. FINDINGS Here, we generated 15 pseudomolecules of yellowhorn chromosomes, on which 97.04% of scaffolds were anchored, using the combined Illumina HiSeq, Pacific Biosciences Sequel, and Hi-C technologies. The length of the final yellowhorn genome assembly was 504.2 Mb with a contig N50 size of 1.04 Mb and a scaffold N50 size of 32.17 Mb. Genome annotation revealed that 68.67% of the yellowhorn genome was composed of repetitive elements. Gene modelling predicted 24,672 protein-coding genes. By comparing orthologous genes, the divergence time of yellowhorn and its close sister species longan (Dimocarpus longan) was estimated at ∼33.07 million years ago. Gene cluster and chromosome synteny analysis demonstrated that the yellowhorn genome shared a conserved genome structure with its ancestor in some chromosomes. CONCLUSIONS This genome assembly represents a high-quality reference genome for yellowhorn. Integrated genome annotations provide a valuable dataset for genetic and molecular research in this species. We did not detect whole-genome duplication in the genome. The yellowhorn genome carries syntenic blocks from ancient chromosomes. These data sources will enable this genome to serve as an initial platform for breeding better yellowhorn cultivars.
Collapse
Affiliation(s)
- Quanxin Bi
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
- Key Laboratory of Biotechnology and Bioresources Utilization, State Ethnic Affairs Commission & Ministry of Education, Dalian Minzu University, Dalian 116600, China
| | - Yang Zhao
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Wei Du
- Key Laboratory of Biotechnology and Bioresources Utilization, State Ethnic Affairs Commission & Ministry of Education, Dalian Minzu University, Dalian 116600, China
| | - Ying Lu
- National Demonstration Center for Experimental Fisheries Science Education, Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources (Ministry of Education) and International Research Center for Marine Biosciences (Ministry of Science and Technology), Shanghai Ocean University, Shanghai 201306, China
| | - Lang Gui
- National Demonstration Center for Experimental Fisheries Science Education, Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources (Ministry of Education) and International Research Center for Marine Biosciences (Ministry of Science and Technology), Shanghai Ocean University, Shanghai 201306, China
| | - Zhimin Zheng
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin 150040, China
- Key Laboratory of Saline-alkali Vegetation Ecology Restoration (SAVER), Ministry of Education, Alkali Soil Natural Environmental Science Center (ASNESC), Northeast Forestry University, Harbin 150040, China
| | - Haiyan Yu
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
- Beijing ABT Biotechnology Co., Ltd., Beijing 102200, China
| | - Yifan Cui
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Zhi Liu
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin 150040, China
- Key Laboratory of Saline-alkali Vegetation Ecology Restoration (SAVER), Ministry of Education, Alkali Soil Natural Environmental Science Center (ASNESC), Northeast Forestry University, Harbin 150040, China
| | - Tianpeng Cui
- Zhangwu Deya yellowhorn Professional Cooperatives, Zhangwu 123200, China
| | - Deshi Cui
- Zhangwu Deya yellowhorn Professional Cooperatives, Zhangwu 123200, China
| | - Xiaojuan Liu
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Yingchao Li
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Siqi Fan
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Xiaoyu Hu
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Guanghui Fu
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Jian Ding
- Key Laboratory of Biotechnology and Bioresources Utilization, State Ethnic Affairs Commission & Ministry of Education, Dalian Minzu University, Dalian 116600, China
| | - Chengjiang Ruan
- Key Laboratory of Biotechnology and Bioresources Utilization, State Ethnic Affairs Commission & Ministry of Education, Dalian Minzu University, Dalian 116600, China
| | - Libing Wang
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| |
Collapse
|
380
|
Zhao Y, Zhang Z, Li M, Luo J, Chen F, Gong Y, Li Y, Wei Y, Su Y, Kong L. Transcriptomic profiles of 33 opium poppy samples in different tissues, growth phases, and cultivars. Sci Data 2019; 6:66. [PMID: 31110243 PMCID: PMC6527585 DOI: 10.1038/s41597-019-0082-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Accepted: 04/12/2019] [Indexed: 11/16/2022] Open
Abstract
Opium poppy is one of the most important medicinal plants and remains the only commercial resource of morphinan-based painkillers. However, little is known about the regulatory mechanisms involved in benzylisoquinoline alkaloids (BIAs) biosynthesis in opium poppy. Herein, the full-length transcriptome dataset of opium poppy was constructed for the first time in accompanied with the 33 samples of Illumina transcriptome data from different tissues, growth phases and cultivars. The long-read sequencing produced 902,140 raw reads with 55,114 high-quality transcripts, and short-read sequencing produced 1,923,679,864 clean reads with an average Q30 rate of 93%. The high-quality transcripts were subsequently quantified using the short reads, and the expression of each unigene among different samples was calculated as reads per kilobase per million mapped reads (RPKM). These data provide a foundation for opium poppy transcriptomic analysis, which may aid in capturing splice variants and some non-coding RNAs involved in the regulation of BIAs biosynthesis. It can also be used for genome assembly and annotation which will favor in new transcript identification.
Collapse
Affiliation(s)
- Yucheng Zhao
- Jiangsu Key Laboratory of Bioactive Natural Product Research and State Key Laboratory of Natural Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, No. 24 Tongjiaxiang, Nanjing, 210009, China
| | - Zhaoping Zhang
- China Agriculture Research System (CARS-21), No. 234 Xinzhen Road, Huangyang town, Liangzhou District, Wuwei, Gansu, 733006, China
| | - Mingzhi Li
- Genepioneer Biotechnologies Co. Ltd., No. 9 Weidi Road, Qixia District, Nanjing, 210014, China
| | - Jun Luo
- Jiangsu Key Laboratory of Bioactive Natural Product Research and State Key Laboratory of Natural Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, No. 24 Tongjiaxiang, Nanjing, 210009, China
| | - Fang Chen
- China Agriculture Research System (CARS-21), No. 234 Xinzhen Road, Huangyang town, Liangzhou District, Wuwei, Gansu, 733006, China
| | - Yongfu Gong
- China Agriculture Research System (CARS-21), No. 234 Xinzhen Road, Huangyang town, Liangzhou District, Wuwei, Gansu, 733006, China
| | - Yanrong Li
- China Agriculture Research System (CARS-21), No. 234 Xinzhen Road, Huangyang town, Liangzhou District, Wuwei, Gansu, 733006, China
| | - Yujie Wei
- China Agriculture Research System (CARS-21), No. 234 Xinzhen Road, Huangyang town, Liangzhou District, Wuwei, Gansu, 733006, China
| | - Yujie Su
- China Agriculture Research System (CARS-21), No. 234 Xinzhen Road, Huangyang town, Liangzhou District, Wuwei, Gansu, 733006, China
| | - Lingyi Kong
- Jiangsu Key Laboratory of Bioactive Natural Product Research and State Key Laboratory of Natural Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, No. 24 Tongjiaxiang, Nanjing, 210009, China.
| |
Collapse
|
381
|
Elbers JP, Rogers MF, Perelman PL, Proskuryakova AA, Serdyukova NA, Johnson WE, Horin P, Corander J, Murphy D, Burger PA. Improving Illumina assemblies with Hi-C and long reads: An example with the North African dromedary. Mol Ecol Resour 2019; 19:1015-1026. [PMID: 30972949 PMCID: PMC6618069 DOI: 10.1111/1755-0998.13020] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Revised: 03/24/2019] [Accepted: 03/25/2019] [Indexed: 12/22/2022]
Abstract
Researchers have assembled thousands of eukaryotic genomes using Illumina reads, but traditional mate‐pair libraries cannot span all repetitive elements, resulting in highly fragmented assemblies. However, both chromosome conformation capture techniques, such as Hi‐C and Dovetail Genomics Chicago libraries and long‐read sequencing, such as Pacific Biosciences and Oxford Nanopore, help span and resolve repetitive regions and therefore improve genome assemblies. One important livestock species of arid regions that does not have a high‐quality contiguous reference genome is the dromedary (Camelus dromedarius). Draft genomes exist but are highly fragmented, and a high‐quality reference genome is needed to understand adaptation to desert environments and artificial selection during domestication. Dromedaries are among the last livestock species to have been domesticated, and together with wild and domestic Bactrian camels, they are the only representatives of the Camelini tribe, which highlights their evolutionary significance. Here we describe our efforts to improve the North African dromedary genome. We used Chicago and Hi‐C sequencing libraries from Dovetail Genomics to resolve the order of previously assembled contigs, producing almost chromosome‐level scaffolds. Remaining gaps were filled with Pacific Biosciences long reads, and then scaffolds were comparatively mapped to chromosomes. Long reads added 99.32 Mbp to the total length of the new assembly. Dovetail Chicago and Hi‐C libraries increased the longest scaffold over 12‐fold, from 9.71 Mbp to 124.99 Mbp and the scaffold N50 over 50‐fold, from 1.48 Mbp to 75.02 Mbp. We demonstrate that Illumina de novo assemblies can be substantially upgraded by combining chromosome conformation capture and long‐read sequencing.
Collapse
Affiliation(s)
- Jean P Elbers
- Department of Integrative Biology and Evolution, Research Institute of Wildlife Ecology, Vetmeduni Vienna, Vienna, Austria
| | - Mark F Rogers
- Intelligent Systems Laboratory, University of Bristol, Bristol, UK
| | - Polina L Perelman
- Institute of Molecular and Cellular Biology, SB RAS and Novosibirsk State University, Novosibirsk, Russia
| | - Anastasia A Proskuryakova
- Institute of Molecular and Cellular Biology, SB RAS and Novosibirsk State University, Novosibirsk, Russia
| | - Natalia A Serdyukova
- Institute of Molecular and Cellular Biology, SB RAS and Novosibirsk State University, Novosibirsk, Russia
| | - Warren E Johnson
- The Walter Reed Biosystematics Unit, Smithsonian Institution, Museum Support Center MRC-534, Suitland, Maryland
| | - Petr Horin
- Department of Animal Genetics, Faculty of Veterinary Medicine, Ceitec VFU, RG Animal Immunogenomics, University of Veterinary and Pharmaceutical Sciences, Brno, Czech Republic
| | - Jukka Corander
- Department of Biostatistics, University of Oslo, Oslo, Norway.,Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
| | - David Murphy
- Bristol Medical School: Translational Health Sciences, Molecular Neuroendocrinology Research Group, University of Bristol, Bristol, UK
| | - Pamela A Burger
- Department of Integrative Biology and Evolution, Research Institute of Wildlife Ecology, Vetmeduni Vienna, Vienna, Austria
| |
Collapse
|
382
|
Li S, Cha SW, Heffner K, Hizal DB, Bowen MA, Chaerkady R, Cole RN, Tejwani V, Kaushik P, Henry M, Meleady P, Sharfstein ST, Betenbaugh MJ, Bafna V, Lewis NE. Proteogenomic Annotation of Chinese Hamsters Reveals Extensive Novel Translation Events and Endogenous Retroviral Elements. J Proteome Res 2019; 18:2433-2445. [PMID: 31020842 DOI: 10.1021/acs.jproteome.8b00935] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
A high-quality genome annotation greatly facilitates successful cell line engineering. Standard draft genome annotation pipelines are based largely on de novo gene prediction, homology, and RNA-Seq data. However, draft annotations can suffer from incorrect predictions of translated sequence, inaccurate splice isoforms, and missing genes. Here, we generated a draft annotation for the newly assembled Chinese hamster genome and used RNA-Seq, proteomics, and Ribo-Seq to experimentally annotate the genome. We identified 3529 new proteins compared to the hamster RefSeq protein annotation and 2256 novel translational events (e.g., alternative splices, mutations, and novel splices). Finally, we used this pipeline to identify the source of translated retroviruses contaminating recombinant products from Chinese hamster ovary (CHO) cell lines, including 119 type-C retroviruses, thus enabling future efforts to eliminate retroviruses to reduce the costs incurred with retroviral particle clearance. In summary, the improved annotation provides a more accurate resource for CHO cell line engineering, by facilitating the interpretation of omics data, defining of cellular pathways, and engineering of complex phenotypes.
Collapse
Affiliation(s)
| | | | | | - Deniz Baycin Hizal
- Antibody Discovery and Protein Engineering , AstraZeneca , Gaithersburg , Maryland , United States
| | - Michael A Bowen
- Antibody Discovery and Protein Engineering , AstraZeneca , Gaithersburg , Maryland , United States
| | - Raghothama Chaerkady
- Antibody Discovery and Protein Engineering , AstraZeneca , Gaithersburg , Maryland , United States
| | | | - Vijay Tejwani
- Colleges of Nanoscale Science and Engineering , SUNY Polytechnic Institute , Albany , New York 12203 , United States
| | - Prashant Kaushik
- National Institute for Cellular Biotechnology , Dublin City University , Dublin 9, Ireland
| | - Michael Henry
- National Institute for Cellular Biotechnology , Dublin City University , Dublin 9, Ireland
| | - Paula Meleady
- National Institute for Cellular Biotechnology , Dublin City University , Dublin 9, Ireland
| | - Susan T Sharfstein
- Colleges of Nanoscale Science and Engineering , SUNY Polytechnic Institute , Albany , New York 12203 , United States
| | | | | | | |
Collapse
|
383
|
Babarinde IA, Li Y, Hutchins AP. Computational Methods for Mapping, Assembly and Quantification for Coding and Non-coding Transcripts. Comput Struct Biotechnol J 2019; 17:628-637. [PMID: 31193391 PMCID: PMC6526290 DOI: 10.1016/j.csbj.2019.04.012] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 04/24/2019] [Accepted: 04/29/2019] [Indexed: 12/17/2022] Open
Abstract
The measurement of gene expression has long provided significant insight into biological functions. The development of high-throughput short-read sequencing technology has revealed transcriptional complexity at an unprecedented scale, and informed almost all areas of biology. However, as researchers have sought to gather more insights from the data, these new technologies have also increased the computational analysis burden. In this review, we describe typical computational pipelines for RNA-Seq analysis and discuss their strengths and weaknesses for the assembly, quantification and analysis of coding and non-coding RNAs. We also discuss the assembly of transposable elements into transcripts, and the difficulty these repetitive elements pose. In summary, RNA-Seq is a powerful technology that is likely to remain a key asset in the biologist's toolkit.
Collapse
Affiliation(s)
| | | | - Andrew P. Hutchins
- Department of Biology, Southern University of Science and Technology, 1088 Xueyuan Lu, Shenzhen, China
| |
Collapse
|
384
|
Abstract
In this genome report, we describe the sequencing and annotation of the genome of the wine grape Carménère (clone 02, VCR-702). Long considered extinct, this old French wine grape variety is now cultivated mostly in Chile where it was imported in the 1850s just before the European phylloxera epidemic. Genomic DNA was sequenced using Single Molecule Real Time technology and assembled with FALCON-Unzip, a diploid-aware assembly pipeline. To optimize the contiguity and completeness of the assembly, we tested about a thousand combinations of assembly parameters, sequencing coverage, error correction and repeat masking methods. The final scaffolds provide a complete and phased representation of the diploid genome of this wine grape. Comparison of the two haplotypes revealed numerous heterozygous variants, including loss-of-function ones, some of which in genes associated with polyphenol biosynthesis. Comparisons with other publicly available grape genomes and transcriptomes showed the impact of structural variation on gene content differences between Carménère and other wine grape cultivars. Among the putative cultivar-specific genes, we identified genes potentially involved in aroma production and stress responses. The genome assembly of Carménère expands the representation of the genomic variability in grapes and will enable studies that aim to understand its distinctive organoleptic and agronomical features and assess its still elusive extant genetic variability. A genome browser for Carménère, its annotation, and an associated blast tool are available at http://cantulab.github.io/data.
Collapse
|
385
|
Peng X, Liu H, Chen P, Tang F, Hu Y, Wang F, Pi Z, Zhao M, Chen N, Chen H, Zhang X, Yan X, Liu M, Fu X, Zhao G, Yao P, Wang L, Dai H, Li X, Xiong W, Xu W, Zheng H, Yu H, Shen S. A Chromosome-Scale Genome Assembly of Paper Mulberry (Broussonetia papyrifera) Provides New Insights into Its Forage and Papermaking Usage. MOLECULAR PLANT 2019; 12:661-677. [PMID: 30822525 DOI: 10.1016/j.molp.2019.01.021] [Citation(s) in RCA: 76] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2018] [Revised: 01/18/2019] [Accepted: 01/20/2019] [Indexed: 05/21/2023]
Abstract
Paper mulberry (Broussonetia papyrifera) is a well-known woody tree historically used for Cai Lun papermaking, one of the four great inventions of ancient China. More recently, Paper mulberry has also been used as forage to address the shortage of feedstuff because of its digestible crude fiber and high protein contents. In this study, we obtained a chromosome-scale genome assembly for Paper mulberry using integrated approaches, including Illumina and PacBio sequencing platform as well as Hi-C, optical, and genetic maps. The assembled Paper mulberry genome consists of 386.83 Mb, which is close to the estimated size, and 99.25% (383.93 Mb) of the assembly was assigned to 13 pseudochromosomes. Comparative genomic analysis revealed the expansion and contraction in the flavonoid and lignin biosynthetic gene families, respectively, accounting for the enhanced flavonoid and decreased lignin biosynthesis in Paper mulberry. Moreover, the increased ratio of syringyl-lignin to guaiacyl-lignin in Paper mulberry underscores its suitability for use in medicine, forage, papermaking, and barkcloth making. We also identified the root-associated microbiota of Paper mulberry and found that Pseudomonas and Rhizobia were enriched in its roots and may provide the source of nitrogen for its stems and leaves via symbiotic nitrogen fixation. Collectively, these results suggest that Paper mulberry might have undergone adaptive evolution and recruited nitrogen-fixing microbes to promote growth by enhancing flavonoid production and altering lignin monomer composition. Our study provides significant insights into genetic basis of the usefulness of Paper mulberry in papermaking and barkcloth making, and as forage. These insights will facilitate further domestication and selection as well as industrial utilization of Paper mulberry worldwide.
Collapse
Affiliation(s)
- Xianjun Peng
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China
| | - Hui Liu
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China
| | - Peilin Chen
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China
| | - Feng Tang
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China
| | - Yanmin Hu
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China
| | - Fenfen Wang
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China
| | - Zhi Pi
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China
| | - Meiling Zhao
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China
| | - Naizhi Chen
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China
| | - Hui Chen
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China
| | - Xiaokang Zhang
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China
| | - Xueqing Yan
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China
| | - Min Liu
- Biomarker Technologies Corporation, Beijing 101300, China
| | - Xiaojun Fu
- Biomarker Technologies Corporation, Beijing 101300, China
| | - Guofeng Zhao
- Biomarker Technologies Corporation, Beijing 101300, China
| | - Pu Yao
- Biomarker Technologies Corporation, Beijing 101300, China
| | - Lili Wang
- Biomarker Technologies Corporation, Beijing 101300, China
| | - He Dai
- Biomarker Technologies Corporation, Beijing 101300, China
| | - Xuming Li
- Biomarker Technologies Corporation, Beijing 101300, China
| | - Wei Xiong
- Quick Green Bio-Tec Co., Ltd., Dalian 116600, China
| | - Wencai Xu
- Beijing Jonathan Science and Technology Development Co., Ltd., Beijing 101314, China
| | - Hongkun Zheng
- Biomarker Technologies Corporation, Beijing 101300, China
| | - Haiyan Yu
- Biomarker Technologies Corporation, Beijing 101300, China.
| | - Shihua Shen
- Key Laboratory of Plant Resources, Institute of Botany, The Chinese Academy of Sciences, Beijing 100093, China; ChuangGou Science & Technology Co. Ltd., Beijing 100049, China.
| |
Collapse
|
386
|
Villamor DEV, Ho T, Al Rwahnih M, Martin RR, Tzanetakis IE. High Throughput Sequencing For Plant Virus Detection and Discovery. PHYTOPATHOLOGY 2019; 109:716-725. [PMID: 30801236 DOI: 10.1094/phyto-07-18-0257-rvw] [Citation(s) in RCA: 180] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Over the last decade, virologists have discovered an unprecedented number of viruses using high throughput sequencing (HTS), which led to the advancement of our knowledge on the diversity of viruses in nature, particularly unraveling the virome of many agricultural crops. However, these new virus discoveries have often widened the gaps in our understanding of virus biology; the forefront of which is the actual role of a new virus in disease, if any. Yet, when used critically in etiological studies, HTS is a powerful tool to establish disease causality between the virus and its host. Conversely, with globalization, movement of plant material is increasingly more common and often a point of dispute between countries. HTS could potentially resolve these issues given its capacity to detect and discover. Although many pipelines are available for plant virus discovery, all share a common backbone. A description of the process of plant virus detection and discovery from HTS data are presented, providing a summary of the different pipelines available for scientists' utility in their research.
Collapse
Affiliation(s)
- D E V Villamor
- 1 Department of Plant Pathology, Division of Agriculture, University of Arkansas System, Fayetteville, AR 72701
| | - T Ho
- 1 Department of Plant Pathology, Division of Agriculture, University of Arkansas System, Fayetteville, AR 72701
| | - M Al Rwahnih
- 2 Department of Plant Pathology, University of California, Davis 95616; and
| | - R R Martin
- 3 Horticulture Crops Research Unit, U.S. Department of Agriculture-Agricultural Research Service, Corvallis, OR 97330
| | - I E Tzanetakis
- 1 Department of Plant Pathology, Division of Agriculture, University of Arkansas System, Fayetteville, AR 72701
| |
Collapse
|
387
|
Wang B, Kumar V, Olson A, Ware D. Reviving the Transcriptome Studies: An Insight Into the Emergence of Single-Molecule Transcriptome Sequencing. Front Genet 2019; 10:384. [PMID: 31105749 PMCID: PMC6498185 DOI: 10.3389/fgene.2019.00384] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 04/09/2019] [Indexed: 12/23/2022] Open
Abstract
Advances in transcriptomics have provided an exceptional opportunity to study functional implications of the genetic variability. Technologies such as RNA-Seq have emerged as state-of-the-art techniques for transcriptome analysis that take advantage of high-throughput next-generation sequencing. However, similar to their predecessors, these approaches continue to impose major challenges on full-length transcript structure identification, primarily due to inherent limitations of read length. With the development of single-molecule sequencing (SMS) from PacBio, a growing number of studies on the transcriptome of different organisms have been reported. SMS has emerged as advantageous for comprehensive genome annotation including identification of novel genes/isoforms, long non-coding RNAs and fusion transcripts. This approach can be used across a broad spectrum of species to better interpret the coding information of the genome, and facilitate the biological function study. We provide an overview of SMS platform and its diverse applications in various biological studies, and our perspective on the challenges associated with the transcriptome studies.
Collapse
Affiliation(s)
- Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States.,USDA-ARS Robert W. Holley Center for Agriculture and Health, Ithaca, NY, United States
| |
Collapse
|
388
|
Unveiling novel targets of paclitaxel resistance by single molecule long-read RNA sequencing in breast cancer. Sci Rep 2019; 9:6032. [PMID: 30988345 PMCID: PMC6465246 DOI: 10.1038/s41598-019-42184-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Accepted: 03/19/2019] [Indexed: 12/31/2022] Open
Abstract
RNA sequencing has become one of the most common technology to study transcriptomes in cancer, whereas its length limits its application on alternative splicing (AS) events and novel isoforms. Firstly, we applied single molecule long-read RNA sequencing (Iso-seq) and de novo assembly with short-read RNA sequencing (RNA-seq) in both wild type (231-WT) and paclitaxel resistant type (231-PTX) of human breast cancer cell MDA-MBA-231. The two sequencing technology provide both the accurate transcript sequences and the deep transcript coverage. Then we combined shor-read and long-read RNA-seq to analyze alternative events and novel isoforms. Last but not the least, we selected BAK1 as our candidate target to verify our analysis. Our results implied that improved characterization of cancer genomic function may require the application of the single molecule long-read RNA sequencing to get the deeper and more precise view to transcriptional level. Our results imply that improved characterization of cancer genomic function may require the application of the single molecule long-read RNA sequencing to get the deeper and more precise view to transcriptional level.
Collapse
|
389
|
Bråte J, Fuss J, Mehrota S, Jakobsen KS, Klaveness D. Draft genome assembly and transcriptome sequencing of the golden algae Hydrurus foetidus (Chrysophyceae). F1000Res 2019; 8:401. [PMID: 31632652 PMCID: PMC6784874 DOI: 10.12688/f1000research.16734.3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/03/2019] [Indexed: 01/01/2023] Open
Abstract
Hydrurusfoetidus is a freshwater chrysophyte alga. It thrives in cold rivers in polar and high alpine regions. It has several morphological traits reminiscent of single-celled eukaryotes, but can also form macroscopic thalli. Despite its ability to produce polyunsaturated fatty acids, its life under cold conditions and its variable morphology, very little is known about its genome and transcriptome. Here, we present an extensive set of next-generation sequencing data, including genomic short reads from Illumina sequencing and long reads from Nanopore sequencing, as well as full length cDNAs from PacBio IsoSeq sequencing and a small RNA dataset (smaller than 200 bp) sequenced with Illumina. The genome sequences were combined to produce an assembly consisting of 5069 contigs, with a total assembly size of 171 Mb and a 77% BUSCO completeness. The new data generated here may contribute to a better understanding of the evolution and ecological roles of chrysophyte algae, as well as to resolve the branching patterns at a larger phylogenetic scale.
Collapse
Affiliation(s)
- Jon Bråte
- Section for Genetics and Evolutionary Biology (EVOGENE), Department of Biosciences, University of Oslo, Oslo, 0316, Norway
| | - Janina Fuss
- Institute of Clinical Molecular Biology, Christian-Albrechts-University Kiel, Kiel, 24118, Germany
| | - Shruti Mehrota
- Section for Genetics and Evolutionary Biology (EVOGENE), Department of Biosciences, University of Oslo, Oslo, 0316, Norway.,Section for Aquatic Biology and Toxicology (AQUA), Department of Biosciences, University of Oslo, Oslo, 0316, Norway
| | - Kjetill S Jakobsen
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, Oslo, 0316, Norway
| | - Dag Klaveness
- Section for Aquatic Biology and Toxicology (AQUA), Department of Biosciences, University of Oslo, Oslo, 0316, Norway
| |
Collapse
|
390
|
Zhang X, Li G, Jiang H, Li L, Ma J, Li H, Chen J. Full-length transcriptome analysis of Litopenaeus vannamei reveals transcript variants involved in the innate immune system. FISH & SHELLFISH IMMUNOLOGY 2019; 87:346-359. [PMID: 30677515 DOI: 10.1016/j.fsi.2019.01.023] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Revised: 01/09/2019] [Accepted: 01/13/2019] [Indexed: 06/09/2023]
Abstract
To better understand the immune system of shrimp, this study combined PacBio isoform sequencing (Iso-Seq) and Illumina paired-end short reads sequencing methods to discover full-length immune-related molecules of the Pacific white shrimp, Litopenaeus vannamei. A total of 72,648 nonredundant full-length transcripts (unigenes) were generated with an average length of 2545 bp from five main tissues, including the hepatopancreas, cardiac stomach, heart, muscle, and pyloric stomach. These unigenes exhibited a high annotation rate (62,164, 85.57%) when compared against NR, NT, Swiss-Prot, Pfam, GO, KEGG and COG databases. A total of 7544 putative long noncoding RNAs (lncRNAs) were detected and 1164 nonredundant full-length transcripts (449 UniTransModels) participated in the alternative splicing (AS) events. Importantly, a total of 5279 nonredundant full-length unigenes were successfully identified, which were involved in the innate immune system, including 9 immune-related processes, 19 immune-related pathways and 10 other immune-related systems. We also found wide transcript variants, which increased the number and function complexity of immune molecules; for example, toll-like receptors (TLRs) and interferon regulatory factors (IRFs). The 480 differentially expressed genes (DEGs) were significantly higher or tissue-specific expression patterns in the hepatopancreas compared with that in other four tested tissues (FDR <0.05). Furthermore, the expression levels of six selected immune-related DEGs and putative IRFs were validated using real-time PCR technology, substantiating the reliability of the PacBio Iso-seq results. In conclusion, our results provide new genetic resources of long-read full-length transcripts data and information for identifying immune-related genes, which are an invaluable transcriptomic resource as genomic reference, especially for further exploration of the innate immune and defense mechanisms of shrimp.
Collapse
Affiliation(s)
- Xiujuan Zhang
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Guangdong Institute of Applied Biological Resources, Guangzhou, Guangdong, 510260, China
| | - Guanyu Li
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Guangdong Institute of Applied Biological Resources, Guangzhou, Guangdong, 510260, China
| | - Haiying Jiang
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Guangdong Institute of Applied Biological Resources, Guangzhou, Guangdong, 510260, China
| | - Linmiao Li
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Guangdong Institute of Applied Biological Resources, Guangzhou, Guangdong, 510260, China
| | - Jinge Ma
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Guangdong Institute of Applied Biological Resources, Guangzhou, Guangdong, 510260, China
| | - Huiming Li
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Guangdong Institute of Applied Biological Resources, Guangzhou, Guangdong, 510260, China
| | - Jinping Chen
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Guangdong Institute of Applied Biological Resources, Guangzhou, Guangdong, 510260, China.
| |
Collapse
|
391
|
Kuosmanen A, Norri T, Mäkinen V. Evaluating approaches to find exon chains based on long reads. Brief Bioinform 2019; 19:404-414. [PMID: 28069635 PMCID: PMC5952954 DOI: 10.1093/bib/bbw137] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Indexed: 11/25/2022] Open
Abstract
Transcript prediction can be modeled as a graph problem where exons are modeled as nodes and reads spanning two or more exons are modeled as exon chains. Pacific Biosciences third-generation sequencing technology produces significantly longer reads than earlier second-generation sequencing technologies, which gives valuable information about longer exon chains in a graph. However, with the high error rates of third-generation sequencing, aligning long reads correctly around the splice sites is a challenging task. Incorrect alignments lead to spurious nodes and arcs in the graph, which in turn lead to incorrect transcript predictions. We survey several approaches to find the exon chains corresponding to long reads in a splicing graph, and experimentally study the performance of these methods using simulated data to allow for sensitivity/precision analysis. Our experiments show that short reads from second-generation sequencing can be used to significantly improve exon chain correctness either by error-correcting the long reads before splicing graph creation, or by using them to create a splicing graph on which the long-read alignments are then projected. We also study the memory and time consumption of various modules, and show that accurate exon chains lead to significantly increased transcript prediction accuracy. Availability: The simulated data and in-house scripts used for this article are available at http://www.cs.helsinki.fi/group/gsa/exon-chains/exon-chains-bib.tar.bz2.
Collapse
Affiliation(s)
- Anna Kuosmanen
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
| | - Tuukka Norri
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
| | - Veli Mäkinen
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
| |
Collapse
|
392
|
Bao E, Xie F, Song C, Song D. FLAS: fast and high-throughput algorithm for PacBio long-read self-correction. Bioinformatics 2019; 35:3953-3960. [DOI: 10.1093/bioinformatics/btz206] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Revised: 02/18/2019] [Accepted: 03/19/2019] [Indexed: 02/02/2023] Open
Abstract
Abstract
Motivation
The third generation PacBio long reads have greatly facilitated sequencing projects with very large read lengths, but they contain about 15% sequencing errors and need error correction. For the projects with long reads only, it is challenging to make correction with fast speed, and also challenging to correct a sufficient amount of read bases, i.e. to achieve high-throughput self-correction. MECAT is currently among the fastest self-correction algorithms, but its throughput is relatively small (Xiao et al., 2017).
Results
Here, we introduce FLAS, a wrapper algorithm of MECAT, to achieve high-throughput long-read self-correction while keeping MECAT’s fast speed. FLAS finds additional alignments from MECAT prealigned long reads to improve the correction throughput, and removes misalignments for accuracy. In addition, FLAS also uses the corrected long-read regions to correct the uncorrected ones to further improve the throughput. In our performance tests on Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana and human long reads, FLAS can achieve 22.0–50.6% larger throughput than MECAT. FLAS is 2–13× faster compared to the self-correction algorithms other than MECAT, and its throughput is also 9.8–281.8% larger. The FLAS corrected long reads can be assembled into contigs of 13.1–29.8% larger N50 sizes than MECAT.
Availability and implementation
The FLAS software can be downloaded for free from this site: https://github.com/baoe/flas.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ergude Bao
- Software Engineering Research Center, School of Software Engineering, Beijing Jiaotong University, Beijing, China
- Department of Botany and Plant Sciences, University of California, Riverside, CA, USA
| | - Fei Xie
- Beijing Lab of Intelligent Information Technology, School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Changjin Song
- Software Engineering Research Center, School of Software Engineering, Beijing Jiaotong University, Beijing, China
| | - Dandan Song
- Beijing Lab of Intelligent Information Technology, School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
393
|
Zhao L, Zhang H, Kohnen MV, Prasad KVSK, Gu L, Reddy ASN. Analysis of Transcriptome and Epitranscriptome in Plants Using PacBio Iso-Seq and Nanopore-Based Direct RNA Sequencing. Front Genet 2019; 10:253. [PMID: 30949200 PMCID: PMC6438080 DOI: 10.3389/fgene.2019.00253] [Citation(s) in RCA: 98] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 03/06/2019] [Indexed: 12/18/2022] Open
Abstract
Nanopore sequencing from Oxford Nanopore Technologies (ONT) and Pacific BioSciences (PacBio) single-molecule real-time (SMRT) long-read isoform sequencing (Iso-Seq) are revolutionizing the way transcriptomes are analyzed. These methods offer many advantages over most widely used high-throughput short-read RNA sequencing (RNA-Seq) approaches and allow a comprehensive analysis of transcriptomes in identifying full-length splice isoforms and several other post-transcriptional events. In addition, direct RNA-Seq provides valuable information about RNA modifications, which are lost during the PCR amplification step in other methods. Here, we present a comprehensive summary of important applications of these technologies in plants, including identification of complex alternative splicing (AS), full-length splice variants, fusion transcripts, and alternative polyadenylation (APA) events. Furthermore, we discuss the impact of the newly developed nanopore direct RNA-Seq in advancing epitranscriptome research in plants. Additionally, we summarize computational tools for identifying and quantifying full-length isoforms and other co/post-transcriptional events and discussed some of the limitations with these methods. Sequencing of transcriptomes using these new single-molecule long-read methods will unravel many aspects of transcriptome complexity in unprecedented ways as compared to previous short-read sequencing approaches. Analysis of plant transcriptomes with these new powerful methods that require minimum sample processing is likely to become the norm and is expected to uncover novel co/post-transcriptional gene regulatory mechanisms that control biological outcomes during plant development and in response to various stresses.
Collapse
Affiliation(s)
- Liangzhen Zhao
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Hangxiao Zhang
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Markus V. Kohnen
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Kasavajhala V. S. K. Prasad
- Program in Cell and Molecular Biology, Department of Biology, Colorado State University, Fort Collins, CO, United States
| | - Lianfeng Gu
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Anireddy S. N. Reddy
- Program in Cell and Molecular Biology, Department of Biology, Colorado State University, Fort Collins, CO, United States
| |
Collapse
|
394
|
Minio A, Massonnet M, Figueroa-Balderas R, Vondras AM, Blanco-Ulate B, Cantu D. Iso-Seq Allows Genome-Independent Transcriptome Profiling of Grape Berry Development. G3 (BETHESDA, MD.) 2019; 9:755-767. [PMID: 30642874 PMCID: PMC6404599 DOI: 10.1534/g3.118.201008] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 01/09/2019] [Indexed: 01/13/2023]
Abstract
Transcriptomics has been widely applied to study grape berry development. With few exceptions, transcriptomic studies in grape are performed using the available genome sequence, PN40024, as reference. However, differences in gene content among grape accessions, which contribute to phenotypic differences among cultivars, suggest that a single reference genome does not represent the species' entire gene space. Though whole genome assembly and annotation can reveal the relatively unique or "private" gene space of any particular cultivar, transcriptome reconstruction is a more rapid, less costly, and less computationally intensive strategy to accomplish the same goal. In this study, we used single molecule-real time sequencing (SMRT) to sequence full-length cDNA (Iso-Seq) and reconstruct the transcriptome of Cabernet Sauvignon berries during berry ripening. In addition, short reads from ripening berries were used to error-correct low-expression isoforms and to profile isoform expression. By comparing the annotated gene space of Cabernet Sauvignon to other grape cultivars, we demonstrate that the transcriptome reference built with Iso-Seq data represents most of the expressed genes in the grape berries and includes 1,501 cultivar-specific genes. Iso-Seq produced transcriptome profiles similar to those obtained after mapping on a complete genome reference. Together, these results justify the application of Iso-Seq to identify cultivar-specific genes and build a comprehensive reference for transcriptional profiling that circumvents the necessity of a genome reference with its associated costs and computational weight.
Collapse
Affiliation(s)
- Andrea Minio
- Department of Viticulture and Enology, University of California Davis, Davis, CA
| | - Mélanie Massonnet
- Department of Viticulture and Enology, University of California Davis, Davis, CA
| | | | - Amanda M Vondras
- Department of Viticulture and Enology, University of California Davis, Davis, CA
| | | | - Dario Cantu
- Department of Viticulture and Enology, University of California Davis, Davis, CA
| |
Collapse
|
395
|
Nielsen SKD, Koch TL, Hauser F, Garm A, Grimmelikhuijzen CJP. De novo transcriptome assembly of the cubomedusa Tripedalia cystophora, including the analysis of a set of genes involved in peptidergic neurotransmission. BMC Genomics 2019; 20:175. [PMID: 30836949 PMCID: PMC6402141 DOI: 10.1186/s12864-019-5514-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 02/07/2019] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND The phyla Cnidaria, Placozoa, Ctenophora, and Porifera emerged before the split of proto- and deuterostome animals, about 600 million years ago. These early metazoans are interesting, because they can give us important information on the evolution of various tissues and organs, such as eyes and the nervous system. Generally, cnidarians have simple nervous systems, which use neuropeptides for their neurotransmission, but some cnidarian medusae belonging to the class Cubozoa (box jellyfishes) have advanced image-forming eyes, probably associated with a complex innervation. Here, we describe a new transcriptome database from the cubomedusa Tripedalia cystophora. RESULTS Based on the combined use of the Illumina and PacBio sequencing technologies, we produced a highly contiguous transcriptome database from T. cystophora. We then developed a software program to discover neuropeptide preprohormones in this database. This script enabled us to annotate seven novel T. cystophora neuropeptide preprohormone cDNAs: One coding for 19 copies of a peptide with the structure pQWLRGRFamide; one coding for six copies of a different RFamide peptide; one coding for six copies of pQPPGVWamide; one coding for eight different neuropeptide copies with the C-terminal LWamide sequence; one coding for thirteen copies of a peptide with the RPRAamide C-terminus; one coding for four copies of a peptide with the C-terminal GRYamide sequence; and one coding for seven copies of a cyclic peptide, of which the most frequent one has the sequence CTGQMCWFRamide. We could also identify orthologs of these seven preprohormones in the cubozoans Alatina alata, Carybdea xaymacana, Chironex fleckeri, and Chiropsalmus quadrumanus. Furthermore, using TBLASTN screening, we could annotate four bursicon-like glycoprotein hormone subunits, five opsins, and 52 other family-A G protein-coupled receptors (GPCRs), which also included two leucine-rich repeats containing G protein-coupled receptors (LGRs) in T. cystophora. The two LGRs are potential receptors for the glycoprotein hormones, while the other GPCRs are candidate receptors for the above-mentioned neuropeptides. CONCLUSIONS By combining Illumina and PacBio sequencing technologies, we have produced a new high-quality de novo transcriptome assembly from T. cystophora that should be a valuable resource for identifying the neuronal components that are involved in vision and other behaviors in cubomedusae.
Collapse
Affiliation(s)
- Sofie K. D. Nielsen
- Section of Marine Biology, Department of Biology, University of Copenhagen, Universitetsparken 4, 2100 Copenhagen, Denmark
| | - Thomas L. Koch
- Section for Cell and Neurobiology, Department of Biology, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen, Denmark
| | - Frank Hauser
- Section for Cell and Neurobiology, Department of Biology, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen, Denmark
| | - Anders Garm
- Section of Marine Biology, Department of Biology, University of Copenhagen, Universitetsparken 4, 2100 Copenhagen, Denmark
| | - Cornelis J. P. Grimmelikhuijzen
- Section for Cell and Neurobiology, Department of Biology, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen, Denmark
| |
Collapse
|
396
|
Limasset A, Flot JF, Peterlongo P. Toward perfect reads: self-correction of short reads via mapping on de Bruijn graphs. Bioinformatics 2019; 36:1374-1381. [DOI: 10.1093/bioinformatics/btz102] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Revised: 01/07/2019] [Accepted: 02/18/2019] [Indexed: 12/25/2022] Open
Abstract
Abstract
Motivation
Short-read accuracy is important for downstream analyses such as genome assembly and hybrid long-read correction. Despite much work on short-read correction, present-day correctors either do not scale well on large datasets or consider reads as mere suites of k-mers, without taking into account their full-length sequence information.
Results
We propose a new method to correct short reads using de Bruijn graphs and implement it as a tool called Bcool. As a first step, Bcool constructs a compacted de Bruijn graph from the reads. This graph is filtered on the basis of k-mer abundance then of unitig abundance, thereby removing most sequencing errors. The cleaned graph is then used as a reference on which the reads are mapped to correct them. We show that this approach yields more accurate reads than k-mer-spectrum correctors while being scalable to human-size genomic datasets and beyond.
Availability and implementation
The implementation is open source, available at http://github.com/Malfoy/BCOOL under the Affero GPL license and as a Bioconda package.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Antoine Limasset
- Evolutionary Biology & Ecology, Université Libre de Bruxelles (ULB), Bruxelles, Belgium
| | - Jean-François Flot
- Evolutionary Biology & Ecology, Université Libre de Bruxelles (ULB), Bruxelles, Belgium
- Interuniversity Institute of Bioinformatics in Brussels – (IB) 2, Brussels, Belgium
| | | |
Collapse
|
397
|
Song QA, Catlin NS, Brad Barbazuk W, Li S. Computational analysis of alternative splicing in plant genomes. Gene 2019; 685:186-195. [PMID: 30321657 DOI: 10.1016/j.gene.2018.10.026] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Revised: 09/16/2018] [Accepted: 10/11/2018] [Indexed: 12/11/2022]
Abstract
Computational analyses play crucial roles in characterizing splicing isoforms in plant genomes. In this review, we provide a survey of computational tools used in recently published, genome-scale splicing analyses in plants. We summarize the commonly used software and pipelines for read mapping, isoform reconstruction, isoform quantification, and differential expression analysis. We also discuss methods for analyzing long reads and the strategies to combine long and short reads in identifying splicing isoforms. We review several tools for characterizing local splicing events, splicing graphs, coding potential, and visualizing splicing isoforms. We further discuss the procedures for identifying conserved splicing isoforms across plant species. Finally, we discuss the outlook of integrating other genomic data with splicing analyses to identify regulatory mechanisms of AS on genome-wide scale.
Collapse
Affiliation(s)
- Qi A Song
- Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, United States of America
| | - Nathan S Catlin
- Department of Biology, University of Florida, Gainesville, FL 32611, United States of America
| | - W Brad Barbazuk
- Department of Biology, University of Florida, Gainesville, FL 32611, United States of America; Genetics Institute, University of Florida, Gainesville, FL 32611, United States of America
| | - Song Li
- School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, United States of America.
| |
Collapse
|
398
|
Fu S, Wang A, Au KF. A comparative evaluation of hybrid error correction methods for error-prone long reads. Genome Biol 2019; 20:26. [PMID: 30717772 PMCID: PMC6362602 DOI: 10.1186/s13059-018-1605-z] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Accepted: 12/05/2018] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Third-generation sequencing technologies have advanced the progress of the biological research by generating reads that are substantially longer than second-generation sequencing technologies. However, their notorious high error rate impedes straightforward data analysis and limits their application. A handful of error correction methods for these error-prone long reads have been developed to date. The output data quality is very important for downstream analysis, whereas computing resources could limit the utility of some computing-intense tools. There is a lack of standardized assessments for these long-read error-correction methods. RESULTS Here, we present a comparative performance assessment of ten state-of-the-art error-correction methods for long reads. We established a common set of benchmarks for performance assessment, including sensitivity, accuracy, output rate, alignment rate, output read length, run time, and memory usage, as well as the effects of error correction on two downstream applications of long reads: de novo assembly and resolving haplotype sequences. CONCLUSIONS Taking into account all of these metrics, we provide a suggestive guideline for method choice based on available data size, computing resources, and individual research goals.
Collapse
Affiliation(s)
- Shuhua Fu
- Department of Internal Medicine, University of Iowa, Iowa City, IA, 52242, USA
| | - Anqi Wang
- Department of Internal Medicine, University of Iowa, Iowa City, IA, 52242, USA
| | - Kin Fai Au
- Department of Internal Medicine, University of Iowa, Iowa City, IA, 52242, USA.
- Department of Biostatistics, University of Iowa, Iowa City, IA, 52242, USA.
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA.
| |
Collapse
|
399
|
Ma J, Wan D, Duan B, Bai X, Bai Q, Chen N, Ma T. Genome sequence and genetic transformation of a widely distributed and cultivated poplar. PLANT BIOTECHNOLOGY JOURNAL 2019; 17:451-460. [PMID: 30044051 PMCID: PMC6335071 DOI: 10.1111/pbi.12989] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2017] [Revised: 07/10/2018] [Accepted: 07/13/2018] [Indexed: 05/20/2023]
Abstract
Populus alba is widely distributed and cultivated in Europe and Asia. This species has been used for diverse studies. In this study, we assembled a de novo genome sequence of P. alba var. pyramidalis (= P. bolleana) and confirmed its high transformation efficiency and short transformation time by experiments. Through a process of hybrid genome assembly, a total of 464 M of the genome was assembled. Annotation analyses predicted 37 901 protein-coding genes. This genome is highly collinear to that of P. trichocarpa, with most genes having orthologs in the two species. We found a marked expansion of gene families related to histone and the hormone auxin but loss of disease resistance genes in P. alba if compared with the closely related P. trichocarpa. The genome sequence presented here represents a valuable resource for further molecular functional analyses of this species as a new tree model, poplar breeding practices and comparative genomic analyses across different poplars.
Collapse
Affiliation(s)
- Jianchao Ma
- State Key Laboratory of Grassland Agro‐EcosystemInstitute of Innovation Ecology & School of Life SciencesLanzhou UniversityLanzhouChina
| | - Dongshi Wan
- State Key Laboratory of Grassland Agro‐EcosystemInstitute of Innovation Ecology & School of Life SciencesLanzhou UniversityLanzhouChina
| | - Bingbing Duan
- State Key Laboratory of Grassland Agro‐EcosystemInstitute of Innovation Ecology & School of Life SciencesLanzhou UniversityLanzhouChina
| | - Xiaotao Bai
- State Key Laboratory of Grassland Agro‐EcosystemInstitute of Innovation Ecology & School of Life SciencesLanzhou UniversityLanzhouChina
| | - Qiuxian Bai
- State Key Laboratory of Grassland Agro‐EcosystemInstitute of Innovation Ecology & School of Life SciencesLanzhou UniversityLanzhouChina
| | - Ningning Chen
- State Key Laboratory of Grassland Agro‐EcosystemInstitute of Innovation Ecology & School of Life SciencesLanzhou UniversityLanzhouChina
| | - Tao Ma
- State Key Laboratory of Grassland Agro‐EcosystemInstitute of Innovation Ecology & School of Life SciencesLanzhou UniversityLanzhouChina
- Key Laboratory of Bio‐Resource and Eco‐Environment of Ministry of EducationCollege of Life SciencesSichuan UniversityChengduChina
| |
Collapse
|
400
|
Chebbi MA, Becking T, Moumen B, Giraud I, Gilbert C, Peccoud J, Cordaux R. The Genome ofArmadillidium vulgare(Crustacea, Isopoda) Provides Insights into Sex Chromosome Evolution in the Context of Cytoplasmic Sex Determination. Mol Biol Evol 2019; 36:727-741. [DOI: 10.1093/molbev/msz010] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Affiliation(s)
- Mohamed Amine Chebbi
- Laboratoire Ecologie et Biologie des Interactions, Equipe Ecologie Evolution Symbiose, Université de Poitiers, UMR CNRS 7267, Poitiers, France
| | - Thomas Becking
- Laboratoire Ecologie et Biologie des Interactions, Equipe Ecologie Evolution Symbiose, Université de Poitiers, UMR CNRS 7267, Poitiers, France
| | - Bouziane Moumen
- Laboratoire Ecologie et Biologie des Interactions, Equipe Ecologie Evolution Symbiose, Université de Poitiers, UMR CNRS 7267, Poitiers, France
| | - Isabelle Giraud
- Laboratoire Ecologie et Biologie des Interactions, Equipe Ecologie Evolution Symbiose, Université de Poitiers, UMR CNRS 7267, Poitiers, France
| | - Clément Gilbert
- Laboratoire Evolution, Génomes, Comportement, Ecologie, CNRS Université Paris-Sud UMR 9191, IRD UMR 247, Gif sur Yvette, France
- Laboratoire Ecologie et Biologie des Interactions, Equipe Ecologie Evolution Symbiose, Université de Poitiers, UMR CNRS 7267, Poitiers, France
| | - Jean Peccoud
- Laboratoire Ecologie et Biologie des Interactions, Equipe Ecologie Evolution Symbiose, Université de Poitiers, UMR CNRS 7267, Poitiers, France
| | - Richard Cordaux
- Laboratoire Ecologie et Biologie des Interactions, Equipe Ecologie Evolution Symbiose, Université de Poitiers, UMR CNRS 7267, Poitiers, France
| |
Collapse
|