101
|
Monat C, Padmarasu S, Lux T, Wicker T, Gundlach H, Himmelbach A, Ens J, Li C, Muehlbauer GJ, Schulman AH, Waugh R, Braumann I, Pozniak C, Scholz U, Mayer KFX, Spannagl M, Stein N, Mascher M. TRITEX: chromosome-scale sequence assembly of Triticeae genomes with open-source tools. Genome Biol 2019; 20:284. [PMID: 31849336 DOI: 10.1101/631648] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Accepted: 11/25/2019] [Indexed: 05/29/2023] Open
Abstract
Chromosome-scale genome sequence assemblies underpin pan-genomic studies. Recent genome assembly efforts in the large-genome Triticeae crops wheat and barley have relied on the commercial closed-source assembly algorithm DeNovoMagic. We present TRITEX, an open-source computational workflow that combines paired-end, mate-pair, 10X Genomics linked-read with chromosome conformation capture sequencing data to construct sequence scaffolds with megabase-scale contiguity ordered into chromosomal pseudomolecules. We evaluate the performance of TRITEX on publicly available sequence data of tetraploid wild emmer and hexaploid bread wheat, and construct an improved annotated reference genome sequence assembly of the barley cultivar Morex as a community resource.
Collapse
Affiliation(s)
- Cécile Monat
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Sudharsan Padmarasu
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Thomas Lux
- PGSB - Plant Genome and Systems Biology, Helmholtz Center Munich - German Research Center for Environmental Health, Neuherberg, Germany
| | - Thomas Wicker
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Heidrun Gundlach
- PGSB - Plant Genome and Systems Biology, Helmholtz Center Munich - German Research Center for Environmental Health, Neuherberg, Germany
| | - Axel Himmelbach
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Jennifer Ens
- Department of Plant Sciences, University of Saskatchewan, Saskatoon, Canada
| | - Chengdao Li
- Western Barley Genetics Alliance, School of Veterinary and Life Sciences (VLS), Murdoch University, Murdoch, WA, Australia
- Hubei Collaborative Innovation Center for Grain Industry/School of Agriculture, Yangtze University, Jingzhou, China
| | - Gary J Muehlbauer
- Department of Agronomy and Plant Genetics & Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN, USA
| | - Alan H Schulman
- Green Technology, Natural Resources Institute (Luke), Viikki Plant Science Centre, and Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | - Robbie Waugh
- The James Hutton Institute, Dundee, UK
- School of Life Sciences, University of Dundee, Dundee, UK
| | | | - Curtis Pozniak
- Department of Plant Sciences, University of Saskatchewan, Saskatoon, Canada
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Klaus F X Mayer
- PGSB - Plant Genome and Systems Biology, Helmholtz Center Munich - German Research Center for Environmental Health, Neuherberg, Germany
- School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - Manuel Spannagl
- PGSB - Plant Genome and Systems Biology, Helmholtz Center Munich - German Research Center for Environmental Health, Neuherberg, Germany
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany.
- Department of Crop Sciences, Center for Integrated Breeding Research (CiBreed), Georg-August-University Göttingen, Göttingen, Germany.
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany.
| |
Collapse
|
102
|
Monat C, Padmarasu S, Lux T, Wicker T, Gundlach H, Himmelbach A, Ens J, Li C, Muehlbauer GJ, Schulman AH, Waugh R, Braumann I, Pozniak C, Scholz U, Mayer KFX, Spannagl M, Stein N, Mascher M. TRITEX: chromosome-scale sequence assembly of Triticeae genomes with open-source tools. Genome Biol 2019; 20:284. [PMID: 31849336 PMCID: PMC6918601 DOI: 10.1186/s13059-019-1899-5] [Citation(s) in RCA: 140] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Accepted: 11/25/2019] [Indexed: 11/24/2022] Open
Abstract
Chromosome-scale genome sequence assemblies underpin pan-genomic studies. Recent genome assembly efforts in the large-genome Triticeae crops wheat and barley have relied on the commercial closed-source assembly algorithm DeNovoMagic. We present TRITEX, an open-source computational workflow that combines paired-end, mate-pair, 10X Genomics linked-read with chromosome conformation capture sequencing data to construct sequence scaffolds with megabase-scale contiguity ordered into chromosomal pseudomolecules. We evaluate the performance of TRITEX on publicly available sequence data of tetraploid wild emmer and hexaploid bread wheat, and construct an improved annotated reference genome sequence assembly of the barley cultivar Morex as a community resource.
Collapse
Affiliation(s)
- Cécile Monat
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Sudharsan Padmarasu
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Thomas Lux
- PGSB - Plant Genome and Systems Biology, Helmholtz Center Munich - German Research Center for Environmental Health, Neuherberg, Germany
| | - Thomas Wicker
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Heidrun Gundlach
- PGSB - Plant Genome and Systems Biology, Helmholtz Center Munich - German Research Center for Environmental Health, Neuherberg, Germany
| | - Axel Himmelbach
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Jennifer Ens
- Department of Plant Sciences, University of Saskatchewan, Saskatoon, Canada
| | - Chengdao Li
- Western Barley Genetics Alliance, School of Veterinary and Life Sciences (VLS), Murdoch University, Murdoch, WA, Australia
- Hubei Collaborative Innovation Center for Grain Industry/School of Agriculture, Yangtze University, Jingzhou, China
| | - Gary J Muehlbauer
- Department of Agronomy and Plant Genetics & Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN, USA
| | - Alan H Schulman
- Green Technology, Natural Resources Institute (Luke), Viikki Plant Science Centre, and Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | - Robbie Waugh
- The James Hutton Institute, Dundee, UK
- School of Life Sciences, University of Dundee, Dundee, UK
| | | | - Curtis Pozniak
- Department of Plant Sciences, University of Saskatchewan, Saskatoon, Canada
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Klaus F X Mayer
- PGSB - Plant Genome and Systems Biology, Helmholtz Center Munich - German Research Center for Environmental Health, Neuherberg, Germany
- School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - Manuel Spannagl
- PGSB - Plant Genome and Systems Biology, Helmholtz Center Munich - German Research Center for Environmental Health, Neuherberg, Germany
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany.
- Department of Crop Sciences, Center for Integrated Breeding Research (CiBreed), Georg-August-University Göttingen, Göttingen, Germany.
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany.
| |
Collapse
|
103
|
Grigoreva E, Ulianich P, Ben C, Gentzbittel L, Potokina E. First Insights into the Guar (Cyamopsis tetragonoloba (L.) Taub.) Genome of the ‘Vavilovskij 130’ Accession, Using Second and Third-Generation Sequencing Technologies. RUSS J GENET+ 2019. [DOI: 10.1134/s102279541911005x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
104
|
Lanfear R, Schalamun M, Kainer D, Wang W, Schwessinger B. MinIONQC: fast and simple quality control for MinION sequencing data. Bioinformatics 2019; 35:523-525. [PMID: 30052755 PMCID: PMC6361240 DOI: 10.1093/bioinformatics/bty654] [Citation(s) in RCA: 111] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Accepted: 07/19/2018] [Indexed: 01/03/2023] Open
Abstract
Summary MinIONQC provides rapid diagnostic plots and quality control data from one or more flowcells of sequencing data from Oxford Nanopore Technologies’ MinION instrument. It can be used to assist with the optimisation of extraction, library preparation, and sequencing protocols, to quickly and directly compare the data from many flowcells, and to provide publication-ready figures summarising sequencing data. Availability and implementation MinIONQC is implemented in R and released under an MIT license. It is available for all platforms from https://github.com/roblanf/minion_qc.
Collapse
Affiliation(s)
- R Lanfear
- The Research School of Biology, Australian National University, Canberra ACT 2601, Australia
| | - M Schalamun
- The Research School of Biology, Australian National University, Canberra ACT 2601, Australia
| | - D Kainer
- The Research School of Biology, Australian National University, Canberra ACT 2601, Australia
| | - W Wang
- The Research School of Biology, Australian National University, Canberra ACT 2601, Australia
| | - B Schwessinger
- The Research School of Biology, Australian National University, Canberra ACT 2601, Australia
| |
Collapse
|
105
|
Ge H, Lin K, Shen M, Wu S, Wang Y, Zhang Z, Wang Z, Zhang Y, Huang Z, Zhou C, Lin Q, Wu J, Liu L, Hu J, Huang Z, Zheng L. De novo assembly of a chromosome-level reference genome of red-spotted grouper (Epinephelus akaara) using nanopore sequencing and Hi-C. Mol Ecol Resour 2019; 19:1461-1469. [PMID: 31325912 PMCID: PMC6899872 DOI: 10.1111/1755-0998.13064] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 06/06/2019] [Accepted: 06/07/2019] [Indexed: 01/02/2023]
Abstract
The red-spotted grouper Epinephelus akaara (E. akaara) is one of the most economically important marine fish in China, Japan and South-East Asia and is a threatened species. The species is also considered a good model for studies of sex inversion, development, genetic diversity and immunity. Despite its importance, molecular resources for E. akaara remain limited and no reference genome has been published to date. In this study, we constructed a chromosome-level reference genome of E. akaara by taking advantage of long-read single-molecule sequencing and de novo assembly by Oxford Nanopore Technology (ONT) and Hi-C. A red-spotted grouper genome of 1.135 Gb was assembled from a total of 106.29 Gb polished Nanopore sequence (GridION, ONT), equivalent to 96-fold genome coverage. The assembled genome represents 96.8% completeness (BUSCO) with a contig N50 length of 5.25 Mb and a longest contig of 25.75 Mb. The contigs were clustered and ordered onto 24 pseudochromosomes covering approximately 95.55% of the genome assembly with Hi-C data, with a scaffold N50 length of 46.03 Mb. The genome contained 43.02% repeat sequences and 5,480 noncoding RNAs. Furthermore, combined with several RNA-seq data sets, 23,808 (99.5%) genes were functionally annotated from a total of 23,923 predicted protein-coding sequences. The high-quality chromosome-level reference genome of E. akaara was assembled for the first time and will be a valuable resource for molecular breeding and functional genomics studies of red-spotted grouper in the future.
Collapse
Affiliation(s)
- Hui Ge
- Key Laboratory of Cultivation and High‐value Utilization of Marine Organisms in Fujian ProvinceFisheries Research Institute of FujianXiamenChina
- Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture, Fisheries CollegeJimei UniversityXiamenChina
| | - Kebing Lin
- Key Laboratory of Cultivation and High‐value Utilization of Marine Organisms in Fujian ProvinceFisheries Research Institute of FujianXiamenChina
| | - Mi Shen
- Nextomics Biosciences InstituteWuhanChina
| | - Shuiqing Wu
- Key Laboratory of Cultivation and High‐value Utilization of Marine Organisms in Fujian ProvinceFisheries Research Institute of FujianXiamenChina
| | - Yilei Wang
- Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture, Fisheries CollegeJimei UniversityXiamenChina
| | - Ziping Zhang
- College of Animal SciencesFujian Agriculture and Forestry UniversityFuzhouChina
| | - Zhiyong Wang
- Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture, Fisheries CollegeJimei UniversityXiamenChina
| | - Yong Zhang
- Southern Laboratory of Ocean Science and Engineering (Guangdong, Zhuhai), Guangdong Provincial Key Laboratory for Aquatic Economic AnimalsSun Yat‐Sen UniversityGuangzhouChina
| | - Zhen Huang
- The Public Service Platform for Industrialization Development Technology of Marine Biological Medicine and Product of State Oceanic AdministrationFujian Normal UniversityFuzhouChina
| | - Chen Zhou
- Key Laboratory of Cultivation and High‐value Utilization of Marine Organisms in Fujian ProvinceFisheries Research Institute of FujianXiamenChina
| | - Qi Lin
- Key Laboratory of Cultivation and High‐value Utilization of Marine Organisms in Fujian ProvinceFisheries Research Institute of FujianXiamenChina
| | - Jianshao Wu
- Key Laboratory of Cultivation and High‐value Utilization of Marine Organisms in Fujian ProvinceFisheries Research Institute of FujianXiamenChina
| | - Lei Liu
- Nextomics Biosciences InstituteWuhanChina
| | - Jiang Hu
- Nextomics Biosciences InstituteWuhanChina
| | - Zhongchi Huang
- Key Laboratory of Cultivation and High‐value Utilization of Marine Organisms in Fujian ProvinceFisheries Research Institute of FujianXiamenChina
| | - Leyun Zheng
- Key Laboratory of Cultivation and High‐value Utilization of Marine Organisms in Fujian ProvinceFisheries Research Institute of FujianXiamenChina
| |
Collapse
|
106
|
Chetelat RT, Qin X, Tan M, Burkart-Waco D, Moritama Y, Huo X, Wills T, Pertuzé R. Introgression lines of Solanum sitiens, a wild nightshade of the Atacama Desert, in the genome of cultivated tomato. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2019; 100:836-850. [PMID: 31323151 DOI: 10.1111/tpj.14460] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Accepted: 07/09/2019] [Indexed: 06/10/2023]
Abstract
The wild tomato relative Solanum sitiens is a xerophyte endemic to the Atacama Desert of Chile and a potential source of genes for tolerance to drought, salinity and low-temperature stresses. However, until recently, strong breeding barriers prevented its hybridization and introgression with cultivated tomato, Solanum lycopersicum L. We overcame these barriers using embryo rescue, bridging lines and allopolyploid hybrids, and synthesized a library of introgression lines (ILs) that captures the genome of S. sitiens in the background of cultivated tomato. The IL library consists of 56 overlapping introgressions that together represent about 93% of the S. sitiens genome: 65% in homozygous and 28% in heterozygous (segregating) ILs. The breakpoints of each segment and the gaps in genome coverage were mapped by single nucleotide polymorphism (SNP) genotyping using the SolCAP SNP array. Marker-assisted selection was used to backcross selected introgressions into tomato, to recover a uniform genetic background, to isolate recombinant sub-lines with shorter introgressions and to select homozygous genotypes. Each IL contains a single S. sitiens chromosome segment, defined by markers, in the genetic background of cv. NC 84173, a fresh market inbred line. Large differences were observed between the lines for both qualitative and quantitative morphological traits, suggesting that the ILs contain highly divergent allelic variation. Several loci contributing to unilateral incompatibility or hybrid necrosis were mapped with the lines. This IL population will facilitate studies of the S. sitiens genome and expands the range of genetic variation available for tomato breeding and research.
Collapse
Affiliation(s)
- Roger T Chetelat
- C. M. Rick Tomato Genetics Resource Center, Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Xiaoqiong Qin
- C. M. Rick Tomato Genetics Resource Center, Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Meilian Tan
- C. M. Rick Tomato Genetics Resource Center, Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Diana Burkart-Waco
- C. M. Rick Tomato Genetics Resource Center, Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Yosuke Moritama
- C. M. Rick Tomato Genetics Resource Center, Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Xiuwen Huo
- C. M. Rick Tomato Genetics Resource Center, Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Tim Wills
- C. M. Rick Tomato Genetics Resource Center, Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Ricardo Pertuzé
- C. M. Rick Tomato Genetics Resource Center, Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| |
Collapse
|
107
|
Alonge M, Soyk S, Ramakrishnan S, Wang X, Goodwin S, Sedlazeck FJ, Lippman ZB, Schatz MC. RaGOO: fast and accurate reference-guided scaffolding of draft genomes. Genome Biol 2019; 20:224. [PMID: 31661016 PMCID: PMC6816165 DOI: 10.1186/s13059-019-1829-6] [Citation(s) in RCA: 393] [Impact Index Per Article: 65.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Accepted: 09/19/2019] [Indexed: 01/10/2023] Open
Abstract
We present RaGOO, a reference-guided contig ordering and orienting tool that leverages the speed and sensitivity of Minimap2 to accurately achieve chromosome-scale assemblies in minutes. After the pseudomolecules are constructed, RaGOO identifies structural variants, including those spanning sequencing gaps. We show that RaGOO accurately orders and orients 3 de novo tomato genome assemblies, including the widely used M82 reference cultivar. We then demonstrate the scalability and utility of RaGOO with a pan-genome analysis of 103 Arabidopsis thaliana accessions by examining the structural variants detected in the newly assembled pseudomolecules. RaGOO is available open source at https://github.com/malonge/RaGOO .
Collapse
Affiliation(s)
- Michael Alonge
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Sebastian Soyk
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | | | - Xingang Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Sara Goodwin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Zachary B Lippman
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
- Cold Spring Harbor Laboratory, Howard Hughes Medical Institute, Cold Spring Harbor, NY, USA
| | - Michael C Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
108
|
Sessegolo C, Cruaud C, Da Silva C, Cologne A, Dubarry M, Derrien T, Lacroix V, Aury JM. Transcriptome profiling of mouse samples using nanopore sequencing of cDNA and RNA molecules. Sci Rep 2019; 9:14908. [PMID: 31624302 PMCID: PMC6797730 DOI: 10.1038/s41598-019-51470-9] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Accepted: 09/28/2019] [Indexed: 01/27/2023] Open
Abstract
Our vision of DNA transcription and splicing has changed dramatically with the introduction of short-read sequencing. These high-throughput sequencing technologies promised to unravel the complexity of any transcriptome. Generally gene expression levels are well-captured using these technologies, but there are still remaining caveats due to the limited read length and the fact that RNA molecules had to be reverse transcribed before sequencing. Oxford Nanopore Technologies has recently launched a portable sequencer which offers the possibility of sequencing long reads and most importantly RNA molecules. Here we generated a full mouse transcriptome from brain and liver using the Oxford Nanopore device. As a comparison, we sequenced RNA (RNA-Seq) and cDNA (cDNA-Seq) molecules using both long and short reads technologies and tested the TeloPrime preparation kit, dedicated to the enrichment of full-length transcripts. Using spike-in data, we confirmed that expression levels are efficiently captured by cDNA-Seq using short reads. More importantly, Oxford Nanopore RNA-Seq tends to be more efficient, while cDNA-Seq appears to be more biased. We further show that the cDNA library preparation of the Nanopore protocol induces read truncation for transcripts containing internal runs of T's. This bias is marked for runs of at least 15 T's, but is already detectable for runs of at least 9 T's and therefore concerns more than 20% of expressed transcripts in mouse brain and liver. Finally, we outline that bioinformatics challenges remain ahead for quantifying at the transcript level, especially when reads are not full-length. Accurate quantification of repeat-associated genes such as processed pseudogenes also remains difficult, and we show that current mapping protocols which map reads to the genome largely over-estimate their expression, at the expense of their parent gene.
Collapse
Affiliation(s)
- Camille Sessegolo
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622, Villeurbanne, France
- EPI ERABLE - Inria Grenoble, Rhône-Alpes, France
| | - Corinne Cruaud
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France
| | - Corinne Da Silva
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France
| | - Audric Cologne
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622, Villeurbanne, France
- EPI ERABLE - Inria Grenoble, Rhône-Alpes, France
| | - Marion Dubarry
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France
| | - Thomas Derrien
- Univ Rennes, CNRS, IGDR (Institut de génétique et développement de Rennes) - UMR 6290, F-35000, Rennes, France
| | - Vincent Lacroix
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622, Villeurbanne, France
- EPI ERABLE - Inria Grenoble, Rhône-Alpes, France
| | - Jean-Marc Aury
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France.
| |
Collapse
|
109
|
Li S, Jia S, Hou L, Nguyen H, Sato S, Holding D, Cahoon E, Zhang C, Clemente T, Yu B. Mapping of transgenic alleles in soybean using a nanopore-based sequencing strategy. JOURNAL OF EXPERIMENTAL BOTANY 2019; 70:3825-3833. [PMID: 31037287 PMCID: PMC6685662 DOI: 10.1093/jxb/erz202] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Accepted: 04/15/2019] [Indexed: 05/23/2023]
Abstract
Transgenic technology was developed to introduce transgenes into various organisms to validate gene function and add genetic variations >40 years ago. However, the identification of the transgene insertion position is still challenging in organisms with complex genomes. Here, we report a nanopore-based method to map the insertion position of a Ds transposable element originating in maize in the soybean genome. In this method, an oligo probe is used to capture the DNA fragments containing the Ds element from pooled DNA samples of transgenic soybean plants. The Ds element-enriched DNAs are then sequenced using the MinION-based platform of Nanopore. This method allowed us to rapidly map the Ds insertion positions in 51 transgenic soybean lines through a single sequencing run. This strategy is high throughput, convenient, reliable, and cost-efficient. The transgenic allele mapping protocol can be easily translated to other eukaryotes with complex genomes.
Collapse
Affiliation(s)
- Shengjun Li
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, NE, USA
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, USA
- Qingdao Engineering Research Center of Biomass Resources and Environment, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, China
| | - Shangang Jia
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, USA
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - Lili Hou
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, USA
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - Hanh Nguyen
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, USA
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - Shirley Sato
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, USA
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - David Holding
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, USA
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - Edgar Cahoon
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, USA
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - Chi Zhang
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, NE, USA
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - Tom Clemente
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, USA
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - Bin Yu
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, NE, USA
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, USA
| |
Collapse
|
110
|
Constructing a Reference Genome in a Single Lab: The Possibility to Use Oxford Nanopore Technology. PLANTS 2019; 8:plants8080270. [PMID: 31390788 PMCID: PMC6724115 DOI: 10.3390/plants8080270] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Revised: 07/29/2019] [Accepted: 08/04/2019] [Indexed: 12/19/2022]
Abstract
The whole genome sequencing (WGS) has become a crucial tool in understanding genome structure and genetic variation. The MinION sequencing of Oxford Nanopore Technologies (ONT) is an excellent approach for performing WGS and it has advantages in comparison with other Next-Generation Sequencing (NGS): It is relatively inexpensive, portable, has simple library preparation, can be monitored in real-time, and has no theoretical limits on reading length. Sorghum bicolor (L.) Moench is diploid (2n = 2x = 20) with a genome size of about 730 Mb, and its genome sequence information is released in the Phytozome database. Therefore, sorghum can be used as a good reference. However, plant species have complex and large genomes when compared to animals or microorganisms. As a result, complete genome sequencing is difficult for plant species. MinION sequencing that produces long-reads can be an excellent tool for overcoming the weak assembly of short-reads generated from NGS by minimizing the generation of gaps or covering the repetitive sequence that appears on the plant genome. Here, we conducted the genome sequencing for S. bicolor cv. BTx623 while using the MinION platform and obtained 895,678 reads and 17.9 gigabytes (Gb) (ca. 25× coverage of reference) from long-read sequence data. A total of 6124 contigs (covering 45.9%) were generated from Canu, and a total of 2661 contigs (covering 50%) were generated from Minimap and Miniasm with a Racon through a de novo assembly using two different tools and mapped assembled contigs against the sorghum reference genome. Our results provide an optimal series of long-read sequencing analysis for plant species while using the MinION platform and a clue to determine the total sequencing scale for optimal coverage that is based on various genome sizes.
Collapse
|
111
|
Jung H, Winefield C, Bombarely A, Prentis P, Waterhouse P. Tools and Strategies for Long-Read Sequencing and De Novo Assembly of Plant Genomes. TRENDS IN PLANT SCIENCE 2019; 24:700-724. [PMID: 31208890 DOI: 10.1016/j.tplants.2019.05.003] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2019] [Revised: 05/01/2019] [Accepted: 05/10/2019] [Indexed: 05/16/2023]
Abstract
The commercial release of third-generation sequencing technologies (TGSTs), giving long and ultra-long sequencing reads, has stimulated the development of new tools for assembling highly contiguous genome sequences with unprecedented accuracy across complex repeat regions. We survey here a wide range of emerging sequencing platforms and analytical tools for de novo assembly, provide background information for each of their steps, and discuss the spectrum of available options. Our decision tree recommends workflows for the generation of a high-quality genome assembly when used in combination with the specific needs and resources of a project.
Collapse
Affiliation(s)
- Hyungtaek Jung
- Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, QLD 4001, Australia.
| | - Christopher Winefield
- Department of Wine, Food, and Molecular Biosciences, Lincoln University, 7647 Christchurch, New Zealand
| | - Aureliano Bombarely
- Department of Bioscience, University of Milan, Milan 20133, Italy; School of Plants and Environmental Sciences, Virginia Tech, Blacksburg, VA 24061, USA
| | - Peter Prentis
- School of Earth, Environmental, and Biological Sciences, Queensland University of Technology, Brisbane, QLD, 4001, Australia
| | - Peter Waterhouse
- Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, QLD 4001, Australia; School of Biological Sciences, University of Sydney, Sydney, NSW 2006, Australia.
| |
Collapse
|
112
|
Ou S, Chen J, Jiang N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res 2019; 46:e126. [PMID: 30107434 PMCID: PMC6265445 DOI: 10.1093/nar/gky730] [Citation(s) in RCA: 323] [Impact Index Per Article: 53.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2018] [Accepted: 07/31/2018] [Indexed: 12/15/2022] Open
Abstract
Assembling a plant genome is challenging due to the abundance of repetitive sequences, yet no standard is available to evaluate the assembly of repeat space. LTR retrotransposons (LTR-RTs) are the predominant interspersed repeat that is poorly assembled in draft genomes. Here, we propose a reference-free genome metric called LTR Assembly Index (LAI) that evaluates assembly continuity using LTR-RTs. After correcting for LTR-RT amplification dynamics, we show that LAI is independent of genome size, genomic LTR-RT content, and gene space evaluation metrics (i.e., BUSCO and CEGMA). By comparing genomic sequences produced by various sequencing techniques, we reveal the significant gain of assembly continuity by using long-read-based techniques over short-read-based methods. Moreover, LAI can facilitate iterative assembly improvement with assembler selection and identify low-quality genomic regions. To apply LAI, intact LTR-RTs and total LTR-RTs should contribute at least 0.1% and 5% to the genome size, respectively. The LAI program is freely available on GitHub: https://github.com/oushujun/LTR_retriever.
Collapse
Affiliation(s)
- Shujun Ou
- Department of Horticulture, Michigan State University, East Lansing, MI 48824, USA.,Program in Ecology, Evolutionary Biology and Behavior, Michigan State University, East Lansing, MI 48824, USA
| | - Jinfeng Chen
- Department of Plant Pathology and Microbiology, University of California, Riverside, CA 92507, USA
| | - Ning Jiang
- Department of Horticulture, Michigan State University, East Lansing, MI 48824, USA.,Program in Ecology, Evolutionary Biology and Behavior, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
113
|
Malmberg MM, Spangenberg GC, Daetwyler HD, Cogan NOI. Assessment of low-coverage nanopore long read sequencing for SNP genotyping in doubled haploid canola (Brassica napus L.). Sci Rep 2019; 9:8688. [PMID: 31213642 PMCID: PMC6582154 DOI: 10.1038/s41598-019-45131-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 05/28/2019] [Indexed: 11/16/2022] Open
Abstract
Despite the high accuracy of short read sequencing (SRS), there are still issues with attaining accurate single nucleotide polymorphism (SNP) genotypes at low sequencing coverage and in highly duplicated genomes due to misalignment. Long read sequencing (LRS) systems, including the Oxford Nanopore Technologies (ONT) minION, have become popular options for de novo genome assembly and structural variant characterisation. The current high error rate often requires substantial post-sequencing correction and would appear to prevent the adoption of this system for SNP genotyping, but nanopore sequencing errors are largely random. Using low coverage ONT minION sequencing for genotyping of pre-validated SNP loci was examined in 9 canola doubled haploids. The minION genotypes were compared to the Illumina sequences to determine the extent and nature of genotype discrepancies between the two systems. The significant increase in read length improved alignment to the genome and the absence of classical SRS biases results in a more even representation of the genome. Sequencing errors are present, primarily in the form of heterozygous genotypes, which can be removed in completely homozygous backgrounds but requires more advanced bioinformatics in heterozygous genomes. Developments in this technology are promising for routine genotyping in the future.
Collapse
Affiliation(s)
- M M Malmberg
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3083, Australia.,School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, 3086, Australia
| | - G C Spangenberg
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3083, Australia.,School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, 3086, Australia
| | - H D Daetwyler
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3083, Australia.,School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, 3086, Australia
| | - N O I Cogan
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3083, Australia. .,School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, 3086, Australia.
| |
Collapse
|
114
|
Jayakumar V, Sakakibara Y. Comprehensive evaluation of non-hybrid genome assembly tools for third-generation PacBio long-read sequence data. Brief Bioinform 2019; 20:866-876. [PMID: 29112696 PMCID: PMC6585154 DOI: 10.1093/bib/bbx147] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Revised: 09/22/2017] [Indexed: 12/20/2022] Open
Abstract
Long reads obtained from third-generation sequencing platforms can help overcome the long-standing challenge of the de novo assembly of sequences for the genomic analysis of non-model eukaryotic organisms. Numerous long-read-aided de novo assemblies have been published recently, which exhibited superior quality of the assembled genomes in comparison with those achieved using earlier second-generation sequencing technologies. Evaluating assemblies is important in guiding the appropriate choice for specific research needs. In this study, we evaluated 10 long-read assemblers using a variety of metrics on Pacific Biosciences (PacBio) data sets from different taxonomic categories with considerable differences in genome size. The results allowed us to narrow down the list to a few assemblers that can be effectively applied to eukaryotic assembly projects. Moreover, we highlight how best to use limited genomic resources for effectively evaluating the genome assemblies of non-model organisms.
Collapse
|
115
|
Soyk S, Lemmon ZH, Sedlazeck FJ, Jiménez-Gómez JM, Alonge M, Hutton SF, Van Eck J, Schatz MC, Lippman ZB. Duplication of a domestication locus neutralized a cryptic variant that caused a breeding barrier in tomato. NATURE PLANTS 2019; 5:471-479. [PMID: 31061537 DOI: 10.1038/s41477-019-0422-z] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Accepted: 04/02/2019] [Indexed: 06/09/2023]
Abstract
Genome editing technologies are being widely adopted in plant breeding1. However, a looming challenge of engineering desirable genetic variation in diverse genotypes is poor predictability of phenotypic outcomes due to unforeseen interactions with pre-existing cryptic mutations2-4. In tomato, breeding with a classical MADS-box gene mutation that improves harvesting by eliminating fruit stem abscission frequently results in excessive inflorescence branching, flowering and reduced fertility due to interaction with a cryptic variant that causes partial mis-splicing in a homologous gene5-8. Here, we show that a recently evolved tandem duplication carrying the second-site variant achieves a threshold of functional transcripts to suppress branching, enabling breeders to neutralize negative epistasis on yield. By dissecting the dosage mechanisms by which this structural variant restored normal flowering and fertility, we devised strategies that use CRISPR-Cas9 genome editing to predictably improve harvesting. Our findings highlight the under-appreciated impact of epistasis in targeted trait breeding and underscore the need for a deeper characterization of cryptic variation to enable the full potential of genome editing in agriculture.
Collapse
Affiliation(s)
- Sebastian Soyk
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
| | | | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - José M Jiménez-Gómez
- Institut Jean-Pierre Bourgin, INRA, AgroParisTech, CNRS, Université Paris-Saclay, Versailles, France
| | - Michael Alonge
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Samuel F Hutton
- Horticultural Sciences Department, University of Florida, Wimauma, FL, USA
| | - Joyce Van Eck
- The Boyce Thompson Institute, Ithaca, NY, USA
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, USA
| | - Michael C Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- Department of Oncologye, Johns Hopkins Medicine, Baltimore, MD, USA
| | - Zachary B Lippman
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
- Howard Hughes Medical Institute, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
| |
Collapse
|
116
|
Slatko BE, Gardner AF, Ausubel FM. Overview of Next-Generation Sequencing Technologies. ACTA ACUST UNITED AC 2019; 122:e59. [PMID: 29851291 DOI: 10.1002/cpmb.59] [Citation(s) in RCA: 453] [Impact Index Per Article: 75.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
High throughput DNA sequencing methodology (next generation sequencing; NGS) has rapidly evolved over the past 15 years and new methods are continually being commercialized. As the technology develops, so do increases in the number of corresponding applications for basic and applied science. The purpose of this review is to provide a compendium of NGS methodologies and associated applications. Each brief discussion is followed by web links to the manufacturer and/or web-based visualizations. Keyword searches, such as with Google, may also provide helpful internet links and information. © 2018 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
| | | | - Frederick M Ausubel
- Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts
| |
Collapse
|
117
|
Morrissey J, Stack JC, Valls R, Motamayor JC. Low-cost assembly of a cacao crop genome is able to resolve complex heterozygous bubbles. HORTICULTURE RESEARCH 2019; 6:44. [PMID: 30962937 PMCID: PMC6441652 DOI: 10.1038/s41438-019-0125-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Revised: 10/08/2018] [Accepted: 01/02/2019] [Indexed: 06/09/2023]
Abstract
Cacao (Theobroma cacao) is a tropical tree that produces the essential raw material for chocolate. Because yields have been stagnant, land use has expanded to provide for increasing chocolate demand. Assembled genomes of key parents could modernize breeding programs in the remote and under-resourced locations where cacao is grown. The MinION, a long read sequencer that runs off of a laptop computer, has the potential to facilitate the assembly of the complex genomes of high-yielding F1 hybrids. Here, we validate the MinION's application to heterozygous crops by creating a de novo genome assembly of a key parent in breeding programs, the clone Pound 7. Our MinION-only assembly was 20% larger than the latest released cacao genome, with 10-fold greater contiguity, and the resolution of complex heterozygosity and repetitive elements. Polishing with Illumina short reads brought the predicted completeness of our assembly to similar levels to the previously released cacao genome assemblies. In contrast to previous cacao genome projects, our assembly required only a small scientific team and limited reagents. Our sequencing and assembly methods could easily be adopted by under-resourced breeding programs, speeding crop improvement in the developing world.
Collapse
Affiliation(s)
- Joe Morrissey
- Mars Chocolate, 13601 Old Cutler Road, Miami, FL 33158 USA
| | | | - Rebecca Valls
- Mars Chocolate, 13601 Old Cutler Road, Miami, FL 33158 USA
| | | |
Collapse
|
118
|
Shin SC, Kim H, Lee JH, Kim HW, Park J, Choi BS, Lee SC, Kim JH, Lee H, Kim S. Nanopore sequencing reads improve assembly and gene annotation of the Parochlus steinenii genome. Sci Rep 2019; 9:5095. [PMID: 30911035 PMCID: PMC6434015 DOI: 10.1038/s41598-019-41549-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Accepted: 03/08/2019] [Indexed: 02/07/2023] Open
Abstract
Parochlus steinenii is a winged midge from King George Island. It is cold-tolerant and endures the harsh Antarctic winter. Previously, we reported the genome of this midge, but the genome assembly with short reads had limited contig contiguity, which reduced the completeness of the genome assembly and the annotated gene sets. Recently, assembly contiguity has been increased using nanopore technology. A number of methods for enhancing the low base quality of the assembly have been reported, including long-read (e.g. Nanopolish) or short-read (e.g. Pilon) based methods. Based on these advances, we used nanopore technologies to upgrade the draft genome sequence of P. steinenii. The final assembled genome was 145,366,448 bases in length. The contig number decreased from 9,132 to 162, and the N50 contig size increased from 36,946 to 1,989,550 bases. The BUSCO completeness of the assembly increased from 87.8 to 98.7%. Improved assembly statistics helped predict more genes from the draft genome of P. steinenii. The completeness of the predicted gene model increased from 79.5 to 92.1%, but the numbers and types of the predicted repeats were similar to those observed in the short read assembly, with the exception of long interspersed nuclear elements. In the present study, we markedly improved the P. steinenii genome assembly statistics using nanopore sequencing, but found that genome polishing with high-quality reads was essential for improving genome annotation. The number of genes predicted and the lengths of the genes were greater than before, and nanopore technology readily improved genome information.
Collapse
Affiliation(s)
- Seung Chul Shin
- Unit of Polar Genomics, Korea Polar Research Institute (KOPRI), Incheon, 21990, Republic of Korea.
| | - Hyun Kim
- Unit of Polar Genomics, Korea Polar Research Institute (KOPRI), Incheon, 21990, Republic of Korea
| | - Jun Hyuck Lee
- Unit of Polar Genomics, Korea Polar Research Institute (KOPRI), Incheon, 21990, Republic of Korea
- Department of Polar Sciences, University of Science and Technology, Incheon, 21990, Republic of Korea
| | - Han-Woo Kim
- Unit of Polar Genomics, Korea Polar Research Institute (KOPRI), Incheon, 21990, Republic of Korea
- Department of Polar Sciences, University of Science and Technology, Incheon, 21990, Republic of Korea
| | - Joonho Park
- Department of Fine Chemistry, Seoul National University of Science and Technology, Seoul, 01811, Republic of Korea
| | | | | | - Ji Hee Kim
- Division of Life Sciences, Korea Polar Research Institute (KOPRI), Incheon, 21990, Republic of Korea
| | - Hyoungseok Lee
- Unit of Polar Genomics, Korea Polar Research Institute (KOPRI), Incheon, 21990, Republic of Korea
- Department of Polar Sciences, University of Science and Technology, Incheon, 21990, Republic of Korea
| | - Sanghee Kim
- Division of Life Sciences, Korea Polar Research Institute (KOPRI), Incheon, 21990, Republic of Korea.
| |
Collapse
|
119
|
Paajanen P, Kettleborough G, López-Girona E, Giolai M, Heavens D, Baker D, Lister A, Cugliandolo F, Wilde G, Hein I, Macaulay I, Bryan GJ, Clark MD. A critical comparison of technologies for a plant genome sequencing project. Gigascience 2019; 8:giy163. [PMID: 30624602 PMCID: PMC6423373 DOI: 10.1093/gigascience/giy163] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2018] [Revised: 09/26/2018] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND A high-quality genome sequence of any model organism is an essential starting point for genetic and other studies. Older clone-based methods are slow and expensive, whereas faster, cheaper short-read-only assemblies can be incomplete and highly fragmented, which minimizes their usefulness. The last few years have seen the introduction of many new technologies for genome assembly. These new technologies and associated new algorithms are typically benchmarked on microbial genomes or, if they scale appropriately, on larger (e.g., human) genomes. However, plant genomes can be much more repetitive and larger than the human genome, and plant biochemistry often makes obtaining high-quality DNA that is free from contaminants difficult. Reflecting their challenging nature, we observe that plant genome assembly statistics are typically poorer than for vertebrates. RESULTS Here, we compare Illumina short read, Pacific Biosciences long read, 10x Genomics linked reads, Dovetail Hi-C, and BioNano Genomics optical maps, singly and combined, in producing high-quality long-range genome assemblies of the potato species Solanum verrucosum. We benchmark the assemblies for completeness and accuracy, as well as DNA compute requirements and sequencing costs. CONCLUSIONS The field of genome sequencing and assembly is reaching maturity, and the differences we observe between assemblies are surprisingly small. We expect that our results will be helpful to other genome projects, and that these datasets will be used in benchmarking by assembly algorithm developers.
Collapse
Affiliation(s)
- Pirita Paajanen
- Technology Development, Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
- Department of Cell and Developmental Biology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - George Kettleborough
- Technology Development, Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Elena López-Girona
- Cell and Molcular Sciences, The James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK
- The New Zealand Institute for Plant & Food Research Limited, Palmerston North 4442, New Zealand
| | - Michael Giolai
- Technology Development, Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Darren Heavens
- Technology Development, Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - David Baker
- Technology Development, Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Ashleigh Lister
- Technology Development, Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Fiorella Cugliandolo
- Technology Development, Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Gail Wilde
- Cell and Molcular Sciences, The James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK
| | - Ingo Hein
- Cell and Molcular Sciences, The James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK
| | - Iain Macaulay
- Technology Development, Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Glenn J Bryan
- Cell and Molcular Sciences, The James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK
| | - Matthew D Clark
- Technology Development, Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
- Department of Life Sciences, Natural History Museum, Cromwell Road, London WC2 5BD, UK
| |
Collapse
|
120
|
Gabur I, Chawla HS, Snowdon RJ, Parkin IAP. Connecting genome structural variation with complex traits in crop plants. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2019; 132:733-750. [PMID: 30448864 DOI: 10.1007/s00122-018-3233-0] [Citation(s) in RCA: 73] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Accepted: 11/07/2018] [Indexed: 05/05/2023]
Abstract
Structural genome variation is a major determinant of useful trait diversity. We describe how genome analysis methods are enabling discovery of trait-associated structural variants and their potential impact on breeding. As our understanding of complex crop genomes continues to grow, there is growing evidence that structural genome variation plays a major role in determining traits important for breeding and agriculture. Identifying the extent and impact of structural variants in crop genomes is becoming increasingly feasible with ongoing advances in the sophistication of genome sequencing technologies, particularly as it becomes easier to generate accurate long sequence reads on a genome-wide scale. In this article, we discuss the origins of structural genome variation in crops from ancient and recent genome duplication and polyploidization events and review high-throughput methods to assay such variants in crop populations in order to find associations with phenotypic traits. There is increasing evidence from such studies that gene presence-absence and copy number variation resulting from segmental chromosome exchanges may be at the heart of adaptive variation of crops to counter abiotic and biotic stress factors. We present examples from major crops that demonstrate the potential of pangenomic diversity as a key resource for future plant breeding for resilience and sustainability.
Collapse
Affiliation(s)
- Iulian Gabur
- Department of Plant Breeding, Justus Liebig University, Heinrich-Buff-Ring 26-32, 35392, Giessen, Germany
| | - Harmeet Singh Chawla
- Department of Plant Breeding, Justus Liebig University, Heinrich-Buff-Ring 26-32, 35392, Giessen, Germany
| | - Rod J Snowdon
- Department of Plant Breeding, Justus Liebig University, Heinrich-Buff-Ring 26-32, 35392, Giessen, Germany.
| | - Isobel A P Parkin
- Agriculture and Agri-Food Canada, 107 Science Place, Saskatoon, SK, S7N OX2, Canada
| |
Collapse
|
121
|
Muthamilarasan M, Singh NK, Prasad M. Multi-omics approaches for strategic improvement of stress tolerance in underutilized crop species: A climate change perspective. ADVANCES IN GENETICS 2019; 103:1-38. [PMID: 30904092 DOI: 10.1016/bs.adgen.2019.01.001] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
For several decades, researchers are working toward improving the "major" crops for better adaptability and tolerance to environmental stresses. However, little or no research attention is given toward neglected and underutilized crop species (NUCS) which hold the potential to ensure food and nutritional security among the ever-growing global population. NUCS are predominantly climate resilient, but their yield and quality are compromised due to selective breeding. In this context, the importance of omics technologies namely genomics, transcriptomics, proteomics, phenomics and ionomics in delineating the complex molecular machinery governing growth, development and stress responses of NUCS is underlined. However, gaining insights through individual omics approaches will not be sufficient to address the research questions, whereas integrating these technologies could be an effective strategy to decipher the gene function, genome structures, biological pathways, metabolic and regulatory networks underlying complex traits. Given this, the chapter enlists the importance of NUCS in food and nutritional security and provides an overview of deploying omics approaches to study the NUCS. Also, the chapter enumerates the status of crop improvement programs in NUCS and suggests implementing "integrating omics" for gaining a better understanding of crops' response to abiotic and biotic stresses.
Collapse
Affiliation(s)
- Mehanathan Muthamilarasan
- National Institute of Plant Genome Research, New Delhi, India; ICAR-National Research Centre on Plant Biotechnology, Pusa Campus, New Delhi, India
| | - Nagendra Kumar Singh
- ICAR-National Research Centre on Plant Biotechnology, Pusa Campus, New Delhi, India
| | - Manoj Prasad
- National Institute of Plant Genome Research, New Delhi, India.
| |
Collapse
|
122
|
Wu M, Kostyun JL, Moyle LC. Genome Sequence of Jaltomata Addresses Rapid Reproductive Trait Evolution and Enhances Comparative Genomics in the Hyper-Diverse Solanaceae. Genome Biol Evol 2019; 11:335-349. [PMID: 30608583 PMCID: PMC6368146 DOI: 10.1093/gbe/evy274] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/16/2018] [Indexed: 12/11/2022] Open
Abstract
Within the economically important plant family Solanaceae, Jaltomata is a rapidly evolving genus that has extensive diversity in flower size and shape, as well as fruit and nectar color, among its ∼80 species. Here, we report the whole-genome sequencing, assembly, and annotation, of one representative species (Jaltomata sinuosa) from this genus. Combining PacBio long reads (25×) and Illumina short reads (148×) achieved an assembly of ∼1.45 Gb, spanning ∼96% of the estimated genome. Ninety-six percent of curated single-copy orthologs in plants were detected in the assembly, supporting a high level of completeness of the genome. Similar to other Solanaceous species, repetitive elements made up a large fraction (∼80%) of the genome, with the most recently active element, Gypsy, expanding across the genome in the last 1–2 Myr. Computational gene prediction, in conjunction with a merged transcriptome data set from 11 tissues, identified 34,725 protein-coding genes. Comparative phylogenetic analyses with six other sequenced Solanaceae species determined that Jaltomata is most likely sister to Solanum, although a large fraction of gene trees supported a conflicting bipartition consistent with substantial introgression between Jaltomata and Capsicum after these species split. We also identified gene family dynamics specific to Jaltomata, including expansion of gene families potentially involved in novel reproductive trait development, and loss of gene families that accompanied the loss of self-incompatibility. This high-quality genome will facilitate studies of phenotypic diversification in this rapidly radiating group and provide a new point of comparison for broader analyses of genomic evolution across the Solanaceae.
Collapse
Affiliation(s)
- Meng Wu
- Department of Biology, Indiana University Bloomington
| | - Jamie L Kostyun
- Department of Biology, Indiana University Bloomington.,Department of Plant Biology, University of Vermont
| | | |
Collapse
|
123
|
Huang X, Xiao M, Xi J, He C, Zheng J, Chen H, Gao J, Zhang S, Wu W, Liang Y, Xie L, Yi K. De Novo Transcriptome Assembly of Agave H11648 by Illumina Sequencing and Identification of Cellulose Synthase Genes in Agave Species. Genes (Basel) 2019; 10:genes10020103. [PMID: 30704153 PMCID: PMC6409920 DOI: 10.3390/genes10020103] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Revised: 01/19/2019] [Accepted: 01/28/2019] [Indexed: 12/21/2022] Open
Abstract
Agave plants are important crassulacean acid metabolism (CAM) plants with multiple agricultural uses, such as being used in tequila and fiber production. Agave hybrid H11648 ((A. amaniensis Trel. and Nowell × A. angustifolia Haw.) × A. amaniensis) is the main cultivated Agave species for fiber production in large tropical areas around the world. In this study, we conducted a transcriptome analysis of A. H11648. About 49.25 million clean reads were obtained by Illumina paired-end sequencing. De novo assembly produced 148,046 unigenes with more than 40% annotated in public databases, or matched homologs in model plants. More homologous gene pairs were found in Asparagus genome than in Arabidopsis or rice, which indicated a close evolutionary relationship between Asparagus and A. H11648. CAM-related gene families were also characterized as previously reported in A. americana. We further identified 12 cellulose synthase genes (CesA) in Asparagus genome and 38 CesA sequences from A. H11648, A. americana, A. deserti and A. tequilana. The full-length CesA genes were used as references for the cloning and assembly of their homologs in other Agave species. As a result, we obtained CesA1/3/4/5/7 genes with full-length coding region in the four Agave species. Phylogenetic and expression analysis revealed a conserved evolutionary pattern, which could not explain the distinct fiber traits in different Agave species. We inferred that transcriptional regulation might be responsible for Agave fiber development. This study represents the transcriptome of A. H11648, which would expand the number of Agave genes and benefit relevant studies of Agave fiber development.
Collapse
Affiliation(s)
- Xing Huang
- Environment and Plant Protection Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China.
| | - Mei Xiao
- College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China.
| | - Jingen Xi
- Environment and Plant Protection Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China.
| | - Chunping He
- Environment and Plant Protection Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China.
| | - Jinlong Zheng
- Environment and Plant Protection Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China.
| | - Helong Chen
- Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China.
| | - Jianming Gao
- Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China.
| | - Shiqing Zhang
- Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China.
| | - Weihuai Wu
- Environment and Plant Protection Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China.
| | - Yanqiong Liang
- Environment and Plant Protection Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China.
| | - Li Xie
- Institute of Tropical Agriculture and Forestry, Hainan University, Haikou, Hainan 570228, China.
| | - Kexian Yi
- Environment and Plant Protection Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China.
| |
Collapse
|
124
|
Marchet C, Lecompte L, Silva CD, Cruaud C, Aury JM, Nicolas J, Peterlongo P. De novo clustering of long reads by gene from transcriptomics data. Nucleic Acids Res 2019; 47:e2. [PMID: 30260405 PMCID: PMC6326815 DOI: 10.1093/nar/gky834] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Revised: 09/04/2018] [Accepted: 09/10/2018] [Indexed: 02/07/2023] Open
Abstract
Long-read sequencing currently provides sequences of several thousand base pairs. It is therefore possible to obtain complete transcripts, offering an unprecedented vision of the cellular transcriptome. However the literature lacks tools for de novo clustering of such data, in particular for Oxford Nanopore Technologies reads, because of the inherent high error rate compared to short reads. Our goal is to process reads from whole transcriptome sequencing data accurately and without a reference genome in order to reliably group reads coming from the same gene. This de novo approach is therefore particularly suitable for non-model species, but can also serve as a useful pre-processing step to improve read mapping. Our contribution both proposes a new algorithm adapted to clustering of reads by gene and a practical and free access tool that allows to scale the complete processing of eukaryotic transcriptomes. We sequenced a mouse RNA sample using the MinION device. This dataset is used to compare our solution to other algorithms used in the context of biological clustering. We demonstrate that it is the best approach for transcriptomics long reads. When a reference is available to enable mapping, we show that it stands as an alternative method that predicts complementary clusters.
Collapse
Affiliation(s)
| | | | - Corinne Da Silva
- Commissariat à l’Énergie Atomique (CEA), Institut de Biologie François Jacob, Genoscope, 91000 Evry, France
| | - Corinne Cruaud
- Commissariat à l’Énergie Atomique (CEA), Institut de Biologie François Jacob, Genoscope, 91000 Evry, France
| | - Jean-Marc Aury
- Commissariat à l’Énergie Atomique (CEA), Institut de Biologie François Jacob, Genoscope, 91000 Evry, France
| | | | | |
Collapse
|
125
|
Bolger AM, Poorter H, Dumschott K, Bolger ME, Arend D, Osorio S, Gundlach H, Mayer KFX, Lange M, Scholz U, Usadel B. Computational aspects underlying genome to phenome analysis in plants. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2019; 97:182-198. [PMID: 30500991 PMCID: PMC6849790 DOI: 10.1111/tpj.14179] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Revised: 11/06/2018] [Accepted: 11/16/2018] [Indexed: 05/18/2023]
Abstract
Recent advances in genomics technologies have greatly accelerated the progress in both fundamental plant science and applied breeding research. Concurrently, high-throughput plant phenotyping is becoming widely adopted in the plant community, promising to alleviate the phenotypic bottleneck. While these technological breakthroughs are significantly accelerating quantitative trait locus (QTL) and causal gene identification, challenges to enable even more sophisticated analyses remain. In particular, care needs to be taken to standardize, describe and conduct experiments robustly while relying on plant physiology expertise. In this article, we review the state of the art regarding genome assembly and the future potential of pangenomics in plant research. We also describe the necessity of standardizing and describing phenotypic studies using the Minimum Information About a Plant Phenotyping Experiment (MIAPPE) standard to enable the reuse and integration of phenotypic data. In addition, we show how deep phenotypic data might yield novel trait-trait correlations and review how to link phenotypic data to genomic data. Finally, we provide perspectives on the golden future of machine learning and their potential in linking phenotypes to genomic features.
Collapse
Affiliation(s)
- Anthony M. Bolger
- Institute for Biology I, BioSCRWTH Aachen UniversityWorringer Weg 352074AachenGermany
| | - Hendrik Poorter
- Forschungszentrum Jülich (FZJ) Institute of Bio‐ and Geosciences (IBG‐2) Plant SciencesWilhelm‐Johnen‐Straße52428JülichGermany
- Department of Biological SciencesMacquarie UniversityNorth RydeNSW2109Australia
| | - Kathryn Dumschott
- Institute for Biology I, BioSCRWTH Aachen UniversityWorringer Weg 352074AachenGermany
| | - Marie E. Bolger
- Forschungszentrum Jülich (FZJ) Institute of Bio‐ and Geosciences (IBG‐2) Plant SciencesWilhelm‐Johnen‐Straße52428JülichGermany
| | - Daniel Arend
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) GaterslebenCorrensstraße 306466SeelandGermany
| | - Sonia Osorio
- Department of Molecular Biology and BiochemistryInstituto de Hortofruticultura Subtropical y Mediterránea “La Mayora”Universidad de Málaga‐Consejo Superior de Investigaciones CientíficasCampus de Teatinos29071MálagaSpain
| | - Heidrun Gundlach
- Plant Genome and Systems Biology (PGSB)Helmholtz Zentrum München (HMGU)Ingolstädter Landstraße 185764NeuherbergGermany
| | - Klaus F. X. Mayer
- Plant Genome and Systems Biology (PGSB)Helmholtz Zentrum München (HMGU)Ingolstädter Landstraße 185764NeuherbergGermany
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) GaterslebenCorrensstraße 306466SeelandGermany
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) GaterslebenCorrensstraße 306466SeelandGermany
| | - Björn Usadel
- Institute for Biology I, BioSCRWTH Aachen UniversityWorringer Weg 352074AachenGermany
- Forschungszentrum Jülich (FZJ) Institute of Bio‐ and Geosciences (IBG‐2) Plant SciencesWilhelm‐Johnen‐Straße52428JülichGermany
| |
Collapse
|
126
|
Schalamun M, Nagar R, Kainer D, Beavan E, Eccles D, Rathjen JP, Lanfear R, Schwessinger B. Harnessing the MinION: An example of how to establish long-read sequencing in a laboratory using challenging plant tissue from Eucalyptus pauciflora. Mol Ecol Resour 2019; 19:77-89. [PMID: 30118581 PMCID: PMC7380007 DOI: 10.1111/1755-0998.12938] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Revised: 08/08/2018] [Accepted: 08/10/2018] [Indexed: 11/28/2022]
Abstract
Long-read sequencing technologies are transforming our ability to assemble highly complex genomes. Realizing their full potential is critically reliant on extracting high-quality, high-molecular-weight (HMW) DNA from the organisms of interest. This is especially the case for the portable MinION sequencer which enables all laboratories to undertake their own genome sequencing projects, due to its low entry cost and minimal spatial footprint. One challenge of the MinION is that each group has to independently establish effective protocols for using the instrument, which can be time-consuming and costly. Here, we present a workflow and protocols that enabled us to establish MinION sequencing in our own laboratories, based on optimizing DNA extraction from a challenging plant tissue as a case study. Following the workflow illustrated, we were able to reliably and repeatedly obtain >6.5 Gb of long-read sequencing data with a mean read length of 13 kb and an N50 of 26 kb. Our protocols are open source and can be performed in any laboratory without special equipment. We also illustrate some more elaborate workflows which can increase mean and average read lengths if this is desired. We envision that our workflow for establishing MinION sequencing, including the illustration of potential pitfalls and suggestions of how to adapt it to other tissue types, will be useful to others who plan to establish long-read sequencing in their own laboratories.
Collapse
Affiliation(s)
- Miriam Schalamun
- Research School of BiologyThe Australian National UniversityActonACTAustralia
- Present address:
University of Natural Resources and Life SciencesViennaAustria
| | - Ramawatar Nagar
- Research School of BiologyThe Australian National UniversityActonACTAustralia
| | - David Kainer
- Research School of BiologyThe Australian National UniversityActonACTAustralia
| | - Eleanor Beavan
- Research School of BiologyThe Australian National UniversityActonACTAustralia
| | - David Eccles
- Malaghan Institute of Medical ResearchWellingtonNew Zealand
- Present address:
Malaghan Institute of Medical ResearchWellingtonNew Zealand
| | - John P. Rathjen
- Research School of BiologyThe Australian National UniversityActonACTAustralia
| | - Robert Lanfear
- Research School of BiologyThe Australian National UniversityActonACTAustralia
| | | |
Collapse
|
127
|
Song C, Liu Y, Song A, Dong G, Zhao H, Sun W, Ramakrishnan S, Wang Y, Wang S, Li T, Niu Y, Jiang J, Dong B, Xia Y, Chen S, Hu Z, Chen F, Chen S. The Chrysanthemum nankingense Genome Provides Insights into the Evolution and Diversification of Chrysanthemum Flowers and Medicinal Traits. MOLECULAR PLANT 2018; 11:1482-1491. [PMID: 30342096 DOI: 10.1016/j.molp.2018.10.003] [Citation(s) in RCA: 123] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Revised: 09/25/2018] [Accepted: 10/10/2018] [Indexed: 05/21/2023]
Abstract
The Asteraceae (Compositae), a large plant family of approximately 24 000-35 000 species, accounts for ∼10% of all angiosperm species and contributes a lot to plant diversity. The most representative members of the Asteraceae are the economically important chrysanthemums (Chrysanthemum L.) that diversified through reticulate evolution. Biodiversity is typically created by multiple evolutionary mechanisms such as whole-genome duplication (WGD) or polyploidization and locally repetitive genome expansion. However, the lack of genomic data from chrysanthemum species has prevented an in-depth analysis of the evolutionary mechanisms involved in their diversification. Here, we used Oxford Nanopore long-read technology to sequence the diploid Chrysanthemum nankingense genome, which represents one of the progenitor genomes of domesticated chrysanthemums. Our analysis revealed that the evolution of the C. nankingense genome was driven by bursts of repetitive element expansion and WGD events including a recent WGD that distinguishes chrysanthemum from sunflower, which diverged from chrysanthemum approximately 38.8 million years ago. Variations of ornamental and medicinal traits in chrysanthemums are linked to the expansion of candidate gene families by duplication events including paralogous gene duplication. Collectively, our study of the assembled reference genome offers new knowledge and resources to dissect the history and pattern of evolution and diversification of chrysanthemum plants, and also to accelerate their breeding and improvement.
Collapse
Affiliation(s)
- Chi Song
- Key Laboratory of Beijing for Identification and Safety Evaluation of Chinese Medicine, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Yifei Liu
- College of Pharmacy, Hubei University of Chinese Medicine, Wuhan 430065, China
| | - Aiping Song
- College of Horticulture, Nanjing Agricultural University, Key Laboratory of Landscape Agriculture, Ministry of Agriculture, Nanjing 210095, China
| | | | - Hongbo Zhao
- Department of Ornamental Horticulture, School of Landscape Architecture, Zhejiang Agriculture and Forestry University, Hangzhou 311300, China
| | - Wei Sun
- Key Laboratory of Beijing for Identification and Safety Evaluation of Chinese Medicine, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | | | - Ying Wang
- Wuhan Benagen Tech Solutions Company Limited, Wuhan 430070, China
| | - Shuaibin Wang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, The Chinese Academy of Sciences, Guangzhou 510650, China
| | - Tingzhao Li
- Amway (China) Botanical R&D Center, Wuxi 214115, China
| | - Yan Niu
- Wuhan Benagen Tech Solutions Company Limited, Wuhan 430070, China
| | - Jiafu Jiang
- College of Horticulture, Nanjing Agricultural University, Key Laboratory of Landscape Agriculture, Ministry of Agriculture, Nanjing 210095, China
| | - Bin Dong
- Department of Ornamental Horticulture, School of Landscape Architecture, Zhejiang Agriculture and Forestry University, Hangzhou 311300, China
| | - Ye Xia
- Wuhan Benagen Tech Solutions Company Limited, Wuhan 430070, China
| | - Sumei Chen
- College of Horticulture, Nanjing Agricultural University, Key Laboratory of Landscape Agriculture, Ministry of Agriculture, Nanjing 210095, China
| | - Zhigang Hu
- College of Pharmacy, Hubei University of Chinese Medicine, Wuhan 430065, China
| | - Fadi Chen
- College of Horticulture, Nanjing Agricultural University, Key Laboratory of Landscape Agriculture, Ministry of Agriculture, Nanjing 210095, China.
| | - Shilin Chen
- Key Laboratory of Beijing for Identification and Safety Evaluation of Chinese Medicine, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China.
| |
Collapse
|
128
|
Yang J, Huang X. A new high-quality genome sequence in soybean. SCIENCE CHINA. LIFE SCIENCES 2018; 61:1604-1605. [PMID: 30474780 DOI: 10.1007/s11427-018-9431-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Accepted: 10/22/2018] [Indexed: 10/27/2022]
Affiliation(s)
- Jun Yang
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China.
| | - Xuehui Huang
- College of life sciences, Shanghai Normal University, Shanghai, 200234, China.
| |
Collapse
|
129
|
Deschamps S, Zhang Y, Llaca V, Ye L, Sanyal A, King M, May G, Lin H. A chromosome-scale assembly of the sorghum genome using nanopore sequencing and optical mapping. Nat Commun 2018; 9:4844. [PMID: 30451840 PMCID: PMC6242865 DOI: 10.1038/s41467-018-07271-1] [Citation(s) in RCA: 102] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Accepted: 10/25/2018] [Indexed: 02/02/2023] Open
Abstract
Long-read sequencing technologies have greatly facilitated assemblies of large eukaryotic genomes. In this paper, Oxford Nanopore sequences generated on a MinION sequencer are combined with Bionano Genomics Direct Label and Stain (DLS) optical maps to generate a chromosome-scale de novo assembly of the repeat-rich Sorghum bicolor Tx430 genome. The final assembly consists of 29 scaffolds, encompassing in most cases entire chromosome arms. It has a scaffold N50 of 33.28 Mbps and covers 90% of the expected genome length. A sequence accuracy of 99.85% is obtained after aligning the assembly against Illumina Tx430 data and 99.6% of the 34,211 public gene models align to the assembly. Comparisons of Tx430 and BTx623 DLS maps against the public BTx623 v3.0.1 genome assembly suggest substantial discrepancies whose origin remains to be determined. In summary, this study demonstrates that informative assemblies of complex plant genomes can be generated by combining nanopore sequencing with DLS optical maps.
Collapse
Affiliation(s)
- Stéphane Deschamps
- Corteva Agriscience™, Agriculture Division of DowDuPont™, 8325 NW 62nd Avenue, Johnston, IA, 50131, USA.
| | - Yun Zhang
- Corteva Agriscience™, Agriculture Division of DowDuPont™, 4010 Point Eden Way, Hayward, CA, 94545, USA
| | - Victor Llaca
- Corteva Agriscience™, Agriculture Division of DowDuPont™, 8325 NW 62nd Avenue, Johnston, IA, 50131, USA
| | - Liang Ye
- Corteva Agriscience™, Agriculture Division of DowDuPont™, 8325 NW 62nd Avenue, Johnston, IA, 50131, USA
| | - Abhijit Sanyal
- Corteva Agriscience™, Agriculture Division of DowDuPont™, The V-Acendas, Atria Block, 12th Floor, Plot No. 17, Hyderabad, 500081, Telangana, India
| | - Matthew King
- Corteva Agriscience™, Agriculture Division of DowDuPont™, 8325 NW 62nd Avenue, Johnston, IA, 50131, USA
| | - Gregory May
- Corteva Agriscience™, Agriculture Division of DowDuPont™, 8325 NW 62nd Avenue, Johnston, IA, 50131, USA
| | - Haining Lin
- Corteva Agriscience™, Agriculture Division of DowDuPont™, 8325 NW 62nd Avenue, Johnston, IA, 50131, USA.
| |
Collapse
|
130
|
Lu FH, McKenzie N, Kettleborough G, Heavens D, Clark MD, Bevan MW. Independent assessment and improvement of wheat genome sequence assemblies using Fosill jumping libraries. Gigascience 2018; 7:4995264. [PMID: 29762659 PMCID: PMC5967450 DOI: 10.1093/gigascience/giy053] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Accepted: 05/04/2018] [Indexed: 12/20/2022] Open
Abstract
Background The accurate sequencing and assembly of very large, often polyploid, genomes remains a challenging task, limiting long-range sequence information and phased sequence variation for applications such as plant breeding. The 15-Gb hexaploid bread wheat (Triticum aestivum) genome has been particularly challenging to sequence, and several different approaches have recently generated long-range assemblies. Mapping and understanding the types of assembly errors are important for optimising future sequencing and assembly approaches and for comparative genomics. Results Here we use a Fosill 38-kb jumping library to assess medium and longer–range order of different publicly available wheat genome assemblies. Modifications to the Fosill protocol generated longer Illumina sequences and enabled comprehensive genome coverage. Analyses of two independent Bacterial Artificial Chromosome (BAC)-based chromosome-scale assemblies, two independent Illumina whole genome shotgun assemblies, and a hybrid Single Molecule Real Time (SMRT-PacBio) and short read (Illumina) assembly were carried out. We revealed a surprising scale and variety of discrepancies using Fosill mate-pair mapping and validated several of each class. In addition, Fosill mate-pairs were used to scaffold a whole genome Illumina assembly, leading to a 3-fold increase in N50 values. Conclusions Our analyses, using an independent means to validate different wheat genome assemblies, show that whole genome shotgun assemblies based solely on Illumina sequences are significantly more accurate by all measures compared to BAC-based chromosome-scale assemblies and hybrid SMRT-Illumina approaches. Although current whole genome assemblies are reasonably accurate and useful, additional improvements will be needed to generate complete assemblies of wheat genomes using open-source, computationally efficient, and cost-effective methods.
Collapse
Affiliation(s)
- Fu-Hao Lu
- John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Neil McKenzie
- John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | | | - Darren Heavens
- The Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Matthew D Clark
- The Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Michael W Bevan
- John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| |
Collapse
|
131
|
Belser C, Istace B, Denis E, Dubarry M, Baurens FC, Falentin C, Genete M, Berrabah W, Chèvre AM, Delourme R, Deniot G, Denoeud F, Duffé P, Engelen S, Lemainque A, Manzanares-Dauleux M, Martin G, Morice J, Noel B, Vekemans X, D'Hont A, Rousseau-Gueutin M, Barbe V, Cruaud C, Wincker P, Aury JM. Chromosome-scale assemblies of plant genomes using nanopore long reads and optical maps. NATURE PLANTS 2018; 4:879-887. [PMID: 30390080 DOI: 10.1038/s41477-018-0289-4] [Citation(s) in RCA: 228] [Impact Index Per Article: 32.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Accepted: 09/24/2018] [Indexed: 05/19/2023]
Abstract
Plant genomes are often characterized by a high level of repetitiveness and polyploid nature. Consequently, creating genome assemblies for plant genomes is challenging. The introduction of short-read technologies 10 years ago substantially increased the number of available plant genomes. Generally, these assemblies are incomplete and fragmented, and only a few are at the chromosome scale. Recently, Pacific Biosciences and Oxford Nanopore sequencing technologies were commercialized that can sequence long DNA fragments (kilobases to megabase) and, using efficient algorithms, provide high-quality assemblies in terms of contiguity and completeness of repetitive regions1-4. However, even though genome assemblies based on long reads exhibit high contig N50s (>1 Mb), these methods are still insufficient to decipher genome organization at the chromosome level. Here, we describe a strategy based on long reads (MinION or PromethION sequencers) and optical maps (Saphyr system) that can produce chromosome-level assemblies and demonstrate applicability by generating high-quality genome sequences for two new dicotyledon morphotypes, Brassica rapa Z1 (yellow sarson) and Brassica oleracea HDEM (broccoli), and one new monocotyledon, Musa schizocarpa (banana). All three assemblies show contig N50s of >5 Mb and contain scaffolds that represent entire chromosomes or chromosome arms.
Collapse
Affiliation(s)
- Caroline Belser
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Benjamin Istace
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Erwan Denis
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Marion Dubarry
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Franc-Christophe Baurens
- CIRAD, UMR AGAP, Montpellier, France
- AGAP, Université Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Cyril Falentin
- IGEPP, INRA, Agrocampus Ouest, Université Rennes 1, BP35327, Le Rheu, France
| | - Mathieu Genete
- Université Lille, CNRS, UMR 8198-Evo-Eco-Paleo, Lille, France
| | - Wahiba Berrabah
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Anne-Marie Chèvre
- IGEPP, INRA, Agrocampus Ouest, Université Rennes 1, BP35327, Le Rheu, France
| | - Régine Delourme
- IGEPP, INRA, Agrocampus Ouest, Université Rennes 1, BP35327, Le Rheu, France
| | - Gwenaëlle Deniot
- IGEPP, INRA, Agrocampus Ouest, Université Rennes 1, BP35327, Le Rheu, France
| | - France Denoeud
- Génomique Métabolique, Genoscope, Institut de biologie François Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, Evry, France
| | - Philippe Duffé
- IGEPP, INRA, Agrocampus Ouest, Université Rennes 1, BP35327, Le Rheu, France
| | - Stefan Engelen
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Arnaud Lemainque
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | | | - Guillaume Martin
- CIRAD, UMR AGAP, Montpellier, France
- AGAP, Université Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Jérôme Morice
- IGEPP, INRA, Agrocampus Ouest, Université Rennes 1, BP35327, Le Rheu, France
| | - Benjamin Noel
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Xavier Vekemans
- Université Lille, CNRS, UMR 8198-Evo-Eco-Paleo, Lille, France
| | - Angélique D'Hont
- CIRAD, UMR AGAP, Montpellier, France
- AGAP, Université Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | | | - Valérie Barbe
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Corinne Cruaud
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut de biologie François Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, Evry, France
| | - Jean-Marc Aury
- Genoscope, Institut de biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France.
| |
Collapse
|
132
|
Abstract
Understanding how crop plants evolved from their wild relatives and spread around the world can inform about the origins of agriculture. Here, we review how the rapid development of genomic resources and tools has made it possible to conduct genetic mapping and population genetic studies to unravel the molecular underpinnings of domestication and crop evolution in diverse crop species. We propose three future avenues for the study of crop evolution: establishment of high-quality reference genomes for crops and their wild relatives; genomic characterization of germplasm collections; and the adoption of novel methodologies such as archaeogenetics, epigenomics, and genome editing.
Collapse
Affiliation(s)
- Mona Schreiber
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, 06466, Seeland, Germany
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, 06466, Seeland, Germany
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, 06466, Seeland, Germany.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103, Leipzig, Germany.
| |
Collapse
|
133
|
Thind AK, Wicker T, Müller T, Ackermann PM, Steuernagel B, Wulff BBH, Spannagl M, Twardziok SO, Felder M, Lux T, Mayer KFX, Keller B, Krattinger SG. Chromosome-scale comparative sequence analysis unravels molecular mechanisms of genome dynamics between two wheat cultivars. Genome Biol 2018; 19:104. [PMID: 30115097 PMCID: PMC6097286 DOI: 10.1186/s13059-018-1477-2] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2018] [Accepted: 07/10/2018] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Recent improvements in DNA sequencing and genome scaffolding have paved the way to generate high-quality de novo assemblies of pseudomolecules representing complete chromosomes of wheat and its wild relatives. These assemblies form the basis to compare the dynamics of wheat genomes on a megabase scale. RESULTS Here, we provide a comparative sequence analysis of the 700-megabase chromosome 2D between two bread wheat genotypes-the old landrace Chinese Spring and the elite Swiss spring wheat line 'CH Campala Lr22a'. Both chromosomes were assembled into megabase-sized scaffolds. There is a high degree of sequence conservation between the two chromosomes. Analysis of large structural variations reveals four large indels of more than 100 kb. Based on the molecular signatures at the breakpoints, unequal crossing over and double-strand break repair were identified as the molecular mechanisms that caused these indels. Three of the large indels affect copy number of NLRs, a gene family involved in plant immunity. Analysis of SNP density reveals four haploblocks of 4, 8, 9 and 48 Mb with a 35-fold increased SNP density compared to the rest of the chromosome. Gene content across the two chromosomes was highly conserved. Ninety-nine percent of the genic sequences were present in both genotypes and the fraction of unique genes ranged from 0.4 to 0.7%. CONCLUSIONS This comparative analysis of two high-quality chromosome assemblies enabled a comprehensive assessment of large structural variations and gene content. The insight obtained from this analysis will form the basis of future wheat pan-genome studies.
Collapse
Affiliation(s)
- Anupriya Kaur Thind
- Department of Plant and Microbial Biology, University of Zurich, Zollikerstrasse 107, Zurich, Switzerland
| | - Thomas Wicker
- Department of Plant and Microbial Biology, University of Zurich, Zollikerstrasse 107, Zurich, Switzerland
| | - Thomas Müller
- Department of Plant and Microbial Biology, University of Zurich, Zollikerstrasse 107, Zurich, Switzerland
| | - Patrick M Ackermann
- Department of Plant and Microbial Biology, University of Zurich, Zollikerstrasse 107, Zurich, Switzerland
| | | | | | | | | | | | - Thomas Lux
- Helmholtz Zentrum Munich, Munich, Germany
| | - Klaus F X Mayer
- Helmholtz Zentrum Munich, Munich, Germany
- School of Life Sciences, Technical University Munich, Munich, Germany
- College of Science, King Saud University, Riad, Kingdom of Saudi Arabia
| | - Beat Keller
- Department of Plant and Microbial Biology, University of Zurich, Zollikerstrasse 107, Zurich, Switzerland
| | - Simon G Krattinger
- Department of Plant and Microbial Biology, University of Zurich, Zollikerstrasse 107, Zurich, Switzerland.
- King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia.
| |
Collapse
|
134
|
Shen Y, Liu J, Geng H, Zhang J, Liu Y, Zhang H, Xing S, Du J, Ma S, Tian Z. De novo assembly of a Chinese soybean genome. SCIENCE CHINA. LIFE SCIENCES 2018; 61:871-884. [PMID: 30062469 DOI: 10.1007/s11427-018-9360-0] [Citation(s) in RCA: 104] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 07/05/2018] [Indexed: 10/28/2022]
Abstract
Soybean was domesticated in China and has become one of the most important oilseed crops. Due to bottlenecks in their introduction and dissemination, soybeans from different geographic areas exhibit extensive genetic diversity. Asia is the largest soybean market; therefore, a high-quality soybean reference genome from this area is critical for soybean research and breeding. Here, we report the de novo assembly and sequence analysis of a Chinese soybean genome for "Zhonghuang 13" by a combination of SMRT, Hi-C and optical mapping data. The assembled genome size is 1.025 Gb with a contig N50 of 3.46 Mb and a scaffold N50 of 51.87 Mb. Comparisons between this genome and the previously reported reference genome (cv. Williams 82) uncovered more than 250,000 structure variations. A total of 52,051 protein coding genes and 36,429 transposable elements were annotated for this genome, and a gene co-expression network including 39,967 genes was also established. This high quality Chinese soybean genome and its sequence analysis will provide valuable information for soybean improvement in the future.
Collapse
Affiliation(s)
- Yanting Shen
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100039, China
| | - Jing Liu
- Provincial Key Laboratory of Agrobiology, Institute of Crop Germplasm and Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China
| | - Haiying Geng
- School of Life Sciences, University of Science and Technology of China, Hefei, 230027, China
| | - Jixiang Zhang
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100039, China
| | - Yucheng Liu
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100039, China
| | | | - Shilai Xing
- Berry Genomics Corporation, Beijing, 100015, China
| | - Jianchang Du
- Provincial Key Laboratory of Agrobiology, Institute of Crop Germplasm and Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China.
| | - Shisong Ma
- School of Life Sciences, University of Science and Technology of China, Hefei, 230027, China.
| | - Zhixi Tian
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.
- University of Chinese Academy of Sciences, Beijing, 100039, China.
| |
Collapse
|
135
|
|
136
|
An D, Li C, Zhou Y, Wu Y, Wang W. Genomes and Transcriptomes of Duckweeds. Front Chem 2018; 6:230. [PMID: 29974050 PMCID: PMC6019479 DOI: 10.3389/fchem.2018.00230] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 05/31/2018] [Indexed: 11/23/2022] Open
Abstract
Duckweeds (Lemnaceae family) are the smallest flowering plants that adapt to the aquatic environment. They are regarded as the promising sustainable feedstock with the characteristics of high starch storage, fast propagation, and global distribution. The duckweed genome size varies 13-fold ranging from 150 Mb in Spirodela polyrhiza to 1,881 Mb in Wolffia arrhiza. With the development of sequencing technology and bioinformatics, five duckweed genomes from Spirodela and Lemna genera are sequenced and assembled. The genome annotations discover that they share similar protein orthologs, whereas the repeat contents could mainly explain the genome size difference. The gene families responsible for cell growth and expansion, lignin biosynthesis, and flowering are greatly contracted. However, the gene family of glutamate synthase has experienced expansion, indicating their significance in ammonia assimilation and nitrogen transport. The transcriptome is comprehensively sequenced for the genera of Spirodela, Landoltia, and Lemna, including various treatments such as abscisic acid, radiation, heavy metal, and starvation. The analysis of the underlying molecular mechanism and the regulatory network would accelerate their applications in the fields of bioenergy and phytoremediation. The comparative genomics has shown that duckweed genomes contain relatively low gene numbers and more contracted gene families, which may be in parallel with their highly reduced morphology with a simple leaf and primary roots. Still, we are waiting for the advancement of the long read sequencing technology to resolve the complex genomes and transcriptomes for unsequenced Wolffiella and Wolffia due to the large genome sizes and the similarity in their polyploidy.
Collapse
Affiliation(s)
- Dong An
- Department of Plant Sciences, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Changsheng Li
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Yong Zhou
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Yongrui Wu
- National Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Wenqin Wang
- Department of Plant Sciences, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
137
|
Li F, Harkess A. A guide to sequence your favorite plant genomes. APPLICATIONS IN PLANT SCIENCES 2018; 6:e1030. [PMID: 29732260 PMCID: PMC5895188 DOI: 10.1002/aps3.1030] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Accepted: 11/29/2017] [Indexed: 05/12/2023]
Abstract
With the rapid development of sequencing technology and the plummeting cost, assembling whole genomes from non-model plants will soon become routine for plant systematists and evolutionary biologists. Here we summarize and compare several of the latest genome sequencing and assembly approaches, offering a practical guide on how to approach a genome project. We also highlight certain precautions that need to be taken before investing time and money into a genome project.
Collapse
Affiliation(s)
- Fay‐Wei Li
- Boyce Thompson InstituteIthacaNew York14853USA
- Plant Biology SectionCornell UniversityIthacaNew York14853USA
| | - Alex Harkess
- Donald Danforth Plant Science CenterSt. LouisMissouri63132USA
| |
Collapse
|
138
|
Krishnakumar R, Sinha A, Bird SW, Jayamohan H, Edwards HS, Schoeniger JS, Patel KD, Branda SS, Bartsch MS. Systematic and stochastic influences on the performance of the MinION nanopore sequencer across a range of nucleotide bias. Sci Rep 2018; 8:3159. [PMID: 29453452 PMCID: PMC5816649 DOI: 10.1038/s41598-018-21484-w] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Accepted: 02/02/2018] [Indexed: 11/09/2022] Open
Abstract
Emerging sequencing technologies are allowing us to characterize environmental, clinical and laboratory samples with increasing speed and detail, including real-time analysis and interpretation of data. One example of this is being able to rapidly and accurately detect a wide range of pathogenic organisms, both in the clinic and the field. Genomes can have radically different GC content however, such that accurate sequence analysis can be challenging depending upon the technology used. Here, we have characterized the performance of the Oxford MinION nanopore sequencer for detection and evaluation of organisms with a range of genomic nucleotide bias. We have diagnosed the quality of base-calling across individual reads and discovered that the position within the read affects base-calling and quality scores. Finally, we have evaluated the performance of the current state-of-the-art neural network-based MinION basecaller, characterizing its behavior with respect to systemic errors as well as context- and sequence-specific errors. Overall, we present a detailed characterization the capabilities of the MinION in terms of generating high-accuracy sequence data from genomes with a wide range of nucleotide content. This study provides a framework for designing the appropriate experiments that are the likely to lead to accurate and rapid field-forward diagnostics.
Collapse
Affiliation(s)
| | - Anupama Sinha
- Systems Biology, Sandia National Laboratories, Livermore, CA, USA
| | - Sara W Bird
- Biotechnology and Bioengineering, Sandia National Laboratories, Livermore, CA, USA.,uBiome, San Francisco, CA, USA
| | - Harikrishnan Jayamohan
- Advanced Systems Engineering & Deployment, Sandia National Laboratories, Livermore, CA, USA.,Roche Molecular Systems, Pleasanton, CA, USA
| | - Harrison S Edwards
- Advanced Systems Engineering & Deployment, Sandia National Laboratories, Livermore, CA, USA.,University of Toronto, Toronto, Canada
| | | | - Kamlesh D Patel
- Advanced Systems Engineering & Deployment, Sandia National Laboratories, Livermore, CA, USA
| | - Steven S Branda
- Biomass Science and Conversion Technology, Sandia National Laboratories, Livermore, CA, USA
| | - Michael S Bartsch
- Advanced Systems Engineering & Deployment, Sandia National Laboratories, Livermore, CA, USA.
| |
Collapse
|
139
|
Michael TP, Jupe F, Bemm F, Motley ST, Sandoval JP, Lanz C, Loudet O, Weigel D, Ecker JR. High contiguity Arabidopsis thaliana genome assembly with a single nanopore flow cell. Nat Commun 2018; 9:541. [PMID: 29416032 PMCID: PMC5803254 DOI: 10.1038/s41467-018-03016-2] [Citation(s) in RCA: 177] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Accepted: 01/11/2018] [Indexed: 12/17/2022] Open
Abstract
The handheld Oxford Nanopore MinION sequencer generates ultra-long reads with minimal cost and time requirements, which makes sequencing genomes at the bench feasible. Here, we sequence the gold standard Arabidopsis thaliana genome (KBS-Mac-74 accession) on the bench with the MinION sequencer, and assemble the genome using typical consumer computing hardware (4 Cores, 16 Gb RAM) into chromosome arms (62 contigs with an N50 length of 12.3 Mb). We validate the contiguity and quality of the assembly with two independent single-molecule technologies, Bionano optical genome maps and Pacific Biosciences Sequel sequencing. The new A. thaliana KBS-Mac-74 genome enables resolution of a quantitative trait locus that had previously been recalcitrant to a Sanger-based BAC sequencing approach. In summary, we demonstrate that even when the purpose is to understand complex structural variation at a single region of the genome, complete genome assembly is becoming the simplest way to achieve this goal.
Collapse
Affiliation(s)
| | - Florian Jupe
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
- Monsanto Company, Creve Coeur, MO, 63141, USA
| | - Felix Bemm
- Max Planck Institute for Developmental Biology, 72076, Tübingen, Germany
| | | | - Justin P Sandoval
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
| | - Christa Lanz
- Max Planck Institute for Developmental Biology, 72076, Tübingen, Germany
| | - Olivier Loudet
- Institut Jean-Pierre Bourgin, INRA, AgroParisTech, CNRS, Université Paris-Saclay, 78000, Versailles, France
| | - Detlef Weigel
- Max Planck Institute for Developmental Biology, 72076, Tübingen, Germany
| | - Joseph R Ecker
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
| |
Collapse
|
140
|
Kyriakidou M, Tai HH, Anglin NL, Ellis D, Strömvik MV. Current Strategies of Polyploid Plant Genome Sequence Assembly. FRONTIERS IN PLANT SCIENCE 2018; 9:1660. [PMID: 30519250 PMCID: PMC6258962 DOI: 10.3389/fpls.2018.01660] [Citation(s) in RCA: 114] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Accepted: 10/25/2018] [Indexed: 05/14/2023]
Abstract
Polyploidy or duplication of an entire genome occurs in the majority of angiosperms. The understanding of polyploid genomes is important for the improvement of those crops, which humans rely on for sustenance and basic nutrition. As climate change continues to pose a potential threat to agricultural production, there will increasingly be a demand for plant cultivars that can resist biotic and abiotic stresses and also provide needed and improved nutrition. In the past decade, Next Generation Sequencing (NGS) has fundamentally changed the genomics landscape by providing tools for the exploration of polyploid genomes. Here, we review the challenges of the assembly of polyploid plant genomes, and also present recent advances in genomic resources and functional tools in molecular genetics and breeding. As genomes of diploid and less heterozygous progenitor species are increasingly available, we discuss the lack of complexity of these currently available reference genomes as they relate to polyploid crops. Finally, we review recent approaches of haplotyping by phasing and the impact of third generation technologies on polyploid plant genome assembly.
Collapse
Affiliation(s)
- Maria Kyriakidou
- Department of Plant Science, McGill University, Montreal, QC, Canada
| | - Helen H. Tai
- Fredericton Research and Development Centre, Agriculture and Agri-Food Canada, Fredericton, NB, Canada
| | | | | | - Martina V. Strömvik
- Department of Plant Science, McGill University, Montreal, QC, Canada
- *Correspondence: Martina V. Strömvik
| |
Collapse
|
141
|
Li C, Lin F, An D, Wang W, Huang R. Genome Sequencing and Assembly by Long Reads in Plants. Genes (Basel) 2017; 9:E6. [PMID: 29283420 PMCID: PMC5793159 DOI: 10.3390/genes9010006] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Revised: 12/18/2017] [Accepted: 12/18/2017] [Indexed: 11/17/2022] Open
Abstract
Plant genomes generated by Sanger and Next Generation Sequencing (NGS) have provided insight into species diversity and evolution. However, Sanger sequencing is limited in its applications due to high cost, labor intensity, and low throughput, while NGS reads are too short to resolve abundant repeats and polyploidy, leading to incomplete or ambiguous assemblies. The advent and improvement of long-read sequencing by Third Generation Sequencing (TGS) methods such as PacBio and Nanopore have shown promise in producing high-quality assemblies for complex genomes. Here, we review the development of sequencing, introducing the application as well as considerations of experimental design in TGS of plant genomes. We also introduce recent revolutionary scaffolding technologies including BioNano, Hi-C, and 10× Genomics. We expect that the informative guidance for genome sequencing and assembly by long reads will benefit the initiation of scientists' projects.
Collapse
Affiliation(s)
- Changsheng Li
- College of Agronomy, Shenyang Agricultural University, 120 Dongling Road, Shenyang 110866, China.
| | - Feng Lin
- College of Bioscience and Biotechnology, Shenyang Agricultural University, 120 Dongling Road, Shenyang 110866, China.
| | - Dong An
- School of Agriculture and Biology, Shanghai Jiao Tong University, 800 Dong Chuan Road, Shanghai 200240, China.
| | - Wenqin Wang
- School of Agriculture and Biology, Shanghai Jiao Tong University, 800 Dong Chuan Road, Shanghai 200240, China.
| | - Ruidong Huang
- College of Agronomy, Shenyang Agricultural University, 120 Dongling Road, Shenyang 110866, China.
| |
Collapse
|
142
|
Hofmann NR. Nanopore Sequencing Comes to Plant Genomes. THE PLANT CELL 2017; 29:2677-2678. [PMID: 29114013 PMCID: PMC5728122 DOI: 10.1105/tpc.17.00863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
|