1
|
Dallaire X, Bouchard R, Hénault P, Ulmo-Diaz G, Normandeau E, Mérot C, Bernatchez L, Moore JS. Widespread Deviant Patterns of Heterozygosity in Whole-Genome Sequencing Due to Autopolyploidy, Repeated Elements, and Duplication. Genome Biol Evol 2023; 15:evad229. [PMID: 38085037 PMCID: PMC10752349 DOI: 10.1093/gbe/evad229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/30/2023] [Indexed: 12/28/2023] Open
Abstract
Most population genomic tools rely on accurate single nucleotide polymorphism (SNP) calling and filtering to meet their underlying assumptions. However, genomic complexity, resulting from structural variants, paralogous sequences, and repetitive elements, presents significant challenges in assembling contiguous reference genomes. Consequently, short-read resequencing studies can encounter mismapping issues, leading to SNPs that deviate from Mendelian expected patterns of heterozygosity and allelic ratio. In this study, we employed the ngsParalog software to identify such deviant SNPs in whole-genome sequencing (WGS) data with low (1.5×) to intermediate (4.8×) coverage for four species: Arctic Char (Salvelinus alpinus), Lake Whitefish (Coregonus clupeaformis), Atlantic Salmon (Salmo salar), and the American Eel (Anguilla rostrata). The analyses revealed that deviant SNPs accounted for 22% to 62% of all SNPs in salmonid datasets and approximately 11% in the American Eel dataset. These deviant SNPs were particularly concentrated within repetitive elements and genomic regions that had recently undergone rediploidization in salmonids. Additionally, narrow peaks of elevated coverage were ubiquitous along all four reference genomes, encompassed most deviant SNPs, and could be partially associated with transposons and tandem repeats. Including these deviant SNPs in genomic analyses led to highly distorted site frequency spectra, underestimated pairwise FST values, and overestimated nucleotide diversity. Considering the widespread occurrence of deviant SNPs arising from a variety of sources, their important impact in estimating population parameters, and the availability of effective tools to identify them, we propose that excluding deviant SNPs from WGS datasets is required to improve genomic inferences for a wide range of taxa and sequencing depths.
Collapse
Affiliation(s)
- Xavier Dallaire
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Centre d'Études Nordiques, Université Laval, Québec, Canada
| | - Raphael Bouchard
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Philippe Hénault
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Gabriela Ulmo-Diaz
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Eric Normandeau
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
- Plateforme de bio-informatique de l’IBIS, Université Laval, Québec, Canada
| | - Claire Mérot
- CNRS, UMR 6553 ECOBIO, Université de Rennes, Rennes, France
| | - Louis Bernatchez
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Jean-Sébastien Moore
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Centre d'Études Nordiques, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| |
Collapse
|
2
|
Adi TK, Fujie M, Satoh N, Ueki T. The acidic amino acid-rich C-terminal domain of VanabinX enhances reductase activity, attaining 1.3- to 1.7-fold vanadium reduction. Biochem Biophys Rep 2022; 32:101349. [PMID: 36147050 PMCID: PMC9486056 DOI: 10.1016/j.bbrep.2022.101349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 09/07/2022] [Accepted: 09/13/2022] [Indexed: 11/20/2022] Open
Abstract
Ascidians accumulate extremely high levels of vanadium (V) in their blood cells. Several V-related proteins, including V-binding proteins (vanabins), have been isolated from V-accumulating ascidians. In this study, to obtain a deeper understanding of vanabins, we performed de novo transcriptome analysis of blood cells from a V-rich ascidian, Ascidia sydneiensis samea, and constructed a database containing 8532 predicted proteins. We found a novel vanabin with a unique acidic amino acid–rich C-terminal domain, designated VanabinX, in the database and studied it in detail. Reverse-transcription polymerase chain reaction analysis revealed that VanabinX was detected in all adult tissues examined, and was most prominent in blood cells and muscle tissue. We prepared recombinant proteins and performed immobilized metal ion affinity chromatography and a NADPH-coupled V(V)-reductase assay. VanabinX bound to metal ions, with increasing affinity for Cu(II) > Zn(II) > Co(II), but not to V(IV). VanabinX reduced V(V) to V(IV) at a rate of 0.170 μM per micoromolar protein within 30 min. The C-terminal acidic domain enhanced the reduction of V(V) by Vanabin2 to 1.3-fold and of VanabinX itself to 1.7-fold in trans mode. In summary, we constructed a protein database containing 8532 predicted proteins expressed in blood cells; among them, we discovered a novel vanabin, VanabinX, which enhances V reduction by vanabins. A novel vanadium-binding protein was identified from a vanadium-rich ascidian. This protein named VanabinX does not bind strongly to V(IV). VanabinX can reduce V(V) to V(IV) in a NADPH/GR/GSH cascade. The acidic C-terminal domain of vanabinX enhances V(V)-reduction of vanabins in trans mode.
Collapse
|
3
|
Todesco M, Bercovich N, Kim A, Imerovski I, Owens GL, Dorado Ruiz Ó, Holalu SV, Madilao LL, Jahani M, Légaré JS, Blackman BK, Rieseberg LH. Genetic basis and dual adaptive role of floral pigmentation in sunflowers. eLife 2022; 11:72072. [PMID: 35040432 PMCID: PMC8765750 DOI: 10.7554/elife.72072] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 11/28/2021] [Indexed: 12/25/2022] Open
Abstract
Variation in floral displays, both between and within species, has been long known to be shaped by the mutualistic interactions that plants establish with their pollinators. However, increasing evidence suggests that abiotic selection pressures influence floral diversity as well. Here, we analyse the genetic and environmental factors that underlie patterns of floral pigmentation in wild sunflowers. While sunflower inflorescences appear invariably yellow to the human eye, they display extreme diversity for patterns of ultraviolet pigmentation, which are visible to most pollinators. We show that this diversity is largely controlled by cis-regulatory variation affecting a single MYB transcription factor, HaMYB111, through accumulation of ultraviolet (UV)-absorbing flavonol glycosides in ligules (the ‘petals’ of sunflower inflorescences). Different patterns of ultraviolet pigments in flowers are strongly correlated with pollinator preferences. Furthermore, variation for floral ultraviolet patterns is associated with environmental variables, especially relative humidity, across populations of wild sunflowers. Ligules with larger ultraviolet patterns, which are found in drier environments, show increased resistance to desiccation, suggesting a role in reducing water loss. The dual role of floral UV patterns in pollinator attraction and abiotic response reveals the complex adaptive balance underlying the evolution of floral traits. Flowers are an important part of how many plants reproduce. Their distinctive colours, shapes and patterns attract specific pollinators, but they can also help to protect the plant from predators and environmental stresses. Many flowers contain pigments that absorb ultraviolet (UV) light to display distinct UV patterns – although invisible to the human eye, most pollinators are able to see them. For example, when seen in UV, sunflowers feature a ‘bullseye’ with a dark centre surrounded by a reflective outer ring. The sizes and thicknesses of these rings vary a lot within and between flower species, and so far, it has been unclear what causes this variation and how it affects the plants. To find out more, Todesco et al. studied the UV patterns in various wild sunflowers across North America by considering the ecology and molecular biology of different plants. This revealed great variation between the UV patterns of the different sunflower populations. Moreover, Todesco et al. found that a gene called HaMYB111 is responsible for the diverse UV patterns in the sunflowers. This gene controls how plants make chemicals called flavonols that absorb UV light. Flavonols also help to protect plants from damage caused by droughts and extreme temperatures. Todesco et al. showed that plants with larger bullseyes had more flavonols, attracted more pollinators, and were better at conserving water. Accordingly, these plants were found in drier locations. This study suggests that, at least in sunflowers, UV patterns help both to attract pollinators and to control water loss. These insights could help to improve pollination – and consequently yield – in cultivated plants, and to develop plants with better resistance to extreme weather. This work also highlights the importance of combining biology on small and large scales to understand complex processes, such as adaptation and evolution.
Collapse
Affiliation(s)
- Marco Todesco
- Department of Botany and Biodiversity Research Centre, University of British Columbia
| | - Natalia Bercovich
- Department of Botany and Biodiversity Research Centre, University of British Columbia
| | - Amy Kim
- Department of Botany and Biodiversity Research Centre, University of British Columbia
| | - Ivana Imerovski
- Department of Botany and Biodiversity Research Centre, University of British Columbia
| | - Gregory L Owens
- Department of Botany and Biodiversity Research Centre, University of British Columbia
- Department of Biology, University of Victoria
| | - Óscar Dorado Ruiz
- Department of Botany and Biodiversity Research Centre, University of British Columbia
| | | | - Lufiani L Madilao
- Michael Smith Laboratory and Wine Research Centre, University of British Columbia
| | - Mojtaba Jahani
- Department of Botany and Biodiversity Research Centre, University of British Columbia
| | - Jean-Sébastien Légaré
- Department of Botany and Biodiversity Research Centre, University of British Columbia
| | | | - Loren H Rieseberg
- Department of Botany and Biodiversity Research Centre, University of British Columbia
| |
Collapse
|
4
|
Rodrigues JA, Hsieh PH, Ruan D, Nishimura T, Sharma MK, Sharma R, Ye X, Nguyen ND, Nijjar S, Ronald PC, Fischer RL, Zilberman D. Divergence among rice cultivars reveals roles for transposition and epimutation in ongoing evolution of genomic imprinting. Proc Natl Acad Sci U S A 2021; 118:e2104445118. [PMID: 34272287 PMCID: PMC8307775 DOI: 10.1073/pnas.2104445118] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Parent-of-origin-dependent gene expression in mammals and flowering plants results from differing chromatin imprints (genomic imprinting) between maternally and paternally inherited alleles. Imprinted gene expression in the endosperm of seeds is associated with localized hypomethylation of maternally but not paternally inherited DNA, with certain small RNAs also displaying parent-of-origin-specific expression. To understand the evolution of imprinting mechanisms in Oryza sativa (rice), we analyzed imprinting divergence among four cultivars that span both japonica and indica subspecies: Nipponbare, Kitaake, 93-11, and IR64. Most imprinted genes are imprinted across cultivars and enriched for functions in chromatin and transcriptional regulation, development, and signaling. However, 4 to 11% of imprinted genes display divergent imprinting. Analyses of DNA methylation and small RNAs revealed that endosperm-specific 24-nt small RNA-producing loci show weak RNA-directed DNA methylation, frequently overlap genes, and are imprinted four times more often than genes. However, imprinting divergence most often correlated with local DNA methylation epimutations (9 of 17 assessable loci), which were largely stable within subspecies. Small insertion/deletion events and transposable element insertions accompanied 4 of the 9 locally epimutated loci and associated with imprinting divergence at another 4 of the remaining 8 loci. Correlating epigenetic and genetic variation occurred at key regulatory regions-the promoter and transcription start site of maternally biased genes, and the promoter and gene body of paternally biased genes. Our results reinforce models for the role of maternal-specific DNA hypomethylation in imprinting of both maternally and paternally biased genes, and highlight the role of transposition and epimutation in rice imprinting evolution.
Collapse
Affiliation(s)
- Jessica A Rodrigues
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720
| | - Ping-Hung Hsieh
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720
| | - Deling Ruan
- Department of Plant Pathology, University of California, Davis, CA 95616
| | - Toshiro Nishimura
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720
| | - Manoj K Sharma
- Department of Plant Pathology, University of California, Davis, CA 95616
| | - Rita Sharma
- Department of Plant Pathology, University of California, Davis, CA 95616
| | - XinYi Ye
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720
| | - Nicholas D Nguyen
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720
| | - Sukhranjan Nijjar
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720
| | - Pamela C Ronald
- Department of Plant Pathology, University of California, Davis, CA 95616
- The Genome Center, University of California, Davis, CA 95616
| | - Robert L Fischer
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720;
| | - Daniel Zilberman
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720;
- Department of Cell and Developmental Biology, The John Innes Centre, Norwich NR4 7UH, United Kingdom
| |
Collapse
|
5
|
Gurdon C, Kozik A, Tao R, Poulev A, Armas I, Michelmore RW, Raskin I. Isolating an active and inactive CACTA transposon from lettuce color mutants and characterizing their family. PLANT PHYSIOLOGY 2021; 186:929-944. [PMID: 33768232 PMCID: PMC8195511 DOI: 10.1093/plphys/kiab143] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 03/02/2021] [Indexed: 06/01/2023]
Abstract
Dietary flavonoids play an important role in human nutrition and health. Flavonoid biosynthesis genes have recently been identified in lettuce (Lactuca sativa); however, few mutants have been characterized. We now report the causative mutations in Green Super Lettuce (GSL), a natural light green mutant derived from red cultivar NAR; and GSL-Dark Green (GSL-DG), an olive-green natural derivative of GSL. GSL harbors CACTA 1 (LsC1), a 3.9-kb active nonautonomous CACTA superfamily transposon inserted in the 5' untranslated region of anthocyanidin synthase (ANS), a gene coding for a key enzyme in anthocyanin biosynthesis. Both terminal inverted repeats (TIRs) of this transposon were intact, enabling somatic excision of the mobile element, which led to the restoration of ANS expression and the accumulation of red anthocyanins in sectors on otherwise green leaves. GSL-DG harbors CACTA 2 (LsC2), a 1.1-kb truncated copy of LsC1 that lacks one of the TIRs, rendering the transposon inactive. RNA-sequencing and reverse transcription quantitative PCR of NAR, GSL, and GSL-DG indicated the relative expression level of ANS was strongly influenced by the transposon insertions. Analysis of flavonoid content indicated leaf cyanidin levels correlated positively with ANS expression. Bioinformatic analysis of the cv Salinas lettuce reference genome led to the discovery and characterization of an LsC1 transposon family with a putative transposon copy number greater than 1,700. Homologs of tnpA and tnpD, the genes encoding two proteins necessary for activation of transposition of CACTA elements, were also identified in the lettuce genome.
Collapse
Affiliation(s)
- Csanad Gurdon
- Department of Plant Biology, Rutgers University, New Brunswick, New Jersey 08901-8520, USA
| | | | - Rong Tao
- UC Davis Genome Center, Davis, California 95616, USA
| | - Alexander Poulev
- Department of Plant Biology, Rutgers University, New Brunswick, New Jersey 08901-8520, USA
| | - Isabel Armas
- Department of Plant Biology, Rutgers University, New Brunswick, New Jersey 08901-8520, USA
| | | | - Ilya Raskin
- Department of Plant Biology, Rutgers University, New Brunswick, New Jersey 08901-8520, USA
| |
Collapse
|
6
|
Massive haplotypes underlie ecotypic differentiation in sunflowers. Nature 2020; 584:602-607. [PMID: 32641831 DOI: 10.1038/s41586-020-2467-6] [Citation(s) in RCA: 162] [Impact Index Per Article: 40.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2019] [Accepted: 04/16/2020] [Indexed: 12/22/2022]
Abstract
Species often include multiple ecotypes that are adapted to different environments1. However, it is unclear how ecotypes arise and how their distinctive combinations of adaptive alleles are maintained despite hybridization with non-adapted populations2-4. Here, by resequencing 1,506 wild sunflowers from 3 species (Helianthus annuus, Helianthus petiolaris and Helianthus argophyllus), we identify 37 large (1-100 Mbp in size), non-recombining haplotype blocks that are associated with numerous ecologically relevant traits, as well as soil and climate characteristics. Limited recombination in these haplotype blocks keeps adaptive alleles together, and these regions differentiate sunflower ecotypes. For example, haplotype blocks control a 77-day difference in flowering between ecotypes of the silverleaf sunflower H. argophyllus (probably through deletion of a homologue of FLOWERING LOCUS T (FT)), and are associated with seed size, flowering time and soil fertility in dune-adapted sunflowers. These haplotypes are highly divergent, frequently associated with structural variants and often appear to represent introgressions from other-possibly now-extinct-congeners. These results highlight a pervasive role of structural variation in ecotypic adaptation.
Collapse
|
7
|
Ostevik KL, Samuk K, Rieseberg LH. Ancestral Reconstruction of Karyotypes Reveals an Exceptional Rate of Nonrandom Chromosomal Evolution in Sunflower. Genetics 2020; 214:1031-1045. [PMID: 32033968 PMCID: PMC7153943 DOI: 10.1534/genetics.120.303026] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Accepted: 02/03/2020] [Indexed: 12/20/2022] Open
Abstract
Mapping the chromosomal rearrangements between species can inform our understanding of genome evolution, reproductive isolation, and speciation. Here, we present a novel algorithm for identifying regions of synteny in pairs of genetic maps, which is implemented in the accompanying R package syntR. The syntR algorithm performs as well as previous ad hoc methods while being systematic, repeatable, and applicable to mapping chromosomal rearrangements in any group of species. In addition, we present a systematic survey of chromosomal rearrangements in the annual sunflowers, which is a group known for extreme karyotypic diversity. We build high-density genetic maps for two subspecies of the prairie sunflower, Helianthus petiolaris ssp. petiolaris and H. petiolaris ssp. fallax Using syntR, we identify blocks of synteny between these two subspecies and previously published high-density genetic maps. We reconstruct ancestral karyotypes for annual sunflowers using those synteny blocks and conservatively estimate that there have been 7.9 chromosomal rearrangements per million years, a high rate of chromosomal evolution. Although the rate of inversion is even higher than the rate of translocation in this group, we further find that every extant karyotype is distinguished by between one and three translocations involving only 8 of the 17 chromosomes. This nonrandom exchange suggests that specific chromosomes are prone to translocation and may thus contribute disproportionately to widespread hybrid sterility in sunflowers. These data deepen our understanding of chromosome evolution and confirm that Helianthus has an exceptional rate of chromosomal rearrangement that may facilitate similarly rapid diversification.
Collapse
Affiliation(s)
- Kate L Ostevik
- Department of Biology, Duke University, Durham, North Carolina 27701
- Department of Botany, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Kieran Samuk
- Department of Biology, Duke University, Durham, North Carolina 27701
| | - Loren H Rieseberg
- Department of Botany, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| |
Collapse
|
8
|
Hoang NV, Furtado A, Perlo V, Botha FC, Henry RJ. The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome. Front Genet 2019; 10:654. [PMID: 31396260 PMCID: PMC6664245 DOI: 10.3389/fgene.2019.00654] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 06/20/2019] [Indexed: 11/13/2022] Open
Abstract
Normalization of cDNA is widely used to improve the coverage of rare transcripts in analysis of transcriptomes employing next-generation sequencing. Recently, long-read technology has been emerging as a powerful tool for sequencing and construction of transcriptomes, especially for complex genomes containing highly similar transcripts and transcript-spliced isoforms. Here, we analyzed the transcriptome of sugarcane, a highly polyploidy plant genome, by PacBio isoform sequencing (Iso-Seq) of two different cDNA library preparations, with and without a normalization step. The results demonstrated that, while the two libraries included many of the same transcripts, many longer transcripts were removed, and many new generally shorter transcripts were detected by normalization. For the same input cDNA and data yield, the normalized library recovered more total transcript isoforms and number of predicted gene families and orthologous groups, resulting in a higher representation for the sugarcane transcriptome, compared to the non-normalized library. The non-normalized library, on the other hand, included a wider transcript length range with more longer transcripts above ∼1.25 kb and more transcript isoforms per gene family and gene ontology terms per transcript. A large proportion of the unique transcripts comprising ∼52% of the normalized library were expressed at a lower level than the unique transcripts from the non-normalized library, across three tissue types tested including leaf, stalk, and root. About 83% of the total 5,348 predicted long noncoding transcripts was derived from the normalized library, of which ∼80% was derived from the lowly expressed fraction. Functional annotation of the unique transcripts suggested that each library enriched different functional transcript fractions. This demonstrated the complementation of the two approaches in obtaining a complete transcriptome of a complex genome at the sequencing depth used in this study.
Collapse
Affiliation(s)
- Nam V. Hoang
- College of Agriculture and Forestry, Hue University, Hue, Vietnam
| | - Agnelo Furtado
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, Australia
| | - Virginie Perlo
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, Australia
| | - Frederik C. Botha
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, Australia
- Sugar Research Australia, Indooroopilly, QLD, Australia
| | - Robert J. Henry
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, Australia
| |
Collapse
|
9
|
Ichida H, Abe T. An improved and robust method to efficiently deplete repetitive elements from complex plant genomes. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2019; 280:455-460. [PMID: 30824026 DOI: 10.1016/j.plantsci.2018.10.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Revised: 10/21/2018] [Accepted: 10/23/2018] [Indexed: 06/09/2023]
Abstract
Genome size and complexity often present major challenges to genome-based approaches in crop plants and other agricultural species. For instance, repetitive sequences comprise 80% to 90% of the genome of hexaploid wheat, which has a haploid genome size of approximately 17 Gb. In this study, we developed an improved design and procedure for short-read library preparation that uses a modified adaptor and duplex-specific nuclease (DSN) for the efficient elimination of highly repeated sequence elements within genomes. The improved adapter, which has a hairpin-like form for stability, was constructed from truncated sequences adjacent to the original Illumina TruSeq adapter and can be converted to a full-length adapter structure during PCR amplification. Using the hairpin-structured adaptor, we prepared randomly sheared genomic libraries from rice and diploid, tetraploid, and hexaploid wheat cultivars and evaluated the efficiency of DSN for the enzymatic depletion of repetitive elements. According to real-time quantitative PCR analysis, the relative abundances of 18S and 25S ribosomal DNA decreased respectively to 1.15% and 3.54% in rice and 1.70%-1.95% and 14.71%-20.01% in the three wheat cultivars. Whole-genome sequencing analysis of a diploid wheat cultivar, KU104-1, indicated that DSN treatment with the designed hairpin-structured adapter dramatically reduced highly repetitive elements, such as Ty1-Copia and Ty3-Gypsy retrotransposons and DNA transposons, within the genome, while sequencing reads derived from low-copy genes and protein coding sequences increased more than 50%. Our new procedure should be useful not only for wheat genomes but also for other agricultural plant species with relatively large and complex genomes.
Collapse
Affiliation(s)
- Hiroyuki Ichida
- RIKEN Nishina Center for Accelerator-Based Science, Saitama 351-0198, Japan.
| | - Tomoko Abe
- RIKEN Nishina Center for Accelerator-Based Science, Saitama 351-0198, Japan
| |
Collapse
|
10
|
Boone M, De Koker A, Callewaert N. Capturing the 'ome': the expanding molecular toolbox for RNA and DNA library construction. Nucleic Acids Res 2018; 46:2701-2721. [PMID: 29514322 PMCID: PMC5888575 DOI: 10.1093/nar/gky167] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2017] [Revised: 02/05/2018] [Accepted: 02/23/2018] [Indexed: 12/14/2022] Open
Abstract
All sequencing experiments and most functional genomics screens rely on the generation of libraries to comprehensively capture pools of targeted sequences. In the past decade especially, driven by the progress in the field of massively parallel sequencing, numerous studies have comprehensively assessed the impact of particular manipulations on library complexity and quality, and characterized the activities and specificities of several key enzymes used in library construction. Fortunately, careful protocol design and reagent choice can substantially mitigate many of these biases, and enable reliable representation of sequences in libraries. This review aims to guide the reader through the vast expanse of literature on the subject to promote informed library generation, independent of the application.
Collapse
Affiliation(s)
- Morgane Boone
- Center for Medical Biotechnology, VIB, Zwijnaarde 9052, Belgium
- Department of Biochemistry and Microbiology, Ghent University, Ghent 9000, Belgium
| | - Andries De Koker
- Center for Medical Biotechnology, VIB, Zwijnaarde 9052, Belgium
- Department of Biochemistry and Microbiology, Ghent University, Ghent 9000, Belgium
| | - Nico Callewaert
- Center for Medical Biotechnology, VIB, Zwijnaarde 9052, Belgium
- Department of Biochemistry and Microbiology, Ghent University, Ghent 9000, Belgium
| |
Collapse
|
11
|
Owens GL, Todesco M, Drummond EBM, Yeaman S, Rieseberg LH. A novel post hoc method for detecting index switching finds no evidence for increased switching on the Illumina HiSeq X. Mol Ecol Resour 2017; 18:169-175. [DOI: 10.1111/1755-0998.12713] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Revised: 08/04/2017] [Accepted: 08/08/2017] [Indexed: 11/26/2022]
Affiliation(s)
- Gregory L. Owens
- Department of Botany and Beaty Biodiversity Centre; University of British Columbia; Vancouver BC Canada
| | - Marco Todesco
- Department of Botany and Beaty Biodiversity Centre; University of British Columbia; Vancouver BC Canada
| | - Emily B. M. Drummond
- Department of Botany and Beaty Biodiversity Centre; University of British Columbia; Vancouver BC Canada
| | - Sam Yeaman
- Department of Biological Sciences; University of Calgary; Calgary AB Canada
| | - Loren H. Rieseberg
- Department of Botany and Beaty Biodiversity Centre; University of British Columbia; Vancouver BC Canada
| |
Collapse
|
12
|
Moyers BT, Owens GL, Baute GJ, Rieseberg LH. The genetic architecture of UV floral patterning in sunflower. ANNALS OF BOTANY 2017; 120:39-50. [PMID: 28459939 PMCID: PMC5737206 DOI: 10.1093/aob/mcx038] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Accepted: 03/14/2017] [Indexed: 06/07/2023]
Abstract
Background and Aims The patterning of floral ultraviolet (UV) pigmentation varies both intra- and interspecifically in sunflowers and many other plant species, impacts pollinator attraction, and can be critical to reproductive success and crop yields. However, the genetic basis for variation in UV patterning is largely unknown. This study examines the genetic architecture for proportional and absolute size of the UV bullseye in Helianthus argophyllus , a close relative of the domesticated sunflower. Methods A camera modified to capture UV light (320-380 nm) was used to phenotype floral UV patterning in an F 2 mapping population, then quantitative trait loci (QTL) were identified using genotyping-by-sequencing and linkage mapping. The ability of these QTL to predict the UV patterning of natural population individuals was also assessed. Key Results Proportional UV pigmentation is additively controlled by six moderate effect QTL that are predictive of this phenotype in natural populations. In contrast, UV bullseye size is controlled by a single large effect QTL that also controls flowerhead size and co-localizes with a major flowering time QTL in Helianthus . Conclusions The co-localization of the UV bullseye size QTL, flowerhead size QTL and a previously known flowering time QTL may indicate a single highly pleiotropic locus or several closely linked loci, which could inhibit UV bullseye size from responding to selection without change in correlated characters. The genetic architecture of proportional UV pigmentation is relatively simple and different from that of UV bullseye size, and so should be able to respond to natural or artificial selection independently.
Collapse
Affiliation(s)
- Brook T. Moyers
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Room 3529-6270 University Blvd, Vancouver, BC V6T 1Z4, Canada
- Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO 80523, USA
| | - Gregory L. Owens
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Room 3529-6270 University Blvd, Vancouver, BC V6T 1Z4, Canada
| | - Gregory J. Baute
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Room 3529-6270 University Blvd, Vancouver, BC V6T 1Z4, Canada
| | - Loren H. Rieseberg
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Room 3529-6270 University Blvd, Vancouver, BC V6T 1Z4, Canada
| |
Collapse
|
13
|
Reyes-Chin-Wo S, Wang Z, Yang X, Kozik A, Arikit S, Song C, Xia L, Froenicke L, Lavelle DO, Truco MJ, Xia R, Zhu S, Xu C, Xu H, Xu X, Cox K, Korf I, Meyers BC, Michelmore RW. Genome assembly with in vitro proximity ligation data and whole-genome triplication in lettuce. Nat Commun 2017; 8:14953. [PMID: 28401891 PMCID: PMC5394340 DOI: 10.1038/ncomms14953] [Citation(s) in RCA: 217] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Accepted: 02/15/2017] [Indexed: 01/03/2023] Open
Abstract
Lettuce (Lactuca sativa) is a major crop and a member of the large, highly successful Compositae family of flowering plants. Here we present a reference assembly for the species and family. This was generated using whole-genome shotgun Illumina reads plus in vitro proximity ligation data to create large superscaffolds; it was validated genetically and superscaffolds were oriented in genetic bins ordered along nine chromosomal pseudomolecules. We identify several genomic features that may have contributed to the success of the family, including genes encoding Cycloidea-like transcription factors, kinases, enzymes involved in rubber biosynthesis and disease resistance proteins that are expanded in the genome. We characterize 21 novel microRNAs, one of which may trigger phasiRNAs from numerous kinase transcripts. We provide evidence for a whole-genome triplication event specific but basal to the Compositae. We detect 26% of the genome in triplicated regions containing 30% of all genes that are enriched for regulatory sequences and depleted for genes involved in defence.
Collapse
Affiliation(s)
| | | | | | | | - Siwaret Arikit
- Delaware Biotechnology Institute, University of Delaware, Newark, Delaware 19711, USA
| | - Chi Song
- BGI Shenzhen, Shenzhen 518083, China
| | | | | | | | | | - Rui Xia
- Donald Danforth Plant Science Center, 975 North Warson Road, St Louis, Missouri 63132, USA
| | | | | | - Huaqin Xu
- UC Davis Genome Center, Davis, California 95616, USA
| | - Xun Xu
- BGI Shenzhen, Shenzhen 518083, China
| | - Kyle Cox
- UC Davis Genome Center, Davis, California 95616, USA
| | - Ian Korf
- UC Davis Genome Center, Davis, California 95616, USA
- Department of Molecular & Cellular Biology, UC Davis, California 95616, USA
| | - Blake C. Meyers
- Delaware Biotechnology Institute, University of Delaware, Newark, Delaware 19711, USA
- Donald Danforth Plant Science Center, 975 North Warson Road, St Louis, Missouri 63132, USA
| | - Richard W. Michelmore
- UC Davis Genome Center, Davis, California 95616, USA
- Department of Molecular & Cellular Biology, UC Davis, California 95616, USA
- Department of Plant Sciences, UC Davis, California 95616, USA
- Department of Medical Microbiology & Immunology, UC Davis, California 95616, USA
| |
Collapse
|
14
|
Sánchez-Martín J, Steuernagel B, Ghosh S, Herren G, Hurni S, Adamski N, Vrána J, Kubaláková M, Krattinger SG, Wicker T, Doležel J, Keller B, Wulff BBH. Rapid gene isolation in barley and wheat by mutant chromosome sequencing. Genome Biol 2016; 17:221. [PMID: 27795210 PMCID: PMC5087116 DOI: 10.1186/s13059-016-1082-1] [Citation(s) in RCA: 183] [Impact Index Per Article: 22.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2016] [Accepted: 10/10/2016] [Indexed: 11/18/2022] Open
Abstract
Identification of causal mutations in barley and wheat is hampered by their large genomes and suppressed recombination. To overcome these obstacles, we have developed MutChromSeq, a complexity reduction approach based on flow sorting and sequencing of mutant chromosomes, to identify induced mutations by comparison to parental chromosomes. We apply MutChromSeq to six mutants each of the barley Eceriferum-q gene and the wheat Pm2 genes. This approach unambiguously identified single candidate genes that were verified by Sanger sequencing of additional mutants. MutChromSeq enables reference-free forward genetics in barley and wheat, thus opening up their pan-genomes to functional genomics.
Collapse
Affiliation(s)
- Javier Sánchez-Martín
- Department of Plant and Microbial Biology, University of Zürich, Zollikerstrasse 107, Zürich, CH-8008 Switzerland
| | | | - Sreya Ghosh
- John Innes Centre, Norwich Research Park, Norwich, NR4 7UH UK
| | - Gerhard Herren
- Department of Plant and Microbial Biology, University of Zürich, Zollikerstrasse 107, Zürich, CH-8008 Switzerland
| | - Severine Hurni
- Department of Plant and Microbial Biology, University of Zürich, Zollikerstrasse 107, Zürich, CH-8008 Switzerland
| | - Nikolai Adamski
- John Innes Centre, Norwich Research Park, Norwich, NR4 7UH UK
| | - Jan Vrána
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Šlechtitelů 31, Olomouc, CZ-78371 Czech Republic
| | - Marie Kubaláková
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Šlechtitelů 31, Olomouc, CZ-78371 Czech Republic
| | - Simon G. Krattinger
- Department of Plant and Microbial Biology, University of Zürich, Zollikerstrasse 107, Zürich, CH-8008 Switzerland
| | - Thomas Wicker
- Department of Plant and Microbial Biology, University of Zürich, Zollikerstrasse 107, Zürich, CH-8008 Switzerland
| | - Jaroslav Doležel
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Šlechtitelů 31, Olomouc, CZ-78371 Czech Republic
| | - Beat Keller
- Department of Plant and Microbial Biology, University of Zürich, Zollikerstrasse 107, Zürich, CH-8008 Switzerland
| | | |
Collapse
|
15
|
Damerum A, Selmes SL, Biggi GF, Clarkson GJJ, Rothwell SD, Truco MJ, Michelmore RW, Hancock RD, Shellcock C, Chapman MA, Taylor G. Elucidating the genetic basis of antioxidant status in lettuce (Lactuca sativa). HORTICULTURE RESEARCH 2015; 2:15055. [PMID: 26640696 PMCID: PMC4660231 DOI: 10.1038/hortres.2015.55] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2015] [Revised: 10/21/2015] [Accepted: 10/21/2015] [Indexed: 05/24/2023]
Abstract
A diet rich in phytonutrients from fruit and vegetables has been acknowledged to afford protection against a range of human diseases, but many of the most popular vegetables are low in phytonutrients. Wild relatives of crops may contain allelic variation for genes determining the concentrations of these beneficial phytonutrients, and therefore understanding the genetic basis of this variation is important for breeding efforts to enhance nutritional quality. In this study, lettuce recombinant inbred lines, generated from a cross between wild and cultivated lettuce (Lactuca serriola and Lactuca sativa, respectively), were analysed for antioxidant (AO) potential and important phytonutrients including carotenoids, chlorophyll and phenolic compounds. When grown in two environments, 96 quantitative trait loci (QTL) were identified for these nutritional traits: 4 for AO potential, 2 for carotenoid content, 3 for total chlorophyll content and 87 for individual phenolic compounds (two per compound on average). Most often, the L. serriola alleles conferred an increase in total AOs and metabolites. Candidate genes underlying these QTL were identified by BLASTn searches; in several cases, these had functions suggesting involvement in phytonutrient biosynthetic pathways. Analysis of a QTL on linkage group 3, which accounted for >30% of the variation in AO potential, revealed several candidate genes encoding multiple MYB transcription factors which regulate flavonoid biosynthesis and flavanone 3-hydroxylase, an enzyme involved in the biosynthesis of the flavonoids quercetin and kaempferol, which are known to have powerful AO activity. Follow-up quantitative RT-PCR of these candidates revealed that 5 out of 10 genes investigated were significantly differentially expressed between the wild and cultivated parents, providing further evidence of their potential involvement in determining the contrasting phenotypes. These results offer exciting opportunities to improve the nutritional content and health benefits of lettuce through marker-assisted breeding.
Collapse
Affiliation(s)
- Annabelle Damerum
- Centre for Biological Sciences, University of Southampton, Life Sciences, University Road, Southampton SO17 1BJ, UK
| | - Stacey L Selmes
- Centre for Biological Sciences, University of Southampton, Life Sciences, University Road, Southampton SO17 1BJ, UK
| | - Gaia F Biggi
- Centre for Biological Sciences, University of Southampton, Life Sciences, University Road, Southampton SO17 1BJ, UK
| | - Graham JJ Clarkson
- Centre for Biological Sciences, University of Southampton, Life Sciences, University Road, Southampton SO17 1BJ, UK
- Vitacress Limited, Lower Link Farm, St Mary Bourne, Andover, Hampshire SP11 6DB, UK
| | - Steve D Rothwell
- Vitacress Limited, Lower Link Farm, St Mary Bourne, Andover, Hampshire SP11 6DB, UK
| | - Maria José Truco
- The Genome Centre and the Department of Plant Sciences, University of California, Davis, CA 95616, USA
| | - Richard W Michelmore
- The Genome Centre and the Department of Plant Sciences, University of California, Davis, CA 95616, USA
| | | | | | - Mark A Chapman
- Centre for Biological Sciences, University of Southampton, Life Sciences, University Road, Southampton SO17 1BJ, UK
| | - Gail Taylor
- Centre for Biological Sciences, University of Southampton, Life Sciences, University Road, Southampton SO17 1BJ, UK
| |
Collapse
|
16
|
Chung BY, Hardcastle TJ, Jones JD, Irigoyen N, Firth AE, Baulcombe DC, Brierley I. The use of duplex-specific nuclease in ribosome profiling and a user-friendly software package for Ribo-seq data analysis. RNA (NEW YORK, N.Y.) 2015; 21:1731-45. [PMID: 26286745 PMCID: PMC4574750 DOI: 10.1261/rna.052548.115] [Citation(s) in RCA: 81] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 06/23/2015] [Indexed: 05/19/2023]
Abstract
Ribosome profiling is a technique that permits genome-wide, quantitative analysis of translation and has found broad application in recent years. Here we describe a modified profiling protocol and software package designed to benefit more broadly the translation community in terms of simplicity and utility. The protocol, applicable to diverse organisms, including organelles, is based largely on previously published profiling methodologies, but uses duplex-specific nuclease (DSN) as a convenient, species-independent way to reduce rRNA contamination. We show that DSN-based depletion compares favorably with other commonly used rRNA depletion strategies and introduces little bias. The profiling protocol typically produces high levels of triplet periodicity, facilitating the detection of coding sequences, including upstream, downstream, and overlapping open reading frames (ORFs) and an alternative ribosome conformation evident during termination of protein synthesis. In addition, we provide a software package that presents a set of methods for parsing ribosomal profiling data from multiple samples, aligning reads to coding sequences, inferring alternative ORFs, and plotting average and transcript-specific aspects of the data. Methods are also provided for extracting the data in a form suitable for differential analysis of translation and translational efficiency.
Collapse
Affiliation(s)
- Betty Y Chung
- Department of Plant Sciences, University of Cambridge, Cambridge CB2 3EA, United Kingdom
| | - Thomas J Hardcastle
- Department of Plant Sciences, University of Cambridge, Cambridge CB2 3EA, United Kingdom
| | - Joshua D Jones
- Division of Virology, Department of Pathology, University of Cambridge, Cambridge CB2 1QP, United Kingdom
| | - Nerea Irigoyen
- Division of Virology, Department of Pathology, University of Cambridge, Cambridge CB2 1QP, United Kingdom
| | - Andrew E Firth
- Division of Virology, Department of Pathology, University of Cambridge, Cambridge CB2 1QP, United Kingdom
| | - David C Baulcombe
- Department of Plant Sciences, University of Cambridge, Cambridge CB2 3EA, United Kingdom
| | - Ian Brierley
- Division of Virology, Department of Pathology, University of Cambridge, Cambridge CB2 1QP, United Kingdom
| |
Collapse
|
17
|
Christopoulou M, McHale LK, Kozik A, Reyes-Chin Wo S, Wroblewski T, Michelmore RW. Dissection of Two Complex Clusters of Resistance Genes in Lettuce (Lactuca sativa). MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2015; 28:751-65. [PMID: 25650829 DOI: 10.1094/mpmi-06-14-0175-r] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Of the over 50 phenotypic resistance genes mapped in lettuce, 25 colocalize to three major resistance clusters (MRC) on chromosomes 1, 2, and 4. Similarly, the majority of candidate resistance genes encoding nucleotide binding-leucine rich repeat (NLR) proteins genetically colocalize with phenotypic resistance loci. MRC1 and MRC4 span over 66 and 63 Mb containing 84 and 21 NLR-encoding genes, respectively, as well as 765 and 627 genes that are not related to NLR genes. Forward and reverse genetic approaches were applied to dissect MRC1 and MRC4. Transgenic lines exhibiting silencing were selected using silencing of β-glucuronidase as a reporter. Silencing of two of five NLR-encoding gene families resulted in abrogation of nine of 14 tested resistance phenotypes mapping to these two regions. At MRC1, members of the coiled coil-NLR-encoding RGC1 gene family were implicated in host and nonhost resistance through requirement for Dm5/8- and Dm45-mediated resistance to downy mildew caused by Bremia lactucae as well as the hypersensitive response to effectors AvrB, AvrRpm1, and AvrRpt2 of the nonpathogen Pseudomonas syringae. At MRC4, RGC12 family members, which encode toll interleukin receptor-NLR proteins, were implicated in Dm4-, Dm7-, Dm11-, and Dm44-mediated resistance to B. lactucae. Lesions were identified in the sequence of a candidate gene within dm7 loss-of-resistance mutant lines, confirming that RGC12G confers Dm7.
Collapse
Affiliation(s)
- Marilena Christopoulou
- Genome Center and Department of Plant Sciences, University of California-Davis, CA 95616, U.S.A
| | - Leah K McHale
- Genome Center and Department of Plant Sciences, University of California-Davis, CA 95616, U.S.A
| | - Alex Kozik
- Genome Center and Department of Plant Sciences, University of California-Davis, CA 95616, U.S.A
| | - Sebastian Reyes-Chin Wo
- Genome Center and Department of Plant Sciences, University of California-Davis, CA 95616, U.S.A
| | - Tadeusz Wroblewski
- Genome Center and Department of Plant Sciences, University of California-Davis, CA 95616, U.S.A
| | - Richard W Michelmore
- Genome Center and Department of Plant Sciences, University of California-Davis, CA 95616, U.S.A
| |
Collapse
|
18
|
Ashrafi H, Hulse-Kemp AM, Wang F, Yang SS, Guan X, Jones DC, Matvienko M, Mockaitis K, Chen ZJ, Stelly DM, Van Deynze A. A Long-Read Transcriptome Assembly of Cotton (Gossypium hirsutum L.) and Intraspecific Single Nucleotide Polymorphism Discovery. THE PLANT GENOME 2015; 8:eplantgenome2014.10.0068. [PMID: 33228299 DOI: 10.3835/plantgenome2014.10.0068] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Accepted: 02/17/2015] [Indexed: 06/11/2023]
Abstract
Upland cotton (Gossypium hirsutum L.) has a narrow germplasm base, which constrains marker development and hampers intraspecific breeding. A pressing need exists for high-throughput single nucleotide polymorphism (SNP) markers that can be readily applied to germplasm in breeding and breeding-related research programs. Despite progress made in developing new sequencing technologies during the past decade, the cost of sequencing remains substantial when one is dealing with numerous samples and large genomes. Several strategies have been proposed to lower the cost of sequencing for multiple genotypes of large-genome species like cotton, such as transcriptome sequencing and reduced-representation DNA sequencing. This paper reports the development of a transcriptome assembly of the inbred line Texas Marker-1 (TM-1), a genetic standard for cotton, its usefulness as a reference for RNA sequencing (RNA-seq)-based SNP identification, and the availability of transcriptome sequences of four other cotton cultivars. An assembly of TM-1 was made using Roche 454 transcriptome reads combined with an assembly of all available public expressed sequence tag (EST) sequences of TM-1. The TM-1 assembly consists of 72,450 contigs with a total of 70 million bp. Functional predictions of the transcripts were estimated by alignment to selected protein databases. Transcriptome sequences of the five lines, including TM-1, were obtained using an Illumina Genome Analyzer-II, and the short reads were mapped to the TM-1 assembly to discover SNPs among the five lines. We identified >14,000 unfiltered allelic SNPs, of which ∼3,700 SNPs were retained for assay development after applying several rigorous filters. This paper reports availability of the reference transcriptome assembly and shows its utility in developing intraspecific SNP markers in upland cotton.
Collapse
Affiliation(s)
- Hamid Ashrafi
- Univ. of California-Davis, Dep. of Plant Sciences and Seed Biotechnology Center, One Shields Ave., Davis, CA, 95616
| | | | - Fei Wang
- Texas A&M Univ., Dep. of Soil and Crop Sciences, College Station, TX, 77843
| | - S Samuel Yang
- Texas A&M Univ., Dep. of Soil and Crop Sciences, College Station, TX, 77843
| | - Xueying Guan
- Institute for Cellular and Molecular Biology and Center for Computational Biology and Bioinformatics, The Univ. of Texas at Austin, Austin, TX, 78712
| | | | - Marta Matvienko
- Univ. of California-Davis, Genome Center, One Shields Ave., Davis, CA, 95616
| | | | - Z Jeffrey Chen
- Institute for Cellular and Molecular Biology and Center for Computational Biology and Bioinformatics, The Univ. of Texas at Austin, Austin, TX, 78712
| | - David M Stelly
- Texas A&M Univ., Dep. of Soil and Crop Sciences, College Station, TX, 77843
| | - Allen Van Deynze
- Univ. of California-Davis, Dep. of Plant Sciences and Seed Biotechnology Center, One Shields Ave., Davis, CA, 95616
| |
Collapse
|
19
|
Duplex-specific nuclease-mediated bioanalysis. Trends Biotechnol 2015; 33:180-8. [DOI: 10.1016/j.tibtech.2014.12.008] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2014] [Revised: 12/22/2014] [Accepted: 12/30/2014] [Indexed: 12/21/2022]
|
20
|
Chown SL, Hodgins KA, Griffin PC, Oakeshott JG, Byrne M, Hoffmann AA. Biological invasions, climate change and genomics. Evol Appl 2015; 8:23-46. [PMID: 25667601 PMCID: PMC4310580 DOI: 10.1111/eva.12234] [Citation(s) in RCA: 126] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2014] [Accepted: 10/24/2014] [Indexed: 12/13/2022] Open
Abstract
The rate of biological invasions is expected to increase as the effects of climate change on biological communities become widespread. Climate change enhances habitat disturbance which facilitates the establishment of invasive species, which in turn provides opportunities for hybridization and introgression. These effects influence local biodiversity that can be tracked through genetic and genomic approaches. Metabarcoding and metagenomic approaches provide a way of monitoring some types of communities under climate change for the appearance of invasives. Introgression and hybridization can be followed by the analysis of entire genomes so that rapidly changing areas of the genome are identified and instances of genetic pollution monitored. Genomic markers enable accurate tracking of invasive species' geographic origin well beyond what was previously possible. New genomic tools are promoting fresh insights into classic questions about invading organisms under climate change, such as the role of genetic variation, local adaptation and climate pre-adaptation in successful invasions. These tools are providing managers with often more effective means to identify potential threats, improve surveillance and assess impacts on communities. We provide a framework for the application of genomic techniques within a management context and also indicate some important limitations in what can be achieved.
Collapse
Affiliation(s)
- Steven L Chown
- School of Biological Sciences, Monash UniversityClayton, Vic., Australia
| | - Kathryn A Hodgins
- School of Biological Sciences, Monash UniversityClayton, Vic., Australia
| | - Philippa C Griffin
- Department of Genetics, Bio21 Institute, The University of MelbourneParkville, Vic., Australia
| | - John G Oakeshott
- CSIRO Land and Water Flagship, Black Mountain LaboratoriesCanberra, ACT, Australia
| | - Margaret Byrne
- Science and Conservation Division, Department of Parks and Wildlife, Bentley Delivery CentreBentley, WA, Australia
| | - Ary A Hoffmann
- Departments of Zoology and Genetics, Bio21 Institute, The University of MelbourneParkville, Vic., Australia
| |
Collapse
|
21
|
Hulse-Kemp AM, Ashrafi H, Zheng X, Wang F, Hoegenauer KA, Maeda ABV, Yang SS, Stoffel K, Matvienko M, Clemons K, Udall JA, Van Deynze A, Jones DC, Stelly DM. Development and bin mapping of gene-associated interspecific SNPs for cotton (Gossypium hirsutum L.) introgression breeding efforts. BMC Genomics 2014. [PMID: 25359292 DOI: 10.1186/1471‐2164‐15‐945] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cotton (Gossypium spp.) is the largest producer of natural fibers for textile and is an important crop worldwide. Crop production is comprised primarily of G. hirsutum L., an allotetraploid. However, elite cultivars express very small amounts of variation due to the species monophyletic origin, domestication and further bottlenecks due to selection. Conversely, wild cotton species harbor extensive genetic diversity of prospective utility to improve many beneficial agronomic traits, fiber characteristics, and resistance to disease and drought. Introgression of traits from wild species can provide a natural way to incorporate advantageous traits through breeding to generate higher-producing cotton cultivars and more sustainable production systems. Interspecific introgression efforts by conventional methods are very time-consuming and costly, but can be expedited using marker-assisted selection. RESULTS Using transcriptome sequencing we have developed the first gene-associated single nucleotide polymorphism (SNP) markers for wild cotton species G. tomentosum, G. mustelinum, G. armourianum and G. longicalyx. Markers were also developed for a secondary cultivated species G. barbadense cv. 3-79. A total of 62,832 non-redundant SNP markers were developed from the five wild species which can be utilized for interspecific germplasm introgression into cultivated G. hirsutum and are directly associated with genes. Over 500 of the G. barbadense markers have been validated by whole-genome radiation hybrid mapping. Overall 1,060 SNPs from the five different species have been screened and shown to produce acceptable genotyping assays. CONCLUSIONS This large set of 62,832 SNPs relative to cultivated G. hirsutum will allow for the first high-density mapping of genes from five wild species that affect traits of interest, including beneficial agronomic and fiber characteristics. Upon mapping, the markers can be utilized for marker-assisted introgression of new germplasm into cultivated cotton and in subsequent breeding of agronomically adapted types, including cultivar development.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | - David M Stelly
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, USA.
| |
Collapse
|
22
|
Hulse-Kemp AM, Ashrafi H, Zheng X, Wang F, Hoegenauer KA, Maeda ABV, Yang SS, Stoffel K, Matvienko M, Clemons K, Udall JA, Van Deynze A, Jones DC, Stelly DM. Development and bin mapping of gene-associated interspecific SNPs for cotton (Gossypium hirsutum L.) introgression breeding efforts. BMC Genomics 2014; 15:945. [PMID: 25359292 PMCID: PMC4298081 DOI: 10.1186/1471-2164-15-945] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2014] [Accepted: 10/03/2014] [Indexed: 11/18/2022] Open
Abstract
Background Cotton (Gossypium spp.) is the largest producer of natural fibers for textile and is an important crop worldwide. Crop production is comprised primarily of G. hirsutum L., an allotetraploid. However, elite cultivars express very small amounts of variation due to the species monophyletic origin, domestication and further bottlenecks due to selection. Conversely, wild cotton species harbor extensive genetic diversity of prospective utility to improve many beneficial agronomic traits, fiber characteristics, and resistance to disease and drought. Introgression of traits from wild species can provide a natural way to incorporate advantageous traits through breeding to generate higher-producing cotton cultivars and more sustainable production systems. Interspecific introgression efforts by conventional methods are very time-consuming and costly, but can be expedited using marker-assisted selection. Results Using transcriptome sequencing we have developed the first gene-associated single nucleotide polymorphism (SNP) markers for wild cotton species G. tomentosum, G. mustelinum, G. armourianum and G. longicalyx. Markers were also developed for a secondary cultivated species G. barbadense cv. 3–79. A total of 62,832 non-redundant SNP markers were developed from the five wild species which can be utilized for interspecific germplasm introgression into cultivated G. hirsutum and are directly associated with genes. Over 500 of the G. barbadense markers have been validated by whole-genome radiation hybrid mapping. Overall 1,060 SNPs from the five different species have been screened and shown to produce acceptable genotyping assays. Conclusions This large set of 62,832 SNPs relative to cultivated G. hirsutum will allow for the first high-density mapping of genes from five wild species that affect traits of interest, including beneficial agronomic and fiber characteristics. Upon mapping, the markers can be utilized for marker-assisted introgression of new germplasm into cultivated cotton and in subsequent breeding of agronomically adapted types, including cultivar development. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-945) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | - David M Stelly
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, USA.
| |
Collapse
|
23
|
Song Y, Giske CG, Gille-Johnson P, Emanuelsson O, Lundeberg J, Gyarmati P. Nuclease-assisted suppression of human DNA background in sepsis. PLoS One 2014; 9:e103610. [PMID: 25076135 PMCID: PMC4116218 DOI: 10.1371/journal.pone.0103610] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2014] [Accepted: 06/29/2014] [Indexed: 11/18/2022] Open
Abstract
Sepsis is a severe medical condition characterized by a systemic inflammatory response of the body caused by pathogenic microorganisms in the bloodstream. Blood or plasma is typically used for diagnosis, both containing large amount of human DNA, greatly exceeding the DNA of microbial origin. In order to enrich bacterial DNA, we applied the C0t effect to reduce human DNA background: a model system was set up with human and Escherichia coli (E. coli) DNA to mimic the conditions of bloodstream infections; and this system was adapted to plasma and blood samples from septic patients. As a consequence of the C0t effect, abundant DNA hybridizes faster than rare DNA. Following denaturation and re-hybridization, the amount of abundant DNA can be decreased with the application of double strand specific nucleases, leaving the non-hybridized rare DNA intact. Our experiments show that human DNA concentration can be reduced approximately 100,000-fold without affecting the E. coli DNA concentration in a model system with similarly sized amplicons. With clinical samples, the human DNA background was decreased 100-fold, as bacterial genomes are approximately 1,000-fold smaller compared to the human genome. According to our results, background suppression can be a valuable tool to enrich rare DNA in clinical samples where a high amount of background DNA can be found.
Collapse
Affiliation(s)
- Yajing Song
- Royal Institute of Technology, Science for Life Laboratory, Stockholm, Sweden
| | - Christian G. Giske
- Karolinska Institutet, Department of Microbiology, Tumor and Cell Biology, Stockholm, Sweden
- Karolinska University Hospital, Department of Clinical Microbiology, Stockholm, Sweden
| | - Patrik Gille-Johnson
- Division of Infectious Diseases, Department of Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Olof Emanuelsson
- Royal Institute of Technology, Science for Life Laboratory, Stockholm, Sweden
| | - Joakim Lundeberg
- Royal Institute of Technology, Science for Life Laboratory, Stockholm, Sweden
| | - Peter Gyarmati
- Royal Institute of Technology, Science for Life Laboratory, Stockholm, Sweden
- Karolinska Institutet, Department of Microbiology, Tumor and Cell Biology, Stockholm, Sweden
- Karolinska University Hospital, Department of Clinical Microbiology, Stockholm, Sweden
- * E-mail:
| |
Collapse
|
24
|
van Gurp TP, McIntyre LM, Verhoeven KJF. Consistent errors in first strand cDNA due to random hexamer mispriming. PLoS One 2013; 8:e85583. [PMID: 24386481 PMCID: PMC3875578 DOI: 10.1371/journal.pone.0085583] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2013] [Accepted: 11/28/2013] [Indexed: 11/19/2022] Open
Abstract
Priming of random hexamers in cDNA synthesis is known to show sequence bias, but in addition it has been suggested recently that mismatches in random hexamer priming could be a cause of mismatches between the original RNA fragment and observed sequence reads. To explore random hexamer mispriming as a potential source of these errors, we analyzed two independently generated RNA-seq datasets of synthetic ERCC spikes for which the reference is known. First strand cDNA synthesized by random hexamer priming on RNA showed consistent position and nucleotide-specific mismatch errors in the first seven nucleotides. The mismatch errors found in both datasets are consistent in distribution and thermodynamically stable mismatches are more common. This strongly indicates that RNA-DNA mispriming of specific random hexamers causes these errors. Due to their consistency and specificity, mispriming errors can have profound implications for downstream applications if not dealt with properly.
Collapse
Affiliation(s)
- Thomas P. van Gurp
- Netherlands Institute of Ecology (NIOO-KNAW), Department of Terrestrial Ecology, Wageningen, The Netherlands
- * E-mail:
| | - Lauren M. McIntyre
- Genetics Institute, University of Florida, Gainesville, Florida, United States of America
| | - Koen J. F. Verhoeven
- Netherlands Institute of Ecology (NIOO-KNAW), Department of Terrestrial Ecology, Wageningen, The Netherlands
| |
Collapse
|
25
|
Hodgins KA, Lai Z, Oliveira LO, Still DW, Scascitelli M, Barker MS, Kane NC, Dempewolf H, Kozik A, Kesseli RV, Burke JM, Michelmore RW, Rieseberg LH. Genomics of Compositae crops: reference transcriptome assemblies and evidence of hybridization with wild relatives. Mol Ecol Resour 2013; 14:166-77. [DOI: 10.1111/1755-0998.12163] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2013] [Revised: 08/14/2013] [Accepted: 08/15/2013] [Indexed: 11/30/2022]
Affiliation(s)
- Kathryn A. Hodgins
- Department of Botany and Biodiversity Research Centre; University of British Columbia; Vancouver BC V6T 1Z4 Canada
| | - Zhao Lai
- Department of Biology and Center for Genomics and Bioinformatics; Indiana University; Bloomington IN 47405 USA
| | - Luiz O. Oliveira
- Departamento de Bioquímica e Biologia Molecular; Universidade Federal de Viçosa; 36570-000 Viçosa Brazil
| | - David W. Still
- Department of Plant Sciences; Cal Poly Pomona; Pomona CA 91768 USA
| | - Moira Scascitelli
- Department of Botany and Biodiversity Research Centre; University of British Columbia; Vancouver BC V6T 1Z4 Canada
| | - Michael S. Barker
- Department of Ecology and Evolutionary Biology; University of Arizona; Tucson AZ 85721 USA
| | - Nolan C. Kane
- Department of Ecology and Evolutionary Biology; University of Colorado Boulder; Boulder CO 80309 USA
| | - Hannes Dempewolf
- Department of Botany and Biodiversity Research Centre; University of British Columbia; Vancouver BC V6T 1Z4 Canada
| | - Alex Kozik
- The Genome Center; University of California; Davis CA 95616 USA
| | | | - John M. Burke
- Department of Plant Biology; University of Georgia; Athens GA 30602 USA
| | - Richard W. Michelmore
- The Genome Center; University of California; Davis CA 95616 USA
- Departments of Plant Sciences, Molecular & Cellular Biology, and Medical Microbiology & Immunology; University of California; Davis CA 95616 USA
| | - Loren H. Rieseberg
- Department of Botany and Biodiversity Research Centre; University of British Columbia; Vancouver BC V6T 1Z4 Canada
- Department of Biology and Center for Genomics and Bioinformatics; Indiana University; Bloomington IN 47405 USA
| |
Collapse
|
26
|
Krasileva KV, Buffalo V, Bailey P, Pearce S, Ayling S, Tabbita F, Soria M, Wang S, Akhunov E, Uauy C, Dubcovsky J. Separating homeologs by phasing in the tetraploid wheat transcriptome. Genome Biol 2013; 14:R66. [PMID: 23800085 PMCID: PMC4053977 DOI: 10.1186/gb-2013-14-6-r66] [Citation(s) in RCA: 115] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2013] [Accepted: 06/25/2013] [Indexed: 11/10/2022] Open
Abstract
Background The high level of identity among duplicated homoeologous genomes in tetraploid pasta wheat presents substantial challenges for de novo transcriptome assembly. To solve this problem, we develop a specialized bioinformatics workflow that optimizes transcriptome assembly and separation of merged homoeologs. To evaluate our strategy, we sequence and assemble the transcriptome of one of the diploid ancestors of pasta wheat, and compare both assemblies with a benchmark set of 13,472 full-length, non-redundant bread wheat cDNAs. Results A total of 489 million 100 bp paired-end reads from tetraploid wheat assemble in 140,118 contigs, including 96% of the benchmark cDNAs. We used a comparative genomics approach to annotate 66,633 open reading frames. The multiple k-mer assembly strategy increases the proportion of cDNAs assembled full-length in a single contig by 22% relative to the best single k-mer size. Homoeologs are separated using a post-assembly pipeline that includes polymorphism identification, phasing of SNPs, read sorting, and re-assembly of phased reads. Using a reference set of genes, we determine that 98.7% of SNPs analyzed are correctly separated by phasing. Conclusions Our study shows that de novo transcriptome assembly of tetraploid wheat benefit from multiple k-mer assembly strategies more than diploid wheat. Our results also demonstrate that phasing approaches originally designed for heterozygous diploid organisms can be used to separate the close homoeologous genomes of tetraploid wheat. The predicted tetraploid wheat proteome and gene models provide a valuable tool for the wheat research community and for those interested in comparative genomic studies.
Collapse
|