1
|
The Use and Limitations of Exome Capture to Detect Novel Variation in the Hexaploid Wheat Genome. FRONTIERS IN PLANT SCIENCE 2022; 13:841855. [PMID: 35498663 PMCID: PMC9039655 DOI: 10.3389/fpls.2022.841855] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 02/28/2022] [Indexed: 06/14/2023]
Abstract
The bread wheat (Triticum aestivum) pangenome is a patchwork of variable regions, including translocations and introgressions from progenitors and wild relatives. Although a large number of these have been documented, it is likely that many more remain unknown. To map these variable regions and make them more traceable in breeding programs, wheat accessions need to be genotyped or sequenced. The wheat genome is large and complex and consequently, sequencing efforts are often targeted through exome capture. In this study, we employed exome capture prior to sequencing 12 wheat varieties; 10 elite T. aestivum cultivars and two T. aestivum landrace accessions. Sequence coverage across chromosomes was greater toward distal regions of chromosome arms and lower in centromeric regions, reflecting the capture probe distribution which itself is determined by the known telomere to centromere gene gradient. Superimposed on this general pattern, numerous drops in sequence coverage were observed. Several of these corresponded with reported introgressions. Other drops in coverage could not be readily explained and may point to introgressions that have not, to date, been documented.
Collapse
|
2
|
A Taxon-Wise Insight Into Rock Weathering and Nitrogen Fixation Functional Profiles of Proglacial Systems. Front Microbiol 2021; 12:627437. [PMID: 34621246 PMCID: PMC8491546 DOI: 10.3389/fmicb.2021.627437] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 08/05/2021] [Indexed: 11/13/2022] Open
Abstract
The Arctic environment is particularly affected by global warming, and a clear trend of the ice retreat is observed worldwide. In proglacial systems, the newly exposed terrain represents different environmental and nutrient conditions compared to later soil stages. Therefore, proglacial systems show several environmental gradients along the soil succession where microorganisms are active protagonists of the soil and carbon pool formation through nitrogen fixation and rock weathering. We studied the microbial succession of three Arctic proglacial systems located in Svalbard (Midtre Lovénbreen), Sweden (Storglaciären), and Greenland (foreland close to Kangerlussuaq). We analyzed 65 whole shotgun metagenomic soil samples for a total of more than 400 Gb of sequencing data. Microbial succession showed common trends typical of proglacial systems with increasing diversity observed along the forefield chronosequence. Microbial trends were explained by the distance from the ice edge in the Midtre Lovénbreen and Storglaciären forefields and by total nitrogen (TN) and total organic carbon (TOC) in the Greenland proglacial system. Furthermore, we focused specifically on genes associated with nitrogen fixation and biotic rock weathering processes, such as nitrogenase genes, obcA genes, and genes involved in cyanide and siderophore synthesis and transport. Whereas we confirmed the presence of these genes in known nitrogen-fixing and/or rock weathering organisms (e.g., Nostoc, Burkholderia), in this study, we also detected organisms that, even if often found in soil and proglacial systems, have never been related to nitrogen-fixing or rock weathering processes before (e.g., Fimbriiglobus, Streptomyces). The different genera showed different gene trends within and among the studied systems, indicating a community constituted by a plurality of organisms involved in nitrogen fixation and biotic rock weathering, and where the latter were driven by different organisms at different soil succession stages.
Collapse
|
3
|
The role of gene flow and chromosomal instability in shaping the bread wheat genome. NATURE PLANTS 2021; 7:172-183. [PMID: 33526912 DOI: 10.1038/s41477-020-00845-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 12/18/2020] [Indexed: 05/02/2023]
Abstract
Bread wheat (Triticum aestivum) is one of the world's most important crops; however, a low level of genetic diversity within commercial breeding accessions can significantly limit breeding potential. In contrast, wheat relatives exhibit considerable genetic variation and so potentially provide a valuable source of novel alleles for use in breeding new cultivars. Historically, gene flow between wheat and its relatives may have contributed novel alleles to the bread wheat pangenome. To assess the contribution made by wheat relatives to genetic diversity in bread wheat, we used markers based on single nucleotide polymorphisms to compare bread wheat accessions, created in the past 150 years, with 45 related species. We show that many bread wheat accessions share near-identical haplotype blocks with close relatives of wheat's diploid and tetraploid progenitors, while some show evidence of introgressions from more distant species and structural variation between accessions. Hence, introgressions and chromosomal rearrangements appear to have made a major contribution to genetic diversity in cultivar collections. As gene flow from relatives to bread wheat is an ongoing process, we assess the impact that introgressions might have on future breeding strategies.
Collapse
|
4
|
Segregation distortion: Utilizing simulated genotyping data to evaluate statistical methods. PLoS One 2020; 15:e0228951. [PMID: 32074141 PMCID: PMC7029859 DOI: 10.1371/journal.pone.0228951] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Accepted: 01/26/2020] [Indexed: 11/18/2022] Open
Abstract
Segregation distortion is the phenomenon in which genotypes deviate from expected Mendelian ratios in the progeny of a cross between two varieties or species. There is not currently a widely used consensus for the appropriate statistical test, or more specifically the multiple testing correction procedure, used to detect segregation distortion for high-density single-nucleotide polymorphism (SNP) data. Here we examine the efficacy of various multiple testing procedures, including chi-square test with no correction for multiple testing, false-discovery rate correction and Bonferroni correction using an in-silico simulation of a biparental mapping population. We find that the false discovery rate correction best approximates the traditional p-value threshold of 0.05 for high-density marker data. We also utilize this simulation to test the effect of segregation distortion on the genetic mapping process, specifically on the formation of linkage groups during marker clustering. Only extreme segregation distortion was found to effect genetic mapping. In addition, we utilize replicate empirical mapping populations of wheat varieties Avalon and Cadenza to assess how often segregation distortion conforms to the same pattern between closely related wheat varieties.
Collapse
|
5
|
CerealsDB-new tools for the analysis of the wheat genome: update 2020. Database (Oxford) 2020; 2020:baaa060. [PMID: 32754757 PMCID: PMC7402920 DOI: 10.1093/database/baaa060] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 06/08/2020] [Accepted: 07/07/2020] [Indexed: 11/24/2022]
Abstract
CerealsDB (www.cerealsdb.uk.net) is an online repository of mainly hexaploid wheat (Triticum aestivum) single nucleotide polymorphisms (SNPs) and genotyping data. The CerealsDB website has been designed to enable wheat breeders and scientists to select the appropriate markers for research breeding tasks, such as marker-assisted selection. We report a large update of genotyping information for over 6000 wheat accessions and describe new webtools for exploring and visualizing the data. We also describe a new database of quantitative trait loci that links phenotypic traits to CerealsDB SNP markers and allelic scores for each of those markers. CerealsDB is an open-access website that hosts information on wheat SNPs considered useful for both plant breeders and research scientists. The latest CerealsDB database is available at https://www.cerealsdb.uk.net/cerealgenomics/CerealsDB/indexNEW.php.
Collapse
|
6
|
Developing a High-Throughput SNP-Based Marker System to Facilitate the Introgression of Traits From Aegilops Species Into Bread Wheat ( Triticum aestivum). FRONTIERS IN PLANT SCIENCE 2019; 9:1993. [PMID: 30733728 PMCID: PMC6354564 DOI: 10.3389/fpls.2018.01993] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Accepted: 12/21/2018] [Indexed: 06/09/2023]
Abstract
The genus Aegilops contains a diverse collection of wild species exhibiting variation in geographical distribution, ecological adaptation, ploidy and genome organization. Aegilops is the most closely related genus to Triticum which includes cultivated wheat, a globally important crop that has a limited gene pool for modern breeding. Aegilops species are a potential future resource for wheat breeding for traits, such as adaptation to different ecological conditions and pest and disease resistance. This study describes the development and application of the first high-throughput genotyping platform specifically designed for screening wheat relative species. The platform was used to screen multiple accessions representing all species in the genus Aegilops. Firstly, the data was demonstrated to be useful for screening diversity and examining relationships within and between Aegilops species. Secondly, markers able to characterize and track introgressions from Aegilops species in hexaploid wheat were identified and validated using two different approaches.
Collapse
|
7
|
Conversion of array-based single nucleotide polymorphic markers for use in targeted genotyping by sequencing in hexaploid wheat (Triticum aestivum). PLANT BIOTECHNOLOGY JOURNAL 2018; 16:867-876. [PMID: 28913866 PMCID: PMC5866950 DOI: 10.1111/pbi.12834] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Revised: 09/04/2017] [Accepted: 09/07/2017] [Indexed: 05/23/2023]
Abstract
Wheat breeders and academics alike use single nucleotide polymorphisms (SNPs) as molecular markers to characterize regions of interest within the hexaploid wheat genome. A number of SNP-based genotyping platforms are available, and their utility depends upon factors such as the available technologies, number of data points required, budgets and the technical expertise required. Unfortunately, markers can rarely be exchanged between existing and newly developed platforms, meaning that previously generated data cannot be compared, or combined, with more recently generated data sets. We predict that genotyping by sequencing will become the predominant genotyping technology within the next 5-10 years. With this in mind, to ensure that data generated from current genotyping platforms continues to be of use, we have designed and utilized SNP-based capture probes from several thousand existing and publicly available probes from Axiom® and KASP™ genotyping platforms. We have validated our capture probes in a targeted genotyping by sequencing protocol using 31 previously genotyped UK elite hexaploid wheat accessions. Data comparisons between targeted genotyping by sequencing, Axiom® array genotyping and KASP™ genotyping assays, identified a set of 3256 probes which reliably bring together targeted genotyping by sequencing data with the previously available marker data set. As such, these probes are likely to be of considerable value to the wheat community. The probe details, full probe sequences and a custom built analysis pipeline may be freely downloaded from the CerealsDB website (http://www.cerealsdb.uk.net/cerealgenomics/CerealsDB/sequence_capture.php).
Collapse
|
8
|
Comparative study of biofilm formation on biocidal antifouling and fouling-release coatings using next-generation DNA sequencing. BIOFOULING 2018; 34:464-477. [PMID: 29745769 DOI: 10.1080/08927014.2018.1464152] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2017] [Accepted: 04/09/2018] [Indexed: 06/08/2023]
Abstract
The bacterial and eukaryotic communities forming biofilms on six different antifouling coatings, three biocidal and three fouling-release, on boards statically submerged in a marine environment were studied using next-generation sequencing. Sequenced amplicons of bacterial 16S ribosomal DNA and eukaryotic ribosomal DNA internal transcribed spacer were assigned taxonomy by comparison to reference databases and relative abundances were calculated. Differences in species composition, bacterial and eukaryotic, and relative abundance were observed between the biofilms on the various coatings; the main difference was between coating type, biocidal compared to fouling-release. Species composition and relative abundance also changed through time. Thus, it was possible to group replicate samples by coating and time point, indicating that there are fundamental and reproducible differences in biofilms assemblages. The routine use of next-generation sequencing to assess biofilm formation will allow evaluation of the efficacy of various commercial coatings and the identification of targets for novel formulations.
Collapse
|
9
|
Characterization of a Wheat Breeders' Array suitable for high-throughput SNP genotyping of global accessions of hexaploid bread wheat (Triticum aestivum). PLANT BIOTECHNOLOGY JOURNAL 2017; 15:390-401. [PMID: 27627182 PMCID: PMC5316916 DOI: 10.1111/pbi.12635] [Citation(s) in RCA: 103] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Revised: 09/02/2016] [Accepted: 09/09/2016] [Indexed: 05/18/2023]
Abstract
Targeted selection and inbreeding have resulted in a lack of genetic diversity in elite hexaploid bread wheat accessions. Reduced diversity can be a limiting factor in the breeding of high yielding varieties and crucially can mean reduced resilience in the face of changing climate and resource pressures. Recent technological advances have enabled the development of molecular markers for use in the assessment and utilization of genetic diversity in hexaploid wheat. Starting with a large collection of 819 571 previously characterized wheat markers, here we describe the identification of 35 143 single nucleotide polymorphism-based markers, which are highly suited to the genotyping of elite hexaploid wheat accessions. To assess their suitability, the markers have been validated using a commercial high-density Affymetrix Axiom® genotyping array (the Wheat Breeders' Array), in a high-throughput 384 microplate configuration, to characterize a diverse global collection of wheat accessions including landraces and elite lines derived from commercial breeding communities. We demonstrate that the Wheat Breeders' Array is also suitable for generating high-density genetic maps of previously uncharacterized populations and for characterizing novel genetic diversity produced by mutagenesis. To facilitate the use of the array by the wheat community, the markers, the associated sequence and the genotype information have been made available through the interactive web site 'CerealsDB'.
Collapse
|
10
|
Abstract
A lack of genetic diversity between wheat breeding lines has been recognized as a significant block to future yield increases. Wheat breeding and prebreeding strategies are increasingly using material from wheat ancestors or wild relatives to reintroduce diversity. Where molecular markers are polymorphic between the host and introgressed material, they may be used to track the size and location of the introgressed material through generations of backcrossing. To generate markers for this purpose, sequence capture targeted resequencing was carried out for a range of wheat varieties, wheat relatives, and wheat progenitors. From these sequences, putative SNPs were identified and used to generate the Axiom® Wheat HD array. A selection of varieties representing a selection of elite wheat breeding material, progenitor species, and wild relatives were used to validate the array. The procedures used are described here in detail.
Collapse
|
11
|
CerealsDB 3.0: expansion of resources and data integration. BMC Bioinformatics 2016; 17:256. [PMID: 27342803 PMCID: PMC4919907 DOI: 10.1186/s12859-016-1139-x] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 06/14/2016] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND The increase in human populations around the world has put pressure on resources, and as a consequence food security has become an important challenge for the 21st century. Wheat (Triticum aestivum) is one of the most important crops in human and livestock diets, and the development of wheat varieties that produce higher yields, combined with increased resistance to pests and resilience to changes in climate, has meant that wheat breeding has become an important focus of scientific research. In an attempt to facilitate these improvements in wheat, plant breeders have employed molecular tools to help them identify genes for important agronomic traits that can be bred into new varieties. Modern molecular techniques have ensured that the rapid and inexpensive characterisation of SNP markers and their validation with modern genotyping methods has produced a valuable resource that can be used in marker assisted selection. CerealsDB was created as a means of quickly disseminating this information to breeders and researchers around the globe. DESCRIPTION CerealsDB version 3.0 is an online resource that contains a wide range of genomic datasets for wheat that will assist plant breeders and scientists to select the most appropriate markers for use in marker assisted selection. CerealsDB includes a database which currently contains in excess of a million putative varietal SNPs, of which several hundreds of thousands have been experimentally validated. In addition, CerealsDB also contains new data on functional SNPs predicted to have a major effect on protein function and we have constructed a web service to encourage data integration and high-throughput programmatic access. CONCLUSION CerealsDB is an open access website that hosts information on SNPs that are considered useful for both plant breeders and research scientists. The recent inclusion of web services designed to federate genomic data resources allows the information on CerealsDB to be more fully integrated with the WheatIS network and other biological databases.
Collapse
|
12
|
High-density SNP genotyping array for hexaploid wheat and its secondary and tertiary gene pool. PLANT BIOTECHNOLOGY JOURNAL 2016; 14:1195-206. [PMID: 26466852 PMCID: PMC4950041 DOI: 10.1111/pbi.12485] [Citation(s) in RCA: 120] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Revised: 08/21/2015] [Accepted: 09/07/2015] [Indexed: 05/15/2023]
Abstract
In wheat, a lack of genetic diversity between breeding lines has been recognized as a significant block to future yield increases. Species belonging to bread wheat's secondary and tertiary gene pools harbour a much greater level of genetic variability, and are an important source of genes to broaden its genetic base. Introgression of novel genes from progenitors and related species has been widely employed to improve the agronomic characteristics of hexaploid wheat, but this approach has been hampered by a lack of markers that can be used to track introduced chromosome segments. Here, we describe the identification of a large number of single nucleotide polymorphisms that can be used to genotype hexaploid wheat and to identify and track introgressions from a variety of sources. We have validated these markers using an ultra-high-density Axiom(®) genotyping array to characterize a range of diploid, tetraploid and hexaploid wheat accessions and wheat relatives. To facilitate the use of these, both the markers and the associated sequence and genotype information have been made available through an interactive web site.
Collapse
|
13
|
Transcriptomic analysis of the interactions between Agaricus bisporus and Lecanicillium fungicola. Fungal Genet Biol 2013; 55:67-76. [PMID: 23665188 DOI: 10.1016/j.fgb.2013.04.010] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2012] [Revised: 04/22/2013] [Accepted: 04/24/2013] [Indexed: 11/24/2022]
Abstract
Agaricus bisporus is susceptible to a number of diseases, particularly those caused by fungi, with Lecanicillium fungicola being the most serious. Control of this disease is important for the security of crop production, however given the lack of knowledge about fungal-fungal interactions, such disease control is rather limited. Exploiting the recently released genome sequence of A. bisporus, here we report studies simultaneously investigating both the host and the pathogen, focussing on transcriptional changes associated with the cap spotting lesions typically seen in this interaction. Forward-suppressive subtractive hybridisation (SSH) analysis identified 68 A. bisporus unigenes induced during infection. Chitin deacetylase showed the strongest response, with almost 1000-fold up-regulation during infection, so was targeted for down-regulation by silencing to see if it was involved in defence against L. fungicola. Transgenic lines were made expressing hairpin RNAi constructs, however no changes in susceptibility to L. fungicola were observed. Amongst the other up-regulated genes there were none with readily apparent roles in resisting infection in this susceptible interaction. Reverse-SSH identified 72 unigenes from A. bisporus showing reduced expression, including two tyrosinases, several genes involved in nitrogen metabolism and a hydrophobin. The forward-SSH analysis of infected mushrooms also yielded 64 unigenes which were not of A. bisporus origin and thus derived from L. fungicola. An EST analysis of infection-mimicking conditions generated an additional 623 unigenes from L. fungicola including several oxidoreductases, cell wall degrading enzymes, ABC and MFS transporter proteins and various other genes believed to play roles in other pathosystems. Together, this analysis shows how both the pathogen and the host modify their gene expression during an infection-interaction, shedding some light on the disease process, although we note that some 40% of unigenes from both organisms encode hypothetical proteins with no ascribed function which highlights how much there is still to discover about this interaction.
Collapse
|
14
|
Discovery and development of exome-based, co-dominant single nucleotide polymorphism markers in hexaploid wheat (Triticum aestivum L.). PLANT BIOTECHNOLOGY JOURNAL 2013; 11:279-95. [PMID: 23279710 DOI: 10.1111/pbi.12009] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2012] [Revised: 08/06/2012] [Accepted: 08/10/2012] [Indexed: 05/19/2023]
Abstract
Globally, wheat is the most widely grown crop and one of the three most important crops for human and livestock feed. However, the complex nature of the wheat genome has, until recently, resulted in a lack of single nucleotide polymorphism (SNP)-based molecular markers of practical use to wheat breeders. Recently, large numbers of SNP-based wheat markers have been made available via the use of next-generation sequencing combined with a variety of genotyping platforms. However, many of these markers and platforms have difficulty distinguishing between heterozygote and homozygote individuals and are therefore of limited use to wheat breeders carrying out commercial-scale breeding programmes. To identify exome-based co-dominant SNP-based assays, which are capable of distinguishing between heterozygotes and homozygotes, we have used targeted re-sequencing of the wheat exome to generate large amounts of genomic sequences from eight varieties. Using a bioinformatics approach, these sequences have been used to identify 95 266 putative single nucleotide polymorphisms, of which 10 251 were classified as being putatively co-dominant. Validation of a subset of these putative co-dominant markers confirmed that 96% were true polymorphisms and 65% were co-dominant SNP assays. The new co-dominant markers described here are capable of genotypic classification of a segregating locus in polyploid wheat and can be used on a variety of genotyping platforms; as such, they represent a powerful tool for wheat breeders. These markers and related information have been made publically available on an interactive web-based database to facilitate their use on genotyping programmes worldwide.
Collapse
|
15
|
Abstract
Bread wheat (Triticum aestivum) is a globally important crop, accounting for 20 per cent of the calories consumed by humans. Major efforts are underway worldwide to increase wheat production by extending genetic diversity and analysing key traits, and genomic resources can accelerate progress. But so far the very large size and polyploid complexity of the bread wheat genome have been substantial barriers to genome analysis. Here we report the sequencing of its large, 17-gigabase-pair, hexaploid genome using 454 pyrosequencing, and comparison of this with the sequences of diploid ancestral and progenitor genomes. We identified between 94,000 and 96,000 genes, and assigned two-thirds to the three component genomes (A, B and D) of hexaploid wheat. High-resolution synteny maps identified many small disruptions to conserved gene order. We show that the hexaploid genome is highly dynamic, with significant loss of gene family members on polyploidization and domestication, and an abundance of gene fragments. Several classes of genes involved in energy harvesting, metabolism and growth are among expanded gene families that could be associated with crop productivity. Our analyses, coupled with the identification of extensive genetic variation, provide a resource for accelerating gene discovery and improving this major crop.
Collapse
|
16
|
Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models. Hum Mutat 2012; 34:57-65. [PMID: 23033316 PMCID: PMC3558800 DOI: 10.1002/humu.22225] [Citation(s) in RCA: 843] [Impact Index Per Article: 70.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Accepted: 09/02/2012] [Indexed: 01/30/2023]
Abstract
The rate at which nonsynonymous single nucleotide polymorphisms (nsSNPs) are being identified in the human genome is increasing dramatically owing to advances in whole-genome/whole-exome sequencing technologies. Automated methods capable of accurately and reliably distinguishing between pathogenic and functionally neutral nsSNPs are therefore assuming ever-increasing importance. Here, we describe the Functional Analysis Through Hidden Markov Models (FATHMM) software and server: a species-independent method with optional species-specific weightings for the prediction of the functional effects of protein missense variants. Using a model weighted for human mutations, we obtained performance accuracies that outperformed traditional prediction methods (i.e., SIFT, PolyPhen, and PANTHER) on two separate benchmarks. Furthermore, in one benchmark, we achieve performance accuracies that outperform current state-of-the-art prediction methods (i.e., SNPs&GO and MutPred). We demonstrate that FATHMM can be efficiently applied to high-throughput/large-scale human and nonhuman genome sequencing projects with the added benefit of phenotypic outcome associations. To illustrate this, we evaluated nsSNPs in wheat (Triticum spp.) to identify some of the important genetic variants responsible for the phenotypic differences introduced by intense selection during domestication. A Web-based implementation of FATHMM, including a high-throughput batch facility and a downloadable standalone package, is available at http://fathmm.biocompute.org.uk.
Collapse
|
17
|
CerealsDB 2.0: an integrated resource for plant breeders and scientists. BMC Bioinformatics 2012; 13:219. [PMID: 22943283 PMCID: PMC3447715 DOI: 10.1186/1471-2105-13-219] [Citation(s) in RCA: 124] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2012] [Accepted: 08/09/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Food security is an issue that has come under renewed scrutiny amidst concerns that substantial yield increases in cereal crops are required to feed the world's booming population. Wheat is of fundamental importance in this regard being one of the three most important crops for both human consumption and livestock feed; however, increase in crop yields have not kept pace with the demands of a growing world population. In order to address this issue, plant breeders require new molecular tools to help them identify genes for important agronomic traits that can be introduced into elite varieties. Studies of the genome using next-generation sequencing enable the identification of molecular markers such as single nucleotide polymorphisms that may be used by breeders to identify and follow genes when breeding new varieties. The development and application of next-generation sequencing technologies has made the characterisation of SNP markers in wheat relatively cheap and straightforward. There is a growing need for the widespread dissemination of this information to plant breeders. DESCRIPTION CerealsDB is an online resource containing a range of genomic datasets for wheat (Triticum aestivum) that will assist plant breeders and scientists to select the most appropriate markers for marker assisted selection. CerealsDB includes a database which currently contains in excess of 100,000 putative varietal SNPs, of which several thousand have been experimentally validated. In addition, CerealsDB contains databases for DArT markers and EST sequences, and links to a draft genome sequence for the wheat variety Chinese Spring. CONCLUSION CerealsDB is an open access website that is rapidly becoming an invaluable resource within the wheat research and plant breeding communities.
Collapse
|
18
|
Abstract
Bread wheat, Triticum aestivum, is an allohexaploid composed of the three distinct ancestral genomes, A, B and D. The polyploid nature of the wheat genome together with its large size has limited our ability to generate the significant amount of sequence data required for whole genome studies. Even with the advent of next-generation sequencing technology, it is still relatively expensive to generate whole genome sequences for more than a few wheat genomes at any one time. To overcome this problem, we have developed a targeted-capture re-sequencing protocol based upon NimbleGen array technology to capture and characterize 56.5 Mb of genomic DNA with sequence similarity to over 100 000 transcripts from eight different UK allohexaploid wheat varieties. Using this procedure in conjunction with a carefully designed bioinformatic procedure, we have identified more than 500 000 putative single-nucleotide polymorphisms (SNPs). While 80% of these were variants between the homoeologous genomes, A, B and D, a significant number (20%) were putative varietal SNPs between the eight varieties studied. A small number of these latter polymorphisms were experimentally validated using KASPar technology and 94% proved to be genuine. The procedures described here to sequence a large proportion of the wheat genome, and the various SNPs identified should be of considerable use to the wider wheat community.
Collapse
|
19
|
Preliminary results using the rapd analysis to screen bloom populations ofEmiliania huxleyi(Haptophyta). ACTA ACUST UNITED AC 2012. [DOI: 10.1080/00364827.1994.10413562] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
20
|
Transcript-specific, single-nucleotide polymorphism discovery and linkage analysis in hexaploid bread wheat (Triticum aestivum L.). PLANT BIOTECHNOLOGY JOURNAL 2011; 9:1086-99. [PMID: 21627760 DOI: 10.1111/j.1467-7652.2011.00628.x] [Citation(s) in RCA: 96] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Food security is a global concern and substantial yield increases in cereal crops are required to feed the growing world population. Wheat is one of the three most important crops for human and livestock feed. However, the complexity of the genome coupled with a decline in genetic diversity within modern elite cultivars has hindered the application of marker-assisted selection (MAS) in breeding programmes. A crucial step in the successful application of MAS in breeding programmes is the development of cheap and easy to use molecular markers, such as single-nucleotide polymorphisms. To mine selected elite wheat germplasm for intervarietal single-nucleotide polymorphisms, we have used expressed sequence tags derived from public sequencing programmes and next-generation sequencing of normalized wheat complementary DNA libraries, in combination with a novel sequence alignment and assembly approach. Here, we describe the development and validation of a panel of 1114 single-nucleotide polymorphisms in hexaploid bread wheat using competitive allele-specific polymerase chain reaction genotyping technology. We report the genotyping results of these markers on 23 wheat varieties, selected to represent a broad cross-section of wheat germplasm including a number of elite UK varieties. Finally, we show that, using relatively simple technology, it is possible to rapidly generate a linkage map containing several hundred single-nucleotide polymorphism markers in the doubled haploid mapping population of Avalon × Cadenza.
Collapse
|
21
|
Abstract
The application of DNA barcoding to dietary studies allows prey taxa to be identified in the absence of morphological evidence and permits a greater resolution of prey identity than is possible through direct examination of faecal material. For insectivorous bats, which typically eat a great diversity of prey and which chew and digest their prey thoroughly, DNA-based approaches to diet analysis may provide the only means of assessing the range and diversity of prey within faeces. Here, we investigated the effectiveness of DNA barcoding in determining the diets of bat species that specialize in eating different taxa of arthropod prey. We designed and tested a novel taxon-specific primer set and examined the performance of short barcode sequences in resolving prey species. We recovered prey DNA from all faecal samples and subsequent cloning and sequencing of PCR products, followed by a comparison of sequences to a reference database, provided species-level identifications for 149/207 (72%) clones. We detected a phylogenetically broad range of prey while completely avoiding detection of nontarget groups. In total, 37 unique prey taxa were identified from 15 faecal samples. A comparison of DNA data with parallel morphological analyses revealed a close correlation between the two methods. However, the sensitivity and taxonomic resolution of the DNA method were far superior. The methodology developed here provides new opportunities for the study of bat diets and will be of great benefit to the conservation of these ecologically important predators.
Collapse
|
22
|
A genome-wide analysis of single nucleotide polymorphism diversity in the world's major cereal crops. PLANT BIOTECHNOLOGY JOURNAL 2009; 7:318-25. [PMID: 19386040 DOI: 10.1111/j.1467-7652.2009.00412.x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Over 3.5 million expressed sequence tags from the major cereal taxa were used to electronically mine over 176,000 putative single nucleotide polymorphisms (SNPs). The density, distribution and degree of linkage between these SNPs were compared among the different taxa. The frequency of sequence polymorphism was lowest in diploid taxa (rice, barley and sorghum), intermediate in tetraploid maize and highest in allohexaploid wheat and octoploid sugarcane. SNPs were further categorized as either intravarietal (differences between gene family members and homoeologues) or varietal (differences between two varieties), and as either co-segregating or non-co-segregating with neighbouring polymorphisms. Varietal co-segregating SNPs represent the best candidates for molecular markers as they show variation between varieties and have a high probability of being validated, as sequencing errors are unlikely to co-segregate with one another. This elite class of SNPs was most abundant in barley and least abundant in wheat and rice. Despite the large number of observed sequence polymorphisms in allohexaploid wheat, only a fraction of those available are likely to make good molecular markers. In addition, we found that rice SNPs up to 10 kb apart were in linkage disequilibrium (LD), but that high levels of LD attributable to population structure confounded the tracking of LD over greater distances.
Collapse
|
23
|
Multiplex single nucleotide polymorphism (SNP)-based genotyping in allohexaploid wheat using padlock probes. PLANT BIOTECHNOLOGY JOURNAL 2009; 7:375-390. [PMID: 19379286 DOI: 10.1111/j.1467-7652.2009.00413.x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Single nucleotide polymorphisms are the most common polymorphism in plant and animal genomes and, as such, are the logical choice for marker-assisted selection. However, many plants are also polyploid, and marker-assisted selection can be complicated by the presence of highly similar, but non-allelic, homoeologous sequences. Despite this, there is practical and academic demand for high-throughput genotyping in several polyploid crop species, such as allohexaploid wheat. In this paper, we present such a system, which utilizes public single nucleotide polymorphisms previously identified in both agronomically important genes and in randomly selected, mapped, expressed sequence tags developed by the wheat community. To achieve relatively high levels of multiplexing, we used non-amplified genomic DNA and padlock probe pairs, together with high annealing temperatures, to differentiate between similar sequences in the wheat genome. Our results suggest that padlock probes are capable of discriminating between homoeologous sequences and hence can be used to efficiently genotype wheat varieties.
Collapse
|
24
|
Transcript profiles of long- and short-lived adults implicate protein synthesis in evolved differences in ageing in the nematode Strongyloides ratti. Mech Ageing Dev 2008; 130:167-72. [PMID: 19056418 DOI: 10.1016/j.mad.2008.11.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2008] [Revised: 10/21/2008] [Accepted: 11/04/2008] [Indexed: 02/04/2023]
Abstract
The nematode Strongyloides ratti shows remarkable phenotypic plasticity in ageing, with parasitic adults living at least 80-times longer than free-living adults. Given that long- and short-lived adults are genetically identical, this plasticity is likely to be due to differences in gene expression. To try and understand how this inter-morph difference in longevity evolved, we compared gene expression in long- and short-lived adults. DNA microarray analysis of long- and short-lived adults identified 32 genes that were up-regulated in long-lived adults, and 96 genes up-regulated in short-lived adults. Strikingly, 38.5% of the genes expressed more in the short-lived morph are predicted to encode ribosomal proteins, compared with only 9% in the long-lived morph. Among the 32 longevity-associated genes there was very little enrichment of genes linked to cellular maintenance. Overall, we have therefore observed a negative correlation between expression of ribosomal protein genes and longevity in S. ratti. Interestingly, engineered reduction of expression of ribosomal protein genes increases lifespan in the free-living nematode Caenorhabditis elegans. Our study therefore suggests that differences in levels of protein synthesis could contribute to evolved differences in animal longevity.
Collapse
|
25
|
Analysis of wheat SAGE tags reveals evidence for widespread antisense transcription. BMC Genomics 2008; 9:475. [PMID: 18847483 PMCID: PMC2584110 DOI: 10.1186/1471-2164-9-475] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2008] [Accepted: 10/10/2008] [Indexed: 12/14/2022] Open
Abstract
Background Serial Analysis of Gene Expression (SAGE) is a powerful tool for genome-wide transcription studies. Unlike microarrays, it has the ability to detect novel forms of RNA such as alternatively spliced and antisense transcripts, without the need for prior knowledge of their existence. One limitation of using SAGE on an organism with a complex genome and lacking detailed sequence information, such as the hexaploid bread wheat Triticum aestivum, is accurate annotation of the tags generated. Without accurate annotation it is impossible to fully understand the dynamic processes involved in such complex polyploid organisms. Hence we have developed and utilised novel procedures to characterise, in detail, SAGE tags generated from the whole grain transcriptome of hexaploid wheat. Results Examination of 71,930 Long SAGE tags generated from six libraries derived from two wheat genotypes grown under two different conditions suggested that SAGE is a reliable and reproducible technique for use in studying the hexaploid wheat transcriptome. However, our results also showed that in poorly annotated and/or poorly sequenced genomes, such as hexaploid wheat, considerably more information can be extracted from SAGE data by carrying out a systematic analysis of both perfect and "fuzzy" (partially matched) tags. This detailed analysis of the SAGE data shows first that while there is evidence of alternative polyadenylation this appears to occur exclusively within the 3' untranslated regions. Secondly, we found no strong evidence for widespread alternative splicing in the developing wheat grain transcriptome. However, analysis of our SAGE data shows that antisense transcripts are probably widespread within the transcriptome and appear to be derived from numerous locations within the genome. Examination of antisense transcripts showing sequence similarity to the Puroindoline a and Puroindoline b genes suggests that such antisense transcripts might have a role in the regulation of gene expression. Conclusion Our results indicate that the detailed analysis of transcriptome data, such as SAGE tags, is essential to understand fully the factors that regulate gene expression and that such analysis of the wheat grain transcriptome reveals that antisense transcripts maybe widespread and hence probably play a significant role in the regulation of gene expression during grain development.
Collapse
|
26
|
Genes important in the parasitic life of the nematode Strongyloides ratti. Mol Biochem Parasitol 2007; 158:112-9. [PMID: 18234359 DOI: 10.1016/j.molbiopara.2007.11.016] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2007] [Revised: 10/09/2007] [Accepted: 11/28/2007] [Indexed: 11/26/2022]
Abstract
Parasitic nematodes are important pathogens of humans and other animals. The genus Strongyloides has both a parasitic and a free-living adult generation. S. ratti infections of its rat host are negatively affected by the host immune response, such that a month after infection, worms are lost from the hosts. Here we have investigated the changes in parasite gene expression that occur as the anti-S. ratti immune pressure increases. Existing S. ratti expressed sequence tags were used to construct a microarray consisting of 2227 putative genes. This was probed with cDNA prepared from parasites subject to low or high immune pressures. There are significant changes in the gene expression of S. ratti when subject to different immune pressures. Most of the genes whose expression changes have no significant alignment to known genes. These data together with previous S. ratti EST data were then used to identify genes that we hypothesise are central to the parasitic life of S. ratti and, perhaps, other parasitic nematodes. These analyses have identified genes likely to play a key role in the parasitic life of S. ratti; these genes should be the priority for further investigation.
Collapse
|
27
|
An expressed sequence tag analysis of the life-cycle of the parasitic nematode Strongyloides ratti. Mol Biochem Parasitol 2005; 142:32-46. [PMID: 15907559 DOI: 10.1016/j.molbiopara.2005.03.006] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2005] [Revised: 03/17/2005] [Accepted: 03/17/2005] [Indexed: 10/25/2022]
Abstract
14,761 expressed sequence tags (ESTs) were generated, representing five stages during the parasitic and free-living phases of the life-cycle of the parasitic nematode Strongyloides ratti. These ESTs formed 4152 clusters, of which 97% contained 10 or fewer ESTs and 66% were singletons. These 4152 clusters are likely to represent approximately 20% of S. ratti's genes. The clusters' consensus sequences were used to assign each cluster to one of three databases: (i) Caenorhabditis elegans and C. briggsae sequences; (ii) other nematode sequences; (iii) non-nematode sequences. This approach has identified putative nematode-specific genes, that may be targets for developing approaches for parasitic nematode control. Approximately 25% of the clusters have no significant alignments and may therefore represent novel genes. The EST representation between the libraries was used to analyse stage-specific or -biased expression in silico. This showed that 81% of clusters are present in only one library and 12% are present in any two libraries, indicating substantial stage-specificity of gene expression. The 30-most abundantly expressed clusters were analysed in further detail. Many of these have significantly different parasitic- or free-living-specific or -biased expression. Many of the parasitic-specific genes are, as yet, uncharacterised: one of these represents 25% of all ESTs obtained from the parasitic stage.
Collapse
|
28
|
Alteration of the embryo transcriptome of hexaploid winter wheat (Triticum aestivum cv. Mercia) during maturation and germination. Funct Integr Genomics 2005; 5:144-54. [PMID: 15714317 DOI: 10.1007/s10142-005-0137-2] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2004] [Revised: 01/11/2005] [Accepted: 01/12/2005] [Indexed: 11/28/2022]
Abstract
Grain dormancy and germination are areas of biology that are of considerable interest to the cereal community. We have used a 9,155-feature wheat unigene cDNA microarray resource to investigate changes in the wheat embryo transcriptome during late grain development and maturation and during the first 48 h of postimbibition germination. In the embryo 392 mRNAs accumulated by twofold or greater over the time course from 21 days postanthesis (dpa) to 40 dpa and on through 1 and 2 days postgermination. These included mRNAs encoding proteins involved in amino acid biosynthesis and metabolism, cell division and subsequent cell development, signal transduction, lipid metabolism, energy production, protein turnover, respiration, initiation of transcription, initiation of translation and ribosomal composition. A number of mRNAs encoding proteins of unknown function also accumulated over the time course. Conversely 163 sequences showed decreases of twofold or greater over the time course. A small number of mRNAs also showed rapid accumulation specifically during the first 48 h of germination. We also examined alterations in the accumulation of transcripts encoding proteins involved in abscisic acid signalling. Thus, we describe changes in the level of transcripts encoding wheat Viviparous 1 (Vp1) and other interacting proteins. Interestingly, the transcript encoding wheat Viviparous-interacting protein 1 showed a pattern of accumulation that correlates inversely with germination. Our data suggests that the majority of the transcripts required for germination accumulate in the embryo prior to germination and we discuss the implications of these findings with regard to manipulation of germination in wheat.
Collapse
|
29
|
Abstract
Grain development, germination and plant development under abiotic stresses are areas of biology that are of considerable interest to the cereal community. Within the Investigating Gene Function programme we have produced the resources required to investigate alterations in the transcriptome of hexaploid wheat during these developmental processes. We have single pass sequenced the cDNAs of between 700 and 1300 randomly picked clones from each of 35 cDNA libraries representing highly specific stages of grain and plant development. Annotated sequencing results have been stored in a publicly accessible, online database at http://www.cerealsdb.uk.net. Each of the tissue stages used has also been photographed in detail, resulting in a collection of high-quality micrograph images detailing wheat grain development. These images have been collated and annotated in order to produce a web site focused on wheat development (http://www.wheatbp.net/). We have also produced high-density microarrays of a publicly available wheat unigene set based on the 35 cDNA libraries and have completed a number of microarray experiments which validate their quality.
Collapse
|
30
|
Genetic diversity within populations of cyanobacteria assessed by analysis of single filaments. Antonie Van Leeuwenhoek 2002; 81:197-202. [PMID: 12448718 DOI: 10.1023/a:1020510516829] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
We have developed a technique for determining the genetic structure of populations of filamentous cyanobacteria. The sequence diversity at specific gene loci is first characterised in a range of clonal cultures; subsequent analysis involves individual trichomes collected directly from natural populations. This technique has been used to examine the population genetic structure of Nodularia in the Baltic Sea and Planktothrix in Lake Zürich. For Nodularia, studies utilising four polymorphic loci reveal that even though there is a degree of linkage disequilibrium, horizontal transfer of genetic information has been sufficient to generate many of the possible allelic combinations. Analyses reveal both spatial and temporal variation in population genetic structure. Other studies of both Nodularia and Planktothrir have shown a correlation between particular alleles at the gvpC locus and the critical pressure of the gas vesicles that accumulate within the cell. We are now investigating how the natural selection of different gas vesicle phenotypes, imposed by changes in the depth of the upper mixed layer of the water column, affects the relative success of individual cyanobacteria possessing different gvpC alleles.
Collapse
|
31
|
Allele-specific PCR shows that genetic exchange occurs among genetically diverse Nodularia (cyanobacteria) filaments in the Baltic Sea. MICROBIOLOGY (READING, ENGLAND) 2000; 146 ( Pt 11):2865-2875. [PMID: 11065365 DOI: 10.1099/00221287-146-11-2865] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Some cyanobacteria have been shown to exchange genetic information under laboratory conditions, but it has not been clear whether such genetic exchange occurs in the natural environment. To address this, a population genetic study was carried out on the filamentous diazotrophic cyanobacterium Nodularia in the Baltic Sea. Nodularia filaments were collected from 20 widely distributed sampling stations in the Baltic Sea during June and July 1998. Allele-specific PCR (AS-PCR) was used to characterize over 2000 filaments at three loci: a non-coding spacer between adjacent copies of the main structural gas vesicle gene gvpA (gvpA-IGS), the phycocyanin intergenic spacer (PC-IGS) and the rDNA internal transcribed spacer (rDNA-ITS). The three loci were all found to be polymorphic in the 1998 population: two alternative alleles were distinguished at the gvpA-IGS and PC-IGS loci, and three at the rDNA-ITS locus. All 12 possible combinations of alleles were found in the filaments studied, but some were much more common than others. The index of association (I:(A)) for all possible pairwise combinations of isolates was found to differ significantly from zero, which implies that there is some linkage disequilibrium between loci. The I:(A) values for 16 out of 20 individual sampling stations also differed significantly from zero: this shows that the observed linkage disequilibrium is not due to pooling data from genetically distinct subpopulations. Monte-Carlo simulations with random subsets of the data confirmed that some combinations showed significantly more linkage disequilibrium than expected by chance alone. It is concluded that genetic exchange occurs in the natural Nodularia population, but the frequency is not high enough for the loci to be in linkage equilibrium. The distribution of the 12 genotypes across the Baltic Sea was found to be non-random, but did not correlate with temperature, salinity or major nutrient concentrations. A significant relationship was found between the gene diversity among filaments at each station and the distance of the station from the centre of the sampling area: possible reasons for this trend are discussed.
Collapse
|