1
|
Clouard C, Nettelblad C. Genotyping of SNPs in bread wheat at reduced cost from pooled experiments and imputation. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:26. [PMID: 38243086 PMCID: PMC10799138 DOI: 10.1007/s00122-023-04533-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 12/19/2023] [Indexed: 01/21/2024]
Abstract
KEY MESSAGE Pooling and imputation are computational methods that can be combined for achieving cost-effective and accurate high-density genotyping of both common and rare variants, as demonstrated in a MAGIC wheat population. The plant breeding industry has shown growing interest in using the genotype data of relevant markers for performing selection of new competitive varieties. The selection usually benefits from large amounts of marker data, and it is therefore crucial to dispose of data collection methods that are both cost-effective and reliable. Computational methods such as genotype imputation have been proposed earlier in several plant science studies for addressing the cost challenge. Genotype imputation methods have though been used more frequently and investigated more extensively in human genetics research. The various algorithms that exist have shown lower accuracy at inferring the genotype of genetic variants occurring at low frequency, while these rare variants can have great significance and impact in the genetic studies that underlie selection. In contrast, pooling is a technique that can efficiently identify low-frequency items in a population, and it has been successfully used for detecting the samples that carry rare variants in a population. In this study, we propose to combine pooling and imputation and demonstrate this by simulating a hypothetical microarray for genotyping a population of recombinant inbred lines in a cost-effective and accurate manner, even for rare variants. We show that with an adequate imputation model, it is feasible to accurately predict the individual genotypes at lower cost than sample-wise genotyping and time-effectively. Moreover, we provide code resources for reproducing the results presented in this study in the form of a containerized workflow.
Collapse
Affiliation(s)
- Camille Clouard
- Division of Scientific Computing, Department of Information Technology, Uppsala University, Lägerhyddsvägen 1, 75237, Uppsala, Sweden.
| | - Carl Nettelblad
- Division of Scientific Computing, Department of Information Technology, Uppsala University, Lägerhyddsvägen 1, 75237, Uppsala, Sweden
- SciLifeLab, Science for Life Laboratory, Husargatan 3, 75237, Uppsala, Sweden
| |
Collapse
|
2
|
Niehoff T, Pook T, Gholami M, Beissinger T. Imputation of low-density marker chip data in plant breeding: Evaluation of methods based on sugar beet. THE PLANT GENOME 2022; 15:e20257. [PMID: 36258672 DOI: 10.1002/tpg2.20257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 08/02/2022] [Indexed: 06/16/2023]
Abstract
Low-density genotyping followed by imputation reduces genotyping costs while still providing high-density marker information. An increased marker density has the potential to improve the outcome of all applications that are based on genomic data. This study investigates techniques for 1k to 20k genomic marker imputation for plant breeding programs with sugar beet (Beta vulgaris L. ssp. vulgaris) as an example crop, where these are realistic marker numbers for modern breeding applications. The generally accepted 'gold standard' for imputation, Beagle 5.1, was compared with the recently developed software AlphaPlantImpute2 which is designed specifically for plant breeding. For Beagle 5.1 and AlphaPlantImpute2, the imputation strategy as well as the imputation parameters were optimized in this study. We found that the imputation accuracy of Beagle could be tremendously improved (0.22 to 0.67) by tuning parameters, mainly by lowering the values for the parameter for the effective population size and increasing the number of iterations performed. Separating the phasing and imputation steps also improved accuracies when optimized parameters were used (0.67 to 0.82). We also found that the imputation accuracy of Beagle decreased when more low-density lines were included for imputation. AlphaPlantImpute2 produced very high accuracies without optimization (0.89) and was generally less responsive to optimization. Overall, AlphaPlantImpute2 performed relatively better for imputation whereas Beagle was better for phasing. Combining both tools yielded the highest accuracies.
Collapse
Affiliation(s)
- Tobias Niehoff
- Animal Breeding and Genomics, Wageningen Univ. & Research, Postbox 338, 6700AH, Wageningen, The Netherlands
- Dep. of Crop Sciences, Division of Plant Breeding Methodology, Univ. of Göttingen, Göttingen, 37075, Germany
| | - Torsten Pook
- Animal Breeding and Genomics, Wageningen Univ. & Research, Postbox 338, 6700AH, Wageningen, The Netherlands
- Dep. of Animal Sciences, Animal Breeding and Genetics Group, Univ. of Göttingen, Göttingen, 37075, Germany
- Center for Integrated Breeding Research, Univ. of Göttingen, Göttingen, 37075, Germany
| | - Mahmood Gholami
- RD-SBCE-BTA, KWS SAAT SE & Co. KGaA, Grimsehlstr. 31, Einbeck, 37574, Germany
| | - Timothy Beissinger
- Dep. of Crop Sciences, Division of Plant Breeding Methodology, Univ. of Göttingen, Göttingen, 37075, Germany
- Center for Integrated Breeding Research, Univ. of Göttingen, Göttingen, 37075, Germany
| |
Collapse
|