1
|
Araujo AC, Johnson JS, Graham JR, Howard J, Huang Y, Oliveira HR, Brito LF. Transgenerational epigenetic heritability for growth, body composition, and reproductive traits in Landrace pigs. Front Genet 2025; 15:1526473. [PMID: 39917178 PMCID: PMC11799271 DOI: 10.3389/fgene.2024.1526473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Accepted: 12/24/2024] [Indexed: 02/09/2025] Open
Abstract
Epigenetics is an important source of variation in complex traits that is not due to changes in DNA sequences, and is dependent on the environment the individuals are exposed to. Therefore, we aimed to estimate transgenerational epigenetic heritability, percentage of resetting epigenetic marks, genetic parameters, and predicting breeding values using genetic and epigenetic models for growth, body composition, and reproductive traits in Landrace pigs using routinely recorded datasets. Birth and weaning weight, backfat thickness, total number of piglets born, and number of piglets born alive (BW, WW, BF, TNB, and NBA, respectively) were investigated. Models including epigenetic effects had a similar or better fit than solely genetic models. Including genomic information in epigenetic models resulted in large changes in the variance component estimates. Transgenerational epigenetic heritability estimates ranged between 0.042 (NBA) to 0.336 (BF). The reset coefficient estimates for epigenetic marks were between 80% and 90%. Heritability estimates for the direct additive and maternal genetic effects ranged between 0.040 (BW) to 0.502 (BF) and 0.034 (BF) to 0.134 (BW), respectively. Repeatability of the reproductive traits ranged between 0.098 (NBA) to 0.148 (TNB). Prediction accuracies, bias, and dispersion of breeding values ranged between 0.199 (BW) to 0.443 (BF), -0.080 (WW) to 0.034 (NBA), and -0.134 (WW) to 0.131 (TNB), respectively, with no substantial differences between genetic and epigenetic models. Transgenerational epigenetic heritability estimates are moderate for growth and body composition and low for reproductive traits in North American Landrace pigs. Fitting epigenetic effects in genetic models did not impact the prediction of breeding values.
Collapse
Affiliation(s)
- Andre C. Araujo
- Department of Animal Sciences, Purdue University, West Lafayette, IN, United States
| | - Jay S. Johnson
- Livestock Behavior Research Unity, USDA-ARS, West Lafayette, IN, United States
| | - Jason R. Graham
- Department of Animal Sciences, Purdue University, West Lafayette, IN, United States
| | - Jeremy Howard
- Smithfield Premium Genetics, Rose Hill, NC, United States
| | - Yijian Huang
- Smithfield Premium Genetics, Rose Hill, NC, United States
| | - Hinayah R. Oliveira
- Department of Animal Sciences, Purdue University, West Lafayette, IN, United States
| | - Luiz F. Brito
- Department of Animal Sciences, Purdue University, West Lafayette, IN, United States
| |
Collapse
|
2
|
Wang H, Li C, Li J, Zhang R, An X, Yuan C, Guo T, Yue Y. Genomic Selection for Weaning Weight in Alpine Merino Sheep Based on GWAS Prior Marker Information. Animals (Basel) 2024; 14:1904. [PMID: 38998016 PMCID: PMC11240623 DOI: 10.3390/ani14131904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 06/19/2024] [Accepted: 06/24/2024] [Indexed: 07/14/2024] Open
Abstract
This study aims to compare the accuracy of genomic estimated breeding values (GEBV) estimated using a genomic best linear unbiased prediction (GBLUP) method and GEBV estimates incorporating prior marker information from a genome-wide association study (GWAS) for the weaning weight trait in highland Merino sheep. The objective is to provide theoretical and technical support for improving the accuracy of genomic selection. The study used a population of 1007 highland Merino ewes, with the weaning weight at 3 months as the target trait. The population was randomly divided into two groups. The first group was used for GWAS analysis to identify significant markers, and the top 5%, top 10%, top 15%, and top 20% markers were selected as prior marker information. The second group was used to estimate genetic parameters and compare the accuracy of GEBV predictions using different prior marker information. The accuracy was obtained using a five-fold cross-validation. Finally, both groups were subjected to cross-validation. The study's findings revealed that the heritability of the weaning weight trait, as calculated using the GBLUP model, ranged from 0.122 to 0.394, with corresponding prediction accuracies falling between 0.075 and 0.228. By incorporating prior marker information from GWAS, the heritability was enhanced to a range of 0.125 to 0.407. The inclusion of the top 5% to top 20% significant SNPs from GWAS results as prior information into GS showed potential for improving the accuracy of predicting genomic breeding value.
Collapse
Affiliation(s)
- Haifeng Wang
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Chenglan Li
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Jianye Li
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Rui Zhang
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Xuejiao An
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Chao Yuan
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Tingting Guo
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Yaojing Yue
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| |
Collapse
|
3
|
Ablondi M, Stocco G, Cortellari M, Carta A, Summer A, Negro A, Grande S, Crepaldi P, Cipolat-Gotet C, Biffani S. Microsatellite imputation using SNP data for parentage verification in four Italian sheep breeds. J Anim Breed Genet 2024; 141:278-290. [PMID: 38058229 DOI: 10.1111/jbg.12839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 11/16/2023] [Accepted: 11/17/2023] [Indexed: 12/08/2023]
Abstract
Microsatellite markers (MS) have been widely used for parentage verification in most of the livestock species over the past decades mainly due to their high polymorphic information content. In the genomic era, the spread of genotype information as single-nucleotide polymorphism (SNP) has raised the question to effectively use SNPs also for parentage testing. Despite the clear advantages of SNP panels in terms of cost, accuracy, and automation, the transition from MS to SNP markers for parentage verification is still very slow and, so far, only routinely applied in cattle. A major difficulty during this transition period is the need of SNP data for parents and offspring, which in most cases is not yet feasible due to the genotyping cost. To overcome the unavailability of same genotyping platform during the transition period, in this study we aimed to assess the feasibility of a MS imputation pipeline from SNPs in four native sheep dairy breeds: Comisana (N = 331), Massese (N = 210), Delle Langhe (N = 59) and Sarda (N = 1003). Those sheep were genotyped for 11 MS and with the Ovine SNP50 Bead Chip. Prior to imputation, a quality control (QC) was performed, and SNPs located within a window of 2 Mb from each MS were selected. The core of the developed pipeline was made up of three steps: (a) storing of both MS and SNP data in a Variant Call Format file, (b) masking MS information in a random sample of individuals (10%), (c) imputing masked MS based on non-missing individuals (90%) using an imputation program. The feasability of the proposed methodology was assessed also among different training - testing split ratio, population size, number of flanking SNPs as well as within and among breeds. The accuracy of the MS imputation was assessed based on the genotype concordance as well as at parentage verification level in a subset of animals in which assigned parents' MS were available. A total of 8 MS passed the QC, and 505 SNPs were located within the ±2 Mb window from each MS, with an average of 63 SNPs per MS. The results were encouraging since when excluding the worst imputed MS (OARAE129), and regardless on the analyses performed (within and across breeds) for all breeds, we achieved an overall concordance rate over 94%. In addition, on average, the imputed offspring MS resulted in equivalent parentage outcome in 94% of the cases when compared to verification using original MS, highlighting both the feasibility and the eventual practical advantage of using this imputation pipeline.
Collapse
Affiliation(s)
- Michela Ablondi
- Department of Veterinary Science, Università degli studi di Parma, Parma, Italy
| | - Giorgia Stocco
- Department of Veterinary Science, Università degli studi di Parma, Parma, Italy
| | - Matteo Cortellari
- Dipartimento di Scienze Agrarie e Ambientali - Produzione, Territorio, Agroenergia, Università degli Studi di Milano, Milan, Italy
| | - Antonello Carta
- Unità di Ricerca di Genetica e Biotecnologie, Agris Sardegna, Sassari, Italy
| | - Andrea Summer
- Department of Veterinary Science, Università degli studi di Parma, Parma, Italy
| | - Alessio Negro
- Ufficio Studi, Associazione Nazionale della Pastorizia, Rome, Italy
| | - Silverio Grande
- Ufficio Studi, Associazione Nazionale della Pastorizia, Rome, Italy
| | - Paola Crepaldi
- Dipartimento di Scienze Agrarie e Ambientali - Produzione, Territorio, Agroenergia, Università degli Studi di Milano, Milan, Italy
| | | | - Stefano Biffani
- Consiglio Nazionale delle Ricerche (CNR), Istituto di Biologia e Biotecnologia Agraria (IBBA), Milan, Italy
| |
Collapse
|
4
|
Alemu A, Åstrand J, Montesinos-López OA, Isidro Y Sánchez J, Fernández-Gónzalez J, Tadesse W, Vetukuri RR, Carlsson AS, Ceplitis A, Crossa J, Ortiz R, Chawade A. Genomic selection in plant breeding: Key factors shaping two decades of progress. MOLECULAR PLANT 2024; 17:552-578. [PMID: 38475993 DOI: 10.1016/j.molp.2024.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 01/22/2024] [Accepted: 03/08/2024] [Indexed: 03/14/2024]
Abstract
Genomic selection, the application of genomic prediction (GP) models to select candidate individuals, has significantly advanced in the past two decades, effectively accelerating genetic gains in plant breeding. This article provides a holistic overview of key factors that have influenced GP in plant breeding during this period. We delved into the pivotal roles of training population size and genetic diversity, and their relationship with the breeding population, in determining GP accuracy. Special emphasis was placed on optimizing training population size. We explored its benefits and the associated diminishing returns beyond an optimum size. This was done while considering the balance between resource allocation and maximizing prediction accuracy through current optimization algorithms. The density and distribution of single-nucleotide polymorphisms, level of linkage disequilibrium, genetic complexity, trait heritability, statistical machine-learning methods, and non-additive effects are the other vital factors. Using wheat, maize, and potato as examples, we summarize the effect of these factors on the accuracy of GP for various traits. The search for high accuracy in GP-theoretically reaching one when using the Pearson's correlation as a metric-is an active research area as yet far from optimal for various traits. We hypothesize that with ultra-high sizes of genotypic and phenotypic datasets, effective training population optimization methods and support from other omics approaches (transcriptomics, metabolomics and proteomics) coupled with deep-learning algorithms could overcome the boundaries of current limitations to achieve the highest possible prediction accuracy, making genomic selection an effective tool in plant breeding.
Collapse
Affiliation(s)
- Admas Alemu
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.
| | - Johanna Åstrand
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden; Lantmännen Lantbruk, Svalöv, Sweden
| | | | - Julio Isidro Y Sánchez
- Centro de Biotecnología y Genómica de Plantas (CBGP, UPM-INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223 Madrid, Spain
| | - Javier Fernández-Gónzalez
- Centro de Biotecnología y Genómica de Plantas (CBGP, UPM-INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223 Madrid, Spain
| | - Wuletaw Tadesse
- International Center for Agricultural Research in the Dry Areas (ICARDA), Rabat, Morocco
| | - Ramesh R Vetukuri
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | - Anders S Carlsson
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | | | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco, México 52640, Mexico
| | - Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.
| | - Aakash Chawade
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| |
Collapse
|
5
|
Li C, Li J, Wang H, Zhang R, An X, Yuan C, Guo T, Yue Y. Genomic Selection for Live Weight in the 14th Month in Alpine Merino Sheep Combining GWAS Information. Animals (Basel) 2023; 13:3516. [PMID: 38003134 PMCID: PMC10668700 DOI: 10.3390/ani13223516] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 10/25/2023] [Accepted: 11/07/2023] [Indexed: 11/26/2023] Open
Abstract
Alpine Merino Sheep is a novel breed reared from Australian Merino Sheep as the father and Gansu Alpine Fine-Wool Sheep as the mother, living all year in cold and arid alpine areas with exceptional wool quality and meat performance. Body weight is an important economic trait of the Alpine Merino Sheep, but there is limited research on identifying the genes associated with live weight in the 14th month for improving the accuracy of the genomic prediction of this trait. Therefore, this study's sample comprised 1310 Alpine Merino Sheep ewes, and the Fine Wool Sheep 50K Panel was used for genome-wide association study (GWAS) analysis to identify candidate genes. Moreover, the trial population (1310 ewes) in this study was randomly divided into two groups. One group was used as the population for GWAS analysis and screened for the most significant top 5%, top 10%, top 15%, and top 20% SNPs to obtain prior marker information. The other group was used to estimate the genetic parameters based on the weight assigned by heritability combined with different prior marker information. The aim of this study was to compare the accuracy of genomic breeding value estimation when combined with prior marker information from GWAS analysis with the optimal linear unbiased prediction method for genome selection (GBLUP) for the breeding value of target traits. Finally, the accuracy was evaluated using the five-fold cross-validation method. This research provides theoretical and technical support to improve the accuracy of sheep genome selection and better guide breeding. The results demonstrated that eight candidate genes were associated with GWAS analysis, and the gene function query and literature search results suggested that FAM184B, NCAPG, MACF1, ANKRD44, DCAF16, FUK, LCORL, and SYN3 were candidate genes affecting live weight in the 14th month (WT), which regulated the growth of muscle and bone in sheep. In genome selection analysis, the heritability of GBLUP to calculate the WT was 0.335-0.374, the accuracy after five-fold cross-verification was 0.154-0.190, and after assigning different weights to the top 5%, top 10%, top 15%, and top 20% of the GWAS results in accordance with previous information to construct the G matrix, the accuracy of the WT in the GBLUP model was improved by 2.59-7.79%.
Collapse
Affiliation(s)
- Chenglan Li
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China; (C.L.)
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Jianye Li
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China; (C.L.)
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Haifeng Wang
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China; (C.L.)
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Rui Zhang
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China; (C.L.)
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Xuejiao An
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China; (C.L.)
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Chao Yuan
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China; (C.L.)
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Tingting Guo
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China; (C.L.)
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| | - Yaojing Yue
- Key Laboratory of Animal Genetics and Breeding on Tibetan Plateau, Ministry of Agriculture and Rural Affairs, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China; (C.L.)
- Sheep Breeding Engineering Technology Research Center of Chinese Academy of Agricultural Sciences, Lanzhou 730050, China
| |
Collapse
|
6
|
Ahmad SF, Singh A, Deb CK, Panda S, Gaur GK, Dutt T, Mishra BP, Kumar A. Evaluation of imputation possibility from low-density SNP panel in composite Vrindavani cattle. Anim Genet 2023; 54:647-648. [PMID: 37336526 DOI: 10.1111/age.13339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 06/07/2023] [Accepted: 06/08/2023] [Indexed: 06/21/2023]
Affiliation(s)
| | - Akansha Singh
- ICAR-Indian Veterinary Research Institute, Bareilly, India
| | - Chandan Kumar Deb
- Computer Applications, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | | | | | - Triveni Dutt
- ICAR-Indian Veterinary Research Institute, Bareilly, India
| | | | - Amit Kumar
- ICAR-Indian Veterinary Research Institute, Bareilly, India
| |
Collapse
|
7
|
Araujo AC, Carneiro PLS, Oliveira HR, Lewis RM, Brito LF. SNP- and haplotype-based single-step genomic predictions for body weight, wool, and reproductive traits in North American Rambouillet sheep. J Anim Breed Genet 2023; 140:216-234. [PMID: 36408677 PMCID: PMC10099590 DOI: 10.1111/jbg.12748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Accepted: 10/23/2022] [Indexed: 11/22/2022]
Abstract
Rambouillet sheep are commonly raised in extensive grazing systems in the US, mainly for wool and meat production. Genomic evaluations in US sheep breeds, including Rambouillet, are still incipient. Therefore, we aimed to evaluate the feasibility of performing genomic prediction of breeding values for various traits in Rambouillet sheep based on single nucleotide polymorphisms (SNP) or haplotypes (fitted as pseudo-SNP) under a single-step GBLUP approach. A total of 28,834 records for birth weight (BWT), 23,306 for postweaning weight (PWT), 5,832 for yearling weight (YWT), 9,880 for yearling fibre diameter (YFD), 11,872 for yearling greasy fleece weight (YGFW), and 15,984 for number of lambs born (NLB) were used in this study. Seven hundred forty-one individuals were genotyped using a moderate (50 K; n = 677) or high (600 K; n = 64) density SNP panel, in which 32 K SNP in common between the two SNP panels (after genotypic quality control) were used for further analyses. Single-step genomic predictions using SNP (H-BLUP) or haplotypes (HAP-BLUP) from blocks with different linkage disequilibrium (LD) thresholds (0.15, 0.35, 0.50, 0.65, and 0.80) were evaluated. We also considered different blending parameters when constructing the genomic relationship matrix used to predict the genomic-enhanced estimated breeding values (GEBV), with alpha equal to 0.95 or 0.50. The GEBV were compared to the estimated breeding values (EBV) obtained from traditional pedigree-based evaluations (A-BLUP). The mean theoretical accuracy ranged from 0.499 (A-BLUP for PWT) to 0.795 (HAP-BLUP using haplotypes from blocks with LD threshold of 0.35 and alpha equal to 0.95 for YFD). The prediction accuracies ranged from 0.143 (A-BLUP for PWT) to 0.330 (A-BLUP for YGFW) while the prediction bias ranged from -0.104 (H-BLUP for PWT) to 0.087 (HAP-BLUP using haplotypes from blocks with LD threshold of 0.15 and alpha equal to 0.95 for YGFW). The GEBV dispersion ranged from 0.428 (A-BLUP for PWT) to 1.035 (A-BLUP for YGFW). Similar results were observed for H-BLUP or HAP-BLUP, independently of the LD threshold to create the haplotypes, alpha value, or trait analysed. Using genomic information (fitting individual SNP or haplotypes) provided similar or higher prediction and theoretical accuracies and reduced the dispersion of the GEBV for body weight, wool, and reproductive traits in Rambouillet sheep. However, there were no clear improvements in the prediction bias when compared to pedigree-based predictions. The next step will be to enlarge the training populations for this breed to increase the benefits of genomic predictions.
Collapse
Affiliation(s)
- Andre C. Araujo
- Graduate Program in Animal SciencesState University of Southwestern BahiaItapetingaBahiaBrazil
- Department of Animal SciencesPurdue UniversityWest LafayetteIndianaUSA
| | | | | | - Ronald M. Lewis
- Department of Animal SciencesUniversity of Nebraska‐LincolnLincolnNebraskaUSA
| | - Luiz F. Brito
- Department of Animal SciencesPurdue UniversityWest LafayetteIndianaUSA
| |
Collapse
|
8
|
Marina H, Pelayo R, Gutiérrez-Gil B, Suárez-Vega A, Esteban-Blanco C, Reverter A, Arranz JJ. Low-density SNP panel for efficient imputation and genomic selection of milk production and technological traits in dairy sheep. J Dairy Sci 2022; 105:8199-8217. [PMID: 36028350 DOI: 10.3168/jds.2021-21601] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 05/30/2022] [Indexed: 11/19/2022]
Abstract
The present study aimed to ascertain how different strategies for leveraging genomic information enhance the accuracy of estimated breeding values for milk and cheese-making traits and to evaluate the implementation of a low-density (LowD) SNP chip designed explicitly for that aim. Thus, milk samples from a total of 2,020 dairy ewes from 2 breeds (1,039 Spanish Assaf and 981 Churra) were collected and analyzed to determine 3 milk production and composition traits and 2 traits related to milk coagulation properties and cheese yield. The 2 studied populations were genotyped with a customized 50K Affymetrix SNP chip (Affymetrix Inc.) containing 55,627 SNP markers. The prediction accuracies were obtained using different multitrait methodologies, such as the BLUP model based on pedigree information, the genomic BLUP (GBLUP), and the BLUP at the SNP level (SNP-BLUP), which are based on genotypic data, and the single-step GBLUP (ssGBLUP), which combines both sources of information. All of these methods were analyzed by cross-validation, comparing predictions of the whole population with the test population sets. Additionally, we describe the design of a LowD SNP chip (3K) and its prediction accuracies through the different methods mentioned previously. Furthermore, the results obtained using the LowD SNP chip were compared with those based on the 50K SNP chip data sets. Finally, we conclude that implementing genomic selection through the ssGBLUP model in the current breeding programs would increase the accuracy of the estimated breeding values compared with the BLUP methodology in the Assaf (from 0.19 to 0.39) and Churra (from 0.27 to 0.44) dairy sheep populations. The LowD SNP chip is cost-effective and has proven to be an accurate tool for estimating genomic breeding values for milk and cheese-making traits, microsatellite imputation, and parentage verification. The results presented here suggest that the routine use of this LowD SNP chip could potentially increase the genetic gains of the breeding selection programs of the 2 Spanish dairy sheep breeds considered here.
Collapse
Affiliation(s)
- H Marina
- Departamento de Producción Animal, Facultad de Veterinaria, Universidad de León, Campus de Vegazana s/n, León 24071, Spain
| | - R Pelayo
- Departamento de Producción Animal, Facultad de Veterinaria, Universidad de León, Campus de Vegazana s/n, León 24071, Spain
| | - B Gutiérrez-Gil
- Departamento de Producción Animal, Facultad de Veterinaria, Universidad de León, Campus de Vegazana s/n, León 24071, Spain
| | - A Suárez-Vega
- Departamento de Producción Animal, Facultad de Veterinaria, Universidad de León, Campus de Vegazana s/n, León 24071, Spain
| | - C Esteban-Blanco
- Departamento de Producción Animal, Facultad de Veterinaria, Universidad de León, Campus de Vegazana s/n, León 24071, Spain
| | - A Reverter
- CSIRO Agriculture & Food, 306 Carmody Rd., St. Lucia, Brisbane, QLD 4067, Australia
| | - J J Arranz
- Departamento de Producción Animal, Facultad de Veterinaria, Universidad de León, Campus de Vegazana s/n, León 24071, Spain.
| |
Collapse
|
9
|
Ye W, Xu L, Li Y, Liu L, Ma Z, Sun D, Han B. Single Nucleotide Polymorphisms of ALDH18A1 and MAT2A Genes and Their Genetic Associations with Milk Production Traits of Chinese Holstein Cows. Genes (Basel) 2022; 13:genes13081437. [PMID: 36011348 PMCID: PMC9407996 DOI: 10.3390/genes13081437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 07/16/2022] [Accepted: 08/10/2022] [Indexed: 11/16/2022] Open
Abstract
Our preliminary work had suggested two genes, aldehyde dehydrogenase 18 family member A1 (ALDH18A1) and methionine adenosyltransferase 2A (MAT2A), related to amino acid synthesis and metabolism as candidates affecting milk traits by analyzing the liver transcriptome and proteome of dairy cows at different lactation stages. In this study, the single nucleotide polymorphisms (SNPs) of ALDH18A1 and MAT2A genes were identified and their genetic effects and underlying causative mechanisms on milk production traits in dairy cattle were analyzed, with the aim of providing effective genetic information for the molecular breeding of dairy cows. By resequencing the entire coding and partial flanking regions of ALDH18A1 and MAT2A, we found eight SNPs located in ALDH18A1 and two in MAT2A. Single-SNP association analysis showed that most of the 10 SNPs of these two genes were significantly associated with the milk yield traits, 305-day milk yield, fat yield, and protein yield in the first and second lactations (corrected p ≤ 0.0488). Using Haploview 4.2, we found that the seven SNPs of ALDH18A1 formed two haplotype blocks; subsequently, the haplotype-based association analysis showed that both haplotypes were significantly associated with 305-day milk yield, fat yield, and protein yield (corrected p ≤ 0.014). Furthermore, by Jaspar and Genomatix software, we found that 26:g.17130318 C>A and 11:g.49472723G>C, respectively, in the 5′ flanking region of ALDH18A1 and MAT2A genes changed the transcription factor binding sites (TFBSs), which might regulate the expression of corresponding genes to affect the phenotypes of milk production traits. Therefore, these two SNPs were considered as potential functional mutations, but they also require further verification. In summary, ALDH18A1 and MAT2A were proved to probably have genetic effects on milk production traits, and their valuable SNPs might be used as candidate genetic markers for dairy cattle’s genomic selection (GS).
Collapse
Affiliation(s)
- Wen Ye
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, National Engineering Laboratory for Animal Breeding, China Agricultural University, Key Laboratory of Animal Genetics, Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Beijing 100193, China
| | - Lingna Xu
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, National Engineering Laboratory for Animal Breeding, China Agricultural University, Key Laboratory of Animal Genetics, Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Beijing 100193, China
| | - Yanhua Li
- Beijing Dairy Cattle Center, Beijing 100192, China
| | - Lin Liu
- Beijing Dairy Cattle Center, Beijing 100192, China
| | - Zhu Ma
- Beijing Dairy Cattle Center, Beijing 100192, China
| | - Dongxiao Sun
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, National Engineering Laboratory for Animal Breeding, China Agricultural University, Key Laboratory of Animal Genetics, Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Beijing 100193, China
| | - Bo Han
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, National Engineering Laboratory for Animal Breeding, China Agricultural University, Key Laboratory of Animal Genetics, Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Beijing 100193, China
- Correspondence:
| |
Collapse
|
10
|
Stolpovsky YA, Kuznetsov SB, Solodneva EV, Shumov ID. New Cattle Genotyping System Based on DNA Microarray Technology. RUSS J GENET+ 2022. [DOI: 10.1134/s1022795422080099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
11
|
Genotyping, the Usefulness of Imputation to Increase SNP Density, and Imputation Methods and Tools. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2467:113-138. [PMID: 35451774 DOI: 10.1007/978-1-0716-2205-6_4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Imputation has become a standard practice in modern genetic research to increase genome coverage and improve accuracy of genomic selection and genome-wide association study as a large number of samples can be genotyped at lower density (and lower cost) and, imputed up to denser marker panels or to sequence level, using information from a limited reference population. Most genotype imputation algorithms use information from relatives and population linkage disequilibrium. A number of software for imputation have been developed originally for human genetics and, more recently, for animal and plant genetics considering pedigree information and very sparse SNP arrays or genotyping-by-sequencing data. In comparison to human populations, the population structures in farmed species and their limited effective sizes allow to accurately impute high-density genotypes or sequences from very low-density SNP panels and a limited set of reference individuals. Whatever the imputation method, the imputation accuracy, measured by the correct imputation rate or the correlation between true and imputed genotypes, increased with the increasing relatedness of the individual to be imputed with its denser genotyped ancestors and as its own genotype density increased. Increasing the imputation accuracy pushes up the genomic selection accuracy whatever the genomic evaluation method. Given the marker densities, the most important factors affecting imputation accuracy are clearly the size of the reference population and the relationship between individuals in the reference and target populations.
Collapse
|
12
|
Yan X, Zhang T, Liu L, Yu Y, Yang G, Han Y, Gong G, Wang F, Zhang L, Liu H, Li W, Yan X, Mao H, Li Y, Du C, Li J, Zhang Y, Wang R, Lv Q, Wang Z, Zhang J, Liu Z, Wang Z, Su R. Accuracy of Genomic Selection for Important Economic Traits of Cashmere and Meat Goats Assessed by Simulation Study. Front Vet Sci 2022; 9:770539. [PMID: 35372544 PMCID: PMC8966406 DOI: 10.3389/fvets.2022.770539] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Accepted: 01/24/2022] [Indexed: 11/13/2022] Open
Abstract
Genomic selection in plants and animals has become a standard tool for breeding because of the advantages of high accuracy and short generation intervals. Implementation of this technology is hindered by the high cost of genotyping and other factors. The aim of this study was to determine an optional marker density panel and reference population size for using genomic selection of goats, with speculation on the number of QTLs that affect the important economic traits of goats. In addition, the effect of buck population size in the reference population on the accuracy of genomic estimated breeding value (GEBV) was discussed. Based on the previous genetic evaluation results of Inner Mongolia White Cashmere Goats, live body weight (LBW, h2 = 0.11) and fiber diameter (FD, h2 = 0.34) were chosen to perform genomic selection in this study. Reasonable genome parameters and generation transmission processes were set, and phenotypic and genotype data of the two traits were simulated. Then, different sizes of the reference population and validation population were selected from progeny. The GEBVs were obtained by six methods, including GBLUP (Genomic Best Linear Unbiased Prediction), ssGBLUP (Single Step Genomic Best Linear Unbiased Prediction), BayesA, BayesB, Bayesian ridge regression, and Bayesian LASSO. The correlation coefficient between the predicted and realized phenotypes from simulation was calculated and used as a measure of the accuracy of GEBV in each trait. The results showed that the medium marker density Panel (45 K) could be used for genomic selection in goats, which can ensure the accuracy of the GEBV. The reference population size of 1,500 can achieve greater genetic progress in genomic selection for fiber diameter and live body weight in goats by comparing with the population size below this level. The accuracy of the GEBV for live body weight and fiber diameter was better when the number of QTLs was 100 and 50, respectively. Additionally, the accuracy of GEBV was discovered to be good when the buck population size was up to 200. Meanwhile, the accuracy of the GEBV for medium heritability traits (FDs) was found to be higher than the accuracy of the GEBV for low heritability traits (LBWs). These findings will provide theoretical guidance for genomic selection in goats by using real data.
Collapse
Affiliation(s)
- Xiaochun Yan
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Tao Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
- Inner Mongolia Bigvet Co., Ltd., Hohhot, China
| | - Lichun Liu
- College of Veterinary Medicine, Inner Mongolia Agricultural University, Hohhot, China
| | - Yongsheng Yu
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Guang Yang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Yaqian Han
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Gao Gong
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Fenghong Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Lei Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Hongfu Liu
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Wenze Li
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Xiaomin Yan
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Haoyu Mao
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Yaming Li
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Chen Du
- Department of Obstetrics and Gynaecology, Inner Mongolia Medical University, Hohhot, China
| | - Jinquan Li
- Key Laboratory of Mutton Sheep Genetics and Breeding, Ministry of Agriculture, Hohhot, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction in Inner Mongolia Autonomous Region, Hohhot, China
- Engineering Research Centre for Goat Genetics and Breeding, Inner Mongolia Autonomous Region, Hohhot, China
| | - Yanjun Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Ruijun Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Qi Lv
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Zhixin Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Jiaxin Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Zhihong Liu
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Zhiying Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
- *Correspondence: Zhiying Wang
| | - Rui Su
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
- Rui Su
| |
Collapse
|
13
|
Impact of Marker Pruning Strategies Based on Different Measurements of Marker Distance on Genomic Prediction in Dairy Cattle. Animals (Basel) 2021; 11:ani11071992. [PMID: 34359120 PMCID: PMC8300388 DOI: 10.3390/ani11071992] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 06/27/2021] [Accepted: 06/28/2021] [Indexed: 11/16/2022] Open
Abstract
Simple Summary The usefulness of genomic prediction (GP) has been widely proofed by breeding analysis in livestock, plants and aquatic populations. It is well known that ‘marker density’ is a critical factor that affects the accuracy of GP, however, how to properly measure ‘marker density’ in GP is yet to be determined. With population-level whole-genome sequence data or high-density single nucleotide polymorphism (SNP) data available, this question seems to be answered more convincingly. In this study, we investigated and discussed the impact of four ‘marker density’ measures that reflect genetic or physical distances between SNPs on the accuracy of GP in a Germany Holstein dairy cattle population. Our results showed that the degree of variation of physical distance between adjacent SNPs had significant effects on the accuracy of GP, while the genetic distance between SNPs had no relationship with the accuracy of GP. Therefore, for studies based on high-density SNP data, the default strategy of pruning SNPs based on genetic distance is detrimental to heritability estimation and genomic prediction. The results extended the communities knowledge of ‘marker density’ and provided useful suggestions for the application and research on genome prediction. Abstract With the availability of high-density single-nucleotide polymorphism (SNP) data and the development of genotype imputation methods, high-density panel-based genomic prediction (GP) has become possible in livestock breeding. It is generally considered that the genomic estimated breeding value (GEBV) accuracy increases with the marker density, while studies have shown that the GEBV accuracy does not increase or even decrease when high-density panels were used. Therefore, in addition to the SNP number, other measurements of ‘marker density’ seem to have impacts on the GEBV accuracy, and exploring the relationship between the GEBV accuracy and the measurements of ‘marker density’ based on high-density SNP or whole-genome sequence data is important for the field of GP. In this study, we constructed different SNP panels with certain SNP numbers (e.g., 1 k) by using the physical distance (PhyD), genetic distance (GenD) and random distance (RanD) between SNPs respectively based on the high-density SNP data of a Germany Holstein dairy cattle population. Therefore, there are three different panels at a certain SNP number level. These panels were used to construct GP models to predict fat percentage, milk yield and somatic cell score. Meanwhile, the mean (d¯) and variance (σd2) of the physical distance between SNPs and the mean (r2¯) and variance (σr22) of the genetic distance between SNPs in each panel were used as marker density-related measurements and their influence on the GEBV accuracy was investigated. At the same SNP number level, the d¯ of all panels is basically the same, but the σd2, r2¯ and σr22 are different. Therefore, we only investigated the effects of σd2, r2¯ and σr22 on the GEBV accuracy. The results showed that at a certain SNP number level, the GEBV accuracy was negatively correlated with σd2, but not with r2¯ and σr22. Compared with GenD and RanD, the σd2 of panels constructed by PhyD is smaller. The low and moderate-density panels (< 50 k) constructed by RanD or GenD have large σd2, which is not conducive to genomic prediction. The GEBV accuracy of the low and moderate-density panels constructed by PhyD is 3.8~34.8% higher than that of the low and moderate-density panels constructed by RanD and GenD. Panels with 20–30 k SNPs constructed by PhyD can achieve the same or slightly higher GEBV accuracy than that of high-density SNP panels for all three traits. In summary, the smaller the variation degree of physical distance between adjacent SNPs, the higher the GEBV accuracy. The low and moderate-density panels construct by physical distance are beneficial to genomic prediction, while pruning high-density SNP data based on genetic distance is detrimental to genomic prediction. The results provide suggestions for the development of SNP panels and the research of genome prediction based on whole-genome sequence data.
Collapse
|
14
|
Rabier C, Grusea S. Prediction in high‐dimensional linear models and application to genomic selection under imperfect linkage disequilibrium. J R Stat Soc Ser C Appl Stat 2021. [DOI: 10.1111/rssc.12496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Charles‐Elie Rabier
- ISE‐MUMR 5554CNRSIRDUniversité de Montpellier France
- IMAGUMR 5149CNRSUniversité de Montpellier France
- LIRMMUMR 5506CNRSUniversité de Montpellier France
| | - Simona Grusea
- Institut de Mathématiques de Toulouse Université de ToulouseINSA de Toulouse France
| |
Collapse
|
15
|
Validation of the Prediction Accuracy for 13 Traits in Chinese Simmental Beef Cattle Using a Preselected Low-Density SNP Panel. Animals (Basel) 2021; 11:ani11071890. [PMID: 34202066 PMCID: PMC8300368 DOI: 10.3390/ani11071890] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 06/08/2021] [Accepted: 06/15/2021] [Indexed: 11/17/2022] Open
Abstract
Simple Summary To reduce the breeding costs and promote the application of genomic selection (GS) in Chinese Simmental beef cattle, we developed a customized low-density single-nucleotide polymorphism (SNP) panel consisting of 30,684 SNPs. When comparing the predictive performance of the low-density SNP panel to that of the BovineHD Beadchip for 13 traits, we found that this ~30 K panel achieved moderate to high prediction accuracies for most traits, while reducing the prediction accuracies of six traits by 0.04–0.09 and decreasing the prediction accuracy of one trait by 0.2. For the remaining six traits, the usage of the low-density SNP panel was associated with a slight increase in prediction accuracy. Our studies suggested that the low-density SNP panel (~30 K) is a feasible and promising tool for cost-effective genomic prediction in Chinese Simmental beef cattle, which may provide breeding organizations with a cheaper option and greater returns on investment. Abstract Chinese Simmental beef cattle play a key role in the Chinese beef industry due to their great adaptability and marketability. To achieve efficient genetic gain at a low breeding cost, it is crucial to develop a customized cost-effective low-density SNP panel for this cattle population. Thirteen growth, carcass, and meat quality traits and a BovineHD Beadchip genotyping of 1346 individuals were used to select trait-associated variants and variants contributing to great genetic variance. In addition, highly informative SNPs with high MAF in each 500 kb sliding window and in each genic region were also included separately. A low-density SNP panel consisting of 30,684 SNPs was developed, with an imputation accuracy of 97.4% when imputed to the 770 K level. Among 13 traits, the average prediction accuracy levels evaluated by genomic best linear unbiased prediction (GBLUP) and BayesA/B/Cπ were 0.22–0.47 and 0.18–0.60 for the ~30 K array and BovineHD Beadchip, respectively. Generally, the predictive performance of the ~30 K array was trait-dependent, with reduced prediction accuracies for seven traits. While differences in terms of prediction accuracy were observed among the 13 traits, the low-density SNP panel achieved moderate to high accuracies for most of the traits and even improved the accuracies for some traits.
Collapse
|
16
|
Toro Ospina AM, Aguilar I, Vargas de Oliveira MH, Cruz Dos Santos Correia LE, Vercesi Filho AE, Albuquerque LG, de Vasconcelos Silva JAI. Assessing the accuracy of imputation in the Gyr breed using different SNP panels. Genome 2021; 64:893-899. [PMID: 34057850 DOI: 10.1139/gen-2020-0081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The aim of this study was to evaluate the accuracy of imputation in a Gyr population using two medium-density panels (Bos taurus - Bos indicus) and to test whether the inclusion of the Nellore breed increases the imputation accuracy in the Gyr population. The database consisted of 289 Gyr females from Brazil genotyped with the GGP Bovine LDv4 chip containing 30 000 SNPs and 158 Gyr females from Colombia genotyped with the GGP indicus chip containing 35 000 SNPs. A customized chip was created that contained the information of 9109 SNPs (9K) to test the imputation accuracy in Gyr populations; 604 Nellore animals with information of LD SNPs tested in the scenarios were included in the reference population. Four scenarios were tested: LD9K_30KGIR, LD9K_35INDGIR, LD9K_30KGIR_NEL, and LD9K_35INDGIR_NEL. Principal component analysis (PCA) was computed for the genomic matrix and sample-specific imputation accuracies were calculated using Pearson's correlation (CS) and the concordance rate (CR) for imputed genotypes. The results of PCA of the Colombian and Brazilian Gyr populations demonstrated the genomic relationship between the two populations. The CS and CR ranged from 0.88 to 0.94 and from 0.93 to 0.96, respectively. Among the scenarios tested, the highest CS (0.94) was observed for the LD9K_30KGIR scenario. The present results highlight the importance of the choice of chip for imputation in the Gyr breed. However, the variation in SNPs may reduce the imputation accuracy even when the chip of the Bos indicus subspecies is used.
Collapse
Affiliation(s)
| | - Ignacio Aguilar
- Instituto Nacional de Investigación Agropecuaria, INIA, Montevideo, Uruguay
| | | | | | | | - Lucia Galvão Albuquerque
- Faculdade de Ciências Agrárias e Veterinárias - Unesp, CEP 14.884-900, Jaboticabal, São Paulo, Brasil
| | | |
Collapse
|
17
|
Pirosanto Y, Laseca N, Valera M, Molina A, Moreno-Millán M, Bugno-Poniewierska M, Ross P, Azor P, Demyda-Peyrás S. Screening and detection of chromosomal copy number alterations in the domestic horse using SNP-array genotyping data. Anim Genet 2021; 52:431-439. [PMID: 34013628 DOI: 10.1111/age.13077] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 03/25/2021] [Accepted: 04/22/2021] [Indexed: 12/27/2022]
Abstract
Chromosomal abnormalities are a common cause of infertility in horses. However, they are difficult to detect using automated methods. Here, we propose a simple methodology based on single nucleotide polymorphism (SNP)-array data that allows us to detect the main chromosomal abnormalities in horses in a single procedure. As proof of concept, we were able to detect chromosomal abnormalities in 33 out of 268 individuals, including monosomies, chimerisms, and male and female sex-reversions, by analyzing the raw signal intensity produced by an SNP array-based genotyping platform. We also demonstrated that the procedure is not affected by the SNP density of the array employed or by the inbreeding level of the individuals. Finally, the methodology proposed in this study could be performed in an open bioinformatic environment, thus permitting its integration as a flexible screening tool in diagnostic laboratories and genomic breeding programs.
Collapse
Affiliation(s)
- Y Pirosanto
- Facultad de Ciencias Veterinarias, Universidad Nacional de La Plata, Calle 60 y 118 s/n, La Plata, 1900, Argentina.,IGEVET (UNLP-CONICET LA PLATA), Facultad de Ciencias Veterinarias, UNLP, Calle 60 y 118 s/n, La Plata, 1900, Argentina
| | - N Laseca
- Laboratorio de Diagnóstico Genético Veterinario, Departamento de Genética, Universidad de Córdoba, CN IV KM 396, Edificio Gregor Mendel, Campus Rabanales, Córdoba, 14071, España
| | - M Valera
- Departamento de Agronomía, Escuela Técnica Superior de Ingeniería Agronómica, Universidad de Sevilla, Ctra. de Utrera km 1, Sevilla, 41013, España
| | - A Molina
- Laboratorio de Diagnóstico Genético Veterinario, Departamento de Genética, Universidad de Córdoba, CN IV KM 396, Edificio Gregor Mendel, Campus Rabanales, Córdoba, 14071, España
| | - M Moreno-Millán
- Laboratorio de Diagnóstico Genético Veterinario, Departamento de Genética, Universidad de Córdoba, CN IV KM 396, Edificio Gregor Mendel, Campus Rabanales, Córdoba, 14071, España
| | - M Bugno-Poniewierska
- Katedra Rozrodu, Anatomii i Genomiki Zwierząt Wydział Hodowli i Biologii Zwierząt, Uniwersytet Rolniczy im. Hugona Kołłątaja w Krakowie, al. Mickiewicza 24/28, Krakow, 30-059, Poland
| | - P Ross
- Department of Animal Science, University of California, Davis, One Shields Ave., Davis, CA, 95616, USA
| | - P Azor
- Asociación Nacional de Criadores de Caballos de Pura Raza Española (ANCCE), Edif. Indotorre · Avda. del Reino Unido 11, pl. 3ª 2, Sevilla, 41012, España
| | - S Demyda-Peyrás
- Facultad de Ciencias Veterinarias, Universidad Nacional de La Plata, Calle 60 y 118 s/n, La Plata, 1900, Argentina.,IGEVET (UNLP-CONICET LA PLATA), Facultad de Ciencias Veterinarias, UNLP, Calle 60 y 118 s/n, La Plata, 1900, Argentina
| |
Collapse
|
18
|
Accuracy of Imputation of Microsatellite Markers from a 50K SNP Chip in Spanish Assaf Sheep. Animals (Basel) 2021; 11:ani11010086. [PMID: 33466430 PMCID: PMC7824810 DOI: 10.3390/ani11010086] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 12/30/2020] [Accepted: 12/31/2020] [Indexed: 11/23/2022] Open
Abstract
Simple Summary Parentage misassignments directly affect genetic gain in traditional breeding programs. The use of genetic markers facilitates parentage verification. In sheep, microsatellite markers and single nucleotide polymorphism (SNP) markers have been proposed by the International Society of Animal Sciences (ISAG) for parentage testing. Since the implementation of genomic selection, the microsatellite information used for parental testing in previous generations is gradually being replaced by SNPs. However, parentage verifications should all be performed using the same technology. A strategy for transitioning from microsatellites to SNP markers, while avoiding extra genotyping costs, is the imputation of microsatellite alleles from SNP haplotypes. This study aims to identify the optimum approach, using a minimum number of SNPs to accurately impute microsatellite markers and developing a low-density SNP chip for parentage verification in the Assaf sheep breed. The imputation approach described here reached high accuracies using a low number of SNP markers, which supports the development of a low-density SNP chip that could avoid the problems of genotyping with both technologies, being a cost-effective method for parentage testing. This study will help sheep breeders to perform parentage verification when different genotyping platforms have been used across generations. Abstract Transitioning from traditional to new genotyping technologies requires the development of bridging methodologies to avoid extra genotyping costs. This study aims to identify the optimum number of single nucleotide polymorphisms (SNPs) necessary to accurately impute microsatellite markers to develop a low-density SNP chip for parentage verification in the Assaf sheep breed. The accuracy of microsatellite marker imputation was assessed with three metrics: genotype concordance (C), genotype dosage (length r2), and allelic dosage (allelic r2), for all imputation scenarios tested (0.5–10 Mb microsatellite flanking SNP windows). The imputation accuracy for the three metrics analyzed for all haplotype lengths tested was higher than 0.90 (C), 0.80 (length r2), and 0.75 (allelic r2), indicating strong genotype concordance. The window with 2 Mb length provides the best accuracy for the imputation procedure and the design of an affordable low-density SNP chip for parentage testing. We additionally evaluated imputation performance under two null models, naive (imputing the most common allele) and random (imputing by randomly selecting the allele), which in comparison showed weak genotype concordances (0.41 and 0.15, respectively). Therefore, we describe a precise methodology in the present article to impute multiallelic microsatellite genotypes from a low-density SNP chip in sheep and solve the problem of parentage verification when different genotyping platforms have been used across generations.
Collapse
|
19
|
Development and validation of a RAD-Seq target-capture based genotyping assay for routine application in advanced black tiger shrimp (Penaeus monodon) breeding programs. BMC Genomics 2020; 21:541. [PMID: 32758142 PMCID: PMC7430818 DOI: 10.1186/s12864-020-06960-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 07/29/2020] [Indexed: 11/26/2022] Open
Abstract
Background The development of genome-wide genotyping resources has provided terrestrial livestock and crop industries with the unique ability to accurately assess genomic relationships between individuals, uncover the genetic architecture of commercial traits, as well as identify superior individuals for selection based on their specific genetic profile. Utilising recent advancements in de-novo genome-wide genotyping technologies, it is now possible to provide aquaculture industries with these same important genotyping resources, even in the absence of existing genome assemblies. Here, we present the development of a genome-wide SNP assay for the Black Tiger shrimp (Penaeus monodon) through utilisation of a reduced-representation whole-genome genotyping approach (DArTseq). Results Based on a single reduced-representation library, 31,262 polymorphic SNPs were identified across 650 individuals obtained from Australian wild stocks and commercial aquaculture populations. After filtering to remove SNPs with low read depth, low MAF, low call rate, deviation from HWE, and non-Mendelian inheritance, 7542 high-quality SNPs were retained. From these, 4236 high-quality genome-wide loci were selected for baits-probe development and 4194 SNPs were included within a finalized target-capture genotype-by-sequence assay (DArTcap). This assay was designed for routine and cost effective commercial application in large scale breeding programs, and demonstrates higher confidence in genotype calls through increased call rate (from 80.2 ± 14.7 to 93.0% ± 3.5%), increased read depth (from 20.4 ± 15.6 to 80.0 ± 88.7), as well as a 3-fold reduction in cost over traditional genotype-by-sequencing approaches. Conclusion Importantly, this assay equips the P. monodon industry with the ability to simultaneously assign parentage of communally reared animals, undertake genomic relationship analysis, manage mate pairings between cryptic family lines, as well as undertake advance studies of genome and trait architecture. Critically this assay can be cost effectively applied as P. monodon breeding programs transition to undertaking genomic selection.
Collapse
|
20
|
Wei C, Luo H, Zhao B, Tian K, Huang X, Wang Y, Fu X, Tian Y, Di J, Xu X, Wu W, Tulafu H, Yasen M, Zhang Y, Zhao W. The Effect of Integrating Genomic Information into Genetic Evaluations of Chinese Merino Sheep. Animals (Basel) 2020; 10:ani10040569. [PMID: 32231053 PMCID: PMC7222387 DOI: 10.3390/ani10040569] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 03/24/2020] [Accepted: 03/24/2020] [Indexed: 01/06/2023] Open
Abstract
Simple Summary Genetic improvement of wool production and quality traits in fine-wool sheep is an appealing option for enhancing the market value of wool products. We estimated genetic parameters and the accuracies of estimated breeding values for various wool production and quality traits in fine-wool sheep using pedigree-based best linear unbiased prediction (PBLUP) and single-step genomic best linear unbiased prediction (ssGBLUP) strategies. ssGBLUP performed slightly better than PBLUP for the studied traits. Therefore, the single-step genetic evaluation method could be successfully implemented in genomic evaluations of fine-wool sheep and the prediction of future breeding values in young Merino sheep as part of an early preselection strategy in the near future. Abstract Genomic evaluations are a method for improving the accuracy of breeding value estimation. This study aimed to compare estimates of genetic parameters and the accuracy of breeding values for wool traits in Merino sheep between pedigree-based best linear unbiased prediction (PBLUP) and single-step genomic best linear unbiased prediction (ssGBLUP) using Bayesian inference. Data were collected from 28,391 yearlings of Chinese Merino sheep (classified in 1992–2018) at the Xinjiang Gonaisi Fine Wool Sheep-Breeding Farm, China. Subjectively-assessed wool traits, namely, spinning count (SC), crimp definition (CRIM), oil (OIL), and body size (BS), and objectively-measured traits, namely, fleece length (FL), greasy fleece weight (GFW), mean fiber diameter (MFD), crimp number (CN), and body weight pre-shearing (BWPS), were analyzed. The estimates of heritability for wool traits were low to moderate. The largest h2 values were observed for FL (0.277) and MFD (0.290) with ssGBLUP. The heritabilities estimated for wool traits with ssGBLUP were slightly higher than those obtained with PBLUP. The accuracies of breeding values were low to moderate, ranging from 0.362 to 0.573 for the whole population and from 0.318 to 0.676 for the genotyped subpopulation. The correlation between the estimated breeding values (EBVs) and genomic EBVs (GEBVs) ranged from 0.717 to 0.862 for the whole population, and the relative increase in accuracy when comparing EBVs with GEBVs ranged from 0.372% to 7.486% for these traits. However, in the genotyped population, the rank correlation between the estimates obtained with PBLUP and ssGBLUP was reduced to 0.525 to 0.769, with increases in average accuracy of 3.016% to 11.736% for the GEBVs in relation to the EBVs. Thus, genomic information could allow us to more accurately estimate the relationships between animals and improve estimates of heritability and the accuracy of breeding values by ssGBLUP.
Collapse
Affiliation(s)
- Chen Wei
- College of Animal Science, Xinjiang Agricultural University, Urumqi 830052, China;
- Key Laboratory of Genetics Breeding and Reproduction of Xinjiang Cashmere and Wool Sheep, Institute of Animal Science, Xinjiang Academy of Animal Science, Urumqi 830011, China (J.D.)
| | - Hanpeng Luo
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Bingru Zhao
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Kechuan Tian
- Key Laboratory of Genetics Breeding and Reproduction of Xinjiang Cashmere and Wool Sheep, Institute of Animal Science, Xinjiang Academy of Animal Science, Urumqi 830011, China (J.D.)
- Correspondence: (K.T.); (X.H.); (Y.W.); Tel.: +86-1590-900-1963 (K.T.); +86-1399-999-6861 (X.H.); +86-1580-159-5851 (Y.W.)
| | - Xixia Huang
- College of Animal Science, Xinjiang Agricultural University, Urumqi 830052, China;
- Correspondence: (K.T.); (X.H.); (Y.W.); Tel.: +86-1590-900-1963 (K.T.); +86-1399-999-6861 (X.H.); +86-1580-159-5851 (Y.W.)
| | - Yachun Wang
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
- Correspondence: (K.T.); (X.H.); (Y.W.); Tel.: +86-1590-900-1963 (K.T.); +86-1399-999-6861 (X.H.); +86-1580-159-5851 (Y.W.)
| | - Xuefeng Fu
- Key Laboratory of Genetics Breeding and Reproduction of Xinjiang Cashmere and Wool Sheep, Institute of Animal Science, Xinjiang Academy of Animal Science, Urumqi 830011, China (J.D.)
| | - Yuezhen Tian
- Key Laboratory of Genetics Breeding and Reproduction of Xinjiang Cashmere and Wool Sheep, Institute of Animal Science, Xinjiang Academy of Animal Science, Urumqi 830011, China (J.D.)
| | - Jiang Di
- Key Laboratory of Genetics Breeding and Reproduction of Xinjiang Cashmere and Wool Sheep, Institute of Animal Science, Xinjiang Academy of Animal Science, Urumqi 830011, China (J.D.)
| | - Xinming Xu
- Key Laboratory of Genetics Breeding and Reproduction of Xinjiang Cashmere and Wool Sheep, Institute of Animal Science, Xinjiang Academy of Animal Science, Urumqi 830011, China (J.D.)
| | - Weiwei Wu
- Key Laboratory of Genetics Breeding and Reproduction of Xinjiang Cashmere and Wool Sheep, Institute of Animal Science, Xinjiang Academy of Animal Science, Urumqi 830011, China (J.D.)
| | - Hanikezi Tulafu
- Key Laboratory of Genetics Breeding and Reproduction of Xinjiang Cashmere and Wool Sheep, Institute of Animal Science, Xinjiang Academy of Animal Science, Urumqi 830011, China (J.D.)
| | - Maerziya Yasen
- Key Laboratory of Genetics Breeding and Reproduction of Xinjiang Cashmere and Wool Sheep, Institute of Animal Science, Xinjiang Academy of Animal Science, Urumqi 830011, China (J.D.)
| | - Yajun Zhang
- Xinjiang Gonaisi Fine Wool Sheep-Breeding Farm, Ili Kazak Autonomous Prefecture 835800, China
| | - Wensheng Zhao
- Xinjiang Gonaisi Fine Wool Sheep-Breeding Farm, Ili Kazak Autonomous Prefecture 835800, China
| |
Collapse
|
21
|
Wu XL, Li H, Ferretti R, Simpson B, Walker J, Parham J, Mastro L, Qiu J, Schultz T, Tait RG, Bauck S. A unified local objective function for optimally selecting SNPs on arrays for agricultural genomics applications. Anim Genet 2020; 51:306-310. [PMID: 32004392 DOI: 10.1111/age.12916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/09/2020] [Indexed: 11/28/2022]
Abstract
Over the years, ad-hoc procedures were used for designing SNP arrays, but the procedures and strategies varied considerably case by case. Recently, a multiple-objective, local optimization (MOLO) algorithm was proposed to select SNPs for SNP arrays, which maximizes the adjusted SNP information (E score) under multiple constraints, e.g. on MAF, uniformness of SNP locations (U score), the inclusion of obligatory SNPs and the number and size of gaps. In the MOLO, each chromosome is split into equally spaced segments and local optima are selected as the SNPs having the highest adjusted E score within each segment, conditional on the presence of obligatory SNPs. The computation of the adjusted E score, however, is empirical, and it does not scale well between the uniformness of SNP locations and SNP informativeness. In addition, the MOLO objective function does not accommodate the selection of uniformly distributed SNPs. In the present study, we proposed a unified local function for optimally selecting SNPs, as an amendment to the MOLO algorithm. This new local function takes scalable weights between the uniformness and informativeness of SNPs, which allows the selection of SNPs under varied scenarios. The results showed that the weighting between the U and the E scores led to a higher imputation concordance rate than the U score or E score alone. The results from the evaluation of six commercial bovine SNP chips further confirmed this conclusion.
Collapse
Affiliation(s)
- X-L Wu
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA.,Department of Animal Sciences, University of Wisconsin, Madison, WI, 53706, USA
| | - H Li
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA.,Department of Animal Sciences, University of Wisconsin, Madison, WI, 53706, USA
| | - R Ferretti
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - B Simpson
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - J Walker
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - J Parham
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - L Mastro
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - J Qiu
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - T Schultz
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - R G Tait
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| | - S Bauck
- Bioinformatics and Biostatistics, Neogen GeneSeek, Lincoln, NE, 68504, USA
| |
Collapse
|
22
|
Ballesta P, Bush D, Silva FF, Mora F. Genomic Predictions Using Low-Density SNP Markers, Pedigree and GWAS Information: A Case Study with the Non-Model Species Eucalyptus cladocalyx. PLANTS (BASEL, SWITZERLAND) 2020; 9:E99. [PMID: 31941085 PMCID: PMC7020392 DOI: 10.3390/plants9010099] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Revised: 12/20/2019] [Accepted: 01/09/2020] [Indexed: 11/16/2022]
Abstract
High-throughput genotyping techniques have enabled large-scale genomic analysis to precisely predict complex traits in many plant species. However, not all species can be well represented in commercial SNP (single nucleotide polymorphism) arrays. In this study, a high-density SNP array (60 K) developed for commercial Eucalyptus was used to genotype a breeding population of Eucalyptus cladocalyx, yielding only ~3.9 K informative SNPs. Traditional Bayesian genomic models were investigated to predict flowering, stem quality and growth traits by considering the following effects: (i) polygenic background and all informative markers (GS model) and (ii) polygenic background, QTL-genotype effects (determined by GWAS) and SNP markers that were not associated with any trait (GSq model). The estimates of pedigree-based heritability and genomic heritability varied from 0.08 to 0.34 and 0.002 to 0.5, respectively, whereas the predictive ability varied from 0.19 (GS) and 0.45 (GSq). The GSq approach outperformed GS models in terms of predictive ability when the proportion of the variance explained by the significant marker-trait associations was higher than those explained by the polygenic background and non-significant markers. This approach can be particularly useful for plant/tree species poorly represented in the high-density SNP arrays, developed for economically important species, or when high-density marker panels are not available.
Collapse
Affiliation(s)
- Paulina Ballesta
- Institute of Biological Sciences, University of Talca, 2 Norte 685, Talca 3460000, Chile;
| | - David Bush
- CSIRO–Australian Tree Seed Centre, Acton 2601, Australia;
| | - Fabyano Fonseca Silva
- Department of Animal Science, Universidade Federal de Viçosa, Viçosa 36570-900, Brazil;
| | - Freddy Mora
- Institute of Biological Sciences, University of Talca, 2 Norte 685, Talca 3460000, Chile;
| |
Collapse
|
23
|
O'Brien AC, Judge MM, Fair S, Berry DP. High imputation accuracy from informative low-to-medium density single nucleotide polymorphism genotypes is achievable in sheep1. J Anim Sci 2019; 97:1550-1567. [PMID: 30722011 DOI: 10.1093/jas/skz043] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 01/30/2019] [Indexed: 12/29/2022] Open
Abstract
The objective of the present study was to quantify the accuracy of imputing medium-density single nucleotide polymorphism (SNP) genotypes from lower-density panels (384 to 12,000 SNPs) derived using alternative selection methods to select the most informative SNPs. Four different selection methods were used to select SNPs based on genomic characteristics (i.e., minor allele frequency (MAF) and linkage disequilibrium (LD)) within five sheep breeds (642 Belclare, 645 Charollais, 715 Suffolk, 440 Texel, and 620 Vendeen) separately. Selection methods evaluated included (i) random, (ii) splitting the genome into blocks of equal length and selecting SNPs within block based on MAF and LD patterns, (iii) equidistant location while optimizing MAF, (iv) a combination of MAF, distance from already selected SNPs, and weak LD with the SNP(s) already selected. All animals were genotyped on the Illumina OvineSNP50 Beadchip containing 51,135 SNPs of which 44,040 remained after edits. Within each breed separately, the youngest 100 animals were assumed to represent the validation population; the remaining animals represented the reference population. Imputation was undertaken under three different conditions: (i) SNPs were selected within a given breed and imputed for all breeds individually, (ii) all breeds were collectively used to select SNPs and were included as the reference population, and (iii) the SNPs were selected for each breed separately and imputation was undertaken for all breeds but excluding from the reference population, the breed from which the SNPs were selected. Regardless of SNP selection method, mean animal allele concordance rate improved at a diminishing rate while the variability in mean animal allele concordance rate reduced as the panel density increased. The SNP selection method impacted the accuracy of imputation although the effect reduced as the density of the panel increased. Overall, the most accurate SNP selection method for panels with <9,000 SNPs was that based on MAF and LD pattern within genomic blocks. The mean animal allele concordance rate varied from 0.89 in Texel to 0.97 in Vendeen. Greater imputation accuracy was achieved when SNPs were selected and imputed within each breed individually compared with when SNPs were selected across all breeds and imputed using a multi-breed reference population. In all, results indicate that accurate genotype imputation to medium density is achievable with low-density genotype panels with at least 6,000 SNPs.
Collapse
Affiliation(s)
- Aine C O'Brien
- Animal and Grassland Research and Innovation Centre, Teagasc, Moorepark, Fermoy, Co. Cork, Ireland.,Laboratory of Animal Reproduction, Department of Biological Sciences, Faculty of Science and Engineering, University of Limerick, Limerick, Ireland
| | - Michelle M Judge
- Animal and Grassland Research and Innovation Centre, Teagasc, Moorepark, Fermoy, Co. Cork, Ireland
| | - Sean Fair
- Laboratory of Animal Reproduction, Department of Biological Sciences, Faculty of Science and Engineering, University of Limerick, Limerick, Ireland
| | - Donagh P Berry
- Animal and Grassland Research and Innovation Centre, Teagasc, Moorepark, Fermoy, Co. Cork, Ireland
| |
Collapse
|
24
|
Bolormaa S, Chamberlain AJ, Khansefid M, Stothard P, Swan AA, Mason B, Prowse-Wilkins CP, Duijvesteijn N, Moghaddar N, van der Werf JH, Daetwyler HD, MacLeod IM. Accuracy of imputation to whole-genome sequence in sheep. Genet Sel Evol 2019; 51:1. [PMID: 30654735 PMCID: PMC6337865 DOI: 10.1186/s12711-018-0443-5] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 12/18/2018] [Indexed: 12/12/2022] Open
Abstract
Background The use of whole-genome sequence (WGS) data for genomic prediction and association studies is highly desirable because the causal mutations should be present in the data. The sequencing of 935 sheep from a range of breeds provides the opportunity to impute sheep genotyped with single nucleotide polymorphism (SNP) arrays to WGS. This study evaluated the accuracy of imputation from SNP genotypes to WGS using this reference population of 935 sequenced sheep. Results The accuracy of imputation from the Ovine Infinium® HD BeadChip SNP (~ 500 k) to WGS was assessed for three target breeds: Merino, Poll Dorset and F1 Border Leicester × Merino. Imputation accuracy was highest for the Poll Dorset breed, although there were more Merino individuals in the sequenced reference population than Poll Dorset individuals. In addition, empirical imputation accuracies were higher (by up to 1.7%) when using larger multi-breed reference populations compared to using a smaller single-breed reference population. The mean accuracy of imputation across target breeds using the Minimac3 or the FImpute software was 0.94. The empirical imputation accuracy varied considerably across the genome; six chromosomes carried regions of one or more Mb with a mean imputation accuracy of < 0.7. Imputation accuracy in five variant annotation classes ranged from 0.87 (missense) up to 0.94 (intronic variants), where lower accuracy corresponded to higher proportions of rare alleles. The imputation quality statistic reported from Minimac3 (R2) had a clear positive relationship with the empirical imputation accuracy. Therefore, by first discarding imputed variants with an R2 below 0.4, the mean empirical accuracy across target breeds increased to 0.97. Although accuracy of genomic prediction was less affected by filtering on R2 in a multi-breed population of sheep with imputed WGS, the genomic heritability clearly tended to be lower when using variants with an R2 ≤ 0.4. Conclusions The mean imputation accuracy was high for all target breeds and was increased by combining smaller breed sets into a multi-breed reference. We found that the Minimac3 software imputation quality statistic (R2) was a useful indicator of empirical imputation accuracy, enabling removal of very poorly imputed variants before downstream analyses. Electronic supplementary material The online version of this article (10.1186/s12711-018-0443-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sunduimijid Bolormaa
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Rd, Bundoora, VIC, 3083, Australia. .,Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.
| | - Amanda J Chamberlain
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Rd, Bundoora, VIC, 3083, Australia
| | - Majid Khansefid
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Rd, Bundoora, VIC, 3083, Australia.,Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia
| | - Paul Stothard
- Faculty of Agricultural, Life and Environmental Sciences, University of Alberta, Edmonton, AB, T6G 2R3, Canada
| | - Andrew A Swan
- Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.,Animal Genetics and Breeding Unit, University of New England, Armidale, NSW, 2351, Australia
| | - Brett Mason
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Rd, Bundoora, VIC, 3083, Australia
| | - Claire P Prowse-Wilkins
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Rd, Bundoora, VIC, 3083, Australia
| | - Naomi Duijvesteijn
- Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.,School of Environmental and Rural Science, University of New England, Armidale, NSW, 2351, Australia
| | - Nasir Moghaddar
- Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.,School of Environmental and Rural Science, University of New England, Armidale, NSW, 2351, Australia
| | - Julius H van der Werf
- Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.,School of Environmental and Rural Science, University of New England, Armidale, NSW, 2351, Australia
| | - Hans D Daetwyler
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Rd, Bundoora, VIC, 3083, Australia.,Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.,School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3086, Australia
| | - Iona M MacLeod
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Rd, Bundoora, VIC, 3083, Australia.,Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia
| |
Collapse
|
25
|
Aliloo H, Mrode R, Okeyo AM, Ni G, Goddard ME, Gibson JP. The feasibility of using low-density marker panels for genotype imputation and genomic prediction of crossbred dairy cattle of East Africa. J Dairy Sci 2018; 101:9108-9127. [PMID: 30077450 DOI: 10.3168/jds.2018-14621] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2018] [Accepted: 05/26/2018] [Indexed: 11/19/2022]
Abstract
Cost-effective high-density (HD) genotypes of livestock species can be obtained by genotyping a proportion of the population using a HD panel and the remainder using a cheaper low-density panel, and then imputing the missing genotypes that are not directly assayed in the low-density panel. The efficacy of genotype imputation can largely be affected by the structure and history of the specific target population and it should be checked before incorporating imputation in routine genotyping practices. Here, we investigated the efficacy of imputation in crossbred dairy cattle populations of East Africa using 4 different commercial single nucleotide polymorphisms (SNP) panels, 3 reference populations, and 3 imputation algorithms. We found that Minimac and a reference population, which included a mixture of crossbred and ancestral purebred animals, provided the highest imputation accuracy compared with other scenarios of imputation. The accuracies of imputation, measured as the correlation between real and imputed genotypes averaged across SNP, were around 0.76 and 0.94 for 7K and 40K SNP, respectively, when imputed up to a 770K panel. We also presented a method to maximize the imputation accuracy of low-density panels, which relies on the pairwise (co)variances between SNP and the minor allele frequency of SNP. The performance of the developed method was tested in a 5-fold cross-validation process where various densities of SNP were selected using the (co)variance method and also by alternative SNP selection methods and then imputed up to the HD panel. The (co)variance method provided the highest imputation accuracies at almost all marker densities, with accuracies being up to 0.19 higher than the random selection of SNP. The accuracies of imputation from 7K and 40K panels selected using the (co)variance method were around 0.80 and 0.94, respectively. The presented method also achieved higher accuracy of genomic prediction at lower densities of selected SNP. The squared correlation between genomic breeding values estimated using imputed genotypes and those from the real 770K HD panel was 0.95 when the accuracy of imputation was 0.64. The presented method for SNP selection is straightforward in its application and can ensure high accuracies in genotype imputation of crossbred dairy populations in East Africa.
Collapse
Affiliation(s)
- H Aliloo
- School of Environmental and Rural Science, University of New England, Armidale, NSW 2350, Australia.
| | - R Mrode
- International Livestock Research Institute (ILRI), PO Box 30709, Nairobi, Kenya; Scotland's Rural College, Easter Bush, Midlothian EH25 9RG, Scotland, United Kingdom
| | - A M Okeyo
- International Livestock Research Institute (ILRI), PO Box 30709, Nairobi, Kenya
| | - G Ni
- School of Environmental and Rural Science, University of New England, Armidale, NSW 2350, Australia
| | - M E Goddard
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, VIC 3083, Australia; Faculty of Veterinary and Agricultural Sciences, Department of Agriculture and Food Systems, The University of Melbourne, Parkville, VIC 3010, Australia
| | - J P Gibson
- School of Environmental and Rural Science, University of New England, Armidale, NSW 2350, Australia
| |
Collapse
|
26
|
Stratz P, Schiller KF, Wellmann R, Preuss S, Baes C, Bennewitz J. Genetic parameter estimates and targeted association analyses of growth, carcass, and meat quality traits in German Merinoland and Merinoland-cross lambs. J Anim Sci 2018; 96:398-406. [PMID: 29385607 DOI: 10.1093/jas/sky012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
In this study, genetic parameters of nine growth, carcass, and meat quality (MQ) traits were estimated, and targeted association studies were conducted using mixed models. Phenotypic information was collected on 1,599 lambs, including both purebred Merinoland (ML) animals and five different F1 crosses. The F1 lambs were produced by mating rams of the meat-type breeds Charollais, Ile de France, German Blackheaded Mutton (Deutsches Schwarzköpfiges Fleischschaf), Suffolk, and Texel with ML ewes. Between four and six sires were used per sire breed. In total, 29 sires and 298 purebred ML sheep were genotyped with the Illumina OvineSNP50 BeadChip. All F1 individuals were genotyped for 289 SNPs located on the chromosomes 1, 2, 3, 18, and 21. These SNPs were used to impute SNPs on five chromosomes of the Illumina Ovine chip in the F1 individuals. Several Bonferroni-corrected significant associations were identified for shoulder width. A number of additional significant associations were found for other traits. Genetic parameters were estimated and single-marker association analyses were performed with breed-specific effects. Moderate heritability estimates were found for average daily gain (0.23), kidney fat weight (0.19), carcass length (0.15), shoulder width (0.33), subcutaneous fat thickness (0.22), and cutlet area (0.36). While heritability for cooking loss was found to be low (0.07), shear force (0.17) and dressing percentage (0.20) showed moderate heritability, and thus might be candidate traits to be included in the selection index in the population. In general, low phenotypic and low or moderate genetic correlations were detected between the traits.
Collapse
Affiliation(s)
- Patrick Stratz
- Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
| | | | - Robin Wellmann
- Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
| | - Siegfried Preuss
- Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
| | - Christine Baes
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Canada
| | - Jörn Bennewitz
- Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
| |
Collapse
|
27
|
Comparing strategies for selection of low-density SNPs for imputation-mediated genomic prediction in U. S. Holsteins. Genetica 2017; 146:137-149. [PMID: 29243001 DOI: 10.1007/s10709-017-0004-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2017] [Accepted: 12/08/2017] [Indexed: 10/18/2022]
Abstract
SNP chips are commonly used for genotyping animals in genomic selection but strategies for selecting low-density (LD) SNPs for imputation-mediated genomic selection have not been addressed adequately. The main purpose of the present study was to compare the performance of eight LD (6K) SNP panels, each selected by a different strategy exploiting a combination of three major factors: evenly-spaced SNPs, increased minor allele frequencies, and SNP-trait associations either for single traits independently or for all the three traits jointly. The imputation accuracies from 6K to 80K SNP genotypes were between 96.2 and 98.2%. Genomic prediction accuracies obtained using imputed 80K genotypes were between 0.817 and 0.821 for daughter pregnancy rate, between 0.838 and 0.844 for fat yield, and between 0.850 and 0.863 for milk yield. The two SNP panels optimized on the three major factors had the highest genomic prediction accuracy (0.821-0.863), and these accuracies were very close to those obtained using observed 80K genotypes (0.825-0.868). Further exploration of the underlying relationships showed that genomic prediction accuracies did not respond linearly to imputation accuracies, but were significantly affected by genotype (imputation) errors of SNPs in association with the traits to be predicted. SNPs optimal for map coverage and MAF were favorable for obtaining accurate imputation of genotypes whereas trait-associated SNPs improved genomic prediction accuracies. Thus, optimal LD SNP panels were the ones that combined both strengths. The present results have practical implications on the design of LD SNP chips for imputation-enabled genomic prediction.
Collapse
|
28
|
Raoul J, Swan AA, Elsen JM. Using a very low-density SNP panel for genomic selection in a breeding program for sheep. Genet Sel Evol 2017; 49:76. [PMID: 29065868 PMCID: PMC5655911 DOI: 10.1186/s12711-017-0351-0] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2017] [Accepted: 10/17/2017] [Indexed: 01/11/2023] Open
Abstract
Background Building an efficient reference population for genomic selection is an issue when the recorded population is small and phenotypes are poorly informed, which is often the case in sheep breeding programs. Using stochastic simulation, we evaluated a genomic design based on a reference population with medium-density genotypes [around 45 K single nucleotide polymorphisms (SNPs)] of dams that were imputed from very low-density genotypes (≤ 1000 SNPs). Methods A population under selection for a maternal trait was simulated using real genotypes. Genetic gains realized from classical selection and genomic selection designs were compared. Genomic selection scenarios that differed in reference population structure (whether or not dams were included in the reference) and genotype quality (medium-density or imputed to medium-density from very low-density) were evaluated. Results The genomic design increased genetic gain by 26% when the reference population was based on sire medium-density genotypes and by 54% when the reference population included both sire and dam medium-density genotypes. When medium-density genotypes of male candidates and dams were replaced by imputed genotypes from very low-density SNP genotypes (1000 SNPs), the increase in gain was 22% for the sire reference population and 42% for the sire and dam reference population. The rate of increase in inbreeding was lower (from − 20 to − 34%) for the genomic design than for the classical design regardless of the genomic scenario. Conclusions We show that very low-density genotypes of male candidates and dams combined with an imputation process result in a substantial increase in genetic gain for small sheep breeding programs.
Collapse
Affiliation(s)
- Jérôme Raoul
- Institut de l'Elevage, Castanet-Tolosan, France. .,GenPhySE, INRA, Castanet-Tolosan, France.
| | - Andrew A Swan
- Animal Genetics and Breeding Unit, University of New England, Armidale, Australia
| | | |
Collapse
|
29
|
Ponomarenko P, Ryutov A, Maglinte DT, Baranova A, Tatarinova TV, Gai X. Clinical utility of the low-density Infinium QC genotyping Array in a genomics-based diagnostics laboratory. BMC Med Genomics 2017; 10:57. [PMID: 28985730 PMCID: PMC5639583 DOI: 10.1186/s12920-017-0297-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2017] [Accepted: 10/02/2017] [Indexed: 11/10/2022] Open
Abstract
Background With 15,949 markers, the low-density Infinium QC Array-24 BeadChip enables linkage analysis, HLA haplotyping, fingerprinting, ethnicity determination, mitochondrial genome variations, blood groups and pharmacogenomics. It represents an attractive independent QC option for NGS-based diagnostic laboratories, and provides cost-efficient means for determining gender, ethnic ancestry, and sample kinships, that are important for data interpretation of NGS-based genetic tests. Methods We evaluated accuracy and reproducibility of Infinium QC genotyping calls by comparing them with genotyping data of the same samples from other genotyping platforms, whole genome/exome sequencing. Accuracy and robustness of determining gender, provenance, and kinships were assessed. Results Concordance of genotype calls between Infinium QC and other platforms was above 99%. Here we show that the chip’s ancestry informative markers are sufficient for ethnicity determination at continental and sometimes subcontinental levels, with assignment accuracy varying with the coverage for a particular region and ethnic groups. Mean accuracies of provenance prediction at a regional level were varied from 81% for Asia, to 89% for Americas, 86% for Africa, 97% for Oceania, 98% for Europe, and 100% for India. Mean accuracy of ethnicity assignment predictions was 63%. Pairwise concordances of AFR samples with the samples from any other super populations were the lowest (0.39–0.43), while the concordances within the same population were relatively high (0.55–0.61). For all populations except African, cross-population comparisons were similar in their concordance ranges to the range of within-population concordances (0.54–0.57). Gender determination was correct in all tested cases. Conclusions Our results indicate that the Infinium QC Array-24 chip is suitable for cost-efficient, independent QC assaying in the settings of an NGS-based molecular diagnostic laboratory; hence, we recommend its integration into the standard laboratory workflow. Low-density chips can provide sample-specific measures for variant call accuracy, prevent sample mix-ups, validate self-reported ethnicities, and detect consanguineous cases. Integration of low-density chips into QC procedures aids proper interpretation of candidate sequence variants. To enhance utility of this low-density chip, we recommend expansion of ADME and mitochondrial markers. Inexpensive Infinium-like low-density human chips have a potential to become a “Swiss army knife” among genotyping assays suitable for many applications requiring high-throughput assays. Electronic supplementary material The online version of this article (10.1186/s12920-017-0297-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Petr Ponomarenko
- Department of Biology, University of La Verne, La Verne, CA, USA
| | - Alex Ryutov
- Center for Personalized Medicine, Department of Pathology and Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, CA, USA
| | - Dennis T Maglinte
- Center for Personalized Medicine, Department of Pathology and Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, CA, USA
| | - Ancha Baranova
- School of Systems Biology, George Mason University, Fairfax, VA, USA.,Research Center for Medical Genetics, Moscow, Russia.,Atlas Biomed Group, Moscow, Russia
| | - Tatiana V Tatarinova
- Department of Biology, University of La Verne, La Verne, CA, USA. .,School of Systems Biology, George Mason University, Fairfax, VA, USA. .,Atlas Biomed Group, Moscow, Russia.
| | - Xiaowu Gai
- Center for Personalized Medicine, Department of Pathology and Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, CA, USA. .,Department of Pathology and Laboratory Medicine, USC Keck School of Medicine, Los Angeles, CA, USA.
| |
Collapse
|
30
|
Oliveira Júnior GA, Chud TCS, Ventura RV, Garrick DJ, Cole JB, Munari DP, Ferraz JBS, Mullart E, DeNise S, Smith S, da Silva MVGB. Genotype imputation in a tropical crossbred dairy cattle population. J Dairy Sci 2017; 100:9623-9634. [PMID: 28987572 DOI: 10.3168/jds.2017-12732] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 08/16/2017] [Indexed: 11/19/2022]
Abstract
The objective of this study was to investigate different strategies for genotype imputation in a population of crossbred Girolando (Gyr × Holstein) dairy cattle. The data set consisted of 478 Girolando, 583 Gyr, and 1,198 Holstein sires genotyped at high density with the Illumina BovineHD (Illumina, San Diego, CA) panel, which includes ∼777K markers. The accuracy of imputation from low (20K) and medium densities (50K and 70K) to the HD panel density and from low to 50K density were investigated. Seven scenarios using different reference populations (RPop) considering Girolando, Gyr, and Holstein breeds separately or combinations of animals of these breeds were tested for imputing genotypes of 166 randomly chosen Girolando animals. The population genotype imputation were performed using FImpute. Imputation accuracy was measured as the correlation between observed and imputed genotypes (CORR) and also as the proportion of genotypes that were imputed correctly (CR). This is the first paper on imputation accuracy in a Girolando population. The sample-specific imputation accuracies ranged from 0.38 to 0.97 (CORR) and from 0.49 to 0.96 (CR) imputing from low and medium densities to HD, and 0.41 to 0.95 (CORR) and from 0.50 to 0.94 (CR) for imputation from 20K to 50K. The CORRanim exceeded 0.96 (for 50K and 70K panels) when only Girolando animals were included in RPop (S1). We found smaller CORRanim when Gyr (S2) was used instead of Holstein (S3) as RPop. The same behavior was observed between S4 (Gyr + Girolando) and S5 (Holstein + Girolando) because the target animals were more related to the Holstein population than to the Gyr population. The highest imputation accuracies were observed for scenarios including Girolando animals in the reference population, whereas using only Gyr animals resulted in low imputation accuracies, suggesting that the haplotypes segregating in the Girolando population had a greater effect on accuracy than the purebred haplotypes. All chromosomes had similar imputation accuracies (CORRsnp) within each scenario. Crossbred animals (Girolando) must be included in the reference population to provide the best imputation accuracies.
Collapse
Affiliation(s)
- Gerson A Oliveira Júnior
- Departamento de Medicina Veterinária, Universidade de São Paulo (USP), Faculdade de Zootecnia e Engenharia de Alimentos, Pirassununga, SP, 13635-900, Brazil
| | - Tatiane C S Chud
- Departamento de Ciências Exatas, Universidade Estadual Paulista (Unesp), Faculdade de Ciências Agrárias e Veterinárias, Jaboticabal, SP, 14884-900, Brazil
| | - Ricardo V Ventura
- Beef Improvement Opportunities, Guelph, ON N1K1E5, Canada; Centre for Genetic Improvement of Livestock, University of Guelph, Guelph, ON N1G2W1, Canada
| | - Dorian J Garrick
- Department of Animal Science, Iowa State University, Ames 50011-3150
| | - John B Cole
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD, 20705-2350
| | - Danísio P Munari
- Departamento de Ciências Exatas, Universidade Estadual Paulista (Unesp), Faculdade de Ciências Agrárias e Veterinárias, Jaboticabal, SP, 14884-900, Brazil
| | - José B S Ferraz
- Departamento de Medicina Veterinária, Universidade de São Paulo (USP), Faculdade de Zootecnia e Engenharia de Alimentos, Pirassununga, SP, 13635-900, Brazil
| | | | | | | | | |
Collapse
|
31
|
Bolormaa S, Swan AA, Brown DJ, Hatcher S, Moghaddar N, van der Werf JH, Goddard ME, Daetwyler HD. Multiple-trait QTL mapping and genomic prediction for wool traits in sheep. Genet Sel Evol 2017; 49:62. [PMID: 28810834 PMCID: PMC5558709 DOI: 10.1186/s12711-017-0337-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Accepted: 07/31/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The application of genomic selection to sheep breeding could lead to substantial increases in profitability of wool production due to the availability of accurate breeding values from single nucleotide polymorphism (SNP) data. Several key traits determine the value of wool and influence a sheep's susceptibility to fleece rot and fly strike. Our aim was to predict genomic estimated breeding values (GEBV) and to compare three methods of combining information across traits to map polymorphisms that affect these traits. METHODS GEBV for 5726 Merino and Merino crossbred sheep were calculated using BayesR and genomic best linear unbiased prediction (GBLUP) with real and imputed 510,174 SNPs for 22 traits (at yearling and adult ages) including wool production and quality, and breech conformation traits that are associated with susceptibility to fly strike. Accuracies of these GEBV were assessed using fivefold cross-validation. We also devised and compared three approximate multi-trait analyses to map pleiotropic quantitative trait loci (QTL): a multi-trait genome-wide association study and two multi-trait methods that use the output from BayesR analyses. One BayesR method used local GEBV for each trait, while the other used the posterior probabilities that a SNP had an effect on each trait. RESULTS BayesR and GBLUP resulted in similar average GEBV accuracies across traits (~0.22). BayesR accuracies were highest for wool yield and fibre diameter (>0.40) and lowest for skin quality and dag score (<0.10). Generally, accuracy was higher for traits with larger reference populations and higher heritability. In total, the three multi-trait analyses identified 206 putative QTL, of which 20 were common to the three analyses. The two BayesR multi-trait approaches mapped QTL in a more defined manner than the multi-trait GWAS. We identified genes with known effects on hair growth (i.e. FGF5, STAT3, KRT86, and ALX4) near SNPs with pleiotropic effects on wool traits. CONCLUSIONS The mean accuracy of genomic prediction across wool traits was around 0.22. The three multi-trait analyses identified 206 putative QTL across the ovine genome. Detailed phenotypic information helped to identify likely candidate genes.
Collapse
Affiliation(s)
- Sunduimijid Bolormaa
- Agriculture Victoria Research, AgriBio Centre, Bundoora, VIC, 3083, Australia. .,Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia.
| | - Andrew A Swan
- Animal Genetics and Breeding Unit, University of New England, Armidale, NSW, 2351, Australia.,Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia
| | - Daniel J Brown
- Animal Genetics and Breeding Unit, University of New England, Armidale, NSW, 2351, Australia.,Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia
| | - Sue Hatcher
- NSW Department of Primary Industries, Orange Agricultural Institute, Orange, NSW, 2800, Australia.,Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia
| | - Nasir Moghaddar
- School of Environmental and Rural Science, University of New England, Armidale, NSW, 2351, Australia.,Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia
| | - Julius H van der Werf
- School of Environmental and Rural Science, University of New England, Armidale, NSW, 2351, Australia.,Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia
| | - Michael E Goddard
- Agriculture Victoria Research, AgriBio Centre, Bundoora, VIC, 3083, Australia.,School of Land and Environment, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Hans D Daetwyler
- Agriculture Victoria Research, AgriBio Centre, Bundoora, VIC, 3083, Australia.,School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3086, Australia.,Cooperative Research Centre for Sheep Industry Innovation, Armidale, NSW, 2351, Australia
| |
Collapse
|
32
|
Abstract
Accurate genomic analyses are predicated on access to a large quantity of accurately genotyped and phenotyped animals. Because the cost of genotyping is often less than the cost of phenotyping, interest is increasing in generating genotypes for phenotyped animals. In some instances this may imply the requirement to genotype older animals with greater phenotypic information content. Biological material for these older informative animals may, however, no longer exist. The objective of the present study was to quantify the ability to impute 11 129 single nucleotide polymorphism (SNP) genotypes of non-genotyped animals (in this instance sires) from the genotypes of their progeny with or without including the genotypes of the progenys' dams (i.e. mates of the sire to be imputed). The impact on the accuracy of genotype imputation by including more progeny (and their dams') genotypes in the imputation reference population was also quantified. When genotypes of the dams were not available, genotypes of 41 sires with at least 15 genotyped progeny were used for the imputation; when genotypes of the dams were available, genotypes of 21 sires with at least 10 genotyped progeny were used for the imputation. Imputation was undertaken exploiting family and population level information. The mean and variability in the proportion of genotypes per individual that could not be imputed reduced as the number of progeny genotypes used per individual increased. Little improvement in the proportion of genotypes that could not be imputed was achieved once genotypes of seven progeny and their dams were used or genotypes of 11 progeny without their respective dam's genotypes were used. Mean imputation accuracy per individual (depicted by both concordance rates and correlation between true and imputed) increased with increasing progeny group size. Moreover, the range in mean imputation accuracy per individual reduced as more progeny genotypes were used in the imputation. If the genotype of the mate of the sire was also used, high accuracy of imputation (mean genotype concordance rate per individual of 0.988), with little additional benefit thereafter, was achieved with seven genotyped progeny. In the absence of genotypes on the dam, similar imputation accuracy could not be achieved even using genotypes on up to 15 progeny. Results therefore suggest, at least for the SNP density used in the present study, that it is possible to accurately impute the genotypes of a non-genotyped parent from the genotypes of its progeny and there is a benefit of also including the genotype of the sire's mate (i.e. dam of the progeny).
Collapse
|
33
|
Ventura RV, Miller SP, Dodds KG, Auvray B, Lee M, Bixley M, Clarke SM, McEwan JC. Assessing accuracy of imputation using different SNP panel densities in a multi-breed sheep population. Genet Sel Evol 2016; 48:71. [PMID: 27663120 PMCID: PMC5035503 DOI: 10.1186/s12711-016-0244-7] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2015] [Accepted: 08/31/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genotype imputation is a key element of the implementation of genomic selection within the New Zealand sheep industry, but many factors can influence imputation accuracy. Our objective was to provide practical directions on the implementation of imputation strategies in a multi-breed sheep population genotyped with three single nucleotide polymorphism (SNP) panels: 5K, 50K and HD (600K SNPs). RESULTS Imputation from 5K to HD was slightly better (0.6 %) than imputation from 5K to 50K. Two-step imputation from 5K to 50K and then from 50K to HD outperformed direct imputation from 5K to HD. A slight loss in imputation accuracy was observed when a large fixed reference population was used compared to a smaller within-breed reference (including all 50K genotypes on animals from different breeds excluding those in the validation set i.e. to be imputed), but only for a few animals across all imputation scenarios from 5K to 50K. However, a major gain in imputation accuracy for a large proportion of animals (purebred and crossbred), justified the use of a fixed and large reference dataset for all situations. This study also investigated the loss in imputation accuracy specifically for SNPs located at the ends of each chromosome, and showed that only chromosome 26 had an overall imputation (5K to 50K) accuracy for 100 SNPs at each end higher than 60 % (r2). Most of the chromosomes displayed reduced imputation accuracy at least at one of their ends. Prediction of imputation accuracy based on the relatedness of low-density genotypes to those of the reference dataset, before imputation (without running an imputation software) was also investigated. FIMPUTE V2.2 outperformed BEAGLE 3.3.2 across all imputation scenarios. CONCLUSIONS Imputation accuracy in sheep breeds can be improved by following a set of recommendations on SNP panels, software, strategies of imputation (one- or two-step imputation), and choice of the animals to be genotyped using both high- and low-density SNP panels. We present a method that predicts imputation accuracy for individual animals at the low-density level, before running imputation, which can be used to restrict genomic prediction only to the animals that can be imputed with sufficient accuracy.
Collapse
Affiliation(s)
- Ricardo V Ventura
- Centre for Genetic Improvement of Livestock, University of Guelph, Guelph, ON, N1G2W1, Canada.,Beef Improvement Opportunities, Guelph, ON, N1K1E5, Canada
| | - Stephen P Miller
- Centre for Genetic Improvement of Livestock, University of Guelph, Guelph, ON, N1G2W1, Canada. .,Invermay Agricultural Centre, AgResearch Limited, Mosgiel, 9053, New Zealand.
| | - Ken G Dodds
- Invermay Agricultural Centre, AgResearch Limited, Mosgiel, 9053, New Zealand
| | - Benoit Auvray
- Department of Mathematics and Statistics, University of Otago, Dunedin, 9016, New Zealand
| | - Michael Lee
- Department of Mathematics and Statistics, University of Otago, Dunedin, 9016, New Zealand
| | - Matthew Bixley
- Invermay Agricultural Centre, AgResearch Limited, Mosgiel, 9053, New Zealand
| | - Shannon M Clarke
- Invermay Agricultural Centre, AgResearch Limited, Mosgiel, 9053, New Zealand
| | - John C McEwan
- Invermay Agricultural Centre, AgResearch Limited, Mosgiel, 9053, New Zealand
| |
Collapse
|
34
|
Wu XL, Xu J, Feng G, Wiggans GR, Taylor JF, He J, Qian C, Qiu J, Simpson B, Walker J, Bauck S. Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications. PLoS One 2016; 11:e0161719. [PMID: 27583971 PMCID: PMC5008792 DOI: 10.1371/journal.pone.0161719] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 08/10/2016] [Indexed: 11/19/2022] Open
Abstract
Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The utility of this MOLO algorithm was also demonstrated in a real application, in which a 6K SNP panel was optimized conditional on 5,260 obligatory SNP selected based on SNP-trait association in U.S. Holstein animals. With this MOLO algorithm, both imputation error rate and genomic prediction error rate were minimal.
Collapse
Affiliation(s)
- Xiao-Lin Wu
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
- * E-mail:
| | - Jiaqi Xu
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
- Department of Statistics, University of Nebraska, Lincoln, Nebraska, United States of America
| | - Guofei Feng
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
- Department of Statistics, University of Nebraska, Lincoln, Nebraska, United States of America
| | - George R. Wiggans
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland, United States of America
| | - Jeremy F. Taylor
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Jun He
- College of Animal Sciences and Technology, Hunan Agricultural University, Changsha, China
| | - Changsong Qian
- Marketing and Business Development, Neogen Bio-Scientific Technology (Shanghai) Company Ltd., Shanghai, China
| | - Jiansheng Qiu
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
| | - Barry Simpson
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
| | - Jeremy Walker
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
| | - Stewart Bauck
- Bioinformatics and Biostatistics, GeneSeek (a Neogen Company), Lincoln, Nebraska, United States of America
| |
Collapse
|
35
|
Friedenberg SG, Meurs KM. Genotype imputation in the domestic dog. Mamm Genome 2016; 27:485-94. [PMID: 27129452 DOI: 10.1007/s00335-016-9636-9] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2015] [Accepted: 04/11/2016] [Indexed: 01/08/2023]
Abstract
Application of imputation methods to accurately predict a dense array of SNP genotypes in the dog could provide an important supplement to current analyses of array-based genotyping data. Here, we developed a reference panel of 4,885,283 SNPs in 83 dogs across 15 breeds using whole genome sequencing. We used this panel to predict the genotypes of 268 dogs across three breeds with 84,193 SNP array-derived genotypes as inputs. We then (1) performed breed clustering of the actual and imputed data; (2) evaluated several reference panel breed combinations to determine an optimal reference panel composition; and (3) compared the accuracy of two commonly used software algorithms (Beagle and IMPUTE2). Breed clustering was well preserved in the imputation process across eigenvalues representing 75 % of the variation in the imputed data. Using Beagle with a target panel from a single breed, genotype concordance was highest using a multi-breed reference panel (92.4 %) compared to a breed-specific reference panel (87.0 %) or a reference panel containing no breeds overlapping with the target panel (74.9 %). This finding was confirmed using target panels derived from two other breeds. Additionally, using the multi-breed reference panel, genotype concordance was slightly higher with IMPUTE2 (94.1 %) compared to Beagle; Pearson correlation coefficients were slightly higher for both software packages (0.946 for Beagle, 0.961 for IMPUTE2). Our findings demonstrate that genotype imputation from SNP array-derived data to whole genome-level genotypes is both feasible and accurate in the dog with appropriate breed overlap between the target and reference panels.
Collapse
Affiliation(s)
- S G Friedenberg
- Department of Clinical Sciences and Comparative Medicine Institute, North Carolina State University College of Veterinary Medicine, 1060 William Moore Drive, Raleigh, NC, 27607, USA.
| | - K M Meurs
- Department of Clinical Sciences and Comparative Medicine Institute, North Carolina State University College of Veterinary Medicine, 1060 William Moore Drive, Raleigh, NC, 27607, USA
| |
Collapse
|