1
|
Wolf MJ, Neumann GB, Kokuć P, Yin T, Brockmann GA, König S, May K. Genetic evaluations for endangered dual-purpose German Black Pied cattle using 50K SNPs, a breed-specific 200K chip, and whole-genome sequencing. J Dairy Sci 2023; 106:3345-3358. [PMID: 37028956 DOI: 10.3168/jds.2022-22665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 12/16/2022] [Indexed: 04/09/2023]
Abstract
Genetic evaluations of local cattle breeds are hampered due to small reference groups or biased due to the utilization of SNP effects estimated in other large populations. Against this background, there is a lack of studies addressing the possible advantage of whole-genome sequences (WGS) or consideration of specific variants from WGS data in genomic predictions for local breeds with small population size. Consequently, the aim of this study was to compare genetic parameters and accuracies of genomic estimated breeding values (GEBV) for 305-d production traits, fat-to protein ratio (FPR), and somatic cell score (SCS) at the first test date after calving and confirmation traits of the endangered German Black Pied cattle (DSN) breed using 4 different marker panels: (1) the commercial 50K Illumina BovineSNP50 BeadChip, (2) a customized 200K chip designed for DSN (DSN200K) which considers the most important variants for DSN from WGS, (3) randomly generated 200K chips based on WGS data, and (4) a WGS panel. The same number of animals was considered for all marker panel analyses (i.e., 1,811 genotyped or sequenced cows for conformation traits, 2,383 cows for lactation production traits, and 2,420 cows for FPR and SCS). Mixed models for the estimation of genetic parameters directly included the respective genomic relationship matrix from the different marker panels plus the trait-specific fixed effects. For the calculation of GEBV accuracies, we applied repeated random subsampling validation. In the process of separate cross-validations per trait, we created a validation set including 20% of cows with masked phenotypes, and a training set comprising 80% of the cows. The cows were selected randomly in a procedure with 10 replicates considering replacements in the different scenarios. The accuracy was defined as the correlation between the direct GEBV and the phenotypes with subtracted corresponding fixed effects for the cows in the validation set. For FPR and SCS, as well as for lactation production traits, heritabilities were largest based on WGS data, but the increase compared with the 50K or DSN200K applications was quite small in the range from 0.01 to 0.03. Also, for most of the conformation traits, heritabilities were largest based on WGS and DSN200K data, but the increase was in the range of the corresponding standard error. Accordingly, GEBV accuracies for most of the studied traits were highest based on WGS data or when utilizing the DSN200K chip, but the accuracy differences across the marker panels were quite small and nonsignificant. In conclusion, WGS data and the DSN200K chip only contributed to minor improvements in genomic predictions, still justifying the use of the commercial 50K chip. Nevertheless, WGS and the 200KDSN chip harbor breed-specific variants, which are valuable for studying causal genetic mechanisms in the endangered DSN population.
Collapse
Affiliation(s)
- Manuel J Wolf
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, 35390 Gießen, Germany
| | - Guilherme B Neumann
- Animal Breeding Biology and Molecular Genetics, Albrecht Daniel Thaer-Institute for Agricultural and Horticultural Sciences, Humboldt Universität zu Berlin, 10115 Berlin, Germany
| | - Paula Kokuć
- Animal Breeding Biology and Molecular Genetics, Albrecht Daniel Thaer-Institute for Agricultural and Horticultural Sciences, Humboldt Universität zu Berlin, 10115 Berlin, Germany
| | - Tong Yin
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, 35390 Gießen, Germany
| | - Gudrun A Brockmann
- Animal Breeding Biology and Molecular Genetics, Albrecht Daniel Thaer-Institute for Agricultural and Horticultural Sciences, Humboldt Universität zu Berlin, 10115 Berlin, Germany
| | - Sven König
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, 35390 Gießen, Germany.
| | - Katharina May
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, 35390 Gießen, Germany
| |
Collapse
|
2
|
Faggion S, Carnier P, Franch R, Babbucci M, Pascoli F, Dalla Rovere G, Caggiano M, Chavanne H, Toffan A, Bargelloni L. Viral nervous necrosis resistance in gilthead sea bream (Sparus aurata) at the larval stage: heritability and accuracy of genomic prediction with different training and testing settings. Genet Sel Evol 2023; 55:22. [PMID: 37013478 PMCID: PMC10069116 DOI: 10.1186/s12711-023-00796-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 03/21/2023] [Indexed: 04/05/2023] Open
Abstract
BACKGROUND The gilthead sea bream (Sparus aurata) has long been considered resistant to viral nervous necrosis (VNN), until recently, when significant mortalities caused by a reassortant nervous necrosis virus (NNV) strain were reported. Selective breeding to enhance resistance against NNV might be a preventive action. In this study, 972 sea bream larvae were subjected to a NNV challenge test and the symptomatology was recorded. All the experimental fish and their parents were genotyped using a genome-wide single nucleotide polymorphism (SNP) array consisting of over 26,000 markers. RESULTS Estimates of pedigree-based and genomic heritabilities of VNN symptomatology were consistent with each other (0.21, highest posterior density interval at 95% (HPD95%): 0.1-0.4; 0.19, HPD95%: 0.1-0.3, respectively). The genome-wide association study suggested one genomic region, i.e., in linkage group (LG) 23 that might be involved in sea bream VNN resistance, although it was far from the genome-wide significance threshold. The accuracies (r) of the predicted estimated breeding values (EBV) provided by three Bayesian genomic regression models (Bayes B, Bayes C, and Ridge Regression) were consistent and on average were equal to 0.90 when assessed in a set of cross-validation (CV) procedures. When genomic relationships between training and testing sets were minimized, accuracy decreased greatly (r = 0.53 for a validation based on genomic clustering, r = 0.12 for a validation based on a leave-one-family-out approach focused on the parents of the challenged fish). Classification of the phenotype using the genomic predictions of the phenotype or using the genomic predictions of the pedigree-based, all data included, EBV as classifiers was moderately accurate (area under the ROC curve 0.60 and 0.66, respectively). CONCLUSIONS The estimate of the heritability for VNN symptomatology indicates that it is feasible to implement selective breeding programs for increased resistance to VNN of sea bream larvae/juveniles. Exploiting genomic information offers the opportunity of developing prediction tools for VNN resistance, and genomic models can be trained on EBV using all data or phenotypes, with minimal differences in classification performance of the trait phenotype. In a long-term view, the weakening of the genomic ties between animals in the training and test sets leads to decreased genomic prediction accuracies, thus periodical update of the reference population with new data is mandatory.
Collapse
Affiliation(s)
- Sara Faggion
- Department of Comparative Biomedicine and Food Science, University of Padova, Viale dell'Università, 16, 35020, Legnaro, PD, Italy.
| | - Paolo Carnier
- Department of Comparative Biomedicine and Food Science, University of Padova, Viale dell'Università, 16, 35020, Legnaro, PD, Italy
| | - Rafaella Franch
- Department of Comparative Biomedicine and Food Science, University of Padova, Viale dell'Università, 16, 35020, Legnaro, PD, Italy
| | - Massimiliano Babbucci
- Department of Comparative Biomedicine and Food Science, University of Padova, Viale dell'Università, 16, 35020, Legnaro, PD, Italy
| | - Francesco Pascoli
- Division of Comparative Biomedical Sciences, OIE Reference Centre for Viral Encephalopathy and Retinopathy, Istituto Zooprofilattico Sperimentale delle Venezie (IZSVe), Padova, Italy
| | - Giulia Dalla Rovere
- Department of Comparative Biomedicine and Food Science, University of Padova, Viale dell'Università, 16, 35020, Legnaro, PD, Italy
| | - Massimo Caggiano
- Panittica Italia Società Agricola S.R.L., Strada del Procaccio, 72016, Torre Canne di Fasano, Italy
| | - Hervé Chavanne
- Panittica Italia Società Agricola S.R.L., Strada del Procaccio, 72016, Torre Canne di Fasano, Italy
| | - Anna Toffan
- Division of Comparative Biomedical Sciences, OIE Reference Centre for Viral Encephalopathy and Retinopathy, Istituto Zooprofilattico Sperimentale delle Venezie (IZSVe), Padova, Italy
| | - Luca Bargelloni
- Department of Comparative Biomedicine and Food Science, University of Padova, Viale dell'Università, 16, 35020, Legnaro, PD, Italy
| |
Collapse
|
3
|
Anilkumar C, Sunitha NC, Devate NB, Ramesh S. Advances in integrated genomic selection for rapid genetic gain in crop improvement: a review. PLANTA 2022; 256:87. [PMID: 36149531 DOI: 10.1007/s00425-022-03996-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Accepted: 09/11/2022] [Indexed: 06/16/2023]
Abstract
Genomic selection and its importance in crop breeding. Integration of GS with new breeding tools and developing SOP for GS to achieve maximum genetic gain with low cost and time. The success of conventional breeding approaches is not sufficient to meet the demand of a growing population for nutritious food and other plant-based products. Whereas, marker assisted selection (MAS) is not efficient in capturing all the favorable alleles responsible for economic traits in the process of crop improvement. Genomic selection (GS) developed in livestock breeding and then adapted to plant breeding promised to overcome the drawbacks of MAS and significantly improve complicated traits controlled by gene/QTL with small effects. Large-scale deployment of GS in important crops, as well as simulation studies in a variety of contexts, addressed G × E interaction effects and non-additive effects, as well as lowering breeding costs and time. The current study provides a complete overview of genomic selection, its process, and importance in modern plant breeding, along with insights into its application. GS has been implemented in the improvement of complex traits including tolerance to biotic and abiotic stresses. Furthermore, this review hypothesises that using GS in conjunction with other crop improvement platforms accelerates the breeding process to increase genetic gain. The objective of this review is to highlight the development of an appropriate GS model, the global open source network for GS, and trans-disciplinary approaches for effective accelerated crop improvement. The current study focused on the application of data science, including machine learning and deep learning tools, to enhance the accuracy of prediction models. Present study emphasizes on developing plant breeding strategies centered on GS combined with routine conventional breeding principles by developing GS-SOP to achieve enhanced genetic gain.
Collapse
Affiliation(s)
- C Anilkumar
- ICAR-National Rice Research Institute, Cuttack, India
| | - N C Sunitha
- University of Agricultural Sciences, Bangalore, India
| | | | - S Ramesh
- University of Agricultural Sciences, Bangalore, India.
| |
Collapse
|
4
|
Liu D, Xu Z, Zhao W, Wang S, Li T, Zhu K, Liu G, Zhao X, Wang Q, Pan Y, Ma P. Genetic parameters and genome-wide association for milk production traits and somatic cell score in different lactation stages of Shanghai Holstein population. Front Genet 2022; 13:940650. [PMID: 36134029 PMCID: PMC9483179 DOI: 10.3389/fgene.2022.940650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 08/04/2022] [Indexed: 11/13/2022] Open
Abstract
The aim of this study was to investigate the genetic parameters and genetic architectures of six milk production traits in the Shanghai Holstein population. The data used to estimate the genetic parameters consisted of 1,968,589 test-day records for 305,031 primiparous cows. Among the cows with phenotypes, 3,016 cows were genotyped with Illumina Bovine SNP50K BeadChip, GeneSeek Bovine 50K BeadChip, GeneSeek Bovine LD BeadChip v4, GeneSeek Bovine 150K BeadChip, or low-depth whole-genome sequencing. A genome-wide association study was performed to identify quantitative trait loci and genes associated with milk production traits in the Shanghai Holstein population using genotypes imputed to whole-genome sequences and both fixed and random model circulating probability unification and a mixed linear model with rMVP software. Estimated heritabilities (h2) varied from 0.04 to 0.14 for somatic cell score (SCS), 0.07 to 0.22 for fat percentage (FP), 0.09 to 0.27 for milk yield (MY), 0.06 to 0.23 for fat yield (FY), 0.09 to 0.26 for protein yield (PY), and 0.07 to 0.35 for protein percentage (PP), respectively. Within lactation, genetic correlations for SCS, FP, MY, FY, PY, and PP at different stages of lactation estimated in random regression model were ranged from -0.02 to 0.99, 0.18 to 0.99, 0.04 to 0.99, 0.04 to 0.99, 0.01 to 0.99, and 0.33 to 0.99, respectively. The genetic correlations were highest between adjacent DIM but decreased as DIM got further apart. Candidate genes included those related to production traits (DGAT1, MGST1, PTK2, and SCRIB), disease-related (LY6K, COL22A1, TECPR2, and PLCB1), heat stress–related (ITGA9, NDST4, TECPR2, and HSF1), and reproduction-related (7SK and DOCK2) genes. This study has shown that there are differences in the genetic mechanisms of milk production traits at different stages of lactation. Therefore, it is necessary to conduct research on milk production traits at different stages of lactation as different traits. Our results can also provide a theoretical basis for subsequent molecular breeding, especially for the novel genetic loci.
Collapse
Affiliation(s)
- Dengying Liu
- Shanghai Key Laboratory of Veterinary Biotechnology, Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Zhong Xu
- Hubei Key Laboratory of Animal Embryo and Molecular Breeding, Institute of Animal Husbandry and Veterinary, Hubei Provincial Academy of Agricultural Sciences, Wuhan, China
| | - Wei Zhao
- Shanghai Key Laboratory of Veterinary Biotechnology, Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Shiyi Wang
- Shanghai Key Laboratory of Veterinary Biotechnology, Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Tuowu Li
- Shanghai Key Laboratory of Veterinary Biotechnology, Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Kai Zhu
- Shanghai Dairy Cattle Breeding Centre Co, Ltd, Shanghai, China
| | - Guanglei Liu
- Shanghai Dairy Cattle Breeding Centre Co, Ltd, Shanghai, China
| | - Xiaoduo Zhao
- Shanghai Dairy Cattle Breeding Centre Co, Ltd, Shanghai, China
| | - Qishan Wang
- Department of Animal Breeding and Reproduction, College of Animal Science, Zhejiang University, Hangzhou, China
| | - Yuchun Pan
- Department of Animal Breeding and Reproduction, College of Animal Science, Zhejiang University, Hangzhou, China
- *Correspondence: Peipei Ma, ; Yuchun Pan,
| | - Peipei Ma
- Shanghai Key Laboratory of Veterinary Biotechnology, Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
- *Correspondence: Peipei Ma, ; Yuchun Pan,
| |
Collapse
|
5
|
Meher PK, Rustgi S, Kumar A. Performance of Bayesian and BLUP alphabets for genomic prediction: analysis, comparison and results. Heredity (Edinb) 2022; 128:519-530. [PMID: 35508540 DOI: 10.1038/s41437-022-00539-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 04/19/2022] [Accepted: 04/19/2022] [Indexed: 11/09/2022] Open
Abstract
We evaluated the performances of three BLUP and five Bayesian methods for genomic prediction by using nine actual and 54 simulated datasets. The genomic prediction accuracy was measured using Pearson's correlation coefficient between the genomic estimated breeding value (GEBV) and the observed phenotypic data using a fivefold cross-validation approach with 100 replications. The Bayesian alphabets performed better for the traits governed by a few genes/QTLs with relatively larger effects. On the contrary, the BLUP alphabets (GBLUP and CBLUP) exhibited higher genomic prediction accuracy for the traits controlled by several small-effect QTLs. Additionally, Bayesian methods performed better for the highly heritable traits and, for other traits, performed at par with the BLUP methods. Further, genomic BLUP (GBLUP) was identified as the least biased method for the GEBV estimation. Among the Bayesian methods, the Bayesian ridge regression and Bayesian LASSO were less biased than other Bayesian alphabets. Nonetheless, genomic prediction accuracy increased with an increase in trait heritability, irrespective of the sample size, marker density, and the QTL type (major/minor effect). In sum, this study provides valuable information regarding the choice of the selection method for genomic prediction in different breeding programs.
Collapse
Affiliation(s)
- Prabina Kumar Meher
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi-12, India.
| | - Sachin Rustgi
- Department of Plant and Environmental Sciences, Clemson University Pee Dee Research and Education Center, Darlington, SC, USA.
| | - Anuj Kumar
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi-12, India
| |
Collapse
|
6
|
Rios EF, Andrade MHML, Resende MFR, Kirst M, de Resende MDV, de Almeida Filho JE, Gezan SA, Munoz P. Genomic prediction in family bulks using different traits and cross-validations in pine. G3-GENES GENOMES GENETICS 2021; 11:6321952. [PMID: 34544139 PMCID: PMC8496210 DOI: 10.1093/g3journal/jkab249] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 07/02/2021] [Indexed: 11/13/2022]
Abstract
Genomic prediction integrates statistical, genomic, and computational tools to improve the estimation of breeding values and increase genetic gain. Due to the broad diversity in mating systems, breeding schemes, propagation methods, and unit of selection, no universal genomic prediction approach can be applied in all crops. In a genome-wide family prediction (GWFP) approach, the family is the basic unit of selection. We tested GWFP in two loblolly pine (Pinus taeda L.) datasets: a breeding population composed of 63 full-sib families (5–20 individuals per family), and a simulated population with the same pedigree structure. In both populations, phenotypic and genomic data was pooled at the family level in silico. Marker effects were estimated to compute genomic estimated breeding values (GEBV) at the individual and family (GWFP) levels. Less than six individuals per family produced inaccurate estimates of family phenotypic performance and allele frequency. Tested across different scenarios, GWFP predictive ability was higher than those for GEBV in both populations. Validation sets composed of families with similar phenotypic mean and variance as the training population yielded predictions consistently higher and more accurate than other validation sets. Results revealed potential for applying GWFP in breeding programs whose selection unit are family, and for systems where family can serve as training sets. The GWFP approach is well suited for crops that are routinely genotyped and phenotyped at the plot-level, but it can be extended to other breeding programs. Higher predictive ability obtained with GWFP would motivate the application of genomic prediction in these situations.
Collapse
Affiliation(s)
- Esteban F Rios
- Agronomy Department, University of Florida, Gainesville, FL 32611, USA
| | | | - Marcio F R Resende
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Matias Kirst
- School of Forest Resources and Conservation, University of Florida, Gainesville, FL 32611, USA
| | - Marcos D V de Resende
- EMBRAPA Café/Department of Statistics, Federal University of Viçosa, Avenida PH Rolfs S/N, Viçosa 36570-000, Brazil
| | | | | | - Patricio Munoz
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
7
|
Krishnappa G, Savadi S, Tyagi BS, Singh SK, Mamrutha HM, Kumar S, Mishra CN, Khan H, Gangadhara K, Uday G, Singh G, Singh GP. Integrated genomic selection for rapid improvement of crops. Genomics 2021; 113:1070-1086. [PMID: 33610797 DOI: 10.1016/j.ygeno.2021.02.007] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Revised: 11/08/2020] [Accepted: 02/15/2021] [Indexed: 11/15/2022]
Abstract
An increase in the rate of crop improvement is essential for achieving sustained food production and other needs of ever-increasing population. Genomic selection (GS) is a potential breeding tool that has been successfully employed in animal breeding and is being incorporated into plant breeding. GS promises accelerated breeding cycles through a rapid selection of superior genotypes. Numerous empirical and simulation studies on GS and realized impacts on improvement in the crop yields are recently being reported. For a holistic understanding of the technology, we briefly discuss the concept of genetic gain, GS methodology, its current status, advantages of GS over other breeding methods, prediction models, and the factors controlling prediction accuracy in GS. Also, integration of speed breeding and other novel technologies viz. high throughput genotyping and phenotyping technologies for enhancing the efficiency and pace of GS, followed by its prospective applications in varietal development programs is reviewed.
Collapse
Affiliation(s)
| | | | | | | | | | - Satish Kumar
- Indian Institute of Wheat and Barley Research, Karnal, India
| | | | - Hanif Khan
- Indian Institute of Wheat and Barley Research, Karnal, India
| | | | | | - Gyanendra Singh
- Indian Institute of Wheat and Barley Research, Karnal, India
| | | |
Collapse
|
8
|
Brunes LC, Baldi F, Lopes FB, Narciso MG, Lobo RB, Espigolan R, Costa MFO, Magnabosco CU. Genomic prediction ability for feed efficiency traits using different models and pseudo-phenotypes under several validation strategies in Nelore cattle. Animal 2020; 15:100085. [PMID: 33573965 DOI: 10.1016/j.animal.2020.100085] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 09/09/2020] [Accepted: 09/15/2020] [Indexed: 10/22/2022] Open
Abstract
There is a growing interest to improve feed efficiency (FE) traits in cattle. The genomic selection was proposed to improve these traits since they are difficult and expensive to measure. Up to date, there are scarce studies about the implementation of genomic selection for FE traits in indicine cattle under different scenarios of pseudo-phenotypes, models, and validation strategies on a commercial large scale. Thus, the aim was to evaluate the feasibility of genomic selection implementation for FE traits in Nelore cattle applying different models and pseudo-phenotypes under validation strategies. Phenotypic and genotypic information from 4 329 and 3 467 animals were used, respectively, which were tested for residual feed intake, DM intake, feed efficiency, feed conversion ratio, residual BW gain, and residual intake and BW gain. Six prediction methods were used: single-step genomic best linear unbiased prediction, Bayes A, Bayes B, Bayes Cπ, Bayesian least absolute shrinkage and selection operator (BLASSO), and Bayes R. Phenotypes adjusted for fixed effects (Y*), estimated breeding value (EBV), and EBV deregressed (DEBV) were used as pseudo-phenotypes. The validation approaches used were: (1) random: the data was randomly divided into ten subsets and the validation was done in each subset at a time; (2) age: the partition into training and testing sets was based on year of birth and testing animals were born after 2016; and (3) EBV accuracy: the data was split into two groups, being animals with accuracy above 0.45 the training set; and below 0.45 the validation set. In the analyses that used the Y* as pseudo-phenotype, prediction ability (PA) was obtained by dividing the correlation between pseudo-phenotype and genomic EBV (GEBV) by the square root of the heritability of the trait. When EBV and DEBV were used as the pseudo-phenotype, the simple correlation of this quantity with the GEBV was considered as PA. The prediction methods show similar results for PA and bias. The random cross-validation presented higher PA (0.17) than EBV accuracy (0.14) and age (0.13). The PA was higher for Y* than for EBV and DEBV (30.0 and 34.3%, respectively). Random validation presented the highest PA, being indicated for use in populations composed mainly of young animals and traits with few generations of data recording. For high heritability traits, the validation can be done by age, enabling the prediction of the next-generation genetic merit. These results would support breeders to identify genomic approaches that are more viable for genomic prediction for FE-related traits.
Collapse
Affiliation(s)
- L C Brunes
- Animal Science Department, Goiás Federal University, 74690-900 Goiânia, GO, Brazil; Embrapa Rice and Beans, GO-462, km 12, 75375-000 Santo Antônio de Goiás, GO, Brazil.
| | - F Baldi
- Animal Science Department, São Paulo State University - Júlio de Mesquita Filho (UNESP), Prof. Paulo Donato Castelane, 14884-900 Jaboticabal, SP, Brazil
| | - F B Lopes
- Cobb-Vantress, Inc., 72761 Siloam Springs, AR, USA
| | - M G Narciso
- Embrapa Rice and Beans, GO-462, km 12, 75375-000 Santo Antônio de Goiás, GO, Brazil
| | - R B Lobo
- National Association of Breeders and Researchers, 14020-230 Ribeirão Preto, Brazil
| | - R Espigolan
- Department of Veterinary Medicine, Faculty of Animal Science and Food Engineering, University of Sao Paulo, 13635-900 Pirassununga, SP, Brazil
| | - M F O Costa
- Embrapa Rice and Beans, GO-462, km 12, 75375-000 Santo Antônio de Goiás, GO, Brazil
| | - C U Magnabosco
- Embrapa Cerrados, BR-020, 18 Sobradinho, 70770-901 Brasilia, DF, Brazil
| |
Collapse
|
9
|
High-frequency marker haplotypes in the genomic selection of dairy cattle. J Appl Genet 2019; 60:179-186. [PMID: 30877657 PMCID: PMC6483952 DOI: 10.1007/s13353-019-00489-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Revised: 01/18/2019] [Accepted: 02/28/2019] [Indexed: 11/05/2022]
Abstract
The aim of this study was to predict the genomic breeding value (DGV) of production, selected conformation and reproductive traits, and somatic cell score of dairy cattle in Poland using high-frequency marker haplotypes. The dataset consisted of phenotypic, genotypic, and pedigree data of 1216 Polish Holstein-Friesian bulls. The genotypic data consisted of 54,000 single-nucleotide polymorphisms (SNPs). The data were divided into two subsets: a test dataset (n = 1064) and a validation dataset (n = 152). Genotypic data were selected using three criteria: the percentage of missing genotypes, minor allele frequency, and linkage disequilibrium. The purpose of the data selection was to identify blocks of SNPs that were then used for the construction of haplotypes. Only haplotypes with a frequency higher than 25% were selected. DGV was predicted using four variants of a linear model with random haplotype effects and deregressed breeding values as the response variables. The accuracy of genomic prediction was checked by comparing DGVs with estimated breeding values (EBVs) using two methods: Pearson’s correlations and the regression of EBV on DGV. The use of high-frequency haplotypes showed a tendency to underestimate DGVs. None of the models tested was clearly superior with regard to the traits studied. DGVs of production and conformation traits as well as somatic cell score (medium or high heritability traits) were more accurate than those estimated for fertility traits (low heritability traits).
Collapse
|
10
|
Wang X, Miao J, Chang T, Xia J, An B, Li Y, Xu L, Zhang L, Gao X, Li J, Gao H. Evaluation of GBLUP, BayesB and elastic net for genomic prediction in Chinese Simmental beef cattle. PLoS One 2019; 14:e0210442. [PMID: 30817758 PMCID: PMC6394919 DOI: 10.1371/journal.pone.0210442] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2018] [Accepted: 12/21/2018] [Indexed: 11/24/2022] Open
Abstract
Chinese Simmental beef cattle are the most economically important cattle breed in China. Estimated breeding values for growth, carcass, and meat quality traits are commonly used as selection criteria in animal breeding. The objective of this study was to evaluate the accuracy of alternative statistical methods for the estimation of genomic breeding values. Analyses of the accuracy of genomic best linear unbiased prediction (GBLUP), BayesB, and elastic net (EN) were performed with an Illumina BovineHD BeadChip on 1,217 animals by applying 5-fold cross-validation. Overall, the accuracies ranged from 0.17 to 0.296 for ten traits, and the heritability estimates ranged from 0.36 to 0.63. The EN (alpha = 0.001) model provided the most accurate prediction, which was also slightly higher (0.2–2%) than that of GBLUP for most traits, such as average daily weight gain (ADG) and carcass weight (CW). BayesB was less accurate for each trait than were EN (alpha = 0.001) and GBLUP. These findings indicate the importance of using an appropriate variable selection method for the genomic selection of traits and suggest the influence of the genetic architecture of the traits we analyzed.
Collapse
Affiliation(s)
- Xiaoqiao Wang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Jian Miao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Tianpeng Chang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Jiangwei Xia
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Binxin An
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yan Li
- Veterinary Bureau of Wulagai Precinct in Xilin Gol League, Wulagai, China
| | - Lingyang Xu
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Lupei Zhang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Xue Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Junya Li
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
- * E-mail: (HG); (JL)
| | - Huijiang Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
- * E-mail: (HG); (JL)
| |
Collapse
|
11
|
Karimi Z, Sargolzaei M, Robinson J, Schenkel F. Assessing haplotype-based models for genomic evaluation in Holstein cattle. CANADIAN JOURNAL OF ANIMAL SCIENCE 2018. [DOI: 10.1139/cjas-2018-0009] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A single-nucleotide polymorphisms-based genomic relationship matrix (GSNP) discriminate less identity by state from identity by descent (IBD) alleles compared with a multi-locus haplotype-based relationship matrix (GHAP), which can better capture IBD alleles and recent relationships. We aimed to compare the prediction reliability and prediction bias of genomic best linear unbiased prediction (GBLUP) using either GSNP or GHAP in Holstein cattle. Therefore, a total of 57 traits with a wide range of heritability values were analyzed. Classical validation tests were done using a validation dataset comprised of 50k genotype records of 561–669 proven bulls born in 2010–2011 with an official estimated breeding value (EBV) in 2016 and a training set of 5314–19 678 bulls born before 2010, depending on the trait. The method for building the genomic relationship matrix (G) had significant, but small effect on observed reliability (r2GEBV) (p < 0.0001) and bias (p < 0.0001). A significant interaction between G and the level of trait heritability on r2GEBV and bias was also observed (p < 0.0001). The small gains in r2GEBV and small reductions in the bias by using GHAPBLUP were increased when predicting moderate to high-heritability traits compared with low-heritability traits.
Collapse
Affiliation(s)
- Z. Karimi
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - M. Sargolzaei
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada
- Semex Alliance, Guelph, ON N1H 6J2, Canada
| | - J.A.B. Robinson
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - F.S. Schenkel
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON N1G 2W1, Canada
| |
Collapse
|
12
|
Momen M, Mehrgardi AA, Sheikhi A, Kranis A, Tusell L, Morota G, Rosa GJM, Gianola D. Predictive ability of genome-assisted statistical models under various forms of gene action. Sci Rep 2018; 8:12309. [PMID: 30120288 PMCID: PMC6098164 DOI: 10.1038/s41598-018-30089-2] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Accepted: 07/24/2018] [Indexed: 11/09/2022] Open
Abstract
Recent work has suggested that the performance of prediction models for complex traits may depend on the architecture of the target traits. Here we compared several prediction models with respect to their ability of predicting phenotypes under various statistical architectures of gene action: (1) purely additive, (2) additive and dominance, (3) additive, dominance, and two-locus epistasis, and (4) purely epistatic settings. Simulation and a real chicken dataset were used. Fourteen prediction models were compared: BayesA, BayesB, BayesC, Bayesian LASSO, Bayesian ridge regression, elastic net, genomic best linear unbiased prediction, a Gaussian process, LASSO, random forests, reproducing kernel Hilbert spaces regression, ridge regression (best linear unbiased prediction), relevance vector machines, and support vector machines. When the trait was under additive gene action, the parametric prediction models outperformed non-parametric ones. Conversely, when the trait was under epistatic gene action, the non-parametric prediction models provided more accurate predictions. Thus, prediction models must be selected according to the most probably underlying architecture of traits. In the chicken dataset examined, most models had similar prediction performance. Our results corroborate the view that there is no universally best prediction models, and that the development of robust prediction models is an important research objective.
Collapse
Affiliation(s)
- Mehdi Momen
- Department of Animal Science, University College of Agriculture, Shahid Bahonar University of Kerman (SBUK), Kerman, Iran
| | - Ahmad Ayatollahi Mehrgardi
- Department of Animal Science, University College of Agriculture, Shahid Bahonar University of Kerman (SBUK), Kerman, Iran.
| | - Ayyub Sheikhi
- Department of Statistical Science, University College of Mathematic and Statistical Science, Shahid Bahonar University of Kerman (SBUK), Kerman, Iran
| | - Andreas Kranis
- Roslin Institute, University of Edinburgh, Edinburgh, EH25 9PS, UK
| | - Llibertat Tusell
- INRA UMR1388/INPT ENSAT/INPT ENVT GenPhySE, F-31326, Castanet-Tolosan, France
| | - Gota Morota
- Department of Animal Science, University of Nebraska-Lincoln, Lincoln, Nebraska, USA
| | - Guilherme J M Rosa
- Department of Animal Sciences, University of Wisconsin, Madison, WI, USA.,Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA
| | - Daniel Gianola
- Department of Animal Sciences, University of Wisconsin, Madison, WI, USA.,Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA.,Department of Dairy Science, University of Wisconsin, Madison, WI, USA
| |
Collapse
|
13
|
Muleta KT, Bulli P, Zhang Z, Chen X, Pumphrey M. Unlocking Diversity in Germplasm Collections via Genomic Selection: A Case Study Based on Quantitative Adult Plant Resistance to Stripe Rust in Spring Wheat. THE PLANT GENOME 2017; 10. [PMID: 29293811 DOI: 10.3835/plantgenome2016.12.0124] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Harnessing diversity from germplasm collections is more feasible today because of the development of lower-cost and higher-throughput genotyping methods. However, the cost of phenotyping is still generally high, so efficient methods of sampling and exploiting useful diversity are needed. Genomic selection (GS) has the potential to enhance the use of desirable genetic variation in germplasm collections through predicting the genomic estimated breeding values (GEBVs) for all traits that have been measured. Here, we evaluated the effects of various scenarios of population genetic properties and marker density on the accuracy of GEBVs in the context of applying GS for wheat ( L.) germplasm use. Empirical data for adult plant resistance to stripe rust ( f. sp. ) collected on 1163 spring wheat accessions and genotypic data based on the wheat 9K single nucleotide polymorphism (SNP) iSelect assay were used for various genomic prediction tests. Unsurprisingly, the results of the cross-validation tests demonstrated that prediction accuracy increased with an increase in training population size and marker density. It was evident that using all the available markers (5619) was unnecessary for capturing the trait variation in the germplasm collection, with no further gain in prediction accuracy beyond 1 SNP per 3.2 cM (∼1850 markers), which is close to the linkage disequilibrium decay rate in this population. Collectively, our results suggest that larger germplasm collections may be efficiently sampled via lower-density genotyping methods, whereas genetic relationships between the training and validation populations remain critical when exploiting GS to select from germplasm collections.
Collapse
|
14
|
Silva RMO, Fragomeni BO, Lourenco DAL, Magalhães AFB, Irano N, Carvalheiro R, Canesin RC, Mercadante MEZ, Boligon AA, Baldi FS, Misztal I, Albuquerque LG. Accuracies of genomic prediction of feed efficiency traits using different prediction and validation methods in an experimental Nelore cattle population. J Anim Sci 2017; 94:3613-3623. [PMID: 27898889 DOI: 10.2527/jas.2016-0401] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Animal feeding is the most important economic component of beef production systems. Selection for feed efficiency has not been effective mainly due to difficult and high costs to obtain the phenotypes. The application of genomic selection using SNP can decrease the cost of animal evaluation as well as the generation interval. The objective of this study was to compare methods for genomic evaluation of feed efficiency traits using different cross-validation layouts in an experimental beef cattle population genotyped for a high-density SNP panel (BovineHD BeadChip assay 700k, Illumina Inc., San Diego, CA). After quality control, a total of 437,197 SNP genotypes were available for 761 Nelore animals from the Institute of Animal Science, Sertãozinho, São Paulo, Brazil. The studied traits were residual feed intake, feed conversion ratio, ADG, and DMI. Methods of analysis were traditional BLUP, single-step genomic BLUP (ssGBLUP), genomic BLUP (GBLUP), and a Bayesian regression method (BayesCπ). Direct genomic values (DGV) from the last 2 methods were compared directly or in an index that combines DGV with parent average. Three cross-validation approaches were used to validate the models: 1) YOUNG, in which the partition into training and testing sets was based on year of birth and testing animals were born after 2010; 2) UNREL, in which the data set was split into 3 less related subsets and the validation was done in each subset a time; and 3) RANDOM, in which the data set was randomly divided into 4 subsets (considering the contemporary groups) and the validation was done in each subset at a time. On average, the RANDOM design provided the most accurate predictions. Average accuracies ranged from 0.10 to 0.58 using BLUP, from 0.09 to 0.48 using GBLUP, from 0.06 to 0.49 using BayesCπ, and from 0.22 to 0.49 using ssGBLUP. The most accurate and consistent predictions were obtained using ssGBLUP for all analyzed traits. The ssGBLUP seems to be more suitable to obtain genomic predictions for feed efficiency traits on an experimental population of genotyped animals.
Collapse
|
15
|
Jenko J, Wiggans G, Cooper T, Eaglen S, Luff W, Bichard M, Pong-Wong R, Woolliams J. Cow genotyping strategies for genomic selection in a small dairy cattle population. J Dairy Sci 2017; 100:439-452. [DOI: 10.3168/jds.2016-11479] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2016] [Accepted: 09/21/2016] [Indexed: 01/22/2023]
|
16
|
Using Genetic Distance to Infer the Accuracy of Genomic Prediction. PLoS Genet 2016; 12:e1006288. [PMID: 27589268 PMCID: PMC5010218 DOI: 10.1371/journal.pgen.1006288] [Citation(s) in RCA: 79] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2015] [Accepted: 08/10/2016] [Indexed: 12/12/2022] Open
Abstract
The prediction of phenotypic traits using high-density genomic data has many applications such as the selection of plants and animals of commercial interest; and it is expected to play an increasing role in medical diagnostics. Statistical models used for this task are usually tested using cross-validation, which implicitly assumes that new individuals (whose phenotypes we would like to predict) originate from the same population the genomic prediction model is trained on. In this paper we propose an approach based on clustering and resampling to investigate the effect of increasing genetic distance between training and target populations when predicting quantitative traits. This is important for plant and animal genetics, where genomic selection programs rely on the precision of predictions in future rounds of breeding. Therefore, estimating how quickly predictive accuracy decays is important in deciding which training population to use and how often the model has to be recalibrated. We find that the correlation between true and predicted values decays approximately linearly with respect to either FST or mean kinship between the training and the target populations. We illustrate this relationship using simulations and a collection of data sets from mice, wheat and human genetics. The availability of increasing amounts of genomic data is making the use of statistical models to predict traits of interest a mainstay of many applications in life sciences. Applications range from medical diagnostics for common and rare diseases to breeding characteristics such as disease resistance in plants and animals of commercial interest. We explored an implicit assumption of how such prediction models are often assessed: that the individuals whose traits we would like to predict originate from the same population as those that are used to train the models. This is commonly not the case, especially in the case of plants and animals that are parts of selection programs. To study this problem we proposed a model-agnostic approach to infer the accuracy of prediction models as a function of two common measures of genetic distance. Using data from plant, animal and human genetics, we find that accuracy decays approximately linearly in either of those measures. Quantifying this decay has fundamental applications in all branches of genetics, as it measures how studies generalise to different populations.
Collapse
|
17
|
Karaman E, Cheng H, Firat MZ, Garrick DJ, Fernando RL. An Upper Bound for Accuracy of Prediction Using GBLUP. PLoS One 2016; 11:e0161054. [PMID: 27529480 PMCID: PMC4986954 DOI: 10.1371/journal.pone.0161054] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 07/29/2016] [Indexed: 11/26/2022] Open
Abstract
This study aims at characterizing the asymptotic behavior of genomic prediction R2 as the size of the reference population increases for common or rare QTL alleles through simulations. Haplotypes derived from whole-genome sequence of 85 Caucasian individuals from the 1,000 Genomes Project were used to simulate random mating in a population of 10,000 individuals for at least 100 generations to create the LD structure in humans for a large number of individuals. To reduce computational demands, only SNPs within a 0.1M region of each of the first 5 chromosomes were used in simulations, and therefore, the total genome length simulated was 0.5M. When the genome length is 30M, to get the same genomic prediction R2 as with a 0.5M genome would require a reference population 60 fold larger. Three scenarios were considered varying in minor allele frequency distributions of markers and QTL, for h2 = 0.8 resembling height in humans. Total number of markers was 4,200 and QTL were 70 for each scenario. In this study, we considered the prediction accuracy in terms of an estimability problem, and thereby provided an upper bound for reliability of prediction, and thus, for prediction R2. Genomic prediction methods GBLUP, BayesB and BayesC were compared. Our results imply that for human height variable selection methods BayesB and BayesC applied to a 30M genome have no advantage over GBLUP when the size of reference population was small (<6,000 individuals), but are superior as more individuals are included in the reference population. All methods become asymptotically equivalent in terms of prediction R2, which approaches genomic heritability when the size of the reference population reaches 480,000 individuals.
Collapse
Affiliation(s)
- Emre Karaman
- Department of Animal Science, Faculty of Agriculture, Akdeniz University, 07059 Antalya, Turkey
- * E-mail:
| | - Hao Cheng
- Department of Animal Science, Iowa State University, 50011 Ames, Iowa, United States of America
- Department of Statistics, Iowa State University, 50011 Ames, Iowa, United States of America
| | - Mehmet Z. Firat
- Department of Animal Science, Faculty of Agriculture, Akdeniz University, 07059 Antalya, Turkey
| | - Dorian J. Garrick
- Department of Animal Science, Iowa State University, 50011 Ames, Iowa, United States of America
- Institute of Veterinary, Animal and Biomedical Science, Massey University, Palmerston North, New Zealand
| | - Rohan L. Fernando
- Department of Animal Science, Iowa State University, 50011 Ames, Iowa, United States of America
| |
Collapse
|
18
|
Fernandes Júnior GA, Rosa GJM, Valente BD, Carvalheiro R, Baldi F, Garcia DA, Gordo DGM, Espigolan R, Takada L, Tonussi RL, de Andrade WBF, Magalhães AFB, Chardulo LAL, Tonhati H, de Albuquerque LG. Genomic prediction of breeding values for carcass traits in Nellore cattle. Genet Sel Evol 2016; 48:7. [PMID: 26830208 PMCID: PMC4734869 DOI: 10.1186/s12711-016-0188-y] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Accepted: 01/18/2016] [Indexed: 01/20/2023] Open
Abstract
Background The objective of this study was to evaluate the accuracy of genomic predictions for rib eye area (REA), backfat thickness (BFT), and hot carcass weight (HCW) in Nellore beef cattle from Brazilian commercial herds using different prediction models. Methods Phenotypic data from 1756 Nellore steers from ten commercial herds in Brazil were used. Animals were offspring of 294 sires and 1546 dams, reared on pasture, feedlot finished, and slaughtered at approximately 2 years of age. All animals were genotyped using a 777k Illumina Bovine HD SNP chip. Accuracy of genomic predictions of breeding values was evaluated by using a 5-fold cross-validation scheme and considering three models: Bayesian ridge regression (BRR), Bayes C (BC) and Bayesian Lasso (BL), and two types of response variables: traditional estimated breeding value (EBV), and phenotype adjusted for fixed effects (Y*). Results The prediction accuracies achieved with the BRR model were equal to 0.25 (BFT), 0.33 (HCW) and 0.36 (REA) when EBV was used as response variable, and 0.21 (BFT), 0.37 (HCW) and 0.46 (REA) when using Y*. Results obtained with the BC and BL models were similar. Accuracies increased for traits with a higher heritability, and using Y* instead of EBV as response variable resulted in higher accuracy when heritability was higher. Conclusions Our results indicate that the accuracy of genomic prediction of carcass traits in Nellore cattle is moderate to high. Prediction of genomic breeding values from adjusted phenotypes Y* was more accurate than from EBV, especially for highly heritable traits. The three models considered (BRR, BC and BL) led to similar predictive abilities and, thus, either one could be used to implement genomic prediction for carcass traits in Nellore cattle.
Collapse
Affiliation(s)
| | - Guilherme J M Rosa
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| | - Bruno D Valente
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| | - Roberto Carvalheiro
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Fernando Baldi
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Diogo A Garcia
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Daniel G M Gordo
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Rafael Espigolan
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Luciana Takada
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Rafael L Tonussi
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Willian B F de Andrade
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Ana F B Magalhães
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Luis A L Chardulo
- Faculdade de Medicina Veterinária e Zootecnia, UNESP, Botucatu, SP, 18618-970, Brazil.
| | - Humberto Tonhati
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| | - Lucia G de Albuquerque
- Faculdade de Ciências Agrárias e Veterinárias, UNESP, Jaboticabal, SP, 14884-900, Brazil.
| |
Collapse
|
19
|
Affiliation(s)
- T. Yin
- Department of Animal Breeding, University of Kassel, 37213 Witzenhausen, Germany
| | - S. König
- Department of Animal Breeding, University of Kassel, 37213 Witzenhausen, Germany
| |
Collapse
|
20
|
Morota G, Gianola D. Kernel-based whole-genome prediction of complex traits: a review. Front Genet 2014; 5:363. [PMID: 25360145 PMCID: PMC4199321 DOI: 10.3389/fgene.2014.00363] [Citation(s) in RCA: 96] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2014] [Accepted: 09/29/2014] [Indexed: 01/18/2023] Open
Abstract
Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.
Collapse
Affiliation(s)
- Gota Morota
- Department of Animal Science, University of Nebraska-Lincoln Lincoln, NE, USA
| | - Daniel Gianola
- Department of Animal Sciences, University of Wisconsin-Madison Madison, WI, USA ; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison Madison, WI, USA ; Department of Dairy Science, University of Wisconsin-Madison Madison, WI, USA
| |
Collapse
|
21
|
Calus MP, Schrooten C, Veerkamp RF. Genomic prediction of breeding values using previously estimated SNP variances. Genet Sel Evol 2014; 46:52. [PMID: 25928875 PMCID: PMC4176585 DOI: 10.1186/s12711-014-0052-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2014] [Accepted: 07/17/2014] [Indexed: 11/10/2022] Open
Abstract
Background Genomic prediction requires estimation of variances of effects of single nucleotide polymorphisms (SNPs), which is computationally demanding, and uses these variances for prediction. We have developed models with separate estimation of SNP variances, which can be applied infrequently, and genomic prediction, which can be applied routinely. Methods SNP variances were estimated with Bayes Stochastic Search Variable Selection (BSSVS) and BayesC. Genome-enhanced breeding values (GEBV) were estimated with RR-BLUP (ridge regression best linear unbiased prediction), using either variances obtained from BSSVS (BLUP-SSVS) or BayesC (BLUP-C), or assuming equal variances for each SNP. Datasets used to estimate SNP variances comprised (1) all animals, (2) 50% random animals (RAN50), (3) 50% best animals (TOP50), or (4) 50% worst animals (BOT50). Traits analysed were protein yield, udder depth, somatic cell score, interval between first and last insemination, direct longevity, and longevity including information from predictors. Results BLUP-SSVS and BLUP-C yielded similar GEBV as the equivalent Bayesian models that simultaneously estimated SNP variances. Reliabilities of these GEBV were consistently higher than from RR-BLUP, although only significantly for direct longevity. Across scenarios that used data subsets to estimate GEBV, observed reliabilities were generally higher for TOP50 than for RAN50, and much higher than for BOT50. Reliabilities of TOP50 were higher because the training data contained more ancestors of selection candidates. Using estimated SNP variances based on random or non-random subsets of the data, while using all data to estimate GEBV, did not affect reliabilities of the BLUP models. A convergence criterion of 10−8 instead of 10−10 for BLUP models yielded similar GEBV, while the required number of iterations decreased by 71 to 90%. Including a separate polygenic effect consistently improved reliabilities of the GEBV, but also substantially increased the required number of iterations to reach convergence with RR-BLUP. SNP variances converged faster for BayesC than for BSSVS. Conclusions Combining Bayesian variable selection models to re-estimate SNP variances and BLUP models that use those SNP variances, yields GEBV that are similar to those from full Bayesian models. Moreover, these combined models yield predictions with higher reliability and less bias than the commonly used RR-BLUP model. Electronic supplementary material The online version of this article (doi:10.1186/s12711-014-0052-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mario Pl Calus
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, P.O. Box 338, Wageningen, 6700 AH, The Netherlands.
| | | | - Roel F Veerkamp
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, P.O. Box 338, Wageningen, 6700 AH, The Netherlands.
| |
Collapse
|
22
|
Desta ZA, Ortiz R. Genomic selection: genome-wide prediction in plant improvement. TRENDS IN PLANT SCIENCE 2014; 19:592-601. [PMID: 24970707 DOI: 10.1016/j.tplants.2014.05.006] [Citation(s) in RCA: 278] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2013] [Revised: 05/08/2014] [Accepted: 05/23/2014] [Indexed: 05/18/2023]
Abstract
Association analysis is used to measure relations between markers and quantitative trait loci (QTL). Their estimation ignores genes with small effects that trigger underpinning quantitative traits. By contrast, genome-wide selection estimates marker effects across the whole genome on the target population based on a prediction model developed in the training population (TP). Whole-genome prediction models estimate all marker effects in all loci and capture small QTL effects. Here, we review several genomic selection (GS) models with respect to both the prediction accuracy and genetic gain from selection. Phenotypic selection or marker-assisted breeding protocols can be replaced by selection, based on whole-genome predictions in which phenotyping updates the model to build up the prediction accuracy.
Collapse
Affiliation(s)
- Zeratsion Abera Desta
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Sundsvagen 14, Box 101, Alnarp, SE 23053, Sweden
| | - Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Sundsvagen 14, Box 101, Alnarp, SE 23053, Sweden.
| |
Collapse
|
23
|
Yao C, Leng N, Weigel KA, Lee KE, Engelman CD, Meyers KJ. Prediction of genetic contributions to complex traits using whole genome sequencing data. BMC Proc 2014; 8:S68. [PMID: 25519339 PMCID: PMC4143683 DOI: 10.1186/1753-6561-8-s1-s68] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Although markers identified by genome-wide association studies have individually strong statistical significance, their performance in prediction remains limited. Our goal was to use animal breeding genomic prediction models to predict additive genetic contributions for systolic blood pressure (SBP) using whole genome sequencing data with different validation designs. The additive genetic contributions of SBP were estimated via linear mixed model. Rare variants (MAF<0.05) were collapsed through the k-means method to create a "collapsed single-nucleotide polymorphisms." Prediction of the additive genomic contributions of SBP was conducted using genomic Best Linear Unbiased Predictor (GBLUP) and BayesCπ. Estimates of predictive accuracy were compared using common single-nucleotide polymorphisms (SNPs) versus common and collapsed SNPs, and for prediction within and across families. The additive genetic variance of SBP contributed to 18% of the phenotypic variance (h(2) = 0.18). BayesCπ had slightly better prediction accuracies than GBLUP. In both models, within-family predictions had higher accuracies both in the training and testing set than didacross-family design. Collapsing rare variants via the k-means method and adding to the common SNPs did not improve prediction accuracies. The prediction model, including both pedigree and genomic information, achieved a slightly higher accuracy than using either source of information alone. Prediction of genetic contributions to complex traits is feasible using whole genome sequencing and statistical methods borrowed from animal breeding. The relatedness of individuals between the training and testing set strongly affected the performance of prediction models. Methods for inclusion of rare variants in these models need more development.
Collapse
Affiliation(s)
- Chen Yao
- Department of Dairy Science, University of Wisconsin, 1675 Observatory Drive, Madison, WI 53706, USA
| | - Ning Leng
- Department of Statistics, University of Wisconsin, 1220 Medical Sciences Center, 1300 University Ave, Madison, WI 53706, USA
| | - Kent A Weigel
- Department of Dairy Science, University of Wisconsin, 1675 Observatory Drive, Madison, WI 53706, USA
| | - Kristine E Lee
- Department of Ophthalmology and Visual Sciences, University of Wisconsin Medical School, 1069 WARF Building, 610 North Walnut Street, Madison, WI 53726, USA
| | - Corinne D Engelman
- Department of Population Health Sciences, University of Wisconsin School of Medicine and Public Health, Madison, WI53726, USA
| | - Kristin J Meyers
- Department of Ophthalmology and Visual Sciences, University of Wisconsin Medical School, 1069 WARF Building, 610 North Walnut Street, Madison, WI 53726, USA
| |
Collapse
|
24
|
Accuracy of estimation of genomic breeding values in pigs using low-density genotypes and imputation. G3-GENES GENOMES GENETICS 2014; 4:623-31. [PMID: 24531728 PMCID: PMC4059235 DOI: 10.1534/g3.114.010504] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Genomic selection has the potential to increase genetic progress. Genotype imputation of high-density single-nucleotide polymorphism (SNP) genotypes can improve the cost efficiency of genomic breeding value (GEBV) prediction for pig breeding. Consequently, the objectives of this work were to: (1) estimate accuracy of genomic evaluation and GEBV for three traits in a Yorkshire population and (2) quantify the loss of accuracy of genomic evaluation and GEBV when genotypes were imputed under two scenarios: a high-cost, high-accuracy scenario in which only selection candidates were imputed from a low-density platform and a low-cost, low-accuracy scenario in which all animals were imputed using a small reference panel of haplotypes. Phenotypes and genotypes obtained with the PorcineSNP60 BeadChip were available for 983 Yorkshire boars. Genotypes of selection candidates were masked and imputed using tagSNP in the GeneSeek Genomic Profiler (10K). Imputation was performed with BEAGLE using 128 or 1800 haplotypes as reference panels. GEBV were obtained through an animal-centric ridge regression model using de-regressed breeding values as response variables. Accuracy of genomic evaluation was estimated as the correlation between estimated breeding values and GEBV in a 10-fold cross validation design. Accuracy of genomic evaluation using observed genotypes was high for all traits (0.65−0.68). Using genotypes imputed from a large reference panel (accuracy: R2 = 0.95) for genomic evaluation did not significantly decrease accuracy, whereas a scenario with genotypes imputed from a small reference panel (R2 = 0.88) did show a significant decrease in accuracy. Genomic evaluation based on imputed genotypes in selection candidates can be implemented at a fraction of the cost of a genomic evaluation using observed genotypes and still yield virtually the same accuracy. On the other side, using a very small reference panel of haplotypes to impute training animals and candidates for selection results in lower accuracy of genomic evaluation.
Collapse
|
25
|
Silva FF, Mulder HA, Knol EF, Lopes MS, Guimarães SEF, Lopes PS, Mathur PK, Viana JMS, Bastiaansen JWM. Sire evaluation for total number born in pigs using a genomic reaction norms approach. J Anim Sci 2014; 92:3825-34. [PMID: 24492557 DOI: 10.2527/jas.2013-6486] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In the era of genome-wide selection (GWS), genotype-by-environment (G×E) interactions can be studied using genomic information, thus enabling the estimation of SNP marker effects and the prediction of genomic estimated breeding values (GEBV) for young candidates for selection in different environments. Although G×E studies in pigs are scarce, the use of artificial insemination has enabled the distribution of genetic material from sires across multiple environments. Given the relevance of reproductive traits, such as the total number born (TNB) and the variation in environmental conditions encountered by commercial dams, understanding G×E interactions can be essential for choosing the best sires for different environments. The present work proposes a two-step reaction norm approach for G×E analysis using genomic information. The first step provided estimates of environmental effects (herd-year-season, HYS), and the second step provided estimates of the intercept and slope for the TNB across different HYS levels, obtained from the first step, using a random regression model. In both steps, pedigree ( A: ) and genomic ( G: ) relationship matrices were considered. The genetic parameters (variance components, h(2) and genetic correlations) were very similar when estimated using the A: and G: relationship matrices. The reaction norm graphs showed considerable differences in environmental sensitivity between sires, indicating a reranking of sires in terms of genetic merit across the HYS levels. Based on the G: matrix analysis, SNP by environment interactions were observed. For some SNP, the effects increased at increasing HYS levels, while for others, the effects decreased at increasing HYS levels or showed no changes between HYS levels. Cross-validation analysis demonstrated better performance of the genomic approach with respect to traditional pedigrees for both the G×E and standard models. The genomic reaction norm model resulted in an accuracy of GEBV for "juvenile" boars varying from 0.14 to 0.44 across different HYS levels, while the accuracy of the standard genomic prediction model, without reaction norms, varied from 0.09 to 0.28. These results show that it is important and feasible to consider G×E interactions in evaluations of sires using genomic prediction models and that genomic information can increase the accuracy of selection across environments.
Collapse
Affiliation(s)
- F F Silva
- Departamento de Zootecnia, Universidade Federal de Viçosa, 36570-000 Viçosa, Brazil
| | - H A Mulder
- Animal Breeding and Genomics Centre, Wageningen University, P.O. Box 338, 6700 AH Wageningen, the Netherlands
| | - E F Knol
- TOPIGS Research Center IPG, P.O. Box 43, 6640 AA Beuningen, the Netherlands
| | - M S Lopes
- TOPIGS Research Center IPG, P.O. Box 43, 6640 AA Beuningen, the Netherlands
| | - S E F Guimarães
- Departamento de Zootecnia, Universidade Federal de Viçosa, 36570-000 Viçosa, Brazil
| | - P S Lopes
- Departamento de Zootecnia, Universidade Federal de Viçosa, 36570-000 Viçosa, Brazil
| | - P K Mathur
- TOPIGS Research Center IPG, P.O. Box 43, 6640 AA Beuningen, the Netherlands
| | - J M S Viana
- Departamento de Biologia Geral, Universidade Federal de Viçosa, 36570-000 Viçosa, Brazil
| | - J W M Bastiaansen
- Animal Breeding and Genomics Centre, Wageningen University, P.O. Box 338, 6700 AH Wageningen, the Netherlands
| |
Collapse
|
26
|
Vazquez AI, de los Campos G, Klimentidis YC, Rosa GJM, Gianola D, Yi N, Allison DB. A comprehensive genetic approach for improving prediction of skin cancer risk in humans. Genetics 2012; 192:1493-502. [PMID: 23051645 PMCID: PMC3512154 DOI: 10.1534/genetics.112.141705] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2012] [Accepted: 09/07/2012] [Indexed: 01/09/2023] Open
Abstract
Prediction of genetic risk for disease is needed for preventive and personalized medicine. Genome-wide association studies have found unprecedented numbers of variants associated with complex human traits and diseases. However, these variants explain only a small proportion of genetic risk. Mounting evidence suggests that many traits, relevant to public health, are affected by large numbers of small-effect genes and that prediction of genetic risk to those traits and diseases could be improved by incorporating large numbers of markers into whole-genome prediction (WGP) models. We developed a WGP model incorporating thousands of markers for prediction of skin cancer risk in humans. We also considered other ways of incorporating genetic information into prediction models, such as family history or ancestry (using principal components, PCs, of informative markers). Prediction accuracy was evaluated using the area under the receiver operating characteristic curve (AUC) estimated in a cross-validation. Incorporation of genetic information (i.e., familial relationships, PCs, or WGP) yielded a significant increase in prediction accuracy: from an AUC of 0.53 for a baseline model that accounted for nongenetic covariates to AUCs of 0.58 (pedigree), 0.62 (PCs), and 0.64 (WGP). In summary, prediction of skin cancer risk could be improved by considering genetic information and using a large number of single-nucleotide polymorphisms (SNPs) in a WGP model, which allows for the detection of patterns of genetic risk that are above and beyond those that can be captured using family history. We discuss avenues for improving prediction accuracy and speculate on the possible use of WGP to prospectively identify individuals at high risk.
Collapse
Affiliation(s)
- Ana I Vazquez
- Section on Statistical Genetics, Department of Biostatistics, University of Alabama, Birmingham, AL 35294, USA.
| | | | | | | | | | | | | |
Collapse
|
27
|
Abstract
Genomic selection relaxes the requirement of traditional selection tools to have phenotypic measurements on close relatives of all selection candidates. This opens up possibilities to select for traits that are difficult or expensive to measure. The objectives of this paper were to predict accuracy of and response to genomic selection for a new trait, considering that only a cow reference population of moderate size was available for the new trait, and that selection simultaneously targeted an index and this new trait. Accuracy for and response to selection were deterministically evaluated for three different breeding goals. Single trait selection for the new trait based only on a limited cow reference population of up to 10 000 cows, showed that maximum genetic responses of 0.20 and 0.28 genetic standard deviation (s.d.) per year can be achieved for traits with a heritability of 0.05 and 0.30, respectively. Adding information from the index based on a reference population of 5000 bulls, and assuming a genetic correlation of 0.5, increased genetic response for both heritability levels by up to 0.14 genetic s.d. per year. The scenario with simultaneous selection for the new trait and the index, yielded a substantially lower response for the new trait, especially when the genetic correlation with the index was negative. Despite the lower response for the index, whenever the new trait had considerable economic value, including the cow reference population considerably improved the genetic response for the new trait. For scenarios with a zero or negative genetic correlation with the index and equal economic value for the index and the new trait, a reference population of 2000 cows increased genetic response for the new trait with at least 0.10 and 0.20 genetic s.d. per year, for heritability levels of 0.05 and 0.30, respectively. We conclude that for new traits with a very small or positive genetic correlation with the index, and a high positive economic value, considerable genetic response can already be achieved based on a cow reference population with only 2000 records, even when the reliability of individual genomic breeding values is much lower than currently accepted in dairy cattle breeding programs. New traits may generally have a negative genetic correlation with the index and a small positive economic value. For such new traits, cow reference populations of at least 10 000 cows may be required to achieve acceptable levels of genetic response for the new trait and for the whole breeding goal.
Collapse
|