Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Karaman E, Cheng H, Firat MZ, Garrick DJ, Fernando RL. An Upper Bound for Accuracy of Prediction Using GBLUP. PLoS One 2016;11:e0161054. [PMID: 27529480 PMCID: PMC4986954 DOI: 10.1371/journal.pone.0161054] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 07/29/2016] [Indexed: 11/26/2022] Open

For:	Karaman E, Cheng H, Firat MZ, Garrick DJ, Fernando RL. An Upper Bound for Accuracy of Prediction Using GBLUP. PLoS One 2016;11:e0161054. [PMID: 27529480 PMCID: PMC4986954 DOI: 10.1371/journal.pone.0161054] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 07/29/2016] [Indexed: 11/26/2022] Open

Number

Cited by Other Article(s)

Nilson SM, Burke JM, Murdoch BM, Morgan JLM, Lewis RM. Pedigree diversity and implications for genetic selection of Katahdin sheep. J Anim Breed Genet 2024;141:304-316. [PMID: 38108572 DOI: 10.1111/jbg.12842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 11/30/2023] [Accepted: 12/02/2023] [Indexed: 12/19/2023]

Abstract

The Katahdin hair breed gained popularity in the United States as low input and prolific, with a propensity to exhibit parasite resistance. With the introduction of genomically enhanced estimated breeding values (GEBV) to the Katahdin genetic evaluation, defining the diversity present in the breed is pertinent. Utilizing pedigree records (n = 92,030) from 1984 to 2019 from the National Sheep Improvement Program, our objectives were to (i) estimate the completeness and quality of the pedigree, (ii) calculate diversity statistics for the whole pedigree and relevant reference subpopulations and (iii) assess the impact of current diversity on genomic selection. Reference 1 was Katahdins born from 2017 to 2019 (n = 23,494), while reference 2 was a subset with at least three generations of Katahdin ancestry (n = 9327). The completeness of the whole pedigree, and the pedigrees of reference 1 and reference 2, were above 50% through the fourth, fifth and seventh generation of ancestors, respectively. Effective population size (Ne) averaged 111 animals with a range from 42.2 to 451.0. The average generation interval was 2.9 years for the whole pedigree and reference 1, and 2.8 years for reference 2. The mean individual inbreeding and average relatedness coefficients were 1.62% and 0.91%, 1.74% and 0.90% and 2.94% and 1.46% for the whole pedigree, reference 1, and reference 2, respectively. There were over 300 effective founders in the whole pedigree and reference 1, with 169 in reference 2. Effective number of ancestors were over 150 for the whole pedigree and reference 1, while there were 67 for reference 2. Prediction accuracies increased as the reference population grew from 1k to 7.5k and plateaued at 15k animals. Given the large number of founders and ancestors contributing to the base genetic variation in the breed, the Ne is sufficient to maintain diversity while achieving progress with selection. Stable low rates of inbreeding and relatedness suggest that incorporating genetic conservation in breeding decisions is currently not of high priority. Current Ne suggests that with limited genotyping, high levels of accuracy for genomic prediction can be achieved. However, intense selection on GEBV may cause loss of genetic diversity long term.

Collapse

Alemu A, Åstrand J, Montesinos-López OA, Isidro Y Sánchez J, Fernández-Gónzalez J, Tadesse W, Vetukuri RR, Carlsson AS, Ceplitis A, Crossa J, Ortiz R, Chawade A. Genomic selection in plant breeding: Key factors shaping two decades of progress. MOLECULAR PLANT 2024;17:552-578. [PMID: 38475993 DOI: 10.1016/j.molp.2024.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 01/22/2024] [Accepted: 03/08/2024] [Indexed: 03/14/2024]

Fernández-González J, Haquin B, Combes E, Bernard K, Allard A, Isidro Y Sánchez J. Maximizing efficiency in sunflower breeding through historical data optimization. PLANT METHODS 2024;20:42. [PMID: 38493115 PMCID: PMC10943787 DOI: 10.1186/s13007-024-01151-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 01/30/2024] [Indexed: 03/18/2024]

Schneider H, Krizanac AM, Falker-Gieske C, Heise J, Tetens J, Thaller G, Bennewitz J. Genomic dissection of the correlation between milk yield and various health traits using functional and evolutionary information about imputed sequence variants of 34,497 German Holstein cows. BMC Genomics 2024;25:265. [PMID: 38461236 DOI: 10.1186/s12864-024-10115-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 02/13/2024] [Indexed: 03/11/2024] Open

Abstract

BACKGROUND

Over the last decades, it was subject of many studies to investigate the genomic connection of milk production and health traits in dairy cattle. Thereby, incorporating functional information in genomic analyses has been shown to improve the understanding of biological and molecular mechanisms shaping complex traits and the accuracies of genomic prediction, especially in small populations and across-breed settings. Still, little is known about the contribution of different functional and evolutionary genome partitioning subsets to milk production and dairy health. Thus, we performed a uni- and a bivariate analysis of milk yield (MY) and eight health traits using a set of ~34,497 German Holstein cows with 50K chip genotypes and ~17 million imputed sequence variants divided into 27 subsets depending on their functional and evolutionary annotation. In the bivariate analysis, eight trait-combinations were observed that contrasted MY with each health trait. Two genomic relationship matrices (GRM) were included, one consisting of the 50K chip variants and one consisting of each set of subset variants, to obtain subset heritabilities and genetic correlations. In addition, 50K chip heritabilities and genetic correlations were estimated applying merely the 50K GRM.

RESULTS

In general, 50K chip heritabilities were larger than the subset heritabilities. The largest heritabilities were found for MY, which was 0.4358 for the 50K and 0.2757 for the subset heritabilities. Whereas all 50K genetic correlations were negative, subset genetic correlations were both, positive and negative (ranging from -0.9324 between MY and mastitis to 0.6662 between MY and digital dermatitis). The subsets containing variants which were annotated as noncoding related, splice sites, untranslated regions, metabolic quantitative trait loci, and young variants ranked highest in terms of their contribution to the traits` genetic variance. We were able to show that linkage disequilibrium between subset variants and adjacent variants did not cause these subsets` high effect.

CONCLUSION

Our results confirm the connection of milk production and health traits in dairy cattle via the animals` metabolic state. In addition, they highlight the potential of including functional information in genomic analyses, which helps to dissect the extent and direction of the observed traits` connection in more detail.

Collapse

Fernández-González J, Akdemir D, Isidro Y Sánchez J. A comparison of methods for training population optimization in genomic selection. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023;136:30. [PMID: 36892603 PMCID: PMC9998580 DOI: 10.1007/s00122-023-04265-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 11/21/2022] [Indexed: 06/18/2023]

Abstract

Maximizing CDmean and Avg_GRM_self were the best criteria for training set optimization. A training set size of 50-55% (targeted) or 65-85% (untargeted) is needed to obtain 95% of the accuracy. With the advent of genomic selection (GS) as a widespread breeding tool, mechanisms to efficiently design an optimal training set for GS models became more relevant, since they allow maximizing the accuracy while minimizing the phenotyping costs. The literature described many training set optimization methods, but there is a lack of a comprehensive comparison among them. This work aimed to provide an extensive benchmark among optimization methods and optimal training set size by testing a wide range of them in seven datasets, six different species, different genetic architectures, population structure, heritabilities, and with several GS models to provide some guidelines about their application in breeding programs. Our results showed that targeted optimization (uses information from the test set) performed better than untargeted (does not use test set data), especially when heritability was low. The mean coefficient of determination was the best targeted method, although it was computationally intensive. Minimizing the average relationship within the training set was the best strategy for untargeted optimization. Regarding the optimal training set size, maximum accuracy was obtained when the training set was the entire candidate set. Nevertheless, a 50-55% of the candidate set was enough to reach 95-100% of the maximum accuracy in the targeted scenario, while we needed a 65-85% for untargeted optimization. Our results also suggested that a diverse training set makes GS robust against population structure, while including clustering information was less effective. The choice of the GS model did not have a significant influence on the prediction accuracies.

Collapse

Angarita Barajas BK, Cantet RJC, Steibel JP, Schrauf MF, Forneris NS. Heritability estimates and predictive ability for pig meat quality traits using identity-by-state and identity-by-descent relationships in an F₂ population. J Anim Breed Genet 2023;140:13-27. [PMID: 36300585 DOI: 10.1111/jbg.12742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 10/05/2022] [Indexed: 12/13/2022]

Wang B, Li P, Hou L, Zhou W, Tao W, Liu C, Liu K, Niu P, Zhang Z, Li Q, Su G, Huang R. Genome‐wide association study and genomic prediction for intramuscular fat content in Suhuai pigs using imputed whole‐genome sequencing data. Evol Appl 2022;15:2054-2066. [DOI: 10.1111/eva.13496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 08/22/2022] [Accepted: 10/04/2022] [Indexed: 11/29/2022] Open

Affiliation(s)

Binbin Wang Key Laboratory in Nanjing for Evaluation and Utilization of Pigs Resources Ministry of Agriculture and Rural Areas of China, Institute of Swine Science, Nanjing Agricultural University Nanjing China Center for Quantitative Genetics and Genomics Aarhus University Aarhus Denmark Huaian Academy Nanjing Agricultural University China
Pinghua Li Key Laboratory in Nanjing for Evaluation and Utilization of Pigs Resources Ministry of Agriculture and Rural Areas of China, Institute of Swine Science, Nanjing Agricultural University Nanjing China Huaian Academy Nanjing Agricultural University China
Liming Hou Key Laboratory in Nanjing for Evaluation and Utilization of Pigs Resources Ministry of Agriculture and Rural Areas of China, Institute of Swine Science, Nanjing Agricultural University Nanjing China Huaian Academy Nanjing Agricultural University China
Wuduo Zhou Key Laboratory in Nanjing for Evaluation and Utilization of Pigs Resources Ministry of Agriculture and Rural Areas of China, Institute of Swine Science, Nanjing Agricultural University Nanjing China
Wei Tao Key Laboratory in Nanjing for Evaluation and Utilization of Pigs Resources Ministry of Agriculture and Rural Areas of China, Institute of Swine Science, Nanjing Agricultural University Nanjing China Huaian Academy Nanjing Agricultural University China
Chenxi Liu Key Laboratory in Nanjing for Evaluation and Utilization of Pigs Resources Ministry of Agriculture and Rural Areas of China, Institute of Swine Science, Nanjing Agricultural University Nanjing China Huaian Academy Nanjing Agricultural University China
Kaiyue Liu Key Laboratory in Nanjing for Evaluation and Utilization of Pigs Resources Ministry of Agriculture and Rural Areas of China, Institute of Swine Science, Nanjing Agricultural University Nanjing China Huaian Academy Nanjing Agricultural University China
Peipei Niu Huaian Academy Nanjing Agricultural University China
Zongping Zhang Huaian Academy Nanjing Agricultural University China
Qiang Li Huaiyin Xinhuai Pig Breeding Farm of Huaian City China
Guosheng Su Center for Quantitative Genetics and Genomics Aarhus University Aarhus Denmark
Ruihua Huang Key Laboratory in Nanjing for Evaluation and Utilization of Pigs Resources Ministry of Agriculture and Rural Areas of China, Institute of Swine Science, Nanjing Agricultural University Nanjing China Huaian Academy Nanjing Agricultural University China

Collapse

Mancin E, Mota LFM, Tuliozi B, Verdiglione R, Mantovani R, Sartori C. Improvement of Genomic Predictions in Small Breeds by Construction of Genomic Relationship Matrix Through Variable Selection. Front Genet 2022;13:814264. [PMID: 35664297 PMCID: PMC9158133 DOI: 10.3389/fgene.2022.814264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Accepted: 03/22/2022] [Indexed: 11/13/2022] Open

Abstract

Genomic selection has been increasingly implemented in the animal breeding industry, and it is becoming a routine method in many livestock breeding contexts. However, its use is still limited in several small-population local breeds, which are, nonetheless, an important source of genetic variability of great economic value. A major roadblock for their genomic selection is accuracy when population size is limited: to improve breeding value accuracy, variable selection models that assume heterogenous variance have been proposed over the last few years. However, while these models might outperform traditional and genomic predictions in terms of accuracy, they also carry a proportional increase of breeding value bias and dispersion. These mutual increases are especially striking when genomic selection is performed with a low number of phenotypes and high shrinkage value—which is precisely the situation that happens with small local breeds. In our study, we tested several alternative methods to improve the accuracy of genomic selection in a small population. First, we investigated the impact of using only a subset of informative markers regarding prediction accuracy, bias, and dispersion. We used different algorithms to select them, such as recursive feature eliminations, penalized regression, and XGBoost. We compared our results with the predictions of pedigree-based BLUP, single-step genomic BLUP, and weighted single-step genomic BLUP in different simulated populations obtained by combining various parameters in terms of number of QTLs and effective population size. We also investigated these approaches on a real data set belonging to the small local Rendena breed. Our results show that the accuracy of GBLUP in small-sized populations increased when performed with SNPs selected via variable selection methods both in simulated and real data sets. In addition, the use of variable selection models—especially those using XGBoost—in our real data set did not impact bias and the dispersion of estimated breeding values. We have discussed possible explanations for our results and how our study can help estimate breeding values for future genomic selection in small breeds.

Collapse

Dzievit MJ, Guo T, Li X, Yu J. Comprehensive analytical and empirical evaluation of genomic prediction across diverse accessions in maize. THE PLANT GENOME 2021;14:e20160. [PMID: 34661990 DOI: 10.1002/tpg2.20160] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Accepted: 08/23/2021] [Indexed: 06/13/2023]

Vojgani E, Pook T, Martini JWR, Hölker AC, Mayer M, Schön CC, Simianer H. Accounting for epistasis improves genomic prediction of phenotypes with univariate and bivariate models across environments. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2021;134:2913-2930. [PMID: 34115154 PMCID: PMC8354961 DOI: 10.1007/s00122-021-03868-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Accepted: 05/24/2021] [Indexed: 06/12/2023]

Abstract

The accuracy of genomic prediction of phenotypes can be increased by including the top-ranked pairwise SNP interactions into the prediction model. We compared the predictive ability of various prediction models for a maize dataset derived from 910 doubled haploid lines from two European landraces (Kemater Landmais Gelb and Petkuser Ferdinand Rot), which were tested at six locations in Germany and Spain. The compared models were Genomic Best Linear Unbiased Prediction (GBLUP) as an additive model, Epistatic Random Regression BLUP (ERRBLUP) accounting for all pairwise SNP interactions, and selective Epistatic Random Regression BLUP (sERRBLUP) accounting for a selected subset of pairwise SNP interactions. These models have been compared in both univariate and bivariate statistical settings for predictions within and across environments. Our results indicate that modeling all pairwise SNP interactions into the univariate/bivariate model (ERRBLUP) is not superior in predictive ability to the respective additive model (GBLUP). However, incorporating only a selected subset of interactions with the highest effect variances in univariate/bivariate sERRBLUP can increase predictive ability significantly compared to the univariate/bivariate GBLUP. Overall, bivariate models consistently outperform univariate models in predictive ability. Across all studied traits, locations and landraces, the increase in prediction accuracy from univariate GBLUP to univariate sERRBLUP ranged from 5.9 to 112.4 percent, with an average increase of 47 percent. For bivariate models, the change ranged from -0.3 to + 27.9 percent comparing the bivariate sERRBLUP to the bivariate GBLUP, with an average increase of 11 percent. This considerable increase in predictive ability achieved by sERRBLUP may be of interest for "sparse testing" approaches in which only a subset of the lines/hybrids of interest is observed at each location.

Collapse

McGaugh SE, Lorenz AJ, Flagel LE. The utility of genomic prediction models in evolutionary genetics. Proc Biol Sci 2021;288:20210693. [PMID: 34344180 PMCID: PMC8334854 DOI: 10.1098/rspb.2021.0693] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 07/15/2021] [Indexed: 12/25/2022] Open

Dekkers JCM, Su H, Cheng J. Predicting the accuracy of genomic predictions. Genet Sel Evol 2021;53:55. [PMID: 34187354 PMCID: PMC8244147 DOI: 10.1186/s12711-021-00647-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 06/11/2021] [Indexed: 11/22/2022] Open

Abstract

Background

Mathematical models are needed for the design of breeding programs using genomic prediction. While deterministic models for selection on pedigree-based estimates of breeding values (PEBV) are available, these have not been fully developed for genomic selection, with a key missing component being the accuracy of genomic EBV (GEBV) of selection candidates. Here, a deterministic method was developed to predict this accuracy within a closed breeding population based on the accuracy of GEBV and PEBV in the reference population and the distance of selection candidates from their closest ancestors in the reference population.

Methods

The accuracy of GEBV was modeled as a combination of the accuracy of PEBV and of EBV based on genomic relationships deviated from pedigree (DEBV). Loss of the accuracy of DEBV from the reference to the target population was modeled based on the effective number of independent chromosome segments in the reference population (M_e). Measures of M_e derived from the inverse of the variance of relationships and from the accuracies of GEBV and PEBV in the reference population, derived using either a Fisher information or a selection index approach, were compared by simulation.

Results

Using simulation, both the Fisher and the selection index approach correctly predicted accuracy in the target population over time, both with and without selection. The index approach, however, resulted in estimates of M_e that were less affected by heritability, reference size, and selection, and which are, therefore, more appropriate as a population parameter. The variance of relationships underpredicted M_e and was greatly affected by selection. A leave-one-out cross-validation approach was proposed to estimate required accuracies of EBV in the reference population. Aspects of the methods were validated using real data.

Conclusions

A deterministic method was developed to predict the accuracy of GEBV in selection candidates in a closed breeding population. The population parameter M_e that is required for these predictions can be derived from an available reference data set, and applied to other reference data sets and traits for that population. This method can be used to evaluate the benefit of genomic prediction and to optimize genomic selection breeding programs.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12711-021-00647-w.

Collapse

Salek Ardestani S, Jafarikia M, Sargolzaei M, Sullivan B, Miar Y. Genomic Prediction of Average Daily Gain, Back-Fat Thickness, and Loin Muscle Depth Using Different Genomic Tools in Canadian Swine Populations. Front Genet 2021;12:665344. [PMID: 34149806 PMCID: PMC8209496 DOI: 10.3389/fgene.2021.665344] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 04/15/2021] [Indexed: 12/12/2022] Open

Abstract

Improvement of prediction accuracy of estimated breeding values (EBVs) can lead to increased profitability for swine breeding companies. This study was performed to compare the accuracy of different popular genomic prediction methods and traditional best linear unbiased prediction (BLUP) for future performance of back-fat thickness (BFT), average daily gain (ADG), and loin muscle depth (LMD) in Canadian Duroc, Landrace, and Yorkshire swine breeds. In this study, 17,019 pigs were genotyped using Illumina 60K and Affymetrix 50K panels. After quality control and imputation steps, a total of 41,304, 48,580, and 49,102 single-nucleotide polymorphisms remained for Duroc (n = 6,649), Landrace (n = 5,362), and Yorkshire (n = 5,008) breeds, respectively. The breeding values of animals in the validation groups (n = 392–774) were predicted before performance test using BLUP, BayesC, BayesCπ, genomic BLUP (GBLUP), and single-step GBLUP (ssGBLUP) methods. The prediction accuracies were obtained using the correlation between the predicted breeding values and their deregressed EBVs (dEBVs) after performance test. The genomic prediction methods showed higher prediction accuracies than traditional BLUP for all scenarios. Although the accuracies of genomic prediction methods were not significantly (P > 0.05) different, ssGBLUP was the most accurate method for Duroc-ADG, Duroc-LMD, Landrace-BFT, Landrace-ADG, and Yorkshire-BFT scenarios, and BayesCπ was the most accurate method for Duroc-BFT, Landrace-LMD, and Yorkshire-ADG scenarios. Furthermore, BayesCπ method was the least biased method for Duroc-LMD, Landrace-BFT, Landrace-ADG, Yorkshire-BFT, and Yorkshire-ADG scenarios. Our findings can be beneficial for accelerating the genetic progress of BFT, ADG, and LMD in Canadian swine populations by selecting more accurate and unbiased genomic prediction methods.

Collapse

Karaman E, Su G, Croue I, Lund MS. Genomic prediction using a reference population of multiple pure breeds and admixed individuals. Genet Sel Evol 2021;53:46. [PMID: 34058971 PMCID: PMC8168010 DOI: 10.1186/s12711-021-00637-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Accepted: 05/11/2021] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

In dairy cattle populations in which crossbreeding has been used, animals show some level of diversity in their origins. In rotational crossbreeding, for instance, crossbred dams are mated with purebred sires from different pure breeds, and the genetic composition of crossbred animals is an admixture of the breeds included in the rotation. How to use the data of such individuals in genomic evaluations is still an open question. In this study, we aimed at providing methodologies for the use of data from crossbred individuals with an admixed genetic background together with data from multiple pure breeds, for the purpose of genomic evaluations for both purebred and crossbred animals. A three-breed rotational crossbreeding system was mimicked using simulations based on animals genotyped with the 50 K single nucleotide polymorphism (SNP) chip.

RESULTS

For purebred populations, within-breed genomic predictions generally led to higher accuracies than those from multi-breed predictions using combined data of pure breeds. Adding admixed population's (MIX) data to the combined pure breed data considering MIX as a different breed led to higher accuracies. When prediction models were able to account for breed origin of alleles, accuracies were generally higher than those from combining all available data, depending on the correlation of quantitative trait loci (QTL) effects between the breeds. Accuracies varied when using SNP effects from any of the pure breeds to predict the breeding values of MIX. Using those breed-specific SNP effects that were estimated separately in each pure breed, while accounting for breed origin of alleles for the selection candidates of MIX, generally improved the accuracies. Models that are able to accommodate MIX data with the breed origin of alleles approach generally led to higher accuracies than models without breed origin of alleles, depending on the correlation of QTL effects between the breeds.

CONCLUSIONS

Combining all available data, pure breeds' and admixed population's data, in a multi-breed reference population is beneficial for the estimation of breeding values for pure breeds with a small reference population. For MIX, such an approach can lead to higher accuracies than considering breed origin of alleles for the selection candidates, and using breed-specific SNP effects estimated separately in each pure breed. Including MIX data in the reference population of multiple breeds by considering the breed origin of alleles, accuracies can be further improved. Our findings are relevant for breeding programs in which crossbreeding is systematically applied, and also for populations that involve different subpopulations and between which exchange of genetic material is routine practice.

Collapse

Cesarani A, Biffani S, Garcia A, Lourenco D, Bertolini G, Neglia G, Misztal I, Macciotta NPP. Genomic investigation of milk production in Italian buffalo. ITALIAN JOURNAL OF ANIMAL SCIENCE 2021. [DOI: 10.1080/1828051x.2021.1902404] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

Vojgani E, Pook T, Simianer H. Phenotype Prediction Under Epistasis. Methods Mol Biol 2021;2212:105-120. [PMID: 33733353 DOI: 10.1007/978-1-0716-0947-7_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/13/2023]

Amini F, Franco FR, Hu G, Wang L. The look ahead trace back optimizer for genomic selection under transparent and opaque simulators. Sci Rep 2021;11:4124. [PMID: 33602979 PMCID: PMC7893003 DOI: 10.1038/s41598-021-83567-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 02/02/2021] [Indexed: 11/29/2022] Open

Abstract

Recent advances in genomic selection (GS) have demonstrated the importance of not only the accuracy of genomic prediction but also the intelligence of selection strategies. The look ahead selection algorithm, for example, has been found to significantly outperform the widely used truncation selection approach in terms of genetic gain, thanks to its strategy of selecting breeding parents that may not necessarily be elite themselves but have the best chance of producing elite progeny in the future. This paper presents the look ahead trace back algorithm as a new variant of the look ahead approach, which introduces several improvements to further accelerate genetic gain especially under imperfect genomic prediction. Perhaps an even more significant contribution of this paper is the design of opaque simulators for evaluating the performance of GS algorithms. These simulators are partially observable, explicitly capture both additive and non-additive genetic effects, and simulate uncertain recombination events more realistically. In contrast, most existing GS simulation settings are transparent, either explicitly or implicitly allowing the GS algorithm to exploit certain critical information that may not be possible in actual breeding programs. Comprehensive computational experiments were carried out using a maize data set to compare a variety of GS algorithms under four simulators with different levels of opacity. These results reveal how differently a same GS algorithm would interact with different simulators, suggesting the need for continued research in the design of more realistic simulators. As long as GS algorithms continue to be trained in silico rather than in planta, the best way to avoid disappointing discrepancy between their simulated and actual performances may be to make the simulator as akin to the complex and opaque nature as possible.

Collapse

Xiang R, MacLeod IM, Daetwyler HD, de Jong G, O’Connor E, Schrooten C, Chamberlain AJ, Goddard ME. Genome-wide fine-mapping identifies pleiotropic and functional variants that predict many traits across global cattle populations. Nat Commun 2021;12:860. [PMID: 33558518 PMCID: PMC7870883 DOI: 10.1038/s41467-021-21001-0] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 11/23/2020] [Indexed: 02/08/2023] Open

Farooq M, van Dijk ADJ, Nijveen H, Aarts MGM, Kruijer W, Nguyen TP, Mansoor S, de Ridder D. Prior Biological Knowledge Improves Genomic Prediction of Growth-Related Traits in Arabidopsis thaliana. Front Genet 2021;11:609117. [PMID: 33552126 PMCID: PMC7855462 DOI: 10.3389/fgene.2020.609117] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 12/21/2020] [Indexed: 01/11/2023] Open

Abstract

Prediction of growth-related complex traits is highly important for crop breeding. Photosynthesis efficiency and biomass are direct indicators of overall plant performance and therefore even minor improvements in these traits can result in significant breeding gains. Crop breeding for complex traits has been revolutionized by technological developments in genomics and phenomics. Capitalizing on the growing availability of genomics data, genome-wide marker-based prediction models allow for efficient selection of the best parents for the next generation without the need for phenotypic information. Until now such models mostly predict the phenotype directly from the genotype and fail to make use of relevant biological knowledge. It is an open question to what extent the use of such biological knowledge is beneficial for improving genomic prediction accuracy and reliability. In this study, we explored the use of publicly available biological information for genomic prediction of photosynthetic light use efficiency (Φ PSII ) and projected leaf area (PLA) in Arabidopsis thaliana. To explore the use of various types of knowledge, we mapped genomic polymorphisms to Gene Ontology (GO) terms and transcriptomics-based gene clusters, and applied these in a Genomic Feature Best Linear Unbiased Predictor (GFBLUP) model, which is an extension to the traditional Genomic BLUP (GBLUP) benchmark. Our results suggest that incorporation of prior biological knowledge can improve genomic prediction accuracy for both Φ PSII and PLA. The improvement achieved depends on the trait, type of knowledge and trait heritability. Moreover, transcriptomics offers complementary evidence to the Gene Ontology for improvement when used to define functional groups of genes. In conclusion, prior knowledge about trait-specific groups of genes can be directly translated into improved genomic prediction.

Collapse

Naserkheil M, Lee DH, Mehrban H. Improving the accuracy of genomic evaluation for linear body measurement traits using single-step genomic best linear unbiased prediction in Hanwoo beef cattle. BMC Genet 2020;21:144. [PMID: 33267771 PMCID: PMC7709290 DOI: 10.1186/s12863-020-00928-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 10/27/2020] [Indexed: 12/20/2022] Open

Abstract

BACKGROUND

Recently, there has been a growing interest in the genetic improvement of body measurement traits in farm animals. They are widely used as predictors of performance, longevity, and production traits, and it is worthwhile to investigate the prediction accuracies of genomic selection for these traits. In genomic prediction, the single-step genomic best linear unbiased prediction (ssGBLUP) method allows the inclusion of information from genotyped and non-genotyped relatives in the analysis. Hence, we aimed to compare the prediction accuracy obtained from a pedigree-based BLUP only on genotyped animals (PBLUP-G), a traditional pedigree-based BLUP (PBLUP), a genomic BLUP (GBLUP), and a single-step genomic BLUP (ssGBLUP) method for the following 10 body measurement traits at yearling age of Hanwoo cattle: body height (BH), body length (BL), chest depth (CD), chest girth (CG), chest width (CW), hip height (HH), hip width (HW), rump length (RL), rump width (RW), and thurl width (TW). The data set comprised 13,067 phenotypic records for body measurement traits and 1523 genotyped animals with 34,460 single-nucleotide polymorphisms. The accuracy for each trait and model was estimated only for genotyped animals using five-fold cross-validations.

RESULTS

The accuracies ranged from 0.02 to 0.19, 0.22 to 0.42, 0.21 to 0.44, and from 0.36 to 0.55 as assessed using the PBLUP-G, PBLUP, GBLUP, and ssGBLUP methods, respectively. The average predictive accuracies across traits were 0.13 for PBLUP-G, 0.34 for PBLUP, 0.33 for GBLUP, and 0.45 for ssGBLUP methods. Our results demonstrated that averaged across all traits, ssGBLUP outperformed PBLUP and GBLUP by 33 and 43%, respectively, in terms of prediction accuracy. Moreover, the least root of mean square error was obtained by ssGBLUP method.

CONCLUSIONS

Our findings suggest that considering the ssGBLUP model may be a promising way to ensure acceptable accuracy of predictions for body measurement traits, especially for improving the prediction accuracy of selection candidates in ongoing Hanwoo breeding programs.

Collapse

Yu X, Leiboff S, Li X, Guo T, Ronning N, Zhang X, Muehlbauer GJ, Timmermans MC, Schnable PS, Scanlon MJ, Yu J. Genomic prediction of maize microphenotypes provides insights for optimizing selection and mining diversity. PLANT BIOTECHNOLOGY JOURNAL 2020;18:2456-2465. [PMID: 32452105 PMCID: PMC7680549 DOI: 10.1111/pbi.13420] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2019] [Revised: 05/05/2020] [Accepted: 05/13/2020] [Indexed: 05/25/2023]

Garcia ALS, Masuda Y, Tsuruta S, Miller S, Misztal I, Lourenco D. Indirect predictions with a large number of genotyped animals using the algorithm for proven and young. J Anim Sci 2020;98:5831156. [PMID: 32374831 PMCID: PMC7263398 DOI: 10.1093/jas/skaa154] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Accepted: 04/30/2020] [Indexed: 11/21/2022] Open

Abstract

Reliable single-nucleotide polymorphisms (SNP) effects from genomic best linear unbiased prediction BLUP (GBLUP) and single-step GBLUP (ssGBLUP) are needed to calculate indirect predictions (IP) for young genotyped animals and animals not included in official evaluations. Obtaining reliable SNP effects and IP requires a minimum number of animals and when a large number of genotyped animals are available, the algorithm for proven and young (APY) may be needed. Thus, the objectives of this study were to evaluate IP with an increasingly larger number of genotyped animals and to determine the minimum number of animals needed to compute reliable SNP effects and IP. Genotypes and phenotypes for birth weight, weaning weight, and postweaning gain were provided by the American Angus Association. The number of animals with phenotypes was more than 3.8 million. Genotyped animals were assigned to three cumulative year-classes: born until 2013 (N = 114,937), born until 2014 (N = 183,847), and born until 2015 (N = 280,506). A three-trait model was fitted using the APY algorithm with 19,021 core animals under two scenarios: 1) core 2013 (random sample of animals born until 2013) used for all year-classes and 2) core 2014 (random sample of animals born until 2014) used for year-class 2014 and core 2015 (random sample of animals born until 2015) used for year-class 2015. GBLUP used phenotypes from genotyped animals only, whereas ssGBLUP used all available phenotypes. SNP effects were predicted using genomic estimated breeding values (GEBV) from either all genotyped animals or only core animals. The correlations between GEBV from GBLUP and IP obtained using SNP effects from core 2013 were ≥0.99 for animals born in 2013 but as low as 0.07 for animals born in 2014 and 2015. Conversely, the correlations between GEBV from ssGBLUP and IP were ≥0.99 for animals born in all years. IP predictive abilities computed with GEBV from ssGBLUP and SNP predictions based on only core animals were as high as those based on all genotyped animals. The correlations between GEBV and IP from ssGBLUP were ≥0.76, ≥0.90, and ≥0.98 when SNP effects were computed using 2k, 5k, and 15k core animals. Suitable IP based on GEBV from GBLUP can be obtained when SNP predictions are based on an appropriate number of core animals, but a considerable decline in IP accuracy can occur in subsequent years. Conversely, IP from ssGBLUP based on large numbers of phenotypes from non-genotyped animals have persistent accuracy over time.

Collapse

Gualdrón Duarte JL, Gori AS, Hubin X, Lourenco D, Charlier C, Misztal I, Druet T. Performances of Adaptive MultiBLUP, Bayesian regressions, and weighted-GBLUP approaches for genomic predictions in Belgian Blue beef cattle. BMC Genomics 2020;21:545. [PMID: 32762654 PMCID: PMC7430838 DOI: 10.1186/s12864-020-06921-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Accepted: 07/17/2020] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Genomic selection has been successfully implemented in many livestock and crop species. The genomic best linear unbiased predictor (GBLUP) approach, assigning equal variance to all SNP effects, is one of the reference methods. When large-effect variants contribute to complex traits, it has been shown that genomic prediction methods that assign a higher variance to subsets of SNP effects can achieve higher prediction accuracy. We herein compared the efficiency of several such approaches, including the Adaptive MultiBLUP (AM-BLUP) that uses local genomic relationship matrices (GRM) to automatically identify and weight genomic regions with large effects, to predict genetic merit in Belgian Blue beef cattle.

RESULTS

We used a population of approximately 10,000 genotyped cows and their phenotypes for 14 traits, mostly related to muscular development and body dimensions. According to the trait, we found that 4 to 25% of the genetic variance could be associated with 2 to 12 genomic regions harbouring large-effect variants. Noteworthy, three previously identified recessive deleterious variants presented heterozygote advantage and were among the most significant SNPs for several traits. The AM-BLUP resulted in increased reliability of genomic predictions compared to GBLUP (+ 2%), but Bayesian methods proved more efficient (+ 3%). Overall, the reliability gains remained thus limited although higher gains were observed for skin thickness, a trait affected by two genomic regions having particularly large effects. Higher accuracies than those from the original AM-BLUP were achieved when applying the Bayesian Sparse Linear Mixed Model to pre-select groups of SNPs with large effects and subsequently use their estimated variance to build a weighted GRM. Finally, the single-step GBLUP performed best and could be further improved (+ 3% prediction accuracy) by using these weighted GRM.

CONCLUSIONS

The AM-BLUP is an attractive method to automatically identify and weight genomic regions with large effects on complex traits. However, the method was less accurate than Bayesian methods. Overall, weighted methods achieved modest accuracy gains compared to GBLUP. Nevertheless, the computational efficiency of the AM-BLUP might be valuable at higher marker density, including with whole-genome sequencing data. Furthermore, weighted GRM are particularly useful to account for large variance loci in the single-step GBLUP.

Collapse

Lourenco D, Legarra A, Tsuruta S, Masuda Y, Aguilar I, Misztal I. Single-Step Genomic Evaluations from Theory to Practice: Using SNP Chips and Sequence Data in BLUPF90. Genes (Basel) 2020;11:E790. [PMID: 32674271 PMCID: PMC7397237 DOI: 10.3390/genes11070790] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 07/03/2020] [Accepted: 07/06/2020] [Indexed: 11/16/2022] Open

Ben Zaabza H, Mäntysaari EA, Strandén I. Using Monte Carlo method to include polygenic effects in calculation of SNP-BLUP model reliability. J Dairy Sci 2020;103:5170-5182. [PMID: 32253036 DOI: 10.3168/jds.2019-17255] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 02/04/2020] [Indexed: 11/19/2022]

Misztal I, Lourenco D, Legarra A. Current status of genomic evaluation. J Anim Sci 2020;98:skaa101. [PMID: 32267923 PMCID: PMC7183352 DOI: 10.1093/jas/skaa101] [Citation(s) in RCA: 64] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Accepted: 04/07/2020] [Indexed: 12/14/2022] Open

Abstract

Early application of genomic selection relied on SNP estimation with phenotypes or de-regressed proofs (DRP). Chips of 50k SNP seemed sufficient for an accurate estimation of SNP effects. Genomic estimated breeding values (GEBV) were composed of an index with parent average, direct genomic value, and deduction of a parental index to eliminate double counting. Use of SNP selection or weighting increased accuracy with small data sets but had minimal to no impact with large data sets. Efforts to include potentially causative SNP derived from sequence data or high-density chips showed limited or no gain in accuracy. After the implementation of genomic selection, EBV by BLUP became biased because of genomic preselection and DRP computed based on EBV required adjustments, and the creation of DRP for females is hard and subject to double counting. Genomic selection was greatly simplified by single-step genomic BLUP (ssGBLUP). This method based on combining genomic and pedigree relationships automatically creates an index with all sources of information, can use any combination of male and female genotypes, and accounts for preselection. To avoid biases, especially under strong selection, ssGBLUP requires that pedigree and genomic relationships are compatible. Because the inversion of the genomic relationship matrix (G) becomes costly with more than 100k genotyped animals, large data computations in ssGBLUP were solved by exploiting limited dimensionality of genomic data due to limited effective population size. With such dimensionality ranging from 4k in chickens to about 15k in cattle, the inverse of G can be created directly (e.g., by the algorithm for proven and young) at a linear cost. Due to its simplicity and accuracy, ssGBLUP is routinely used for genomic selection by the major chicken, pig, and beef industries. Single step can be used to derive SNP effects for indirect prediction and for genome-wide association studies, including computations of the P-values. Alternative single-step formulations exist that use SNP effects for genotyped or for all animals. Although genomics is the new standard in breeding and genetics, there are still some problems that need to be solved. This involves new validation procedures that are unaffected by selection, parameter estimation that accounts for all the genomic data used in selection, and strategies to address reduction in genetic variances after genomic selection was implemented.

Collapse

Karaman E, Lund MS, Su G. Multi-trait single-step genomic prediction accounting for heterogeneous (co)variances over the genome. Heredity (Edinb) 2020;124:274-287. [PMID: 31641237 PMCID: PMC6972913 DOI: 10.1038/s41437-019-0273-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 09/05/2019] [Accepted: 09/06/2019] [Indexed: 11/23/2022] Open

Pocrnic I, Lourenco DAL, Masuda Y, Misztal I. Accuracy of genomic BLUP when considering a genomic relationship matrix based on the number of the largest eigenvalues: a simulation study. Genet Sel Evol 2019;51:75. [PMID: 31830899 PMCID: PMC6907194 DOI: 10.1186/s12711-019-0516-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 12/04/2019] [Indexed: 12/20/2022] Open

Abstract

Background

The dimensionality of genomic information is limited by the number of independent chromosome segments (M_e), which is a function of the effective population size. This dimensionality can be determined approximately by singular value decomposition of the gene content matrix, by eigenvalue decomposition of the genomic relationship matrix (GRM), or by the number of core animals in the algorithm for proven and young (APY) that maximizes the accuracy of genomic prediction. In the latter, core animals act as proxies to linear combinations of M_e. Field studies indicate that a moderate accuracy of genomic selection is achieved with a small dataset, but that further improvement of the accuracy requires much more data. When only one quarter of the optimal number of core animals are used in the APY algorithm, the accuracy of genomic selection is only slightly below the optimal value. This suggests that genomic selection works on clusters of M_e.

Results

The simulation included datasets with different population sizes and amounts of phenotypic information. Computations were done by genomic best linear unbiased prediction (GBLUP) with selected eigenvalues and corresponding eigenvectors of the GRM set to zero. About four eigenvalues in the GRM explained 10% of the genomic variation, and less than 2% of the total eigenvalues explained 50% of the genomic variation. With limited phenotypic information, the accuracy of GBLUP was close to the peak where most of the smallest eigenvalues were set to zero. With a large amount of phenotypic information, accuracy increased as smaller eigenvalues were added.

Conclusions

A small amount of phenotypic data is sufficient to estimate only the effects of the largest eigenvalues and the associated eigenvectors that contain a large fraction of the genomic information, and a very large amount of data is required to estimate the remaining eigenvalues that account for a limited amount of genomic information. Core animals in the APY algorithm act as proxies of almost the same number of eigenvalues. By using an eigenvalues-based approach, it was possible to explain why the moderate accuracy of genomic selection based on small datasets only increases slowly as more data are added.

Collapse

Fragomeni BO, Lourenco DAL, Legarra A, VanRaden PM, Misztal I. Alternative SNP weighting for single-step genomic best linear unbiased predictor evaluation of stature in US Holsteins in the presence of selected sequence variants. J Dairy Sci 2019;102:10012-10019. [PMID: 31495612 DOI: 10.3168/jds.2019-16262] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Accepted: 07/16/2019] [Indexed: 11/19/2022]

Abstract

Causal variants inferred from sequence data analysis are expected to increase accuracy of genomic selection. In this work we evaluated the gain in reliability of genomic predictions, for stature in US Holsteins, when adding selected sequence variants to a pre-existent SNP chip. Two prediction methods were tested: de-regressed proofs assuming heterogeneous (genomic BLUP; GBLUP) residual variances and by single-step GBLUP (ssGBLUP) using actual phenotypes. Phenotypic data included 3,999,631 records for stature on 3,027,304 Holstein cows. Genotypes on 54,087 SNP markers (54k) were available for 26,877 bulls. Additionally, 16,648 selected sequence variants were combined with the 54k markers, for a total of 70,735 (70k) markers. In all methods, SNP in the genomic relationship matrix (G) were unweighted or weighted iteratively, with weights derived either by SNP effects squared or by a nonlinear method that resembles BayesA (nonlinear A). Reliability of genomic predictions were obtained by cross validation. With unweighted G derived from 54k markers, the reliabilities (× 100) were 72.4 for GBLUP and 75.3 for ssGBLUP. With unweighted G derived from 70k markers, the reliabilities were 73.4 and 76.0, respectively. Weighting by nonlinear A changed reliabilities to 73.3, and 75.9, respectively. Addition of selected sequence variants had a small effect on reliabilities. Weighting by quadratic functions reduced reliabilities. Weighting by nonlinear A increased reliabilities for GBLUP but had only a small effect in ssGBLUP. Reliabilities for direct genomic values extracted from ssGBLUP using unweighted G with 54k were higher than reliabilities by any GBLUP. Thus, ssGBLUP seems to capture more information than GBLUP and there is less room for extra reliability. Improvements in GBLUP may be because the weights in G change the covariance structure, which can explain a proportion of the variance that is accounted for when a heterogeneous residual variance is assumed by considering a different number of daughters per bull.

Collapse

Improvement of genomic prediction by integrating additional single nucleotide polymorphisms selected from imputed whole genome sequencing data. Heredity (Edinb) 2019;124:37-49. [PMID: 31278370 PMCID: PMC6906477 DOI: 10.1038/s41437-019-0246-7] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 05/11/2019] [Accepted: 06/17/2019] [Indexed: 11/10/2022] Open

Hao Y, Wang H, Yang X, Zhang H, He C, Li D, Li H, Wang G, Wang J, Fu J. Genomic Prediction using Existing Historical Data Contributing to Selection in Biparental Populations: A Study of Kernel Oil in Maize. THE PLANT GENOME 2019;12. [PMID: 30951098 DOI: 10.3835/plantgenome2018.05.0025] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]

Impact of genotyping strategy on the accuracy of genomic prediction in simulated populations of purebred swine. Animal 2019;13:1804-1810. [PMID: 30616709 DOI: 10.1017/s1751731118003567] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Genomic Prediction Using Multi-trait Weighted GBLUP Accounting for Heterogeneous Variances and Covariances Across the Genome. G3-GENES GENOMES GENETICS 2018;8:3549-3558. [PMID: 30194089 PMCID: PMC6222589 DOI: 10.1534/g3.118.200673] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Cheng H, Kizilkaya K, Zeng J, Garrick D, Fernando R. Genomic Prediction from Multiple-Trait Bayesian Regression Methods Using Mixture Priors. Genetics 2018. [PMID: 29514861 DOI: 10.1534/genetics.118.300650/-/dc1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2023] Open

Zeng J, Garrick D, Dekkers J, Fernando R. A nested mixture model for genomic prediction using whole-genome SNP genotypes. PLoS One 2018;13:e0194683. [PMID: 29561877 PMCID: PMC5862491 DOI: 10.1371/journal.pone.0194683] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2017] [Accepted: 03/07/2018] [Indexed: 11/19/2022] Open

Abstract

Genomic prediction exploits single nucleotide polymorphisms (SNPs) across the whole genome for predicting genetic merit of selection candidates. In most models for genomic prediction, e.g. BayesA, B, C, R and GBLUP, independence of SNP effects is assumed. However, SNP effects are expected to be locally dependent given the presence of a nearby QTL because SNPs surrounding the QTL do not segregate independently. A consequence of ignoring this dependence is that SNPs with small effects may be overly shrunk, e.g. effects from markers with high minor allele frequencies (MAF) that flank QTL with low MAF. A nested mixture model (BayesN) is developed to account for the dependence of effects of SNPs that are closely linked, where the effects of SNPs in every non-overlapping genomic window a priori follow a point mass at zero for all SNPs or a mixture of some SNPs with nonzero effects and others with zero effects. It can be regarded as a parsimonious alternative to the existing antedependence model, antiBayesB, which allow a nonstationary dependence of SNP effects. Illumina 777K BovineHD genotypes from 948 Angus cattle were used to simulate 5,000 offspring, with 4,000 used for training and 1,000 for validation. Scenarios with 300 common (MAF > 0.05) or rare (MAF < 0.05) QTL randomly selected from segregating SNPs were replicated 8 times. SNPs corresponding to QTL were masked from a 600k panel comprising SNPs with MAF > 0.05 or a 50k evenly spaced subset of these. Compared with BayesB and a modified antiBayesB, BayesN improved the accuracy of prediction up to 2.0% with 50k SNPs and up to 7.0% with 600k SNPs, most improvements occurring in the rare QTL scenario. Computing time was reduced up to 60% with 50k SNPs and up to 75% with 600k SNPs. BayesN is an accurate and computationally efficient method for genomic prediction with whole-genome SNPs, especially for traits with rare QTL.

Collapse

Genomic Prediction from Multiple-Trait Bayesian Regression Methods Using Mixture Priors. Genetics 2018. [PMID: 29514861 DOI: 10.1534/genetics.118.300650] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Morota G. ShinyGPAS: interactive genomic prediction accuracy simulator based on deterministic formulas. Genet Sel Evol 2017;49:91. [PMID: 29262775 PMCID: PMC5738850 DOI: 10.1186/s12711-017-0368-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Accepted: 12/11/2017] [Indexed: 11/10/2022] Open

Factors affecting GEBV accuracy with single-step Bayesian models. Heredity (Edinb) 2017;120:100-109. [PMID: 29167557 DOI: 10.1038/s41437-017-0010-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2017] [Revised: 09/04/2017] [Accepted: 09/14/2017] [Indexed: 12/23/2022] Open

Lourenco DAL, Fragomeni BO, Bradford HL, Menezes IR, Ferraz JBS, Aguilar I, Tsuruta S, Misztal I. Implications of SNP weighting on single-step genomic predictions for different reference population sizes. J Anim Breed Genet 2017;134:463-471. [PMID: 28833593 DOI: 10.1111/jbg.12288] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2017] [Accepted: 07/19/2017] [Indexed: 01/20/2023]

Fragomeni BO, Lourenco DAL, Masuda Y, Legarra A, Misztal I. Incorporation of causative quantitative trait nucleotides in single-step GBLUP. Genet Sel Evol 2017;49:59. [PMID: 28747171 PMCID: PMC5530494 DOI: 10.1186/s12711-017-0335-0] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2017] [Accepted: 07/17/2017] [Indexed: 11/23/2022] Open

Abstract

Background

Much effort is put into identifying causative quantitative trait nucleotides (QTN) in animal breeding, empowered by the availability of dense single nucleotide polymorphism (SNP) information. Genomic selection using traditional SNP information is easily implemented for any number of genotyped individuals using single-step genomic best linear unbiased predictor (ssGBLUP) with the algorithm for proven and young (APY). Our aim was to investigate whether ssGBLUP is useful for genomic prediction when some or all QTN are known.

Methods

Simulations included 180,000 animals across 11 generations. Phenotypes were available for all animals in generations 6 to 10. Genotypes for 60,000 SNPs across 10 chromosomes were available for 29,000 individuals. The genetic variance was fully accounted for by 100 or 1000 biallelic QTN. Raw genomic relationship matrices (GRM) were computed from (a) unweighted SNPs, (b) unweighted SNPs and causative QTN, (c) SNPs and causative QTN weighted with results obtained with genome-wide association studies, (d) unweighted SNPs and causative QTN with simulated weights, (e) only unweighted causative QTN, (f–h) as in (b–d) but using only the top 10% causative QTN, and (i) using only causative QTN with simulated weight. Predictions were computed by pedigree-based BLUP (PBLUP) and ssGBLUP. Raw GRM were blended with 1 or 5% of the numerator relationship matrix, or 1% of the identity matrix. Inverses of GRM were obtained directly or with APY.

Results

Accuracy of breeding values for 5000 genotyped animals in the last generation with PBLUP was 0.32, and for ssGBLUP it increased to 0.49 with an unweighted GRM, 0.53 after adding unweighted QTN, 0.63 when QTN weights were estimated, and 0.89 when QTN weights were based on true effects known from the simulation. When the GRM was constructed from causative QTN only, accuracy was 0.95 and 0.99 with blending at 5 and 1%, respectively. Accuracies simulating 1000 QTN were generally lower, with a similar trend. Accuracies using the APY inverse were equal or higher than those with a regular inverse.

Conclusions

Single-step GBLUP can account for causative QTN via a weighted GRM. Accuracy gains are maximum when variances of causative QTN are known and blending is at 1%.

Collapse

Neyhart JL, Tiede T, Lorenz AJ, Smith KP. Evaluating Methods of Updating Training Data in Long-Term Genomewide Selection. G3 (BETHESDA, MD.) 2017;7:1499-1510. [PMID: 28315831 PMCID: PMC5427505 DOI: 10.1534/g3.117.040550] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 03/10/2017] [Indexed: 12/22/2022]

Application of Whole-Genome Prediction Methods for Genome-Wide Association Studies: A Bayesian Approach. JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS 2017. [DOI: 10.1007/s13253-017-0277-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Brown PJ. Plant breeding: Effective use of genetic diversity. NATURE PLANTS 2016;2:16154. [PMID: 27701395 DOI: 10.1038/nplants.2016.154] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]

Yu X, Li X, Guo T, Zhu C, Wu Y, Mitchell SE, Roozeboom KL, Wang D, Wang ML, Pederson GA, Tesso TT, Schnable PS, Bernardo R, Yu J. Genomic prediction contributing to a promising global strategy to turbocharge gene banks. NATURE PLANTS 2016;2:16150. [PMID: 27694945 DOI: 10.1038/nplants.2016.150] [Citation(s) in RCA: 121] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Accepted: 08/31/2016] [Indexed: 05/18/2023]