Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Raymond B, Bouwman AC, Wientjes YCJ, Schrooten C, Houwing-Duistermaat J, Veerkamp RF. Genomic prediction for numerically small breeds, using models with pre-selected and differentially weighted markers. Genet Sel Evol 2018;50:49. [PMID: 30314431 PMCID: PMC6186145 DOI: 10.1186/s12711-018-0419-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Accepted: 10/01/2018] [Indexed: 01/22/2023] Open

For:	Raymond B, Bouwman AC, Wientjes YCJ, Schrooten C, Houwing-Duistermaat J, Veerkamp RF. Genomic prediction for numerically small breeds, using models with pre-selected and differentially weighted markers. Genet Sel Evol 2018;50:49. [PMID: 30314431 PMCID: PMC6186145 DOI: 10.1186/s12711-018-0419-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Accepted: 10/01/2018] [Indexed: 01/22/2023] Open

Number

Cited by Other Article(s)

Calderón-Chagoya R, Vega-Murillo VE, García-Ruiz A, Ríos-Utrera Á, Martínez-Velázquez G, Montaño-Bermúdez M. Discovering Genomic Regions Associated with Reproductive Traits and Frame Score in Mexican Simmental and Simbrah Cattle Using Individual SNP and Haplotype Markers. Genes (Basel) 2023;14:2004. [PMID: 38002947 PMCID: PMC10671695 DOI: 10.3390/genes14112004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 10/11/2023] [Accepted: 10/20/2023] [Indexed: 11/26/2023] Open

Zhu D, Zhao Y, Zhang R, Wu H, Cai G, Wu Z, Wang Y, Hu X. Genomic prediction based on selective linkage disequilibrium pruning of low-coverage whole-genome sequence variants in a pure Duroc population. Genet Sel Evol 2023;55:72. [PMID: 37853325 PMCID: PMC10583454 DOI: 10.1186/s12711-023-00843-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 09/14/2023] [Indexed: 10/20/2023] Open

Abstract

BACKGROUND

Although the accumulation of whole-genome sequencing (WGS) data has accelerated the identification of mutations underlying complex traits, its impact on the accuracy of genomic predictions is limited. Reliable genotyping data and pre-selected beneficial loci can be used to improve prediction accuracy. Previously, we reported a low-coverage sequencing genotyping method that yielded 11.3 million highly accurate single-nucleotide polymorphisms (SNPs) in pigs. Here, we introduce a method termed selective linkage disequilibrium pruning (SLDP), which refines the set of SNPs that show a large gain during prediction of complex traits using whole-genome SNP data.

RESULTS

We used the SLDP method to identify and select markers among millions of SNPs based on genome-wide association study (GWAS) prior information. We evaluated the performance of SLDP with respect to three real traits and six simulated traits with varying genetic architectures using two representative models (genomic best linear unbiased prediction and BayesR) on samples from 3579 Duroc boars. SLDP was determined by testing 180 combinations of two core parameters (GWAS P-value thresholds and linkage disequilibrium r2). The parameters for each trait were optimized in the training population by five fold cross-validation and then tested in the validation population. Similar to previous GWAS prior-based methods, the performance of SLDP was mainly affected by the genetic architecture of the traits analyzed. Specifically, SLDP performed better for traits controlled by major quantitative trait loci (QTL) or a small number of quantitative trait nucleotides (QTN). Compared with two commercial SNP chips, genotyping-by-sequencing data, and an unselected whole-genome SNP panel, the SLDP strategy led to significant improvements in prediction accuracy, which ranged from 0.84 to 3.22% for real traits controlled by major or moderate QTL and from 1.23 to 11.47% for simulated traits controlled by a small number of QTN.

CONCLUSIONS

The SLDP marker selection method can be incorporated into mainstream prediction models to yield accuracy improvements for traits with a relatively simple genetic architecture, however, it has no significant advantage for traits not controlled by major QTL. The main factors that affect its performance are the genetic architecture of traits and the reliability of GWAS prior information. Our findings can facilitate the application of WGS-based genomic selection.

Collapse

Zhang R, Zhang Y, Liu T, Jiang B, Li Z, Qu Y, Chen Y, Li Z. Utilizing Variants Identified with Multiple Genome-Wide Association Study Methods Optimizes Genomic Selection for Growth Traits in Pigs. Animals (Basel) 2023;13:ani13040722. [PMID: 36830509 PMCID: PMC9952664 DOI: 10.3390/ani13040722] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Revised: 02/09/2023] [Accepted: 02/15/2023] [Indexed: 02/22/2023] Open

Ros-Freixedes R, Johnsson M, Whalen A, Chen CY, Valente BD, Herring WO, Gorjanc G, Hickey JM. Genomic prediction with whole-genome sequence data in intensely selected pig lines. GENETICS SELECTION EVOLUTION 2022;54:65. [PMID: 36153511 PMCID: PMC9509613 DOI: 10.1186/s12711-022-00756-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 09/05/2022] [Indexed: 12/03/2022]

Abstract

Background

Early simulations indicated that whole-genome sequence data (WGS) could improve the accuracy of genomic predictions within and across breeds. However, empirical results have been ambiguous so far. Large datasets that capture most of the genomic diversity in a population must be assembled so that allele substitution effects are estimated with high accuracy. The objectives of this study were to use a large pig dataset from seven intensely selected lines to assess the benefits of using WGS for genomic prediction compared to using commercial marker arrays and to identify scenarios in which WGS provides the largest advantage.

Methods

We sequenced 6931 individuals from seven commercial pig lines with different numerical sizes. Genotypes of 32.8 million variants were imputed for 396,100 individuals (17,224 to 104,661 per line). We used BayesR to perform genomic prediction for eight complex traits. Genomic predictions were performed using either data from a standard marker array or variants preselected from WGS based on association tests.

Results

The accuracies of genomic predictions based on preselected WGS variants were not robust across traits and lines and the improvements in prediction accuracy that we achieved so far with WGS compared to standard marker arrays were generally small. The most favourable results for WGS were obtained when the largest training sets were available and standard marker arrays were augmented with preselected variants with statistically significant associations to the trait. With this method and training sets of around 80k individuals, the accuracy of within-line genomic predictions was on average improved by 0.025. With multi-line training sets, improvements of 0.04 compared to marker arrays could be expected.

Conclusions

Our results showed that WGS has limited potential to improve the accuracy of genomic predictions compared to marker arrays in intensely selected pig lines. Thus, although we expect that larger improvements in accuracy from the use of WGS are possible with a combination of larger training sets and optimised pipelines for generating and analysing such datasets, the use of WGS in the current implementations of genomic prediction should be carefully evaluated against the cost of large-scale WGS data on a case-by-case basis.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12711-022-00756-0.

Collapse

Jighly A, Benhajali H, Liu Z, Goddard ME. MetaGS: an accurate method to impute and combine SNP effects across populations using summary statistics. Genet Sel Evol 2022;54:37. [PMID: 35655152 PMCID: PMC9164759 DOI: 10.1186/s12711-022-00725-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 05/02/2022] [Indexed: 11/10/2022] Open

Elsen JM. Genomic Prediction of Complex Traits, Principles, Overview of Factors Affecting the Reliability of Genomic Prediction, and Algebra of the Reliability. Methods Mol Biol 2022;2467:45-76. [PMID: 35451772 DOI: 10.1007/978-1-0716-2205-6_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Cazenave X, Petit B, Lateur M, Nybom H, Sedlak J, Tartarini S, Laurens F, Durel CE, Muranty H. Combining genetic resources and elite material populations to improve the accuracy of genomic prediction in apple. G3 (BETHESDA, MD.) 2021;12:6459174. [PMID: 34893831 PMCID: PMC9210277 DOI: 10.1093/g3journal/jkab420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Accepted: 11/29/2021] [Indexed: 11/12/2022]

Gebreyesus G, Lund MS, Sahana G, Su G. Reliabilities of Genomic Prediction for Young Stock Survival Traits Using 54K SNP Chip Augmented With Additional Single-Nucleotide Polymorphisms Selected From Imputed Whole-Genome Sequencing Data. Front Genet 2021;12:667300. [PMID: 34349779 PMCID: PMC8326759 DOI: 10.3389/fgene.2021.667300] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Accepted: 06/23/2021] [Indexed: 11/16/2022] Open

Abstract

This study investigated effects of integrating single-nucleotide polymorphisms (SNPs) selected based on previous genome-wide association studies (GWASs), from imputed whole-genome sequencing (WGS) data, in the conventional 54K chip on genomic prediction reliability of young stock survival (YSS) traits in dairy cattle. The WGS SNPs included two groups of SNP sets that were selected based on GWAS in the Danish Holstein for YSS index (YSS_SNPs, n = 98) and SNPs chosen as peaks of quantitative trait loci for the traits of Nordic total merit index in Denmark–Finland–Sweden dairy cattle populations (DFS_SNPs, n = 1,541). Additionally, the study also investigated the possibility of improving genomic prediction reliability for survival traits by modeling the SNPs within recessive lethal haplotypes (LET_SNP, n = 130) detected from the 54K chip in the Nordic Holstein. De-regressed proofs (DRPs) were obtained from 6,558 Danish Holstein bulls genotyped with either 54K chip or customized LD chip that includes SNPs in the standard LD chip and some of the selected WGS SNPs. The chip data were subsequently imputed to 54K SNP together with the selected WGS SNPs. Genomic best linear unbiased prediction (GBLUP) models were implemented to predict breeding values through either pooling the 54K and selected WGS SNPs together as one genetic component (a one-component model) or considering 54K SNPs and selected WGS SNPs as two separate genetic components (a two-component model). Across all the traits, inclusion of each of the selected WGS SNP sets led to negligible improvements in prediction accuracies (0.17 percentage points on average) compared to prediction using only 54K. Similarly, marginal improvement in prediction reliability was obtained when all the selected WGS SNPs were included (0.22 percentage points). No further improvement in prediction reliability was observed when considering random regression on genotype code of recessive lethal alleles in the model including both groups of the WGS SNPs. Additionally, there was no difference in prediction reliability from integrating the selected WGS SNP sets through the two-component model compared to the one-component GBLUP.

Collapse

Berry DP. Invited review: Beef-on-dairy-The generation of crossbred beef × dairy cattle. J Dairy Sci 2021;104:3789-3819. [PMID: 33663845 DOI: 10.3168/jds.2020-19519] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 11/26/2020] [Indexed: 02/06/2023]

Abstract

Because a growing proportion of the beef output in many countries originates from dairy herds, the most critical decisions about the genetic merit of most carcasses harvested are being made by dairy producers. Interest in the generation of more valuable calves from dairy females is intensifying, and the most likely vehicle is the use of appropriately selected beef bulls for mating to the dairy females. This is especially true given the growing potential to undertake more beef × dairy matings as herd metrics improve (e.g., reproductive performance) and technological advances are more widely adopted (e.g., sexed semen). Clear breed differences (among beef breeds but also compared with dairy breeds) exist for a whole plethora of performance traits, but considerable within-breed variability has also been demonstrated. Although such variability has implications for the choice of bull to mate to dairy females, the fact that dairy females themselves exhibit such genetic variability implies that "one size fits all" may not be appropriate for bull selection. Although differences in a whole series of key performance indicators have been documented between beef and beef-on-dairy animals, of particular note is the reported lower environmental hoofprint associated with beef-on-dairy production systems if the environmental overhead of the mature cow is attributed to the milk she eventually produces. Despite the known contribution of beef (i.e., both surplus calves and cull cows) to the overall gross output of most dairy herds globally, and the fact that each dairy female contributes half her genetic merit to her progeny, proxies for meat yield (i.e., veal or beef) are not directly considered in the vast majority of dairy cow breeding objectives. Breeding objectives to identify beef bulls suitable for dairy production systems are now being developed and validated, demonstrating the financial benefit of using such breeding objectives over and above a focus on dairy bulls or easy-calving, short-gestation beef bulls. When this approach is complemented by management-based decision-support tools, considerable potential exists to improve the profitability and sustainability of modern dairy production systems by exploiting beef-on-dairy breeding strategies using the most appropriate beef bulls.

Collapse

Marjanovic J, Hulsegge B, Calus MPL. Relatedness between numerically small Dutch Red dairy cattle populations and possibilities for multibreed genomic prediction. J Dairy Sci 2021;104:4498-4506. [PMID: 33551169 DOI: 10.3168/jds.2020-19573] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 11/05/2020] [Indexed: 11/19/2022]

Abstract

Red dairy breeds are a valuable cultural and historical asset, and often a source of unique genetic diversity. However, they have difficulties competing with other, more productive, dairy breeds. Improving competitiveness of Red dairy breeds, by accelerating their genetic improvement using genomic selection, may be a promising strategy to secure their long-term future. For many Red dairy breeds, establishing a sufficiently large breed-specific reference population for genomic prediction is often not possible, but may be overcome by adding individuals from another breed. Relatedness between breeds strongly decides the benefit of adding another breed to the reference population. To prioritize among available breeds, the effective number of chromosome segments (M_e) can be used as an indicator of relatedness between individuals from different breeds. The M_e is also an important parameter in determining the accuracy of genomic prediction. The M_e can be estimated both within a population and between 2 populations or breeds, as the reciprocal of the variance of genomic relationships. We investigated relatedness between 6 Dutch Red cattle breeds, Groningen White Headed (GWH), Dutch Friesian (DF), Meuse-Rhine-Yssel (MRY), Dutch Belted (DB), Deep Red (DR), and Improved Red (IR), focusing primarily on the M_e, to predict which of those breeds may benefit from including reference animals of the other breeds. All of these breeds, except MRY, are under high risk of extinction. Our results indicated high variability of M_e, especially between M_e ranging from ∼3,500 to ∼17,400, indicating different levels of relatedness between the breeds. Two clusters are especially important, one formed by MRY, DR, and IR, and the other comprising DF and DB. Although relatedness between breeds within each of these 2 clusters is high, across-breed genomic prediction is still limited by the current number of genotyped individuals, which for many breeds is low. However, adding MRY individuals would increase the reference population of DR substantially. We estimated that between 11 and 133 individuals from other breeds are needed to achieve accuracy of genomic prediction equivalent to using one additional individual from the same breed. Given the variation in size of the breeds in this study, the benefit of a multibreed reference population is expected to be lower for larger breeds than for the smaller ones.

Collapse

Lopez BIM, An N, Srikanth K, Lee S, Oh JD, Shin DH, Park W, Chai HH, Park JE, Lim D. Genomic Prediction Based on SNP Functional Annotation Using Imputed Whole-Genome Sequence Data in Korean Hanwoo Cattle. Front Genet 2021;11:603822. [PMID: 33552124 PMCID: PMC7859490 DOI: 10.3389/fgene.2020.603822] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Accepted: 11/09/2020] [Indexed: 12/12/2022] Open

Abstract

Whole-genome sequence (WGS) data are increasingly being applied into genomic predictions, offering a higher predictive ability by including causal mutations or single-nucleotide polymorphisms (SNPs) putatively in strong linkage disequilibrium with causal mutations affecting the trait. This study aimed to improve the predictive performance of the customized Hanwoo 50 k SNP panel for four carcass traits in commercial Hanwoo population by adding highly predictive variants from sequence data. A total of 16,892 Hanwoo cattle with phenotypes (i.e., backfat thickness, carcass weight, longissimus muscle area, and marbling score), 50 k genotypes, and WGS imputed genotypes were used. We partitioned imputed WGS data according to functional annotation [intergenic (IGR), intron (ITR), regulatory (REG), synonymous (SYN), and non-synonymous (NSY)] to characterize the genomic regions that will deliver higher predictive power for the traits investigated. Animals were assigned into two groups, the discovery set (7324 animals) used for predictive variant detection and the cross-validation set for genomic prediction. Genome-wide association studies were performed by trait to every genomic region and entire WGS data for the pre-selection of variants. Each set of pre-selected SNPs with different density (1000, 3000, 5000, or 10,000) were added to the 50 k genotypes separately and the predictive performance of each set of genotypes was assessed using the genomic best linear unbiased prediction (GBLUP). Results showed that the predictive performance of the customized Hanwoo 50 k SNP panel can be improved by the addition of pre-selected variants from the WGS data, particularly 3000 variants from each trait, which is then sufficient to improve the prediction accuracy for all traits. When 12,000 pre-selected variants (3000 variants from each trait) were added to the 50 k genotypes, the prediction accuracies increased by 9.9, 9.2, 6.4, and 4.7% for backfat thickness, carcass weight, longissimus muscle area, and marbling score compared to the regular 50 k SNP panel, respectively. In terms of prediction bias, regression coefficients for all sets of genotypes in all traits were close to 1, indicating an unbiased prediction. The strategy used to select variants based on functional annotation did not show a clear advantage compared to using whole-genome. Nonetheless, such pre-selected SNPs from the IGR region gave the highest improvement in prediction accuracy among genomic regions and the values were close to those obtained using the WGS data for all traits. We concluded that additional gain in prediction accuracy when using pre-selected variants appears to be trait-dependent, and using WGS data remained more accurate compared to using a specific genomic region.

Collapse

Raymond B, Yengo L, Costilla R, Schrooten C, Bouwman AC, Hayes BJ, Veerkamp RF, Visscher PM. Using prior information from humans to prioritize genes and gene-associated variants for complex traits in livestock. PLoS Genet 2020;16:e1008780. [PMID: 32925905 PMCID: PMC7514049 DOI: 10.1371/journal.pgen.1008780] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2020] [Revised: 09/24/2020] [Accepted: 07/21/2020] [Indexed: 01/13/2023] Open

Abstract

Genome-Wide Association Studies (GWAS) in large human cohorts have identified thousands of loci associated with complex traits and diseases. For identifying the genes and gene-associated variants that underlie complex traits in livestock, especially where sample sizes are limiting, it may help to integrate the results of GWAS for equivalent traits in humans as prior information. In this study, we sought to investigate the usefulness of results from a GWAS on human height as prior information for identifying the genes and gene-associated variants that affect stature in cattle, using GWAS summary data on samples sizes of 700,000 and 58,265 for humans and cattle, respectively. Using Fisher's exact test, we observed a significant proportion of cattle stature-associated genes (30/77) that are also associated with human height (odds ratio = 5.1, p = 3.1e-10). Result of randomized sampling tests showed that cattle orthologs of human height-associated genes, hereafter referred to as candidate genes (C-genes), were more enriched for cattle stature GWAS signals than random samples of genes in the cattle genome (p = 0.01). Randomly sampled SNPs within the C-genes also tend to explain more genetic variance for cattle stature (up to 13.2%) than randomly sampled SNPs within random cattle genes (p = 0.09). The most significant SNPs from a cattle GWAS for stature within the C-genes did not explain more genetic variance for cattle stature than the most significant SNPs within random cattle genes (p = 0.87). Altogether, our findings support previous studies that suggest a similarity in the genetic regulation of height across mammalian species. However, with the availability of a powerful GWAS for stature that combined data from 8 cattle breeds, prior information from human-height GWAS does not seem to provide any additional benefit with respect to the identification of genes and gene-associated variants that affect stature in cattle.

Collapse

Boison S, Ding J, Leder E, Gjerde B, Bergtun PH, Norris A, Baranski M, Robinson N. QTLs Associated with Resistance to Cardiomyopathy Syndrome in Atlantic Salmon. J Hered 2020;110:727-737. [PMID: 31287894 PMCID: PMC6785937 DOI: 10.1093/jhered/esz042] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2019] [Accepted: 07/01/2019] [Indexed: 11/24/2022] Open

Warburton CL, Engle BN, Ross EM, Costilla R, Moore SS, Corbet NJ, Allen JM, Laing AR, Fordyce G, Lyons RE, McGowan MR, Burns BM, Hayes BJ. Use of whole-genome sequence data and novel genomic selection strategies to improve selection for age at puberty in tropically-adapted beef heifers. Genet Sel Evol 2020;52:28. [PMID: 32460805 PMCID: PMC7251835 DOI: 10.1186/s12711-020-00547-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 05/15/2020] [Indexed: 12/14/2022] Open

Abstract

Background

In tropically-adapted beef heifers, application of genomic prediction for age at puberty has been limited due to low prediction accuracies. Our aim was to investigate novel methods of pre-selecting whole-genome sequence (WGS) variants and alternative analysis methodologies; including genomic best linear unbiased prediction (GBLUP) with multiple genomic relationship matrices (MGRM) and Bayesian (BayesR) analyses, to determine if prediction accuracy for age at puberty can be improved.

Methods

Genotypes and phenotypes were obtained from two research herds. In total, 868 Brahman and 960 Tropical Composite heifers were recorded in the first population and 3695 Brahman, Santa Gertrudis and Droughtmaster heifers were recorded in the second population. Genotypes were imputed to 23 million whole-genome sequence variants. Eight strategies were used to pre-select variants from genome-wide association study (GWAS) results using conditional or joint (COJO) analyses. Pre-selected variants were included in three models, GBLUP with a single genomic relationship matrix (SGRM), GBLUP MGRM and BayesR. Five-way cross-validation was used to test the effect of marker panel density (6 K, 50 K and 800 K), analysis model, and inclusion of pre-selected WGS variants on prediction accuracy.

Results

In all tested scenarios, prediction accuracies for age at puberty were highest in BayesR analyses. The addition of pre-selected WGS variants had little effect on the accuracy of prediction when BayesR was used. The inclusion of WGS variants that were pre-selected using a meta-analysis with COJO analyses by chromosome, fitted in a MGRM model, had the highest prediction accuracies in the GBLUP analyses, regardless of marker density. When the low-density (6 K) panel was used, the prediction accuracy of GBLUP was equal (0.42) to that with the high-density panel when only six additional sequence variants (identified using meta-analysis COJO by chromosome) were included.

Conclusions

While BayesR consistently outperforms other methods in terms of prediction accuracies, reasonable improvements in accuracy can be achieved when using GBLUP and low-density panels with the inclusion of a relatively small number of highly relevant WGS variants.

Collapse

Raymond B, Wientjes YCJ, Bouwman AC, Schrooten C, Veerkamp RF. A deterministic equation to predict the accuracy of multi-population genomic prediction with multiple genomic relationship matrices. Genet Sel Evol 2020;52:21. [PMID: 32345213 PMCID: PMC7189707 DOI: 10.1186/s12711-020-00540-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 04/14/2020] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

A multi-population genomic prediction (GP) model in which important pre-selected single nucleotide polymorphisms (SNPs) are differentially weighted (MPMG) has been shown to result in better prediction accuracy than a multi-population, single genomic relationship matrix ([Formula: see text]) GP model (MPSG) in which all SNPs are weighted equally. Our objective was to underpin theoretically the advantages and limits of the MPMG model over the MPSG model, by deriving and validating a deterministic prediction equation for its accuracy.

METHODS

Using selection index theory, we derived an equation to predict the accuracy of estimated total genomic values of selection candidates from population [Formula: see text] ([Formula: see text]), when individuals from two populations, [Formula: see text] and [Formula: see text], are combined in the training population and two [Formula: see text], made respectively from pre-selected and remaining SNPs, are fitted simultaneously in MPMG. We used simulations to validate the prediction equation in scenarios that differed in the level of genetic correlation between populations, heritability, and proportion of genetic variance explained by the pre-selected SNPs. Empirical accuracy of the MPMG model in each scenario was calculated and compared to the predicted accuracy from the equation.

RESULTS

In general, the derived prediction equation resulted in accurate predictions of [Formula: see text] for the scenarios evaluated. Using the prediction equation, we showed that an important advantage of the MPMG model over the MPSG model is its ability to benefit from the small number of independent chromosome segments ([Formula: see text]) due to the pre-selected SNPs, both within and across populations, whereas for the MPSG model, there is only a single value for [Formula: see text], calculated based on all SNPs, which is very large. However, this advantage is dependent on the pre-selected SNPs that explain some proportion of the total genetic variance for the trait.

CONCLUSIONS

We developed an equation that gives insight into why, and under which conditions the MPMG outperforms the MPSG model for GP. The equation can be used as a deterministic tool to assess the potential benefit of combining information from different populations, e.g., different breeds or lines for GP in livestock or plants, or different groups of people based on their ethnic background for prediction of disease risk scores.

Collapse

The Impact of Non-additive Effects on the Genetic Correlation Between Populations. G3-GENES GENOMES GENETICS 2020;10:783-795. [PMID: 31857332 PMCID: PMC7003072 DOI: 10.1534/g3.119.400663] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations. Genet Sel Evol 2019;51:72. [PMID: 31805849 PMCID: PMC6896509 DOI: 10.1186/s12711-019-0514-2] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Accepted: 11/25/2019] [Indexed: 12/13/2022] Open

Abstract

Background

Whole-genome sequence (WGS) data could contain information on genetic variants at or in high linkage disequilibrium with causative mutations that underlie the genetic variation of polygenic traits. Thus far, genomic prediction accuracy has shown limited increase when using such information in dairy cattle studies, in which one or few breeds with limited diversity predominate. The objective of our study was to evaluate the accuracy of genomic prediction in a multi-breed Australian sheep population of relatively less related target individuals, when using information on imputed WGS genotypes.

Methods

Between 9626 and 26,657 animals with phenotypes were available for nine economically important sheep production traits and all had WGS imputed genotypes. About 30% of the data were used to discover predictive single nucleotide polymorphism (SNPs) based on a genome-wide association study (GWAS) and the remaining data were used for training and validation of genomic prediction. Prediction accuracy using selected variants from imputed sequence data was compared to that using a standard array of 50k SNP genotypes, thereby comparing genomic best linear prediction (GBLUP) and Bayesian methods (BayesR/BayesRC). Accuracy of genomic prediction was evaluated in two independent populations that were each lowly related to the training set, one being purebred Merino and the other crossbred Border Leicester x Merino sheep.

Results

A substantial improvement in prediction accuracy was observed when selected sequence variants were fitted alongside 50k genotypes as a separate variance component in GBLUP (2GBLUP) or in Bayesian analysis as a separate category of SNPs (BayesRC). From an average accuracy of 0.27 in both validation sets for the 50k array, the average absolute increase in accuracy across traits with 2GBLUP was 0.083 and 0.073 for purebred and crossbred animals, respectively, whereas with BayesRC it was 0.102 and 0.087. The average gain in accuracy was smaller when selected sequence variants were treated in the same category as 50k SNPs. Very little improvement over 50k prediction was observed when using all WGS variants.

Conclusions

Accuracy of genomic prediction in diverse sheep populations increased substantially by using variants selected from whole-genome sequence data based on an independent multi-breed GWAS, when compared to genomic prediction using standard 50K genotypes.

Collapse

Song H, Ye S, Jiang Y, Zhang Z, Zhang Q, Ding X. Using imputation-based whole-genome sequencing data to improve the accuracy of genomic prediction for combined populations in pigs. Genet Sel Evol 2019;51:58. [PMID: 31638889 PMCID: PMC6805481 DOI: 10.1186/s12711-019-0500-8] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Accepted: 10/07/2019] [Indexed: 11/17/2022] Open

Abstract

BACKGROUND

For genomic selection in populations with a small reference population, combining populations of the same breed or populations of related breeds is an effective way to increase the size of the reference population. However, genomic predictions based on single nucleotide polymorphism (SNP)-chip genotype data using combined populations with different genetic backgrounds or from different breeds have not shown a clear advantage over using within-population or within-breed predictions. The increasing availability of whole-genome sequencing (WGS) data provides new opportunities for combined population genomic prediction. Our objective was to investigate the accuracy of genomic prediction using imputation-based WGS data from combined populations in pigs. Using 80K SNP panel genotypes, WGS genotypes, or genotypes on WGS variants that were pruned based on linkage disequilibrium (LD), three methods [genomic best linear unbiased prediction (GBLUP), single-step (ss)GBLUP, and genomic feature (GF)BLUP] were implemented with different prior information to identify the best method to improve the accuracy of genomic prediction for combined populations in pigs.

RESULTS

In total, 2089 and 2043 individuals with production and reproduction phenotypes, respectively, from three Yorkshire populations with different genetic backgrounds were genotyped with the PorcineSNP80 panel. Imputation accuracy from 80K to WGS variants reached 92%. The results showed that use of the WGS data compared to the 80K SNP panel did not increase the accuracy of genomic prediction in a single population, but using WGS data with LD pruning and GFBLUP with prior information did yield higher accuracy than the 80K SNP panel. For the 80K SNP panel genotypes, using the combined population resulted in a slight improvement, no change, or even a slight decrease in accuracy in comparison with the single population for GBLUP and ssGBLUP, while accuracy increased by 1 to 2.4% when using WGS data. Notably, the GFBLUP method did not perform well for both the combined population and the single populations.

CONCLUSIONS

The use of WGS data was beneficial for combined population genomic prediction. Simply increasing the number of SNPs to the WGS level did not increase accuracy for a single population, while using pruned WGS data based on LD and GFBLUP with prior information could yield higher accuracy than the 80K SNP panel.

Collapse

Theoretical Evaluation of Multi-Breed Genomic Prediction in Chinese Indigenous Cattle. Animals (Basel) 2019;9:ani9100789. [PMID: 31614691 PMCID: PMC6827096 DOI: 10.3390/ani9100789] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 09/24/2019] [Accepted: 10/02/2019] [Indexed: 12/19/2022] Open

Abstract

Simple Summary

In order to evaluate the potential application of genomic selection (GS) for Chinese indigenous cattle, we assessed the influence of combining multiple populations on the reliability of genomic predictions for 10 indigenous breeds of Chinese cattle using simulated data. We found the predictive accuracies to be low when the reference and validation populations were sampled from different breeds. When using multiple breeds for the reference population, the predictive accuracies were higher if the reference was comprised of breeds with close relationships. In addition, the accuracy increased in all scenarios when the heritability increased, and the genetic architecture of the QTL can affect genomic prediction. Our study suggested that the application of meta-populations can increase accuracy in scenarios with a reduced size of reference populations.

Abstract

Genomic selection (GS) has been widely considered as a valuable strategy for enhancing the rate of genetic gain in farm animals. However, the construction of a large reference population is a big challenge for small populations like indigenous cattle. In order to evaluate the potential application of GS for Chinese indigenous cattle, we assessed the influence of combining multiple populations on the reliability of genomic predictions for 10 indigenous breeds of Chinese cattle using simulated data. Also, we examined the effect of different genetic architecture on prediction accuracy. In this study, we simulated a set of genotype data by a resampling approach which can reflect the realistic linkage disequilibrium pattern for multiple populations. We found within-breed evaluations yielded the highest accuracies ranged from 0.64 to 0.68 for four different simulated genetic architectures. For scenarios using multiple breeds as reference, the predictive accuracies were higher when the reference was comprised of breeds with a close relationship, while the accuracies were low when prediction were carried out among breeds. In addition, the accuracy increased in all scenarios with the heritability increased. Our results suggested that using meta-population as reference can increase accuracy of genomic predictions for small populations. Moreover, multi-breed genomic selection was feasible for Chinese indigenous populations with genetic relationships.

Collapse

Ye S, Gao N, Zheng R, Chen Z, Teng J, Yuan X, Zhang H, Chen Z, Zhang X, Li J, Zhang Z. Strategies for Obtaining and Pruning Imputed Whole-Genome Sequence Data for Genomic Prediction. Front Genet 2019;10:673. [PMID: 31379929 PMCID: PMC6650575 DOI: 10.3389/fgene.2019.00673] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Accepted: 06/27/2019] [Indexed: 11/13/2022] Open

Abstract

Genomic prediction with imputed whole-genome sequencing (WGS) data is an attractive approach to improve predictive ability with low cost. However, high accuracy has not been realized using this method in livestock. In this study, we imputed 435 individuals from 600K single nucleotide polymorphism (SNP) chip data to WGS data using different reference panels. We also investigated the prediction accuracy of genomic best linear unbiased prediction (GBLUP) using imputed WGS data from different reference panels, linkage disequilibrium (LD)-based marker pruning, and pre-selected variants based on Genome-wide association society (GWAS) results. Results showed that the imputation accuracies from 600K to WGS data were 0.873 ± 0.038, 0.906 ± 0.036, and 0.979 ± 0.010 for the internal, external, and combined reference panels, respectively. In most traits of chickens, the prediction accuracy of imputed WGS data obtained from the internal reference panel was greater than or equal to that of the combined reference panel; the external reference panel had the lowest prediction accuracy. Compared with 600K chip data, GBLUP with imputed WGS data had only a small increase (1-3%) in prediction accuracy. Using only variants selected from imputed WGS data based on GWAS results resulted in almost no increase for most traits and even increased the bias of the regression coefficient. The impact of the degree of LD of selected and remaining variants on prediction accuracy was different. For average daily gain (ADG), residual feed intake (RFI), intestine length (IL), and body weight in 91 days (BW91), the accuracy of GBLUP increased as the degree of LD of selected variants decreased, but the opposite relationship occurred for the remaining variants. But for breast muscle weight (BMW) and average daily feed intake (ADFI), the accuracy of GBLUP increased as the degree of LD of selected variants increased, and the degree of LD of remaining variants had a small effect on prediction accuracy. Overall, the optimal imputation strategy to obtain WGS data for genomic prediction should consider the relationship between selected individuals and target population individuals to avoid heterogeneity of imputation. LD-based marker pruning can be used to improve the accuracy of genomic prediction using imputed WGS data.

Collapse

Vandenplas J, Calus MPL, Eding H, Vuik C. A second-level diagonal preconditioner for single-step SNPBLUP. Genet Sel Evol 2019;51:30. [PMID: 31238880 PMCID: PMC6593613 DOI: 10.1186/s12711-019-0472-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Accepted: 06/07/2019] [Indexed: 12/30/2022] Open

Abstract

Background

The preconditioned conjugate gradient (PCG) method is an iterative solver of linear equations systems commonly used in animal breeding. However, the PCG method has been shown to encounter convergence issues when applied to single-step single nucleotide polymorphism BLUP (ssSNPBLUP) models. Recently, we proposed a deflated PCG (DPCG) method for solving ssSNPBLUP efficiently. The DPCG method introduces a second-level preconditioner that annihilates the effect of the largest unfavourable eigenvalues of the ssSNPBLUP preconditioned coefficient matrix on the convergence of the iterative solver. While it solves the convergence issues of ssSNPBLUP, the DPCG method requires substantial additional computations, in comparison to the PCG method. Accordingly, the aim of this study was to develop a second-level preconditioner that decreases the largest eigenvalues of the ssSNPBLUP preconditioned coefficient matrix at a lower cost than the DPCG method, in addition to comparing its performance to the (D)PCG methods applied to two different ssSNPBLUP models.

Results

Based on the properties of the ssSNPBLUP preconditioned coefficient matrix, we proposed a second-level diagonal preconditioner that decreases the largest eigenvalues of the ssSNPBLUP preconditioned coefficient matrix under some conditions. This proposed second-level preconditioner is easy to implement in current software and does not result in additional computing costs as it can be combined with the commonly used (block-)diagonal preconditioner. Tested on two different datasets and with two different ssSNPBLUP models, the second-level diagonal preconditioner led to a decrease of the largest eigenvalues and the condition number of the preconditioned coefficient matrices. It resulted in an improvement of the convergence pattern of the iterative solver. For the largest dataset, the convergence of the PCG method with the proposed second-level diagonal preconditioner was slower than the DPCG method, but it performed better than the DPCG method in terms of total computing time.

Conclusions

The proposed second-level diagonal preconditioner can improve the convergence of the (D)PCG methods applied to two ssSNPBLUP models. Based on our results, the PCG method combined with the proposed second-level diagonal preconditioner seems to be more efficient than the DPCG method in solving ssSNPBLUP. However, the optimal combination of ssSNPBLUP and solver will most likely be situation-dependent.

Electronic supplementary material

The online version of this article (10.1186/s12711-019-0472-8) contains supplementary material, which is available to authorized users.

Collapse

van den Berg S, Vandenplas J, van Eeuwijk FA, Lopes MS, Veerkamp RF. Significance testing and genomic inflation factor using high-density genotypes or whole-genome sequence data. J Anim Breed Genet 2019;136:418-429. [PMID: 31215703 PMCID: PMC6900143 DOI: 10.1111/jbg.12419] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Revised: 05/21/2019] [Accepted: 05/29/2019] [Indexed: 01/02/2023]

Abstract

Significance testing for genome‐wide association study (GWAS) with increasing SNP density up to whole‐genome sequence data (WGS) is not straightforward, because of strong LD between SNP and population stratification. Therefore, the objective of this study was to investigate genomic control and different significance testing procedures using data from a commercial pig breeding scheme. A GWAS was performed in GCTA with data of 4,964 Large White pigs using medium density, high density or imputed whole‐genome sequence data, fitting a genomic relationship matrix based on a leave‐one–chromosome‐out approach to account for population structure. Subsequently, genomic inflation factors were assessed on whole‐genome level and the chromosome level. To establish a significance threshold, permutation testing, Bonferroni corrections using either the total number of SNPs or the number of independent chromosome fragments, and false discovery rates (FDR) using either the Benjamini–Hochberg procedure or the Benjamini and Yekutieli procedure were evaluated. We found that genomic inflation factors did not differ between different density genotypes but do differ between chromosomes. Also, the leave‐one‐chromosome‐out approach for GWAS or using the pedigree relationships did not account appropriately for population stratification and gave strong genomic inflation. Regarding different procedures for significance testing, when the aim is to find QTL regions that are associated with a trait of interest, we recommend applying the FDR following the Benjamini and Yekutieli approach to establish a significance threshold that is adjusted for multiple testing. When the aim is to pinpoint a specific mutation, the more conservative Bonferroni correction based on the total number of SNPs is more appropriate, till an appropriate method is established to adjust for the number of independent tests.

Collapse