Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

110
(from Reference Citation Analysis)

Article PDFs (40)

Cited by > 0 (54)

Searched Name

Daniela Lourenco

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Ramos PVB, de Oliveira Menezes GR, da Silva DA, Lourenco D, Santiago GG, Torres Júnior RAA, Silva FFE, Lopes PS, Veroneze R. Genomic analysis of feed efficiency traits in beef cattle using random regression models. J Anim Breed Genet 2024;141:291-303. [PMID: 38062881 DOI: 10.1111/jbg.12840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 10/31/2023] [Accepted: 11/24/2023] [Indexed: 04/08/2024]

Abstract

Feed efficiency plays a major role in the overall profitability and sustainability of the beef cattle industry, as it is directly related to the reduction of the animal demand for input and methane emissions. Traditionally, the average daily feed intake and weight gain are used to calculate feed efficiency traits. However, feed efficiency traits can be analysed longitudinally using random regression models (RRMs), which allow fitting random genetic and environmental effects over time by considering the covariance pattern between the daily records. Therefore, the objectives of this study were to: (1) propose genomic evaluations for dry matter intake (DMI), body weight gain (BWG), residual feed intake (RFI) and residual weight gain (RWG) data collected during an 84-day feedlot test period via RRMs; (2) compare the goodness-of-fit of RRM using Legendre polynomials (LP) and B-spline functions; (3) evaluate the genetic parameters behaviour for feed efficiency traits and their implication for new selection strategies. The datasets were provided by the EMBRAPA-GENEPLUS beef cattle breeding program and included 2920 records for DMI, 2696 records for BWG and 4675 genotyped animals. Genetic parameters and genomic breeding values (GEBVs) were estimated by RRMs under ssGBLUP for Nellore cattle using orthogonal LPs and B-spline. Models were compared based on the deviance information criterion (DIC). The ranking of the average GEBV of each test week and the overall GEBV average were compared by the percentage of individuals in common and the Spearman correlation coefficient (top 1%, 5%, 10% and 100%). The highest goodness-of-fit was obtained with linear B-Spline function considering heterogeneous residual variance. The heritability estimates across the test period for DMI, BWG, RFI and RWG ranged from 0.06 to 0.21, 0.11 to 0.30, 0.03 to 0.26 and 0.07 to 0.27, respectively. DMI and RFI presented within-trait genetic correlations ranging from low to high magnitude across different performance test-day. In contrast, BWG and RWG presented negative genetic correlations between the first 3 weeks and the other days of performance tests. DMI and RFI presented a high-ranking similarity between the GEBV average of week eight and the overall GEBV average, with Spearman correlations and percentages of individuals selected in common ranging from 0.95 to 1.00 and 93 to 100, respectively. Week 11 presented the highest Spearman correlations (ranging from 0.94 to 0.98) and percentages of individuals selected in common (ranging from 85 to 94) of BWG and RWG with the average GEBV of the entire period of the test. In conclusion, the RRM using linear B-splines is a feasible alternative for the genomic evaluation of feed efficiency. Heritability estimates of DMI, RFI, BWG and RWG indicate enough additive genetic variance to achieve a moderate response to selection. A new selection strategy can be adopted by reducing the performance test to 56 days for DMI and RFI selection and 77 days for BWG and RWG selection.

Collapse

Hollifield MK, Chen CY, Psota E, Holl J, Lourenco D, Misztal I. Estimating genetic parameters of digital behavior traits and their relationship with production traits in purebred pigs. Genet Sel Evol 2024;56:29. [PMID: 38627636 PMCID: PMC11022375 DOI: 10.1186/s12711-024-00902-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 04/08/2024] [Indexed: 04/19/2024] Open

Abstract

BACKGROUND

With the introduction of digital phenotyping and high-throughput data, traits that were previously difficult or impossible to measure directly have become easily accessible, offering the opportunity to enhance the efficiency and rate of genetic gain in animal production. It is of interest to assess how behavioral traits are indirectly related to the production traits during the performance testing period. The aim of this study was to assess the quality of behavior data extracted from day-wise video recordings and estimate the genetic parameters of behavior traits and their phenotypic and genetic correlations with production traits in pigs. Behavior was recorded for 70 days after on-test at about 10 weeks of age and ended at off-test for 2008 female purebred pigs, totaling 119,812 day-wise records. Behavior traits included time spent eating, drinking, laterally lying, sternally lying, sitting, standing, and meters of distance traveled. A quality control procedure was created for algorithm training and adjustment, standardizing recording hours, removing culled animals, and filtering unrealistic records.

RESULTS

Production traits included average daily gain (ADG), back fat thickness (BF), and loin depth (LD). Single-trait linear models were used to estimate heritabilities of the behavior traits and two-trait linear models were used to estimate genetic correlations between behavior and production traits. The results indicated that all behavior traits are heritable, with heritability estimates ranging from 0.19 to 0.57, and showed low-to-moderate phenotypic and genetic correlations with production traits. Two-trait linear models were also used to compare traits at different intervals of the recording period. To analyze the redundancies in behavior data during the recording period, the averages of various recording time intervals for the behavior and production traits were compared. Overall, the average of the 55- to 68-day recording interval had the strongest phenotypic and genetic correlation estimates with the production traits.

CONCLUSIONS

Digital phenotyping is a new and low-cost method to record behavior phenotypes, but thorough data cleaning procedures are needed. Evaluating behavioral traits at different time intervals offers a deeper insight into their changes throughout the growth periods and their relationship with production traits, which may be recorded at a less frequent basis.

Collapse

Hollifield MK, Lourenco D, Misztal I. Estimation of heritability with genomic information by method R. J Anim Breed Genet 2024. [PMID: 38523564 DOI: 10.1111/jbg.12863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 03/05/2024] [Accepted: 03/10/2024] [Indexed: 03/26/2024]

Abstract

Estimating heritabilities with large genomic models by established methods such as restricted maximum likelihood (REML) or Bayesian via Gibbs sampling is computationally expensive. Alternatively, heritability can be estimated indirectly by method R and by maximum predictivity, referred to as MaxPred here, at a much lower computing cost. By method R, the heritability used for predictions with whole and partial data is considered the best estimate when the predictions based on partial data are unbiased relative to those with the complete data. By MaxPred, the heritability estimate is the one that maximizes predictivity. This study compared heritability estimation with genomic information using average information REML (AI-REML), method R and MaxPred. A simulated population was generated with ten generations of 5000 animals each and an effective population size of 80. Each animal had one record for a trait with a heritability of 0.3, a phenotypic variance of 10.0 and was genotyped at 50 k SNP. In method R, the heritability estimate is found when the expectation of a regression coefficient is equal to one. The regression is the EBV of selection candidates calculated with the whole dataset regressed on the EBV of candidates calculated from a partial dataset. In this study, we used the GBLUP framework and therefore, GEBV was calculated. The partial dataset was created by removing the last generation of phenotypes. Predictivity was defined as the correlation between the adjusted phenotypes of the selection candidates and their GEBV calculated from the partial data. We estimated the heritability for populations that included between three and 10 generations. In every scenario, predictivity increased as more data was used and was the highest at the simulated heritability. However, the predictivity for all data subsets and all heritabilities compared did not differ more than 0.01, suggesting MaxPred is not the best indication for heritability estimation. For the whole dataset, the heritability was estimated as 0.30 ± 0.01, 0.26 ± 0.01 and 0.30 ± 0.04 for AI-REML without genomics, AI-REML with genomics and method R with genomics, respectively. Heritability estimation with genomics by method R reduced timing by 83%, implying a reduction in computing time from 9.5 to 1.6 h, on average, compared to AI-REML with genomics. Method R has the potential to estimate heritabilities with large genomic information at a low cost when many generations of animals are present; however, the standard error can be high when only a few iterations are used.

Collapse

Bermann M, Legarra A, Munera AA, Misztal I, Lourenco D. Confidence intervals for validation statistics with data truncation in genomic prediction. Genet Sel Evol 2024;56:18. [PMID: 38459504 DOI: 10.1186/s12711-024-00883-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 01/31/2024] [Indexed: 03/10/2024] Open

Abstract

BACKGROUND

Validation by data truncation is a common practice in genetic evaluations because of the interest in predicting the genetic merit of a set of young selection candidates. Two of the most used validation methods in genetic evaluations use a single data partition: predictivity or predictive ability (correlation between pre-adjusted phenotypes and estimated breeding values (EBV) divided by the square root of the heritability) and the linear regression (LR) method (comparison of "early" and "late" EBV). Both methods compare predictions with the whole dataset and a partial dataset that is obtained by removing the information related to a set of validation individuals. EBV obtained with the partial dataset are compared against adjusted phenotypes for the predictivity or EBV obtained with the whole dataset in the LR method. Confidence intervals for predictivity and the LR method can be obtained by replicating the validation for different samples (or folds), or bootstrapping. Analytical confidence intervals would be beneficial to avoid running several validations and to test the quality of the bootstrap intervals. However, analytical confidence intervals are unavailable for predictivity and the LR method.

RESULTS

We derived standard errors and Wald confidence intervals for the predictivity and statistics included in the LR method (bias, dispersion, ratio of accuracies, and reliability). The confidence intervals for the bias, dispersion, and reliability depend on the relationships and prediction error variances and covariances across the individuals in the validation set. We developed approximations for large datasets that only need the reliabilities of the individuals in the validation set. The confidence intervals for the ratio of accuracies and predictivity were obtained through the Fisher transformation. We show the adequacy of both the analytical and approximated analytical confidence intervals and compare them versus bootstrap confidence intervals using two simulated examples. The analytical confidence intervals were closer to the simulated ones for both examples. Bootstrap confidence intervals tend to be narrower than the simulated ones. The approximated analytical confidence intervals were similar to those obtained by bootstrapping.

CONCLUSIONS

Estimating the sampling variation of predictivity and the statistics in the LR method without replication or bootstrap is possible for any dataset with the formulas presented in this study.

Collapse

Richter J, Hidalgo J, Bussiman F, Breen V, Misztal I, Lourenco D. Temporal dynamics of genetic parameters and SNP effects for performance and disorder traits in poultry undergoing genomic selection. J Anim Sci 2024;102:skae097. [PMID: 38576313 PMCID: PMC11044709 DOI: 10.1093/jas/skae097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 04/03/2024] [Indexed: 04/06/2024] Open

Abstract

Accurate genetic parameters are crucial for predicting breeding values and selection responses in breeding programs. Genetic parameters change with selection, reducing additive genetic variance and changing genetic correlations. This study investigates the dynamic changes in genetic parameters for residual feed intake (RFI), gain (GAIN), breast percentage (BP), and femoral head necrosis (FHN) in a broiler population that undergoes selection, both with and without the use of genomic information. Changes in single nucleotide polymorphism (SNP) effects were also investigated when including genomic information. The dataset containing 200,093 phenotypes for RFI, 42,895 for BP, 203,060 for GAIN, and 63,349 for FHN was obtained from 55 mating groups. The pedigree included 1,252,619 purebred broilers, of which 154,318 were genotyped with a 60K Illumina Chicken SNP BeadChip. A Bayesian approach within the GIBBSF90 + software was applied to estimate the genetic parameters for single-, two-, and four-trait models with sliding time intervals. For all models, we used genomic-based (GEN) and pedigree-based approaches (PED), meaning with or without genotypes. For GEN (PED), heritability varied from 0.19 to 0.2 (0.31 to 0.21) for RFI, 0.18 to 0.11 (0.25 to 0.14) for GAIN, 0.45 to 0.38 (0.61 to 0.47) for BP, and 0.35 to 0.24 (0.53 to 0.28) for FHN, across the intervals. Changes in genetic correlations estimated by GEN (PED) were 0.32 to 0.33 (0.12 to 0.25) for RFI-GAIN, -0.04 to -0.27 (-0.18 to -0.27) for RFI-BP, -0.04 to -0.07 (-0.02 to -0.08) for RFI-FHN, -0.04 to 0.04 (0.06 to 0.2) for GAIN-BP, -0.17 to -0.06 (-0.02 to -0.01) for GAIN-FHN, and 0.02 to 0.07 (0.06 to 0.07) for BP-FHN. Heritabilities tended to decrease over time while genetic correlations showed both increases and decreases depending on the traits. Similar to heritabilities, correlations between SNP effects declined from 0.78 to 0.2 for RFI, 0.8 to 0.2 for GAIN, 0.73 to 0.16 for BP, and 0.71 to 0.14 for FHN over the eight intervals with genomic information, suggesting potential epistatic interactions affecting genetic trait architecture. Given rapid genetic architecture changes and differing estimates between genomic and pedigree-based approaches, using more recent data and genomic information to estimate variance components is recommended for populations undergoing genomic selection to avoid potential biases in genetic parameters.

Collapse

Guinan FL, Wiggans GR, Norman HD, Dürr JW, Cole JB, Van Tassell CP, Misztal I, Lourenco D. Corrigendum to "Changes in genetic trends in US dairy cattle since the implementation of genomic selection" (J. Dairy Sci. 106:1110-1129). J Dairy Sci 2023;106:9911. [PMID: 38115381 DOI: 10.3168/jds.2023-106-12-9911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2023]

McWhorter TM, Sargolzaei M, Sattler CG, Utt MD, Tsuruta S, Misztal I, Lourenco D. Single-step genomic predictions for heat tolerance of production yields in US Holsteins and Jerseys. J Dairy Sci 2023;106:7861-7879. [PMID: 37641276 DOI: 10.3168/jds.2022-23144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 05/08/2023] [Indexed: 08/31/2023]

Abstract

The physiological stress caused by excessive heat affects dairy cattle health and production. This study sought to investigate the effect of heat stress on test-day yields in US Holstein and Jersey cows and develop single-step genomic predictions to identify heat tolerant animals. Data included 12.8 million and 2.1 million test-day records, respectively, for 923,026 Holstein and 153,710 Jersey cows in 27 US states. From 2015 through 2021, test-day records from the first 5 lactations included milk, fat, and protein yields (kg). Cow records were included if they had at least 5 test-day records per lactation. Heat stress was quantified by analyzing the effect of a 5-d hourly average temperature-humidity index (THI5d¯) on observed test-day yields. Using a multiple trait repeatability model, a heat threshold (THI threshold) was determined fowr each breed based on the point that the average adjusted yields started to decrease, which was 69 for Holsteins and 72 for Jerseys. An additive genetic component of general production and heat tolerance production were estimated using a multiple trait reaction norm model and single-step genomic BLUP methodology. Random effects were regressed on a function of 5-d hourly average (THI5d¯) and THI threshold. The proportion of test-day records that occurred on or above the respective heat thresholds was 15% for Holstein and 10% for Jersey. Heritability of milk, fat, and protein yields under heat stress for Holsteins increased, with a small standard error, indicating that the additive genetic component for heat tolerance of these traits was observed. This was not as evident in Jersey traits. For Jersey, the permanent environment explained the same or more of the variation in fat and protein yield under heat stress indicating that nongenetic factors may determine heat tolerance for these Jersey traits. Correlations between the general genetic merit of production (in the absence of heat stress) and heat tolerance genetic merit of production traits were moderate in strength and negative. This indicated that selecting for general genetic merit without consideration of heat tolerance genetic merit of production may result in less favorable performance in hot and humid climates. A general genomic estimated breeding value for genetic merit and a heat tolerance genomic estimated breeding value were calculated for each animal. This study contributes to the investigation of the impact of heat stress on US dairy cattle production yields and offers a basis for the implementation of genomic selection. The results indicate that genomic selection for heat tolerance of production yields is possible for US Holsteins and Jerseys, but a study to validate the genomic predictions should be explored.

Collapse

Jang S, Ros-Freixedes R, Hickey JM, Chen CY, Holl J, Herring WO, Misztal I, Lourenco D. Using pre-selected variants from large-scale whole-genome sequence data for single-step genomic predictions in pigs. Genet Sel Evol 2023;55:55. [PMID: 37495982 PMCID: PMC10373252 DOI: 10.1186/s12711-023-00831-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 07/18/2023] [Indexed: 07/28/2023] Open

Abstract

BACKGROUND

Whole-genome sequence (WGS) data harbor causative variants that may not be present in standard single nucleotide polymorphism (SNP) chip data. The objective of this study was to investigate the impact of using preselected variants from WGS for single-step genomic predictions in maternal and terminal pig lines with up to 1.8k sequenced and 104k sequence imputed animals per line.

METHODS

Two maternal and four terminal lines were investigated for eight and seven traits, respectively. The number of sequenced animals ranged from 1365 to 1491 for the maternal lines and 381 to 1865 for the terminal lines. Imputation to sequence occurred within each line for 66k to 76k animals for the maternal lines and 29k to 104k animals for the terminal lines. Two preselected SNP sets were generated based on a genome-wide association study (GWAS). Top40k included the SNPs with the lowest p-value in each of the 40k genomic windows, and ChipPlusSign included significant variants integrated into the porcine SNP chip used for routine genotyping. We compared the performance of single-step genomic predictions between using preselected SNP sets assuming equal or different variances and the standard porcine SNP chip.

RESULTS

In the maternal lines, ChipPlusSign and Top40k showed an average increase in accuracy of 0.6 and 4.9%, respectively, compared to the regular porcine SNP chip. The greatest increase was obtained with Top40k, particularly for fertility traits, for which the initial accuracy based on the standard SNP chip was low. However, in the terminal lines, Top40k resulted in an average loss of accuracy of 1%. ChipPlusSign provided a positive, although small, gain in accuracy (0.9%). Assigning different variances for the SNPs slightly improved accuracies when using variances obtained from BayesR. However, increases were inconsistent across the lines and traits.

CONCLUSIONS

The benefit of using sequence data depends on the line, the size of the genotyped population, and how the WGS variants are preselected. When WGS data are available on hundreds of thousands of animals, using sequence data presents an advantage but this remains limited in pigs.

Collapse

Leite NG, Knol E, Tsuruta S, Nuphaus S, Vogelzang R, Lourenco D. Using social interaction models for genetic analysis of skin damage in gilts. Genet Sel Evol 2023;55:52. [PMID: 37488486 PMCID: PMC10364388 DOI: 10.1186/s12711-023-00816-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 05/31/2023] [Indexed: 07/26/2023] Open

Abstract

BACKGROUND

Skin damage is a trait of economic and welfare importance that results from social interactions between animals. These interactions may produce wound signs on the gilt's skin as a result of damage behavior (i.e., fighting), biting syndromes (i.e., tail, vulva, or ear biting), and swine inflammation and necrosis syndrome. Although current selection for traits that are affected by social interactions primarily focuses on improving direct genetic effects, combined selection on direct and social genetic effects could increase genetic gain and avoid a negative response to selection in cases of competitive behavior. The objectives of this study were to (1) estimate variance components for combined skin damage (CSD), with or without accounting for social genetic effects, (2) investigate the impact of including genomic information on the prediction accuracy, bias, and dispersion of CSD estimated breeding values, and (3) perform a single-step genome-wide association study (ssGWAS) of CSD under a classical and a social interaction model.

RESULTS

Our results show that CSD is heritable and affected by social genetic effects. Modeling CSD with social interaction models increased the total heritable variance relative to the phenotypic variance by three-fold compared to the classical model. Including genomic information increased the prediction accuracy of direct, social, and total estimated breeding values for purebred sires by at least 21.2%. Bias and dispersion of estimated breeding values were reduced by including genomic information in classical and social interaction models but remained present. The ssGWAS did not identify any single nucleotide polymorphism that was significantly associated with social or direct genetic effects for CSD.

CONCLUSIONS

Combined skin damage is heritable, and genetic selection against this trait will increase the welfare of animals in the long term. Combined skin damage is affected by social genetic effects, and modeling this trait with a social interaction model increases the potential for genetic improvement. Including genomic information increases the prediction accuracy of estimated breeding values and reduces their bias and dispersion, although some biases persist. The results of the genome-wide association study indicate that CSD has a polygenic architecture and no major quantitative trait locus was detected.

Collapse

Jang S, Tsuruta S, Leite NG, Misztal I, Lourenco D. Dimensionality of genomic information and its impact on genome-wide associations and variant selection for genomic prediction: a simulation study. Genet Sel Evol 2023;55:49. [PMID: 37460964 DOI: 10.1186/s12711-023-00823-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 07/03/2023] [Indexed: 07/20/2023] Open

Abstract

BACKGROUND

Identifying true positive variants in genome-wide associations (GWA) depends on several factors, including the number of genotyped individuals. The limited dimensionality of genomic information may give insights into the optimal number of individuals to be used in GWA. This study investigated different discovery set sizes based on the number of largest eigenvalues explaining a certain proportion of variance in the genomic relationship matrix (G). In addition, we investigated the impact on the prediction accuracy by adding variants, which were selected based on different set sizes, to the regular single nucleotide polymorphism (SNP) chips used for genomic prediction.

METHODS

We simulated sequence data that included 500k SNPs with 200 or 2000 quantitative trait nucleotides (QTN). A regular 50k panel included one in every ten simulated SNPs. Effective population size (Ne) was set to 20 or 200. GWA were performed using a number of genotyped animals equivalent to the number of largest eigenvalues of G (EIG) explaining 50, 60, 70, 80, 90, 95, 98, and 99% of the variance. In addition, the largest discovery set consisted of 30k genotyped animals. Limited or extensive phenotypic information was mimicked by changing the trait heritability. Significant and large-effect size SNPs were added to the 50k panel and used for single-step genomic best linear unbiased prediction (ssGBLUP).

RESULTS

Using a number of genotyped animals corresponding to at least EIG98 allowed the identification of QTN with the largest effect sizes when Ne was large. Populations with smaller Ne required more than EIG98. Furthermore, including genotyped animals with a higher reliability (i.e., a higher trait heritability) improved the identification of the most informative QTN. Prediction accuracy was highest when the significant or the large-effect SNPs representing twice the number of simulated QTN were added to the 50k panel.

CONCLUSIONS

Accurately identifying causative variants from sequence data depends on the effective population size and, therefore, on the dimensionality of genomic information. This dimensionality can help identify the most suitable sample size for GWA and could be considered for variant selection, especially when resources are restricted. Even when variants are accurately identified, their inclusion in prediction models has limited benefits.

Collapse

Steyn Y, Lawlor TJ, Lourenco D, Misztal I. The importance of historically popular sires on the accuracy of genomic predictions of young animals in the US Holstein population. JDS Commun 2023;4:260-264. [PMID: 37521061 PMCID: PMC10382817 DOI: 10.3168/jdsc.2022-0299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 01/26/2023] [Indexed: 08/01/2023]

Abstract

The dairy industry is known for its extensive use of artificial insemination, which has resulted in a population where most animals can be traced back to only a few sires. Due to their relatedness to the population, old influential sires could still contribute to the accuracy of genomic predictions. The objective of the study was to identify the impact of historically influential sires on the recent population. This was tested by constructing a genomic relationship matrix using recursion with different sets of sires. Differences in prediction accuracies with different sets are indicative of how important each set is. Recursion coefficients linking young animals to those sets reveal the relative importance of specific sires to the prediction accuracy of recent animals. The data included ∼10 million scores for stature and fore udder attachment (FUA) measured from 1983. Genotypes of 569,404 animals were available. Sire sets included the 100 most popular sires born within different time periods. Computations were with single-step genomic BLUP. In general, the younger sires had higher prediction accuracies than the oldest sires, even though they generally have fewer progeny. The accuracy of evaluation for stature was increased from 0.54 with the most popular sires born before 1981 to 0.69 with sires born from 2001 to 2010, while the accuracy for FUA increased from 0.47 to 0.61. The accuracy achieved using the overall 100 most used sires was 0.66 for stature and 0.58 for FUA. All 100 sires from each period were combined in a subset to determine the importance of each sire relative to all 400 animals in the combined subset. The highest relative impact of a sire that was born within the different time sets was 1.97 for Valiant (before 1981), 1.94 for Blackstar (1981 to 1990), 4.38 for Shottle (1991 to 2000), and 3.09 for Planet (2001 to 2010). The 3 sires among the 400 with the greatest impact were Shottle, Goldwyn (3.73), and Planet. The relative impact of a sire was not strongly related to the number of progeny. For instance, the relative impact of Durham with 34K progeny was 2.29, whereas the impact of O Man with 15K progeny was 3.13. The impact of a sire is also influenced by whether it was used as a sire of sires. Results show that younger sires are more relevant to the accuracy of breeding value prediction in the recent population.

Collapse

Hidalgo J, Lourenco D, Tsuruta S, Bermann M, Breen V, Herring W, Misztal I. Efficient ways to combine data from broiler and layer chickens to account for sequential genomic selection. J Anim Sci 2023:7186226. [PMID: 37249185 DOI: 10.1093/jas/skad177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Indexed: 05/31/2023] Open

Abstract

In broiler breeding, superior individuals for growth become parents and are later evaluated for reproduction in an independent evaluation; however, ignoring broiler data can produce inaccurate and biased predictions. This research aimed to determine the most accurate, unbiased, and time-efficient approach for jointly evaluating reproductive and broiler traits. The data comprised a pedigree with 577K birds, 146K genotypes, phenotypes for three reproductive [egg production (EP), fertility (FE), hatch of fertile eggs (HF); 9K each], and four broiler traits [body weight (BW), breast meat percent (BP), fat percent (FP), residual feed intake (RF); up to 467K]. Broiler data were added sequentially to assess the impact on the quality of predictions for reproductive traits. The baseline scenario (RE) included pedigrees, genotypes, and phenotypes for reproductive traits of selected animals; in RE2, we added their broiler phenotypes; in RE_BR, broiler phenotypes of non-selected animals, and in RE_BR_GE, their genotypes. We computed accuracy, bias, and dispersion of predictions for hens from the last two breeding cycles and their sires. We tested three core definitions for the algorithm of proven and young to find the most time-efficient approach: two random cores with 7K and 12K animals and one with 19K animals, containing parents and young animals. From RE to RE_BR_GE, changes in accuracy were null or minimal for EP (0.51 in hens, 0.59 in roosters) and HF (0.47 in hens, 0.49 in roosters); for FE in hens (roosters), it changed from 0.4 (0.49) to 0.47 (0.53). In hens (roosters), bias (additive SD units) decreased from 0.69 (0.7) to 0.04 (0.05) for EP, 1.48 (1.44) to 0.11 (0.03) for FE, and 1.06 (0.96) to 0.09 (0.02) for HF. Dispersion remained stable in hens (roosters) at ~ 0.93 (~ 1.03) for EP, and it improved from 0.57 (0.72) to 0.87 (1.0) for FE and from 0.8 (0.79) to 0.88 (0.87) for HF. Ignoring broiler data deteriorated the predictions' quality. The impact was significant for the low heritability trait (0.02; FE); bias (up to 1.5) and dispersion (as low as 0.57) were farther from the ideal value, and accuracy losses were up to 17.5%. Accuracy was maintained in traits with moderate heritability (~ 0.3; EP and HF), and bias and dispersion were less substantial. Adding information from the broiler phase maximized accuracy and unbiased predictions. The most time-efficient approach is a random core with 7K animals in the algorithm for proven and young.

Collapse

Jang S, Ros-Freixedes R, Hickey JM, Chen CY, Herring WO, Holl J, Misztal I, Lourenco D. Multi-line ssGBLUP evaluation using preselected markers from whole-genome sequence data in pigs. Front Genet 2023;14:1163626. [PMID: 37252662 PMCID: PMC10213539 DOI: 10.3389/fgene.2023.1163626] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Accepted: 05/03/2023] [Indexed: 05/31/2023] Open

Abstract

Genomic evaluations in pigs could benefit from using multi-line data along with whole-genome sequencing (WGS) if the data are large enough to represent the variability across populations. The objective of this study was to investigate strategies to combine large-scale data from different terminal pig lines in a multi-line genomic evaluation (MLE) through single-step GBLUP (ssGBLUP) models while including variants preselected from whole-genome sequence (WGS) data. We investigated single-line and multi-line evaluations for five traits recorded in three terminal lines. The number of sequenced animals in each line ranged from 731 to 1,865, with 60k to 104k imputed to WGS. Unknown parent groups (UPG) and metafounders (MF) were explored to account for genetic differences among the lines and improve the compatibility between pedigree and genomic relationships in the MLE. Sequence variants were preselected based on multi-line genome-wide association studies (GWAS) or linkage disequilibrium (LD) pruning. These preselected variant sets were used for ssGBLUP predictions without and with weights from BayesR, and the performances were compared to that of a commercial porcine single-nucleotide polymorphisms (SNP) chip. Using UPG and MF in MLE showed small to no gain in prediction accuracy (up to 0.02), depending on the lines and traits, compared to the single-line genomic evaluation (SLE). Likewise, adding selected variants from the GWAS to the commercial SNP chip resulted in a maximum increase of 0.02 in the prediction accuracy, only for average daily feed intake in the most numerous lines. In addition, no benefits were observed when using preselected sequence variants in multi-line genomic predictions. Weights from BayesR did not help improve the performance of ssGBLUP. This study revealed limited benefits of using preselected whole-genome sequence variants for multi-line genomic predictions, even when tens of thousands of animals had imputed sequence data. Correctly accounting for line differences with UPG or MF in MLE is essential to obtain predictions similar to SLE; however, the only observed benefit of an MLE is to have comparable predictions across lines. Further investigation into the amount of data and novel methods to preselect whole-genome causative variants in combined populations would be of significant interest.

Collapse

Garcia A, Aguilar I, Legarra A, Tsuruta S, Misztal I, Lourenco D. Correction: Theoretical accuracy for indirect predictions based on SNP effects from single-step GBLUP. Genet Sel Evol 2023;55:26. [PMID: 37069505 PMCID: PMC10108542 DOI: 10.1186/s12711-023-00799-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2023] Open

Cesarani A, Bermann M, Dimauro C, Degano L, Vicario D, Lourenco D, Macciotta NPP. Strategies for choosing core animals in the algorithm for proven and young and their impact on the accuracy of single-step genomic predictions in cattle. Animal 2023;17:100766. [PMID: 37001441 DOI: 10.1016/j.animal.2023.100766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 02/24/2023] [Accepted: 02/28/2023] [Indexed: 03/16/2023] Open

Abstract

Nowadays, in some populations, the number of genotyped animals is too large to obtain the inverse of the genomic relationship matrix. The algorithm for proven and young animals (APY) can be used to overcome this problem. In the present work, different strategies for defining core animals in APY were tested using either simulated or real data. In particular, core definitions based on random choice or on the contribution to the genomic relationship matrix (G_CONTR) calculated using Principal Component Analysis were tested. Core sizes able to explain 90, 95, 98, and 99% of the total variance of the genomic relationship matrix (G) were used. Analyzed phenotypes were three simulated traits for 3 000 individuals, and milkability records for 136 406 Italian Simmental cows. The number of genotypes was 4 100 for the simulated dataset, and 11 636 for the Simmental data, respectively. The G_CONTR values in Simmental dataset were moderately correlated with the analyzed phenotype, and they showed a decreasing trend according to the year of birth of genotyped animals. The accuracy increased as the size of the core increased in both datasets. The inclusion in the core of animals with largest G_CONTR values led to the lowest accuracies (0.50 and 0.71 for the simulated and Simmental datasets, respectively; average across traits and core sizes). On the contrary, the selection of animals with the lowest rank according to their contribution to the G provided slightly higher accuracies, especially in the simulated dataset (0.68 for the simulated dataset, and 0.76 for the Simmental data; average across traits and core sizes). In real data, particularly for larger sizes of core animals, the criteria of choice appear less important, confirming the results of earlier studies. Anyway, the inclusion in the core of animals with the lowest values of G_CONTR led to increases in accuracy. These are preliminary results based on a small sample size that need to be confirmed on a larger number of genotypes.

Collapse

Leite NG, Knol EF, Nuphaus S, Vogelzang R, Tsuruta S, Wittmann M, Lourenco D. The genetic basis of swine inflammation and necrosis syndrome and its genetic association with post-weaning skin damage and production traits. J Anim Sci 2023;101:7067130. [PMID: 36860185 PMCID: PMC10050931 DOI: 10.1093/jas/skad067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 02/27/2023] [Indexed: 03/03/2023] Open

Abstract

The swine inflammation and necrosis syndrome (SINS) is a syndrome visually characterized by the presence of inflamed and necrotic skin at extreme body parts, such as the teats, tail, ears, and claw coronary bands. This syndrome is associated with several environmental causes, but knowledge of the role of genetics is still limited. Moreover, piglets affected by SINS are believed to be phenotypically more susceptible to chewing and biting behaviors from pen mates, which could cause a chronic reduction in their welfare throughout the production process. Our objectives were to 1) investigate the genetic basis of SINS expressed on piglets' different body parts and 2) estimate SINS genetic relationship with post-weaning skin damage and pre and post-weaning production traits. A total of 5,960 two to three-day-old piglets were scored for SINS on the teats, claws, tails, and ears as a binary phenotype. Later, those binary records were combined into a trait defined as TOTAL_SINS. For TOTAL_SINS, animals presenting no signs of SINS were scored as 1, whereas animals showing at least one affected part were scored as 2. Apart from SINS traits, piglets had their birth weight (BW) and weaning weight (WW) recorded, and up to 4,132 piglets were later evaluated for combined skin damage (CSD), carcass backfat (BF), and loin depth (LOD). In the first set of analyses, the heritability of SINS on different body parts was estimated with single-trait animal-maternal models, and pairwise genetic correlations between body parts were obtained from two-trait models. Later, we used four three-trait animal models with TOTAL_SINS, CSD, and an alternative production trait (i.e., BW, WW, LOD, BF) to access trait heritabilities and genetic correlations between SINS and production traits. The maternal effect was included in the BW, WW, and TOTAL_SINS models. The direct heritability of SINS on different body parts ranged from 0.08 to 0.34, indicating that reducing SINS incidence through genetic selection is feasible. The direct genetic correlation between TOTAL_SINS and pre-weaning growth traits (BW and WW) was favorable and negative (from -0.40 to -0.30), indicating that selection for animals genetically less prone to present signs of SINS will positively affect the piglet's genetics for heavier weight at birth and weaning. The genetic correlations between TOTAL_SINS and BF and between TOTAL_SINS and LOD were weak or not significant (-0.16 to 0.05). However, the selection against SINS was shown to be genetically correlated with CSD, with estimates ranging from 0.19 to 0.50. That means that piglets genetically less likely to present SINS signs are also more unlikely to suffer CSD after weaning, having a long-term increase in their welfare throughout the production system.

Collapse

Steyn Y, Lawlor T, Masuda Y, Tsuruta S, Legarra A, Lourenco D, Misztal I. Nonparallel genome changes within subpopulations over time contributed to genetic diversity within the US Holstein population. J Dairy Sci 2023;106:2551-2572. [PMID: 36797192 DOI: 10.3168/jds.2022-21914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 10/03/2022] [Indexed: 02/16/2023]

Abstract

Maintaining genetic variation in a population is important for long-term genetic gain. The existence of subpopulations within a breed helps maintain genetic variation and diversity. The 20,990 genotyped animals, representing the breeding animals in the year 2014, were identified as the sires of animals born after 2010 with at least 25 progenies, and females measured for type traits within the last 2 yr of data. K-means clustering with 5 clusters (C1, C2, C3, C4, and C5) was applied to the genomic relationship matrix based on 58,990 SNP markers to stratify the selected candidates into subpopulations. The general higher inbreeding resulting from within-cluster mating than across-cluster mating suggests the successful stratification into genetically different groups. The largest cluster (C4) contained animals that were less related to each animal within and across clusters. The average fixation index was 0.03, indicating that the populations were differentiated, and allele differences across the subpopulations were not due to drift alone. Starting with the selected candidates within each cluster, a family unit was identified by tracing back through the pedigree, identifying the genotyped ancestors, and assigning them to a pseudogeneration. Each of the 5 families (F1, F2, F3, F4, and F5) was traced back for 10 generations, allowing for changes in frequency of individual SNPs over time to be observed, which we call allele frequencies change. Alternative procedures were used to identify SNPs changing in a parallel or nonparallel way across families. For example, markers that have changed the most in the whole population, markers that have changed differently across families, and genes previously identified as those that have changed in allele frequency. The genomic trajectory taken by each family involves selective sweeps, polygenic changes, hitchhiking, and epistasis. The replicate frequency spectrum was used to measure the similarity of change across families and showed that populations have changed differently. The proportion of markers that reversed direction in allele frequency change varied from 0.00 to 0.02 if the rate of change was greater than 0.02 per generation, or from 0.14 to 0.24 if the rate of change was greater than 0.005 per generation within each family. Cluster-specific SNP effects for stature were estimated using only females and applied to obtain indirect genomic predictions for males. Reranking occurs depending on SNP effects used. Additive genetic correlations between clusters show possible differences in populations. Further research is required to determine how this knowledge can be applied to maintain diversity and optimize selection decisions in the future.

Collapse

Garcia A, Tsuruta S, Gao G, Palti Y, Lourenco D, Leeds T. Genomic selection models substantially improve the accuracy of genetic merit predictions for fillet yield and body weight in rainbow trout using a multi-trait model and multi-generation progeny testing. Genet Sel Evol 2023;55:11. [PMID: 36759760 PMCID: PMC9912574 DOI: 10.1186/s12711-023-00782-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 01/16/2023] [Indexed: 02/11/2023] Open

Abstract

BACKGROUND

In aquaculture, the proportion of edible meat (FY = fillet yield) is of major economic importance, and breeding animals of superior genetic merit for this trait can improve efficiency and profitability. Achieving genetic gains for fillet yield is possible using a pedigree-based best linear unbiased prediction (PBLUP) model with direct and indirect selection. To investigate the feasibility of using genomic selection (GS) to improve FY and body weight (BW) in rainbow trout, the prediction accuracy of GS models was compared to that of PBLUP. In addition, a genome-wide association study (GWAS) was conducted to identify quantitative trait loci (QTL) for the traits. All analyses were performed using a two-trait model with FY and BW, and variance components, heritability, and genetic correlations were estimated without genomic information. The data used included 14,165 fish in the pedigree, of which 2742 and 12,890 had FY and BW phenotypic records, respectively, and 2484 had genotypes from the 57K single nucleotide polymorphism (SNP) array.

RESULTS

The heritabilities were moderate, at 0.41 and 0.33 for FY and BW, respectively. Both traits were lowly but positively correlated (genetic correlation; r = 0.24), which suggests potential favourable correlated genetic gains. GS models increased prediction accuracy compared to PBLUP by up to 50% for FY and 44% for BW. Evaluations were found to be biased when validation was performed on future performances but not when it was performed on future genomic estimated breeding values.

CONCLUSIONS

The low but positive genetic correlation between fillet yield and body weight indicates that some improvement in fillet yield may be achieved through indirect selection for body weight. Genomic information increases the prediction accuracy of breeding values and is an important tool to accelerate genetic progress for fillet yield and growth in the current rainbow trout population. No significant QTL were found for either trait, indicating that both traits are polygenic, and that marker-assisted selection will not be helpful to improve these traits in this population.

Collapse

Guinan FL, Wiggans GR, Norman HD, Dürr JW, Cole JB, Van Tassell CP, Misztal I, Lourenco D. Changes in genetic trends in US dairy cattle since the implementation of genomic selection. J Dairy Sci 2023;106:1110-1129. [PMID: 36494224 DOI: 10.3168/jds.2022-22205] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Accepted: 09/06/2022] [Indexed: 12/12/2022]

Abstract

Genomic selection increases accuracy and decreases generation interval, accelerating genetic changes in populations. Assumptions of genetic improvement must be addressed to quantify the magnitude and direction of change. Genetic trends of US dairy cattle breeds were examined to determine the genetic gain since the implementation of genomic evaluations in 2009. Inbreeding levels and generation intervals were also investigated. Breeds included Ayrshire, Brown Swiss, Guernsey, Holstein (HO), and Jersey (JE), which were characterized by the evaluation breed the animal received. Mean genomic predicted breeding values (PBV¯) were analyzed per year to calculate genetic trends for bulls and cows. The data set contained 154,008 bulls and 33,022,242 cows born since 1975. Breakpoints were estimated using linear regression, and nonlinear regression was used to fit the piecewise model for the small sample number in some years. Generation intervals and inbreeding levels were also investigated since 1975. Milk, fat, and protein yields, somatic cell score, productive life, daughter pregnancy rate, and livability PBV¯ were documented. In 2017, 100% of bulls in this data set were genotyped. The percentage of genotyped cows has increased 23 percentage points since 2010. Overall, production traits have increased steadily over time, as expected. The HO and JE breeds have benefited most from genomics, with up to 192% increase in genetic gain since 2009. Due to the low number of observations, trends for Ayrshire, Brown Swiss, and Guernsey are difficult to infer from. Trends in fertility are most substantial; particularly, most breeds are trending downwards and daughter pregnancy rate for JE has been decreasing steadily since 1975 for bulls and cows. Levels of genomic inbreeding are increasing in HO bulls and cows. In 2017, genomic inbreeding levels were 12.7% for bulls and 7.9% for cows. A suggestion to control this is to include the genomic inbreeding coefficient with a negative weight to the selection index of bulls with high future genomic inbreeding levels. For sires of bulls, the current generation intervals are 2.2 yr in HO, 3.2 in JE, 4.4 in Brown Swiss, 5.1 in Ayrshire, and 4.3 in Guernsey. The number of colored breed bulls in the United States is currently at an extremely low level, and this number will only increase with a market incentive or additional breed association involvement. Increased education and extension could be beneficial to increase knowledge about inbreeding levels, use of genomics and genetic improvement, and genetic diversity in the genomic selection era.

Collapse

Bermann M, Aguilar I, Lourenco D, Misztal I, Legarra A. Reliabilities of estimated breeding values in models with metafounders. Genet Sel Evol 2023;55:6. [PMID: 36690938 PMCID: PMC9869531 DOI: 10.1186/s12711-023-00778-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 01/04/2023] [Indexed: 01/24/2023] Open

Abstract

BACKGROUND

Reliabilities of best linear unbiased predictions (BLUP) of breeding values are defined as the squared correlation between true and estimated breeding values and are helpful in assessing risk and genetic gain. Reliabilities can be computed from the prediction error variances for models with a single base population but are undefined for models that include several base populations and when unknown parent groups are modeled as fixed effects. In such a case, the use of metafounders in principle enables reliabilities to be derived.

METHODS

We propose to compute the reliability of the contrast of an individual's estimated breeding value with that of a metafounder based on the prediction error variances of the individual and the metafounder, their prediction error covariance, and their genetic relationship. Computation of the required terms demands only little extra work once the sparse inverse of the mixed model equations is obtained, or they can be approximated. This also allows the reliabilities of the metafounders to be obtained. We studied the reliabilities for both BLUP and single-step genomic BLUP (ssGBLUP), using several definitions of reliability in a large dataset with 1,961,687 dairy sheep and rams, most of which had phenotypes and among which 27,000 rams were genotyped with a 50K single nucleotide polymorphism (SNP) chip. There were 23 metafounders with progeny sizes between 100,000 and 2000 individuals.

RESULTS

In models with metafounders, directly using the prediction error variance instead of the contrast with a metafounder leads to artificially low reliabilities because they refer to a population with maximum heterozygosity. When only one metafounder is fitted in the model, the reliability of the contrast is shown to be equivalent to the reliability of the individual in a model without metafounders. When there are several metafounders in the model, using a contrast with the oldest metafounder yields reliabilities that are on a meaningful scale and very close to reliabilities obtained from models without metafounders. The reliabilities using contrasts with ssGBLUP also resulted in meaningful values.

CONCLUSIONS

This work provides a general method to obtain reliabilities for both BLUP and ssGBLUP when several base populations are included through metafounders.

Collapse

Hidalgo J, Lourenco D, Tsuruta S, Bermann M, Breen V, Misztal I. Derivation of indirect predictions using genomic recursions across generations in a broiler population. J Anim Sci 2023;101:skad355. [PMID: 37837636 PMCID: PMC10630029 DOI: 10.1093/jas/skad355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 10/12/2023] [Indexed: 10/16/2023] Open

Abstract

Genomic estimated breeding values (GEBV) of animals without phenotypes can be indirectly predicted using recursions on GEBV of a subset. To maximize predictive ability of indirect predictions (IP), the subset must represent the independent chromosome segments segregating in the population. We aimed to 1) determine the number of animals needed in recursions to maximize predictive ability, 2) evaluate equivalency IP-GEBV, and 3) investigate trends in predictive ability of IP derived from recent vs. distant generations or accumulating phenotypes from recent to past generations. Data comprised pedigree of 825K birds hatched over 12 overlapping generations, phenotypes for body weight (BW; 820K), residual feed intake (RF; 200K) and weight gain during a trial period (WG; 200K), and breast meat percent (BP; 43K). A total of 154K birds (last six generations) had genotypes. The number of animals that maximize predictive ability was assessed based on the number of largest eigenvalues explaining 99% of variation in the genomic relationship matrix (1Me = 7,131), twice (2Me), or a fraction of this number (i.e., 0.75, 0.50, or 0.25Me). Equivalency between IP and GEBV was measured by correlating these two sets of predictions. GEBV were obtained as if generation 12 (validation animals) was part of the evaluation. IP were derived from GEBV of animals from generations 8 to 11 or generations 11, 10, 9, or 8. IP predictive ability was defined as the correlation between IP and adjusted phenotypes. The IP predictive ability increased from 0.25Me to 1Me (11%, on average); the change from 1Me to 2Me was negligible (0.6%). The correlation IP-GEBV was the same when IP were derived from a subset of 1Me animals chosen randomly across generations (8 to 11) or from generation 11 (0.98 for BW, 0.99 for RF, WG, and BP). A marginal decline in the correlation was observed when IP were based on GEBV of animals from generation 8 (0.95 for BW, 0.98 for RF, WG, and BP). Predictive ability had a similar trend; from generation 11 to 8, it changed from 0.32 to 0.31 for BW, from 0.39 to 0.38 for BP, and was constant at 0.33(0.22) for RF(WG). Predictive ability had a slight to moderate increase accumulating up to four generations of phenotypes. 1Me animals provide accurate IP, equivalent to GEBV. A minimum decay in predictive ability is observed when IP are derived from GEBV of animals from four generations back, possibly because of strong selection or the model not being completely additive.

Collapse

Bussiman F, Chen CY, Holl J, Bermann M, Legarra A, Misztal I, Lourenco D. Boundaries for genotype, phenotype, and pedigree truncation in genomic evaluations in pigs. J Anim Sci 2023;101:skad273. [PMID: 37584978 PMCID: PMC10464514 DOI: 10.1093/jas/skad273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 08/10/2023] [Indexed: 08/17/2023] Open

Abstract

Historical data collection for genetic evaluation purposes is a common practice in animal populations; however, the larger the dataset, the higher the computing power needed to perform the analyses. Also, fitting the same model to historical and recent data may be inappropriate. Data truncation can reduce the number of equations to solve, consequently decreasing computing costs; however, the large volume of genotypes is responsible for most of the increase in computations. This study aimed to assess the impact of removing genotypes along with phenotypes and pedigree on the computing performance, reliability, and inflation of genomic predicted breeding value (GEBV) from single-step genomic best linear unbiased predictor for selection candidates. Data from two pig lines, a terminal sire (L1) and a maternal line (L2), were analyzed in this study. Four analyses were implemented: growth and "weaning to finish" mortality on L1, pre-weaning and reproductive traits on L2. Four genotype removal scenarios were proposed: removing genotyped animals without phenotypes and progeny (noInfo), removing genotyped animals based on birth year (Age), the combination of noInfo and Age scenarios (noInfo + Age), and no genotype removal (AllGen). In all scenarios, phenotypes were removed, based on birth year, and three pedigree depths were tested: two and three generations traced back and using the entire pedigree. The full dataset contained 1,452,257 phenotypes for growth traits, 324,397 for weaning to finish mortality, 517,446 for pre-weaning traits, and 7,853,629 for reproductive traits in pure and crossbred pigs. Pedigree files for lines L1 and L2 comprised 3,601,369 and 11,240,865 animals, of which 168,734 and 170,121 were genotyped, respectively. In each truncation scenario, the linear regression method was used to assess the reliability and dispersion of GEBV for genotyped parents (born after 2019). The number of years of data that could be removed without harming reliability depended on the number of records, type of analyses (multitrait vs. single trait), the heritability of the trait, and data structure. All scenarios had similar reliabilities, except for noInfo, which performed better in the growth analysis. Based on the data used in this study, considering the last ten years of phenotypes, tracing three generations back in the pedigree, and removing genotyped animals not contributing own or progeny phenotypes, increases computing efficiency with no change in the ability to predict breeding values.

Collapse

McWhorter TM, Bermann M, Garcia ALS, Legarra A, Aguilar I, Misztal I, Lourenco D. Implication of the order of blending and tuning when computing the genomic relationship matrix in single-step GBLUP. J Anim Breed Genet 2023;140:60-78. [PMID: 35946919 PMCID: PMC10087221 DOI: 10.1111/jbg.12734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 07/12/2022] [Indexed: 12/13/2022]

Abstract

Single-step genomic BLUP (ssGBLUP) relies on the combination of the genomic ( G $$ \mathbf{G} $$ ) and pedigree relationship matrices for all ( A $$ \mathbf{A} $$ ) and genotyped ( A 22 $$ {\mathbf{A}}_{22} $$ ) animals. The procedure ensures G $$ \mathbf{G} $$ and A 22 $$ {\mathbf{A}}_{22} $$ are compatible so that both matrices refer to the same genetic base ('tuning'). Then G $$ \mathbf{G} $$ is combined with a proportion of A 22 $$ {\mathbf{A}}_{22} $$ ('blending') to avoid singularity problems and to account for the polygenic component not accounted for by markers. This computational procedure has been implemented in the reverse order (blending before tuning) following the sequential research developments. However, blending before tuning may result in less optimal tuning because the blended matrix already contains a proportion of A 22 $$ {\mathbf{A}}_{22} $$ . In this study, the impact of 'tuning before blending' was compared with 'blending before tuning' on genomic estimated breeding values (GEBV), single nucleotide polymorphism (SNP) effects and indirect predictions (IP) from ssGBLUP using American Angus Association and Holstein Association USA, Inc. data. Two slightly different tuning methods were used; one that adjusts the mean diagonals and off-diagonals of G $$ \mathbf{G} $$ to be similar to those in A 22 $$ {\mathbf{A}}_{22} $$ and another one that adjusts based on the average difference between all elements of G $$ \mathbf{G} $$ and A 22 $$ {\mathbf{A}}_{22} $$ . Over 6 million Angus growth records and 5.9 million Holstein udder depth records were available. Genomic information was available on 51,478 Angus and 105,116 Holstein animals. Average realized relationship estimates among groups of animals were similar across scenarios. Scatterplots show that GEBV, SNP effects and IP did not noticeably change for all animals in the evaluation regardless of the order of computations and when using blending parameter of 0.05. Formulas were derived to determine the blending parameter that maximizes changes in the genomic relationship matrix and GEBV when changing the order of blending and tuning. Algebraically, the change is maximized when the blending parameter is equal to 0.5. Overall, tuning G $$ \mathbf{G} $$ before blending, regardless of blending parameter used, had a negligible impact on genomic predictions and SNP effects in this study.

Collapse

Leite NG, Chen CY, Herring WO, Holl J, Tsuruta S, Lourenco D. Leveraging low-density crossbred genotypes to offset crossbred phenotypes and their impact on purebred predictions. J Anim Sci 2022;100:6780296. [PMID: 36309902 PMCID: PMC9733505 DOI: 10.1093/jas/skac359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 10/27/2022] [Indexed: 12/15/2022] Open

Abstract

The objectives of this study were to 1) investigate the predictability and bias of genomic breeding values (GEBV) of purebred (PB) sires for CB performance when CB genotypes imputed from a low-density panel are available, 2) assess if the availability of those CB genotypes can be used to partially offset CB phenotypic recording, and 3) investigate the impact of including imputed CB genotypes in genomic analyses when using the algorithm for proven and young (APY). Two pig populations with up to 207,375 PB and 32,893 CB phenotypic records per trait and 138,026 PB and 32,893 CB genotypes were evaluated. PB sires were genotyped for a 50K panel, whereas CB animals were genotyped for a low-density panel of 600 SNP and imputed to 50K. The predictability and bias of GEBV of PB sires for backfat thickness (BFX) and average daily gain recorded (ADGX) recorded on CB animals were assessed when CB genotypes were available or not in the analyses. In the first set of analyses, direct inverses of the genomic relationship matrix (G) were used with phenotypic datasets truncated at different time points. In the next step, we evaluated the APY algorithm with core compositions differing in the CB genotype contributions. After that, the performance of core compositions was compared with an analysis using a random PB core from a purely PB genomic set. The number of rounds to convergence was recorded for all APY analyses. With the direct inverse of G in the first set of analyses, adding CB genotypes imputed from a low-density panel (600 SNP) did not improve predictability or reduce the bias of PB sires' GEBV for CB performance, even for sires with fewer CB progeny phenotypes in the analysis. That indicates that the inclusion of CB genotypes primarily used for inferring pedigree in commercial farms is of no benefit to offset CB phenotyping. When CB genotypes were incorporated into APY, a random core composition or a core with no CB genotypes reduced bias and the number of rounds to convergence but did not affect predictability. Still, a PB random core composition from a genomic set with only PB genotypes resulted in the highest predictability and the smallest number of rounds to convergence, although bias increased. Genotyping CB individuals for low-density panels is a valuable identification tool for linking CB phenotypes to pedigree; however, the inclusion of those CB genotypes imputed from a low-density panel (600 SNP) might not benefit genomic predictions for PB individuals or offset CB phenotyping for the evaluated CB performance traits. Further studies will help understand the usefulness of those imputed CB genotypes for traits with lower PB-CB genetic correlations and traits not recorded in the PB environment, such as mortality and disease traits.

Collapse

Garcia A, Aguilar I, Legarra A, Tsuruta S, Misztal I, Lourenco D. Theoretical accuracy for indirect predictions based on SNP effects from single-step GBLUP. Genet Sel Evol 2022;54:66. [PMID: 36162979 PMCID: PMC9513904 DOI: 10.1186/s12711-022-00752-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 08/23/2022] [Indexed: 11/13/2022] Open

Abstract

Background

Although single-step GBLUP (ssGBLUP) is an animal model, SNP effects can be backsolved from genomic estimated breeding values (GEBV). Predicted SNP effects allow to compute indirect prediction (IP) per individual as the sum of the SNP effects multiplied by its gene content, which is helpful when the number of genotyped animals is large, for genotyped animals not in the official evaluations, and when interim evaluations are needed. Typically, IP are obtained for new batches of genotyped individuals, all of them young and without phenotypes. Individual (theoretical) accuracies for IP are rarely reported, but they are nevertheless of interest. Our first objective was to present equations to compute individual accuracy of IP, based on prediction error covariance (PEC) of SNP effects, and in turn, are obtained from PEC of GEBV in ssGBLUP. The second objective was to test the algorithm for proven and young (APY) in PEC computations. With large datasets, it is impossible to handle the full PEC matrix, thus the third objective was to examine the minimum number of genotyped animals needed in PEC computations to achieve IP accuracies that are equivalent to GEBV accuracies.

Results

Correlations between GEBV and IP for the validation animals using SNP effects from ssGBLUP evaluations were ≥ 0.99. When all available genotyped animals were used for PEC computations, correlations between GEBV and IP accuracy were ≥ 0.99. In addition, IP accuracies were compatible with GEBV accuracies either with direct inversion of the genomic relationship matrix (G) or using the algorithm for proven and young (APY) to obtain the inverse of G. As the number of genotyped animals included in the PEC computations decreased from around 55,000 to 15,000, correlations were still ≥ 0.96, but IP accuracies were biased downwards.

Conclusions

Theoretical accuracy of indirect prediction can be successfully obtained by computing SNP PEC out of GEBV PEC from ssGBLUP equations using direct or APY G inverse. It is possible to reduce the number of genotyped animals in PEC computations, but accuracies may be underestimated. Further research is needed to approximate SNP PEC from ssGBLUP to limit the computational requirements with many genotyped animals.

Collapse

Hollifield MK, Bermann M, Lourenco D, Misztal I. 36 Exploring the Statistical Nature of Independent Chromosome Segments. J Anim Sci 2022. [DOI: 10.1093/jas/skac247.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract Abstract Independent chromosome segments for a population with Ne effective population size and L genome length can be approximately defined as 4NeL non-overlapping haplotypes of L/2 length derived from any Ne animals. The number of independent chromosome segments (Me) can be approximated as 4NeL. The genetic selection with a genomic relationship matrix (GRM) using such haplotypes approaches that with a GRM using the SNP markers. The objective of this study was to investigate the statistical nature of independent chromosome segments. Data were simulated using QMSim and contained a population of ten non-overlapping generations, each including 2,000 animals with Ne equal to 20, and a polygenic trait with a heritability of 0.6. The last three generations were genotyped, and each genome contained ten 1 M long chromosomes, for a total genome length of 10 M and 50,000 SNP. Chromosome segments of each animal were organized in non-overlapping haplotypes by the SNP code using in-house software written in Fortran. The effects of the hypothetical independent chromosome segments were estimated using a model that included an overall mean plus the segment effects. To analyze the behavior around Me, the number of segments chosen to estimate segment effects varied around 4NeL. Accuracies were calculated for animals in the last generation by cor (TBV, Zsŝ), where Zs ŝ is a vector of breeding values based on segments and TBV is a vector of the true breeding values outputted by QMSim. Accuracies of segment effects were compared with the true accuracy, i.e., cor (TBV,GEBV). The maximum accuracy of segment effects was 0.84, and the true accuracy was 0.96. The results suggest that 4NeL segments contain most of the additive information in the population. However, the accuracy of GBLUP is greater than that of the chromosome segment effects, suggesting that arranging chromosome segments based solely on the statistical nature is not enough to account for all the genetic variation. Collapse

Bermann M, Lourenco D, Misztal I. 35 Young Scholar Award Talk: Computing Strategies for National Beef Cattle Evaluations. J Anim Sci 2022. [DOI: 10.1093/jas/skac247.356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Hollifield MK, Bermann M, Lourenco D, Misztal I. Impact of blending the genomic relationship matrix with different levels of pedigree relationships or the identity matrix on genetic evaluations. JDS Commun 2022;3:343-347. [PMID: 36340904 PMCID: PMC9623765 DOI: 10.3168/jdsc.2022-0229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 06/29/2022] [Indexed: 06/16/2023]

Abstract

Evaluations using single-step genomic BLUP require blending the genomic relationship matrix (G) with a positive definite matrix to ensure nonsingularity for solving the mixed model equations. Many organizations blend G with a proportion of the numerator relationship matrix for genotyped animals (A ₂₂) to improve stability and possibly add a residual polygenic effect. However, when nearly all the polygenic variance is explained by G, blending with A ₂₂ may cause inflation and add excess computing time; thus, blending with an identity matrix (I) multiplied by a small value may be a better solution. The objective of this study was to evaluate changes in reliability and inflation of genomic estimated breeding values, convergence rate, elapsed wall-clock time for blending G with different levels of A ₂₂ or I, and develop a more time-efficient blending method. A US Holstein cattle data set was used with 9.7 million animals in the pedigree, 569,404 animals with genotypes, and 10.1 million stature phenotypes. Blending G by adding a small value to the diagonal elements had comparable performance to A ₂₂ with fewer rounds to convergence required to solve the system of equations. Reliability and inflation of genomic estimated breeding values ranged from 0.63 to 0.68 and 0.86 to 0.89 for all blending scenarios tested. The current blending default in the BLUPF90 software is to replace G with (1 - β)G + βA ₂₂, where β equals 0.05. In this study, β values of 0.30, 0.20, 0.05, 0.01, 0.005, and 0.001 were evaluated with A ₂₂ and I. Negligible differences in elapsed computing time between the blending types and levels were observed. Subsequently, the current blending algorithm used in the BLUPF90 family of programs was optimized, reducing the blending time from approximately 2 h to 5 min for A ₂₂ and less than 1 s for I. The new time difference between blending with A ₂₂ or I is negligible and not computationally critical. The results indicate that blending G with A ₂₂ does not have clear advantages over blending with a small proportion of I.

Collapse

Callister AN, Bermann M, Elms S, Bradshaw BP, Lourenco D, Brawner JT. Accounting for population structure in genomic predictions of Eucalyptus globulus. G3 Genes|Genomes|Genetics 2022;12:6654591. [PMID: 35920792 PMCID: PMC9434241 DOI: 10.1093/g3journal/jkac180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 06/29/2022] [Indexed: 12/02/2022]

Bermann M, Lourenco D, Forneris NS, Legarra A, Misztal I. On the equivalence between marker effect models and breeding value models and direct genomic values with the Algorithm for Proven and Young. Genet Sel Evol 2022;54:52. [PMID: 35842585 PMCID: PMC9288049 DOI: 10.1186/s12711-022-00741-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 06/29/2022] [Indexed: 12/04/2022] Open

Abstract

Background

Single-step genomic predictions obtained from a breeding value model require calculating the inverse of the genomic relationship matrix \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$({\mathbf{G}}^{-1})$$\end{document}(G-1). The Algorithm for Proven and Young (APY) creates a sparse representation of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{G}}^{-1}$$\end{document}G-1 with a low computational cost. APY consists of selecting a group of core animals and expressing the breeding values of the remaining animals as a linear combination of those from the core animals plus an error term. The objectives of this study were to: (1) extend APY to marker effects models; (2) derive equations for marker effect estimates when APY is used for breeding value models, and (3) show the implication of selecting a specific group of core animals in terms of a marker effects model.

Results

We derived a family of marker effects models called APY-SNP-BLUP. It differs from the classic marker effects model in that the row space of the genotype matrix is reduced and an error term is fitted for non-core animals. We derived formulas for marker effect estimates that take this error term in account. The prediction error variance (PEV) of the marker effect estimates depends on the PEV for core animals but not directly on the PEV of the non-core animals. We extended the APY-SNP-BLUP to include a residual polygenic effect and accommodate non-genotyped animals. We show that selecting a specific group of core animals is equivalent to select a subspace of the row space of the genotype matrix. As the number of core animals increases, subspaces corresponding to different sets of core animals tend to overlap, showing that random selection of core animals is algebraically justified.

Conclusions

The APY-(ss)GBLUP models can be expressed in terms of marker effect models. When the number of core animals is equal to the rank of the genotype matrix, APY-SNP-BLUP is identical to the classic marker effects model. If the number of core animals is less than the rank of the genotype matrix, genotypes for non-core animals are imputed as a linear combination of the genotypes of the core animals. For estimating SNP effects, only relationships and estimated breeding values for core animals are needed.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12711-022-00741-7.

Collapse

Lozada-Soto EA, Lourenco D, Maltecca C, Fix J, Schwab C, Shull C, Tiezzi F. Genotyping and phenotyping strategies for genetic improvement of meat quality and carcass composition in swine. Genet Sel Evol 2022;54:42. [PMID: 35672700 PMCID: PMC9171933 DOI: 10.1186/s12711-022-00736-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 05/25/2022] [Indexed: 12/04/2022] Open

Abdollahi-Arpanahi R, Lourenco D, Misztal I. A comprehensive study on size and definition of the core group in the proven and young algorithm for single-step GBLUP. Genet Sel Evol 2022;54:34. [PMID: 35596130 PMCID: PMC9123737 DOI: 10.1186/s12711-022-00726-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 05/02/2022] [Indexed: 11/16/2022] Open

Abstract

Background

The algorithm for proven and young (APY) has been suggested as a solution for recursively computing a sparse representation for the inverse of a large genomic relationship matrix (G). In APY, a subset of genotyped individuals is used as the core and the remaining genotyped individuals are used as noncore. Size and definition of the core are relevant research subjects for the application of APY, especially given the ever-increasing number of genotyped individuals.

Methods

The aim of this study was to investigate several core definitions, including the most popular animals (MPA) (i.e., animals with high contributions to the genetic pool), the least popular males (LPM), the least popular females (LPF), a random set (Rnd), animals evenly distributed across genealogical paths (Ped), unrelated individuals (Unrel), or based on within-family selection (Fam), or on decomposition of the gene content matrix (QR). Each definition was evaluated for six core sizes based on prediction accuracy of single-step genomic best linear unbiased prediction (ssGBLUP) with APY. Prediction accuracy of ssGBLUP with the full inverse of G was used as the baseline. The dataset consisted of 357k pedigreed Duroc pigs with 111k pigs with genotypes and ~ 220k phenotypic records.

Results

When the core size was equal to the number of largest eigenvalues explaining 50% of the variation of G (n = 160), MPA and Ped core definitions delivered the highest average prediction accuracies (~ 0.41−0.53). As the core size increased to the number of eigenvalues explaining 99% of the variation in G (n = 7320), prediction accuracy was nearly identical for all core types and correlations with genomic estimated breeding values (GEBV) from ssGBLUP with the full inversion of G were greater than 0.99 for all core definitions. Cores that represent all generations, such as Rnd, Ped, Fam, and Unrel, were grouped together in the hierarchical clustering of GEBV.

Conclusions

For small core sizes, the definition of the core matters; however, as the size of the core reaches an optimal value equal to the number of largest eigenvalues explaining 99% of the variation of G, the definition of the core becomes arbitrary.

Collapse

Junqueira VS, Lourenco D, Masuda Y, Cardoso FF, Lopes PS, Silva FFE, Misztal I. Is single-step genomic REML with the algorithm for proven and young more computationally efficient when less generations of data are present? J Anim Sci 2022;100:skac082. [PMID: 35289906 PMCID: PMC9118993 DOI: 10.1093/jas/skac082] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 03/10/2022] [Indexed: 12/04/2022] Open

Abstract

Efficient computing techniques allow the estimation of variance components for virtually any traditional dataset. When genomic information is available, variance components can be estimated using genomic REML (GREML). If only a portion of the animals have genotypes, single-step GREML (ssGREML) is the method of choice. The genomic relationship matrix (G) used in both cases is dense, limiting computations depending on the number of genotyped animals. The algorithm for proven and young (APY) can be used to create a sparse inverse of G (GAPY~-1) with close to linear memory and computing requirements. In ssGREML, the inverse of the realized relationship matrix (H-1) also includes the inverse of the pedigree relationship matrix, which can be dense with a long pedigree, but sparser with short. The main purpose of this study was to investigate whether costs of ssGREML can be reduced using APY with truncated pedigree and phenotypes. We also investigated the impact of truncation on variance components estimation when different numbers of core animals are used in APY. Simulations included 150K animals from 10 generations, with selection. Phenotypes (h2 = 0.3) were available for all animals in generations 1-9. A total of 30K animals in generations 8 and 9, and 15K validation animals in generation 10 were genotyped for 52,890 SNP. Average information REML and ssGREML with G-1 and GAPY~-1 using 1K, 5K, 9K, and 14K core animals were compared. Variance components are impacted when the core group in APY represents the number of eigenvalues explaining a small fraction of the total variation in G. The most time-consuming operation was the inversion of G, with more than 50% of the total time. Next, numerical factorization consumed nearly 30% of the total computing time. On average, a 7% decrease in the computing time for ordering was observed by removing each generation of data. APY can be successfully applied to create the inverse of the genomic relationship matrix used in ssGREML for estimating variance components. To ensure reliable variance component estimation, it is important to use a core size that corresponds to the number of largest eigenvalues explaining around 98% of total variation in G. When APY is used, pedigrees can be truncated to increase the sparsity of H and slightly reduce computing time for ordering and symbolic factorization, with no impact on the estimates.

Collapse

Bermann M, Cesarani A, Misztal I, Lourenco D. Past, present, and future developments in single-step genomic models. Italian Journal of Animal Science 2022. [DOI: 10.1080/1828051x.2022.2053366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Cesarani A, Lourenco D, Tsuruta S, Legarra A, Nicolazzi E, VanRaden P, Misztal I. Multibreed genomic evaluation for production traits of dairy cattle in the United States using single-step genomic best linear unbiased predictor. J Dairy Sci 2022;105:5141-5152. [DOI: 10.3168/jds.2021-21505] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 01/27/2022] [Indexed: 01/01/2023]

Jang S, Lourenco D, Miller S. Inclusion of Sire by Herd interaction effect in the genomic evaluation for weaning weight of American Angus. J Anim Sci 2022;100:6537149. [PMID: 35213718 PMCID: PMC9030219 DOI: 10.1093/jas/skac057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 02/23/2022] [Indexed: 11/12/2022] Open

Abstract

A spurious negative genetic correlation between direct and maternal effects of weaning weight (WW) in beef cattle has historically been problematic for researchers and industry. Previous research has suggested the covariance between sires and herds may be contributing to this relationship. The objective of this study was to estimate the variance components (VC) for WW in American Angus with and without sire by herd (S×H) interaction effect when genomic information is used or not. Five subsets of ~100k animals for each subset were used. When genomic information was included, genotypes were added for 15,637 animals. Five replicates were performed. Four different models were tested, namely, M1: without S×H interaction effect and with covariance between direct and maternal effect (σam) ≠ 0; M2: with S×H interaction effect and σam ≠ 0; M3: without S×H interaction effect and with σam = 0; M4: with S×H interaction effect and σam = 0. VC were estimated using the restricted maximum likelihood (REML) and single-step genomic REML (ssGREML) with the average information algorithm. Breeding values were computed using single-step genomic BLUP for the models above and one additional model, which had the covariance zeroed after the estimation of VC (M5). The ability of each model to predict future breeding values was investigated with the linear regression method. Under REML, when the S×H interaction effect was added to the model, both direct and maternal genetic variances were greatly reduced, and the negative covariance became positive (i.e., when moving from M1 to M2). Similar patterns were observed under ssGREML, but with less reduction in the direct and maternal genetic variances and still a negative covariance. Models with the S×H interaction effect (M2 and M4) had a better fit according to the Akaike information criteria. Breeding values from those models were more accurate and had less bias than the other three models. The rankings and breeding values of artificial insemination sires (N = 1,977) greatly changed when the S×H interaction effect was fit in the model. Although the S×H interaction effect accounted for 3% to 5% of the total phenotypic variance and improved the model fit, this change in the evaluation model will cause severe reranking among animals.

A spurious negative genetic correlation between direct and maternal effects of weaning weight (WW) in beef cattle has been problematic for researchers and industry. Previous research suggested the covariance between sires and herds may contribute to this relationship. The objective of this study was to estimate the variance components (VC) for WW in American Angus with and without sire by herd (S×H) interaction effect when genomic information is used or not. Four models were designed to investigate the S×H effect. The restricted maximum likelihood (REML) and single-step genomic REML (ssGREML) were used to estimate VC. Breeding values were computed using single-step genomic BLUP and the validation was done through the linear regression method. Under REML, when the S×H was added to the model, both direct and maternal genetic variances were greatly reduced, and the negative covariance became positive. Similar patterns were observed under ssGREML, but with less reduction in the direct and maternal genetic variances and still a negative covariance. Breeding values from models with S×H were more accurate and had less bias than the other models. Although the S×H improved the model, this change in the evaluation model will cause severe reranking among key animals.

Collapse

Campos GS, Cardoso FF, Gomes CCG, Domingues R, de Almeida Regitano LC, de Sena Oliveira MC, de Oliveira HN, Carvalheiro R, Albuquerque LG, Miller S, Misztal I, Lourenco D. Development of genomic predictions for Angus cattle in Brazil incorporating genotypes from related American sires. J Anim Sci 2022;100:6507787. [PMID: 35031806 PMCID: PMC8867558 DOI: 10.1093/jas/skac009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 01/12/2022] [Indexed: 11/24/2022] Open

Abstract

Genomic prediction has become the new standard for genetic improvement programs, and currently, there is a desire to implement this technology for the evaluation of Angus cattle in Brazil. Thus, the main objective of this study was to assess the feasibility of evaluating young Brazilian Angus (BA) bulls and heifers for 12 routinely recorded traits using single-step genomic BLUP (ssGBLUP) with and without genotypes from American Angus (AA) sires. The second objective was to obtain estimates of effective population size (N_e) and linkage disequilibrium (LD) in the Brazilian Angus population. The dataset contained phenotypic information for up to 277,661 animals belonging to the Promebo breeding program, pedigree for 362,900, of which 1,386 were genotyped for 50k, 77k, and 150k single nucleotide polymorphism (SNP) panels. After imputation and quality control, 61,666 SNPs were available for the analyses. In addition, genotypes from 332 American Angus (AA) sires widely used in Brazil were retrieved from the AA Association database to be used for genomic predictions. Bivariate animal models were used to estimate variance components, traditional EBV, and genomic EBV (GEBV). Validation was carried out with the linear regression method (LR) using young-genotyped animals born between 2013 and 2015 without phenotypes in the reduced dataset and with records in the complete dataset. Validation animals were further split into progeny of BA and AA sires to evaluate if their progenies would benefit by including genotypes from AA sires. The N_e was 254 based on pedigree and 197 based on LD, and the average LD (±SD) and distance between adjacent single nucleotide polymorphisms (SNPs) across all chromosomes were 0.27 (±0.27) and 40743.68 bp, respectively. Prediction accuracies with ssGBLUP outperformed BLUP for all traits, improving accuracies by, on average, 16% for BA young bulls and heifers. The GEBV prediction accuracies ranged from 0.37 (total maternal for weaning weight and tick count) to 0.54 (yearling precocity) across all traits, and dispersion (LR coefficients) fluctuated between 0.92 and 1.06. Inclusion of genotyped sires from the AA improved GEBV accuracies by 2%, on average, compared to using only the BA reference population. Our study indicated that genomic information could help us to improve GEBV accuracies and hence genetic progress in the Brazilian Angus population. The inclusion of genotypes from American Angus sires heavily used in Brazil just marginally increased the GEBV accuracies for selection candidates.

There was a desire to implement genomic selection for Angus cattle in Brazil since the technology has been proved to increase genetic gain in animal breeding programs. Single-step genomic best linear unbiased prediction (ssGBLUP), which simultaneously combines pedigree and genomic information, was used to estimate individuals’ genomic breeding values (GEBV) or genetic merit. Genomic selection can accelerate genetic progress by increasing accuracy, especially in young animals without progeny. The accuracy of GEBV can also be improved by combing data from other countries to increase the reference population (i.e., genotyped and phenotyped animals) in small, genotyped populations. Thus, the main objective of this study was to evaluate the accuracy of GEBV for young Brazilian Angus (BA) bulls and heifers with ssGBLUP, including or not the genotypes from American Angus sires. The accuracies with ssGBLUP were higher than those from traditional BLUP (EBV calculated from pedigree), improving accuracies by, on average, 16% for young bulls and heifers. Including genotypes from American Angus sires heavily used in Brazil just marginally increased the GEBV accuracies for selection candidates.

Collapse

Sungkhapreecha P, Misztal I, Hidalgo J, Lourenco D, Buaban S, Chankitisakul V, Boonkum W. Validation of single-step genomic predictions using the linear regression method for milk yield and heat tolerance in a Thai-Holstein population. Vet World 2021;14:3119-3125. [PMID: 35153401 PMCID: PMC8829417 DOI: 10.14202/vetworld.2021.3119-3125] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 11/02/2021] [Indexed: 12/03/2022] Open

Abstract

Background and Aim:

Genomic selection improves accuracy and decreases the generation interval, increasing the selection response. This study was conducted to assess the benefits of using single-step genomic best linear unbiased prediction (ssGBLUP) for genomic evaluations of milk yield and heat tolerance in Thai-Holstein cows and to test the value of old phenotypic data to maintain the accuracy of predictions.

Materials and Methods:

The dataset included 104,150 milk yield records collected from 1999 to 2018 from 15,380 cows. The pedigree contained 33,799 animals born between 1944 and 2016, of which 882 were genotyped. Analyses were performed with and without genomic information using ssGBLUP and BLUP, respectively. Statistics for bias, dispersion, the ratio of accuracies, and the accuracy of estimated breeding values were calculated using the linear regression (LR) method. A partial dataset excluded the phenotypes of the last generation, and 66 bulls were identified as validation individuals.

Results:

Bias was considerable for BLUP (0.44) but negligible (−0.04) for ssGBLUP; dispersion was similar for both techniques (0.84 vs. 1.06 for BLUP and ssGBLUP, respectively). The ratio of accuracies was 0.33 for BLUP and 0.97 for ssGBLUP, indicating more stable predictions for ssGBLUP. The accuracy of predictions was 0.18 for BLUP and 0.36 for ssGBLUP. Excluding the first 10 years of phenotypic data (i.e., 1999-2008) decreased the accuracy to 0.09 for BLUP and 0.32 for ssGBLUP. Genomic information doubled the accuracy and increased the persistence of genomic estimated breeding values when old phenotypes were removed.

Conclusion:

The LR method is useful for estimating accuracies and bias in complex models. When the population size is small, old data are useful, and even a small amount of genomic information can substantially improve the accuracy. The effect of heat stress on first parity milk yield is small.

Collapse

Bermann M, Lourenco D, Misztal I. Efficient approximation of reliabilities for single-step genomic BLUP models with the Algorithm for Proven and Young. J Anim Sci 2021;100:6455777. [PMID: 34877603 PMCID: PMC8827023 DOI: 10.1093/jas/skab353] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 11/20/2021] [Indexed: 11/14/2022] Open

Abstract

The objectives of this study were to develop an efficient algorithm for calculating prediction error variances (PEV) for GBLUP models using the Algorithm for Proven and Young (APY), extend it to single-step GBLUP (ssGBLUP), and to apply this algorithm for approximating the theoretical reliabilities for single and multiple trait models in ssGBLUP. The PEV with APY was calculated by block-sparse inversion, efficiently exploiting the sparse structure of the inverse of the genomic relationship matrix with APY. Single-step GBLUP reliabilities were approximated by combining reliabilities with and without genomic information in terms of effective record contributions. Multi-trait reliabilities relied on single-trait results adjusted using the genetic and residual covariance matrices among traits. Tests involved two datasets provided by the American Angus Association. A small dataset (Data1) was used for comparing the approximated reliabilities with the reliabilities obtained by the inversion of the left-hand side of the mixed model equations. The large dataset (Data2) was used for evaluating the computational performance of the algorithm. Analyses with both datasets used single-trait and three-trait models. The number of animals in the pedigree ranged from 167,951 in Data1 to 10,213,401 in Data2, with 50,000 and 20,000 genotyped animals for single-trait and multiple trait-analysis, respectively, in Data1 and 335,325 in Data2. Correlations between estimated and exact reliabilities obtained by inversion ranged from 0.97 to 0.99, whereas the intercept and slope of the regression of the exact on the approximated reliabilities ranged from 0.00 to 0.04 and from 0.93 to 1.05, respectively. For the three-trait model with the largest dataset (Data2), the elapsed time for the reliability estimation was eleven minutes. The computational complexity of the proposed algorithm increased linearly with the number of genotyped animals and with the number of traits in the model. This algorithm can efficiently approximate the theoretical reliability of genomic estimated breeding values in ssGBLUP with APY for large numbers of genotyped animals at a low cost.

Collapse

Abdollahi-Arpanahi R, Lourenco D, Legarra A, Misztal I. Dissecting genetic trends to understand breeding practices in livestock: a maternal pig line example. Genet Sel Evol 2021;53:89. [PMID: 34837954 PMCID: PMC8627101 DOI: 10.1186/s12711-021-00683-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Accepted: 11/09/2021] [Indexed: 11/10/2022] Open

Abstract

Background

Understanding whether genomic selection has been effective in livestock and when the results of genomic selection became visible are essential questions which we have addressed in this paper. Three criteria were used to identify practices of breeding programs over time: (1) the point of divergence of estimated genetic trends based on pedigree-based best linear unbiased prediction (BLUP) versus single-step genomic BLUP (ssGBLUP), (2) the point of divergence of realized Mendelian sampling (RMS) trends based on BLUP and ssGBLUP, and (3) the partition of genetic trends into that contributed by genotyped and non-genotyped individuals and by males and females.

Methods

We used data on 282,035 animals from a commercial maternal line of pigs, of which 32,856 were genotyped for 36,612 single nucleotide polymorphisms (SNPs) after quality control. Phenotypic data included 228,427, 101,225, and 11,444 records for birth weight, average daily gain in the nursery, and feed intake, respectively. Breeding values were predicted in a multiple-trait framework using BLUP and ssGBLUP.

Results

The points of divergence of the genetic and RMS trends estimated by BLUP and ssGBLUP indicated that genomic selection effectively started in 2019. Partitioning the overall genetic trends into that for genotyped and non-genotyped individuals revealed that the contribution of genotyped animals to the overall genetic trend increased rapidly from ~ 74% in 2016 to 90% in 2019. The contribution of the female pathway to the genetic trend also increased since genomic selection was implemented in this pig population, which reflects the changes in the genotyping strategy in recent years.

Conclusions

Our results show that an assessment of breeding program practices can be done based on the point of divergence of genetic and RMS trends between BLUP and ssGBLUP and based on the partitioning of the genetic trend into contributions from different selection pathways. However, it should be noted that genetic trends can diverge before the onset of genomic selection if superior animals are genotyped retroactively. For the pig population example, the results showed that genomic selection was effective in this population.

Collapse

Falchi L, Gaspa G, Cesarani A, Correddu F, Degano L, Vicario D, Lourenco D, Macciotta NPP. Investigation of β-hydroxybutyrate in early lactation of Simmental cows: Genetic parameters and genomic predictions. J Anim Breed Genet 2021;138:708-718. [PMID: 34180560 PMCID: PMC8518359 DOI: 10.1111/jbg.12637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 05/28/2021] [Indexed: 11/28/2022]

Abstract

Genomic information allows for a more accurate calculation of relationships among animals than the pedigree information, leading to an increase in accuracy of breeding values. Here, we used pedigree-based and single-step genomic approaches to estimate variance components and breeding values for β-hydroxybutyrate milk content (BHB). Additionally, we performed a genome-wide association study (GWAS) to depict its genetic architecture. BHB concentrations within the first 90 days of lactation, estimated from milk medium infrared spectra, were available for 30,461 cows (70,984 records). Genotypes at 42,152 loci were available for 9,123 animals. Low heritabilities were found for BHB using pedigree-based (0.09 ± 0.01) and genomic (0.10 ± 0.01) approaches. Genetic correlation between BHB and milk traits ranged from -0.27 ± 0.06 (BHB and protein percentage) to 0.13 ± 0.07 (BHB and fat-to-protein ratio) using pedigree and from -0.26 ± 0.05 (BHB and protein percentage) to 0.13 ± 0.06 (BHB and fat-to-protein ratio) using genomics. Breeding values were validated for 344 genotyped cows using linear regression method. The genomic EBV (GEBV) had greater accuracy (0.51 vs. 0.45) and regression coefficient (0.98 vs. 0.95) compared to EBV. The correlation between two subsequent evaluations, without and with phenotypes for validation cows, was 0.85 for GEBV and 0.82 for EBV. Predictive ability (correlation between (G)EBV and adjusted phenotypes) was greater when genomic information was used (0.38) than in the pedigree-based approach (0.31). Validation statistics in the pairwise two-trait models (milk yield, fat and protein percentage, urea, fat/protein ratio, lactose and logarithmic transformation of somatic cells count) were very similar to the ones highlighted for the single-trait model. The GWAS allowed discovering four significant markers located on BTA20 (57.5-58.2 Mb), where the ANKH gene is mapped. This gene has been associated with lactose, alpha-lactalbumin and BHB. Results of this study confirmed the usefulness of genomic information to provide more accurate variance components and breeding values, and important insights about the genomic determination of BHB milk content.

Collapse

Jang S, Tsuruta S, Leite N, Misztal I, Lourenco D. 34 Dimensionality of Genomic Information and Its Impact on GWA and Variant Selection: A Simulation Study. J Anim Sci 2021. [DOI: 10.1093/jas/skab235.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract Abstract The ability to identify true-positive variants increases as more genotyped animals are available. Although thousands of animals can be genotyped, the dimensionality of the genomic information is limited. Therefore, there is a certain number of animals that represent all chromosome segments (Me) segregating in the population. The number of Me can be approximated from the eigenvalue decomposition of the genomic relationship matrix (G). Thus, the limited dimensionality may help to identify the number of animals to be used in genome-wide association (GWA). The first objective of this study was to examine different discovery set sizes for GWA, with set sizes based on the number of largest eigenvalues explaining a certain proportion of variance in G. Additionally, we investigated the impact of incorporating variants selected from different set sizes to regular SNP chip used for genomic prediction. Sequence data were simulated that contained 500k SNP and 2k QTL, where the genetic variance was fully explained by QTL. The GWA was conducted using the number of genotyped animals equal to the number of largest eigenvalues of G (EIG) explaining 50, 60, 70, 80, 90, 95, 98, and 99 percent of the variance in G. Significant SNP had a p-value lower than 0.05 with Bonferroni correction. Further, SNP with the largest effect size (top10, 100, 500, 1k, 2k, and 4k) were also selected to be incorporated into the 50k regular chip. Genomic predictions using the 50k combined with selected SNP were conducted using single-step GBLUP (ssGBLUP). Using the number of animals corresponding to at least EIG98 enabled the identification of the largest effect size QTL. The greatest accuracy of prediction was obtained when the top 2k SNP was combined to the 50k chip. The dimensionality of genomic information should be taken into account for variant selection in GWAS. Collapse

Misztal I, Pocrnic I, Lourenco D. 40 Factors Influencing Accuracy of Genomic Selection with Sequence Information. J Anim Sci 2021. [DOI: 10.1093/jas/skab235.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Abstract Abstract Incorporating the sequence information only marginally increases the accuracy of genomic selection. The purpose of this study was to find out why by examining profiles of Quantitative Trait Nucleotides (QTN). Multiple populations were simulated with different effective population sizes and number of animals. 100 equidistant QTN with identical substitution effects were included in 50k SNP genotypes. Analyses were by single-step GBLUP, with solutions converted to SNP values and subsequently to p-values for each SNP. Manhattan plots for standardized SNP solutions were noisy and were elevated only for few QTNs. Manhattan plots for p-values were similar to those for SNP solutions, indicating little impact of population structure. The number of significant QTN was lower with lower effective population size and increased with larger data; at most about 20% of QTNs were detected. A QTN profile was created by averaging SNP solutions ±100 SNP around each QTN. The profile showed a normal-like response but with a distinct peak for the QTN. While the peak was higher with more data and higher effective population size, the normal-like response was smaller with higher effective population size. QTNs explained little variance because of shrinkage. The accuracy of genomic selection would be 100% if all QTNs are identified and their variances known, to prevent shrinking or inflation. This study allows to see limits of application of QTN from sequence data for genomic selection. If all causative SNP are included in the data, only a fraction of them can be identified even under a very simplistic architecture. As variance of QTN are assumed constant or are crude approximations (like in BayesR), the estimated QTN effects are inaccurate. Additional complications in QTN detection are close-spaced QTN and false QTNs due to imputation. Small effective population size allows the genomic selection by GBLUP but complicates the use of QTNs. Collapse

Bermann M, Lourenco D, Breen V, Hawken R, Lopes FB, Misztal I. PSXII-9 Modeling genetic differences of combined broiler chicken populations in single-step GBLUP. J Anim Sci 2021. [DOI: 10.1093/jas/skab235.464] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Abstract Abstract The objectives of this study were to model the inclusion of a group of external birds into a local broiler chicken population for the purpose of genomic evaluations and evaluating the behavior of two accuracy estimators under different model specifications. The pedigree was composed by 242,413 birds and genotypes were available for 107,216 birds. A five-trait model that included one growth, two yield, and two efficiency traits was used for the analyses. The strategies to model the introduction of external birds were to include a fixed effect representing the origin of parents and to use UPG or metafounders. Genomic estimated breeding values (GEBV) were obtained with single-step GBLUP (ssGBLUP) using the Algorithm for Proven and Young (APY). Bias, dispersion, and accuracy of GEBV for the validation birds, i.e., from the most recent generation, were computed. The bias and dispersion were estimated with the LR-method, whereas accuracy was estimated by the LR-method and predictive ability. Models with fixed UPG and estimated inbreeding or random UPG resulted in similar GEBV. The inclusion of an extra fixed effect in the model made the GEBV unbiased and reduced the inflation, while models without such an effect were significantly biased. Genomic predictions with metafounders were slightly biased and inflated due to the unbalanced number of observations assigned to each metafounder. When combining local and external populations, the greatest accuracy and smallest bias can be obtained by adding an extra fixed effect to account for the origin of parents plus UPG with estimated inbreeding or random UPG. To estimate the accuracy, the LR-method is more consistent among models, whereas predictive ability greatly depends on the model specification, that is, on the fixed effects included in the model. When changing model specification, the largest variation for the LR-method was 20%, while for predictive ability was 110%. Collapse

Hidalgo J, Lourenco D, Tsuruta S, Masuda Y, Breen V, Hawken R, Bermann M, Misztal I. 44 Accuracy of Genomic Predictions over Time in Broilers. J Anim Sci 2021. [DOI: 10.1093/jas/skab235.047] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract Abstract The objectives of this research were to investigate trends for accuracy of genomic predictions over time in a broiler population accumulating data, and to test if data from distant generations are useful in maintaining the accuracy of genomic predictions in selection candidates. The data contained 820k phenotypes for a growth trait (GROW), 200k for two feed efficiency traits (FE1 and FE2), and 42k for a dissection trait (DT). The pedigree included 1.2M animals across 7 years, over 100k from the last 4 years were genotyped. Accuracy was calculated by the linear regression method. Before genotypes became available for training populations, accuracy was nearly stable despite the accumulation of phenotypes and pedigrees. When the first year of genomic data was included in the training population, accuracy increased 56, 77, 39, and 111% for GROW, FE1, FE2, and DT, respectively. With genomic information, the accuracies increased every year except the last one, when they declined for GROW and FE2. The decay of accuracy over time was evaluated in progeny, grand-progeny, and great-grand-progeny of training populations. Without genotypes, the average decline in accuracy across traits was 41% from progeny to grand-progeny, and 19% from grand-progeny to great-grand-progeny. Whit genotypes, the average decline across traits was 14% from progeny to grand-progeny, and 2% from grand-progeny to great-grand-progeny. The accuracies in the last 3 generations were the same when the training population included 5 or 2 years of data, and a marginal decrease was observed when the training population included only 1 year of data. Training sets including genomic information provided an increased accuracy and persistence of genomic predictions compared to training sets without genomic data. The two most recent years of data were enough to maintain the accuracy of predictions in selection candidates. Collapse

Steyn Y, Lourenco D, Chen CY, Valente B, Holl JW, Herring WO, Misztal I. 28 Optimal Definition of Contemporary Groups for Crossbred Pigs in a Joint Purebred and Crossbred Genetic Evaluation. J Anim Sci 2021. [DOI: 10.1093/jas/skab235.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract Abstract In the pig industry, purebred animals are raised in nucleus herds and selected to produce crossbred progeny for commercial environments. Crossbred and purebred performances are different, correlated traits. All purebreds in a pen are assessed together at the end of a performance test. However, only selected crossbreds are removed (based on visual inspection) and measured at different times, creating many small contemporary groups (CG). This may reduce EBV prediction accuracies. Considering this sequential recording of crossbreds, the objective was to investigate the impact of different CG definitions on genetic parameters and EBV prediction accuracy for crossbred traits. Growth rate (GP) and ultrasound backfat (BFP) records were available for purebreds. Lifetime growth (GX) and backfat (BFX) were recorded on crossbreds. Different CG were tested: CG_all included farm, sex, birth year and birth week, CG_week added slaughter week, and CG_day used slaughter day instead of week. Data of 124,709 crossbreds were used. The purebred phenotypes (62,274 animals) included 3 generations of purebred ancestors of these crossbreds and their CG mates. Variance components for 4-trait models with different CG definitions were estimated with AIREML. Purebred traits’ variance components remained stable across CG definitions and varied slightly for BFX. Additive genetic variances (and heritabilities) for GX fluctuated more: 812±36 (0.28±0.01), 257±15 (0.17±0.01) and 204±13 (0.15±0.01) for CG_all, CG_week, and CG_day, respectively. The predictive ability, linear regression (LR) accuracy, bias, and dispersion of crossbred traits in crossbreds favored CG_day, but correlations with unadjusted phenotypes favored CG_all. In purebreds, CG_all showed the best LR accuracy, while showing small relative differences in bias and dispersion. Different GC scenarios showed no relevant impact on BFX EBV. This study shows that different CG definitions may affect evaluation stability and ranking. Results suggest that ignoring slaughter dates in CG is more appropriate for estimating crossbred trait EBV for purebred animals. Collapse

McWhorter TM, Garcia A, Bermann M, Legarra A, Aguilar I, Misztal I, Lourenco D. 36 Effect of Blending and Tuning Relationship Matrices in Single-step Genomic BLUP. J Anim Sci 2021. [DOI: 10.1093/jas/skab235.032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Leite N, Chen CY, Herring WO, Tsuruta S, Lourenco D. 49 Predicting Breeding Values of Purebred Pigs for Crossbred Performance Using Crossbred Phenotypes and Genotypes. J Anim Sci 2021. [DOI: 10.1093/jas/skab235.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract Abstract Phenotyping a large number of crossbred progeny for the evaluation of purebred animals can be expensive. As genotyping with low-density panels is becoming cheaper, we aimed to evaluate the tradeoff between having different percentages of genotypes and phenotypes for crossbred progeny of candidate boars. We used the linear regression (LR) method to investigate changes in accuracy, bias, and inflation of breeding values for crossbred traits in purebred boars. A total of 304,582 purebred and 147,474 crossbred animals were phenotyped for average daily gain (ADG) and backfat thickness (BF), out of which 46,691 purebred and 13,117 crossbred animals were genotyped. Genomic information consisted of imputed genotypes for 40,247 SNP markers after quality control. A four-trait animal model under single-step GBLUP was used that included phenotypes recorded in purebred and crossbred animals as correlated traits. The LR statistics were calculated based on breeding values of young purebred sires from complete and partial data. The first complete data included genotypes for purebreds and phenotypes for purebreds and crossbreds, whereas the second included also genotypes for crossbreds. The partial data included phenotypes on 50% or none of the progeny of validation sires, with or without genotypes for crossbred animals. When 50% of the progeny has phenotypes, adding genotypes for crossbred progeny marginally increased accuracy of ADG (0.77 vs 0.78) for 47 boars with more than 150 progeny with phenotypes. No increase was observed for BF. A small increase in bias and inflation by adding crossbred genotypes was observed for ADG but not for BF. When no phenotypes were available for crossbred progeny, accuracy for both traits was lower but improved with crossbred genotypes for ADG (0.61 vs 0.64) for boars with more than 150 progeny. The tradeoff between phenotypes and genotypes should be further investigated in larger datasets with more validation boars. Collapse

Hollifield MK, Lourenco D, Tsuruta S, Bermann M, Howard JT, Misztal I. 33 Impact of Including the Cause of Missing Records on Genetic Evaluations for Growth in Commercial Pigs. J Anim Sci 2021. [DOI: 10.1093/jas/skab235.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Abstract Abstract It is of interest to evaluate crossbred pigs for hot carcass weight (HCW) and birth weight (BW); however, obtaining a HCW record is dependent on livability (LIV) and retained tag (RT). The purpose of this study is to analyze how HCW evaluations are affected when herd removal and missing identification are included in the model and examine if accounting for the reasons for missing traits improves the accuracy of predicting breeding values. Pedigree information was available for 1,965,077 purebred and crossbred animals. Records for 503,716 commercial three-way crossbred terminal animals from 2014 to 2019 were provided by Smithfield Premium Genetics. Two pedigree-based models were compared; model 1 (M1) was a threshold-linear model with all four traits (BW, HCW, RT, and LIV), and model 2 (M2) was a linear model including only BW and HCW. The fixed effects used in the model were contemporary group, sex, age at harvest (for HCW only), and dam parity. The random effects included direct additive genetic and random litter effects. Accuracy, dispersion, bias, and Pearson correlations were estimated using the linear regression method. The heritabilities were 0.11, 0.07, 0.02, and 0.04 for BW, HCW, RT, and LIV, respectively, with standard errors less than 0.01. No difference was observed in heritabilities or accuracies for BW and HCW between M1 and M2. Accuracies were 0.33, 0.37, 0.19, and 0.23 for BW, HCW, RT, and LIV respectively. The genetic correlation between BW and RT was 0.34 ± 0.03, and between BW and LIV was 0.56 ± 0.03. The positive and moderate genetic correlations between BW and other traits imply a heavier BW resulted in a higher probability of surviving to harvest. Despite the heritable and correlated aspects of RT and LIV, results imply no major differences between M1 and M2; hence, it is unnecessary to include these traits in classical models for BW and HCW. Collapse

Abdollahi-Arpanahi R, Lourenco D, Misztal I. 35 Detecting Effective Starting Point of Genomic Selection by Divergent Trends from BLUP and ssGBLUP. J Anim Sci 2021. [DOI: 10.1093/jas/skab235.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open