1
|
Hidalgo J, Tsuruta S, Gonzalez D, de Oliveira G, Sanchez M, Kulkarni A, Przybyla C, Vargas G, Vukasinovic N, Misztal I, Lourenco D. Converting estimated breeding values from the observed to probability scale for health traits. J Dairy Sci 2024; 107:9628-9637. [PMID: 39004126 DOI: 10.3168/jds.2024-24767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 06/12/2024] [Indexed: 07/16/2024]
Abstract
Dairy cattle health traits are paramount from a welfare and economic viewpoint, and modern breeding programs therefore prioritize the genetic improvement of these traits. Estimated breeding values for health traits are published as the probability of animals staying healthy. They are obtained using threshold models, which assume that the observed binary phenotype (i.e., healthy or sick) is dictated by an underlying normally distributed liability exceeding or not exceeding a threshold. This methodology requires significant computing time and faces convergence challenges, as it implies a nonlinear system of equations. Linear models have more straightforward computations and provide a robust approximation to threshold models; thus, they could be used to overcome these challenges. However, linear models yield estimated breeding values on the observed scale, requiring an approximation to the liability scale analogous to that from threshold models to later obtain the estimated breeding values on the probability scale. In addition, the robustness of the approximation of linear to threshold models depends on the amount of information and the incidence of the trait, with extreme incidence (i.e., ≤5%) deviating from optimal approximation. Our objective was to test a transformation from the observed to the liability, and then to the probability scale, in the genetic evaluation of health traits with moderate and very low (extreme) incidence. Data comprised displaced abomasum (5.1 million), ketosis (3.6 million), lameness (5 million), and mastitis (6.3 million) records from a Holstein population with a pedigree of 6 million animals, of which 1.7 million were genotyped. Univariate threshold and linear models were performed to predict breeding values. The agreement between estimated breeding values on the probability scale derived from threshold and linear models was assessed using Spearman rank correlations and comparison of estimated breeding values distributions. Correlations were at least 0.95, and estimated breeding value distributions almost entirely overlapped for all the traits but displaced abomasum, the trait with the lowest incidence (2%). Computing time was ∼3 times longer for threshold than for linear models. In this Holstein population, the approximation was suboptimal for a trait with extreme incidence (2%). However, when the incidence was ≥6%, the approximation was robust, and its use is recommended along with linear models for analyzing categorical traits in large populations to ease the computational burden.
Collapse
Affiliation(s)
- Jorge Hidalgo
- Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602.
| | - Shogo Tsuruta
- Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602
| | | | | | - Miguel Sanchez
- Zoetis Genetics and Precision Animal Health, Kalamazoo, MI 49007
| | - Asmita Kulkarni
- Zoetis Genetics and Precision Animal Health, Kalamazoo, MI 49007
| | - Cory Przybyla
- Zoetis Genetics and Precision Animal Health, Kalamazoo, MI 49007
| | - Giovana Vargas
- Zoetis Genetics and Precision Animal Health, Kalamazoo, MI 49007
| | | | - Ignacy Misztal
- Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602
| |
Collapse
|
2
|
Bussiman F, Alves AAC, Richter J, Hidalgo J, Veroneze R, Oliveira T. Supervised Machine Learning Techniques for Breeding Value Prediction in Horses: An Example Using Gait Visual Scores. Animals (Basel) 2024; 14:2723. [PMID: 39335312 PMCID: PMC11429212 DOI: 10.3390/ani14182723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Revised: 09/13/2024] [Accepted: 09/17/2024] [Indexed: 09/30/2024] Open
Abstract
Gait scores are widely used in the genetic evaluation of horses. However, the nature of such measurement may limit genetic progress since there is subjectivity in phenotypic information. This study aimed to assess the application of machine learning techniques in the prediction of breeding values for five visual gait scores in Campolina horses: dissociation, comfort, style, regularity, and development. The dataset contained over 5000 phenotypic records with 107,951 horses (14 generations) in the pedigree. A fixed model was used to estimate least-square solutions for fixed effects and adjusted phenotypes. Variance components and breeding values (EBV) were obtained via a multiple-trait model (MTM). Adjusted phenotypes and fixed effects solutions were used to train machine learning models (using the EBV from MTM as target variable): artificial neural network (ANN), random forest regression (RFR) and support vector regression (SVR). To validate the models, the linear regression method was used. Accuracy was comparable across all models (but it was slightly higher for ANN). The highest bias was observed for ANN, followed by MTM. Dispersion varied according to the trait; it was higher for ANN and the lowest for MTM. Machine learning is a feasible alternative to EBV prediction; however, this method will be slightly biased and over-dispersed for young animals.
Collapse
Affiliation(s)
- Fernando Bussiman
- Animal and Dairy Science Department, University of Georgia, Athens, GA 30602, USA; (A.A.C.A.); (J.R.); (J.H.); (R.V.)
| | - Anderson A. C. Alves
- Animal and Dairy Science Department, University of Georgia, Athens, GA 30602, USA; (A.A.C.A.); (J.R.); (J.H.); (R.V.)
| | - Jennifer Richter
- Animal and Dairy Science Department, University of Georgia, Athens, GA 30602, USA; (A.A.C.A.); (J.R.); (J.H.); (R.V.)
| | - Jorge Hidalgo
- Animal and Dairy Science Department, University of Georgia, Athens, GA 30602, USA; (A.A.C.A.); (J.R.); (J.H.); (R.V.)
| | - Renata Veroneze
- Animal and Dairy Science Department, University of Georgia, Athens, GA 30602, USA; (A.A.C.A.); (J.R.); (J.H.); (R.V.)
- Animal Science Department, Federal University of Viçosa, Viçosa 36570-900, Brazil
| | - Tiago Oliveira
- Statistics Department, State University of Paraíba, Campina Grande 58429-500, Brazil;
| |
Collapse
|
3
|
Hollifield MK, Lourenco D, Misztal I. Estimation of heritability with genomic information by method R. J Anim Breed Genet 2024; 141:550-558. [PMID: 38523564 DOI: 10.1111/jbg.12863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 03/05/2024] [Accepted: 03/10/2024] [Indexed: 03/26/2024]
Abstract
Estimating heritabilities with large genomic models by established methods such as restricted maximum likelihood (REML) or Bayesian via Gibbs sampling is computationally expensive. Alternatively, heritability can be estimated indirectly by method R and by maximum predictivity, referred to as MaxPred here, at a much lower computing cost. By method R, the heritability used for predictions with whole and partial data is considered the best estimate when the predictions based on partial data are unbiased relative to those with the complete data. By MaxPred, the heritability estimate is the one that maximizes predictivity. This study compared heritability estimation with genomic information using average information REML (AI-REML), method R and MaxPred. A simulated population was generated with ten generations of 5000 animals each and an effective population size of 80. Each animal had one record for a trait with a heritability of 0.3, a phenotypic variance of 10.0 and was genotyped at 50 k SNP. In method R, the heritability estimate is found when the expectation of a regression coefficient is equal to one. The regression is the EBV of selection candidates calculated with the whole dataset regressed on the EBV of candidates calculated from a partial dataset. In this study, we used the GBLUP framework and therefore, GEBV was calculated. The partial dataset was created by removing the last generation of phenotypes. Predictivity was defined as the correlation between the adjusted phenotypes of the selection candidates and their GEBV calculated from the partial data. We estimated the heritability for populations that included between three and 10 generations. In every scenario, predictivity increased as more data was used and was the highest at the simulated heritability. However, the predictivity for all data subsets and all heritabilities compared did not differ more than 0.01, suggesting MaxPred is not the best indication for heritability estimation. For the whole dataset, the heritability was estimated as 0.30 ± 0.01, 0.26 ± 0.01 and 0.30 ± 0.04 for AI-REML without genomics, AI-REML with genomics and method R with genomics, respectively. Heritability estimation with genomics by method R reduced timing by 83%, implying a reduction in computing time from 9.5 to 1.6 h, on average, compared to AI-REML with genomics. Method R has the potential to estimate heritabilities with large genomic information at a low cost when many generations of animals are present; however, the standard error can be high when only a few iterations are used.
Collapse
Affiliation(s)
- Mary Kate Hollifield
- Department of Animal and Dairy Science, University of Georgia, Athens, Georgia, USA
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, Georgia, USA
| | - Ignacy Misztal
- Department of Animal and Dairy Science, University of Georgia, Athens, Georgia, USA
| |
Collapse
|
4
|
Melo TP, Zwirtes AK, Silva AA, Lázaro SF, Oliveira HR, Silveira KR, Santos JCG, Andrade WBF, Kluska S, Evangelho LA, Oliveira HN, Tonhati H. Unknown parent groups and truncated pedigree in single-step genomic evaluations of Murrah buffaloes. J Dairy Sci 2024:S0022-0302(24)00847-6. [PMID: 38825116 DOI: 10.3168/jds.2023-24608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 04/16/2024] [Indexed: 06/04/2024]
Abstract
Missing pedigree may produce bias in genomic evaluations. Thus, strategies to deal with this problem have been proposed as using unknown parent groups (UPG) or truncated pedigrees. The aim of this study was to investigate the impact of modeling missing pedigree under ssGBLUP evaluations for productive and reproductive traits in dairy buffalos using different approaches: 1) traditional BLUP without UPG (BLUP), 2) traditional BLUP including UPG (BLUP/UPG), 3) ssGBLUP without UPG (ssGBLUP), 4) ssGBLUP including UPG in the A and A22 matrices (ssGBLUP/A_UPG), 5) ssGBLUP including UPG in all elements of the H matrix (ssGBLUP/H_UPG), 6) BLUP with pedigree truncation for the last 3 generations (BLUP/truncated), and 7) ssGBLUP with pedigree truncation for the last 3 generations (ssGBLUP/ truncated). UPGs were not used in the scenarios with truncated pedigree. A total of 3,717, 4,126 and 3,823 records of the first lactation for accumulated 305 d milk yield (MY), age at first calving (AFC) and lactation length (LL), respectively were used. Accuracies ranged from 0.27 for LL (BLUP) to 0.46 for MY (BLUP), bias ranged from -0.62 for MY (ssGBLUP) to 0.0002 for AFC (BLUP/truncated), and dispersion ranged from 0.88 for MY (BLUP/ A_UPG) to 1.13 for LL (BLUP). Genetic trend showed genetic gains for all traits across 20 years of selection and the impact of including either genomic information, UPG or pedigree truncation under GEBV accuracies ranged among the evaluated traits. Overall, methods using UPGs, truncation pedigree and genomic information exhibited potential to improve GEBV accuracies, bias and dispersion for all traits compared with other methods. Truncated scenarios promoted high genetic gains. In small populations with few genotyped animals, combining truncated pedigree or UPG with genomic information is a feasible approach to deal with missing pedigrees.
Collapse
Affiliation(s)
- T P Melo
- Departament of Animal Science, Federal University of Santa Maria (UFSM), Santa Maria, 97105-900, Rio Grande do Sul, Brazil.
| | - A K Zwirtes
- Departament of Animal Science, Federal University of Santa Maria (UFSM), Santa Maria, 97105-900, Rio Grande do Sul, Brazil
| | - A A Silva
- Departament of Animal Science, Sao Paulo State University (UNESP), Jaboticabal 14884-900, Sao Paulo, Brazil
| | - S F Lázaro
- Department of Animal Biosciences, University of Guelph, Guelph, N1G 1Y2, Ontario, Canada
| | - H R Oliveira
- Departament of Animal Sciences, Purdue University, West Lafayette, 47906, Indiana, USA
| | - K R Silveira
- Departament of Animal Science, Sao Paulo State University (UNESP), Jaboticabal 14884-900, Sao Paulo, Brazil
| | - J C G Santos
- Departament of Animal Science, Sao Paulo State University (UNESP), Jaboticabal 14884-900, Sao Paulo, Brazil
| | - W B F Andrade
- Departament of Animal Science, Sao Paulo State University (UNESP), Jaboticabal 14884-900, Sao Paulo, Brazil
| | - S Kluska
- Brazilian Association of Girolando Breeder's
| | - L A Evangelho
- Departament of Animal Science, Federal University of Santa Maria (UFSM), Santa Maria, 97105-900, Rio Grande do Sul, Brazil
| | - H N Oliveira
- Departament of Animal Science, Sao Paulo State University (UNESP), Jaboticabal 14884-900, Sao Paulo, Brazil
| | - H Tonhati
- Departament of Animal Science, Sao Paulo State University (UNESP), Jaboticabal 14884-900, Sao Paulo, Brazil
| |
Collapse
|
5
|
Sosa-Madrid BS, Maniatis G, Ibáñez-Escriche N, Avendaño S, Kranis A. Genetic Variance Estimation over Time in Broiler Breeding Programmes for Growth and Reproductive Traits. Animals (Basel) 2023; 13:3306. [PMID: 37958060 PMCID: PMC10649193 DOI: 10.3390/ani13213306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 10/12/2023] [Accepted: 10/19/2023] [Indexed: 11/15/2023] Open
Abstract
Monitoring the genetic variance of traits is a key priority to ensure the sustainability of breeding programmes in populations under directional selection, since directional selection can decrease genetic variation over time. Studies monitoring changes in genetic variation have typically used long-term data from small experimental populations selected for a handful of traits. Here, we used a large dataset from a commercial breeding line spread over a period of twenty-three years. A total of 2,059,869 records and 2,062,112 animals in the pedigree were used for the estimations of variance components for the traits: body weight (BWT; 2,059,869 records) and hen-housed egg production (HHP; 45,939 records). Data were analysed with three estimation approaches: sliding overlapping windows, under frequentist (restricted maximum likelihood (REML)) and Bayesian (Gibbs sampling) methods; expected variances using coefficients of the full relationship matrix; and a "double trait covariances" analysis by computing correlations and covariances between the same trait in two distinct consecutive windows. The genetic variance showed marginal fluctuations in its estimation over time. Whereas genetic, maternal permanent environmental, and residual variances were similar for BWT in both the REML and Gibbs methods, variance components when using the Gibbs method for HHP were smaller than the variances estimated when using REML. Large data amounts were needed to estimate variance components and detect their changes. For Gibbs (REML), the changes in genetic variance from 1999-2001 to 2020-2022 were 82.29 to 93.75 (82.84 to 93.68) for BWT and 76.68 to 95.67 (98.42 to 109.04) for HHP. Heritability presented a similar pattern as the genetic variance estimation, changing from 0.32 to 0.36 (0.32 to 0.36) for BWT and 0.16 to 0.15 (0.21 to 0.18) for HHP. On the whole, genetic parameters tended slightly to increase over time. The expected variance estimates were lower than the estimates when using overlapping windows. That indicates the low effect of the drift-selection process on the genetic variance, or likely, the presence of genetic variation sources compensating for the loss. Double trait covariance analysis confirmed the maintenance of variances over time, presenting genetic correlations >0.86 for BWT and >0.82 for HHP. Monitoring genetic variance in broiler breeding programmes is important to sustain genetic progress. Although the genetic variances of both traits fluctuated over time, in some windows, particularly between 2003 and 2020, increasing trends were observed, which warrants further research on the impact of other factors, such as novel mutations, operating on the dynamics of genetic variance.
Collapse
Affiliation(s)
- Bolívar Samuel Sosa-Madrid
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian EH25 9RG, UK
- Institute for Animal Science and Technology, Universitat Politècnica de València, P.O. Box 2201, 46071 Valencia, Spain;
| | | | - Noelia Ibáñez-Escriche
- Institute for Animal Science and Technology, Universitat Politècnica de València, P.O. Box 2201, 46071 Valencia, Spain;
| | | | - Andreas Kranis
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian EH25 9RG, UK
- Aviagen Ltd., Newbridge, Edinburgh EH28 8SZ, UK; (G.M.); (S.A.)
| |
Collapse
|