1
|
Nilforooshan MA. Short Communication: Reduced GBLUP equations to core animals in the algorithm for proven and young (APY). Vet Anim Sci 2024; 23:100334. [PMID: 38283332 PMCID: PMC10820638 DOI: 10.1016/j.vas.2024.100334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 01/02/2024] [Accepted: 01/02/2024] [Indexed: 01/30/2024] Open
Abstract
The number of animal genotypes is rapidly increasing, and a major challenge for animal models is inverting the genomic relationship matrix (G). Matrix G has a limited dimensionality, and the algorithm for proven and young (APY) makes inverting a large G possible via the inverse of a block diagonal of G with a size equivalent to the dimensionality of G. APY divides genotyped animals into core and non-core groups, and breeding values of non-core animals are conditioned on the breeding values of core animals. Therefore, there is the possibility of opting out equations for non-core animals from the model. A methodology was presented for a reduced APY genomic BLUP (GBLUP) to equations for core animals. Using a small example dataset, the method was validated by the equality of the full and the reduced model analysis results. Absorption of fixed effect equations into random effect equations was successful in reducing the number of equations to solve and producing the same random effect solutions. Extending the method to APY single-step GBLUP (ssGBLUP) was not computationally justifiable. Other reduction techniques exist for ssGBLUP (regardless of APY or non-APY) that work by reducing the number of equations for non-genotyped animals. The number of equations can further be reduced by data pruning.
Collapse
|
2
|
Liu H, Yu S. A dimensionality-reduction genomic prediction method without direct inverse of the genomic relationship matrix for large genomic data. PLANT CELL REPORTS 2023; 42:1825-1832. [PMID: 37750948 DOI: 10.1007/s00299-023-03069-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 09/08/2023] [Indexed: 09/27/2023]
Abstract
KEY MESSAGE A new genomic prediction method (RHPP) was developed via combining randomized Haseman-Elston regression (RHE-reg), PCR based on genomic information of core population, and preconditioned conjugate gradient (PCG) algorithm. Computational efficiency is becoming a hot issue in the practical application of genomic prediction due to the large number of data generated by the high-throughput genotyping technology. In this study, we developed a fast genomic prediction method RHPP via combining randomized Haseman-Elston regression (RHE-reg), PCR based on genomic information of core population, and preconditioned conjugate gradient (PCG) algorithm. The simulation results demonstrated similar prediction accuracy between RHPP and GBLUP, and significantly higher computational efficiency of the former with the increase of individuals. The results of real datasets of both bread wheat and loblolly pine demonstrated that RHPP had a similar or better predictive accuracy in most cases compared with GBLUP. In the future, RHPP may be an attractive choice for analyzing large-scale and high-dimensional data.
Collapse
Affiliation(s)
- Hailan Liu
- Maize Research Institute, Sichuan Agricultural University, Chengdu, 611130, Sichuan, China.
| | - Shizhou Yu
- Molecular Genetics Key Laboratory of China Tobacco, Guizhou Academy of Tobacco Science, Guiyang, 550081, Guizhou, China.
| |
Collapse
|
3
|
Vandenplas J, Ten Napel J, Darbaghshahi SN, Evans R, Calus MPL, Veerkamp R, Cromie A, Mäntysaari EA, Strandén I. Efficient large-scale single-step evaluations and indirect genomic prediction of genotyped selection candidates. Genet Sel Evol 2023; 55:37. [PMID: 37291510 DOI: 10.1186/s12711-023-00808-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 04/28/2023] [Indexed: 06/10/2023] Open
Abstract
BACKGROUND Single-step genomic best linear unbiased prediction (ssGBLUP) models allow the combination of genomic, pedigree, and phenotypic data into a single model, which is computationally challenging for large genotyped populations. In practice, genotypes of animals without their own phenotype and progeny, so-called genotyped selection candidates, can become available after genomic breeding values have been estimated by ssGBLUP. In some breeding programmes, genomic estimated breeding values (GEBV) for these animals should be known shortly after obtaining genotype information but recomputing GEBV using the full ssGBLUP takes too much time. In this study, first we compare two equivalent formulations of ssGBLUP models, i.e. one that is based on the Woodbury matrix identity applied to the inverse of the genomic relationship matrix, and one that is based on marker equations. Second, we present computationally-fast approaches to indirectly compute GEBV for genotyped selection candidates, without the need to do the full ssGBLUP evaluation. RESULTS The indirect approaches use information from the latest ssGBLUP evaluation and rely on the decomposition of GEBV into its components. The two equivalent ssGBLUP models and indirect approaches were tested on a six-trait calving difficulty model using Irish dairy and beef cattle data that include 2.6 million genotyped animals of which about 500,000 were considered as genotyped selection candidates. When using the same computational approaches, the solving phase of the two equivalent ssGBLUP models showed similar requirements for memory and time per iteration. The computational differences between them were due to the preprocessing phase of the genomic information. Regarding the indirect approaches, compared to GEBV obtained from single-step evaluations including all genotypes, indirect GEBV had correlations higher than 0.99 for all traits while showing little dispersion and level bias. CONCLUSIONS In conclusion, ssGBLUP predictions for the genotyped selection candidates were accurately approximated using the presented indirect approaches, which are more memory efficient and computationally fast, compared to solving a full ssGBLUP evaluation. Thus, indirect approaches can be used even on a weekly basis to estimate GEBV for newly genotyped animals, while the full single-step evaluation is done only a few times within a year.
Collapse
Affiliation(s)
- Jeremie Vandenplas
- Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands.
| | - Jan Ten Napel
- Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| | | | - Ross Evans
- Irish Cattle Breeding Federation, Highfield House, Newcestown Road, Bandon, Cork, Ireland
| | - Mario P L Calus
- Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| | - Roel Veerkamp
- Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| | - Andrew Cromie
- Irish Cattle Breeding Federation, Highfield House, Newcestown Road, Bandon, Cork, Ireland
| | | | - Ismo Strandén
- Natural Resources Institute Finland (Luke), Jokioinen, Finland
| |
Collapse
|
4
|
Ben Zaabza H, Van Tassell CP, Vandenplas J, VanRaden P, Liu Z, Eding H, McKay S, Haugaard K, Lidauer MH, Mäntysaari EA, Strandén I. Invited review: Reliability computation from the animal model era to the single-step genomic model era. J Dairy Sci 2023; 106:1518-1532. [PMID: 36567247 DOI: 10.3168/jds.2022-22629] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 11/07/2022] [Indexed: 12/24/2022]
Abstract
The calculation of exact reliabilities involving the inversion of mixed model equations poses a heavy computational challenge when the system of equations is large. This has prompted the development of different approximation methods. We give an overview of the various methods and computational approaches in calculating reliability from the era before the animal model to the era of single-step genomic models. The different methods are discussed in terms of modeling, development, and applicability in large dairy cattle populations. The paper also describes the problems faced in reliability computation. Many details dispersed throughout the literature are presented in this paper. It is clear that a universal solution applicable to every model and input data may not be possible, but we point out several efficient and accurate algorithms developed recently for a variety of very large genomic evaluations.
Collapse
Affiliation(s)
- Hafedh Ben Zaabza
- Department of Animal and Veterinary Sciences, University of Vermont, Burlington 05405; Animal Improvement Programs Laboratory, Agricultural Research Service, US Department of Agriculture, Beltsville, MD 20705-2350.
| | - Curtis P Van Tassell
- Animal Improvement Programs Laboratory, Agricultural Research Service, US Department of Agriculture, Beltsville, MD 20705-2350
| | - Jeremie Vandenplas
- Animal Breeding and Genomics, Wageningen University & Research, P.O. Box 338, 6700 AH, Wageningen, the Netherlands
| | - Paul VanRaden
- Animal Improvement Programs Laboratory, Agricultural Research Service, US Department of Agriculture, Beltsville, MD 20705-2350
| | - Zengting Liu
- IT Solutions for Animal Production (vit), Heinrich-Schröder-Weg 1, D-27283 Verden, Germany
| | - Herwin Eding
- CRV BV, Wassenaarweg, 20, 6843 NW, Arnhem, the Netherlands
| | - Stephanie McKay
- Department of Animal and Veterinary Sciences, University of Vermont, Burlington 05405
| | | | - Martin H Lidauer
- Natural Resources Institute Finland (Luke), FI-31600 Jokioinen, Finland
| | - Esa A Mäntysaari
- Natural Resources Institute Finland (Luke), FI-31600 Jokioinen, Finland
| | - Ismo Strandén
- Natural Resources Institute Finland (Luke), FI-31600 Jokioinen, Finland
| |
Collapse
|
5
|
Garcia A, Aguilar I, Legarra A, Tsuruta S, Misztal I, Lourenco D. Theoretical accuracy for indirect predictions based on SNP effects from single-step GBLUP. Genet Sel Evol 2022; 54:66. [PMID: 36162979 PMCID: PMC9513904 DOI: 10.1186/s12711-022-00752-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 08/23/2022] [Indexed: 11/13/2022] Open
Abstract
Background Although single-step GBLUP (ssGBLUP) is an animal model, SNP effects can be backsolved from genomic estimated breeding values (GEBV). Predicted SNP effects allow to compute indirect prediction (IP) per individual as the sum of the SNP effects multiplied by its gene content, which is helpful when the number of genotyped animals is large, for genotyped animals not in the official evaluations, and when interim evaluations are needed. Typically, IP are obtained for new batches of genotyped individuals, all of them young and without phenotypes. Individual (theoretical) accuracies for IP are rarely reported, but they are nevertheless of interest. Our first objective was to present equations to compute individual accuracy of IP, based on prediction error covariance (PEC) of SNP effects, and in turn, are obtained from PEC of GEBV in ssGBLUP. The second objective was to test the algorithm for proven and young (APY) in PEC computations. With large datasets, it is impossible to handle the full PEC matrix, thus the third objective was to examine the minimum number of genotyped animals needed in PEC computations to achieve IP accuracies that are equivalent to GEBV accuracies. Results Correlations between GEBV and IP for the validation animals using SNP effects from ssGBLUP evaluations were ≥ 0.99. When all available genotyped animals were used for PEC computations, correlations between GEBV and IP accuracy were ≥ 0.99. In addition, IP accuracies were compatible with GEBV accuracies either with direct inversion of the genomic relationship matrix (G) or using the algorithm for proven and young (APY) to obtain the inverse of G. As the number of genotyped animals included in the PEC computations decreased from around 55,000 to 15,000, correlations were still ≥ 0.96, but IP accuracies were biased downwards. Conclusions Theoretical accuracy of indirect prediction can be successfully obtained by computing SNP PEC out of GEBV PEC from ssGBLUP equations using direct or APY G inverse. It is possible to reduce the number of genotyped animals in PEC computations, but accuracies may be underestimated. Further research is needed to approximate SNP PEC from ssGBLUP to limit the computational requirements with many genotyped animals.
Collapse
Affiliation(s)
- Andre Garcia
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, 30602, USA.
| | - Ignacio Aguilar
- Instituto Nacional de Investigación Agropecuaria (INIA), 11500, Montevideo, Uruguay
| | - Andres Legarra
- UMR GenPhySE, INRA Toulouse, BP52626, 31326, Castanet Tolosan, France
| | - Shogo Tsuruta
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, 30602, USA
| | - Ignacy Misztal
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, 30602, USA
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, 30602, USA
| |
Collapse
|