1
|
Modelling missing pedigree with metafounders and validating single-step genomic predictions in a small dairy cattle population with a great influence of foreign genetics. J Dairy Sci 2024:S0022-0302(24)00054-7. [PMID: 38310956 DOI: 10.3168/jds.2023-23732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 12/22/2023] [Indexed: 02/06/2024]
Abstract
Genetic improvement in small countries rely heavily on foreign genetics. In an importing country such as Uruguay, consideration of unknown parent groups (UPG) for foreign sires is essential. However, the use of UPG in genomic model evaluations may lead to bias in genomic estimated breeding values. The objective of this study was to study different models including UPG or Metafounders (MF) in the Uruguayan Holstein evaluation and to analyze bias, dispersion, and accuracy of (G)EBV predictions in BLUP and ssGBLUP. Gamma matrix (Γ) was estimated either by using base allele population frequencies obtained by bounded linear regression (MFbounded), or by using 2 values to design Γ, i.e., a single value for the diagonal and a different value for the off-diagonal (MFrobust). Both Γ estimators performed well in terms of GEBV predictions, but MFbounded was the best option. There is, however, some bias whose origin was not completely understood. UPG or MF seem to model correctly genetic progress for unknown parents except for the very first groups (earlier time period). As for validation bulls, bias was observed across all models, whereas for validation cows it was only observed with UPG in BLUP. Overdispersion was found in all models, but it was mostly detected in validation bulls. Ratio of accuracies indicated that ssGBLUP gave better predictions than BLUP.
Collapse
|
2
|
Transmission ratio distortion regions in the context of genomic evaluation and their effects on reproductive traits in cattle. J Dairy Sci 2023; 106:7786-7798. [PMID: 37210358 DOI: 10.3168/jds.2022-23062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 04/19/2023] [Indexed: 05/22/2023]
Abstract
Transmission ratio distortion (TRD), which is a deviation from Mendelian expectations, has been associated with basic mechanisms of life such as sperm and ova fertility and viability at developmental stages of the reproductive cycle. In this study different models including TRD regions were tested for different reproductive traits [days from first service to conception (FSTC), number of services, first service nonreturn rate (NRR), and stillbirth (SB)]. Thus, in addition to a basic model with systematic and random effects, including genetic effects modeled through a genomic relationship matrix, we developed 2 additional models, including a second genomic relationship matrix based on TRD regions, and TRD regions as a random effect assuming heterogeneous variances. The analyses were performed with 10,623 cows and 1,520 bulls genotyped for 47,910 SNPs, 590 TRD regions, and several records ranging from 9,587 (FSTC) to 19,667 (SB). The results of this study showed the ability of TRD regions to capture some additional genetic variance for some traits; however, this did not translate into higher accuracy for genomic prediction. This could be explained by the nature of TRD itself, which may arise in different stages of the reproductive cycle. Nevertheless, important effects of TRD regions were found on SB (31 regions) and NRR (18 regions) when comparing at-risk versus control matings, especially for regions with allelic TRD pattern. Particularly for NRR, the probability of observing nonpregnant cow increases by up to 27% for specific TRD regions, and the probability of observing stillbirth increased by up to 254%. These results support the relevance of several TRD regions on some reproductive traits, especially those with allelic patterns that have not received as much attention as recessive TRD patterns.
Collapse
|
3
|
Microbiability of milk composition and genetic control of microbiota effects in sheep. J Dairy Sci 2023; 106:6288-6298. [PMID: 37474364 DOI: 10.3168/jds.2022-22948] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 02/28/2023] [Indexed: 07/22/2023]
Abstract
Recently, high-dimensional omics data are becoming available in larger quantities, and models have been developed that integrate them with genomics to understand in finer detail the relationship between genotype and phenotype, and thus improve the performance of genetic evaluations. Our objectives are to quantify the effect of the inclusion of microbiome data in the genetic evaluation for dairy traits in sheep, through the estimation of the heritability, microbiability, and how the microbiome effect on dairy traits decomposes into genetic and nongenetic parts. In this study we analyzed milk and rumen samples of 795 Lacaune dairy ewes. We included, as phenotype, dairy traits and milk fatty acids and proteins composition; as omics measurements, 16S rRNA rumen bacterial abundances; and as genotyping, 54K SNP chip for all ewes. Two nested genomic models were used: a first model to predict the individual contributions of the genetic and microbial abundances to phenotypes, and a second model to predict the additive genetic effect of the microbial community. In addition, microbiome-wide association studies for all dairy traits were applied using the 2,059 rumen bacterial abundances, and the genetic correlations between microbiome principal components and dairy traits were estimated. Results showed that in general the inclusion of both genetic and microbiome effect did not improve the fit of the model compared with the model with the genetic effect only. In addition, for all dairy traits the total heritability was equal to the direct heritability after fitting microbiota effects, due to a microbiability being almost zero for most dairy traits and heritability of the microbial community was very close to zero. Microbiome-wide association studies did not show operational taxonomic units with major effect for any of the dairy traits evaluated, and the genetic correlations between the first 5 principal components and dairy traits were low to moderate. So far, we can conclude that, using a substantial data set of 795 Lacaune dairy ewes, rumen bacterial abundances do not provide improved genetic evaluation for dairy traits in sheep.
Collapse
|
4
|
Partitioning of the genetic trends of French dairy sheep in Mendelian samplings and long-term contributions. J Dairy Sci 2023; 106:6275-6287. [PMID: 37419742 DOI: 10.3168/jds.2022-23009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 02/28/2023] [Indexed: 07/09/2023]
Abstract
The genetic trend of milk yield for 4 French dairy sheep breeds (Lacaune, Basco-Béarnaise, Manech Tête Noire, and Manech Tête Rousse) was partitioned in Mendelian sampling trends by categories of animals defined by sex and by selection pathways. Five categories were defined, as follows: (1) artificial insemination (AI) males (after progeny testing), (2) males discarded after progeny testing, (3) natural mating males, (4) dams of males, and (5) dams of females. Dams of males and AI males were the most important sources of genetic progress, as observed in the decomposition in Mendelian sampling trends. The yearly contributions were more erratic for AI males than for dams of males, as AI males are averaged across a smaller number of individuals. Natural mating males and discarded males did not contribute to the trend in terms of Mendelian sampling, as their estimated Mendelian sampling term is either null (natural mating males) or negative (discarded males). Overall, in terms of Mendelian sampling, females contributed more than males to the total genetic gain, and we interpret that this is because females constitute a larger pool of genetic diversity. In addition, we computed long-term contributions from each individual to the following pseudo-generations (one pseudo-generation spanning 4 years). With this information, we studied the selection decisions (selected or not selected) for females, and the contributions to the following generations. Mendelian sampling was more important than parent average to determine the selection of individuals and their long-term contributions. Long-term contributions were greater for AI males (with larger progeny sizes than females) and in Basco-Béarnaise than in Lacaune (with the latter being a larger population).
Collapse
|
5
|
Effect of subdivision of the Lacaune dairy sheep breed on the accuracy of genomic prediction. J Dairy Sci 2023; 106:5570-5581. [PMID: 37349212 DOI: 10.3168/jds.2022-23114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 02/16/2023] [Indexed: 06/24/2023]
Abstract
Genomic selection was deployed in Lacaune dairy breed in 2015. Lacaune population split in 1972 into 2 breeding companies with associated flocks, and there have been very few exchanges of animals between the subpopulations, leading to divergence of the 2 subpopulations. In spite of that, there is a joint genomic prediction. The objective of this study is to understand how this structuring affects prediction accuracy. We analyzed all the data available from Lacaune breeding program for milk yield: around 6 million phenotypes, 2 million animals in the pedigree and more than 29,000 genotyped animals, including 3,434 and 2,868 AI rams for each company. To consider missing pedigree, we set up genetic groups using the theory of metafounders. First, we studied the pedigree and genomic structures of the 2 subpopulations calculating Fst, evolution of average pedigree relationships across time and principal components analysis of genomic relationships. In a second part, we compared the reliability between different scenarios: an evaluation with a single reference population (Alone), an evaluation with a joint reference population (Together) and an evaluation of one subpopulation based on the reference population of the other group (Indirect). The low Fst value (0.02) reveals that the 2 subpopulations are still genetically close. Nevertheless, a low and constant average relationship between the animals of the 2 subpopulations confirms the absence of recent connections between them. We can see with principal component analysis results that even if they are close, they diverge over time. Finally, we observe small gains in accuracy of Together versus Alone, in spite of whereas doubling the reference population size in Together. These gains vary across years and subpopulations: less than 0.08 (0.46 to 0.54; ratio of accuracy for the partial and whole evaluations-corresponding to the greatest change in this ratio for breeding company 1, observed for the cohort 2016) for one subpopulation and between 0.03 (0.55 to 0.58) and 0.17 (0.48 to 0.65) for the other. To conclude, the 2 subpopulations remain close enough genetically so that their combined evaluation is advantageous, even if only slightly.
Collapse
|
6
|
Nonparallel genome changes within subpopulations over time contributed to genetic diversity within the US Holstein population. J Dairy Sci 2023; 106:2551-2572. [PMID: 36797192 DOI: 10.3168/jds.2022-21914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 10/03/2022] [Indexed: 02/16/2023]
Abstract
Maintaining genetic variation in a population is important for long-term genetic gain. The existence of subpopulations within a breed helps maintain genetic variation and diversity. The 20,990 genotyped animals, representing the breeding animals in the year 2014, were identified as the sires of animals born after 2010 with at least 25 progenies, and females measured for type traits within the last 2 yr of data. K-means clustering with 5 clusters (C1, C2, C3, C4, and C5) was applied to the genomic relationship matrix based on 58,990 SNP markers to stratify the selected candidates into subpopulations. The general higher inbreeding resulting from within-cluster mating than across-cluster mating suggests the successful stratification into genetically different groups. The largest cluster (C4) contained animals that were less related to each animal within and across clusters. The average fixation index was 0.03, indicating that the populations were differentiated, and allele differences across the subpopulations were not due to drift alone. Starting with the selected candidates within each cluster, a family unit was identified by tracing back through the pedigree, identifying the genotyped ancestors, and assigning them to a pseudogeneration. Each of the 5 families (F1, F2, F3, F4, and F5) was traced back for 10 generations, allowing for changes in frequency of individual SNPs over time to be observed, which we call allele frequencies change. Alternative procedures were used to identify SNPs changing in a parallel or nonparallel way across families. For example, markers that have changed the most in the whole population, markers that have changed differently across families, and genes previously identified as those that have changed in allele frequency. The genomic trajectory taken by each family involves selective sweeps, polygenic changes, hitchhiking, and epistasis. The replicate frequency spectrum was used to measure the similarity of change across families and showed that populations have changed differently. The proportion of markers that reversed direction in allele frequency change varied from 0.00 to 0.02 if the rate of change was greater than 0.02 per generation, or from 0.14 to 0.24 if the rate of change was greater than 0.005 per generation within each family. Cluster-specific SNP effects for stature were estimated using only females and applied to obtain indirect genomic predictions for males. Reranking occurs depending on SNP effects used. Additive genetic correlations between clusters show possible differences in populations. Further research is required to determine how this knowledge can be applied to maintain diversity and optimize selection decisions in the future.
Collapse
|
7
|
Genomic evaluation methods to include intermediate correlated features such as high-throughput or omics phenotypes. JDS COMMUNICATIONS 2022; 4:55-60. [PMID: 36713125 PMCID: PMC9873823 DOI: 10.3168/jdsc.2022-0276] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 09/26/2022] [Indexed: 12/05/2022]
Abstract
Gene expression is supposed to be an intermediate between DNA and the phenotype, and it can be measured. Thus, for a trait, we may have intermediate measures, which are in fact a series of genetically controlled traits. Similarly, several traits may be measured or predicted using infrared spectra, accelerometers, and similar high-throughput measures that we will call "omics." Although these measurements have errors, many of them are heritable, and they may be more accurate or easier to record than the trait of interest. It is therefore important to develop methods to use intermediate measurements in selection. Here, we present methods and perspectives for selection based on massively recorded intermediate traits (omics). Recent developments allow a hierarchical integrated framework for prediction, in which a trait is partially controlled by omics. In addition, the omics measures are themselves partly controlled by genetics ("mediated breeding values") and partly by environment or residual factors. Thus, a part of the genetic determinism of a trait is mediated by omics, whereas the remaining part is not mediated, which results in "residual breeding values." In such a framework, genetic evaluations consist of 2 nested genomic BLUP-based models. In the first, the effect of omics on the trait (which can be seen as an improved estimate of the phenotype) and the residual breeding values are estimated. The second model extracts the mediated breeding values from the improved estimate of the phenotype, considering that omics themselves are heritable. The whole procedure is called GOBLUP (genomics omics BLUP) and it allows measures in only some individuals; that is, it is a "single-step"-like method. In this model, heritability is split into "mediated" and "not mediated" parts. This decomposition allows us to predict how accurate the omics measure of the trait would be compared with the direct measure. The ideal omics measure is heritable and explains a large part of the phenotypic variation of the trait. Ideally, this could be the case for some traits with low heritability. However, even if the omics measure explains only a small part of the phenotypic variation, when omics measurement themselves are heritable, the use of such a model would lead to more accurate selection. Expressions for upper bounds of reliability given omics measurements are also presented. More studies are needed to confirm the usefulness of omics or high-throughput prediction. Usefulness of the technology likely needs to be checked on a case-by-case basis.
Collapse
|
8
|
High genetic correlation for milk yield across Manech and Latxa dairy sheep from France and Spain. JDS COMMUNICATIONS 2022; 3:260-264. [PMID: 36338014 PMCID: PMC9623675 DOI: 10.3168/jdsc.2021-0195] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 03/15/2022] [Indexed: 06/16/2023]
Abstract
Spanish Latxa and French Manech are dairy sheep breeds that split into Blond (Latxa Cara Rubia, LCR; Manech Tête Rousse, MTR) and Black (Latxa Cara Negra of Navarre, LCN; Manech Tête Noire, MTN) strains. Exchange of genetic material (artificial insemination doses) is becoming more and more frequent across these breeds, within color, to boost both genomic precision using a larger reference population and genetic progress using a larger selection base. This exchange leads to some rams having descendance across both countries. However, additional gains can only be achieved if the selected traits are genetically similar across countries. The objective of this work was to estimate the genetic correlation across breeds for milk yield. We combine across-country, within-color records, pedigree, and marker information. The number of animals with records oscillates from 65,000 (LCN) to 544,000 (MTR), whereas the number of connecting artificial insemination rams (with more than 10 daughters in the other country) is 381 MTR rams in LCR and 58 MTN rams in LCN. Blond strains had a stronger and more extended-in-time connection. The number of genotyped rams goes from 328 (LCN) to 4,901 (MTR). The relatedness of populations was assessed by principal component analysis and Fst coefficients. The genetic correlation was estimated using 2 (one per color) 2-trait models (each country a trait), including all available data (records, pedigree and genotypes), by maximum profile likelihood while fixing other variance components to within-population estimates. Results showed a closer genetic relationship of Blond strains than of Black strains (Fst: 0.01 vs. 0.05, respectively). Genetic correlation estimates for milk yield were 0.70 in both cases. Based on Fst distances, we expected a lower correlation for Black strains than for Blond ones if dominance or epistasis are important. Thus, we attribute the value of this correlation not being close to 1 mostly to genotype-by-environment interaction, including on-farm management and trait modeling. Regardless, the correlation of 0.7 across populations is encouraging for future joint work of Latxa and Manech breeders, including joint genetic evaluations.
Collapse
|
9
|
Removing data and using metafounders alleviates biases for all traits in Lacaune dairy sheep predictions. J Dairy Sci 2022; 105:2439-2452. [PMID: 35033343 DOI: 10.3168/jds.2021-20860] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 11/23/2021] [Indexed: 11/19/2022]
Abstract
Bias in dairy genetic evaluations, when it exists, has to be understood and properly addressed. The origin of biases is not always clear. We analyzed 40 yr of records from the Lacaune dairy sheep breeding program to evaluate the extent of bias, assess possible corrections, and emit hypotheses on its origin. The data set included 7 traits (milk yield, fat and protein contents, somatic cell score, teat angle, udder cleft, and udder depth) with records from 600,000 to 5 million depending on the trait, ∼1,900,000 animals, and ∼5,900 genotyped elite artificial insemination rams. For the ∼8% animals with missing sire, we fit 25 unknown parent groups. We used the linear regression method to compare "partial" and "whole" predictions of young rams before and after progeny testing, with 7 cut-off points, and we obtained estimates of their bias, (over)dispersion, and accuracy in early proofs. We tried (1) several scenarios as follows: multiple or single trait, the "official" (routine) evaluation, which is a mixture of both single and multiple trait, and "deletion" of data before 1990; and (2) several models as follows: BLUP and single-step genomic (SSG)BLUP with fixed unknown parent groups or metafounders, where, for metafounders, their relationship matrix gamma was estimated using either a model for inbreeding trend, or base allele frequencies estimated by peeling. The estimate of gamma obtained by modeling the inbreeding trend resulted in an estimated increase of inbreeding, based on markers, faster than the pedigree-based one. The estimated genetic trends were similar for most models and scenarios across all traits, but were shrunken when gamma was estimated by peeling. This was due to shrinking of the estimates of metafounders in the latter case. Across scenarios, all traits showed bias, generally as an overestimate of genetic trend for milk yield and an underestimate for the other traits. As for the slope, it showed overdispersion of estimated breeding values for all traits. Using multiple-trait models slightly reduced the overestimate of genetic trend and the overdispersion, as did including genomic information (i.e., SSGBLUP) when the gamma matrix was estimated by the model for inbreeding trend. However, only deletion of historical data before 1990 resulted in elimination of both kind of biases. The SSGBLUP resulted in more accurate early proofs than BLUP for all traits. We considered that a snowball effect of small errors in each genetic evaluation, combined with selection, may have resulted in biased evaluations. Improving statistical methods reduced some bias but not all, and a simple solution for this data set was to remove historical records.
Collapse
|
10
|
Multibreed genomic evaluation for production traits of dairy cattle in the United States using single-step genomic best linear unbiased predictor. J Dairy Sci 2022; 105:5141-5152. [DOI: 10.3168/jds.2021-21505] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 01/27/2022] [Indexed: 01/01/2023]
|
11
|
Islands of runs of homozygosity indicate selection signatures in Ovis aries 6 (OAR6) of French dairy sheep. JDS COMMUNICATIONS 2021; 2:132-136. [PMID: 36339500 PMCID: PMC9623631 DOI: 10.3168/jdsc.2020-0011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Accepted: 02/05/2021] [Indexed: 11/21/2022]
Abstract
The presence of runs of homozygosity is not randomly distributed across the genome. Islands of runs of homozygosity may be the result of selection pressure. Concordance existed between islands of runs of homozygosity and selection signatures on OAR6. Candidate genes NCAPG and LCORL on OAR6 have agricultural and adaptive importance.
Runs of homozygosity (ROH) are contiguous homozygous segments of the genome where the haplotypes inherited from each parent are identical. The occurrence of ROH is not randomly distributed across the genome, and ROH islands across many animals may be the result of selective pressure. The objective of this study was to demonstrate that the presence of ROH islands may be indicative of selection signatures in French dairy sheep breeds and subpopulations. The data set available included animals (artificial insemination males) from various breeds and subpopulations: Basco-Béarnaise breed (321 individuals), Manech Tête Noire breed (329 individuals), Manech Tête Rousse breed (1,906 individuals), Lacaune Confederation subpopulation (3,030 individuals), and Lacaune Ovitest subpopulation (3,114 individuals). Animals were genotyped with the Illumina OvineSNP50 BeadChip. After applying filtering criteria, the genomic data included 38,287 autosomal SNP distributed across 26 chromosomes and 8,700 individuals. One island of ROH was detected on OAR6 in the same genomic position across animals (between 30 and 40 Mb). Global Wright's differentiation coefficients for 2 SNP within this ROH island were high (0.67–0.68). The linkage disequilibrium between both SNP was also elevated (0.98). The divergence in allele frequencies in those SNP grouped Basco-Béarnaise, Manech Tête Noire, and Manech Tête Rousse breeds in one cluster and Lacaune Confederation and Lacaune Ovitest subpopulations in another cluster. The closest candidate genes are NCAPG and LCORL, which have been reported to be under positive selection and suggested to control weight and height in sheep. The preliminary identification of ROH suggests the presence of selection. However, for the identification of potential candidate genes, ROH detection should be combined with other approaches to improve mapping accuracy.
Collapse
|
12
|
Genomic and pedigree estimation of inbreeding depression for semen traits in the Basco-Béarnaise dairy sheep breed. J Dairy Sci 2020; 104:3221-3230. [PMID: 33358787 DOI: 10.3168/jds.2020-18761] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Accepted: 10/05/2020] [Indexed: 01/09/2023]
Abstract
Inbreeding depression is associated with a decrease in performance and fitness of the animals. The goal of this study was to evaluate pedigree-based and genomic methods to estimate the level of inbreeding and inbreeding depression for 3 semen traits (volume, concentration, and motility score) in the Basco-Béarnaise sheep breed. Data comprised 16,196 (or 15,071) phenotypic records from 620 rams (of which 533 rams had genotypes of 36,464 SNPs). The pedigree included 8,266 animals, composed of the 620 rams and their ancestors. The number of equivalent complete generations for the 620 rams was 7.04. Inbreeding coefficients were estimated using genomic and pedigree-based information. Genomic inbreeding coefficients were estimated from individual SNP and using segments of homozygous SNP (runs of homozygosity, ROH). Short ROH are of old origin, whereas long ROH are due to recent inbreeding. Considering that the equivalent number of generations in Basco-Béarnaise was 6, inbreeding coefficients for ROH with a length >4 Mb refer to all (recent + old) inbreeding, those with a length >17 Mb correspond to recent inbreeding, and the difference between them indicates old inbreeding. Pedigree-based inbreeding coefficients were also estimated classically, or accounting for nonzero relationships for unknown parents, or including metafounder relationships (estimated using markers) to account for missing pedigree information. Finally, inbreeding coefficients combining genotyped and nongenotyped animal information were computed from matrix H of the single-step approach, also including metafounders. Inbreeding depression was estimated differently depending on the approach used to compute inbreeding coefficients. These 8 estimators of inbreeding coefficients were included as covariates in different animal models. No inbreeding depression was detected for sperm volume or sperm concentration. Inbreeding depression was significant for the motility of spermatozoa. The effect of old and recent inbreeding on motility was null and negative, respectively, demonstrating the existence of purging by selection of deleterious recessive alleles affecting motility. A 10% increase in inbreeding would result in a reduction in mean motility ranging between 0.09 and 0.22 points in the score (from 0 to 5). Motility is unfavorably affected by increasing recent inbreeding but the impact is very small. Runs of homozygosity and metafounders allow us to accurately estimate inbreeding depression and detect recent inbreeding.
Collapse
|
13
|
Genome-wide association study for feed efficiency in collective cage-raised rabbits under full and restricted feeding. Anim Genet 2020; 51:799-810. [PMID: 32697387 PMCID: PMC7540659 DOI: 10.1111/age.12988] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 06/16/2020] [Accepted: 06/26/2020] [Indexed: 01/30/2023]
Abstract
Feed efficiency (FE) is one of the most economically and environmentally relevant traits in the animal production sector. The objective of this study was to gain knowledge about the genetic control of FE in rabbits. To this end, GWASs were conducted for individual growth under two feeding regimes (full feeding and restricted) and FE traits collected from cage groups, using 114 604 autosome SNPs segregating in 438 rabbits. Two different models were implemented: (1) an animal model with a linear regression on each SNP allele for growth trait; and (2) a two‐trait animal model, jointly fitting the performance trait and each SNP allele content, for FE traits. This last modeling strategy is a new tool applied to GWAS and allows information to be considered from non‐genotyped individuals whose contribution is relevant in the group average traits. A total of 189 SNPs in 17 chromosomal regions were declared to be significantly associated with any of the five analyzed traits at a chromosome‐wide level. In 12 of these regions, 20 candidate genes were proposed to explain the variation of the analyzed traits, including genes such as FTO, NDUFAF6 and CEBPA previously associated with growth and FE traits in monogastric species. Candidate genes associated with behavioral patterns were also identified. Overall, our results can be considered as the foundation for future functional research to unravel the actual causal mutations regulating growth and FE in rabbits.
Collapse
|
14
|
Exploring the inclusion of genomic information and metafounders in Latxa dairy sheep genetic evaluations. J Dairy Sci 2020; 103:6346-6353. [DOI: 10.3168/jds.2019-18033] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 02/25/2020] [Indexed: 11/19/2022]
|
15
|
Inbreeding, effective population size, and coancestry in the Latxa dairy sheep breed. J Dairy Sci 2020; 103:5215-5226. [PMID: 32253040 DOI: 10.3168/jds.2019-17743] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 02/03/2020] [Indexed: 12/24/2022]
Abstract
Traditionally, breeding programs have estimated and managed inbreeding based on pedigree information. The availability of genomic marker panels has made possible new alternatives to achieve more precise estimates, for example in case of missing pedigree. The objective of the present study was to assess and compare, different estimation methods (pedigree-based methodologies, single SNP-based approach (homozygosity) and runs of homozygosity-based method) to analyze the evolution of genetic diversity measured as inbreeding or as coancestry of 3 selected populations of Latxa dairy sheep (Latxa Cara Rubia and Latxa Cara Negra from Euskadi and Navarre). Genomic data came from 972 artificial insemination rams genotyped with the Illumina OvineSNP50 BeadChip (Illumina Inc., San Diego, CA) whose genealogy consisted of 4,484 animals. Inbreeding estimates based on molecular data were more similar between them than compared with those based on pedigree information. However, the SNP-based approach estimations of effective population size differed more, reflecting the sensitivity of effective population size to small changes in the evolution of inbreeding. The 2 Latxa Cara Negra populations showed increases of inbreeding rates with time and effective population sizes between 64 and 103 animals, depending on breed and methodology used. The Latxa Cara Rubia population did not show an increase in inbreeding rate, mainly due to semen importation from the related French population of Manech Tête Rousse. The effective size estimates based on coancestry increase show a higher variability and they are more sensitive to the source of information and the data structure considered. Realized effective population size based on individual increase in inbreeding were in agreement with the previous estimates. Coancestry evolution analysis based on DNA information showed an increase on coancestry during the last 10 yr in all breeds, as a consequence of the selection process. Moreover, the increase on coancestry between Latxa Cara Rubia and Manech Tête Rousse was more noticeable between than within each of those breeds.
Collapse
|
16
|
Short communication: Methods to compute genomic inbreeding for ungenotyped individuals. J Dairy Sci 2020; 103:3363-3367. [PMID: 32057428 DOI: 10.3168/jds.2019-17750] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Accepted: 12/18/2019] [Indexed: 11/19/2022]
Abstract
The genomic measure of inbreeding is closer to the actual inbreeding than the pedigree-based measure. However, it cannot be computed for ungenotyped animals. An estimate of genomic inbreeding comes from the diagonal of matrix H used in single-step methods. This matrix projects genomic relationships to all ungenotyped members of the pedigree. The diagonal element of H-1 gives an estimate of the genomic inbreeding coefficient. However, so far no computational methods are available to compute the diagonal of H. Here we propose 3 exact methods to compute this diagonal. The first uses an already-existing algorithm to compute, for each ungenotyped individual, products of the form Hx to obtain the corresponding diagonal element of H. The second method computes, for each ungenotyped individual, a term that can be written as a quadratic form involving pedigree and genomic relationships. For both methods, the computational burden is linear in the number of ungenotyped animals. The last method reorders the computations of the second method so that they become linear in the number of genotyped animals, which is usually much smaller. We tested the methods in 3 small data sets (with ~2,000 genotyped animals and 30,000-500,000 animals in pedigree) and in a large simulated population (with 1,220,000 animals in pedigree and 36,000 genotyped animals). Tests resulted in satisfactory computing times (<10 min in the largest example using 10 parallel threads). Computing times were much shorter for the third method, as expected. Using these methods, estimates of genomic inbreeding in ungenotyped animals can be obtained on a regular basis.
Collapse
|
17
|
Behavior of the Linear Regression method to estimate bias and accuracies with correct and incorrect genetic evaluation models. J Dairy Sci 2019; 103:529-544. [PMID: 31704008 DOI: 10.3168/jds.2019-16603] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Accepted: 09/13/2019] [Indexed: 11/19/2022]
Abstract
Bias in genetic evaluations has been a constant concern in animal genetics. The interest in this topic has increased in the last years, since many studies have detected overestimation (bias) in estimated breeding values (EBV). Detecting the existence of bias, and the realized accuracy of predictions, is therefore of importance, yet this is difficult when studying small data sets or breeds. In this study, we tested by simulation the recently presented method Linear Regression (LR) for estimation of bias, slope, and accuracy of pedigree EBV. The LR method computes statistics by comparing EBV from a data set containing old, partial information with EBV from a data set containing all information (old and new, a whole data set) for the same individuals. The method proposes an estimator for bias (Δpˆ), an estimator of slope (bpˆ), and 3 estimators related to accuracies: the ratio between accuracies [Formula: see text] the reliability of the partial data set (accp2ˆ), and the ratio of reliabilities (ρp,w2ˆ). We simulated a dairy scheme for low (0.10) and moderate (0.30) heritabilities. In both cases, we checked the behavior of the estimators for 3 scenarios: (1) when the evaluation model is the same as the model used to simulate the data; (2) when the evaluation model uses an incorrect heritability; and (3) when the data includes an environmental trend. For scenarios in which the evaluation model was correct, the LR method was capable of correctly estimating bias, slope, and accuracies, with better performance for higher heritability [i.e., corr(bp,bpˆ) was 0.45 for h2 = 0.10 and 0.59 for h2 = 0.30]. In cases of the use of incorrect heritabilities in the evaluation model, the bias was correctly estimated in direction but not in magnitude. In the same way, the magnitudes of bias and of slope were underestimated in scenarios with environmental trends in data, except for cases in which contemporary groups were random and greatly shrunken. In general, accuracies were well estimated in all scenarios. The LR method is capable of checking bias and accuracy in all cases, if the evaluation model is reasonably correct or robust, and its estimations are more precise with more information (e.g., high heritability). If the model uses an incorrect heritability or a hidden trend exists in the data, it is still possible to estimate the direction and existence of bias and slope but not always their magnitudes.
Collapse
|
18
|
Dissecting total genetic variance into additive and dominance components of purebred and crossbred pig traits. Animal 2019; 13:2429-2439. [PMID: 31120005 DOI: 10.1017/s1751731119001046] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The partition of the total genetic variance into its additive and non-additive components can differ from trait to trait, and between purebred and crossbred populations. A quantification of these genetic variance components will determine the extent to which it would be of interest to account for dominance in genomic evaluations or to establish mate allocation strategies along different populations and traits. This study aims at assessing the contribution of the additive and dominance genomic variances to the phenotype expression of several purebred Piétrain and crossbred (Piétrain × Large White) pig performances. A total of 636 purebred and 720 crossbred male piglets were phenotyped for 22 traits that can be classified into six groups of traits: growth rate and feed efficiency, carcass composition, meat quality, behaviour, boar taint and puberty. Additive and dominance variances estimated in univariate genotypic models, including additive and dominance genotypic effects, and a genomic inbreeding covariate allowed to retrieve the additive and dominance single nucleotide polymorphism variances for purebred and crossbred performances. These estimated variances were used, together with the allelic frequencies of the parental populations, to obtain additive and dominance variances in terms of genetic breeding values and dominance deviations. Estimates of the Piétrain and Large White allelic contributions to the crossbred variance were of about the same magnitude in all the traits. Estimates of additive genetic variances were similar regardless of the inclusion of dominance. Some traits showed relevant amount of dominance genetic variance with respect to phenotypic variance in both populations (i.e. growth rate 8%, feed conversion ratio 9% to 12%, backfat thickness 14% to 12%, purebreds-crossbreds). Other traits showed higher amount in crossbreds (i.e. ham cut 8% to 13%, loin 7% to 16%, pH semimembranosus 13% to 18%, pH longissimus dorsi 9% to 14%, androstenone 5% to 13% and estradiol 6% to 11%, purebreds-crossbreds). It was not encountered a clear common pattern of dominance expression between groups of analysed traits and between populations. These estimates give initial hints regarding which traits could benefit from accounting for dominance for example to improve genomic estimated breeding value accuracy in genetic evaluations or to boost the total genetic value of progeny by means of assortative mating.
Collapse
|
19
|
Alternative SNP weighting for single-step genomic best linear unbiased predictor evaluation of stature in US Holsteins in the presence of selected sequence variants. J Dairy Sci 2019; 102:10012-10019. [PMID: 31495612 DOI: 10.3168/jds.2019-16262] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Accepted: 07/16/2019] [Indexed: 11/19/2022]
Abstract
Causal variants inferred from sequence data analysis are expected to increase accuracy of genomic selection. In this work we evaluated the gain in reliability of genomic predictions, for stature in US Holsteins, when adding selected sequence variants to a pre-existent SNP chip. Two prediction methods were tested: de-regressed proofs assuming heterogeneous (genomic BLUP; GBLUP) residual variances and by single-step GBLUP (ssGBLUP) using actual phenotypes. Phenotypic data included 3,999,631 records for stature on 3,027,304 Holstein cows. Genotypes on 54,087 SNP markers (54k) were available for 26,877 bulls. Additionally, 16,648 selected sequence variants were combined with the 54k markers, for a total of 70,735 (70k) markers. In all methods, SNP in the genomic relationship matrix (G) were unweighted or weighted iteratively, with weights derived either by SNP effects squared or by a nonlinear method that resembles BayesA (nonlinear A). Reliability of genomic predictions were obtained by cross validation. With unweighted G derived from 54k markers, the reliabilities (× 100) were 72.4 for GBLUP and 75.3 for ssGBLUP. With unweighted G derived from 70k markers, the reliabilities were 73.4 and 76.0, respectively. Weighting by nonlinear A changed reliabilities to 73.3, and 75.9, respectively. Addition of selected sequence variants had a small effect on reliabilities. Weighting by quadratic functions reduced reliabilities. Weighting by nonlinear A increased reliabilities for GBLUP but had only a small effect in ssGBLUP. Reliabilities for direct genomic values extracted from ssGBLUP using unweighted G with 54k were higher than reliabilities by any GBLUP. Thus, ssGBLUP seems to capture more information than GBLUP and there is less room for extra reliability. Improvements in GBLUP may be because the weights in G change the covariance structure, which can explain a proportion of the variance that is accounted for when a heterogeneous residual variance is assumed by considering a different number of daughters per bull.
Collapse
|
20
|
Inbreeding and effective population size in French dairy sheep: Comparison between genomic and pedigree estimates. J Dairy Sci 2019; 102:4227-4237. [PMID: 30827541 DOI: 10.3168/jds.2018-15405] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2018] [Accepted: 12/23/2018] [Indexed: 01/11/2023]
Abstract
Before availability of dense SNP data, genetic diversity was characterized and managed with pedigree-based information. Besides this classical approach, 2 methodologies have been proposed in recent years to characterize and manage diversity from dense SNP data: the SNP-by-SNP approach and the alternative based on runs of homozygosity (ROH). The establishment of criteria to identify ROH is a current constraint in the literature dealing with ROH. The objective of this study was, using a medium-density SNP chip, to quantify by 3 methods (pedigree, SNP-by-SNP, and ROH) the genetic diversity on 5 selected French dairy sheep subpopulations and breeds and to assess the effect of the definition of ROH on these estimates. The data set available included individuals from the breeds Basco-Béarnaise, Manech Tête Noire, Manech Tête Rousse, and 2 subpopulations of Lacaune: Lacaune Confederation and Lacaune Ovitest. Animals were genotyped with the Illumina OvineSNP50 BeadChip (Illumina Inc., San Diego, CA). After filtering, the genomic data included 38,287 autosomal SNP and 8,700 individuals, which comprised 72,803 animals in the pedigree. The results indicated that no significant differences were observed in effective population size estimates obtained from pedigree or genomic (SNP-by-SNP or ROH) information. In general, estimates of effective population size were above 200 in Lacaune Confederation and Lacaune Ovitest subpopulations and below 200 in Basco-Béarnaise, Manech Tête Noire, and Manech Tête Rousse breeds. The minimum length that constituted a ROH, the minimum number of SNP that constituted a ROH, as well as the minimum density and the maximum distance allowed between 2 homozygous SNP are ROH-defining factors with important implications in the estimation of the rate of inbreeding. The ROH-based rates of inbreeding in concordance with those obtained from pedigree information require a specific set of values. This particular set of values is different from that identified to obtain ROH-based rates of inbreeding similar to those obtained on a SNP-by-SNP basis. Factors to define ROH do not change the results much unless extreme values are considered, although further research on ROH-based inbreeding is still required.
Collapse
|
21
|
Modeling missing pedigree in single-step genomic BLUP. J Dairy Sci 2019; 102:2336-2346. [DOI: 10.3168/jds.2018-15434] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Accepted: 11/12/2018] [Indexed: 11/19/2022]
|
22
|
Pedigree-based estimation of covariance between dominance deviations and additive genetic effects in closed rabbit lines considering inbreeding and using a computationally simpler equivalent model. J Anim Breed Genet 2017; 134:184-195. [PMID: 28508486 DOI: 10.1111/jbg.12267] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 02/05/2017] [Indexed: 12/01/2022]
Abstract
Inbreeding generates covariances between additive and dominance effects (breeding values and dominance deviations). In this work, we developed and applied models for estimation of dominance and additive genetic variances and their covariance, a model that we call "full dominance," from pedigree and phenotypic data. Estimates with this model such as presented here are very scarce both in livestock and in wild genetics. First, we estimated pedigree-based condensed probabilities of identity using recursion. Second, we developed an equivalent linear model in which variance components can be estimated using closed-form algorithms such as REML or Gibbs sampling and existing software. Third, we present a new method to refer the estimated variance components to meaningful parameters in a particular population, i.e., final partially inbred generations as opposed to outbred base populations. We applied these developments to three closed rabbit lines (A, V and H) selected for number of weaned at the Polytechnic University of Valencia. Pedigree and phenotypes are complete and span 43, 39 and 14 generations, respectively. Estimates of broad-sense heritability are 0.07, 0.07 and 0.05 at the base versus 0.07, 0.07 and 0.09 in the final generations. Narrow-sense heritability estimates are 0.06, 0.06 and 0.02 at the base versus 0.04, 0.04 and 0.01 at the final generations. There is also a reduction in the genotypic variance due to the negative additive-dominance correlation. Thus, the contribution of dominance variation is fairly large and increases with inbreeding and (over)compensates for the loss in additive variation. In addition, estimates of the additive-dominance correlation are -0.37, -0.31 and 0.00, in agreement with the few published estimates and theoretical considerations.
Collapse
|
23
|
Technical note: Genomic evaluation for crossbred performance in a single-step approach with metafounders. J Anim Sci 2017; 95:1472-1480. [PMID: 28464109 DOI: 10.2527/jas.2016.1155] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
A single-step genomic BLUP method (ssGBLUP) has been successfully developed and applied for purebred and crossbred performance in pigs. However, it requires phasing the genotypes and inferring the breed origin of alleles in crossbred animals, which is somewhat inconvenient. Recently, a new concept of metafounders that considers the relationship within and across base populations was developed. With this concept of metafounders, regular methods to build and invert the pedigree relationships matrix can be used with only minor modifications and, moreover, genomic relationships and pedigree-based relationships are automatically compatible in the ssGBLUP. In this study, data for the total number of piglets born in Danish Landrace, Yorkshire, and 2-way crossbred pigs and models for purebred and crossbred performance were revisited by use of ssGBLUP with 2 metafounders. Genetic variances and genetic correlations between purebred and crossbred performances were first reestimated. Then, model-based reliabilities of purebred boars for their crossbred performance and predictive abilities for crossbred animals were compared in different scenarios. Results in this study were compared to those in a previous study with identical data but with models that required known breed origin of crossbred genotypes. Results show that relationships for base individuals within Landrace and within Yorkshire are similar and that the ancestor populations for Landrace and Yorkshire are related. In terms of model-based reliabilities and predictive abilities, ssGBLUP with metafounders performs at least as well as the single-step method requiring phasing at a lower complexity.
Collapse
|
24
|
193 Including causative variants into single step genomic BLUP. J Anim Sci 2017. [DOI: 10.2527/asasann.2017.193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
25
|
Role of inbreeding depression, non-inbred dominance deviations and random year-season effect in genetic trends for prolificacy in closed rabbit lines. J Anim Breed Genet 2017; 134:441-452. [PMID: 28685498 DOI: 10.1111/jbg.12284] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 06/02/2017] [Indexed: 11/28/2022]
Abstract
In closed rabbit lines selected for prolificacy at the Polytechnic University of Valencia, genetic responses are predicted using BLUP. With a standard additive BLUP model and year-season (YS) effects fitted as fixed, genetic trends were overestimated compared to responses estimated using control populations obtained from frozen embryos. In these lines, there is a confounding between genetic trend, YS effects and inbreeding, and the role of dominance is uncertain. This is a common situation in data from reproductively closed selection lines. This paper fits different genetic evaluation models to data of these lines, aiming to identify the source of these biases: dominance, inbreeding depression and/or an ill-conditioned model due to the strong collinearity between YS, inbreeding and genetic trend. The study involved three maternal lines (A, V and H) and analysed two traits, total born (TB) and the number of kits at weaning (NW). Models fitting YS effect as fixed or random were implemented, in addition to additive genetic, permanent environment effects and non-inbred dominance deviations effects. When YS was fitted as a fixed effect, the genetic trends were overestimated compared to control populations, inbreeding had an apparent positive effect on litter size and the environmental trends were negative. When YS was fitted as random, the genetic trends were compatible with control populations results, inbreeding had a negative effect (lower prolificacy) and environmental trends were flat. The model fitting random YS, inbreeding and non-inbred dominance deviations yielded the following ratios of additive and dominance variances to total variance for NW: 0.06 and 0.01 for line A, 0.06 and 0.00 for line V and 0.01 and 0.08 for line H. Except for line H, dominance deviations seem to be of low relevance. When it is confounded with inbreeding as in these lines, fitting YS effect as random allows correct estimation of genetic trends.
Collapse
|
26
|
Technical note: Avoiding the direct inversion of the numerator relationship matrix for genotyped animals in single-step genomic best linear unbiased prediction solved with the preconditioned conjugate gradient. J Anim Sci 2017; 95:49-52. [PMID: 28177357 DOI: 10.2527/jas.2016.0699] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
This paper evaluates an efficient implementation to multiply the inverse of a numerator relationship matrix for genotyped animals () by a vector (). The computation is required for solving mixed model equations in single-step genomic BLUP (ssGBLUP) with the preconditioned conjugate gradient (PCG). The inverse can be decomposed into sparse matrices that are blocks of the sparse inverse of a numerator relationship matrix () including genotyped animals and their ancestors. The elements of were rapidly calculated with the Henderson's rule and stored as sparse matrices in memory. Implementation of was by a series of sparse matrix-vector multiplications. Diagonal elements of , which were required as preconditioners in PCG, were approximated with a Monte Carlo method using 1,000 samples. The efficient implementation of was compared with explicit inversion of with 3 data sets including about 15,000, 81,000, and 570,000 genotyped animals selected from populations with 213,000, 8.2 million, and 10.7 million pedigree animals, respectively. The explicit inversion required 1.8 GB, 49 GB, and 2,415 GB (estimated) of memory, respectively, and 42 s, 56 min, and 13.5 d (estimated), respectively, for the computations. The efficient implementation required <1 MB, 2.9 GB, and 2.3 GB of memory, respectively, and <1 sec, 3 min, and 5 min, respectively, for setting up. Only <1 sec was required for the multiplication in each PCG iteration for any data sets. When the equations in ssGBLUP are solved with the PCG algorithm, is no longer a limiting factor in the computations.
Collapse
|
27
|
Technical note: Genomic evaluation for crossbred performance in a single-step approach with metafounders. J Anim Sci 2017. [DOI: 10.2527/jas2016.1155] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
28
|
Technical note: Avoiding the direct inversion of the numerator relationship matrix for genotyped animals in single-step genomic best linear unbiased prediction solved with the preconditioned conjugate gradient. J Anim Sci 2017. [DOI: 10.2527/jas2016.0699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
29
|
0292 Dimensionality of genomic information and APY inverse of genomic relationship matrix. J Anim Sci 2016. [DOI: 10.2527/jam2016-0292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
30
|
Estimates of the actual relationship between half-sibs in a pig population. J Anim Breed Genet 2016; 134:109-118. [PMID: 27670252 DOI: 10.1111/jbg.12236] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2016] [Accepted: 08/04/2016] [Indexed: 12/26/2022]
Abstract
Genomic relationships based on markers capture the actual instead of the expected (based on pedigree) proportion of genome shared identical by descent (IBD). Several methods exist to estimate genomic relationships. In this research, we compare four such methods that were tested looking at the empirical distribution of the estimated relationships across 6704 pairs of half-sibs from a cross-bred pig population. The first method based on multiple marker linkage analysis displayed a mean and standard deviation (SD) in close agreement with the expected ones and was robust to changes in the minor allele frequencies (MAF). A single marker method that accounts for linkage disequilibrium (LD) and inbreeding came second, showing more sensitivity to changes in the MAF. Another single marker method that considers neither inbreeding nor LD showed the smallest empirical SD and was the most sensible to changes in MAF. A higher mean and SD were displayed by VanRaden's method, which was not sensitive to changes in MAF. Therefore, the method based on multiple marker linkage analysis and the single marker method that considers LD and inbreeding performed closer to theoretical values and were consistent with the estimates reported in literature for human half-sibs.
Collapse
|
31
|
A comparison of methods to estimate genomic relationships using pedigree and markers in livestock populations. J Anim Breed Genet 2016; 133:452-462. [PMID: 27135179 DOI: 10.1111/jbg.12217] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Accepted: 03/30/2016] [Indexed: 12/20/2022]
Abstract
Accurate prediction of breeding values depends on capturing the variability in genome sharing of relatives with the same pedigree relationship. Here, we compare two approaches to set up genomic relationship matrices for precision of genomic relationships (GR) and accuracy of estimated breeding values (GEBV). Real and simulated data (pigs, 60k SNP) were analysed, and GR were estimated using two approaches: (i) identity by state, corrected with either the observed (GVR-O ) or the base population (GVR-B ) allele frequencies and (ii) identity by descent using linkage analysis (GIBD-L ). Estimators were evaluated for precision and empirical bias with respect to true pedigree IBD GR. All three estimators had very low bias. GIBD-L displayed the lowest sampling error and the highest correlation with true genome-shared values. GVR-B approximated GIBD-L 's correlation and had lower error than GVR-O . Accuracy of GEBV for selection candidates was significantly higher when GIBD-L was used and identical between GVR-O and GVR-B . In real data, GIBD-L 's sampling standard deviation was the closest to the theoretical value for each pedigree relationship. Use of pedigree to calculate GR improved the precision of estimates and the accuracy of GEBV.
Collapse
|
32
|
A combined coalescence gene-dropping tool for evaluating genomic selection in complex scenarios (ms2gs). J Anim Breed Genet 2016; 133:85-91. [PMID: 26995218 DOI: 10.1111/jbg.12200] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2015] [Accepted: 12/07/2015] [Indexed: 11/28/2022]
Abstract
We present ms2gs, a combined coalescence - gene dropping (i.e. backward-forward) simulator for complex traits. It therefore aims at combining the advantages of both approaches. It is primarily conceived for very short term, recent scenarios such as those that are of interest in animal and plant breeding. It is very flexible in terms of defining QTL architecture and SNP ascertainment bias, and it allows for easy modelling of alternative markers such as RADs. It can use real sequence or chip data or generate molecular polymorphisms via the coalescence. It can generate QTL conditional on extant molecular information, such as low-density genotyping. It models (simplistically) sequence, imputation or genotyping errors. It requires as input both genotypic data in plink or ms formats, and a pedigree that is used to perform the gene dropping. By default, it compares accuracy for BLUP, SNP ascertained data, sequence, and causal SNPs. It employs VanRaden's linear (GBLUP) and nonlinear method for incorporating molecular information. To illustrate the program, we present a small application in a half-sib population and a multiparental (MAGIC) cross. The program, manual and examples are available at https://github.com/mperezenciso/ms2gs.
Collapse
|
33
|
Genetic evaluation using single-step genomic best linear unbiased predictor in American Angus. J Anim Sci 2016; 93:2653-62. [PMID: 26115253 DOI: 10.2527/jas.2014-8836] [Citation(s) in RCA: 109] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Predictive ability of genomic EBV when using single-step genomic BLUP (ssGBLUP) in Angus cattle was investigated. Over 6 million records were available on birth weight (BiW) and weaning weight (WW), almost 3.4 million on postweaning gain (PWG), and over 1.3 million on calving ease (CE). Genomic information was available on, at most, 51,883 animals, which included high and low EBV accuracy animals. Traditional EBV was computed by BLUP and genomic EBV by ssGBLUP and indirect prediction based on SNP effects was derived from ssGBLUP; SNP effects were calculated based on the following reference populations: ref_2k (contains top bulls and top cows that had an EBV accuracy for BiW ≥0.85), ref_8k (contains all parents that were genotyped), and ref_33k (contains all genotyped animals born up to 2012). Indirect prediction was obtained as direct genomic value (DGV) or as an index of DGV and parent average (PA). Additionally, runs with ssGBLUP used the inverse of the genomic relationship matrix calculated by an algorithm for proven and young animals (APY) that uses recursions on a small subset of reference animals. An extra reference subset included 3,872 genotyped parents of genotyped animals (ref_4k). Cross-validation was used to assess predictive ability on a validation population of 18,721 animals born in 2013. Computations for growth traits used multiple-trait linear model and, for CE, a bivariate CE-BiW threshold-linear model. With BLUP, predictivities were 0.29, 0.34, 0.23, and 0.12 for BiW, WW, PWG, and CE, respectively. With ssGBLUP and ref_2k, predictivities were 0.34, 0.35, 0.27, and 0.13 for BiW, WW, PWG, and CE, respectively, and with ssGBLUP and ref_33k, predictivities were 0.39, 0.38, 0.29, and 0.13 for BiW, WW, PWG, and CE, respectively. Low predictivity for CE was due to low incidence rate of difficult calving. Indirect predictions with ref_33k were as accurate as with full ssGBLUP. Using the APY and recursions on ref_4k gave 88% gains of full ssGBLUP and using the APY and recursions on ref_8k gave 97% gains of full ssGBLUP. Genomic evaluation in beef cattle with ssGBLUP is feasible while keeping the models (maternal, multiple trait, and threshold) already used in regular BLUP. Gains in predictivity are dependent on the composition of the reference population. Indirect predictions via SNP effects derived from ssGBLUP allow for accurate genomic predictions on young animals, with no advantage of including PA in the index if the reference population is large. With the APY conditioning on about 10,000 reference animals, ssGBLUP is potentially applicable to a large number of genotyped animals without compromising predictive ability.
Collapse
|
34
|
Application of single-step genomic evaluation for crossbred performance in pig1. J Anim Sci 2016; 94:936-48. [DOI: 10.2527/jas.2015-9930] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
|
35
|
Implementation of genomic recursions in single-step genomic best linear unbiased predictor for US Holsteins with a large number of genotyped animals. J Dairy Sci 2016; 99:1968-1974. [PMID: 26805987 DOI: 10.3168/jds.2015-10540] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2015] [Accepted: 12/01/2015] [Indexed: 11/19/2022]
Abstract
The objectives of this study were to develop and evaluate an efficient implementation in the computation of the inverse of genomic relationship matrix with the recursion algorithm, called the algorithm for proven and young (APY), in single-step genomic BLUP. We validated genomic predictions for young bulls with more than 500,000 genotyped animals in final score for US Holsteins. Phenotypic data included 11,626,576 final scores on 7,093,380 US Holstein cows, and genotypes were available for 569,404 animals. Daughter deviations for young bulls with no classified daughters in 2009, but at least 30 classified daughters in 2014 were computed using all the phenotypic data. Genomic predictions for the same bulls were calculated with single-step genomic BLUP using phenotypes up to 2009. We calculated the inverse of the genomic relationship matrix GAPY(-1) based on a direct inversion of genomic relationship matrix on a small subset of genotyped animals (core animals) and extended that information to noncore animals by recursion. We tested several sets of core animals including 9,406 bulls with at least 1 classified daughter, 9,406 bulls and 1,052 classified dams of bulls, 9,406 bulls and 7,422 classified cows, and random samples of 5,000 to 30,000 animals. Validation reliability was assessed by the coefficient of determination from regression of daughter deviation on genomic predictions for the predicted young bulls. The reliabilities were 0.39 with 5,000 randomly chosen core animals, 0.45 with the 9,406 bulls, and 7,422 cows as core animals, and 0.44 with the remaining sets. With phenotypes truncated in 2009 and the preconditioned conjugate gradient to solve mixed model equations, the number of rounds to convergence for core animals defined by bulls was 1,343; defined by bulls and cows, 2,066; and defined by 10,000 random animals, at most 1,629. With complete phenotype data, the number of rounds decreased to 858, 1,299, and at most 1,092, respectively. Setting up GAPY(-1) for 569,404 genotyped animals with 10,000 core animals took 1.3h and 57 GB of memory. The validation reliability with APY reaches a plateau when the number of core animals is at least 10,000. Predictions with APY have little differences in reliability among definitions of core animals. Single-step genomic BLUP with APY is applicable to millions of genotyped animals.
Collapse
|
36
|
Hot topic: Use of genomic recursions in single-step genomic best linear unbiased predictor (BLUP) with a large number of genotypes. J Dairy Sci 2015; 98:4090-4. [PMID: 25864050 DOI: 10.3168/jds.2014-9125] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2014] [Accepted: 03/13/2015] [Indexed: 11/19/2022]
Abstract
The purpose of this study was to evaluate the accuracy of genomic selection in single-step genomic BLUP (ssGBLUP) when the inverse of the genomic relationship matrix (G) is derived by the "algorithm for proven and young animals" (APY). This algorithm implements genomic recursions on a subset of "proven" animals. Only a relationship matrix for animals treated as "proven" needs to be inverted, and the extra costs of adding animals treated as "young" are linear. Analyses involved 10,102,702 final scores on 6,930,618 Holstein cows. Final score, which is a composite of type traits, is popular trait in the United States and was easily available for this study. A total of 100,000 animals with genotypes were used in the analyses and included 23,000 sires (16,000 with >5 progeny), 27,000 cows, and 50,000 young animals. Genomic EBV (GEBV) were calculated with a regular inverse of G, and with the G inverse approximated by APY. Animals in the proven subset included only sires (23,000), sires+cows (50,000), only cows (27,000), or sires with >5 progeny (16,000). The correlations of GEBV with APY and regular GEBV for young genotyped animals were 0.994, 0.995, 0.992, and 0.992, respectively Later, animals in the proven subset were randomly sampled from all genotyped animals in sets of 2,000, 5,000, 10,000, 15,000, and 20,000; each sample was replicated 4 times. Respective correlations were 0.97 (5,000 sample), 0.98 (10,000 sample), and 0.99 (20,000 sample), with minimal difference between samples of the same size. Genomic EBV with APY were accurate when the number of animals used in the subset is between 10,000 and 20,000, with little difference between the ways of creating the subset. Due to the approximately linear cost of APY, ssGBLUP with APY could support any number of genotyped animals without affecting accuracy.
Collapse
|
37
|
Differences between genomic-based and pedigree-based relationships in a chicken population, as a function of quality control and pedigree links among individuals. J Anim Breed Genet 2014; 131:445-51. [PMID: 25039816 DOI: 10.1111/jbg.12109] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Accepted: 06/24/2014] [Indexed: 12/01/2022]
Abstract
This work studied differences between expected (calculated from pedigree) and realized (genomic, from markers) relationships in a real population, the influence of quality control on these differences, and their fit to current theory. Data included 4940 pure line chickens across five generations genotyped for 57,636 SNP. Pedigrees (5762 animals) were available for the five generations, pedigree starting on the first one. Three levels of quality control were used. With no quality control, mean difference between realized and expected relationships for different type of relationships was ≤ 0.04 with standard deviation ≤ 0.10. With strong quality control (call rate ≥ 0.9, parent-progeny conflicts, minor allele frequency and use of only autosomal chromosomes), these numbers reduced to ≤ 0.02 and ≤ 0.04, respectively. While the maximum difference was 1.02 with the complete data, it was only 0.18 with the latest three generations of genotypes (but including all pedigrees). Variation of expected minus realized relationships agreed with theoretical developments and suggests an effective number of loci of 70 for this population. When the pedigree is complete and as deep as the genotypes, the standard deviation of difference between the expected and realized relationships is around 0.04, all categories confounded. Standard deviation of differences larger than 0.10 suggests bad quality control, mistakes in pedigree recording or genotype labelling, or insufficient depth of pedigree.
Collapse
|
38
|
Using recursion to compute the inverse of the genomic relationship matrix. J Dairy Sci 2014; 97:3943-52. [PMID: 24679933 DOI: 10.3168/jds.2013-7752] [Citation(s) in RCA: 111] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2013] [Accepted: 02/10/2014] [Indexed: 11/19/2022]
Abstract
Computing the inverse of the genomic relationship matrix using recursion was investigated. A traditional algorithm to invert the numerator relationship matrix is based on the observation that the conditional expectation for an additive effect of 1 animal given the effects of all other animals depends on the effects of its sire and dam only, each with a coefficient of 0.5. With genomic relationships, such an expectation depends on all other genotyped animals, and the coefficients do not have any set value. For each animal, the coefficients plus the conditional variance can be called a genomic recursion. If such recursions are known, the mixed model equations can be solved without explicitly creating the inverse of the genomic relationship matrix. Several algorithms were developed to create genomic recursions. In an algorithm with sequential updates, genomic recursions are created animal by animal. That algorithm can also be used to update a known inverse of a genomic relationship matrix for additional genotypes. In an algorithm with forward updates, a newly computed recursion is immediately applied to update recursions for remaining animals. The computing costs for both algorithms depend on the sparsity pattern of the genomic recursions, but are lower or equal than for regular inversion. An algorithm for proven and young animals assumes that the genomic recursions for young animals contain coefficients only for proven animals. Such an algorithm generates exact genomic EBV in genomic BLUP and is an approximation in single-step genomic BLUP. That algorithm has a cubic cost for the number of proven animals and a linear cost for the number of young animals. The genomic recursions can provide new insight into genomic evaluation and possibly reduce costs of genetic predictions with extremely large numbers of genotypes.
Collapse
|
39
|
Within- and across-breed genomic predictions and genomic relationships for Western Pyrenees dairy sheep breeds Latxa, Manech, and Basco-Béarnaise. J Dairy Sci 2014; 97:3200-12. [PMID: 24630656 DOI: 10.3168/jds.2013-7745] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2013] [Accepted: 02/02/2014] [Indexed: 01/13/2023]
Abstract
Genotypes, phenotypes and pedigrees of 6 breeds of dairy sheep (including subdivisions of Latxa, Manech, and Basco-Béarnaise) from the Spain and France Western Pyrenees were used to estimate genetic relationships across breeds (together with genotypes from the Lacaune dairy sheep) and to verify by forward cross-validation single-breed or multiple-breed genetic evaluations. The number of rams genotyped fluctuated between 100 and 1,300 but generally represented the 10 last cohorts of progeny-tested rams within each breed. Genetic relationships were assessed by principal components analysis of the genomic relationship matrices and also by the conservation of linkage disequilibrium patterns at given physical distances in the genome. Genomic and pedigree-based evaluations used daughter yield performances of all rams, although some of them were not genotyped. A pseudo-single step method was used in this case for genomic predictions. Results showed a clear structure in blond and black breeds for Manech and Latxa, reflecting historical exchanges, and isolation of Basco-Béarnaise and Lacaune. Relatedness between any 2 breeds was, however, lower than expected. Single-breed genomic predictions had accuracies comparable with other breeds of dairy sheep or small breeds of dairy cattle. They were more accurate than pedigree predictions for 5 out of 6 breeds, with absolute increases in accuracy ranging from 0.05 to 0.30 points. They were significantly better, as assessed by bootstrapping of candidates, for 2 of the breeds. Predictions using multiple populations only marginally increased the accuracy for a couple of breeds. Pooling populations does not increase the accuracy of genomic evaluations in dairy sheep; however, single-breed genomic predictions are more accurate, even for small breeds, and make the consideration of genomic schemes in dairy sheep interesting.
Collapse
|
40
|
Assessment of accuracy of genomic prediction for French Lacaune dairy sheep. J Dairy Sci 2014; 97:1107-16. [DOI: 10.3168/jds.2013-7135] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 10/22/2013] [Indexed: 11/19/2022]
|
41
|
The coefficient of dominance is not (always) estimable with biallelic markers. J Anim Breed Genet 2014; 131:97-104. [PMID: 24397385 DOI: 10.1111/jbg.12076] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2013] [Accepted: 11/29/2013] [Indexed: 11/29/2022]
Abstract
The genetic relationship among individuals at one locus is characterized by nine coefficients of identity. The coefficients of inbreeding, coancestry and dominance (or fraternity) are just linear functions of them. Here, it is shown how they can be estimated using biallelic and triallelic markers using the method of moments, and comparisons are made with other methods based on molecular coancestry or molecular covariance. It is concluded that in the general case of dominance and inbreeding with biallelic markers, only the coefficients of inbreeding and coancestry can be estimated, but neither the single coefficients of identity nor the coefficient of dominance can be estimated. More than two alleles are required for a full estimation as illustrated with the triallelic situation.
Collapse
|
42
|
Unknown-parent groups in single-step genomic evaluation. J Anim Breed Genet 2013; 130:252-8. [DOI: 10.1111/jbg.12025] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2012] [Accepted: 11/30/2012] [Indexed: 11/29/2022]
|
43
|
Computational strategies for national integration of phenotypic, genomic, and pedigree data in a single-step best linear unbiased prediction. J Dairy Sci 2012; 95:4629-45. [PMID: 22818478 DOI: 10.3168/jds.2011-4982] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2011] [Accepted: 04/03/2012] [Indexed: 11/19/2022]
Abstract
The single-step genomic BLUP (SSGBLUP) is a method that can integrate pedigree and genotypes at molecular markers in an optimal way. However, its present form (regular SSGBLUP) has a high computational cost (cubic in the number of genotyped animals) and may need extensive rewriting of genetic evaluation software. In this work, we propose several strategies to implement the single step in a simpler manner. The first one expands the single-step mixed-model equations to obtain equivalent equations from which the regular (including pedigree and records only) mixed-model equations are a subset. These new equations (unsymmetric extended SSGBLUP) have low computational cost, but require a nonsymmetric solver such as the biconjugate gradient stabilized method or successive underrelaxation, which is a variant of successive overrelaxation, with a relaxation factor lower than 1. In addition, we show a new derivation of the single-step method, which includes, as an extra effect, deviations from strictly polygenic breeding values. As a result, the same set of equations as above is obtained. We show that, whereas the new derivation shows apparent problems of nonpositive definiteness for certain covariance matrices, a proper equivalent model including imaginary effects always exists, leading always to the regular SSGBLUP mixed model equations. The system of equations can be solved (iterative SSGBLUP) by iterating between a pedigree and records evaluation and a genomic evaluation (each one solved by any iterative or direct method), whereas global iteration can use a block version of successive underrelaxation, which ensures convergence. The genomic evaluation can explicitly include marker or haplotype effects and possibly involve nonlinear (e.g., Bayesian by Markov chain Monte Carlo) methods. In a simulated example with 28,800 individuals and 1,800 genotyped individuals, all methods converged quickly to the same solutions. Using existing efficient methods with limited memory requirements to compute the products Gt and A(22)t for any t (where G and A(22) are genomic and pedigree relationships for genotyped animals, and t is a vector), all strategies can be converted to iteration on data procedures for which the total number of operations is linear in the number of animals + number of genotyped animals × number of markers.
Collapse
|
44
|
Computation of deregressed proofs for genomic selection when own phenotypes exist with an application in French show-jumping horses. J Anim Sci 2012; 91:1076-85. [PMID: 23230121 DOI: 10.2527/jas.2012-5256] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Genomic evaluations often use as pseudo-phenotypes corrected means of progeny performances, like daughter yield deviations (DYD) in dairy species. In horse breeding, own performances are also available and performances from other relatives (as half sibs) may play an important part in the EBV because the number of progeny remains low, even for stallions. The first step for genomic selection in horses is therefore to generate pseudo-phenotypes for genomic analysis when parental or own information is considered. This work presents an easy method to compute deregressed EBV from regular pedigree-based genetic evaluations (EBV, reliabilities) to be used in genomic evaluations. The proposed methodology builds deregressed proofs so that they combine own performances (from genotyped individuals) and performances of relatives (outside of the genotyped sample). An application to show jumping horse data is presented. A sample of 908 stallions specialized in show jumping [71% Selle Français (SF), 17% foreign sport horses (FH), 13% Anglo Arab (AA)] were genotyped. Genotyping was performed using the Illumina Equine SNP50 BeadChip, and after quality tests, 44,444 SNP were retained. Two methods were used for genomic evaluation: GBLUP and BayesCπ, and 6 validation data sets were compared, chosen according to breeds SF + FH + AA or SF + FH, family structure (more than 3 half sibs), reliability of sires (>0.97) or sons (>0.72). In spite of a favorable genetic structure [linkage disequilibrium equal to 0.24 at 50 kb pairs], results showed low advantage of genomic evaluation. On the validation sample SF + FH + AA, the correlation between deregressed proofs and GBLUP or BayesCπ predictions was 0.39, 0.37, 0.51 according to the different validation data sets, compared with 0.36, 0.33, 0.53 obtained with BLUP predictions. Correlations were much lower on the SF + FH sample. Research is pursued to understand this low advantage of genomic selection and to improve the methodology for genomic evaluation in this context, which is less favorable than dairy cattle breeding.
Collapse
|
45
|
A genome scan for QTL affecting resistance to Haemonchus contortus in sheep1. J Anim Sci 2012; 90:4690-705. [DOI: 10.2527/jas.2012-5121] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
46
|
Application of Bayesian least absolute shrinkage and selection operator (LASSO) and BayesCπ methods for genomic selection in French Holstein and Montbéliarde breeds. J Dairy Sci 2012; 96:575-91. [PMID: 23127905 DOI: 10.3168/jds.2011-5225] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2011] [Accepted: 09/14/2012] [Indexed: 11/19/2022]
Abstract
Recently, the amount of available single nucleotide polymorphism (SNP) marker data has considerably increased in dairy cattle breeds, both for research purposes and for application in commercial breeding and selection programs. Bayesian methods are currently used in the genomic evaluation of dairy cattle to handle very large sets of explanatory variables with a limited number of observations. In this study, we applied 2 bayesian methods, BayesCπ and bayesian least absolute shrinkage and selection operator (LASSO), to 2 genotyped and phenotyped reference populations consisting of 3,940 Holstein bulls and 1,172 Montbéliarde bulls with approximately 40,000 polymorphic SNP. We compared the accuracy of the bayesian methods for the prediction of 3 traits (milk yield, fat content, and conception rate) with pedigree-based BLUP, genomic BLUP, partial least squares (PLS) regression, and sparse PLS regression, a variable selection PLS variant. The results showed that the correlations between observed and predicted phenotypes were similar in BayesCπ (including or not pedigree information) and bayesian LASSO for most of the traits and whatever the breed. In the Holstein breed, bayesian methods led to higher correlations than other approaches for fat content and were similar to genomic BLUP for milk yield and to genomic BLUP and PLS regression for the conception rate. In the Montbéliarde breed, no method dominated the others, except BayesCπ for fat content. The better performances of the bayesian methods for fat content in Holstein and Montbéliarde breeds are probably due to the effect of the DGAT1 gene. The SNP identified by the BayesCπ, bayesian LASSO, and sparse PLS regression methods, based on their effect on the different traits of interest, were located at almost the same position on the genome. As the bayesian methods resulted in regressions of direct genomic values on daughter trait deviations closer to 1 than for the other methods tested in this study, bayesian methods are suggested for genomic evaluations of French dairy cattle.
Collapse
|
47
|
Methods to approximate reliabilities in single-step genomic evaluation. J Dairy Sci 2012; 96:647-54. [PMID: 23127903 DOI: 10.3168/jds.2012-5656] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Accepted: 09/18/2012] [Indexed: 11/19/2022]
Abstract
Reliability of predictions from single-step genomic BLUP (ssGBLUP) can be calculated by matrix inversion, but that is not feasible for large data sets. Two methods of approximating reliability were developed based on the decomposition of a function of reliability into contributions from records, pedigrees, and genotypes. Those contributions can be expressed in record or daughter equivalents. The first approximation method involved inversion of a matrix that contains inverses of the genomic relationship matrix and the pedigree relationship matrix for genotyped animals. The second approximation method involved only the diagonal elements of those inverses. The 2 approximation methods were tested with a simulated data set. The correlations between ssGBLUP and approximated contributions from genomic information were 0.92 for the first approximation method and 0.56 for the second approximation method; contributions were inflated by 62 and 258%, respectively. The respective correlations for reliabilities were 0.98 and 0.72. After empirical correction for inflation, those correlations increased to 0.99 and 0.89. Approximations of reliabilities of predictions by ssGBLUP are accurate and computationally feasible for populations with up to 100,000 genotyped animals. A critical part of the approximations is quality control of information from single nucleotide polymorphisms and proper scaling of the genomic relationship matrix.
Collapse
|
48
|
LDSO: a program to simulate pedigrees and molecular information under various evolutionary forces. J Anim Breed Genet 2012; 129:417-21. [PMID: 22963363 DOI: 10.1111/j.1439-0388.2011.00986.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Simulations are a major tool to evaluate new statistical methods and optimize experimental designs in the genomic era. However, this can only be achieved when the simulations are close enough to reality, as well as diverse enough to be realistic. For mapping studies, it is thus critical to re-create as much as possible the forces generating linkage (mutation, random drift, changes in population sizes, selection and pedigree structure) and the mechanisms producing trait genetic architecture (additivity, dominance, epistasis). We present here a computer program (ldso) simulating these phenomena. Optional outputs provide statistics on the linkage disequilibrium (LD) structure and the identity by descent between chromosomal segments, facilitating further data analyses. Furthermore, ldso enables the simulation of genomic data in known pedigrees, which sticks as precisely as possible to recent population history and structures of the long-range LD, allowing optimization of fine-mapping strategies.
Collapse
|
49
|
A comparison of partial least squares (PLS) and sparse PLS regressions in genomic selection in French dairy cattle. J Dairy Sci 2012; 95:2120-31. [PMID: 22459857 DOI: 10.3168/jds.2011-4647] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2011] [Accepted: 12/09/2011] [Indexed: 01/25/2023]
Abstract
Genomic selection involves computing a prediction equation from the estimated effects of a large number of DNA markers based on a limited number of genotyped animals with phenotypes. The number of observations is much smaller than the number of independent variables, and the challenge is to find methods that perform well in this context. Partial least squares regression (PLS) and sparse PLS were used with a reference population of 3,940 genotyped and phenotyped French Holstein bulls and 39,738 polymorphic single nucleotide polymorphism markers. Partial least squares regression reduces the number of variables by projecting independent variables onto latent structures. Sparse PLS combines variable selection and modeling in a one-step procedure. Correlations between observed phenotypes and phenotypes predicted by PLS and sparse PLS were similar, but sparse PLS highlighted some genome regions more clearly. Both PLS and sparse PLS were more accurate than pedigree-based BLUP and generally provided lower correlations between observed and predicted phenotypes than did genomic BLUP. Furthermore, PLS and sparse PLS required similar computing time to genomic BLUP for the study of 6 traits.
Collapse
|
50
|
Genomic selection in the French Lacaune dairy sheep breed. J Dairy Sci 2012; 95:2723-33. [DOI: 10.3168/jds.2011-4980] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2011] [Accepted: 01/05/2012] [Indexed: 11/19/2022]
|