1
|
Misztal I, Aguilar I, Lourenco D, Ma L, Steibel JP, Toro M. Emerging issues in genomic selection. J Anim Sci 2021; 99:skab092. [PMID: 33773494 PMCID: PMC8186541 DOI: 10.1093/jas/skab092] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Accepted: 03/26/2021] [Indexed: 12/22/2022] Open
Abstract
Genomic selection (GS) is now practiced successfully across many species. However, many questions remain, such as long-term effects, estimations of genomic parameters, robustness of genome-wide association study (GWAS) with small and large datasets, and stability of genomic predictions. This study summarizes presentations from the authors at the 2020 American Society of Animal Science (ASAS) symposium. The focus of many studies until now is on linkage disequilibrium between two loci. Ignoring higher-level equilibrium may lead to phantom dominance and epistasis. The Bulmer effect leads to a reduction of the additive variance; however, the selection for increased recombination rate can release anew genetic variance. With genomic information, estimates of genetic parameters may be biased by genomic preselection, but costs of estimation can increase drastically due to the dense form of the genomic information. To make the computation of estimates feasible, genotypes could be retained only for the most important animals, and methods of estimation should use algorithms that can recognize dense blocks in sparse matrices. GWASs using small genomic datasets frequently find many marker-trait associations, whereas studies using much bigger datasets find only a few. Most of the current tools use very simple models for GWAS, possibly causing artifacts. These models are adequate for large datasets where pseudo-phenotypes such as deregressed proofs indirectly account for important effects for traits of interest. Artifacts arising in GWAS with small datasets can be minimized by using data from all animals (whether genotyped or not), realistic models, and methods that account for population structure. Recent developments permit the computation of P-values from genomic best linear unbiased prediction (GBLUP), where models can be arbitrarily complex but restricted to genotyped animals only, and single-step GBLUP that also uses phenotypes from ungenotyped animals. Stability was an important part of nongenomic evaluations, where genetic predictions were stable in the absence of new data even with low prediction accuracies. Unfortunately, genomic evaluations for such animals change because all animals with genotypes are connected. A top-ranked animal can easily drop in the next evaluation, causing a crisis of confidence in genomic evaluations. While correlations between consecutive genomic evaluations are high, outliers can have differences as high as 1 SD. A solution to fluctuating genomic evaluations is to base selection decisions on groups of animals. Although many issues in GS have been solved, many new issues that require additional research continue to surface.
Collapse
Affiliation(s)
- Ignacy Misztal
- Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602, USA
| | - Ignacio Aguilar
- Instituto Nacional de Investigación Agropecuaria (INIA), 90200 Canelones, Uruguay
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602, USA
| | - Li Ma
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA
| | - Juan Pedro Steibel
- Department of Animal Science, Michigan State University, East Lansing, MI 48824, USA
| | - Miguel Toro
- Departamento de Producción Agraria, Universidad Politécnica de Madrid, Madrid, Spain
| |
Collapse
|
2
|
Genetic consistency between gait analysis by accelerometry and evaluation scores at breeding shows for the selection of jumping competition horses. PLoS One 2020; 15:e0244064. [PMID: 33326505 PMCID: PMC7743953 DOI: 10.1371/journal.pone.0244064] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Accepted: 12/02/2020] [Indexed: 01/16/2023] Open
Abstract
The aim was to assess the efficiency of gaits characteristics in improving jumping performance of sport horses and confront accelerometers and judge scores for this purpose. A sample of 1,477 young jumping horses were measured using accelerometers for walk, trot, and canter. Of these, 702 were genotyped with 541,175 SNPs after quality control. Dataset of 26,914 horses scored by judges in breeding shows for gaits and dataset of 142,682 horses that performed in jumping competitions were used. Analysis of accelerometric data defined three principal components from 64% to 89% of variability explained for each gait. Animal mixed models were used to estimate genetic parameters with the inclusion to up 308,105 ancestors for the relationship matrix. Fixed effects for the accelerometric variables included velocity, gender, age, and event. A GWAS was performed on residuals with the fixed effect of each SNP. The GWAS did not reveal other QTLs for gait traits than the one related to the height at withers. The accelerometric principal components were highly heritable for the one linked to stride frequency and dorsoventral displacement at trot (0.53) and canter (0.41) and moderately for the one linked to longitudinal activities (0.33 for trot, 0.19 for canter). Low heritabilities were found for the walk traits. The genetic correlations of the accelerometric principal components with the jumping competition were essentially nil, except for a negative correlation with longitudinal activity at canter (-0.19). The genetic correlation between the judges’ scores and the jumping competition reached 0.45 for canter (0.31 for trot and 0.17 for walk). But these correlations turned negative when the scores were corrected for the known parental breeding value for competition at the time of the judging. In conclusion, gait traits were not helpful to select for jumping performances. Different gaits may be suitable for a good jumping horse.
Collapse
|
3
|
Aguilar I, Legarra A, Cardoso F, Masuda Y, Lourenco D, Misztal I. Frequentist p-values for large-scale-single step genome-wide association, with an application to birth weight in American Angus cattle. Genet Sel Evol 2019; 51:28. [PMID: 31221101 PMCID: PMC6584984 DOI: 10.1186/s12711-019-0469-3] [Citation(s) in RCA: 82] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Accepted: 05/27/2019] [Indexed: 11/14/2022] Open
Abstract
Background Single-step genomic best linear unbiased prediction (SSGBLUP) is a comprehensive method for genomic prediction. Point estimates of marker effects from SSGBLUP are often used for genome-wide association studies (GWAS) without a formal framework of hypothesis testing. Our objective was to implement p-values for single-marker GWAS studies within the single-step GWAS (SSGWAS) framework by deriving computational algorithms and procedures, and by applying these to a large beef cattle population. Methods P-values were obtained based on the prediction error (co)variances for single nucleotide polymorphisms (SNPs), which were obtained from the prediction error (co)variances of genomic predictions based on the inverse of the coefficient matrix and formulas to estimate SNP effects. Results Computation of p-values took a negligible time for a dataset with almost 2 million animals in the pedigree and 1424 genotyped sires, and no inflation of statistics was observed. The SNPs that passed the Bonferroni threshold of 10−5.9 were the same as those that explained the highest proportion of additive genetic variance, but even at the same significance levels and effects, some of them explained less genetic variance due to lower allele frequency. Conclusions The use of a p-value for SSGWAS is a very general and efficient strategy to identify quantitative trait loci (QTL). It can be used for complex datasets such as those used in animal breeding, where only a proportion of the pedigreed animals are genotyped.
Collapse
Affiliation(s)
- Ignacio Aguilar
- Instituto Nacional de Investigación Agropecuaria (INIA), 90200, Canelones, Uruguay
| | - Andres Legarra
- UMR GenPhySE, INRA Toulouse, BP52626, 31326, Castanet Tolosan, France.
| | - Fernando Cardoso
- Department of Animal Science, Federal University of Pelotas, Rio Grande do Sul, Brazil.,Embrapa Pecuária Sul, Bagé, RS, 96400-031, Brazil
| | - Yutaka Masuda
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, USA
| | - Daniela Lourenco
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, USA
| | - Ignacy Misztal
- Department of Animal and Dairy Science, University of Georgia, Athens, GA, USA
| |
Collapse
|
4
|
González-Prendes R, Quintanilla R, Mármol-Sánchez E, Pena RN, Ballester M, Cardoso TF, Manunza A, Casellas J, Cánovas Á, Díaz I, Noguera JL, Castelló A, Mercadé A, Amills M. Comparing the mRNA expression profile and the genetic determinism of intramuscular fat traits in the porcine gluteus medius and longissimus dorsi muscles. BMC Genomics 2019; 20:170. [PMID: 30832586 PMCID: PMC6399881 DOI: 10.1186/s12864-019-5557-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2018] [Accepted: 02/22/2019] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Intramuscular fat (IMF) content and composition have a strong impact on the nutritional and organoleptic properties of porcine meat. The goal of the current work was to compare the patterns of gene expression and the genetic determinism of IMF traits in the porcine gluteus medius (GM) and longissimus dorsi (LD) muscles. RESULTS A comparative analysis of the mRNA expression profiles of the pig GM and LD muscles in 16 Duroc pigs with available microarray mRNA expression measurements revealed the existence of 106 differentially expressed probes (fold-change > 1.5 and q-value < 0.05). Amongst the genes displaying the most significant differential expression, several loci belonging to the Hox transcription factor family were either upregulated (HOXA9, HOXA10, HOXB6, HOXB7 and TBX1) or downregulated (ARX) in the GM muscle. Differences in the expression of genes with key roles in carbohydrate and lipid metabolism (e.g. FABP3, ORMDL1 and SLC37A1) were also detected. By performing a GWAS for IMF content and composition traits recorded in the LD and GM muscles of 350 Duroc pigs, we identified the existence of one region on SSC14 (110-114 Mb) displaying significant associations with C18:0, C18:1(n-7), saturated and unsaturated fatty acid contents in both GM and LD muscles. Moreover, we detected several genome-wide significant associations that were not consistently found in both muscles. Further studies should be performed to confirm whether these associations are muscle-specific. Finally, the performance of an eQTL scan for 74 genes, located within GM QTL regions and with available microarray measurements of gene expression, made possible to identify 14 cis-eQTL regulating the expression of 14 loci, and six of them were confirmed by RNA-Seq. CONCLUSIONS We have detected significant differences in the mRNA expression patterns of the porcine LD and GM muscles, evidencing that the transcriptomic profile of the skeletal muscle tissue is affected by anatomical, metabolic and functional factors. A highly significant association with IMF composition on SSC14 was replicated in both muscles, highlighting the existence of a common genetic determinism, but we also observed the existence of a few associations whose magnitude and significance varied between LD and GM muscles.
Collapse
Affiliation(s)
- Rayner González-Prendes
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
| | - Raquel Quintanilla
- Animal Breeding and Genetics Program, Institute for Research and Technology in Food and Agriculture (IRTA), Rovira Roure 191, 25198 Lleida, Spain
| | - Emilio Mármol-Sánchez
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
| | - Ramona N. Pena
- Departament de Ciència Animal, Universitat de Lleida-Agrotecnio Centre, 25198 Lleida, Spain
| | - Maria Ballester
- Animal Breeding and Genetics Program, Institute for Research and Technology in Food and Agriculture (IRTA), Rovira Roure 191, 25198 Lleida, Spain
| | - Tainã Figueiredo Cardoso
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
- CAPES Foundation, Ministry of Education of Brazil, Brasilia, DF 70.040-020 Brazil
| | - Arianna Manunza
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
| | - Joaquim Casellas
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
| | - Ángela Cánovas
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, 50 Stone Road East, Guelph, Ontario N1G 2W1 Canada
| | - Isabel Díaz
- Institute for Research and Technology in Food and Agriculture (IRTA), Tecnologia dels Aliments, 17121 Monells, Spain
| | - José Luis Noguera
- Animal Breeding and Genetics Program, Institute for Research and Technology in Food and Agriculture (IRTA), Rovira Roure 191, 25198 Lleida, Spain
| | - Anna Castelló
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
| | - Anna Mercadé
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
| | - Marcel Amills
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
| |
Collapse
|
5
|
GWAS by GBLUP: Single and Multimarker EMMAX and Bayes Factors, with an Example in Detection of a Major Gene for Horse Gait. G3-GENES GENOMES GENETICS 2018; 8:2301-2308. [PMID: 29748199 PMCID: PMC6027892 DOI: 10.1534/g3.118.200336] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Bayesian models for genomic prediction and association mapping are being increasingly used in genetics analysis of quantitative traits. Given a point estimate of variance components, the popular methods SNP-BLUP and GBLUP result in joint estimates of the effect of all markers on the analyzed trait; single and multiple marker frequentist tests (EMMAX) can be constructed from these estimates. Indeed, BLUP methods can be seen simultaneously as Bayesian or frequentist methods. So far there is no formal method to produce Bayesian statistics from GBLUP. Here we show that the Bayes Factor, a commonly admitted statistical procedure, can be computed as the ratio of two normal densities: the first, of the estimate of the marker effect over its posterior standard deviation; the second of the null hypothesis (a value of 0 over the prior standard deviation). We extend the BF to pool evidence from several markers and of several traits. A real data set that we analyze, with ours and existing methods, analyzes 630 horses genotyped for 41711 polymorphic SNPs for the trait “outcome of the qualification test” (which addresses gait, or ambling, of horses) for which a known major gene exists. In the horse data, single marker EMMAX shows a significant effect at the right place at Bonferroni level. The BF points to the same location although with low numerical values. The strength of evidence combining information from several consecutive markers increases using the BF and decreases using EMMAX, which comes from a fundamental difference in the Bayesian and frequentist schools of hypothesis testing. We conclude that our BF method complements frequentist EMMAX analyses because it provides a better pooling of evidence across markers, although its use for primary detection is unclear due to the lack of defined rejection thresholds.
Collapse
|
6
|
Gross A, Tönjes A, Scholz M. On the impact of relatedness on SNP association analysis. BMC Genet 2017; 18:104. [PMID: 29212447 PMCID: PMC5719591 DOI: 10.1186/s12863-017-0571-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 11/23/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND When testing for SNP (single nucleotide polymorphism) associations in related individuals, observations are not independent. Simple linear regression assuming independent normally distributed residuals results in an increased type I error and the power of the test is also affected in a more complicate manner. Inflation of type I error is often successfully corrected by genomic control. However, this reduces the power of the test when relatedness is of concern. In the present paper, we derive explicit formulae to investigate how heritability and strength of relatedness contribute to variance inflation of the effect estimate of the linear model. Further, we study the consequences of variance inflation on hypothesis testing and compare the results with those of genomic control correction. We apply the developed theory to the publicly available HapMap trio data (N=129), the Sorbs (a self-contained population with N=977 characterised by a cryptic relatedness structure) and synthetic family studies with different sample sizes (ranging from N=129 to N=999) and different degrees of relatedness. RESULTS We derive explicit and easily to apply approximation formulae to estimate the impact of relatedness on the variance of the effect estimate of the linear regression model. Variance inflation increases with increasing heritability. Relatedness structure also impacts the degree of variance inflation as shown for example family structures. Variance inflation is smallest for HapMap trios, followed by a synthetic family study corresponding to the trio data but with larger sample size than HapMap. Next strongest inflation is observed for the Sorbs, and finally, for a synthetic family study with a more extreme relatedness structure but with similar sample size as the Sorbs. Type I error increases rapidly with increasing inflation. However, for smaller significance levels, power increases with increasing inflation while the opposite holds for larger significance levels. When genomic control is applied, type I error is preserved while power decreases rapidly with increasing variance inflation. CONCLUSIONS Stronger relatedness as well as higher heritability result in increased variance of the effect estimate of simple linear regression analysis. While type I error rates are generally inflated, the behaviour of power is more complex since power can be increased or reduced in dependence on relatedness and the heritability of the phenotype. Genomic control cannot be recommended to deal with inflation due to relatedness. Although it preserves type I error, the loss in power can be considerable. We provide a simple formula for estimating variance inflation given the relatedness structure and the heritability of a trait of interest. As a rule of thumb, variance inflation below 1.05 does not require correction and simple linear regression analysis is still appropriate.
Collapse
Affiliation(s)
- Arnd Gross
- Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Haertelstrasse 16-18, Leipzig, 04107, Germany. .,LIFE - Leipzig Research Center for Civilization Diseases, University of Leipzig, Philipp-Rosenthal-Strasse 27, Leipzig, 04103, Germany.
| | - Anke Tönjes
- Department of Medicine, University of Leipzig, Liebigstrasse 18, Leipzig, 04103, Germany
| | - Markus Scholz
- Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Haertelstrasse 16-18, Leipzig, 04107, Germany.,LIFE - Leipzig Research Center for Civilization Diseases, University of Leipzig, Philipp-Rosenthal-Strasse 27, Leipzig, 04103, Germany
| |
Collapse
|
7
|
Rincent R, Kuhn E, Monod H, Oury FX, Rousset M, Allard V, Le Gouis J. Optimization of multi-environment trials for genomic selection based on crop models. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2017; 130:1735-1752. [PMID: 28540573 PMCID: PMC5511605 DOI: 10.1007/s00122-017-2922-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Accepted: 05/11/2017] [Indexed: 05/20/2023]
Abstract
We propose a statistical criterion to optimize multi-environment trials to predict genotype × environment interactions more efficiently, by combining crop growth models and genomic selection models. Genotype × environment interactions (GEI) are common in plant multi-environment trials (METs). In this context, models developed for genomic selection (GS) that refers to the use of genome-wide information for predicting breeding values of selection candidates need to be adapted. One promising way to increase prediction accuracy in various environments is to combine ecophysiological and genetic modelling thanks to crop growth models (CGM) incorporating genetic parameters. The efficiency of this approach relies on the quality of the parameter estimates, which depends on the environments composing this MET used for calibration. The objective of this study was to determine a method to optimize the set of environments composing the MET for estimating genetic parameters in this context. A criterion called OptiMET was defined to this aim, and was evaluated on simulated and real data, with the example of wheat phenology. The MET defined with OptiMET allowed estimating the genetic parameters with lower error, leading to higher QTL detection power and higher prediction accuracies. MET defined with OptiMET was on average more efficient than random MET composed of twice as many environments, in terms of quality of the parameter estimates. OptiMET is thus a valuable tool to determine optimal experimental conditions to best exploit MET and the phenotyping tools that are currently developed.
Collapse
Affiliation(s)
- R Rincent
- INRA, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 5 chemin de Beaulieu, 63100, Clermont-Ferrand, France.
- Université Blaise Pascal, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 63178, Aubière Cedex, France.
| | - E Kuhn
- INRA, MaIAGE, INRA, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - H Monod
- INRA, MaIAGE, INRA, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - F-X Oury
- INRA, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 5 chemin de Beaulieu, 63100, Clermont-Ferrand, France
- Université Blaise Pascal, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 63178, Aubière Cedex, France
| | - M Rousset
- INRA, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 5 chemin de Beaulieu, 63100, Clermont-Ferrand, France
- Université Blaise Pascal, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 63178, Aubière Cedex, France
| | - V Allard
- INRA, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 5 chemin de Beaulieu, 63100, Clermont-Ferrand, France
- Université Blaise Pascal, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 63178, Aubière Cedex, France
| | - J Le Gouis
- INRA, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 5 chemin de Beaulieu, 63100, Clermont-Ferrand, France
- Université Blaise Pascal, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 63178, Aubière Cedex, France
| |
Collapse
|
8
|
Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat and Arabidopsis. G3-GENES GENOMES GENETICS 2016; 6:3241-3256. [PMID: 27520956 PMCID: PMC5068945 DOI: 10.1534/g3.116.034256] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Standard genome-wide association studies (GWAS) scan for relationships between each of p molecular markers and a continuously distributed target trait. Typically, a marker-based matrix of genomic similarities among individuals (G) is constructed, to account more properly for the covariance structure in the linear regression model used. We show that the generalized least-squares estimator of the regression of phenotype on one or on m markers is invariant with respect to whether or not the marker(s) tested is(are) used for building G, provided variance components are unaffected by exclusion of such marker(s) from G. The result is arrived at by using a matrix expression such that one can find many inverses of genomic relationship, or of phenotypic covariance matrices, stemming from removing markers tested as fixed, but carrying out a single inversion. When eigenvectors of the genomic relationship matrix are used as regressors with fixed regression coefficients, e.g., to account for population stratification, their removal from G does matter. Removal of eigenvectors from G can have a noticeable effect on estimates of genomic and residual variances, so caution is needed. Concepts were illustrated using genomic data on 599 wheat inbred lines, with grain yield as target trait, and on close to 200 Arabidopsis thaliana accessions.
Collapse
|
9
|
Correa K, Lhorente JP, López ME, Bassini L, Naswa S, Deeb N, Di Genova A, Maass A, Davidson WS, Yáñez JM. Genome-wide association analysis reveals loci associated with resistance against Piscirickettsia salmonis in two Atlantic salmon (Salmo salar L.) chromosomes. BMC Genomics 2015; 16:854. [PMID: 26499328 PMCID: PMC4619534 DOI: 10.1186/s12864-015-2038-7] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Accepted: 10/08/2015] [Indexed: 12/15/2022] Open
Abstract
Background Pisciricketssia salmonis is the causal agent of Salmon Rickettsial Syndrome (SRS), which affects salmon species and causes severe economic losses. Selective breeding for disease resistance represents one approach for controlling SRS in farmed Atlantic salmon. Knowledge concerning the architecture of the resistance trait is needed before deciding on the most appropriate approach to enhance artificial selection for P. salmonis resistance in Atlantic salmon. The purpose of the study was to dissect the genetic variation in the resistance to this pathogen in Atlantic salmon. Methods 2,601 Atlantic salmon smolts were experimentally challenged against P. salmonis by means of intra-peritoneal injection. These smolts were the progeny of 40 sires and 118 dams from a Chilean breeding population. Mortalities were recorded daily and the experiment ended at day 40 post-inoculation. Fish were genotyped using a 50K Affymetrix® Axiom® myDesignTM Single Nucleotide Polymorphism (SNP) Genotyping Array. A Genome Wide Association Analysis was performed on data from the challenged fish. Linear regression and logistic regression models were tested. Results Genome Wide Association Analysis indicated that resistance to P. salmonis is a moderately polygenic trait. There were five SNPs in chromosomes Ssa01 and Ssa17 significantly associated with the traits analysed. The proportion of the phenotypic variance explained by each marker is small, ranging from 0.007 to 0.045. Candidate genes including interleukin receptors and fucosyltransferase have been found to be physically linked with these genetic markers and may play an important role in the differential immune response against this pathogen. Conclusions Due to the small amount of variance explained by each significant marker we conclude that genetic resistance to this pathogen can be more efficiently improved with the implementation of genetic evaluations incorporating genotype information from a dense SNP array. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2038-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Katharina Correa
- Facultad de Ciencias Veterinarias y Pecuarias, Universidad de Chile, Av Santa Rosa 11735, Santiago, Chile.
| | | | - María E López
- Facultad de Ciencias Agronómicas, Universidad de Chile, Av Santa Rosa 11315, Santiago, Chile.
| | - Liane Bassini
- Facultad de Ciencias Agronómicas, Universidad de Chile, Av Santa Rosa 11315, Santiago, Chile.
| | - Sudhir Naswa
- Genus plc, 100 Bluegrass Commons Blvd. Suite 2200, Hendersonville, TN, 37075, USA.
| | - Nader Deeb
- Genus plc, 100 Bluegrass Commons Blvd. Suite 2200, Hendersonville, TN, 37075, USA.
| | - Alex Di Genova
- Laboratory of Bioinformatics and Mathematics of the Genome, Center for Mathematical Modeling (UMI 2807 CNRS) and Center for Genome Regulation, Universidad de Chile, Beauchef 851, Santiago, Chile.
| | - Alejandro Maass
- Laboratory of Bioinformatics and Mathematics of the Genome, Center for Mathematical Modeling (UMI 2807 CNRS) and Center for Genome Regulation, Universidad de Chile, Beauchef 851, Santiago, Chile.
| | - William S Davidson
- Department of Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, BC, Canada.
| | - José M Yáñez
- Facultad de Ciencias Veterinarias y Pecuarias, Universidad de Chile, Av Santa Rosa 11735, Santiago, Chile.
| |
Collapse
|
10
|
Romé H, Varenne A, Hérault F, Chapuis H, Alleno C, Dehais P, Vignal A, Burlot T, Le Roy P. GWAS analyses reveal QTL in egg layers that differ in response to diet differences. Genet Sel Evol 2015; 47:83. [PMID: 26482360 PMCID: PMC4617898 DOI: 10.1186/s12711-015-0160-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Accepted: 10/06/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The genetic architecture of egg production and egg quality traits, i.e. the quantitative trait loci (QTL) that influence these traits, is still poorly known. To date, 33 studies have focused on the detection of QTL for laying traits in chickens, but less than 10 genes have been identified. The availability of a high-density SNP (single nucleotide polymorphism) chicken array developed by Affymetrix, i.e. the 600K Affymetrix(®) Axiom(®) HD genotyping array offers the possibility to narrow down the localization of previously detected QTL and to detect new QTL. This high-density array is also anticipated to take research beyond the classical hypothesis of additivity of QTL effects or of QTL and environmental effects. The aim of our study was to search for QTL that influence laying traits using the 600K SNP chip and to investigate whether the effects of these QTL differed between diets and age at egg collection. RESULTS One hundred and thirty-one QTL were detected for 16 laying traits and were spread across all marked chromosomes, except chromosomes 16 and 25. The percentage of variance explained by a QTL varied from 2 to 10 % for the various traits, depending on diet and age at egg collection. Chromosomes 3, 9, 10 and Z were overrepresented, with more than eight QTL on each one. Among the 131 QTL, 60 had a significantly different effect, depending on diet or age at egg collection. For egg production traits, when the QTL × environment interaction was significant, numerous inversions of sign of the SNP effects were observed, whereas for egg quality traits, the QTL × environment interaction was mostly due to a difference of magnitude of the SNP effects. CONCLUSIONS Our results show that numerous QTL influence egg production and egg quality traits and that the genomic regions, which are involved in shaping the ability of layer chickens to adapt to their environment for egg production, vary depending on the environmental conditions. The next question will be to address what the impact of these genotype × environment interactions is on selection.
Collapse
Affiliation(s)
- Hélène Romé
- INRA, UMR1348 PEGASE, Domaine de La Prise, 35590, Saint-Gilles, France. .,Agrocampus Ouest, UMR1348 PEGASE, 65 Rue de Saint Brieuc, 35042, Rennes, France.
| | | | - Frédéric Hérault
- INRA, UMR1348 PEGASE, Domaine de La Prise, 35590, Saint-Gilles, France. .,Agrocampus Ouest, UMR1348 PEGASE, 65 Rue de Saint Brieuc, 35042, Rennes, France.
| | - Hervé Chapuis
- SYSAAF, INRA UR83 Recherches Avicoles, 37380, Nouzilly, France.
| | - Christophe Alleno
- Zootests, Parc Technologique Du Zoopôle, 5 Rue Gabriel Calloet Kerbrat, 22440, Ploufragan, France.
| | - Patrice Dehais
- INRA, UMR1388 GenPhySe, Auzeville BP52627, 31326, Castanet-Tolosan, France.
| | - Alain Vignal
- INRA, UMR1388 GenPhySe, Auzeville BP52627, 31326, Castanet-Tolosan, France.
| | | | - Pascale Le Roy
- INRA, UMR1348 PEGASE, Domaine de La Prise, 35590, Saint-Gilles, France. .,Agrocampus Ouest, UMR1348 PEGASE, 65 Rue de Saint Brieuc, 35042, Rennes, France.
| |
Collapse
|
11
|
Ricard A. Does heterozygosity at the DMRT3 gene make French trotters better racers? Genet Sel Evol 2015; 47:10. [PMID: 25886871 PMCID: PMC4340234 DOI: 10.1186/s12711-015-0095-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2014] [Accepted: 01/16/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Recently, a mutation was discovered in the DMRT3 gene that controls pacing in horses. The mutant allele A is fixed in the American Standardbred trotter breed, while in the French trotter breed, the frequency of the wild-type allele C is still 24%. This study aimed at measuring the effect of DMRT3 genotypes on the performance of French trotters and explaining why the polymorphism still occurs in this breed. Using a mixed animal model, genetic parameters and environmental effects on performance traits were estimated from data on 173 176 French trotter races. The effect of the DMRT3 gene was then estimated by the effect of genotype at the highly linked SNP BIEC2-620109 (C-C, A-T) for 630 horses. A selection scheme that included qualification and racing performances was modeled to (1) verify if the observed superiority of heterozygous CT horses at this SNP could be explained only by selection and (2) understand why allele C has not disappeared in French trotters. RESULTS Heritability of racing performance traits was high for qualification test (0.56), moderate for annual earnings per finished race (0.26 to 0.31) and low for proportion of disqualified races (0.06 to 0.09). Genotype CC was always unfavorable compared to genotype TT for qualification: the probability to be qualified was 20% for CC vs. 48% for TT and earnings were -0.96 σy lower for CC than for TT. Genotype CT was also unfavorable for qualification (40%) and earnings at 3 years (-0.21 σy), but favorable for earnings at ages greater than 5 years: +0.41 σy (P = 7.10(-4)). Selection on qualification could not explain more than 19% of the difference between genotypes CC and CT in earnings at ages greater than 5 years. Only a scenario for which genotype CT has a favorable effect on the performance of horses older than 5 years could explain that the polymorphism at the DMRT3 gene still exists in the French trotter breed. CONCLUSIONS The use of mature horses in the French racing circuit can explain that the CA genotype is still present in the French trotter horses.
Collapse
Affiliation(s)
- Anne Ricard
- INRA, UMR 1313 Génétique Animale et Biologie Intégrative, 78352, Jouy-en-Josas, France. .,IFCE, Recherche et Innovation, 61310, Exmes, France.
| |
Collapse
|
12
|
Legarra A, Croiseau P, Sanchez MP, Teyssèdre S, Sallé G, Allais S, Fritz S, Moreno CR, Ricard A, Elsen JM. A comparison of methods for whole-genome QTL mapping using dense markers in four livestock species. Genet Sel Evol 2015; 47:6. [PMID: 25885597 PMCID: PMC4324410 DOI: 10.1186/s12711-015-0087-7] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2014] [Accepted: 01/06/2015] [Indexed: 12/17/2022] Open
Abstract
Background With dense genotyping, many choices exist for methods to detect quantitative trait loci (QTL) in livestock populations. However, no across-species study has been conducted on the performance of different methods using real data. We compared three methods that correct for relatedness either implicitly or explicitly: linkage and linkage disequilibrium haplotype-based analysis (LDLA), efficient mixed-model association (EMMA) analysis, and Bayesian whole-genome regression (BayesC). We analyzed one chromosome in each of five datasets (dairy cattle, beef cattle, sheep, horses, and pigs) using real genotypes based on dense single nucleotide polymorphisms and phenotypes. The P values corrected for multiple testing or Bayes factors greater than 150 were considered to be significant. To complete the real data study, we also simulated quantitative trait loci (QTL) for the same datasets based on the real genotypes. Several scenarios were chosen, with different QTL effects and linkage disequilibrium patterns. A pseudo-null statistical distribution was chosen to make the significance thresholds comparable across methods. Results For the real data, the three methods generally agreed within 1 or 2 cM for the locations of QTL regions and disagreed when no signals were significant (e.g. in pigs). For certain datasets, LDLA had more significant signals than EMMA or BayesC, but they were concentrated around the same peaks. Therefore, the three methods detected approximately the same number of QTL regions. For the simulated data, LDLA was slightly less powerful and accurate than either EMMA or BayesC but this depended strongly on how thresholds were set in the simulations. Conclusions All three methods performed similarly for real and simulated data. No method was clearly superior across all datasets or for any particular dataset. For computational efficiency and ease of interpretation, EMMA is recommended, but using more than one method is suggested. Electronic supplementary material The online version of this article (doi:10.1186/s12711-015-0087-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Andres Legarra
- INRA, UMR 1388 GenPhySE, BP52627, 31326, Castanet Tolosan, France.
| | - Pascal Croiseau
- INRA, UMR 1313 GABI, Domaine de Vilvert, 78352, Jouy-en-Josas, France.
| | | | - Simon Teyssèdre
- INRA, UMR 1388 GenPhySE, BP52627, 31326, Castanet Tolosan, France. .,Current address: RAGT-R2n, Le bourg, 12510, Druelle, France.
| | - Guillaume Sallé
- INRA, UMR1282 Infectiologie et Santé Publique, F-37380, Nouzilly, France. .,Université François Rabelais de Tours, UMR1282 Infectiologie et Santé Publique, 37000, Tours, France.
| | - Sophie Allais
- Agrocampus Ouest, UMR1348 Pegase, F-35000, Rennes, France. .,INRA, UMR1348 Pegase, F-35590, Saint-Gilles, France. .,Université Européenne de Bretagne, Rennes, France.
| | | | | | - Anne Ricard
- INRA, UMR 1313 GABI, Domaine de Vilvert, 78352, Jouy-en-Josas, France. .,Recherche et Innovation, IFCE, 61310 Exmes, Paris, France.
| | | |
Collapse
|
13
|
Brard S, Ricard A. Genome-wide association study for jumping performances in French sport horses. Anim Genet 2014; 46:78-81. [PMID: 25515185 DOI: 10.1111/age.12245] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/26/2014] [Indexed: 11/28/2022]
Abstract
A genome-wide association study was performed to identify single nucleotide polymorphisms (SNPs) associated with jumping performances of warmbloods in France. The 999 horses included in the study for jumping performances were sport horses [mostly Selle Français (68%), Anglo-Arabians (13%) and horses from the other European studbooks]. Horses were genotyped using the Illumina EquineSNP50 BeadChip. Of the 54,602 SNPs available on this chip, 44,424 were retained after quality testing. Phenotypes were obtained by deregressing official breeding values for jumping competitions to use all available information, that is, the performances of each horse as well as those of its relatives. Two models were used to test the effects of the genotypes on deregressed phenotypes: a single-marker mixed model and a haplotype-based mixed model (significant: P < 1E-05; suggestive: P < 1E-04). Both models included a polygenic effect to take into account familial structures. For jumping performances, one suggestive quantitative trait locus (QTL) located on chromosome 1 (BIEC2_31196 and BIEC2_31198) was detected with both models. This QTL explains 0.7% of the phenotypic variance. RYR2, a gene encoding a major calcium channel in cardiac muscle in humans and mice, is located 0.55 Mb from this potential QTL.
Collapse
Affiliation(s)
- S Brard
- INRA, GenPhySE (Génétique Physiologie et Systèmes d'Elevage), F-31326, Castanet-Tolosan, France; INP, ENSAT, GenPhySE (Génétique Physiologie et Systèmes d'Elevage), Université de Toulouse, F-31326, Castanet-Tolosan, France; INP, ENVT, GenPhySE (Génétique Physiologie et Systèmes d'Elevage), Université de Toulouse, F-31076, Toulouse, France
| | | |
Collapse
|
14
|
Rincent R, Moreau L, Monod H, Kuhn E, Melchinger AE, Malvar RA, Moreno-Gonzalez J, Nicolas S, Madur D, Combes V, Dumas F, Altmann T, Brunel D, Ouzunova M, Flament P, Dubreuil P, Charcosset A, Mary-Huard T. Recovering power in association mapping panels with variable levels of linkage disequilibrium. Genetics 2014; 197:375-87. [PMID: 24532779 PMCID: PMC4012494 DOI: 10.1534/genetics.113.159731] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2013] [Accepted: 02/09/2014] [Indexed: 11/18/2022] Open
Abstract
Association mapping has permitted the discovery of major QTL in many species. It can be applied to existing populations and, as a consequence, it is generally necessary to take into account structure and relatedness among individuals in the statistical model to control false positives. We analytically studied power in association studies by computing noncentrality parameter of the tests and its relationship with parameters characterizing diversity (genetic differentiation between groups and allele frequencies) and kinship between individuals. Investigation of three different maize diversity panels genotyped with the 50k SNPs array highlighted contrasted average power among panels and revealed gaps of power of classical mixed models in regions with high linkage disequilibrium (LD). These gaps could be related to the fact that markers are used for both testing association and estimating relatedness. We thus considered two alternative approaches to estimating the kinship matrix to recover power in regions of high LD. In the first one, we estimated the kinship with all the markers that are not located on the same chromosome than the tested SNP. In the second one, correlation between markers was taken into account to weight the contribution of each marker to the kinship. Simulations revealed that these two approaches were efficient to control false positives and were more powerful than classical models.
Collapse
Affiliation(s)
- Renaud Rincent
- Unité Mixte de Recherche de Génétique Végétale, Institut National de la Recherche Agronomique, Université Paris-Sud, Centre National de la Recherche Scientifique, 91190 Gif-sur-Yvette, France
- Biogemma, Genetics and Genomics in Cereals, 63720 Chappes, France
- Kleinwanzlebener Saatzucht Saat AG, 37555 Einbeck, Germany
- Limagrain, site d’Ulice, BP173, 63204 Riom Cedex, France
| | - Laurence Moreau
- Unité Mixte de Recherche de Génétique Végétale, Institut National de la Recherche Agronomique, Université Paris-Sud, Centre National de la Recherche Scientifique, 91190 Gif-sur-Yvette, France
| | - Hervé Monod
- Institut National de la Recherche Agronomique, Unité de Mathématique et Informatique Appliquées, 78352 Jouy-en-Josas, France
| | - Estelle Kuhn
- Institut National de la Recherche Agronomique, Unité de Mathématique et Informatique Appliquées, 78352 Jouy-en-Josas, France
| | - Albrecht E. Melchinger
- Institute of Plant Breeding, Seed Science, and Population Genetics, University of Hohenheim, 70599, Stuttgart, Germany
| | - Rosa A. Malvar
- Misión Biológica de Galicia, Spanish National Research Council, 36080 Pontevedra, Spain
| | | | - Stéphane Nicolas
- Unité Mixte de Recherche de Génétique Végétale, Institut National de la Recherche Agronomique, Université Paris-Sud, Centre National de la Recherche Scientifique, 91190 Gif-sur-Yvette, France
| | - Delphine Madur
- Unité Mixte de Recherche de Génétique Végétale, Institut National de la Recherche Agronomique, Université Paris-Sud, Centre National de la Recherche Scientifique, 91190 Gif-sur-Yvette, France
| | - Valérie Combes
- Unité Mixte de Recherche de Génétique Végétale, Institut National de la Recherche Agronomique, Université Paris-Sud, Centre National de la Recherche Scientifique, 91190 Gif-sur-Yvette, France
| | - Fabrice Dumas
- Unité Mixte de Recherche de Génétique Végétale, Institut National de la Recherche Agronomique, Université Paris-Sud, Centre National de la Recherche Scientifique, 91190 Gif-sur-Yvette, France
| | - Thomas Altmann
- Max-Planck Institute for Molecular Plant Physiology, 14476 Potsdam-Golm and Leibniz-Institute of Plant Genetics and Crop Plant Research (IPK), 06466 Gatersleben, Germany
| | - Dominique Brunel
- Institut National de la Recherche Agronomique, Etude du Polymorphisme des Génomes Végétaux, Commissariat à l'Energie Atomique Institut de Génomique, Centre National de Génotypage, 91057 Evry, France
| | | | - Pascal Flament
- Limagrain, site d’Ulice, BP173, 63204 Riom Cedex, France
| | - Pierre Dubreuil
- Biogemma, Genetics and Genomics in Cereals, 63720 Chappes, France
| | - Alain Charcosset
- Unité Mixte de Recherche de Génétique Végétale, Institut National de la Recherche Agronomique, Université Paris-Sud, Centre National de la Recherche Scientifique, 91190 Gif-sur-Yvette, France
| | - Tristan Mary-Huard
- Unité Mixte de Recherche de Génétique Végétale, Institut National de la Recherche Agronomique, Université Paris-Sud, Centre National de la Recherche Scientifique, 91190 Gif-sur-Yvette, France
- Institut National de la Recherche Agronomique/AgroParisTech, Unité Mixte de Recherche 518, 75231, Paris, France
| |
Collapse
|