1
|
An Efficient Genome-Wide Multilocus Epistasis Search. Genetics 2015; 201:865-70. [PMID: 26405029 DOI: 10.1534/genetics.115.182444] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 09/15/2015] [Indexed: 01/04/2023] Open
Abstract
There has been a continuing interest in approaches that analyze pairwise locus-by-locus (epistasis) interactions using multilocus association models in genome-wide data sets. In this paper, we suggest an approach that uses sure independence screening to first lower the dimension of the problem by considering the marginal importance of each interaction term within the huge loop. Subsequent multilocus association steps are executed using an extended Bayesian least absolute shrinkage and selection operator (LASSO) model and fast generalized expectation-maximization estimation algorithms. The potential of this approach is illustrated and compared with PLINK software using data examples where phenotypes have been simulated conditionally on marker data from the Quantitative Trait Loci Mapping and Marker Assisted Selection (QTLMAS) Workshop 2008 and real pig data sets.
Collapse
|
2
|
Why breeding values estimated using familial data should not be used for genome-wide association studies. G3-GENES GENOMES GENETICS 2014; 4:341-7. [PMID: 24362310 PMCID: PMC3931567 DOI: 10.1534/g3.113.008706] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
In animal breeding, the genetic potential of an animal is summarized as its estimated breeding value, which is derived from its own performance as well as the performance of related individuals. Here, we illustrate why estimated breeding values are not suitable as a phenotype for genome-wide association studies. We simulated human-type and pig-type pedigrees with a range of quantitative trait loci (QTL) effects (0.5–3% of phenotypic variance) and heritabilities (0.3−0.8). We analyzed 1000 replicates of each scenario with four models: (a) a full mixed model including a polygenic effect, (b) a regression analysis using the residual of a mixed model as a trait score (so called GRAMMAR approach), (c) a regression analysis using the estimated breeding value as a trait score, and (d) a regression analysis that uses the raw phenotype as a trait score. We show that using breeding values as a trait score gives very high false-positive rates (up 14% in human pedigrees and >60% in pig pedigrees). Simulations based on a real pedigree show that additional generations of pedigree increase the type I error. Including the family relationship as a random effect provides the greatest power to detect QTL while controlling for type I error at the desired level and providing the most accurate estimates of the QTL effect. Both the use of residuals and the use of breeding values result in deflated estimates of the QTL effect. We derive the contributions of QTL effects to the breeding value and residual and show how this affects the estimates.
Collapse
|
3
|
Knürr T, Läärä E, Sillanpää MJ. Impact of prior specifications in a shrinkage-inducing Bayesian model for quantitative trait mapping and genomic prediction. Genet Sel Evol 2013; 45:24. [PMID: 23834140 PMCID: PMC3750442 DOI: 10.1186/1297-9686-45-24] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2012] [Accepted: 06/10/2013] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND In quantitative trait mapping and genomic prediction, Bayesian variable selection methods have gained popularity in conjunction with the increase in marker data and computational resources. Whereas shrinkage-inducing methods are common tools in genomic prediction, rigorous decision making in mapping studies using such models is not well established and the robustness of posterior results is subject to misspecified assumptions because of weak biological prior evidence. METHODS Here, we evaluate the impact of prior specifications in a shrinkage-based Bayesian variable selection method which is based on a mixture of uniform priors applied to genetic marker effects that we presented in a previous study. Unlike most other shrinkage approaches, the use of a mixture of uniform priors provides a coherent framework for inference based on Bayes factors. To evaluate the robustness of genetic association under varying prior specifications, Bayes factors are compared as signals of positive marker association, whereas genomic estimated breeding values are considered for genomic selection. The impact of specific prior specifications is reduced by calculation of combined estimates from multiple specifications. A Gibbs sampler is used to perform Markov chain Monte Carlo estimation (MCMC) and a generalized expectation-maximization algorithm as a faster alternative for maximum a posteriori point estimation. The performance of the method is evaluated by using two publicly available data examples: the simulated QTLMAS XII data set and a real data set from a population of pigs. RESULTS Combined estimates of Bayes factors were very successful in identifying quantitative trait loci, and the ranking of Bayes factors was fairly stable among markers with positive signals of association under varying prior assumptions, but their magnitudes varied considerably. Genomic estimated breeding values using the mixture of uniform priors compared well to other approaches for both data sets and loss of accuracy with the generalized expectation-maximization algorithm was small as compared to that with MCMC. CONCLUSIONS Since no error-free method to specify priors is available for complex biological phenomena, exploring a wide variety of prior specifications and combining results provides some solution to this problem. For this purpose, the mixture of uniform priors approach is especially suitable, because it comprises a wide and flexible family of distributions and computationally intensive estimation can be carried out in a reasonable amount of time.
Collapse
Affiliation(s)
- Timo Knürr
- Department of Mathematics and Statistics, P.O. Box 68, University of Helsinki, Helsinki, FIN-00014, Finland
| | - Esa Läärä
- Department of Mathematical Sciences/Statistics, P.O. Box 3000, University of Oulu, Oulu, FIN-90014, Finland
| | - Mikko J Sillanpää
- Department of Mathematics and Statistics, P.O. Box 68, University of Helsinki, Helsinki, FIN-00014, Finland
- Department of Mathematical Sciences/Statistics, P.O. Box 3000, University of Oulu, Oulu, FIN-90014, Finland
- Department of Biology and Biocenter Oulu, P.O. Box 3000, University of Oulu, Oulu, FIN-90014, Finland
- Department of Agricultural Sciences, P.O. Box 27, University of Helsinki, Helsinki, FIN-00014, Finland
| |
Collapse
|
4
|
Kileh-Wais M, Elsen JM, Vignal A, Feves K, Vignoles F, Fernandez X, Manse H, Davail S, André JM, Bastianelli D, Bonnal L, Filangi O, Baéza E, Guéméné D, Genêt C, Bernadet MD, Dubos F, Marie-Etancelin C. Detection of QTL controlling metabolism, meat quality, and liver quality traits of the overfed interspecific hybrid mule duck. J Anim Sci 2012; 91:588-604. [PMID: 23148259 DOI: 10.2527/jas.2012-5411] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The mule duck, an interspecific hybrid obtained by crossing common duck (Anas platyrhynchos) females with Muscovy (Cairina moschata) drakes, is widely used for fatty liver production. The purpose of the present study was to detect and map single and pleiotropic QTL that segregate in the common duck species, and influence the expression of traits in their overfed mule duck offspring. To this end, we generated a common duck backcross (BC) population by crossing Kaiya and heavy Pekin experimental lines, which differ notably in regard to the BW and overfeeding ability of their mule progeny. The BC females were mated to Muscovy drakes and, on average, 4 male mule ducks hatched per BC female (1600 in total) and were measured for growth, metabolism during growth and the overfeeding period, overfeeding ability, and the quality of their breast meat and fatty liver. The phenotypic value of BC females was estimated for each trait by assigning to each female the mean value of the phenotypes of her offspring. Estimations allowed for variance, which depended on the number of male offspring per BC and the heritability of the trait considered. The genetic map used for QTL detection consisted of 91 microsatellite markers aggregated into 16 linkage groups (LG) covering a total of 778 cM. Twenty-two QTL were found to be significant at the 1% chromosome-wide threshold level using the single-trait detection option of the QTLMap software. Most of the QTL detected were related to the quality of breast meat and fatty liver: QTL for meat pH 20 min post mortem were mapped to LG4 (at the 1% genome-wide significance level), and QTL for meat lipid content and cooking losses were mapped to LG2a. The QTL related to fatty liver weight and liver protein and lipid content were for the most part detected on LG2c and LG9. Multitrait analysis highlighted the pleiotropic effects of QTL in these chromosome regions. Apart from the strong QTL for plasma triglyceride content at the end of the overfeeding period mapped to chromosome Z using single-trait analysis, all metabolic trait QTL were detected with the multitrait approach: the QTL mapped to LG14 and LG21 affected the plasma cholesterol and triglyceride contents, whereas the QTL mapped to LG2a seemed to impact glycemia and the basal plasma corticosterone content. A greater density genetic map will be needed to further fine map the QTL.
Collapse
Affiliation(s)
- M Kileh-Wais
- Institut National de la Recherche Agronomique, SAGA Station d'Amélioration Génétique des Animaux, UR631, 31 326 Castanet Tolosan, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Kärkkāinen HP, Sillanpää MJ. Robustness of Bayesian multilocus association models to cryptic relatedness. Ann Hum Genet 2012; 76:510-23. [PMID: 22971009 DOI: 10.1111/j.1469-1809.2012.00729.x] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Population-based association analyses are more powerful than within-family analyses in identifying genetic loci associated with a phenotype of interest. However, if the population or sample structure is omitted from the model, population stratification and cryptic relatedness may lead to false positive and negative signals caused by relatedness between individuals, rather than association due to close linkage of the marker and the trait loci. Therefore it is important to correct or account for these confounders in population-based association analyses. However, there is cumulative evidence that when fitting a multilocus association model, the genetic relationships between the individuals can be captured by the markers themselves, bringing about a possibility to use the models without an additional correction for the population or sample structure. In this work we have further investigated this possibility in the Bayesian multilocus association model context using the extended Bayesian LASSO and the indicator-based variable selection. In particular, we have studied whether these multilocus models benefit from an insertion of an additional polygenic term representing the genetic variation not captured by the markers and taking account of the residual dependencies between the individuals. We have found that although the models may benefit from the insertion of the polygenic component, omitting the component does not damage the model performance severely.
Collapse
Affiliation(s)
- Hanni P Kärkkāinen
- Department of Agricultural Sciences, University of Helsinki, Helsinki FIN-00014, Finland
| | | |
Collapse
|
6
|
Li Z, Sillanpää MJ. Overview of LASSO-related penalized regression methods for quantitative trait mapping and genomic selection. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2012; 125:419-435. [PMID: 22622521 DOI: 10.1007/s00122-012-1892-9] [Citation(s) in RCA: 96] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2011] [Accepted: 04/27/2012] [Indexed: 06/01/2023]
Abstract
Quantitative trait loci (QTL)/association mapping aims at finding genomic loci associated with the phenotypes, whereas genomic selection focuses on breeding value prediction based on genomic data. Variable selection is a key to both of these tasks as it allows to (1) detect clear mapping signals of QTL activity, and (2) predict the genome-enhanced breeding values accurately. In this paper, we provide an overview of a statistical method called least absolute shrinkage and selection operator (LASSO) and two of its generalizations named elastic net and adaptive LASSO in the contexts of QTL mapping and genomic breeding value prediction in plants (or animals). We also briefly summarize the Bayesian interpretation of LASSO, and the inspired hierarchical Bayesian models. We illustrate the implementation and examine the performance of methods using three public data sets: (1) North American barley data with 127 individuals and 145 markers, (2) a simulated QTLMAS XII data with 5,865 individuals and 6,000 markers for both QTL mapping and genomic selection, and (3) a wheat data with 599 individuals and 1,279 markers only for genomic selection.
Collapse
Affiliation(s)
- Zitong Li
- Department of Mathematics and Statistics, University of Helsinki, PO Box 68, 00014, Helsinki, Finland
| | | |
Collapse
|
7
|
Sahana G, Mailund T, Lund MS, Guldbrandtsen B. Local genealogies in a linear mixed model for genome-wide association mapping in complex pedigreed populations. PLoS One 2011; 6:e27061. [PMID: 22073255 PMCID: PMC3206889 DOI: 10.1371/journal.pone.0027061] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2011] [Accepted: 10/10/2011] [Indexed: 11/18/2022] Open
Abstract
INTRODUCTION The state-of-the-art for dealing with multiple levels of relationship among the samples in genome-wide association studies (GWAS) is unified mixed model analysis (MMA). This approach is very flexible, can be applied to both family-based and population-based samples, and can be extended to incorporate other effects in a straightforward and rigorous fashion. Here, we present a complementary approach, called 'GENMIX (genealogy based mixed model)' which combines advantages from two powerful GWAS methods: genealogy-based haplotype grouping and MMA. SUBJECTS AND METHODS We validated GENMIX using genotyping data of Danish Jersey cattle and simulated phenotype and compared to the MMA. We simulated scenarios for three levels of heritability (0.21, 0.34, and 0.64), seven levels of MAF (0.05, 0.10, 0.15, 0.20, 0.25, 0.35, and 0.45) and five levels of QTL effect (0.1, 0.2, 0.5, 0.7 and 1.0 in phenotypic standard deviation unit). Each of these 105 possible combinations (3 h(2) x 7 MAF x 5 effects) of scenarios was replicated 25 times. RESULTS GENMIX provides a better ranking of markers close to the causative locus' location. GENMIX outperformed MMA when the QTL effect was small and the MAF at the QTL was low. In scenarios where MAF was high or the QTL affecting the trait had a large effect both GENMIX and MMA performed similarly. CONCLUSION In discovery studies, where high-ranking markers are identified and later examined in validation studies, we therefore expect GENMIX to enrich candidates brought to follow-up studies with true positives over false positives more than the MMA would.
Collapse
Affiliation(s)
- Goutam Sahana
- Department of Molecular Biology and Genetics, Faculty of Science and Technology, Aarhus University, Tjele, Denmark.
| | | | | | | |
Collapse
|
8
|
Strandén I, Christensen OF. Allele coding in genomic evaluation. Genet Sel Evol 2011; 43:25. [PMID: 21703021 PMCID: PMC3154140 DOI: 10.1186/1297-9686-43-25] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2011] [Accepted: 06/26/2011] [Indexed: 12/03/2022] Open
Abstract
Background Genomic data are used in animal breeding to assist genetic evaluation. Several models to estimate genomic breeding values have been studied. In general, two approaches have been used. One approach estimates the marker effects first and then, genomic breeding values are obtained by summing marker effects. In the second approach, genomic breeding values are estimated directly using an equivalent model with a genomic relationship matrix. Allele coding is the method chosen to assign values to the regression coefficients in the statistical model. A common allele coding is zero for the homozygous genotype of the first allele, one for the heterozygote, and two for the homozygous genotype for the other allele. Another common allele coding changes these regression coefficients by subtracting a value from each marker such that the mean of regression coefficients is zero within each marker. We call this centered allele coding. This study considered effects of different allele coding methods on inference. Both marker-based and equivalent models were considered, and restricted maximum likelihood and Bayesian methods were used in inference. Results Theoretical derivations showed that parameter estimates and estimated marker effects in marker-based models are the same irrespective of the allele coding, provided that the model has a fixed general mean. For the equivalent models, the same results hold, even though different allele coding methods lead to different genomic relationship matrices. Calculated genomic breeding values are independent of allele coding when the estimate of the general mean is included into the values. Reliabilities of estimated genomic breeding values calculated using elements of the inverse of the coefficient matrix depend on the allele coding because different allele coding methods imply different models. Finally, allele coding affects the mixing of Markov chain Monte Carlo algorithms, with the centered coding being the best. Conclusions Different allele coding methods lead to the same inference in the marker-based and equivalent models when a fixed general mean is included in the model. However, reliabilities of genomic breeding values are affected by the allele coding method used. The centered coding has some numerical advantages when Markov chain Monte Carlo methods are used.
Collapse
Affiliation(s)
- Ismo Strandén
- Biotechnology and Food Research, MTT Agrifood Research Finland, FI-31600 Jokioinen, Finland.
| | | |
Collapse
|
9
|
Shepherd RK, Meuwissen THE, Woolliams JA. Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers. BMC Bioinformatics 2010; 11:529. [PMID: 20969788 PMCID: PMC3098088 DOI: 10.1186/1471-2105-11-529] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2010] [Accepted: 10/22/2010] [Indexed: 12/19/2022] Open
Abstract
Background The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. Results This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. Conclusions emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time.
Collapse
Affiliation(s)
- Ross K Shepherd
- School of Information and Communication Technology, CQUniversity, Rockhampton 4702, Australia.
| | | | | |
Collapse
|
10
|
Erbe M, Ytournel F, Pimentel E, Sharifi A, Simianer H. Power and robustness of three whole genome association mapping approaches in selected populations. J Anim Breed Genet 2010; 128:3-14. [DOI: 10.1111/j.1439-0388.2010.00885.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
11
|
Simianer H, Pimentel ECG. Robust QTL fine mapping by applying a quantitative transmission disequilibrium test to the Mendelian sampling term. J Anim Breed Genet 2009; 126:432-42. [PMID: 19912417 DOI: 10.1111/j.1439-0388.2009.00812.x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In many farm animal populations, high-density single nucleotide polymorphism (SNP) genotypes are becoming available on a large scale, and routine estimation of breeding values is implemented for a multiplicity of traits. We propose to apply the basic principle of the quantitative transmission disequilibrium test (QTDT) to estimated Mendelian sampling terms. A two-step procedure is suggested, where in the first step additive breeding values are estimated with a mixed linear model and the Mendelian sampling terms are calculated from the estimated breeding values. In the second step, the QTDT is applied to these estimated Mendelian sampling terms. The resulting test is expected to yield significant results if the SNP is in sufficient linkage disequilibrium and linkage with quantitative trait loci (QTL). This principle is illustrated with a simulated data set comprising 4665 individuals genotyped for 6000 SNP and 15 true QTL. Thirteen of the fifteen QTL were significant on a genome-wide 0.1% error level. Results for the empirical power are derived from repeated samples of 1000 and 3000 genotyped individuals, respectively. General properties and potential extensions of the methodology are indicated. Owing to its computational simplicity and speed, the suggested procedure is well suited to scan whole genomes with high-density SNP coverage in samples of substantial size and for a multiplicity of different traits.
Collapse
Affiliation(s)
- H Simianer
- Department of Animal Science, Animal Breeding and Genetics Group, Georg-August-University, Goettingen, Germany.
| | | |
Collapse
|
12
|
Besnier F, Carlborg O. A genetic algorithm based method for stringent haplotyping of family data. BMC Genet 2009; 10:57. [PMID: 19761594 PMCID: PMC2754495 DOI: 10.1186/1471-2156-10-57] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2009] [Accepted: 09/17/2009] [Indexed: 12/02/2022] Open
Abstract
Background The linkage phase, or haplotype, is an extra level of information that in addition to genotype and pedigree can be useful for reconstructing the inheritance pattern of the alleles in a pedigree, and computing for example Identity By Descent probabilities. If a haplotype is provided, the precision of estimated IBD probabilities increases, as long as the haplotype is estimated without errors. It is therefore important to only use haplotypes that are strongly supported by the available data for IBD estimation, to avoid introducing new errors due to erroneous linkage phases. Results We propose a genetic algorithm based method for haplotype estimation in family data that includes a stringency parameter. This allows the user to decide the error tolerance level when inferring parental origin of the alleles. This is a novel feature compared to existing methods for haplotype estimation. We show that using a high stringency produces haplotype data with few errors, whereas a low stringency provides haplotype estimates in most situations, but with an increased number of errors. Conclusion By including a stringency criterion in our haplotyping method, the user is able to maintain the error rate at a suitable level for the particular study; one can select anything from haplotyped data with very small proportion of errors and a higher proportion of non-inferred haplotypes, to data with phase estimates for every marker, when haplotype errors are tolerable. Giving this choice makes the method more flexible and useful in a wide range of applications as it is able to fulfil different requirements regarding the tolerance for haplotype errors, or uncertain marker-phases.
Collapse
Affiliation(s)
- Francois Besnier
- Linnaeus Centre for Bioinformatics, Uppsala University, SE-75124 Uppsala, Sweden.
| | | |
Collapse
|
13
|
Abstract
Highly recombinant populations derived from inbred lines, such as advanced intercross lines and heterogeneous stocks, can be used to map loci far more accurately than is possible with standard intercrosses. However, the varying degrees of relatedness that exist between individuals complicate analysis, potentially leading to many false positive signals. We describe a method to deal with these problems that does not require pedigree information and accounts for model uncertainty through model averaging. In our method, we select multiple quantitative trait loci (QTL) models using forward selection applied to resampled data sets obtained by nonparametric bootstrapping and subsampling. We provide model-averaged statistics about the probability of loci or of multilocus regions being included in model selection, and this leads to more accurate identification of QTL than by single-locus mapping. The generality of our approach means it can potentially be applied to any population of unknown structure.
Collapse
|
14
|
Bink MCAM, van Eeuwijk FA. A Bayesian QTL linkage analysis of the common dataset from the 12th QTLMAS workshop. BMC Proc 2009; 3 Suppl 1:S4. [PMID: 19278543 PMCID: PMC2654498 DOI: 10.1186/1753-6561-3-s1-s4] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023] Open
Abstract
Background To compare the power of various QTL mapping methodologies, a dataset was simulated within the framework of 12th QTLMAS workshop. A total of 5865 diploid individuals was simulated, spanning seven generations, with known pedigree. Individuals were genotyped for 6000 SNPs across six chromosomes. We present an illustration of a Bayesian QTL linkage analysis, as implemented in the special purpose software FlexQTL. Most importantly, we treated the number of bi-allelic QTL as a random variable and used Bayes Factors to infer plausible QTL models. We investigated the power of our analysis in relation to the number of phenotyped individuals and SNPs. Results We report clear posterior evidence for 12 QTL that jointly explained 30% of the phenotypic variance, which was very close to the total of included simulation effects, when using all phenotypes and a set of 600 SNPs. Decreasing the number of phenotyped individuals from 4665 to 1665 and/or the number of SNPs in the analysis from 600 to 120 dramatically reduced the power to identify and locate QTL. Posterior estimates of genome-wide breeding values for a small set of individuals were given. Conclusion We presented a successful Bayesian linkage analysis of a simulated dataset with a pedigree spanning several generations. Our analysis identified all regions that contained QTL with effects explaining more than one percent of the phenotypic variance. We showed how the results of a Bayesian QTL mapping can be used in genomic prediction.
Collapse
Affiliation(s)
- Marco C A M Bink
- Biometris, Wageningen University & Research centre, Bornsesteeg 47, 6708 PD, Wageningen, Netherlands.
| | | |
Collapse
|
15
|
Nettelblad C, Holmgren S, Crooks L, Carlborg Ö. cnF2freq: Efficient Determination of Genotype and Haplotype Probabilities in Outbred Populations Using Markov Models. BIOINFORMATICS AND COMPUTATIONAL BIOLOGY 2009. [DOI: 10.1007/978-3-642-00727-9_29] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|