1
|
QTL detection from regression analysis of 'generalized de-regressed proof' information. J Anim Breed Genet 2012; 129:336-42. [PMID: 22775266 DOI: 10.1111/j.1439-0388.2011.00981.x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
QTL detection using the regression of phenotypes on transmission probability is largely used when large families are available. In three generations designs, the use of a 'de-regressed proof' as a phenotype to be analysed was proposed by Weller et al. (1990) and Tribout et al. (2008). Our work generalizes this approach. A score (that we define as a 'generalized de-regressed proof') is described, which combines performance phenotypes recorded in multigenerational offspring of genotyped individuals. Estimation of the QTL effect on this score with a simple regression is unbiased. The link between this score and the BLUP animal model of the polygenic effect is demonstrated. The theory is developed and two simple examples illustrate how this technique can be implemented.
Collapse
|
2
|
QTL mapping in outbred half-sib families using Bayesian model selection. Heredity (Edinb) 2011; 107:265-76. [PMID: 21487433 DOI: 10.1038/hdy.2011.15] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
In this article, we propose a model selection method, the Bayesian composite model space approach, to map quantitative trait loci (QTL) in a half-sib population for continuous and binary traits. In our method, the identity-by-descent-based variance component model is used. To demonstrate the performance of this model, the method was applied to map QTL underlying production traits on BTA6 in a Chinese half-sib dairy cattle population. A total of four QTLs were detected, whereas only one QTL was identified using the traditional least square (LS) method. We also conducted two simulation experiments to validate the efficiency of our method. The results suggest that the proposed method based on a multiple-QTL model is efficient in mapping multiple QTL for an outbred half-sib population and is more powerful than the LS method based on a single-QTL model.
Collapse
|
3
|
Genome-wide evaluation for quantitative trait loci under the variance component model. Genetica 2010; 138:1099-109. [PMID: 20835884 PMCID: PMC2948655 DOI: 10.1007/s10709-010-9497-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2010] [Accepted: 09/01/2010] [Indexed: 12/02/2022]
Abstract
The identity-by-descent (IBD) based variance component analysis is an important method for mapping quantitative trait loci (QTL) in outbred populations. The interval-mapping approach and various modified versions of it may have limited use in evaluating the genetic variances of the entire genome because they require evaluation of multiple models and model selection. In this study, we developed a multiple variance component model for genome-wide evaluation using both the maximum likelihood (ML) method and the MCMC implemented Bayesian method. We placed one QTL in every few cM on the entire genome and estimated the QTL variances and positions simultaneously in a single model. Genomic regions that have no QTL usually showed no evidence of QTL while regions with large QTL always showed strong evidence of QTL. While the Bayesian method produced the optimal result, the ML method is computationally more efficient than the Bayesian method. Simulation experiments were conducted to demonstrate the efficacy of the new methods.
Collapse
|
4
|
Bayesian structural equation models for inferring relationships between phenotypes: a review of methodology, identifiability, and applications. J Anim Breed Genet 2010; 127:3-15. [DOI: 10.1111/j.1439-0388.2009.00835.x] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
5
|
Bayesian model averaging for evaluation of candidate gene effects. Genetica 2010; 138:395-407. [PMID: 20049510 DOI: 10.1007/s10709-009-9433-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2009] [Accepted: 12/16/2009] [Indexed: 10/20/2022]
Abstract
Statistical assessment of candidate gene effects can be viewed as a problem of variable selection and model comparison. Given a certain number of genes to be considered, many possible models may fit to the data well, each including a specific set of gene effects and possibly their interactions. The question arises as to which of these models is most plausible. Inference about candidate gene effects based on a specific model ignores uncertainty about model choice. Here, a Bayesian model averaging approach is proposed for evaluation of candidate gene effects. The method is implemented through simultaneous sampling of multiple models. By averaging over a set of competing models, the Bayesian model averaging approach incorporates model uncertainty into inferences about candidate gene effects. Features of the method are demonstrated using a simulated data set with ten candidate genes under consideration.
Collapse
|
6
|
Bayesian multilocus association mapping on ordinal and censored traits and its application to the analysis of genetic variation among Oryza sativa L. germplasms. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2009; 118:865-880. [PMID: 19132337 DOI: 10.1007/s00122-008-0945-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2008] [Accepted: 12/02/2008] [Indexed: 05/27/2023]
Abstract
Association mapping can be a powerful tool for detecting quantitative trait loci (QTLs) without requiring line-crossing experiments. We previously proposed a Bayesian approach for simultaneously mapping multiple QTLs by a regression method that directly incorporates estimates of the population structure. In the present study, we extended our method to analyze ordinal and censored traits, since both types of traits are common in the evaluation of germplasm collections. Ordinal-probit and tobit models were employed to analyze ordinal and censored traits, respectively. In both models, we postulated the existence of a latent continuous variable associated with the observable data, and we used a Markov-chain Monte Carlo algorithm to sample the latent variable and determine the model parameters. We evaluated the efficiency of our approach by using simulated- and real-trait analyses of a rice germplasm collection. Simulation analyses based on real marker data showed that our models could reduce both false-positive and false-negative rates in detecting QTLs to reasonable levels. Simulation analyses based on highly polymorphic marker data, which were generated by coalescent simulations, showed that our models could be applied to genotype data based on highly polymorphic marker systems, like simple sequence repeats. For the real traits, we analyzed heading date as a censored trait and amylose content and the shape of milled rice grains as ordinal traits. We found significant markers that may be linked to previously reported QTLs. Our approach will be useful for whole-genome association mapping of ordinal and censored traits in rice germplasm collections.
Collapse
|
7
|
Bayesian multilocus association mapping on ordinal and censored traits and its application to the analysis of genetic variation among Oryza sativa L. germplasms. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2009. [PMID: 19132337 DOI: 10.1007/s00122‐008‐0945‐6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Association mapping can be a powerful tool for detecting quantitative trait loci (QTLs) without requiring line-crossing experiments. We previously proposed a Bayesian approach for simultaneously mapping multiple QTLs by a regression method that directly incorporates estimates of the population structure. In the present study, we extended our method to analyze ordinal and censored traits, since both types of traits are common in the evaluation of germplasm collections. Ordinal-probit and tobit models were employed to analyze ordinal and censored traits, respectively. In both models, we postulated the existence of a latent continuous variable associated with the observable data, and we used a Markov-chain Monte Carlo algorithm to sample the latent variable and determine the model parameters. We evaluated the efficiency of our approach by using simulated- and real-trait analyses of a rice germplasm collection. Simulation analyses based on real marker data showed that our models could reduce both false-positive and false-negative rates in detecting QTLs to reasonable levels. Simulation analyses based on highly polymorphic marker data, which were generated by coalescent simulations, showed that our models could be applied to genotype data based on highly polymorphic marker systems, like simple sequence repeats. For the real traits, we analyzed heading date as a censored trait and amylose content and the shape of milled rice grains as ordinal traits. We found significant markers that may be linked to previously reported QTLs. Our approach will be useful for whole-genome association mapping of ordinal and censored traits in rice germplasm collections.
Collapse
|
8
|
Hierarchical modeling of clinical and expression quantitative trait loci. Heredity (Edinb) 2008; 101:271-84. [DOI: 10.1038/hdy.2008.58] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
|
9
|
Bayesian joint mapping of quantitative trait loci for Gaussian and categorical characters in line crosses. Genetica 2008; 135:367-77. [DOI: 10.1007/s10709-008-9283-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2008] [Accepted: 06/03/2008] [Indexed: 11/29/2022]
|
10
|
A Bayesian method for simultaneously detecting Mendelian and imprinted quantitative trait loci in experimental crosses of outbred species. Genetics 2008; 178:527-38. [PMID: 18202392 DOI: 10.1534/genetics.107.081521] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Genomic imprinting is interpreted as a phenomenon, in which some genes inherited from one parent are not completely expressed due to modification of the genome caused during gametogenesis. Subsequently, the expression level of an allele at the imprinted gene is changed dependent on the parental origin, which is referred to as the parent-of-origin effect. In livestock, some QTL for reproductive performance and meat productivity have been reported to be imprinted. So far, methods detecting imprinted QTL have been proposed on the basis of interval mapping, where only a single QTL was tested at a time. In this study, we developed a Bayesian method for simultaneously mapping multiple QTL, allowing the inference about expression modes of QTL in an outbred F2 family. The inference about whether a QTL is Mendelian or imprinted was made using Markov chain Monte Carlo estimation by comparing the goodness-of-fits between models, assuming the presence and the absence of parent-of-origin effect at a QTL. We showed by the analyses of simulated data sets that the Bayesian method can effectively detect both Mendelian QTL and imprinted QTL.
Collapse
|
11
|
Bayesian analysis of genetic architecture of quantitative trait using data of crosses of multiple inbred lines. Genetica 2008; 134:367-75. [PMID: 18278559 DOI: 10.1007/s10709-008-9244-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2007] [Accepted: 02/05/2008] [Indexed: 10/22/2022]
Abstract
Using the data of crosses of multiple of inbred lines for mapping QTL can increase QTL detecting power compared with only cross of two inbred lines. Although many fixed-effect model methods have been proposed to analyze such data, they are largely based on one-QTL model or main effect model, and the interaction effects between QTL are always neglected. However, effectively separating the interaction effects from the residual error can increase the statistical power. In this article, we both extended the novel Bayesian model selection method and Bayesian shrinkage estimation approaches to multiple inbred line crosses. With two extensions, interacting QTL are effectively detected with high solution; in addition, the posterior variances for both main effects and interaction effects are also subjected to full Bayesian estimate, which is more optimal than two step approach involved in maximum-likelihood. A series of simulation experiments have been conducted to demonstrate the performance of the methods. The computer program written in FORTRAN language is freely available on request.
Collapse
|
12
|
Inclusive composite interval mapping (ICIM) for digenic epistasis of quantitative traits in biparental populations. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2008; 116:243-60. [PMID: 17985112 DOI: 10.1007/s00122-007-0663-5] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2007] [Accepted: 10/02/2007] [Indexed: 05/18/2023]
Abstract
It has long been recognized that epistasis or interactions between non-allelic genes plays an important role in the genetic control and evolution of quantitative traits. However, the detection of epistasis and estimation of epistatic effects are difficult due to the complexity of epistatic patterns, insufficient sample size of mapping populations and lack of efficient statistical methods. Under the assumption of additivity of QTL effects on the phenotype of a trait in interest, the additive effect of a QTL can be completely absorbed by the flanking marker variables, and the epistatic effect between two QTL can be completely absorbed by the four marker-pair multiplication variables between the two pairs of flanking markers. Based on this property, we proposed an inclusive composite interval mapping (ICIM) by simultaneously considering marker variables and marker-pair multiplications in a linear model. Stepwise regression was applied to identify the most significant markers and marker-pair multiplications. Then a two-dimensional scanning (or interval mapping) was conducted to identify QTL with significant digenic epistasis using adjusted phenotypic values based on the best multiple regression model. The adjusted values retain the information of QTL on the two current mapping intervals but exclude the influence of QTL on other intervals and chromosomes. Epistatic QTL can be identified by ICIM, no matter whether the two interacting QTL have any additive effects. Simulated populations and one barley doubled haploids (DH) population were used to demonstrate the efficiency of ICIM in mapping both additive QTL and digenic interactions.
Collapse
|
13
|
Bayesian mapping of quantitative trait loci for multiple complex traits with the use of variance components. Am J Hum Genet 2007; 81:304-20. [PMID: 17668380 PMCID: PMC1950806 DOI: 10.1086/519495] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2007] [Accepted: 05/07/2007] [Indexed: 11/03/2022] Open
Abstract
Complex traits important for humans are often correlated phenotypically and genetically. Joint mapping of quantitative-trait loci (QTLs) for multiple correlated traits plays an important role in unraveling the genetic architecture of complex traits. Compared with single-trait analysis, joint mapping addresses more questions and has advantages for power of QTL detection and precision of parameter estimation. Some statistical methods have been developed to map QTLs underlying multiple traits, most of which are based on maximum-likelihood methods. We develop here a multivariate version of the Bayes methodology for joint mapping of QTLs, using the Markov chain-Monte Carlo (MCMC) algorithm. We adopt a variance-components method to model complex traits in outbred populations (e.g., humans). The method is robust, can deal with an arbitrary number of alleles with arbitrary patterns of gene actions (such as additive and dominant), and allows for multiple phenotype data of various types in the joint analysis (e.g., multiple continuous traits and mixtures of continuous traits and discrete traits). Under a Bayesian framework, parameters--including the number of QTLs--are estimated on the basis of their marginal posterior samples, which are generated through two samplers, the Gibbs sampler and the reversible-jump MCMC. In addition, we calculate the Bayes factor related to each identified QTL, to test coincident linkage versus pleiotropy. The performance of our method is evaluated in simulations with full-sib families. The results show that our proposed Bayesian joint-mapping method performs well for mapping multiple QTLs in situations of either bivariate continuous traits or mixed data types. Compared with the analysis for each trait separately, Bayesian joint mapping improves statistical power, provides stronger evidence of QTL detection, and increases precision in estimation of parameter and QTL position. We also applied the proposed method to a set of real data and detected a coincident linkage responsible for determining bone mineral density and areal bone size of wrist in humans.
Collapse
|
14
|
Bayesian association mapping of multiple quantitative trait loci and its application to the analysis of genetic variation among Oryza sativa L. germplasms. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2007; 114:1437-49. [PMID: 17356864 DOI: 10.1007/s00122-007-0529-x] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2006] [Accepted: 02/16/2007] [Indexed: 05/03/2023]
Abstract
One way to use a crop germplasm collection directly to map QTLs without using line-crossing experiments is the whole genome association mapping. A major problem with association mapping is the presence of population structure, which can lead to both false positives and failure to detect genuine associations (i.e., false negatives). Particularly in highly selfing species such as Asian cultivated rice, high levels of population structure are expected and therefore the efficiency of association mapping remains almost unknown. Here, we propose an approach that combines a Bayesian method for mapping multiple QTLs with a regression method that directly incorporates estimates of population structure. That is, the effects due to both multiple QTLs and population structure were included in our statistical model. We evaluated the efficiency of our approach in simulated- and real-trait analyses of a rice germplasm collection. Simulation analyses based on real marker data showed that our model could suppress both false-positive and false-negative rates and the error of estimation of genetic effects over single QTL models, indicating that our model has statistically desirable attributes over single QTL models. As real traits, we analyzed the size and shape of milled rice grains and found significant markers that may be linked to QTLs reported previously. Association mapping should have good prospects in highly selfing species such as rice if proper methods are adopted. Our approach will be useful for the whole genome association mapping of various selfing crop species.
Collapse
|
15
|
Abstract
Many quantitative traits are measured repeatedly during the life of an organism. Such traits are called dynamic traits. The pattern of the changes of a dynamic trait is called the growth trajectory. Studying the growth trajectory may enhance our understanding of the genetic architecture of the growth trajectory. Recently, we developed an interval-mapping procedure to map QTL for dynamic traits under the maximum-likelihood framework. We fit the growth trajectory by Legendre polynomials. The method intended to map one QTL at a time and the entire QTL analysis involved scanning the entire genome by fitting multiple single-QTL models. In this study, we propose a Bayesian shrinkage analysis for estimating and mapping multiple QTL in a single model. The method is a combination between the shrinkage mapping for individual quantitative traits and the Legendre polynomial analysis for dynamic traits. The multiple-QTL model is implemented in two ways: (1) a fixed-interval approach where a QTL is placed in each marker interval and (2) a moving-interval approach where the position of a QTL can be searched in a range that covers many marker intervals. Simulation study shows that the Bayesian shrinkage method generates much better signals for QTL than the interval-mapping approach. We propose several alternative methods to present the results of the Bayesian shrinkage analysis. In particular, we found that the Wald test-statistic profile can serve as a mechanism to test the significance of a putative QTL.
Collapse
|
16
|
Study on mapping quantitative trait loci for animal complex binary traits using Bayesian-Markov chain Monte Carlo approach. ACTA ACUST UNITED AC 2007; 49:552-9. [PMID: 17312993 DOI: 10.1007/s11427-006-2024-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
It is a challenging issue to map Quantitative Trait Loci (QTL) underlying complex discrete traits, which usually show discontinuous distribution and less information, using conventional statistical methods. Bayesian-Markov chain Monte Carlo (Bayesian-MCMC) approach is the key procedure in mapping QTL for complex binary traits, which provides a complete posterior distribution for QTL parameters using all prior information. As a consequence, Bayesian estimates of all interested variables can be obtained straightforwardly basing on their posterior samples simulated by the MCMC algorithm. In our study, utilities of Bayesian-MCMC are demonstrated using simulated several animal outbred full-sib families with different family structures for a complex binary trait underlied by both a QTL and polygene. Under the Identity-by-Descent-Based variance component random model, three samplers basing on MCMC, including Gibbs sampling, Metropolis algorithm and reversible jump MCMC, were implemented to generate the joint posterior distribution of all unknowns so that the QTL parameters were obtained by Bayesian statistical inferring. The results showed that Bayesian-MCMC approach could work well and robust under different family structures and QTL effects. As family size increases and the number of family decreases, the accuracy of the parameter estimates will be improved. When the true QTL has a small effect, using outbred population experiment design with large family size is the optimal mapping strategy.
Collapse
|
17
|
Abstract
Composite interval mapping (CIM) is the most commonly used method for mapping quantitative trait loci (QTL) with populations derived from biparental crosses. However, the algorithm implemented in the popular QTL Cartographer software may not completely ensure all its advantageous properties. In addition, different background marker selection methods may give very different mapping results, and the nature of the preferred method is not clear. A modified algorithm called inclusive composite interval mapping (ICIM) is proposed in this article. In ICIM, marker selection is conducted only once through stepwise regression by considering all marker information simultaneously, and the phenotypic values are then adjusted by all markers retained in the regression equation except the two markers flanking the current mapping interval. The adjusted phenotypic values are finally used in interval mapping (IM). The modified algorithm has a simpler form than that used in CIM, but a faster convergence speed. ICIM retains all advantages of CIM over IM and avoids the possible increase of sampling variance and the complicated background marker selection process in CIM. Extensive simulations using two genomes and various genetic models indicated that ICIM has increased detection power, a reduced false detection rate, and less biased estimates of QTL effects.
Collapse
|
18
|
A modified algorithm for the improvement of composite interval mapping. Genetics 2007; 175:361-74. [PMID: 17110476 PMCID: PMC1775001 DOI: 10.1534/genetics.106.066811] [Citation(s) in RCA: 397] [Impact Index Per Article: 23.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2006] [Accepted: 10/24/2006] [Indexed: 11/18/2022] Open
Abstract
Composite interval mapping (CIM) is the most commonly used method for mapping quantitative trait loci (QTL) with populations derived from biparental crosses. However, the algorithm implemented in the popular QTL Cartographer software may not completely ensure all its advantageous properties. In addition, different background marker selection methods may give very different mapping results, and the nature of the preferred method is not clear. A modified algorithm called inclusive composite interval mapping (ICIM) is proposed in this article. In ICIM, marker selection is conducted only once through stepwise regression by considering all marker information simultaneously, and the phenotypic values are then adjusted by all markers retained in the regression equation except the two markers flanking the current mapping interval. The adjusted phenotypic values are finally used in interval mapping (IM). The modified algorithm has a simpler form than that used in CIM, but a faster convergence speed. ICIM retains all advantages of CIM over IM and avoids the possible increase of sampling variance and the complicated background marker selection process in CIM. Extensive simulations using two genomes and various genetic models indicated that ICIM has increased detection power, a reduced false detection rate, and less biased estimates of QTL effects.
Collapse
|
19
|
A modified algorithm for the improvement of composite interval mapping. Genetics 2007. [PMID: 17110476 DOI: 10.1534/genetics.106.066811.40] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/04/2023] Open
Abstract
Composite interval mapping (CIM) is the most commonly used method for mapping quantitative trait loci (QTL) with populations derived from biparental crosses. However, the algorithm implemented in the popular QTL Cartographer software may not completely ensure all its advantageous properties. In addition, different background marker selection methods may give very different mapping results, and the nature of the preferred method is not clear. A modified algorithm called inclusive composite interval mapping (ICIM) is proposed in this article. In ICIM, marker selection is conducted only once through stepwise regression by considering all marker information simultaneously, and the phenotypic values are then adjusted by all markers retained in the regression equation except the two markers flanking the current mapping interval. The adjusted phenotypic values are finally used in interval mapping (IM). The modified algorithm has a simpler form than that used in CIM, but a faster convergence speed. ICIM retains all advantages of CIM over IM and avoids the possible increase of sampling variance and the complicated background marker selection process in CIM. Extensive simulations using two genomes and various genetic models indicated that ICIM has increased detection power, a reduced false detection rate, and less biased estimates of QTL effects.
Collapse
|
20
|
Association mapping of complex trait loci with context-dependent effects and unknown context variable. Genetics 2006; 174:1597-611. [PMID: 17028339 PMCID: PMC1667093 DOI: 10.1534/genetics.106.061275] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2006] [Accepted: 08/28/2006] [Indexed: 11/18/2022] Open
Abstract
A novel method for Bayesian analysis of genetic heterogeneity and multilocus association in random population samples is presented. The method is valid for quantitative and binary traits as well as for multiallelic markers. In the method, individuals are stochastically assigned into two etiological groups that can have both their own, and possibly different, subsets of trait-associated (disease-predisposing) loci or alleles. The method is favorable especially in situations when etiological models are stratified by the factors that are unknown or went unmeasured, that is, if genetic heterogeneity is due to, for example, unknown genes x environment or genes x gene interactions. Additionally, a heterogeneity structure for the phenotype does not need to follow the structure of the general population; it can have a distinct selection history. The performance of the method is illustrated with simulated example of genes x environment interaction (quantitative trait with loosely linked markers) and compared to the results of single-group analysis in the presence of missing data. Additionally, example analyses with previously analyzed cystic fibrosis and type 2 diabetes data sets (binary traits with closely linked markers) are presented. The implementation (written in WinBUGS) is freely available for research purposes from http://www.rni.helsinki.fi/ approximately mjs/.
Collapse
|
21
|
Biased estimators of quantitative trait locus heritability and location in interval mapping. Heredity (Edinb) 2005; 95:476-84. [PMID: 16189542 DOI: 10.1038/sj.hdy.6800747] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
In many empirical studies, it has been observed that genome scans yield biased estimates of heritability, as well as genetic effects. It is widely accepted that quantitative trait locus (QTL) mapping is a model selection procedure, and that the overestimation of genetic effects is the result of using the same data for model selection as estimation of parameters. There are two key steps in QTL modeling, each of which biases the estimation of genetic effects. First, test procedures are employed to select the regions of the genome for which there is significant evidence for the presence of QTL. Second, and most important for this demonstration, estimates of the genetic effects are reported only at the locations for which the evidence is maximal. We demonstrate that even when we know there is just one QTL present (ignoring the testing bias), and we use interval mapping to estimate its location and effect, the estimator of the effect will be biased. As evidence, we present results of simulations investigating the relative importance of the two sources of bias and the dependence of bias of heritability estimators on the true QTL heritability, sample size, and the length of the investigated part of the genome. Moreover, we present results of simulations demonstrating the skewness of the distribution of estimators of QTL locations and the resulting bias in estimation of location. We use computer simulations to investigate the dependence of this bias on the true QTL location, heritability, and the sample size.
Collapse
|
22
|
Bayesian model selection for genome-wide epistatic quantitative trait loci analysis. Genetics 2005; 170:1333-44. [PMID: 15911579 PMCID: PMC1451197 DOI: 10.1534/genetics.104.040386] [Citation(s) in RCA: 115] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2004] [Accepted: 04/04/2005] [Indexed: 11/18/2022] Open
Abstract
The problem of identifying complex epistatic quantitative trait loci (QTL) across the entire genome continues to be a formidable challenge for geneticists. The complexity of genome-wide epistatic analysis results mainly from the number of QTL being unknown and the number of possible epistatic effects being huge. In this article, we use a composite model space approach to develop a Bayesian model selection framework for identifying epistatic QTL for complex traits in experimental crosses from two inbred lines. By placing a liberal constraint on the upper bound of the number of detectable QTL we restrict attention to models of fixed dimension, greatly simplifying calculations. Indicators specify which main and epistatic effects of putative QTL are included. We detail how to use prior knowledge to bound the number of detectable QTL and to specify prior distributions for indicators of genetic effects. We develop a computationally efficient Markov chain Monte Carlo (MCMC) algorithm using the Gibbs sampler and Metropolis-Hastings algorithm to explore the posterior distribution. We illustrate the proposed method by detecting new epistatic QTL for obesity in a backcross of CAST/Ei mice onto M16i.
Collapse
|
23
|
Abstract
In this article, a unified Markov chain Monte Carlo (MCMC) framework is proposed to identify multiple quantitative trait loci (QTL) for complex traits in experimental designs, based on a composite space representation of the problem that has fixed dimension. The proposed unified approach includes the existing Bayesian QTL mapping methods using reversible jump MCMC algorithm as special cases. We also show that a variety of Bayesian variable selection methods using Gibbs sampling can be applied to the composite model space for mapping multiple QTL. The unified framework not only results in some new algorithms, but also gives useful insight into some of the important factors governing the performance of Gibbs sampling and reversible jump for mapping multiple QTL. Finally, we develop strategies to improve the performance of MCMC algorithms.
Collapse
|
24
|
Modifying the Schwarz Bayesian information criterion to locate multiple interacting quantitative trait loci. Genetics 2005; 167:989-99. [PMID: 15238547 PMCID: PMC1470914 DOI: 10.1534/genetics.103.021683] [Citation(s) in RCA: 112] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
The problem of locating multiple interacting quantitative trait loci (QTL) can be addressed as a multiple regression problem, with marker genotypes being the regressor variables. An important and difficult part in fitting such a regression model is the estimation of the QTL number and respective interactions. Among the many model selection criteria that can be used to estimate the number of regressor variables, none are used to estimate the number of interactions. Our simulations demonstrate that epistatic terms appearing in a model without the related main effects cause the standard model selection criteria to have a strong tendency to overestimate the number of interactions, and so the QTL number. With this as our motivation we investigate the behavior of the Schwarz Bayesian information criterion (BIC) by explaining the phenomenon of the overestimation and proposing a novel modification of BIC that allows the detection of main effects and pairwise interactions in a backcross population. Results of an extensive simulation study demonstrate that our modified version of BIC performs very well in practice. Our methodology can be extended to general populations and higher-order interactions.
Collapse
|
25
|
Bayesian mapping of QTL in outbred F2 families allowing inference about whether F0 grandparents are homozygous or heterozygous at QTL. Heredity (Edinb) 2005; 94:326-37. [PMID: 15674384 DOI: 10.1038/sj.hdy.6800638] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
In this paper, we propose a new Bayesian method for QTL analysis in outbred F2 families based on Markov chain Monte Carlo (MCMC) estimation allowing inference about whether each of F0 founders (grandparents) is homozygous or heterozygous at QTL. This, in turn, allows us to select a model accurately explaining observations of phenotypes for F2 individuals. The proposed method performs the fitting a statistical model of the two possible QTL states in each F0 grandparent, that is, homozygous and heterozygous at QTL, and gives a posterior distribution for the QTL states in each F0 grandparent. We confine ourselves to the discrimination of two QTL states, homozygous or heterozygous, for each of the F0 grandparents without taking into consideration whether common alleles are shared by F0 grandparents. The statistical model includes allelic effects and dominance effects for each QTL. The number of parameters representing allelic effects and dominance effects is therefore changed depending on the QTL states. A Reversible Jump MCMC technique is used for transition between the models of different dimensions. The effectiveness of the proposed method was investigated using simulation experiments. It was practicable to estimate the QTL states of F0 grandparents as well as the number, the locations and the effects of QTL segregating in an outbred F2 family.
Collapse
|
26
|
Abstract
In plants and laboratory animals, QTL mapping is commonly performed using F(2) or BC individuals derived from the cross of two inbred lines. Typical QTL mapping statistics assume that each F(2) individual is genotyped for the markers and phenotyped for the trait. For plant traits with low heritability, it has been suggested to use the average phenotypic values of F(3) progeny derived from selfing F(2) plants in place of the F(2) phenotype itself. All F(3) progeny derived from the same F(2) plant belong to the same F(2:3) family, denoted by F(2:3). If the size of each F(2:3) family (the number of F(3) progeny) is sufficiently large, the average value of the family will represent the genotypic value of the F(2) plant, and thus the power of QTL mapping may be significantly increased. The strategy of using F(2) marker genotypes and F(3) average phenotypes for QTL mapping in plants is quite similar to the daughter design of QTL mapping in dairy cattle. We study the fundamental principle of the plant version of the daughter design and develop a new statistical method to map QTL under this F(2:3) strategy. We also propose to combine both the F(2) phenotypes and the F(2:3) average phenotypes to further increase the power of QTL mapping. The statistical method developed in this study differs from published ones in that the new method fully takes advantage of the mixture distribution for F(2:3) families of heterozygous F(2) plants. Incorporation of this new information has significantly increased the statistical power of QTL detection relative to the classical F(2) design, even if only a single F(3) progeny is collected from each F(2:3) family. The mixture model is developed on the basis of a single-QTL model and implemented via the EM algorithm. Substantial computer simulation was conducted to demonstrate the improved efficiency of the mixture model. Extension of the mixture model to multiple QTL analysis is developed using a Bayesian approach. The computer program performing the Bayesian analysis of the simulated data is available to users for real data analysis.
Collapse
|
27
|
Bayesian association-based fine mapping in small chromosomal segments. Genetics 2005; 169:427-39. [PMID: 15371355 PMCID: PMC1448870 DOI: 10.1534/genetics.104.032680] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2004] [Accepted: 09/16/2004] [Indexed: 11/18/2022] Open
Abstract
A Bayesian method for fine mapping is presented, which deals with multiallelic markers (with two or more alleles), unknown phase, missing data, multiple causal variants, and both continuous and binary phenotypes. We consider small chromosomal segments spanned by a dense set of closely linked markers and putative genes only at marker points. In the phenotypic model, locus-specific indicator variables are used to control inclusion in or exclusion from marker contributions. To account for covariance between consecutive loci and to control fluctuations in association signals along a candidate region we introduce a joint prior for the indicators that depends on genetic or physical map distances. The potential of the method, including posterior estimation of trait-associated loci, their effects, linkage disequilibrium pattern due to close linkage of loci, and the age of a causal variant (time to most recent common ancestor), is illustrated with the well-known cystic fibrosis and Friedreich ataxia data sets by assuming that haplotypes were not available. In addition, simulation analysis with large genetic distances is shown. Estimation of model parameters is based on Markov chain Monte Carlo (MCMC) sampling and is implemented using WinBUGS. The model specification code is freely available for research purposes from http://www.rni.helsinki.fi/~mjs/.
Collapse
|
28
|
On the Metropolis-Hastings acceptance probability to add or drop a quantitative trait locus in Markov chain Monte Carlo-based Bayesian analyses. Genetics 2004; 166:641-3. [PMID: 15020452 PMCID: PMC1470712 DOI: 10.1534/genetics.166.1.641] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The Metropolis-Hastings algorithm used in analyses that estimate the number of QTL segregating in a mapping population requires the calculation of an acceptance probability to add or drop a QTL from the model. Expressions for this acceptance probability need to recognize that sets of QTL are unordered such that the number of equivalent sets increases with the factorial of the QTL number. Here, we show how accounting for this fact affects the acceptance probability and review expressions found in the literature.
Collapse
|
29
|
Abstract
A maximum likelihood method was developed for QTL mapping in half-sib designs and compared to the regression method in analyses of both field and simulated data. The field data consisted of milk production evaluations of 433 progeny tested sons of 6 sires and 64 microsatellite markers distributed over 12 chromosomes. Based on permutation tests, 5 significant QTL were detected in the field data by the regression method compared with 10 by the maximum likelihood method (P < 0.05). In field data analysis, the maximum likelihood method detected more significant QTL and had a smaller residual variance than the regression method. The simulation included 9 scenarios differing in number of families, family size, QTL variance, and marker density, each replicated 100 times. The simulation results suggested that, as for the regression method, the precision of estimating QTL from the maximum likelihood method improves with increasing number of sons per sire, increasing the ratio of QTL to phenotypic variance, and decreasing marker interval. The maximum likelihood method had a smaller dispersion of estimated QTL positions than the regression method in 6 of 9 scenarios simulated. Overall, the maximum likelihood method shows potential advantage in QTL detection over the regression method, especially in the situations with less favorable conditions for QTL detection.
Collapse
|
30
|
Abstract
Abstract
In plants and laboratory animals, QTL mapping is commonly performed using F2 or BC individuals derived from the cross of two inbred lines. Typical QTL mapping statistics assume that each F2 individual is genotyped for the markers and phenotyped for the trait. For plant traits with low heritability, it has been suggested to use the average phenotypic values of F3 progeny derived from selfing F2 plants in place of the F2 phenotype itself. All F3 progeny derived from the same F2 plant belong to the same F2:3 family, denoted by F2:3. If the size of each F2:3 family (the number of F3 progeny) is sufficiently large, the average value of the family will represent the genotypic value of the F2 plant, and thus the power of QTL mapping may be significantly increased. The strategy of using F2 marker genotypes and F3 average phenotypes for QTL mapping in plants is quite similar to the daughter design of QTL mapping in dairy cattle. We study the fundamental principle of the plant version of the daughter design and develop a new statistical method to map QTL under this F2:3 strategy. We also propose to combine both the F2 phenotypes and the F2:3 average phenotypes to further increase the power of QTL mapping. The statistical method developed in this study differs from published ones in that the new method fully takes advantage of the mixture distribution for F2:3 families of heterozygous F2 plants. Incorporation of this new information has significantly increased the statistical power of QTL detection relative to the classical F2 design, even if only a single F3 progeny is collected from each F2:3 family. The mixture model is developed on the basis of a single-QTL model and implemented via the EM algorithm. Substantial computer simulation was conducted to demonstrate the improved efficiency of the mixture model. Extension of the mixture model to multiple QTL analysis is developed using a Bayesian approach. The computer program performing the Bayesian analysis of the simulated data is available to users for real data analysis.
Collapse
|
31
|
Abstract
In this article, we consider the problem of the estimation of quantitative trait loci (QTL), those chromosomal regions at which genetic information affecting some quantitative trait is encoded. Generally the number of such encoding sites is unknown, and associations between neutral molecular marker genotypes and observed trait phenotypes are sought to locate them. We consider a Bayesian model for simple experimental designs, and discuss the existing approaches to inference for this problem. In particular, we focus on locating positions of the best candidate markers segregating for the trait, a situation which is of primary interest in comparative mapping. We introduce a loss function for estimating both the number of QTL and their location, and we illustrate its application via simulated and real data.
Collapse
|
32
|
Discovering Disease Genes: Multipoint Linkage Analysis via a New Markov Chain Monte Carlo Approach. Stat Sci 2003. [DOI: 10.1214/ss/1081443233] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
33
|
Abstract
AbstractMost complex traits of animals, plants, and humans are influenced by multiple genetic and environmental factors. Interactions among multiple genes play fundamental roles in the genetic control and evolution of complex traits. Statistical modeling of interaction effects in quantitative trait loci (QTL) analysis must accommodate a very large number of potential genetic effects, which presents a major challenge to determining the genetic model with respect to the number of QTL, their positions, and their genetic effects. In this study, we use the methodology of Bayesian model and variable selection to develop strategies for identifying multiple QTL with complex epistatic patterns in experimental designs with two segregating genotypes. Specifically, we develop a reversible jump Markov chain Monte Carlo algorithm to determine the number of QTL and to select main and epistatic effects. With the proposed method, we can jointly infer the genetic model of a complex trait and the associated genetic parameters, including the number, positions, and main and epistatic effects of the identified QTL. Our method can map a large number of QTL with any combination of main and epistatic effects. Utility and flexibility of the method are demonstrated using both simulated data and a real data set. Sensitivity of posterior inference to prior specifications of the number and genetic effects of QTL is investigated.
Collapse
|
34
|
Abstract
A Bayesian model-based method for multilocus association analysis of quantitative and qualitative (binary) traits is presented. The method selects a trait-associated subset of markers among candidates, and is equally applicable for analyzing wide chromosomal segments (genome scans) and small candidate regions. The method can be applied in situations involving missing genotype data. The number of trait loci, their marker positions, and the magnitudes of their gene effects (strengths of association) are all estimated simultaneously. The inference of parameters is based on their posterior distributions, which are obtained through Markov chain Monte Carlo simulations. The strengths of the approach are: 1) flexible use of oligogenic models with unknown number of loci, 2) performing the estimation of association jointly with model selection, and 3) avoidance of the multiple testing problem, which typically complicates the approaches based on association testing. The performance of the method was tested and compared to the multilocus conditional search procedure by analyzing two simulated data sets. We also applied the method to cystic fibrosis haplotype data (two-locus haplotypes), where gene position has already been identified. The method is implemented as a software package, which is freely available for research purposes under the name BAMA.
Collapse
|
35
|
Fine mapping of complex trait genes combining pedigree and linkage disequilibrium information: a Bayesian unified framework. Genetics 2003; 163:1497-510. [PMID: 12702692 PMCID: PMC1462504 DOI: 10.1093/genetics/163.4.1497] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
We present a Bayesian method that combines linkage and linkage disequilibrium (LDL) information for quantitative trait locus (QTL) mapping. This method uses jointly all marker information (haplotypes) and all available pedigree information; i.e., it is not restricted to any specific experimental design and it is not required that phases are known. Infinitesimal genetic effects or environmental noise ("fixed") effects can equally be fitted. A diallelic QTL is assumed and both additive and dominant effects can be estimated. We have implemented a combined Gibbs/Metropolis-Hastings sampling to obtain the marginal posterior distributions of the parameters of interest. We have also implemented a Bayesian variant of usual disequilibrium measures like D' and r(2) between QTL and markers. We illustrate the method with simulated data in "simple" (two-generation full-sib families) and "complex" (four-generation) pedigrees. We compared the estimates with and without using linkage disequilibrium information. In general, using LDL resulted in estimates of QTL position that were much better than linkage-only estimates when there was complete disequilibrium between the mutant QTL allele and the marker. This advantage, however, decreased when the association was only partial. In all cases, additive and dominant effects were estimated accurately either with or without disequilibrium information.
Collapse
|
36
|
A Unified Approach to Joint Modeling of Multiple Quantitative and Qualitative Traits in Gene Mapping. J Theor Biol 2002. [DOI: 10.1006/jtbi.2002.3090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
37
|
Bayesian MCMC Mapping of Quantitative Trait Loci in a Half-sib Design: a Graphical Model Perspective. Int Stat Rev 2002. [DOI: 10.1111/j.1751-5823.2002.tb00362.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
38
|
Abstract
The choice of an appropriate genetic model describing the genetic architecture underlying a character of interest is an inherent part of the gene mapping studies of human and other living organisms. The genetic model specifies the statistical parameters for the number of genes, their positions, and the types and magnitudes of their contributions to the phenotype. There are many considerations involved in model formulation (choice) ranging from the assumptions concerning the data, the role of environment, and the number of oligogenes (or quantitative trait loci) influencing the trait behavior. There are several model selection procedures and criteria under specific sampling designs in the genetic literature. These approaches often have their origin in computer science or in general statistical theory. Our aim here is to give an overview of the most popular statistical criteria and to present principles behind them. Bayesian model averaging is suggested as a robust alternative for such methods.
Collapse
|
39
|
A note on estimating the posterior density of a quantitative trait locus from a Markov chain Monte Carlo sample. Genet Epidemiol 2002; 22:369-76. [PMID: 11984868 DOI: 10.1002/gepi.01125] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
We provide an overview of the use of kernel smoothing to summarize the quantitative trait locus posterior distribution from a Markov chain Monte Carlo sample. More traditional distributional summary statistics based on the histogram depend both on the bin width and on the sideway shift of the bin grid used. These factors influence both the overall mapping accuracy and the estimated location of the mode of the distribution. Replacing the histogram by kernel smoothing helps to alleviate these problems. Using simulated data, we performed numerical comparisons between the two approaches. The results clearly illustrate the superiority of the kernel method. The kernel approach is particularly efficient when one needs to point out the best putative quantitative trait locus position on the marker map. In such situations, the smoothness of the posterior estimate is especially important because rough posterior estimates easily produce biased mode estimates. Different kernel implementations are available from Rolf Nevanlinna Institute's web page (http://www.rni.helsinki.fi/;fjh).
Collapse
|
40
|
Abstract
Genetic mapping in analysis of medical disease is performed under several assumptions and (experimental) conditions, which are made about the data in general and the disease in particular. Here we discuss these conditions, what they mean, and what kind of deleterious effects they might have on the analysis. We also illustrate how to proceed and what kind of possibilities the statistical analysis may provide to medical scientists.
Collapse
|
41
|
Abstract
A Bayesian method for multipoint oligogenic analysis of quantitative and qualitative traits is presented. This method can be applied to general pedigrees, which do not necessarily have to be "peelable" and can have large numbers of markers. The number of quantitative/qualitative trait loci (QTL), their map positions in the genome, and phenotypic effects (mode of inheritances) are all estimated simultaneously within the same framework. The summaries of the estimated parameters are based on the marginal posterior distributions that are obtained through Markov chain Monte Carlo (MCMC) methods. The method uses founder alleles together with segregation indicators in order to determine the genotypes of the trait loci of all individuals in the pedigree. To improve mixing properties of the sampler, we propose (1) joint sampling of map position and segregation indicators, (2) omitting data augmentation for untyped or uninformative markers (homozygous parent), and (3) updating several markers jointly within a single block. The performance of the method was tested with two replicate GAW10 data sets (considering two levels of available marker information). The results were concordant and similar to those presented earlier with other methods. These analyses clearly illustrate the utility and wide applicability of the method.
Collapse
|
42
|
Abstract
We describe a general statistical framework for the genetic analysis of quantitative trait data in inbred line crosses. Our main result is based on the observation that, by conditioning on the unobserved QTL genotypes, the problem can be split into two statistically independent and manageable parts. The first part involves only the relationship between the QTL and the phenotype. The second part involves only the location of the QTL in the genome. We developed a simple Monte Carlo algorithm to implement Bayesian QTL analysis. This algorithm simulates multiple versions of complete genotype information on a genomewide grid of locations using information in the marker genotype data. Weights are assigned to the simulated genotypes to capture information in the phenotype data. The weighted complete genotypes are used to approximate quantities needed for statistical inference of QTL locations and effect sizes. One advantage of this approach is that only the weights are recomputed as the analyst considers different candidate models. This device allows the analyst to focus on modeling and model comparisons. The proposed framework can accommodate multiple interacting QTL, nonnormal and multivariate phenotypes, covariates, missing genotype data, and genotyping errors in any type of inbred line cross. A software tool implementing this procedure is available. We demonstrate our approach to QTL analysis using data from a mouse backcross population that is segregating multiple interacting QTL associated with salt-induced hypertension.
Collapse
|
43
|
Abstract
The existence of a quantitative trait locus (QTL) is usually tested using the likelihood of the quantitative trait on the basis of phenotypic character data plus the recombination fraction between QTL and flanking markers. When doing this, the likelihood is calculated for all possible locations on the linkage map. When multiple QTL are suspected close by, it is impractical to calculate the likelihood for all possible combinations of numbers and locations of QTL. Here, we propose a genetic algorithm (GA) for the heuristic solution of this problem. GA can globally search the optimum by improving the "genotype" with alterations called "recombination" and "mutation." The "genotype" of our GA is the number and location of QTL. The "fitness" is a function based on the likelihood plus Akaike's information criterion (AIC), which helps avoid false-positive QTL. A simulation study comparing the new method with existing QTL mapping packages shows the advantage of the new GA. The GA reliably distinguishes multiple QTL located in a single marker interval.
Collapse
|
44
|
|
45
|
Abstract
Quantitative trait loci (QTL) are easily studied in a biallelic system. Such a system requires the cross of two inbred lines presumably fixed for alternative alleles of the QTL. However, development of inbred lines can be time consuming and cost ineffective for species with long generation intervals and severe inbreeding depression. In addition, restriction of the investigation to a biallelic system can sometimes be misleading because many potentially important allelic interactions do not have a chance to express and thus fail to be detected. A complicated mating design involving multiple alleles mimics the actual breeding system. However, it is difficult to develop the statistical model and algorithm using the classical maximum-likelihood method. In this study, we investigate the application of a Bayesian method implemented via the Markov chain Monte Carlo (MCMC) algorithm to QTL mapping under arbitrarily complicated mating designs. We develop the method under a mixed-model framework where the genetic values of founder alleles are treated as random and the nongenetic effects are treated as fixed. With the MCMC algorithm, we first draw the gene flows from the founders to the descendants for each QTL and then draw samples of the genetic parameters. Finally, we are able to simultaneously infer the posterior distribution of the number, the additive and dominance variances, and the chromosomal locations of all identified QTL.
Collapse
|
46
|
Abstract
We develop a mixed model approach of quantitative trait locus (QTL) mapping for a hybrid population derived from the crosses of two or more distinguished outbred populations. Under the mixed model, we treat the mean allelic value of each source population as the fixed effect and the allelic deviations from the mean as random effects so that we can partition the total genetic variance into between- and within-population variances. Statistical inference of the QTL parameters is obtained by using the Bayesian method implemented by Markov chain Monte Carlo (MCMC). This unified QTL mapping algorithm treats the fixed and random model approaches as special cases of the general mixed model methodology. Utility and flexibility of the method are demonstrated by using a set of simulated data.
Collapse
|
47
|
A note on algorithms for genotype and allele elimination in complex pedigrees with incomplete genotype data. Genetics 2000; 156:2051-62. [PMID: 11102395 PMCID: PMC1461391 DOI: 10.1093/genetics/156.4.2051] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Elimination of genotypes or alleles for each individual or meiosis, which are inconsistent with observed genotypes, is a component of various genetic analyses of complex pedigrees. Computational efficiency of the elimination algorithm is critical in some applications such as genotype sampling via descent graph Markov chains. We present an allele elimination algorithm and two genotype elimination algorithms for complex pedigrees with incomplete genotype data. We modify all three algorithms to incorporate inheritance restrictions imposed by a complete or incomplete descent graph such that every inconsistent complete descent graph is detected in any pedigree, and every inconsistent incomplete descent graph is detected in any pedigree without loops with the genotype elimination algorithms. Allele elimination requires less CPU time and memory, but does not always eliminate all inconsistent alleles, even in pedigrees without loops. The first genotype algorithm produces genotype lists for each individual, which are identical to those obtained from the Lange-Goradia algorithm, but exploits the half-sib structure of some populations and reduces CPU time. The second genotype elimination algorithm deletes more inconsistent genotypes in pedigrees with loops and detects more illegal, incomplete descent graphs in such pedigrees.
Collapse
|
48
|
Abstract
There is a growing need for the development of statistical techniques capable of mapping quantitative trait loci (QTL) in general outbred animal populations. Presently used variance component methods, which correctly account for the complex relationships that may exist between individuals, are challenged by the difficulties incurred through unknown marker genotypes, inbred individuals, partially or unknown marker phases, and multigenerational data. In this article, a two-step variance component approach that enables practitioners to routinely map QTL in populations with the aforementioned difficulties is explored. The performance of the QTL mapping methodology is assessed via its application to simulated data. The capacity of the technique to accurately estimate parameters is examined for a range of scenarios.
Collapse
|
49
|
Bayesian mapping of quantitative trait loci under the identity-by-descent-based variance component model. Genetics 2000; 156:411-22. [PMID: 10978304 PMCID: PMC1461251 DOI: 10.1093/genetics/156.1.411] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Variance component analysis of quantitative trait loci (QTL) is an important strategy of genetic mapping for complex traits in humans. The method is robust because it can handle an arbitrary number of alleles with arbitrary modes of gene actions. The variance component method is usually implemented using the proportion of alleles with identity-by-descent (IBD) shared by relatives. As a result, information about marker linkage phases in the parents is not required. The method has been studied extensively under either the maximum-likelihood framework or the sib-pair regression paradigm. However, virtually all investigations are limited to normally distributed traits under a single QTL model. In this study, we develop a Bayes method to map multiple QTL. We also extend the Bayesian mapping procedure to identify QTL responsible for the variation of complex binary diseases in humans under a threshold model. The method can also treat the number of QTL as a parameter and infer its posterior distribution. We use the reversible jump Markov chain Monte Carlo method to infer the posterior distributions of parameters of interest. The Bayesian mapping procedure ends with an estimation of the joint posterior distribution of the number of QTL and the locations and variances of the identified QTL. Utilities of the method are demonstrated using a simulated population consisting of multiple full-sib families.
Collapse
|
50
|
Abstract
A complex binary trait is a character that has a dichotomous expression but with a polygenic genetic background. Mapping quantitative trait loci (QTL) for such traits is difficult because of the discrete nature and the reduced variation in the phenotypic distribution. Bayesian statistics are proved to be a powerful tool for solving complicated genetic problems, such as multiple QTL with nonadditive effects, and have been successfully applied to QTL mapping for continuous traits. In this study, we show that Bayesian statistics are particularly useful for mapping QTL for complex binary traits. We model the binary trait under the classical threshold model of quantitative genetics. The Bayesian mapping statistics are developed on the basis of the idea of data augmentation. This treatment allows an easy way to generate the value of a hypothetical underlying variable (called the liability) and a threshold, which in turn allow the use of existing Bayesian statistics. The reversible jump Markov chain Monte Carlo algorithm is used to simulate the posterior samples of all unknowns, including the number of QTL, the locations and effects of identified QTL, genotypes of each individual at both the QTL and markers, and eventually the liability of each individual. The Bayesian mapping ends with an estimation of the joint posterior distribution of the number of QTL and the locations and effects of the identified QTL. Utilities of the method are demonstrated using a simulated outbred full-sib family. A computer program written in FORTRAN language is freely available on request.
Collapse
|