1
|
Wu H, Gao B, Zhang R, Huang Z, Yin Z, Hu X, Yang CX, Du ZQ. Residual network improves the prediction accuracy of genomic selection. Anim Genet 2024; 55:599-611. [PMID: 38746973 DOI: 10.1111/age.13445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Revised: 04/21/2024] [Accepted: 04/29/2024] [Indexed: 07/04/2024]
Abstract
Genetic improvement of complex traits in animal and plant breeding depends on the efficient and accurate estimation of breeding values. Deep learning methods have been shown to be not superior over traditional genomic selection (GS) methods, partially due to the degradation problem (i.e. with the increase of the model depth, the performance of the deeper model deteriorates). Since the deep learning method residual network (ResNet) is designed to solve gradient degradation, we examined its performance and factors related to its prediction accuracy in GS. Here we compared the prediction accuracy of conventional genomic best linear unbiased prediction, Bayesian methods (BayesA, BayesB, BayesC, and Bayesian Lasso), and two deep learning methods, convolutional neural network and ResNet, on three datasets (wheat, simulated and real pig data). ResNet outperformed other methods in both Pearson's correlation coefficient (PCC) and mean squared error (MSE) on the wheat and simulated data. For the pig backfat depth trait, ResNet still had the lowest MSE, whereas Bayesian Lasso had the highest PCC. We further clustered the pig data into four groups and, on one separated group, ResNet had the highest prediction accuracy (both PCC and MSE). Transfer learning was adopted and capable of enhancing the performance of both convolutional neural network and ResNet. Taken together, our findings indicate that ResNet could improve GS prediction accuracy, affected potentially by factors such as the genetic architecture of complex traits, data volume, and heterogeneity.
Collapse
Affiliation(s)
- Huaxuan Wu
- College of Animal Science and Technology, Yangtze University, Jingzhou, Hubei, China
| | - Bingxi Gao
- College of Animal Science and Technology, Yangtze University, Jingzhou, Hubei, China
| | - Rong Zhang
- College of Animal Science and Technology, Yangtze University, Jingzhou, Hubei, China
| | - Zehang Huang
- College of Animal Science and Technology, Yangtze University, Jingzhou, Hubei, China
| | - Zongjun Yin
- College of Animal Science and Technology, Anhui Agricultural University, Hefei, Anhui, China
| | - Xiaoxiang Hu
- State Key Laboratory for Agrobiotechnology, China Agricultural University, Beijing, China
| | - Cai-Xia Yang
- College of Animal Science and Technology, Yangtze University, Jingzhou, Hubei, China
| | - Zhi-Qiang Du
- College of Animal Science and Technology, Yangtze University, Jingzhou, Hubei, China
| |
Collapse
|
2
|
Chen C, Bhuiyan SA, Ross E, Powell O, Dinglasan E, Wei X, Atkin F, Deomano E, Hayes B. Genomic prediction for sugarcane diseases including hybrid Bayesian-machine learning approaches. FRONTIERS IN PLANT SCIENCE 2024; 15:1398903. [PMID: 38751840 PMCID: PMC11095127 DOI: 10.3389/fpls.2024.1398903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 04/15/2024] [Indexed: 05/18/2024]
Abstract
Sugarcane smut and Pachymetra root rots are two serious diseases of sugarcane, with susceptible infected crops losing over 30% of yield. A heritable component to both diseases has been demonstrated, suggesting selection could improve disease resistance. Genomic selection could accelerate gains even further, enabling early selection of resistant seedlings for breeding and clonal propagation. In this study we evaluated four types of algorithms for genomic predictions of clonal performance for disease resistance. These algorithms were: Genomic best linear unbiased prediction (GBLUP), including extensions to model dominance and epistasis, Bayesian methods including BayesC and BayesR, Machine learning methods including random forest, multilayer perceptron (MLP), modified convolutional neural network (CNN) and attention networks designed to capture epistasis across the genome-wide markers. Simple hybrid methods, that first used BayesR/GWAS to identify a subset of 1000 markers with moderate to large marginal additive effects, then used attention networks to derive predictions from these effects and their interactions, were also developed and evaluated. The hypothesis for this approach was that using a subset of markers more likely to have an effect would enable better estimation of interaction effects than when there were an extremely large number of possible interactions, especially with our limited data set size. To evaluate the methods, we applied both random five-fold cross-validation and a structured PCA based cross-validation that separated 4702 sugarcane clones (that had disease phenotypes and genotyped for 26k genome wide SNP markers) by genomic relationship. The Bayesian methods (BayesR and BayesC) gave the highest accuracy of prediction, followed closely by hybrid methods with attention networks. The hybrid methods with attention networks gave the lowest variation in accuracy of prediction across validation folds (and lowest MSE), which may be a criteria worth considering in practical breeding programs. This suggests that hybrid methods incorporating the attention mechanism could be useful for genomic prediction of clonal performance, particularly where non-additive effects may be important.
Collapse
Affiliation(s)
- Chensong Chen
- Center for Animal Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Shamsul A. Bhuiyan
- Sugar Research Australia, Woodford, QLD, Australia
- Queensland Micro- and Nanotechnology Centre, Griffith University, Nathan, QLD, Australia
| | - Elizabeth Ross
- Center for Animal Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Owen Powell
- Center for Crop Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Eric Dinglasan
- Center for Animal Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Xianming Wei
- Sugar Research Australia, Indooroopilly, QLD, Australia
| | | | - Emily Deomano
- Sugar Research Australia, Indooroopilly, QLD, Australia
| | - Ben Hayes
- Center for Animal Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
3
|
Haque MA, Lee YM, Ha JJ, Jin S, Park B, Kim NY, Won JI, Kim JJ. Genomic Predictions in Korean Hanwoo Cows: A Comparative Analysis of Genomic BLUP and Bayesian Methods for Reproductive Traits. Animals (Basel) 2023; 14:27. [PMID: 38200758 PMCID: PMC10778388 DOI: 10.3390/ani14010027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 12/07/2023] [Accepted: 12/18/2023] [Indexed: 01/12/2024] Open
Abstract
This study aimed to predict the accuracy of genomic estimated breeding values (GEBVs) for reproductive traits in Hanwoo cows using the GBLUP, BayesB, BayesLASSO, and BayesR methods. Accuracy estimates of GEBVs for reproductive traits were derived through fivefold cross-validation, analyzing a dataset comprising 11,348 animals and employing an Illumina Bovine 50K SNP chip. GBLUP showed an accuracy of 0.26 for AFC, while BayesB, BayesLASSO, and BayesR demonstrated values of 0.28, 0.29, and 0.29, respectively. For CI, GBLUP attained an accuracy of 0.19, whereas BayesB, BayesLASSO, and BayesR scored 0.21, 0.24, and 0.25, respectively. The accuracy for GL was uniform across GBLUP, BayesB, and BayesR at 0.31, whereas BayesLASSO showed a slightly higher accuracy of 0.33. For NAIPC, GBLUP showed an accuracy of 0.24, while BayesB, BayesLASSO, and BayesR recorded 0.22, 0.27, and 0.30, respectively. The variation in genomic prediction accuracy among methods indicated Bayesian approaches slightly outperformed GBLUP. The findings suggest that Bayesian methods, notably BayesLASSO and BayesR, offer improved predictive capabilities for reproductive traits. Future research may explore more advanced genomic approaches to enhance predictive accuracy and genetic gains in Hanwoo cattle breeding programs.
Collapse
Affiliation(s)
- Md Azizul Haque
- Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Republic of Korea; (M.A.H.); (Y.-M.L.)
| | - Yun-Mi Lee
- Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Republic of Korea; (M.A.H.); (Y.-M.L.)
| | - Jae-Jung Ha
- Gyeongbuk Livestock Research Institute, Yeongju 36052, Republic of Korea;
| | - Shil Jin
- Hanwoo Research Institute, National Institute of Animal Science, Pyeongchang 25340, Republic of Korea; (S.J.); (B.P.); (N.-Y.K.)
| | - Byoungho Park
- Hanwoo Research Institute, National Institute of Animal Science, Pyeongchang 25340, Republic of Korea; (S.J.); (B.P.); (N.-Y.K.)
| | - Nam-Young Kim
- Hanwoo Research Institute, National Institute of Animal Science, Pyeongchang 25340, Republic of Korea; (S.J.); (B.P.); (N.-Y.K.)
| | - Jeong-Il Won
- Hanwoo Research Institute, National Institute of Animal Science, Pyeongchang 25340, Republic of Korea; (S.J.); (B.P.); (N.-Y.K.)
| | - Jong-Joo Kim
- Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Republic of Korea; (M.A.H.); (Y.-M.L.)
| |
Collapse
|
4
|
Önder H, Sitskowska B, Kurnaz B, Piwczyński D, Kolenda M, Şen U, Tırınk C, Çanga Boğa D. Multi-Trait Single-Step Genomic Prediction for Milk Yield and Milk Components for Polish Holstein Population. Animals (Basel) 2023; 13:3070. [PMID: 37835676 PMCID: PMC10572056 DOI: 10.3390/ani13193070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 09/27/2023] [Accepted: 09/28/2023] [Indexed: 10/15/2023] Open
Abstract
The objective of our study was to evaluate the predictive ability of a multi-trait genomic prediction model that accounts for interactions between marker effects to estimate heritability and genetic correlations of traits including 305-day milk yield, milk fat percentage, milk protein percentage, milk lactose percentage, and milk dry matter percentage in the Polish Holstein Friesian cow population. For this aim, 14,742 SNP genotype records for 586 Polish Holstein Friesian dairy cows from Poland were used. Single-Trait-ssGBLUP (ST) and Multi-Trait-ssGBLUP (MT) methods were used for estimation. We examined 305-day milk yield (MY, kg), milk fat percentage (MF, %), milk protein percentage (MP, %), milk lactose percentage (ML, %), and milk dry matter percentage (MDM, %). The results showed that the highest marker effect rank correlation was found between milk fat percentage and milk dry matter. The weakest marker effect rank correlation was found between ML and all other traits. Obtained accuracies of this study were between 0.770 and 0.882, and 0.773 and 0.876 for MT and ST, respectively, which were acceptable values. All estimated bias values were positive, which is proof of underestimation. The highest heritability value was obtained for MP (0.3029) and the lowest heritability value was calculated for ML (0.2171). Estimated heritability values were low for milk yield and milk composition as expected. The strongest genetic correlation was estimated between MDM and MF (0.4990) and the weakest genetic correlation was estimated between MY and ML (0.001). The genetic relations with milk yield were negative and can be ignored as they were not significant. In conclusion, multi-trait genomic prediction can be more beneficial than single-trait genomic prediction.
Collapse
Affiliation(s)
- Hasan Önder
- Department of Animal Science, Ondokuz Mayis University, Samsun 55139, Türkiye;
| | - Beata Sitskowska
- Department of Animal Biotechnology and Genetic, Faculty of Animal Breeding and Biology, Bydgoszcz University of Science and Technology, 85084 Bydgoszcz, Poland; (B.S.); (D.P.); (M.K.)
| | - Burcu Kurnaz
- Department of Animal Science, Ondokuz Mayis University, Samsun 55139, Türkiye;
| | - Dariusz Piwczyński
- Department of Animal Biotechnology and Genetic, Faculty of Animal Breeding and Biology, Bydgoszcz University of Science and Technology, 85084 Bydgoszcz, Poland; (B.S.); (D.P.); (M.K.)
| | - Magdalena Kolenda
- Department of Animal Biotechnology and Genetic, Faculty of Animal Breeding and Biology, Bydgoszcz University of Science and Technology, 85084 Bydgoszcz, Poland; (B.S.); (D.P.); (M.K.)
| | - Uğur Şen
- Department of Agricultural Biotechnology, Ondokuz Mayis University, Samsun 55139, Türkiye;
| | - Cem Tırınk
- Department of Animal Science, Iğdır University, Iğdır 76000, Türkiye;
| | - Demet Çanga Boğa
- Department of Chemistry and Chemical Processing, Osmaniye Korkut Ata University, Osmaniye 80050, Türkiye;
| |
Collapse
|
5
|
Vu NT, Phuc TH, Nguyen NH, Van Sang N. Effects of common full-sib families on accuracy of genomic prediction for tagging weight in striped catfish Pangasianodon hypophthalmus. Front Genet 2023; 13:1081246. [PMID: 36685869 PMCID: PMC9845282 DOI: 10.3389/fgene.2022.1081246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 12/06/2022] [Indexed: 01/06/2023] Open
Abstract
Common full-sib families (c 2 ) make up a substantial proportion of total phenotypic variation in traits of commercial importance in aquaculture species and omission or inclusion of the c 2 resulted in possible changes in genetic parameter estimates and re-ranking of estimated breeding values. However, the impacts of common full-sib families on accuracy of genomic prediction for commercial traits of economic importance are not well known in many species, including aquatic animals. This research explored the impacts of common full-sib families on accuracy of genomic prediction for tagging weight in a population of striped catfish comprising 11,918 fish traced back to the base population (four generations), in which 560 individuals had genotype records of 14,154 SNPs. Our single step genomic best linear unbiased prediction (ssGLBUP) showed that the accuracy of genomic prediction for tagging weight was reduced by 96.5%-130.3% when the common full-sib families were included in statistical models. The reduction in the prediction accuracy was to a smaller extent in multivariate analysis than in univariate models. Imputation of missing genotypes somewhat reduced the upward biases in the prediction accuracy for tagging weight. It is therefore suggested that genomic evaluation models for traits recorded during the early phase of growth development should account for the common full-sib families to minimise possible biases in the accuracy of genomic prediction and hence, selection response.
Collapse
Affiliation(s)
- Nguyen Thanh Vu
- School of Science, Technology and Engineering, University of the Sunshine Coast, Sippy Downs, QLD, Australia,Center for Bio-Innovation, University of the Sunshine Coast, Maroochydore, QLD, Australia,Research Institute for Aquaculture No. 2, Ho Chi Minh City, Vietnam
| | - Tran Huu Phuc
- Research Institute for Aquaculture No. 2, Ho Chi Minh City, Vietnam
| | - Nguyen Hong Nguyen
- School of Science, Technology and Engineering, University of the Sunshine Coast, Sippy Downs, QLD, Australia,Center for Bio-Innovation, University of the Sunshine Coast, Maroochydore, QLD, Australia,*Correspondence: Nguyen Hong Nguyen, ; Nguyen Van Sang,
| | - Nguyen Van Sang
- Research Institute for Aquaculture No. 2, Ho Chi Minh City, Vietnam,*Correspondence: Nguyen Hong Nguyen, ; Nguyen Van Sang,
| |
Collapse
|
6
|
Atashi H, Bastin C, Wilmot H, Vanderick S, Hubin X, Gengler N. Genome-wide association study for selected cheese-making properties in Dual-Purpose Belgian Blue cows. J Dairy Sci 2022; 105:8972-8988. [PMID: 36175238 DOI: 10.3168/jds.2022-21780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 06/21/2022] [Indexed: 01/05/2023]
Abstract
This study aimed to estimate genetic parameters and identify genomic region(s) associated with selected cheese-making properties (CMP) in Dual-Purpose Belgian Blue (DPBB) cows. Edited data were 46,301 test-day records of milk yield, fat percentage, protein percentage, casein percentage, milk calcium content (CC), coagulation time (CT), curd firmness after 30 min from rennet addition (a30), and milk titratable acidity (MTA) collected from 2014 to 2020 on 4,077 first-parity (26,027 test-day records), and 3,258 second-parity DPBB cows (20,274 test-day records) distributed in 124 herds in the Walloon Region of Belgium. Data of 28,266 SNP, located on 29 Bos taurus autosomes (BTA) of 1,699 animals were used. Random regression test-day models were used to estimate genetic parameters through the Bayesian Gibbs sampling method. The SNP solutions were estimated using a single-step genomic BLUP approach. The proportion of the total additive genetic variance explained by windows of 25 consecutive SNPs (with an average size of ∼2 Mb) was calculated, and regions accounting for at least 1.0% of the total additive genetic variance were used to search for candidate genes. Heritability estimates for the included CMP ranged from 0.19 (CC) to 0.50 (MTA), and 0.24 (CC) to 0.41 (MTA) in the first and second parity, respectively. The genetic correlation estimated between CT and a30 varied from -0.61 to -0.41 and from -0.55 to -0.38 in the first and second lactations, respectively. Negative genetic correlations were found between CT and milk yield and composition, while those estimated between curd firmness and milk composition were positive. Genome-wide association analyses results identified 4 genomic regions (BTA1, BTA3, BTA7, and BTA11) associated with the considered CMP. The identified genomic regions showed contrasting results between parities and among the different stages of each parity. It suggests that different sets of candidate genes underlie the phenotypic expression of the considered CMP between parities and lactation stages of each parity. The findings of this study can be used for future implementation and use of genomic evaluation to improve the cheese-making traits in DPBB cows.
Collapse
Affiliation(s)
- H Atashi
- TERRA Teaching and Research Center, Gembloux Agro-Bio Tech, University of Liège, 5030 Gembloux, Belgium; Department of Animal Science, Shiraz University, 71441-65186 Shiraz, Iran.
| | - C Bastin
- Walloon Breeders Association, 5590 Ciney, Belgium
| | - H Wilmot
- TERRA Teaching and Research Center, Gembloux Agro-Bio Tech, University of Liège, 5030 Gembloux, Belgium; National Fund for Scientific Research (FRS-FNRS), Rue d'Egmont 5, B-1000 Brussels, Belgium
| | - S Vanderick
- TERRA Teaching and Research Center, Gembloux Agro-Bio Tech, University of Liège, 5030 Gembloux, Belgium
| | - X Hubin
- Walloon Breeders Association, 5590 Ciney, Belgium
| | - N Gengler
- TERRA Teaching and Research Center, Gembloux Agro-Bio Tech, University of Liège, 5030 Gembloux, Belgium
| |
Collapse
|
7
|
Hardner CM, Fikere M, Gasic K, da Silva Linge C, Worthington M, Byrne D, Rawandoozi Z, Peace C. Multi-environment genomic prediction for soluble solids content in peach ( Prunus persica). FRONTIERS IN PLANT SCIENCE 2022; 13:960449. [PMID: 36275520 PMCID: PMC9583944 DOI: 10.3389/fpls.2022.960449] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 08/01/2022] [Indexed: 06/16/2023]
Abstract
Genotype-by-environment interaction (G × E) is a common phenomenon influencing genetic improvement in plants, and a good understanding of this phenomenon is important for breeding and cultivar deployment strategies. However, there is little information on G × E in horticultural tree crops, mostly due to evaluation costs, leading to a focus on the development and deployment of locally adapted germplasm. Using sweetness (measured as soluble solids content, SSC) in peach/nectarine assessed at four trials from three US peach-breeding programs as a case study, we evaluated the hypotheses that (i) complex data from multiple breeding programs can be connected using GBLUP models to improve the knowledge of G × E for breeding and deployment and (ii) accounting for a known large-effect quantitative trait locus (QTL) improves the prediction accuracy. Following a structured strategy using univariate and multivariate models containing additive and dominance genomic effects on SSC, a model that included a previously detected QTL and background genomic effects was a significantly better fit than a genome-wide model with completely anonymous markers. Estimates of an individual's narrow-sense and broad-sense heritability for SSC were high (0.57-0.73 and 0.66-0.80, respectively), with 19-32% of total genomic variance explained by the QTL. Genome-wide dominance effects and QTL effects were stable across environments. Significant G × E was detected for background genome effects, mostly due to the low correlation of these effects across seasons within a particular trial. The expected prediction accuracy, estimated from the linear model, was higher than the realised prediction accuracy estimated by cross-validation, suggesting that these two parameters measure different qualities of the prediction models. While prediction accuracy was improved in some cases by combining data across trials, particularly when phenotypic data for untested individuals were available from other trials, this improvement was not consistent. This study confirms that complex data can be combined into a single analysis using GBLUP methods to improve understanding of G × E and also incorporate known QTL effects. In addition, the study generated baseline information to account for population structure in genomic prediction models in horticultural crop improvement.
Collapse
Affiliation(s)
- Craig M. Hardner
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Mulusew Fikere
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Ksenija Gasic
- Department of Plant and Environmental Sciences, Clemson University, Clemson, SC, United States
| | - Cassia da Silva Linge
- Department of Plant and Environmental Sciences, Clemson University, Clemson, SC, United States
| | - Margaret Worthington
- Faculty Horticulture, University of Arkansas System Division of Agriculture, Fayetteville, AR, United States
| | - David Byrne
- College of Agriculture and Life Sciences, Texas A&M University, College Station, TX, United States
| | - Zena Rawandoozi
- College of Agriculture and Life Sciences, Texas A&M University, College Station, TX, United States
| | - Cameron Peace
- Department of Horticulture, Washington State University, Pullman, WA, United States
| |
Collapse
|
8
|
Flutre T, Le Cunff L, Fodor A, Launay A, Romieu C, Berger G, Bertrand Y, Terrier N, Beccavin I, Bouckenooghe V, Roques M, Pinasseau L, Verbaere A, Sommerer N, Cheynier V, Bacilieri R, Boursiquot JM, Lacombe T, Laucou V, This P, Péros JP, Doligez A. A genome-wide association and prediction study in grapevine deciphers the genetic architecture of multiple traits and identifies genes under many new QTLs. G3 (BETHESDA, MD.) 2022; 12:6575896. [PMID: 35485948 PMCID: PMC9258538 DOI: 10.1093/g3journal/jkac103] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 04/21/2022] [Indexed: 12/11/2022]
Abstract
To cope with the challenges facing agriculture, speeding-up breeding programs is a worthy endeavor, especially for perennial species such as grapevine, but requires understanding the genetic architecture of target traits. To go beyond the mapping of quantitative trait loci in bi-parental crosses, we exploited a diversity panel of 279 Vitis vinifera L. cultivars planted in 5 blocks in the vineyard. This panel was phenotyped over several years for 127 traits including yield components, organic acids, aroma precursors, polyphenols, and a water stress indicator. The panel was genotyped for 63k single nucleotide polymorphisms by combining an 18K microarray and genotyping-by-sequencing. The experimental design allowed to reliably assess the genotypic values for most traits. Marker densification via genotyping-by-sequencing markedly increased the proportion of genetic variance explained by single nucleotide polymorphisms, and 2 multi-single nucleotide polymorphism models identified quantitative trait loci not found by a single nucleotide polymorphism-by-single nucleotide polymorphism model. Overall, 489 reliable quantitative trait loci were detected for 41% more response variables than by a single nucleotide polymorphism-by-single nucleotide polymorphism model with microarray-only single nucleotide polymorphisms, many new ones compared with the results from bi-parental crosses. A prediction accuracy higher than 0.42 was obtained for 50% of the response variables. Our overall approach as well as quantitative trait locus and prediction results provide insights into the genetic architecture of target traits. New candidate genes and the application into breeding are discussed.
Collapse
Affiliation(s)
- Timothée Flutre
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France.,Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE-Le Moulon, 91190 Gif-sur-Yvette, France
| | - Loïc Le Cunff
- UMT Géno-Vigne, 34398 Montpellier, France.,IFV, 30240 Le Grau-du-Roi, France
| | - Agota Fodor
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France
| | - Amandine Launay
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France
| | - Charles Romieu
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France
| | - Gilles Berger
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France
| | - Yves Bertrand
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France
| | - Nancy Terrier
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France
| | | | | | - Maryline Roques
- UMT Géno-Vigne, 34398 Montpellier, France.,IFV, 30240 Le Grau-du-Roi, France
| | - Lucie Pinasseau
- SPO, Univ Montpellier, INRAE, Institut Agro, 34060 Montpellier, France
| | - Arnaud Verbaere
- SPO, Univ Montpellier, INRAE, Institut Agro, 34060 Montpellier, France
| | - Nicolas Sommerer
- SPO, Univ Montpellier, INRAE, Institut Agro, 34060 Montpellier, France
| | | | - Roberto Bacilieri
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France
| | - Jean-Michel Boursiquot
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France
| | - Thierry Lacombe
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France
| | - Valérie Laucou
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France
| | - Patrice This
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France
| | - Jean-Pierre Péros
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France
| | - Agnès Doligez
- AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398 Montpellier, France.,UMT Géno-Vigne, 34398 Montpellier, France
| |
Collapse
|
9
|
Breen EJ, MacLeod IM, Ho PN, Haile-Mariam M, Pryce JE, Thomas CD, Daetwyler HD, Goddard ME. BayesR3 enables fast MCMC blocked processing for largescale multi-trait genomic prediction and QTN mapping analysis. Commun Biol 2022; 5:661. [PMID: 35790806 PMCID: PMC9256732 DOI: 10.1038/s42003-022-03624-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 06/22/2022] [Indexed: 01/26/2023] Open
Abstract
Bayesian methods, such as BayesR, for predicting the genetic value or risk of individuals from their genotypes, such as Single Nucleotide Polymorphisms (SNP), are often implemented using a Markov Chain Monte Carlo (MCMC) process. However, the generation of Markov chains is computationally slow. We introduce a form of blocked Gibbs sampling for estimating SNP effects from Markov chains that greatly reduces computational time by sampling each SNP effect iteratively n-times from conditional block posteriors. Subsequent iteration over all blocks m-times produces chains of length m × n. We use this strategy to solve large-scale genomic prediction and fine-mapping problems using the Bayesian MCMC mixed-effects genetic model, BayesR3. We validate the method using simulated data, followed by analysis of empirical dairy cattle data using high dimension milk mid infra-red spectra data as an example of “omics” data and show its use to increase the precision of mapping variants affecting milk, fat, and protein yields relative to a univariate analysis of milk, fat, and protein. BayesR3 samples the polymorphisms affecting complex traits at reduced computational cost to predict the genetic value, breeding value, or individual risk of genotypes.
Collapse
|
10
|
Wolc A, Dekkers JCM. Application of Bayesian genomic prediction methods to genome-wide association analyses. Genet Sel Evol 2022; 54:31. [PMID: 35562659 PMCID: PMC9103490 DOI: 10.1186/s12711-022-00724-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 04/27/2022] [Indexed: 11/19/2022] Open
Abstract
Background Bayesian genomic prediction methods were developed to simultaneously fit all genotyped markers to a set of available phenotypes for prediction of breeding values for quantitative traits, allowing for differences in the genetic architecture (distribution of marker effects) of traits. These methods also provide a flexible and reliable framework for genome-wide association (GWA) studies. The objective here was to review developments in Bayesian hierarchical and variable selection models for GWA analyses. Results By fitting all genotyped markers simultaneously, Bayesian GWA methods implicitly account for population structure and the multiple-testing problem of classical single-marker GWA. Implemented using Markov chain Monte Carlo methods, Bayesian GWA methods allow for control of error rates using probabilities obtained from posterior distributions. Power of GWA studies using Bayesian methods can be enhanced by using informative priors based on previous association studies, gene expression analyses, or functional annotation information. Applied to multiple traits, Bayesian GWA analyses can give insight into pleiotropic effects by multi-trait, structural equation, or graphical models. Bayesian methods can also be used to combine genomic, transcriptomic, proteomic, and other -omics data to infer causal genotype to phenotype relationships and to suggest external interventions that can improve performance. Conclusions Bayesian hierarchical and variable selection methods provide a unified and powerful framework for genomic prediction, GWA, integration of prior information, and integration of information from other -omics platforms to identify causal mutations for complex quantitative traits.
Collapse
Affiliation(s)
- Anna Wolc
- Department of Animal Science, Iowa State University, 806 Stange Road, 239 Kildee Hall, Ames, IA, 50010, USA.,Hy-Line International, 2583 240th Street, Dallas Center, IA, 50063, USA
| | - Jack C M Dekkers
- Department of Animal Science, Iowa State University, 806 Stange Road, 239 Kildee Hall, Ames, IA, 50010, USA.
| |
Collapse
|
11
|
Shen F, Bianco L, Wu B, Tian Z, Wang Y, Wu T, Xu X, Han Z, Velasco R, Fontana P, Zhang X. A bulked segregant analysis tool for out-crossing species (BSATOS) and QTL-based genomics-assisted prediction of complex traits in apple. J Adv Res 2022; 42:149-162. [PMID: 36513410 PMCID: PMC9788957 DOI: 10.1016/j.jare.2022.03.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 03/06/2022] [Accepted: 03/22/2022] [Indexed: 12/27/2022] Open
Abstract
INTRODUCTION Genomic heterozygosity, self-incompatibility, and rich-in somatic mutations hinder the molecular breeding efficiency of outcrossing plants. OBJECTIVES We attempted to develop an efficient integrated strategy to identify quantitative trait loci (QTLs) and trait-associated genes, to develop gene markers, and to construct genomics-assisted prediction (GAP) modes. METHODS A novel protocol, bulked segregant analysis tool for out-crossing species (BSATOS), is presented here, which is characterized by taking full advantage of all segregation patterns (including AB × AB markers) and haplotype information. To verify the effectiveness of the protocol in dealing with the complex traits of outbreeding species, three apple cross populations with 9,654 individuals were adopted. RESULTS By using BSATOS, 90, 60, and 77 significant QTLs were identified successfully and candidate genes were predicted for apple fruit weight (FW), fruit ripening date (FRD), and fruit soluble solid content (SSC), respectively. The gene-based markers were developed and genotyped for 1,396 individuals in a training population, including 145 Malus accessions and 1,251 F1 plants of the three full-sib families. GAP models were trained using marker genotype effect estimates of the training population. The prediction accuracy was 0.7658, 0.6455, and 0.3758 for FW, FRD, and SSC, respectively. CONCLUSION The BSATOS and GAP models provided a convenient and efficient methodology for candidate gene mining and molecular breeding in out-crossing plant species. The BSATOS pipeline can be freely downloaded from: https://github.com/maypoleflyn/BSATOS.
Collapse
Affiliation(s)
- Fei Shen
- College of Horticulture, China Agricultural University, Beijing 100193, China,Research and Innovation Center, Edmund Mach Foundation, 38010 S. Michele all’Adige, Italy,Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
| | - Luca Bianco
- Research and Innovation Center, Edmund Mach Foundation, 38010 S. Michele all’Adige, Italy
| | - Bei Wu
- College of Horticulture, China Agricultural University, Beijing 100193, China
| | - Zhendong Tian
- College of Horticulture, China Agricultural University, Beijing 100193, China
| | - Yi Wang
- College of Horticulture, China Agricultural University, Beijing 100193, China
| | - Ting Wu
- College of Horticulture, China Agricultural University, Beijing 100193, China
| | - Xuefeng Xu
- College of Horticulture, China Agricultural University, Beijing 100193, China
| | - Zhenhai Han
- College of Horticulture, China Agricultural University, Beijing 100193, China,Corresponding authors.
| | - Riccardo Velasco
- Research Centre for Viticulture and Enology, CREA, Conegliano, Italy
| | - Paolo Fontana
- Research and Innovation Center, Edmund Mach Foundation, 38010 S. Michele all’Adige, Italy,Corresponding authors.
| | - Xinzhong Zhang
- College of Horticulture, China Agricultural University, Beijing 100193, China,Corresponding authors.
| |
Collapse
|
12
|
Brault C, Doligez A, Cunff L, Coupel-Ledru A, Simonneau T, Chiquet J, This P, Flutre T. Harnessing multivariate, penalized regression methods for genomic prediction and QTL detection of drought-related traits in grapevine. G3-GENES GENOMES GENETICS 2021; 11:6325507. [PMID: 34544146 PMCID: PMC8496232 DOI: 10.1093/g3journal/jkab248] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 07/02/2021] [Indexed: 11/13/2022]
Abstract
Viticulture has to cope with climate change and to decrease pesticide inputs, while maintaining yield and wine quality. Breeding is a key lever to meet this challenge, and genomic prediction a promising tool to accelerate breeding programs. Multivariate methods are potentially more accurate than univariate ones. Moreover, some prediction methods also provide marker selection, thus allowing quantitative trait loci (QTLs) detection and the identification of positional candidate genes. To study both genomic prediction and QTL detection for drought-related traits in grapevine, we applied several methods, interval mapping (IM) as well as univariate and multivariate penalized regression, in a bi-parental progeny. With a dense genetic map, we simulated two traits under four QTL configurations. The penalized regression method Elastic Net (EN) for genomic prediction, and controlling the marginal False Discovery Rate on EN selected markers to prioritize the QTLs. Indeed, penalized methods were more powerful than IM for QTL detection across various genetic architectures. Multivariate prediction did not perform better than its univariate counterpart, despite strong genetic correlation between traits. Using 14 traits measured in semi-controlled conditions under different watering conditions, penalized regression methods proved very efficient for intra-population prediction whatever the genetic architecture of the trait, with predictive abilities reaching 0.68. Compared to a previous study on the same traits, these methods applied on a denser map found new QTLs controlling traits linked to drought tolerance and provided relevant candidate genes. Overall, these findings provide a strong evidence base for implementing genomic prediction in grapevine breeding.
Collapse
Affiliation(s)
- Charlotte Brault
- Institut Français de la Vigne et du Vin, Montpellier F-34398, France.,UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier F-34398, France.,UMT Geno-Vigne®, IFV-INRAE-Institut Agro, Montpellier F-34398, France
| | - Agnès Doligez
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier F-34398, France.,UMT Geno-Vigne®, IFV-INRAE-Institut Agro, Montpellier F-34398, France
| | - Le Cunff
- Institut Français de la Vigne et du Vin, Montpellier F-34398, France.,UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier F-34398, France.,UMT Geno-Vigne®, IFV-INRAE-Institut Agro, Montpellier F-34398, France
| | - Aude Coupel-Ledru
- LEPSE, Univ Montpellier, INRAE, Institut Agro, Montpellier 34000, France
| | - Thierry Simonneau
- LEPSE, Univ Montpellier, INRAE, Institut Agro, Montpellier 34000, France
| | | | - Patrice This
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier F-34398, France.,UMT Geno-Vigne®, IFV-INRAE-Institut Agro, Montpellier F-34398, France
| | - Timothée Flutre
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE-Le Moulon, Gif-sur-Yvette 91190, France
| |
Collapse
|
13
|
Alam MJ, Mydam J, Hossain MR, Islam SMS, Mollah MNH. Robust regression based genome-wide multi-trait QTL analysis. Mol Genet Genomics 2021; 296:1103-1119. [PMID: 34170407 DOI: 10.1007/s00438-021-01801-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 06/01/2021] [Indexed: 10/21/2022]
Abstract
In genome-wide quantitative trait locus (QTL) mapping studies, multiple quantitative traits are often measured along with the marker genotypes. Multi-trait QTL (MtQTL) analysis, which includes multiple quantitative traits together in a single model, is an efficient technique to increase the power of QTL identification. The two most widely used classical approaches for MtQTL mapping are Gaussian Mixture Model-based MtQTL (GMM-MtQTL) and Linear Regression Model-based MtQTL (LRM-MtQTL) analyses. There are two types of LRM-MtQTL approach known as least squares-based LRM-MtQTL (LS-LRM-MtQTL) and maximum likelihood-based LRM-MtQTL (ML-LRM-MtQTL). These three classical approaches are equivalent alternatives for QTL detection, but ML-LRM-MtQTL is computationally faster than GMM-MtQTL and LS-LRM-MtQTL. However, one major limitation common to all the above classical approaches is that they are very sensitive to outliers, which leads to misleading results. Therefore, in this study, we developed an LRM-based robust MtQTL approach, called LRM-RobMtQTL, for the backcross population based on the robust estimation of regression parameters by maximizing the β-likelihood function induced from the β-divergence with multivariate normal distribution. When β = 0, the proposed LRM-RobMtQTL method reduces to the classical ML-LRM-MtQTL approach. Simulation studies showed that both ML-LRM-MtQTL and LRM-RobMtQTL methods identified the same QTL positions in the absence of outliers. However, in the presence of outliers, only the proposed method was able to identify all the true QTL positions. Real data analysis results revealed that in the presence of outliers only our LRM-RobMtQTL approach can identify all the QTL positions as those identified in the absence of outliers by both methods. We conclude that our proposed LRM-RobMtQTL analysis approach outperforms the classical MtQTL analysis methods.
Collapse
Affiliation(s)
- Md Jahangir Alam
- Bioinformatics Laboratory, Department of Statistics, University of Rajshahi, Rajshahi, 6205, Bangladesh
| | - Janardhan Mydam
- Division of Neonatology, Department of Pediatrics, John H. Stroger, Jr. Hospital of Cook County, 1969 Ogden Avenue, Chicago, IL, 60612, USA
- Department of Pediatrics, Rush Medical Center, Chicago, USA
| | - Md Ripter Hossain
- Bioinformatics Laboratory, Department of Statistics, University of Rajshahi, Rajshahi, 6205, Bangladesh
| | - S M Shahinul Islam
- Institute of Biological Science, University of Rajshahi, Rajshahi, 6205, Bangladesh
| | - Md Nurul Haque Mollah
- Bioinformatics Laboratory, Department of Statistics, University of Rajshahi, Rajshahi, 6205, Bangladesh.
| |
Collapse
|
14
|
Meuwissen T, van den Berg I, Goddard M. On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL. Genet Sel Evol 2021; 53:19. [PMID: 33637049 PMCID: PMC7908738 DOI: 10.1186/s12711-021-00607-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Accepted: 01/25/2021] [Indexed: 11/10/2022] Open
Abstract
Background Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here, we present a computationally fast implementation of a variable selection genomic prediction method, that could handle WGS data on more than 35,000 individuals, test its accuracy for across-breed predictions and assess its quantitative trait locus (QTL) mapping precision. Methods The Monte Carlo Markov chain (MCMC) variable selection model (Bayes GC) fits simultaneously a genomic best linear unbiased prediction (GBLUP) term, i.e. a polygenic effect whose correlations are described by a genomic relationship matrix (G), and a Bayes C term, i.e. a set of single nucleotide polymorphisms (SNPs) with large effects selected by the model. Computational speed is improved by a Metropolis–Hastings sampling that directs computations to the SNPs, which are, a priori, most likely to be included into the model. Speed is also improved by running many relatively short MCMC chains. Memory requirements are reduced by storing the genotype matrix in binary form. The model was tested on a WGS dataset containing Holstein, Jersey and Australian Red cattle. The data contained 4,809,520 genotypes on 35,549 individuals together with their milk, fat and protein yields, and fat and protein percentage traits. Results The prediction accuracies of the Jersey individuals improved by 1.5% when using across-breed GBLUP compared to within-breed predictions. Using WGS instead of 600 k SNP-chip data yielded on average a 3% accuracy improvement for Australian Red cows. QTL were fine-mapped by locating the SNP with the highest posterior probability of being included in the model. Various QTL known from the literature were rediscovered, and a new SNP affecting milk production was discovered on chromosome 20 at 34.501126 Mb. Due to the high mapping precision, it was clear that many of the discovered QTL were the same across the five dairy traits. Conclusions Across-breed Bayes GC genomic prediction improved prediction accuracies compared to GBLUP. The combination of across-breed WGS data and Bayesian genomic prediction proved remarkably effective for the fine-mapping of QTL.
Collapse
Affiliation(s)
- Theo Meuwissen
- Norwegian University of Life Sciences, Box 5003, 1432, Ås, Norway.
| | | | - Mike Goddard
- Agriculture Victoria, Bundoora, Australia.,Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Parkville, Australia
| |
Collapse
|
15
|
Xiang R, MacLeod IM, Daetwyler HD, de Jong G, O’Connor E, Schrooten C, Chamberlain AJ, Goddard ME. Genome-wide fine-mapping identifies pleiotropic and functional variants that predict many traits across global cattle populations. Nat Commun 2021; 12:860. [PMID: 33558518 PMCID: PMC7870883 DOI: 10.1038/s41467-021-21001-0] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 11/23/2020] [Indexed: 02/08/2023] Open
Abstract
The difficulty in finding causative mutations has hampered their use in genomic prediction. Here, we present a methodology to fine-map potentially causal variants genome-wide by integrating the functional, evolutionary and pleiotropic information of variants using GWAS, variant clustering and Bayesian mixture models. Our analysis of 17 million sequence variants in 44,000+ Australian dairy cattle for 34 traits suggests, on average, one pleiotropic QTL existing in each 50 kb chromosome-segment. We selected a set of 80k variants representing potentially causal variants within each chromosome segment to develop a bovine XT-50K genotyping array. The custom array contains many pleiotropic variants with biological functions, including splicing QTLs and variants at conserved sites across 100 vertebrate species. This biology-informed custom array outperformed the standard array in predicting genetic value of multiple traits across populations in independent datasets of 90,000+ dairy cattle from the USA, Australia and New Zealand.
Collapse
Affiliation(s)
- Ruidong Xiang
- grid.1008.90000 0001 2179 088XFaculty of Veterinary and Agricultural Science, The University of Melbourne, Parkville, VIC Australia ,grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC Australia
| | - Iona M. MacLeod
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC Australia
| | - Hans D. Daetwyler
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC Australia ,grid.1018.80000 0001 2342 0938School of Applied Systems Biology, La Trobe University, Bundoora, VIC Australia
| | | | | | | | - Amanda J. Chamberlain
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC Australia
| | - Michael E. Goddard
- grid.1008.90000 0001 2179 088XFaculty of Veterinary and Agricultural Science, The University of Melbourne, Parkville, VIC Australia ,grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC Australia
| |
Collapse
|
16
|
Fernandes SB, Zhang KS, Jamann TM, Lipka AE. How Well Can Multivariate and Univariate GWAS Distinguish Between True and Spurious Pleiotropy? Front Genet 2021; 11:602526. [PMID: 33584799 PMCID: PMC7873880 DOI: 10.3389/fgene.2020.602526] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 12/11/2020] [Indexed: 11/13/2022] Open
Abstract
Quantification of the simultaneous contributions of loci to multiple traits, a phenomenon called pleiotropy, is facilitated by the increased availability of high-throughput genotypic and phenotypic data. To understand the prevalence and nature of pleiotropy, the ability of multivariate and univariate genome-wide association study (GWAS) models to distinguish between pleiotropic and non-pleiotropic loci in linkage disequilibrium (LD) first needs to be evaluated. Therefore, we used publicly available maize and soybean genotypic data to simulate multiple pairs of traits that were either (i) controlled by quantitative trait nucleotides (QTNs) on separate chromosomes, (ii) controlled by QTNs in various degrees of LD with each other, or (iii) controlled by a single pleiotropic QTN. We showed that multivariate GWAS could not distinguish between QTNs in LD and a single pleiotropic QTN. In contrast, a unique QTN detection rate pattern was observed for univariate GWAS whenever the simulated QTNs were in high LD or pleiotropic. Collectively, these results suggest that multivariate and univariate GWAS should both be used to infer whether or not causal mutations underlying peak GWAS associations are pleiotropic. Therefore, we recommend that future studies use a combination of multivariate and univariate GWAS models, as both models could be useful for identifying and narrowing down candidate loci with potential pleiotropic effects for downstream biological experiments.
Collapse
Affiliation(s)
- Samuel B. Fernandes
- Department of Crop Science, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | | | | | - Alexander E. Lipka
- Department of Crop Science, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| |
Collapse
|
17
|
Klápště J, Dungey HS, Telfer EJ, Suontama M, Graham NJ, Li Y, McKinley R. Marker Selection in Multivariate Genomic Prediction Improves Accuracy of Low Heritability Traits. Front Genet 2020; 11:499094. [PMID: 33193595 PMCID: PMC7662070 DOI: 10.3389/fgene.2020.499094] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Accepted: 09/18/2020] [Indexed: 11/13/2022] Open
Abstract
Multivariate analysis using mixed models allows for the exploration of genetic correlations between traits. Additionally, the transition to a genomic based approach is simplified by substituting classic pedigrees with a marker-based relationship matrix. It also enables the investigation of correlated responses to selection, trait integration and modularity in different kinds of populations. This study investigated a strategy for the construction of a marker-based relationship matrix that prioritized markers using Partial Least Squares. The efficiency of this strategy was found to depend on the correlation structure between investigated traits. In terms of accuracy, we found no benefit of this strategy compared with the all-marker-based multivariate model for the primary trait of diameter at breast height (DBH) in a radiata pine (Pinus radiata) population, possibly due to the presence of strong and well-estimated correlation with other highly heritable traits. Conversely, we did see benefit in a shining gum (Eucalyptus nitens) population, where the primary trait had low or only moderate genetic correlation with other low/moderately heritable traits. Marker selection in multivariate analysis can therefore be an efficient strategy to improve prediction accuracy for low heritability traits due to improved precision in poorly estimated low/moderate genetic correlations. Additionally, our study identified the genetic diversity as a factor contributing to the efficiency of marker selection in multivariate approaches due to higher precision of genetic correlation estimates.
Collapse
Affiliation(s)
- Jaroslav Klápště
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand
| | - Heidi S Dungey
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand
| | - Emily J Telfer
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand
| | - Mari Suontama
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand.,Skogforsk, Umeå, Sweden
| | - Natalie J Graham
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand
| | - Yongjun Li
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand.,Agriculture Victoria, AgriBio Center, Bundoora, VIC, Australia
| | - Russell McKinley
- Scion (New Zealand Forest Research Institute Ltd.), Rotorua, New Zealand
| |
Collapse
|
18
|
Stolpovsky YA, Piskunov AK, Svishcheva GR. Genomic Selection. I: Latest Trends and Possible Ways of Development. RUSS J GENET+ 2020. [DOI: 10.1134/s1022795420090148] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
19
|
Genetic Basis of Maize Resistance to Multiple Insect Pests: Integrated Genome-Wide Comparative Mapping and Candidate Gene Prioritization. Genes (Basel) 2020; 11:genes11060689. [PMID: 32599710 PMCID: PMC7349181 DOI: 10.3390/genes11060689] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 05/30/2020] [Accepted: 06/01/2020] [Indexed: 01/01/2023] Open
Abstract
Several species of herbivores feed on maize in field and storage setups, making the development of multiple insect resistance a critical breeding target. In this study, an association mapping panel of 341 tropical maize lines was evaluated in three field environments for resistance to fall armyworm (FAW), whilst bulked grains were subjected to a maize weevil (MW) bioassay and genotyped with Diversity Array Technology's single nucleotide polymorphisms (SNPs) markers. A multi-locus genome-wide association study (GWAS) revealed 62 quantitative trait nucleotides (QTNs) associated with FAW and MW resistance traits on all 10 maize chromosomes, of which, 47 and 31 were discovered at stringent Bonferroni genome-wide significance levels of 0.05 and 0.01, respectively, and located within or close to multiple insect resistance genomic regions (MIRGRs) concerning FAW, SB, and MW. Sixteen QTNs influenced multiple traits, of which, six were associated with resistance to both FAW and MW, suggesting a pleiotropic genetic control. Functional prioritization of candidate genes (CGs) located within 10-30 kb of the QTNs revealed 64 putative GWAS-based CGs (GbCGs) showing evidence of involvement in plant defense mechanisms. Only one GbCG was associated with each of the five of the six combined resistance QTNs, thus reinforcing the pleiotropy hypothesis. In addition, through in silico co-functional network inferences, an additional 107 network-based CGs (NbCGs), biologically connected to the 64 GbCGs, and differentially expressed under biotic or abiotic stress, were revealed within MIRGRs. The provided multiple insect resistance physical map should contribute to the development of combined insect resistance in maize.
Collapse
|
20
|
Melo D, Marroig G, Wolf JB. Genomic Perspective on Multivariate Variation, Pleiotropy, and Evolution. J Hered 2020; 110:479-493. [PMID: 30986303 DOI: 10.1093/jhered/esz011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Accepted: 02/13/2019] [Indexed: 11/14/2022] Open
Abstract
Multivariate quantitative genetics provides a powerful framework for understanding patterns and processes of phenotypic evolution. Quantitative genetics parameters, like trait heritability or the G-matrix for sets of traits, can be used to predict evolutionary response or to understand the evolutionary history of a population. These population-level approaches have proven to be extremely successful, but the underlying genetics of multivariate variation and evolutionary change typically remain a black box. Establishing a deeper empirical understanding of how individual genetic effects lead to genetic (co)variation is then crucial to our understanding of the evolutionary process. To delve into this black box, we exploit an experimental population of mice composed from lineages derived by artificial selection. We develop an approach to estimate the multivariate effect of loci and characterize these vectors of effects in terms of their magnitude and alignment with the direction of evolutionary divergence. Using these estimates, we reconstruct the traits in the ancestral populations and quantify how much of the divergence is due to genetic effects. Finally, we also use these vectors to decompose patterns of genetic covariation and examine the relationship between these components and the corresponding distribution of pleiotropic effects. We find that additive effects are much larger than dominance effects and are more closely aligned with the direction of selection and divergence, with larger effects being more aligned than smaller effects. Pleiotropic effects are highly variable but are, on average, modular. These results are consistent with pleiotropy being partly shaped by selection while reflecting underlying developmental constraints.
Collapse
Affiliation(s)
- Diogo Melo
- Departamento de Genética e Biologia Evolutiva, Instituto de Biociências, Universidade de São Paulo, São Paulo, SP, Brasil
| | - Gabriel Marroig
- Departamento de Genética e Biologia Evolutiva, Instituto de Biociências, Universidade de São Paulo, São Paulo, SP, Brasil
| | - Jason B Wolf
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, UK
| |
Collapse
|
21
|
Improved polygenic prediction by Bayesian multiple regression on summary statistics. Nat Commun 2019; 10:5086. [PMID: 31704910 PMCID: PMC6841727 DOI: 10.1038/s41467-019-12653-0] [Citation(s) in RCA: 287] [Impact Index Per Article: 47.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 08/30/2019] [Indexed: 01/21/2023] Open
Abstract
Accurate prediction of an individual’s phenotype from their DNA sequence is one of the great promises of genomics and precision medicine. We extend a powerful individual-level data Bayesian multiple regression model (BayesR) to one that utilises summary statistics from genome-wide association studies (GWAS), SBayesR. In simulation and cross-validation using 12 real traits and 1.1 million variants on 350,000 individuals from the UK Biobank, SBayesR improves prediction accuracy relative to commonly used state-of-the-art summary statistics methods at a fraction of the computational resources. Furthermore, using summary statistics for variants from the largest GWAS meta-analysis (n ≈ 700, 000) on height and BMI, we show that on average across traits and two independent data sets that SBayesR improves prediction R2 by 5.2% relative to LDpred and by 26.5% relative to clumping and p value thresholding. Various approaches are being used for polygenic prediction including Bayesian multiple regression methods that require access to individual-level genotype data. Here, the authors extend BayesR to utilise GWAS summary statistics (SBayesR) and show that it outperforms other summary statistic-based methods.
Collapse
|
22
|
Mehrban H, Lee DH, Naserkheil M, Moradi MH, Ibáñez-Escriche N. Comparison of conventional BLUP and single-step genomic BLUP evaluations for yearling weight and carcass traits in Hanwoo beef cattle using single trait and multi-trait models. PLoS One 2019; 14:e0223352. [PMID: 31609979 PMCID: PMC6791548 DOI: 10.1371/journal.pone.0223352] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2019] [Accepted: 09/19/2019] [Indexed: 11/20/2022] Open
Abstract
Hanwoo, an important indigenous and popular breed of beef cattle in Korea, shows rapid growth and has high meat quality. Its yearling weight (YW) and carcass traits (backfat thickness, carcass weight- CW, eye muscle area, and marbling score) are economically important for selection of young and proven bulls. However, measuring carcass traits is difficult and expensive, and can only be performed postmortem. Genomic selection has become an appealing procedure for genetic evaluation of these traits (by inclusion of the genomic data) along with the possibility of multi-trait analysis. The aim of this study was to compare conventional best linear unbiased prediction (BLUP) and single-step genomic BLUP (ssGBLUP) methods, using both single-trait (ST-BLUP, ST-ssGBLUP) and multi-trait (MT-BLUP, MT-ssGBLUP) models to investigate the improvement of breeding-value accuracy for carcass traits and YW. The data comprised of 15,279 phenotypic records for YW and 5,824 records for carcass traits, and 1,541 genotyped animals for 34,479 single-nucleotide polymorphisms. Accuracy for each trait and model was estimated only for genotyped animals by five-fold cross-validation. ssGBLUP models (ST-ssGBLUP and MT-ssGBLUP) showed ~19% and ~36% greater accuracy than conventional BLUP models (ST-BLUP and MT-BLUP) for YW and carcass traits, respectively. Within ssGBLUP models, the accuracy of the genomically estimated breeding value for CW increased (19%) when ST-ssGBLUP was replaced with the MT-ssGBLUP model, as the inclusion of YW in the analysis led to a strong genetic correlation with CW (0.76). For backfat thickness, eye muscle area, and marbling score, ST- and MT-ssGBLUP models yielded similar accuracy. Thus, combining pedigree and genomic data via the ssGBLUP model may be a promising way to ensure acceptable accuracy of predictions, especially among young animals, for ongoing Hanwoo cattle breeding programs. MT-ssGBLUP is highly recommended when phenotypic records are limited for one of the two highly correlated genetic traits.
Collapse
Affiliation(s)
- Hossein Mehrban
- Department of Animal Sciences, Shahrekord University, Shahrekord, Iran
| | - Deuk Hwan Lee
- Department of Animal Life and Environment Sciences, Hankyong National University, Jungang-ro 327, Anseong-si, Gyeonggi-do, Korea
- * E-mail:
| | - Masoumeh Naserkheil
- Department of Animal Sciences, University College of Agriculture and Natural Resources, University of Tehran, Karaj, Iran
| | - Mohammad Hossein Moradi
- Department of Animal Sciences, Faculty of Agriculture and Natural Resources, Arak University, Arak, Iran
| | - Noelia Ibáñez-Escriche
- Institute for Animal Science and Technology, Universitat Politècnica de València, València, Spain
| |
Collapse
|
23
|
Liu X, Wang H, Hu X, Li K, Liu Z, Wu Y, Huang C. Improving Genomic Selection With Quantitative Trait Loci and Nonadditive Effects Revealed by Empirical Evidence in Maize. FRONTIERS IN PLANT SCIENCE 2019; 10:1129. [PMID: 31620155 PMCID: PMC6759780 DOI: 10.3389/fpls.2019.01129] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 08/15/2019] [Indexed: 05/20/2023]
Abstract
Genomic selection (GS), a tool developed for molecular breeding, is used by plant breeders to improve breeding efficacy by shortening the breeding cycle and to facilitate the selection of candidate lines for creating hybrids without phenotyping in various environments. Association and linkage mapping have been widely used to explore and detect candidate genes in order to understand the genetic mechanisms of quantitative traits. In the current study, phenotypic and genotypic data from three experimental populations, including data on six agronomic traits (e.g., plant height, ear height, ear length, ear diameter, grain yield per plant, and hundred-kernel weight), were used to evaluate the effect of trait-relevant markers (TRMs) on prediction accuracy estimation. Integrating information from mapping into a statistical model can efficiently improve prediction performance compared with using stochastically selected markers to perform GS. The prediction accuracy can reach plateau when a total of 500-1,000 TRMs are utilized in GS. The prediction accuracy can be significantly enhanced by including nonadditive effects and TRMs in the GS model when genotypic data with high proportions of heterozygous alleles and complex agronomic traits with high proportion of nonadditive variancein phenotypic variance are used to perform GS. In addition, taking information on population structure into account can slightly improve prediction performance when the genetic relationship between the training and testing sets is influenced by population stratification due to different allele frequencies. In conclusion, GS is a useful approach for prescreening candidate lines, and the empirical evidence provided by the current study for TRMs and nonadditive effects can inform plant breeding and in turn contribute to the improvement of selection efficiency in practical GS-assisted breeding programs.
Collapse
|