1
|
Yuan C, Gualdrón Duarte JL, Takeda H, Georges M, Druet T. Evaluation of heritability partitioning approaches in livestock populations. BMC Genomics 2024; 25:690. [PMID: 39003468 PMCID: PMC11246585 DOI: 10.1186/s12864-024-10600-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 07/08/2024] [Indexed: 07/15/2024] Open
Abstract
BACKGROUND Heritability partitioning approaches estimate the contribution of different functional classes, such as coding or regulatory variants, to the genetic variance. This information allows a better understanding of the genetic architecture of complex traits, including complex diseases, but can also help improve the accuracy of genomic selection in livestock species. However, methods have mainly been tested on human genomic data, whereas livestock populations have specific characteristics, such as high levels of relatedness, small effective population size or long-range levels of linkage disequilibrium. RESULTS Here, we used data from 14,762 cows, imputed at the whole-genome sequence level for 11,537,240 variants, to simulate traits in a typical livestock population and evaluate the accuracy of two state-of-the-art heritability partitioning methods, GREML and a Bayesian mixture model. In simulations where a single functional class had increased contribution to heritability, we observed that the estimators were unbiased but had low precision. When causal variants were enriched in variants with low (< 0.05) or high (> 0.20) minor allele frequency or low (below 1st quartile) or high (above 3rd quartile) linkage disequilibrium scores, it was necessary to partition the genetic variance into multiple classes defined on the basis of allele frequencies or LD scores to obtain unbiased results. When multiple functional classes had variable contributions to heritability, estimators showed higher levels of variation and confounding between certain categories was observed. In addition, estimators from small categories were particularly imprecise. However, the estimates and their ranking were still informative about the contribution of the classes. We also demonstrated that using methods that estimate the contribution of a single category at a time, a commonly used approach, results in an overestimation. Finally, we applied the methods to phenotypes for muscular development and height and estimated that, on average, variants in open chromatin regions had a higher contribution to the genetic variance (> 45%), while variants in coding regions had the strongest individual effects (> 25-fold enrichment on average). Conversely, variants in intergenic or intronic regions showed lower levels of enrichment (0.2 and 0.6-fold on average, respectively). CONCLUSIONS Heritability partitioning approaches should be used cautiously in livestock populations, in particular for small categories. Two-component approaches that fit only one functional category at a time lead to biased estimators and should not be used.
Collapse
Affiliation(s)
- Can Yuan
- Unit of Animal Genomics, GIGA-R & Faculty of Veterinary Medicine, University of Liège, Avenue de L'Hôpital, 1, 4000, Liège, Belgium.
| | | | - Haruko Takeda
- Unit of Animal Genomics, GIGA-R & Faculty of Veterinary Medicine, University of Liège, Avenue de L'Hôpital, 1, 4000, Liège, Belgium
| | - Michel Georges
- Unit of Animal Genomics, GIGA-R & Faculty of Veterinary Medicine, University of Liège, Avenue de L'Hôpital, 1, 4000, Liège, Belgium
| | - Tom Druet
- Unit of Animal Genomics, GIGA-R & Faculty of Veterinary Medicine, University of Liège, Avenue de L'Hôpital, 1, 4000, Liège, Belgium
| |
Collapse
|
2
|
Jiang J. MPH: fast REML for large-scale genome partitioning of quantitative genetic variation. Bioinformatics 2024; 40:btae298. [PMID: 38688661 PMCID: PMC11093526 DOI: 10.1093/bioinformatics/btae298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 04/24/2024] [Accepted: 04/29/2024] [Indexed: 05/02/2024] Open
Abstract
MOTIVATION Genome partitioning of quantitative genetic variation is useful for dissecting the genetic architecture of complex traits. However, existing methods, such as Haseman-Elston regression and linkage disequilibrium score regression, often face limitations when handling extensive farm animal datasets, as demonstrated in this study. RESULTS To overcome this challenge, we present MPH, a novel software tool designed for efficient genome partitioning analyses using restricted maximum likelihood. The computational efficiency of MPH primarily stems from two key factors: the utilization of stochastic trace estimators and the comprehensive implementation of parallel computation. Evaluations with simulated and real datasets demonstrate that MPH achieves comparable accuracy and significantly enhances convergence, speed, and memory efficiency compared to widely used tools like GCTA and LDAK. These advancements facilitate large-scale, comprehensive analyses of complex genetic architectures in farm animals. AVAILABILITY AND IMPLEMENTATION The MPH software is available at https://jiang18.github.io/mph/.
Collapse
Affiliation(s)
- Jicai Jiang
- Department of Animal Science, North Carolina State University, Raleigh, NC 27695, United States
| |
Collapse
|
3
|
Ye H, Xu Z, Bello SF, Zhu Q, Kong S, Zheng M, Fang X, Jia X, Xu H, Zhang X, Nie Q. Haplotype analysis of genomic prediction by incorporating genomic pathway information based on high-density SNP marker in Chinese yellow-feathered chicken. Poult Sci 2023; 102:102549. [PMID: 36907129 PMCID: PMC10024239 DOI: 10.1016/j.psj.2023.102549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Revised: 01/16/2023] [Accepted: 01/27/2023] [Indexed: 02/09/2023] Open
Abstract
Genomic selection using single nucleotide polymorphism (SNP) markers is now intensively investigated in breeding and has been widely utilized for genetic improvement. Currently, several studies have used haplotype (consisting of multiallelic SNPs) for genomic prediction and revealed its performance advantage. In this study, we comprehensively evaluated the performance of haplotype models for genomic prediction in 15 traits, including 6 growth, 5 carcass, and 4 feeding traits in a Chinese yellow-feathered chicken population. We adopted 3 methods to define haplotypes from high-density SNP panels, and our strategy included combining Kyoto Encyclopedia of Genes and Genomes pathway information and considering linkage disequilibrium (LD) information. Our results showed an increase in prediction accuracy due to haplotypes ranging from -0.04∼27.16% in all traits, where the significant improvements were found in 12 traits. The estimates of haplotype epistasis heritability were strongly correlated with the accuracy increase by haplotype models. In addition, incorporating genomic annotation information could further increase the accuracy of the haplotype model, where the further increase in accuracy is significantly relative to the increase of relative haplotype epistasis heritability. The genomic prediction using LD information for constructing haplotypes has the best prediction performance among the 4 traits. These results uncovered that haplotype methods were beneficial for genomic prediction, and the accuracy could be further increased by incorporating genomic annotation information. Moreover, using LD information would potentially improve the performance of genomic prediction.
Collapse
Affiliation(s)
- Haoqiang Ye
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, South China Agricultural University, Guangzhou, 510642 China; Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding and Key Lab of Chicken Genetics, Breeding and Reproduction, Ministry of Agriculture, Guangzhou, 510642 China
| | - Zhenqiang Xu
- Wen's Nanfang Poultry Breeding Co. Ltd, Guangdong Province, Yunfu 527400, China
| | - Semiu Folaniyi Bello
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, South China Agricultural University, Guangzhou, 510642 China; Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding and Key Lab of Chicken Genetics, Breeding and Reproduction, Ministry of Agriculture, Guangzhou, 510642 China
| | - Qianghui Zhu
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, South China Agricultural University, Guangzhou, 510642 China
| | - Shaofen Kong
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, South China Agricultural University, Guangzhou, 510642 China; Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding and Key Lab of Chicken Genetics, Breeding and Reproduction, Ministry of Agriculture, Guangzhou, 510642 China
| | - Ming Zheng
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, South China Agricultural University, Guangzhou, 510642 China; Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding and Key Lab of Chicken Genetics, Breeding and Reproduction, Ministry of Agriculture, Guangzhou, 510642 China
| | - Xiang Fang
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, South China Agricultural University, Guangzhou, 510642 China; Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding and Key Lab of Chicken Genetics, Breeding and Reproduction, Ministry of Agriculture, Guangzhou, 510642 China
| | - Xinzheng Jia
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, Foshan University, Foshan, 528225 China
| | - Haiping Xu
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, South China Agricultural University, Guangzhou, 510642 China; Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding and Key Lab of Chicken Genetics, Breeding and Reproduction, Ministry of Agriculture, Guangzhou, 510642 China
| | - Xiquan Zhang
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, South China Agricultural University, Guangzhou, 510642 China; Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding and Key Lab of Chicken Genetics, Breeding and Reproduction, Ministry of Agriculture, Guangzhou, 510642 China
| | - Qinghua Nie
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, South China Agricultural University, Guangzhou, 510642 China; Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding and Key Lab of Chicken Genetics, Breeding and Reproduction, Ministry of Agriculture, Guangzhou, 510642 China.
| |
Collapse
|
4
|
Ogawa S, Taniguchi Y, Watanabe T, Iwaisaki H. Fitting Genomic Prediction Models with Different Marker Effects among Prefectures to Carcass Traits in Japanese Black Cattle. Genes (Basel) 2022; 14:24. [PMID: 36672767 PMCID: PMC9859149 DOI: 10.3390/genes14010024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 12/16/2022] [Accepted: 12/20/2022] [Indexed: 12/25/2022] Open
Abstract
We fitted statistical models, which assumed single-nucleotide polymorphism (SNP) marker effects differing across the fattened steers marketed into different prefectures, to the records for cold carcass weight (CW) and marbling score (MS) of 1036, 733, and 279 Japanese Black fattened steers marketed into Tottori, Hiroshima, and Hyogo prefectures in Japan, respectively. Genotype data on 33,059 SNPs was used. Five models that assume only common SNP effects to all the steers (model 1), common effects plus SNP effects differing between the steers marketed into Hyogo prefecture and others (model 2), only the SNP effects differing between Hyogo steers and others (model 3), common effects plus SNP effects specific to each prefecture (model 4), and only the effects specific to each prefecture (model 5) were exploited. For both traits, slightly lower values of residual variance than that of model 1 were estimated when fitting all other models. Estimated genetic correlation among the prefectures in models 2 and 4 ranged to 0.53 to 0.71, all <0.8. These results might support that the SNP effects differ among the prefectures to some degree, although we discussed the necessity of careful consideration to interpret the current results.
Collapse
Affiliation(s)
- Shinichiro Ogawa
- Graduate School of Agriculture, Kyoto University, Kyoto 606-8502, Japan
- Division of Meat Animal and Poultry Research, Institute of Livestock and Grassland Science, Tsukuba 305-0901, Japan
| | - Yukio Taniguchi
- Graduate School of Agriculture, Kyoto University, Kyoto 606-8502, Japan
| | - Toshio Watanabe
- National Livestock Breeding Center, Fukushima 961-8511, Japan
- Maebashi Institute of Animal Science, Livestock Improvement Association of Japan, Inc., Maebashi 371-0121, Japan
| | - Hiroaki Iwaisaki
- Graduate School of Agriculture, Kyoto University, Kyoto 606-8502, Japan
- Sado Island Center for Ecological Sustainability, Niigata University, Niigata 952-0103, Japan
| |
Collapse
|
5
|
Hao X, Liang A, Plastow G, Zhang C, Wang Z, Liu J, Salzano A, Gasparrini B, Campanile G, Zhang S, Yang L. An Integrative Genomic Prediction Approach for Predicting Buffalo Milk Traits by Incorporating Related Cattle QTLs. Genes (Basel) 2022; 13:genes13081430. [PMID: 36011341 PMCID: PMC9408041 DOI: 10.3390/genes13081430] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 08/08/2022] [Accepted: 08/09/2022] [Indexed: 11/16/2022] Open
Abstract
Background: The 90K Axiom Buffalo SNP Array is expected to improve and speed up various genomic analyses for the buffalo (Bubalus bubalis). Genomic prediction is an effective approach in animal breeding to improve selection and reduce costs. As buffalo genome research is lagging behind that of the cow and production records are also limited, genomic prediction performance will be relatively poor. To improve the genomic prediction in buffalo, we introduced a new approach (pGBLUP) for genomic prediction of six buffalo milk traits by incorporating QTL information from the cattle milk traits in order to help improve the prediction performance for buffalo. Results: In simulations, the pGBLUP could outperform BayesR and the GBLUP if the prior biological information (i.e., the known causal loci) was appropriate; otherwise, it performed slightly worse than BayesR and equal to or better than the GBLUP. In real data, the heritability of the buffalo genomic region corresponding to the cattle milk trait QTLs was enriched (fold of enrichment > 1) in four buffalo milk traits (FY270, MY270, PY270, and PM) when the EBV was used as the response variable. The DEBV as the response variable yielded more reliable genomic predictions than the traditional EBV, as has been shown by previous research. The performance of the three approaches (GBLUP, BayesR, and pGBLUP) did not vary greatly in this study, probably due to the limited sample size, incomplete prior biological information, and less artificial selection in buffalo. Conclusions: To our knowledge, this study is the first to apply genomic prediction to buffalo by incorporating prior biological information. The genomic prediction of buffalo traits can be further improved with a larger sample size, higher-density SNP chips, and more precise prior biological information.
Collapse
Affiliation(s)
- Xingjie Hao
- Department of Epidemiology and Biostatistics, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
- Correspondence: (X.H.); (L.Y.)
| | - Aixin Liang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Graham Plastow
- Livestock Gentec Center, Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6G 2C8, Canada
| | - Chunyan Zhang
- Livestock Gentec Center, Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6G 2C8, Canada
| | - Zhiquan Wang
- Livestock Gentec Center, Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6G 2C8, Canada
| | - Jiajia Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Angela Salzano
- Department of Veterinary Medicine and Animal Productions, University of Naples “Federico II”, 80137 Naples, Italy
| | - Bianca Gasparrini
- Department of Veterinary Medicine and Animal Productions, University of Naples “Federico II”, 80137 Naples, Italy
| | - Giuseppe Campanile
- Department of Veterinary Medicine and Animal Productions, University of Naples “Federico II”, 80137 Naples, Italy
| | - Shujun Zhang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
| | - Liguo Yang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China
- Correspondence: (X.H.); (L.Y.)
| |
Collapse
|
6
|
Wolc A, Dekkers JCM. Application of Bayesian genomic prediction methods to genome-wide association analyses. Genet Sel Evol 2022; 54:31. [PMID: 35562659 PMCID: PMC9103490 DOI: 10.1186/s12711-022-00724-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 04/27/2022] [Indexed: 11/19/2022] Open
Abstract
Background Bayesian genomic prediction methods were developed to simultaneously fit all genotyped markers to a set of available phenotypes for prediction of breeding values for quantitative traits, allowing for differences in the genetic architecture (distribution of marker effects) of traits. These methods also provide a flexible and reliable framework for genome-wide association (GWA) studies. The objective here was to review developments in Bayesian hierarchical and variable selection models for GWA analyses. Results By fitting all genotyped markers simultaneously, Bayesian GWA methods implicitly account for population structure and the multiple-testing problem of classical single-marker GWA. Implemented using Markov chain Monte Carlo methods, Bayesian GWA methods allow for control of error rates using probabilities obtained from posterior distributions. Power of GWA studies using Bayesian methods can be enhanced by using informative priors based on previous association studies, gene expression analyses, or functional annotation information. Applied to multiple traits, Bayesian GWA analyses can give insight into pleiotropic effects by multi-trait, structural equation, or graphical models. Bayesian methods can also be used to combine genomic, transcriptomic, proteomic, and other -omics data to infer causal genotype to phenotype relationships and to suggest external interventions that can improve performance. Conclusions Bayesian hierarchical and variable selection methods provide a unified and powerful framework for genomic prediction, GWA, integration of prior information, and integration of information from other -omics platforms to identify causal mutations for complex quantitative traits.
Collapse
Affiliation(s)
- Anna Wolc
- Department of Animal Science, Iowa State University, 806 Stange Road, 239 Kildee Hall, Ames, IA, 50010, USA.,Hy-Line International, 2583 240th Street, Dallas Center, IA, 50063, USA
| | - Jack C M Dekkers
- Department of Animal Science, Iowa State University, 806 Stange Road, 239 Kildee Hall, Ames, IA, 50010, USA.
| |
Collapse
|
7
|
Ogawa S, Matsuda H, Taniguchi Y, Watanabe T, Sugimoto Y, Iwaisaki H. Estimation of the autosomal contribution to total additive genetic variability of carcass traits in Japanese Black cattle. Anim Sci J 2022; 93:e13710. [PMID: 35416392 DOI: 10.1111/asj.13710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 02/18/2022] [Accepted: 03/18/2022] [Indexed: 11/29/2022]
Abstract
We attempted to estimate the additive genetic variance explained by each autosome, using genotype data of 33,657 single nucleotide polymorphism (SNP) markers in 2271 Japanese Black fattened steers. Traits were cold carcass weight, ribeye area, rib thickness, subcutaneous fat thickness, estimated yield percentage, and marbling score. Two mixed linear models were used: One is that (model 1) incorporating a genomic relationship matrix (G matrix) constructed by using all available SNPs, and another (model 2), incorporating two G matrices constructed by using the SNPs on one autosome and using those on the remaining autosomes. Genomic heritabilities estimated using model 1 were moderate to high. The sums of the proportions of the additive genetic variance explained by each autosome to the total genetic variance estimated by using model 2 were >90%. For carcass weight, the proportions explained by Bos taurus autosomes 6, 8, and 14 were higher than those explained by the remaining autosomes. In some cases, the estimated proportion was close to 0. The results obtained from model 2 could provide a novel insight into the genetic architecture, such as heritability per chromosome, of carcass traits in Japanese Black cattle, although further careful investigation would be required.
Collapse
Affiliation(s)
| | | | - Yukio Taniguchi
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
| | | | - Yoshikazu Sugimoto
- Shirakawa Institute of Animal Genetics, Japan Livestock Technology Association, Tokyo, Japan
| | | |
Collapse
|
8
|
Shi S, Zhang Z, Li B, Zhang S, Fang L. Incorporation of Trait-Specific Genetic Information into Genomic Prediction Models. Methods Mol Biol 2022; 2467:329-340. [PMID: 35451781 DOI: 10.1007/978-1-0716-2205-6_11] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Due to the rapid development of high-throughput sequencing technology, we can easily obtain not only the genetic variants at the whole-genome sequence level (e.g., from 1000 Genomes project and 1000 Bull Genomes project), but also a wide range of functional annotations (e.g., enhancers and promoters from ENCODE, FAANG, and FarmGTEx projects) across a wide range of tissues, cell types, developmental stages, and environmental conditions. This huge amount of information leads to a revolution in studying genetics and genomics of complex traits in humans, livestock, and plant species. In this chapter, we focused on and reviewed the genomic prediction methods that incorporate external biological information into genomic prediction, such as sequence ontology, linkage disequilibrium (LD) of SNPs, quantitative trait loci (QTL), and multi-layer omics data (e.g., transcriptome, epigenome, and microbiome).
Collapse
Affiliation(s)
- Shaolei Shi
- College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Zhe Zhang
- Department of Animal Breeding and genetics, College of Animal Science, South China Agricultural University (SCAU), Guangzhou, China
| | - Bingjie Li
- The Roslin Institute Building, Scotland's Rural College, Edinburgh, UK
| | - Shengli Zhang
- College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Lingzhao Fang
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
9
|
Rice BR, Lipka AE. Diversifying maize genomic selection models. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2021; 41:33. [PMID: 37309328 PMCID: PMC10236107 DOI: 10.1007/s11032-021-01221-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 03/07/2021] [Indexed: 06/14/2023]
Abstract
Genomic selection (GS) is one of the most powerful tools available for maize breeding. Its use of genome-wide marker data to estimate breeding values translates to increased genetic gains with fewer breeding cycles. In this review, we cover the history of GS and highlight particular milestones during its adaptation to maize breeding. We discuss how GS can be applied to developing superior maize inbreds and hybrids. Additionally, we characterize refinements in GS models that could enable the encapsulation of non-additive genetic effects, genotype by environment interactions, and multiple levels of the biological hierarchy, all of which could ultimately result in more accurate predictions of breeding values. Finally, we suggest the stages in a maize breeding program where it would be beneficial to apply GS. Given the current sophistication of high-throughput phenotypic, genotypic, and other -omic level data currently available to the maize community, now is the time to explore the implications of their incorporation into GS models and thus ensure that genetic gains are being achieved as quickly and efficiently as possible.
Collapse
Affiliation(s)
- Brian R. Rice
- Department of Crop Sciences, University of Illinois, Urbana, IL USA
| | | |
Collapse
|
10
|
Farooq M, van Dijk ADJ, Nijveen H, Aarts MGM, Kruijer W, Nguyen TP, Mansoor S, de Ridder D. Prior Biological Knowledge Improves Genomic Prediction of Growth-Related Traits in Arabidopsis thaliana. Front Genet 2021; 11:609117. [PMID: 33552126 PMCID: PMC7855462 DOI: 10.3389/fgene.2020.609117] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 12/21/2020] [Indexed: 01/11/2023] Open
Abstract
Prediction of growth-related complex traits is highly important for crop breeding. Photosynthesis efficiency and biomass are direct indicators of overall plant performance and therefore even minor improvements in these traits can result in significant breeding gains. Crop breeding for complex traits has been revolutionized by technological developments in genomics and phenomics. Capitalizing on the growing availability of genomics data, genome-wide marker-based prediction models allow for efficient selection of the best parents for the next generation without the need for phenotypic information. Until now such models mostly predict the phenotype directly from the genotype and fail to make use of relevant biological knowledge. It is an open question to what extent the use of such biological knowledge is beneficial for improving genomic prediction accuracy and reliability. In this study, we explored the use of publicly available biological information for genomic prediction of photosynthetic light use efficiency (Φ PSII ) and projected leaf area (PLA) in Arabidopsis thaliana. To explore the use of various types of knowledge, we mapped genomic polymorphisms to Gene Ontology (GO) terms and transcriptomics-based gene clusters, and applied these in a Genomic Feature Best Linear Unbiased Predictor (GFBLUP) model, which is an extension to the traditional Genomic BLUP (GBLUP) benchmark. Our results suggest that incorporation of prior biological knowledge can improve genomic prediction accuracy for both Φ PSII and PLA. The improvement achieved depends on the trait, type of knowledge and trait heritability. Moreover, transcriptomics offers complementary evidence to the Gene Ontology for improvement when used to define functional groups of genes. In conclusion, prior knowledge about trait-specific groups of genes can be directly translated into improved genomic prediction.
Collapse
Affiliation(s)
- Muhammad Farooq
- Bioinformatics Group, Wageningen University, Wageningen, Netherlands
- Molecular Virology and Gene Silencing Lab, Agricultural Biotechnology Division, National Institute for Biotechnology and Genetic Engineering (NIBGE), Punjab, Pakistan
| | - Aalt D. J. van Dijk
- Bioinformatics Group, Wageningen University, Wageningen, Netherlands
- Biometris, Wageningen University, Wageningen, Netherlands
| | - Harm Nijveen
- Bioinformatics Group, Wageningen University, Wageningen, Netherlands
| | - Mark G. M. Aarts
- Laboratory of Genetics, Wageningen University, Wageningen, Netherlands
| | - Willem Kruijer
- Biometris, Wageningen University, Wageningen, Netherlands
| | - Thu-Phuong Nguyen
- Laboratory of Genetics, Wageningen University, Wageningen, Netherlands
| | - Shahid Mansoor
- Molecular Virology and Gene Silencing Lab, Agricultural Biotechnology Division, National Institute for Biotechnology and Genetic Engineering (NIBGE), Punjab, Pakistan
| | - Dick de Ridder
- Bioinformatics Group, Wageningen University, Wageningen, Netherlands
| |
Collapse
|
11
|
Genomic Prediction Informed by Biological Processes Expands Our Understanding of the Genetic Architecture Underlying Free Amino Acid Traits in Dry Arabidopsis Seeds. G3-GENES GENOMES GENETICS 2020; 10:4227-4239. [PMID: 32978264 PMCID: PMC7642941 DOI: 10.1534/g3.120.401240] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Plant growth, development, and nutritional quality depends upon amino acid homeostasis, especially in seeds. However, our understanding of the underlying genetics influencing amino acid content and composition remains limited, with only a few candidate genes and quantitative trait loci identified to date. Improved knowledge of the genetics and biological processes that determine amino acid levels will enable researchers to use this information for plant breeding and biological discovery. Toward this goal, we used genomic prediction to identify biological processes that are associated with, and therefore potentially influence, free amino acid (FAA) composition in seeds of the model plant Arabidopsis thaliana. Markers were split into categories based on metabolic pathway annotations and fit using a genomic partitioning model to evaluate the influence of each pathway on heritability explained, model fit, and predictive ability. Selected pathways included processes known to influence FAA composition, albeit to an unknown degree, and spanned four categories: amino acid, core, specialized, and protein metabolism. Using this approach, we identified associations for pathways containing known variants for FAA traits, in addition to finding new trait-pathway associations. Markers related to amino acid metabolism, which are directly involved in FAA regulation, improved predictive ability for branched chain amino acids and histidine. The use of genomic partitioning also revealed patterns across biochemical families, in which serine-derived FAAs were associated with protein related annotations and aromatic FAAs were associated with specialized metabolic pathways. Taken together, these findings provide evidence that genomic partitioning is a viable strategy to uncover the relative contributions of biological processes to FAA traits in seeds, offering a promising framework to guide hypothesis testing and narrow the search space for candidate genes.
Collapse
|
12
|
Rohde PD, Fourie Sørensen I, Sørensen P. qgg: an R package for large-scale quantitative genetic analyses. Bioinformatics 2019; 36:2614-2615. [DOI: 10.1093/bioinformatics/btz955] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 12/16/2019] [Accepted: 12/23/2019] [Indexed: 01/03/2023] Open
Abstract
Abstract
Summary
Here, we present the R package qgg, which provides an environment for large-scale genetic analyses of quantitative traits and diseases. The qgg package provides an infrastructure for efficient processing of large-scale genetic data and functions for estimating genetic parameters, and performing single and multiple marker association analyses and genomic-based predictions of phenotypes.
Availability and implementation
The qgg package is freely available. For the latest updates, user guides and example scripts, consult the main page http://psoerensen.github.io/qgg. The current release is available from CRAN (https://CRAN.R-project.org/package=qgg) for all major operating systems.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Palle Duun Rohde
- Department of Molecular Biology and Genetics, Aarhus University, Tjele, Denmark
| | | | - Peter Sørensen
- Department of Molecular Biology and Genetics, Aarhus University, Tjele, Denmark
| |
Collapse
|
13
|
Abstract
Mastitis is a prevalent and costly disease on dairy farms. Improved management and hygiene can reduce the risk of infection by contagious or environmental pathogens, and genetic selection can confer permanent improvement in mastitis resistance. National veterinary recording systems in the Nordic countries have allowed direct selection for sire families with low incidence of clinical mastitis for 3 decades, whereas other countries have practiced indirect selection for lower somatic cell count. Recently, pooling of producer-recorded data from on-farm herd management software programs has enabled selection for reduced incidence of clinical mastitis in the United States and other leading dairy countries.
Collapse
Affiliation(s)
- Kent A Weigel
- Department of Dairy Science, University of Wisconsin-Madison, 1675 Observatory Drive, Madison, WI 53706-1205, USA.
| | - George E Shook
- Department of Dairy Science, University of Wisconsin-Madison, 1675 Observatory Drive, Madison, WI 53706-1205, USA
| |
Collapse
|
14
|
Laodim T, Elzo MA, Koonawootrittriron S, Suwanasopee T, Jattawa D. Pathway enrichment and protein interaction network analysis for milk yield, fat yield and age at first calving in a Thai multibreed dairy population. ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES 2018; 32:508-518. [PMID: 30056656 PMCID: PMC6409460 DOI: 10.5713/ajas.18.0382] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2018] [Accepted: 07/15/2018] [Indexed: 01/01/2023]
Abstract
Objective This research aimed to determine biological pathways and protein-protein interaction (PPI) networks for 305-d milk yield (MY), 305-d fat yield (FY), and age at first calving (AFC) in the Thai multibreed dairy population. Methods Genotypic information contained 75,776 imputed and actual single nucleotide polymorphisms (SNP) from 2,661 animals. Single-step genomic best linear unbiased predictions were utilized to estimate SNP genetic variances for MY, FY, and AFC. Fixed effects included herd-year-season, breed regression and heterosis regression effects. Random effects were animal additive genetic and residual. Individual SNP explaining at least 0.001% of the genetic variance for each trait were used to identify nearby genes in the National Center for Biotechnology Information database. Pathway enrichment analysis was performed. The PPI of genes were identified and visualized of the PPI network. Results Identified genes were involved in 16 enriched pathways related to MY, FY, and AFC. Most genes had two or more connections with other genes in the PPI network. Genes associated with MY, FY, and AFC based on the biological pathways and PPI were primarily involved in cellular processes. The percent of the genetic variance explained by genes in enriched pathways (303) was 2.63% for MY, 2.59% for FY, and 2.49% for AFC. Genes in the PPI network (265) explained 2.28% of the genetic variance for MY, 2.26% for FY, and 2.12% for AFC. Conclusion These sets of SNP associated with genes in the set enriched pathways and the PPI network could be used as genomic selection targets in the Thai multibreed dairy population. This study should be continued both in this and other populations subject to a variety of environmental conditions because predicted SNP values will likely differ across populations subject to different environmental conditions and changes over time.
Collapse
Affiliation(s)
- Thawee Laodim
- Department of Animal Science, Kasetsart University, Bangkok 10900, Thailand
| | - Mauricio A Elzo
- Department of Animal Sciences, University of Florida, Gainesville, FL 32611-0910, USA
| | | | | | - Danai Jattawa
- Department of Animal Science, Kasetsart University, Bangkok 10900, Thailand
| |
Collapse
|
15
|
Rohde PD, Østergaard S, Kristensen TN, Sørensen P, Loeschcke V, Mackay TFC, Sarup P. Functional Validation of Candidate Genes Detected by Genomic Feature Models. G3 (BETHESDA, MD.) 2018; 8:1659-1668. [PMID: 29519937 PMCID: PMC5940157 DOI: 10.1534/g3.118.200082] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Accepted: 03/07/2018] [Indexed: 12/11/2022]
Abstract
Understanding the genetic underpinnings of complex traits requires knowledge of the genetic variants that contribute to phenotypic variability. Reliable statistical approaches are needed to obtain such knowledge. In genome-wide association studies, variants are tested for association with trait variability to pinpoint loci that contribute to the quantitative trait. Because stringent genome-wide significance thresholds are applied to control the false positive rate, many true causal variants can remain undetected. To ameliorate this problem, many alternative approaches have been developed, such as genomic feature models (GFM). The GFM approach tests for association of set of genomic markers, and predicts genomic values from genomic data utilizing prior biological knowledge. We investigated to what degree the findings from GFM have biological relevance. We used the Drosophila Genetic Reference Panel to investigate locomotor activity, and applied genomic feature prediction models to identify gene ontology (GO) categories predictive of this phenotype. Next, we applied the covariance association test to partition the genomic variance of the predictive GO terms to the genes within these terms. We then functionally assessed whether the identified candidate genes affected locomotor activity by reducing gene expression using RNA interference. In five of the seven candidate genes tested, reduced gene expression altered the phenotype. The ranking of genes within the predictive GO term was highly correlated with the magnitude of the phenotypic consequence of gene knockdown. This study provides evidence for five new candidate genes for locomotor activity, and provides support for the reliability of the GFM approach.
Collapse
Affiliation(s)
- Palle Duun Rohde
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, 8000 Aarhus, Denmark
- Center for Integrative Sequencing, Aarhus University, 8000 Aarhus, Denmark
| | - Solveig Østergaard
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - Torsten Nygaard Kristensen
- Section for Genetics, Ecology and Evolution, Department of Bioscience, Aarhus University, 8000 Aarhus, Denmark
- Section for Biology and Environmental Science, Department of Chemistry and Bioscience, Aalborg University, 9220 Aalborg, Denmark
| | - Peter Sørensen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - Volker Loeschcke
- Section for Genetics, Ecology and Evolution, Department of Bioscience, Aarhus University, 8000 Aarhus, Denmark
| | - Trudy F C Mackay
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695
- Program in Genetics, North Carolina State University, Raleigh, North Carolina 27695
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, North Carolina 27695
| | - Pernille Sarup
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| |
Collapse
|
16
|
Laodim T, Elzo MA, Koonawootrittriron S, Suwanasopee T, Jattawa D. Identification of SNP markers associated with milk and fat yields in multibreed dairy cattle using two genetic group structures. Livest Sci 2017. [DOI: 10.1016/j.livsci.2017.10.015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
17
|
Fang L, Sahana G, Ma P, Su G, Yu Y, Zhang S, Lund MS, Sørensen P. Use of biological priors enhances understanding of genetic architecture and genomic prediction of complex traits within and between dairy cattle breeds. BMC Genomics 2017; 18:604. [PMID: 28797230 PMCID: PMC5553760 DOI: 10.1186/s12864-017-4004-z] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2016] [Accepted: 08/02/2017] [Indexed: 02/08/2023] Open
Abstract
Background A better understanding of the genetic architecture underlying complex traits (e.g., the distribution of causal variants and their effects) may aid in the genomic prediction. Here, we hypothesized that the genomic variants of complex traits might be enriched in a subset of genomic regions defined by genes grouped on the basis of “Gene Ontology” (GO), and that incorporating this independent biological information into genomic prediction models might improve their predictive ability. Results Four complex traits (i.e., milk, fat and protein yields, and mastitis) together with imputed sequence variants in Holstein (HOL) and Jersey (JER) cattle were analysed. We first carried out a post-GWAS analysis in a HOL training population to assess the degree of enrichment of the association signals in the gene regions defined by each GO term. We then extended the genomic best linear unbiased prediction model (GBLUP) to a genomic feature BLUP (GFBLUP) model, including an additional genomic effect quantifying the joint effect of a group of variants located in a genomic feature. The GBLUP model using a single random effect assumes that all genomic variants contribute to the genomic relationship equally, whereas GFBLUP attributes different weights to the individual genomic relationships in the prediction equation based on the estimated genomic parameters. Our results demonstrate that the immune-relevant GO terms were more associated with mastitis than milk production, and several biologically meaningful GO terms improved the prediction accuracy with GFBLUP for the four traits, as compared with GBLUP. The improvement of the genomic prediction between breeds (the average increase across the four traits was 0.161) was more apparent than that it was within the HOL (the average increase across the four traits was 0.020). Conclusions Our genomic feature modelling approaches provide a framework to simultaneously explore the genetic architecture and genomic prediction of complex traits by taking advantage of independent biological knowledge. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-4004-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lingzhao Fang
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark. .,Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture & National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China.
| | - Goutam Sahana
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Peipei Ma
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Guosheng Su
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Ying Yu
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture & National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Shengli Zhang
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture & National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Mogens Sandø Lund
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Peter Sørensen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| |
Collapse
|
18
|
Rohde PD, Gaertner B, Ward K, Sørensen P, Mackay TFC. Genomic Analysis of Genotype-by-Social Environment Interaction for Drosophila melanogaster Aggressive Behavior. Genetics 2017; 206:1969-1984. [PMID: 28550016 PMCID: PMC5560801 DOI: 10.1534/genetics.117.200642] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Accepted: 05/22/2017] [Indexed: 02/06/2023] Open
Abstract
Human psychiatric disorders such as schizophrenia, bipolar disorder, and attention-deficit/hyperactivity disorder often include adverse behaviors including increased aggressiveness. Individuals with psychiatric disorders often exhibit social withdrawal, which can further increase the probability of conducting a violent act. Here, we used the inbred, sequenced lines of the Drosophila Genetic Reference Panel (DGRP) to investigate the genetic basis of variation in male aggressive behavior for flies reared in a socialized and socially isolated environment. We identified genetic variation for aggressive behavior, as well as significant genotype-by-social environmental interaction (GSEI); i.e., variation among DGRP genotypes in the degree to which social isolation affected aggression. We performed genome-wide association (GWA) analyses to identify genetic variants associated with aggression within each environment. We used genomic prediction to partition genetic variants into gene ontology (GO) terms and constituent genes, and identified GO terms and genes with high prediction accuracies in both social environments and for GSEI. The top predictive GO terms significantly increased the proportion of variance explained, compared to prediction models based on all segregating variants. We performed genomic prediction across environments, and identified genes in common between the social environments that turned out to be enriched for genome-wide associated variants. A large proportion of the associated genes have previously been associated with aggressive behavior in Drosophila and mice. Further, many of these genes have human orthologs that have been associated with neurological disorders, indicating partially shared genetic mechanisms underlying aggression in animal models and human psychiatric disorders.
Collapse
Affiliation(s)
- Palle Duun Rohde
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
- iPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8000 Aarhus, Denmark
- ISEQ, Center for Integrative Sequencing, Aarhus University, 8000 Aarhus, Denmark
| | - Bryn Gaertner
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695
- Program in Genetics, North Carolina State University, Raleigh, North Carolina 27695
- W.M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, North Carolina 27695
| | - Kirsty Ward
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695
- Program in Genetics, North Carolina State University, Raleigh, North Carolina 27695
- W.M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, North Carolina 27695
| | - Peter Sørensen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - Trudy F C Mackay
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695
- Program in Genetics, North Carolina State University, Raleigh, North Carolina 27695
- W.M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, North Carolina 27695
| |
Collapse
|
19
|
Sørensen IF, Edwards SM, Rohde PD, Sørensen P. Multiple Trait Covariance Association Test Identifies Gene Ontology Categories Associated with Chill Coma Recovery Time in Drosophila melanogaster. Sci Rep 2017; 7:2413. [PMID: 28546557 PMCID: PMC5445101 DOI: 10.1038/s41598-017-02281-3] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Accepted: 04/10/2017] [Indexed: 12/29/2022] Open
Abstract
The genomic best linear unbiased prediction (GBLUP) model has proven to be useful for prediction of complex traits as well as estimation of population genetic parameters. Improved inference and prediction accuracy of GBLUP may be achieved by identifying genomic regions enriched for causal genetic variants. We aimed at searching for patterns in GBLUP-derived single-marker statistics, by including them in genetic marker set tests, that could reveal associations between a set of genetic markers (genomic feature) and a complex trait. GBLUP-derived set tests proved to be powerful for detecting genomic features, here defined by gene ontology (GO) terms, enriched for causal variants affecting a quantitative trait in a population with low degree of relatedness. Different set test approaches were compared using simulated data illustrating the impact of trait- and genomic feature-specific factors on detection power. We extended the most powerful single trait set test, covariance association test (CVAT), to a multiple trait setting. The multiple trait CVAT (MT-CVAT) identified functionally relevant GO categories associated with the quantitative trait, chill coma recovery time, in the unrelated, sequenced inbred lines of the Drosophila melanogaster Genetic Reference Panel.
Collapse
Affiliation(s)
- Izel Fourie Sørensen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830, Tjele, Denmark.
| | - Stefan M Edwards
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830, Tjele, Denmark.,The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush, Midlothian, Scotland, UK
| | - Palle Duun Rohde
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830, Tjele, Denmark.,Centre for Integrative Sequencing, iSEQ, Aarhus University, 8000, Aarhus, Denmark.,iPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8000, Aarhus, Denmark
| | - Peter Sørensen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830, Tjele, Denmark
| |
Collapse
|
20
|
Fang L, Sahana G, Ma P, Su G, Yu Y, Zhang S, Lund MS, Sørensen P. Exploring the genetic architecture and improving genomic prediction accuracy for mastitis and milk production traits in dairy cattle by mapping variants to hepatic transcriptomic regions responsive to intra-mammary infection. Genet Sel Evol 2017; 49:44. [PMID: 28499345 PMCID: PMC5427631 DOI: 10.1186/s12711-017-0319-0] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2016] [Accepted: 05/03/2017] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND A better understanding of the genetic architecture of complex traits can contribute to improve genomic prediction. We hypothesized that genomic variants associated with mastitis and milk production traits in dairy cattle are enriched in hepatic transcriptomic regions that are responsive to intra-mammary infection (IMI). Genomic markers [e.g. single nucleotide polymorphisms (SNPs)] from those regions, if included, may improve the predictive ability of a genomic model. RESULTS We applied a genomic feature best linear unbiased prediction model (GFBLUP) to implement the above strategy by considering the hepatic transcriptomic regions responsive to IMI as genomic features. GFBLUP, an extension of GBLUP, includes a separate genomic effect of SNPs within a genomic feature, and allows differential weighting of the individual marker relationships in the prediction equation. Since GFBLUP is computationally intensive, we investigated whether a SNP set test could be a computationally fast way to preselect predictive genomic features. The SNP set test assesses the association between a genomic feature and a trait based on single-SNP genome-wide association studies. We applied these two approaches to mastitis and milk production traits (milk, fat and protein yield) in Holstein (HOL, n = 5056) and Jersey (JER, n = 1231) cattle. We observed that a majority of genomic features were enriched in genomic variants that were associated with mastitis and milk production traits. Compared to GBLUP, the accuracy of genomic prediction with GFBLUP was marginally improved (3.2 to 3.9%) in within-breed prediction. The highest increase (164.4%) in prediction accuracy was observed in across-breed prediction. The significance of genomic features based on the SNP set test were correlated with changes in prediction accuracy of GFBLUP (P < 0.05). CONCLUSIONS GFBLUP provides a framework for integrating multiple layers of biological knowledge to provide novel insights into the biological basis of complex traits, and to improve the accuracy of genomic prediction. The SNP set test might be used as a first-step to improve GFBLUP models. Approaches like GFBLUP and SNP set test will become increasingly useful, as the functional annotations of genomes keep accumulating for a range of species and traits.
Collapse
Affiliation(s)
- Lingzhao Fang
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark. .,Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture and National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China.
| | - Goutam Sahana
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Peipei Ma
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Guosheng Su
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Ying Yu
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture and National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Shengli Zhang
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture and National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Mogens Sandø Lund
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Peter Sørensen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| |
Collapse
|
21
|
Fang L, Sahana G, Su G, Yu Y, Zhang S, Lund MS, Sørensen P. Integrating Sequence-based GWAS and RNA-Seq Provides Novel Insights into the Genetic Basis of Mastitis and Milk Production in Dairy Cattle. Sci Rep 2017; 7:45560. [PMID: 28358110 PMCID: PMC5372096 DOI: 10.1038/srep45560] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Accepted: 02/28/2017] [Indexed: 02/06/2023] Open
Abstract
Connecting genome-wide association study (GWAS) to biological mechanisms underlying complex traits is a major challenge. Mastitis resistance and milk production are complex traits of economic importance in the dairy sector and are associated with intra-mammary infection (IMI). Here, we integrated IMI-relevant RNA-Seq data from Holstein cattle and sequence-based GWAS data from three dairy cattle breeds (i.e., Holstein, Nordic red cattle, and Jersey) to explore the genetic basis of mastitis resistance and milk production using post-GWAS analyses and a genomic feature linear mixed model. At 24 h post-IMI, genes responsive to IMI in the mammary gland were preferentially enriched for genetic variants associated with mastitis resistance rather than milk production. Response genes in the liver were mainly enriched for variants associated with mastitis resistance at an early time point (3 h) post-IMI, whereas responsive genes at later stages were enriched for associated variants with milk production. The up- and down-regulated genes were enriched for associated variants with mastitis resistance and milk production, respectively. The patterns were consistent across breeds, indicating that different breeds shared similarities in the genetic basis of these traits. Our approaches provide a framework for integrating multiple layers of data to understand the genetic architecture underlying complex traits.
Collapse
Affiliation(s)
- Lingzhao Fang
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark.,Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture &National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, 100193, Beijing, China
| | - Goutam Sahana
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - Guosheng Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - Ying Yu
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture &National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, 100193, Beijing, China
| | - Shengli Zhang
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture &National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, 100193, Beijing, China
| | - Mogens Sandø Lund
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - Peter Sørensen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| |
Collapse
|
22
|
Dadousis C, Pegolo S, Rosa GJM, Gianola D, Bittante G, Cecchinato A. Pathway-based genome-wide association analysis of milk coagulation properties, curd firmness, cheese yield, and curd nutrient recovery in dairy cattle. J Dairy Sci 2016; 100:1223-1231. [PMID: 27988128 DOI: 10.3168/jds.2016-11587] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Accepted: 10/20/2016] [Indexed: 01/02/2023]
Abstract
It is becoming common to complement genome-wide association studies (GWAS) with gene-set enrichment analysis to deepen the understanding of the biological pathways affecting quantitative traits. Our objective was to conduct a gene ontology and pathway-based analysis to identify possible biological mechanisms involved in the regulation of bovine milk technological traits: coagulation properties, curd firmness modeling, individual cheese yield (CY), and milk nutrient recovery into the curd (REC) or whey loss traits. Results from 2 previous GWAS studies using 1,011 cows genotyped for 50k single nucleotide polymorphisms were used. Overall, the phenotypes analyzed consisted of 3 traditional milk coagulation property measures [RCT: rennet coagulation time defined as the time (min) from addition of enzyme to the beginning of coagulation; k20: the interval (min) from RCT to the time at which a curd firmness of 20 mm is attained; a30: a measure of the extent of curd firmness (mm) 30 min after coagulant addition], 6 curd firmness modeling traits [RCTeq: RCT estimated through the CF equation (min); CFP: potential asymptotic curd firmness (mm); kCF: curd-firming rate constant (% × min-1); kSR: syneresis rate constant (% × min-1); CFmax: maximum curd firmness (mm); and tmax: time to CFmax (min)], 3 individual CY-related traits expressing the weight of fresh curd (%CYCURD), curd solids (%CYSOLIDS), and curd moisture (%CYWATER) as a percentage of weight of milk processed and 4 milk nutrient and energy recoveries in the curd (RECFAT, RECPROTEIN, RECSOLIDS, and RECENERGY calculated as the % ratio between the nutrient in curd and the corresponding nutrient in processed milk), milk pH, and protein percentage. Each trait was analyzed separately. In total, 13,269 annotated genes were used in the analysis. The Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway databases were queried for enrichment analyses. Overall, 21 Gene Ontology and 17 Kyoto Encyclopedia of Genes and Genomes categories were significantly associated (false discovery rate at 5%) with 7 traits (RCT, RCTeq, kCF, %CYSOLIDS, RECFAT, RECSOLIDS, and RECENERGY), with some being in common between traits. The significantly enriched categories included calcium signaling pathway, salivary secretion, metabolic pathways, carbohydrate digestion and absorption, the tight junction and the phosphatidylinositol pathways, as well as pathways related to the bovine mammary gland health status, and contained a total of 150 genes spanning all chromosomes but 9, 20, and 27. This study provided new insights into the regulation of bovine milk coagulation and cheese ability that were not captured by the GWAS.
Collapse
Affiliation(s)
- C Dadousis
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, Viale dell'Università 16, 35020 Legnaro, Italy
| | - S Pegolo
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, Viale dell'Università 16, 35020 Legnaro, Italy
| | - G J M Rosa
- Department of Animal Sciences, University of Wisconsin, Madison 53706; Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison 53706
| | - D Gianola
- Department of Animal Sciences, University of Wisconsin, Madison 53706; Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison 53706
| | - G Bittante
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, Viale dell'Università 16, 35020 Legnaro, Italy
| | - A Cecchinato
- Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, Viale dell'Università 16, 35020 Legnaro, Italy.
| |
Collapse
|
23
|
The 'heritability' of domestication and its functional partitioning in the pig. Heredity (Edinb) 2016; 118:160-168. [PMID: 27649617 DOI: 10.1038/hdy.2016.78] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Revised: 07/04/2016] [Accepted: 07/04/2016] [Indexed: 11/08/2022] Open
Abstract
We propose to estimate the proportion of variance explained by regression on genome-wide markers (or genomic heritability) when wild/domestic status is considered the phenotype of interest. This approach differs from the standard Fst in that it can accommodate genetic similarity between individuals in a general form. We apply this strategy to complete genome data from 47 wild and domestic pigs from Asia and Europe. When we partitioned the total genomic variance into components associated to subsets of single nucleotide polymorphisms (SNPs) defined in terms of their annotation, we found that potentially deleterious non-synonymous mutations (9566 SNPs) explained as much genetic variance as the whole set of 25 million SNPs. This suggests that domestication may have affected protein sequence to a larger extent than regulatory or other kinds of mutations. A pathway-guided analysis revealed ovarian steroidogenesis and leptin signaling as highly relevant in domestication. The genomic regression approach proposed in this study revealed molecular processes not apparent through typical differentiation statistics. We propose that at least some of these processes are likely new discoveries because domestication is a dynamic process of genetic selection, which may not be completely characterized by a static metric like Fst. Nevertheless, and despite some particularly influential mutation types or pathways, our analyses tend to rule out a simplistic genetic basis for the domestication process: neither a single pathway nor a unique set of SNPs can explain the process as a whole.
Collapse
|
24
|
Genomic Prediction for Quantitative Traits Is Improved by Mapping Variants to Gene Ontology Categories in Drosophila melanogaster. Genetics 2016; 203:1871-83. [PMID: 27235308 DOI: 10.1534/genetics.116.187161] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2016] [Accepted: 05/19/2016] [Indexed: 01/28/2023] Open
Abstract
Predicting individual quantitative trait phenotypes from high-resolution genomic polymorphism data is important for personalized medicine in humans, plant and animal breeding, and adaptive evolution. However, this is difficult for populations of unrelated individuals when the number of causal variants is low relative to the total number of polymorphisms and causal variants individually have small effects on the traits. We hypothesized that mapping molecular polymorphisms to genomic features such as genes and their gene ontology categories could increase the accuracy of genomic prediction models. We developed a genomic feature best linear unbiased prediction (GFBLUP) model that implements this strategy and applied it to three quantitative traits (startle response, starvation resistance, and chill coma recovery) in the unrelated, sequenced inbred lines of the Drosophila melanogaster Genetic Reference Panel. Our results indicate that subsetting markers based on genomic features increases the predictive ability relative to the standard genomic best linear unbiased prediction (GBLUP) model. Both models use all markers, but GFBLUP allows differential weighting of the individual genetic marker relationships, whereas GBLUP weighs the genetic marker relationships equally. Simulation studies show that it is possible to further increase the accuracy of genomic prediction for complex traits using this model, provided the genomic features are enriched for causal variants. Our GFBLUP model using prior information on genomic features enriched for causal variants can increase the accuracy of genomic predictions in populations of unrelated individuals and provides a formal statistical framework for leveraging and evaluating information across multiple experimental studies to provide novel insights into the genetic architecture of complex traits.
Collapse
|