1
|
Peters SO, Kızılkaya K, Sinecen M, Mestav B, Thiruvenkadan AK, Thomas MG. Genomic Prediction Accuracies for Growth and Carcass Traits in a Brangus Heifer Population. Animals (Basel) 2023; 13:ani13071272. [PMID: 37048528 PMCID: PMC10093372 DOI: 10.3390/ani13071272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 03/30/2023] [Accepted: 04/04/2023] [Indexed: 04/14/2023] Open
Abstract
The predictive abilities and accuracies of genomic best linear unbiased prediction (GBLUP) and the Bayesian (BayesA, BayesB, BayesC and Lasso) genomic selection (GS) methods for economically important growth (birth, weaning, and yearling weights) and carcass (depth of rib fat, apercent intramuscular fat and longissimus muscle area) traits were characterized by estimating the linkage disequilibrium (LD) structure in Brangus heifers using single nucleotide polymorphisms (SNP) markers. Sharp declines in LD were observed as distance among SNP markers increased. The application of the GBLUP and the Bayesian methods to obtain the GEBV for growth and carcass traits within k-means and random clusters showed that k-means and random clustering had quite similar heritability estimates, but the Bayesian methods resulted in the lower estimates of heritability between 0.06 and 0.21 for growth and carcass traits compared with those between 0.21 and 0.35 from the GBLUP methodologies. Although the prediction ability of the GBLUP and the Bayesian methods were quite similar for growth and carcass traits, the Bayesian methods overestimated the accuracies of GEBV because of the lower estimates of heritability of growth and carcass traits. However, GBLUP resulted in accuracy of GEBV for growth and carcass traits that parallels previous reports.
Collapse
Affiliation(s)
- Sunday O Peters
- Department of Animal Science, Berry College, Mount Berry, GA 30149, USA
| | - Kadir Kızılkaya
- Department of Animal Science, Faculty of Agriculture, Aydin Adnan Menderes University, Aydin 09100, Turkey
| | - Mahmut Sinecen
- Department of Computer Engineering, Faculty of Engineering, Aydin Adnan Menderes University, Aydin 09100, Turkey
| | - Burcu Mestav
- Department of Statistics, Faculty of Arts and Sciences, Çanakkale Onsekiz Mart University, Terzioğlu Campus, Çanakkale 17100, Turkey
| | - Aranganoor K Thiruvenkadan
- Department of Animal Genetics and Breeding, Veterinary College and Research Institute, Tamil Nadu Veterinary and Animal Sciences University, Salem 637002, Tamil Nadu, India
| | | |
Collapse
|
2
|
Li H, Wang Z, Xu L, Li Q, Gao H, Ma H, Cai W, Chen Y, Gao X, Zhang L, Gao H, Zhu B, Xu L, Li J. Genomic prediction of carcass traits using different haplotype block partitioning methods in beef cattle. Evol Appl 2022; 15:2028-2042. [PMID: 36540636 PMCID: PMC9753827 DOI: 10.1111/eva.13491] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 09/18/2022] [Indexed: 09/22/2023] Open
Abstract
Genomic prediction (GP) based on haplotype alleles can capture quantitative trait loci (QTL) effects and increase predictive ability because the haplotypes are expected to be in linkage disequilibrium (LD) with QTL. In this study, we constructed haploblocks using LD-based and the fixed number of single nucleotide polymorphisms (fixed-SNP) methods with Illumina BovineHD chip in beef cattle. To evaluate the performance of different haplotype block partitioning methods, we constructed haploblocks based on LD thresholds (from r 2 > 0.2 to r 2 > 0.8) and the number of fixed-SNPs (5, 10, 20). The performance of predictive methods for three carcass traits including liveweight (LW), dressing percentage (DP), and longissimus dorsi muscle weight (LDMW) was evaluated using three approaches (GBLUP and BayesB model based on the SNP, GHBLUP, and BayesBH models based on the haploblock, and GHBLUP+GBLUP and BayesBH+BayesB models based on the combined haploblock and the nonblocked SNPs, which were located between blocks). In this study, we found the accuracies of LD-based and fixed-SNP haplotype Bayesian methods outperformed the Bayesian models (up to 8.54 ± 7.44% and 5.74 ± 2.95%, respectively). GHBLUP showed a high improvement (up to 11.29 ± 9.87%) compared with GBLUP. The Bayesian models have higher accuracies than BLUP models in most scenarios. The average computing time of the BayesBH+BayesB model can reduce by 29.3% compared with the BayesB model. The prediction accuracies using the LD-based haplotype method showed higher improvements than the fixed-SNP haplotype method. In addition, to avoid the influence of rare haplotypes generated from haplotype construction, we compared the performance of GP by filtering four types of minor haplotype allele frequency (MHAF) (0.01, 0.025, 0.05, and 0.1) under different conditions (LD levels were set at r 2 > 0.3, and the fixed number of SNPs was 5). We found the optimal MHAF threshold for LW was 0.01, and the optimal MHAF threshold for DP and LDMW was 0.025.
Collapse
Affiliation(s)
- Hongwei Li
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Zezhao Wang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Lei Xu
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Qian Li
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Han Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Haoran Ma
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Wentao Cai
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Yan Chen
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Xue Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Lupei Zhang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Huijiang Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Bo Zhu
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Lingyang Xu
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| | - Junya Li
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal SciencesChinese Academy of Agricultural SciencesBeijingChina
| |
Collapse
|
3
|
Yan X, Zhang T, Liu L, Yu Y, Yang G, Han Y, Gong G, Wang F, Zhang L, Liu H, Li W, Yan X, Mao H, Li Y, Du C, Li J, Zhang Y, Wang R, Lv Q, Wang Z, Zhang J, Liu Z, Wang Z, Su R. Accuracy of Genomic Selection for Important Economic Traits of Cashmere and Meat Goats Assessed by Simulation Study. Front Vet Sci 2022; 9:770539. [PMID: 35372544 PMCID: PMC8966406 DOI: 10.3389/fvets.2022.770539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Accepted: 01/24/2022] [Indexed: 11/13/2022] Open
Abstract
Genomic selection in plants and animals has become a standard tool for breeding because of the advantages of high accuracy and short generation intervals. Implementation of this technology is hindered by the high cost of genotyping and other factors. The aim of this study was to determine an optional marker density panel and reference population size for using genomic selection of goats, with speculation on the number of QTLs that affect the important economic traits of goats. In addition, the effect of buck population size in the reference population on the accuracy of genomic estimated breeding value (GEBV) was discussed. Based on the previous genetic evaluation results of Inner Mongolia White Cashmere Goats, live body weight (LBW, h2 = 0.11) and fiber diameter (FD, h2 = 0.34) were chosen to perform genomic selection in this study. Reasonable genome parameters and generation transmission processes were set, and phenotypic and genotype data of the two traits were simulated. Then, different sizes of the reference population and validation population were selected from progeny. The GEBVs were obtained by six methods, including GBLUP (Genomic Best Linear Unbiased Prediction), ssGBLUP (Single Step Genomic Best Linear Unbiased Prediction), BayesA, BayesB, Bayesian ridge regression, and Bayesian LASSO. The correlation coefficient between the predicted and realized phenotypes from simulation was calculated and used as a measure of the accuracy of GEBV in each trait. The results showed that the medium marker density Panel (45 K) could be used for genomic selection in goats, which can ensure the accuracy of the GEBV. The reference population size of 1,500 can achieve greater genetic progress in genomic selection for fiber diameter and live body weight in goats by comparing with the population size below this level. The accuracy of the GEBV for live body weight and fiber diameter was better when the number of QTLs was 100 and 50, respectively. Additionally, the accuracy of GEBV was discovered to be good when the buck population size was up to 200. Meanwhile, the accuracy of the GEBV for medium heritability traits (FDs) was found to be higher than the accuracy of the GEBV for low heritability traits (LBWs). These findings will provide theoretical guidance for genomic selection in goats by using real data.
Collapse
Affiliation(s)
- Xiaochun Yan
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Tao Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
- Inner Mongolia Bigvet Co., Ltd., Hohhot, China
| | - Lichun Liu
- College of Veterinary Medicine, Inner Mongolia Agricultural University, Hohhot, China
| | - Yongsheng Yu
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Guang Yang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Yaqian Han
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Gao Gong
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Fenghong Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Lei Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Hongfu Liu
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Wenze Li
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Xiaomin Yan
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Haoyu Mao
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Yaming Li
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Chen Du
- Department of Obstetrics and Gynaecology, Inner Mongolia Medical University, Hohhot, China
| | - Jinquan Li
- Key Laboratory of Mutton Sheep Genetics and Breeding, Ministry of Agriculture, Hohhot, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction in Inner Mongolia Autonomous Region, Hohhot, China
- Engineering Research Centre for Goat Genetics and Breeding, Inner Mongolia Autonomous Region, Hohhot, China
| | - Yanjun Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Ruijun Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Qi Lv
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Zhixin Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Jiaxin Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Zhihong Liu
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Zhiying Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
- *Correspondence: Zhiying Wang
| | - Rui Su
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
- Rui Su
| |
Collapse
|
4
|
Ahmar S, Ballesta P, Ali M, Mora-Poblete F. Achievements and Challenges of Genomics-Assisted Breeding in Forest Trees: From Marker-Assisted Selection to Genome Editing. Int J Mol Sci 2021; 22:10583. [PMID: 34638922 PMCID: PMC8508745 DOI: 10.3390/ijms221910583] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 09/26/2021] [Accepted: 09/27/2021] [Indexed: 12/23/2022] Open
Abstract
Forest tree breeding efforts have focused mainly on improving traits of economic importance, selecting trees suited to new environments or generating trees that are more resilient to biotic and abiotic stressors. This review describes various methods of forest tree selection assisted by genomics and the main technological challenges and achievements in research at the genomic level. Due to the long rotation time of a forest plantation and the resulting long generation times necessary to complete a breeding cycle, the use of advanced techniques with traditional breeding have been necessary, allowing the use of more precise methods for determining the genetic architecture of traits of interest, such as genome-wide association studies (GWASs) and genomic selection (GS). In this sense, main factors that determine the accuracy of genomic prediction models are also addressed. In turn, the introduction of genome editing opens the door to new possibilities in forest trees and especially clustered regularly interspaced short palindromic repeats and CRISPR-associated protein 9 (CRISPR/Cas9). It is a highly efficient and effective genome editing technique that has been used to effectively implement targetable changes at specific places in the genome of a forest tree. In this sense, forest trees still lack a transformation method and an inefficient number of genotypes for CRISPR/Cas9. This challenge could be addressed with the use of the newly developing technique GRF-GIF with speed breeding.
Collapse
Affiliation(s)
- Sunny Ahmar
- Institute of Biological Sciences, University of Talca, 1 Poniente 1141, Talca 3460000, Chile;
| | - Paulina Ballesta
- The National Fund for Scientific and Technological Development, Av. del Agua 3895, Talca 3460000, Chile
| | - Mohsin Ali
- Department of Forestry and Range Management, University of Agriculture Faisalabad, Faisalabad 38000, Pakistan;
| | - Freddy Mora-Poblete
- Institute of Biological Sciences, University of Talca, 1 Poniente 1141, Talca 3460000, Chile;
| |
Collapse
|
5
|
Gogolev YV, Ahmar S, Akpinar BA, Budak H, Kiryushkin AS, Gorshkov VY, Hensel G, Demchenko KN, Kovalchuk I, Mora-Poblete F, Muslu T, Tsers ID, Yadav NS, Korzun V. OMICs, Epigenetics, and Genome Editing Techniques for Food and Nutritional Security. PLANTS (BASEL, SWITZERLAND) 2021; 10:1423. [PMID: 34371624 PMCID: PMC8309286 DOI: 10.3390/plants10071423] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 06/30/2021] [Accepted: 07/07/2021] [Indexed: 12/22/2022]
Abstract
The incredible success of crop breeding and agricultural innovation in the last century greatly contributed to the Green Revolution, which significantly increased yields and ensures food security, despite the population explosion. However, new challenges such as rapid climate change, deteriorating soil, and the accumulation of pollutants require much faster responses and more effective solutions that cannot be achieved through traditional breeding. Further prospects for increasing the efficiency of agriculture are undoubtedly associated with the inclusion in the breeding strategy of new knowledge obtained using high-throughput technologies and new tools in the future to ensure the design of new plant genomes and predict the desired phenotype. This article provides an overview of the current state of research in these areas, as well as the study of soil and plant microbiomes, and the prospective use of their potential in a new field of microbiome engineering. In terms of genomic and phenomic predictions, we also propose an integrated approach that combines high-density genotyping and high-throughput phenotyping techniques, which can improve the prediction accuracy of quantitative traits in crop species.
Collapse
Affiliation(s)
- Yuri V. Gogolev
- Federal Research Center Kazan Scientific Center of Russian Academy of Sciences, Kazan Institute of Biochemistry and Biophysics, 420111 Kazan, Russia;
- Federal Research Center Kazan Scientific Center of Russian Academy of Sciences, Laboratory of Plant Infectious Diseases, 420111 Kazan, Russia;
| | - Sunny Ahmar
- Institute of Biological Sciences, University of Talca, 1 Poniente 1141, Talca 3460000, Chile; (S.A.); (F.M.-P.)
| | | | - Hikmet Budak
- Montana BioAg Inc., Missoula, MT 59802, USA; (B.A.A.); (H.B.)
| | - Alexey S. Kiryushkin
- Laboratory of Cellular and Molecular Mechanisms of Plant Development, Komarov Botanical Institute of the Russian Academy of Sciences, 197376 Saint Petersburg, Russia; (A.S.K.); (K.N.D.)
| | - Vladimir Y. Gorshkov
- Federal Research Center Kazan Scientific Center of Russian Academy of Sciences, Kazan Institute of Biochemistry and Biophysics, 420111 Kazan, Russia;
- Federal Research Center Kazan Scientific Center of Russian Academy of Sciences, Laboratory of Plant Infectious Diseases, 420111 Kazan, Russia;
| | - Goetz Hensel
- Centre for Plant Genome Engineering, Institute of Plant Biochemistry, Heinrich-Heine-University, 40225 Dusseldorf, Germany;
- Centre of the Region Haná for Biotechnological and Agricultural Research, Czech Advanced Technology and Research Institute, Palacký University Olomouc, 78371 Olomouc, Czech Republic
| | - Kirill N. Demchenko
- Laboratory of Cellular and Molecular Mechanisms of Plant Development, Komarov Botanical Institute of the Russian Academy of Sciences, 197376 Saint Petersburg, Russia; (A.S.K.); (K.N.D.)
| | - Igor Kovalchuk
- Department of Biological Sciences, University of Lethbridge, Lethbridge, AB T1K 3M4, Canada; (I.K.); (N.S.Y.)
| | - Freddy Mora-Poblete
- Institute of Biological Sciences, University of Talca, 1 Poniente 1141, Talca 3460000, Chile; (S.A.); (F.M.-P.)
| | - Tugdem Muslu
- Faculty of Engineering and Natural Sciences, Sabanci University, 34956 Istanbul, Turkey;
| | - Ivan D. Tsers
- Federal Research Center Kazan Scientific Center of Russian Academy of Sciences, Laboratory of Plant Infectious Diseases, 420111 Kazan, Russia;
| | - Narendra Singh Yadav
- Department of Biological Sciences, University of Lethbridge, Lethbridge, AB T1K 3M4, Canada; (I.K.); (N.S.Y.)
| | - Viktor Korzun
- Federal Research Center Kazan Scientific Center of Russian Academy of Sciences, Laboratory of Plant Infectious Diseases, 420111 Kazan, Russia;
- KWS SAAT SE & Co. KGaA, Grimsehlstr. 31, 37555 Einbeck, Germany
| |
Collapse
|
6
|
Al-Khudhair A, VanRaden PM, Null DJ, Li B. Marker selection and genomic prediction of economically important traits using imputed high-density genotypes for 5 breeds of dairy cattle. J Dairy Sci 2021; 104:4478-4485. [PMID: 33612229 DOI: 10.3168/jds.2020-19260] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 11/22/2020] [Indexed: 11/19/2022]
Abstract
Marker sets used in US dairy genomic predictions were previously expanded by including high-density (HD) or sequence markers with the largest effects for Holstein breed only. Other non-Holstein breeds lacked enough HD genotyped animals to be used as a reference population at that time, and thus were not included in the genomic prediction. Recently, numbers of non-Holstein breeds genotyped using HD panels reached an acceptable level for imputation and marker selection, allowing HD genomic prediction and HD marker selection for Holstein plus 4 other breeds. Genotypes for 351,461 Holsteins, 347,570 Jerseys, 42,346 Brown Swiss, 9,364 Ayrshires (including Red dairy cattle), and 4,599 Guernseys were imputed to the HD marker list that included 643,059 SNP. The separate HD reference populations included Illumina BovineHD (San Diego, CA) genotypes for 4,012 Holsteins, 407 Jerseys, 181 Brown Swiss, 527 Ayrshires, and 147 Guernseys. The 643,059 variants included the HD SNP and all 79,254 (80K) genetic markers and QTL used in routine national genomic evaluations. Before imputation, approximately 91 to 97% of genotypes were unknown for each breed; after imputation, 1.1% of Holstein, 3.2% of Jersey, 6.7% of Brown Swiss, 4.8% of Ayrshire, and 4.2% of Guernsey alleles remained unknown due to lower density haplotypes that had no matching HD haplotype. The higher remaining missing rates in non-Holstein breeds are mainly due to fewer HD genotyped animals in the imputation reference populations. Allele effects for up to 39 traits were estimated separately within each breed using phenotypic reference populations that included up to 6,157 Jersey males and 110,130 Jersey females. Correlations of HD with 80K genomic predictions for young animals averaged 0.986, 0.989, 0.985, 0.992, and 0.978 for Jersey, Ayrshire, Brown Swiss, Guernsey, and Holstein breeds, respectively. Correlations were highest for yield traits (about 0.991) and lowest for foot angle and rear legs-side view (0.981and 0.982, respectively). Some HD effects were more than twice as large as the largest 80K SNP effect, and HD markers had larger effects than nearby 80K markers for many breed-trait combinations. Previous studies selected and included markers with large effects for Holstein traits; the newly selected HD markers should also improve non-Holstein and crossbred genomic predictions and were added to official US genomic predictions in April 2020.
Collapse
Affiliation(s)
- A Al-Khudhair
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705-2350
| | - P M VanRaden
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705-2350.
| | - D J Null
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705-2350
| | - B Li
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705-2350
| |
Collapse
|
7
|
Peters SO, Kızılkaya K, Ibeagha-Awemu EM, Sinecen M, Zhao X. Comparative accuracies of genetic values predicted for economically important milk traits, genome-wide association, and linkage disequilibrium patterns of Canadian Holstein cows. J Dairy Sci 2020; 104:1900-1916. [PMID: 33358789 DOI: 10.3168/jds.2020-18489] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 08/10/2020] [Indexed: 11/19/2022]
Abstract
Genomic selection methodologies and genome-wide association studies use powerful statistical procedures that correlate large amounts of high-density SNP genotypes and phenotypic data. Actual 305-d milk (MY), fat (FY), and protein (PY) yield data on 695 cows and 76,355 genotyping-by-sequencing-generated SNP marker genotypes from Canadian Holstein dairy cows were used to characterize linkage disequilibrium (LD) structure of Canadian Holstein cows. Also, the comparison of pedigree-based BLUP, genomic BLUP (GBLUP), and Bayesian (BayesB) statistical methods in the genomic selection methodologies and the comparison of Bayesian ridge regression and BayesB statistical methods in the genome-wide association studies were carried out for MY, FY, and PY. Results from LD analysis revealed that as marker distance decreases, LD increases through chromosomes. However, unexpected high peaks in LD were observed between marker pairs with larger marker distances on all chromosomes. The GBLUP and BayesB models resulted in similar heritability estimates through 10-fold cross-validation for MY and PY; however, the GBLUP model resulted in higher heritability estimates than BayesB model for FY. The predictive ability of GBLUP model was significantly lower than that of BayesB for MY, FY, and PY. Association analyses indicated that 28 high-effect markers and markers on Bos taurus autosome 14 located within 6 genes (DOP1B, TONSL, CPSF1, ADCK5, PARP10, and GRINA) associated significantly with FY.
Collapse
Affiliation(s)
- Sunday O Peters
- Department of Animal Science, Berry College, Mount Berry, GA 30149; Department of Animal and Dairy Science, University of Georgia, Athens 30602.
| | - Kadir Kızılkaya
- Department of Animal Science, Faculty of Agriculture, Aydin Adnan Menderes University, Aydin, 09100, Turkey
| | - Eveline M Ibeagha-Awemu
- Agriculture and Agri-Food Canada, Sherbrooke Research and Development Centre, 2000 Rue College, Sherbrooke, QC, J1M 0C8 Canada
| | - Mahmut Sinecen
- Department of Computer Engineering, Faculty of Engineering, Aydin Adnan Menderes University, Aydin, 09100, Turkey
| | - Xin Zhao
- Department of Animal Science, McGill University, 21,111 Lakeshore Road, Ste-Anne-De-Bellevue, QC, H9S 3V9 Canada
| |
Collapse
|
8
|
Ge F, Jia C, Bao P, Wu X, Liang C, Yan P. Accuracies of Genomic Prediction for Growth Traits at Weaning and Yearling Ages in Yak. Animals (Basel) 2020; 10:E1793. [PMID: 33023134 PMCID: PMC7650705 DOI: 10.3390/ani10101793] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 09/26/2020] [Accepted: 09/28/2020] [Indexed: 12/20/2022] Open
Abstract
Genomic selection is a promising breeding strategy that has been used in considerable numbers of breeding projects due to its highly accurate results. Yak are rare mammals that are remarkable because of their ability to survive in the extreme and harsh conditions predominantly at the so-called "roof of the world"-the Qinghai-Tibetan Plateau. In the current study, we conducted an exploration of the feasibility of genomic evaluation and compared the predictive accuracy of early growth traits with five different approaches. In total, four growth traits were measured in 354 yaks, including body weight, withers height, body length, and chest girth in two early stages of development (weaning and yearling). Genotyping was implemented using the Illumina BovineHD BeadChip. The predictive accuracy was calculated through five-fold cross-validation in five classical statistical methods including genomic best linear unbiased prediction (GBLUP) and four Bayesian methods. Body weights at 30 months in the same yak population were also measured to evaluate the prediction at 6 months. The results indicated that the predictive accuracy for the early growth traits of yak ranged from 0.147 to 0.391. Similar performance was found for the GBLUP and Bayesian methods for most growth traits. Among the Bayesian methods, Bayes B outperformed Bayes A in the majority of traits. The average correlation coefficient between the prediction at 6 months using different methods and observations at 30 months was 0.4. These results indicate that genomic prediction is feasible for early growth traits in yak. Considering that genomic selection is necessary in yak breeding projects, the present study provides promising reference for future applications.
Collapse
Affiliation(s)
| | | | | | | | - Chunnian Liang
- Key Laboratory of Yak Breeding Engineering of Gansu Province, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China; (F.G.); (C.J.); (P.B.); (X.W.)
| | - Ping Yan
- Key Laboratory of Yak Breeding Engineering of Gansu Province, Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lanzhou 730050, China; (F.G.); (C.J.); (P.B.); (X.W.)
| |
Collapse
|
9
|
Ma X, Christensen OF, Gao H, Huang R, Nielsen B, Madsen P, Jensen J, Ostersen T, Li P, Shirali M, Su G. Prediction of breeding values for group-recorded traits including genomic information and an individually recorded correlated trait. Heredity (Edinb) 2020; 126:206-217. [PMID: 32665691 PMCID: PMC7852592 DOI: 10.1038/s41437-020-0339-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Revised: 06/25/2020] [Accepted: 06/25/2020] [Indexed: 12/25/2022] Open
Abstract
Records on groups of individuals could be valuable for predicting breeding values when a trait is difficult or costly to measure on single individuals, such as feed intake and egg production. Adding genomic information has shown improvement in the accuracy of genetic evaluation of quantitative traits with individual records. Here, we investigated the value of genomic information for traits with group records. Besides, we investigated the improvement in accuracy of genetic evaluation for group-recorded traits when including information on a correlated trait with individual records. The study was based on a simulated pig population, including three scenarios of group structure and size. The results showed that both the genomic information and a correlated trait increased the accuracy of estimated breeding values (EBVs) for traits with group records. The accuracies of EBV obtained from group records with a size 24 were much lower than those with a size 12. Random assignment of animals to pens led to lower accuracy due to the weaker relationship between individuals within each group. It suggests that group records are valuable for genetic evaluation of a trait that is difficult to record on individuals, and the accuracy of genetic evaluation can be considerably increased using genomic information. Moreover, the genetic evaluation for a trait with group records can be greatly improved using a bivariate model, including correlated traits that are recorded individually. For efficient use of group records in genetic evaluation, relatively small group size and close relationships between individuals within one group are recommended.
Collapse
Affiliation(s)
- Xiang Ma
- Institute of Swine Science, Nanjing Agricultural University, Nanjing, 210095, China.,College of Animal Science and Technology, College of Veterinary Medicine, Zhejiang Agriculture and Forest Universiry, Hangzhou, 311300, China.,Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Ole F Christensen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Hongding Gao
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Ruihua Huang
- Institute of Swine Science, Nanjing Agricultural University, Nanjing, 210095, China
| | | | - Per Madsen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Just Jensen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Tage Ostersen
- SEGES, Pig Research Centre, 1609, Copenhagen, Denmark
| | - Pinghua Li
- Institute of Swine Science, Nanjing Agricultural University, Nanjing, 210095, China
| | - Mahmoud Shirali
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark
| | - Guosheng Su
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark.
| |
Collapse
|
10
|
Wang S, Xu Y, Qu H, Cui Y, Li R, Chater JM, Yu L, Zhou R, Ma R, Huang Y, Qiao Y, Hu X, Xie W, Jia Z. Boosting predictabilities of agronomic traits in rice using bivariate genomic selection. Brief Bioinform 2020; 22:5867560. [PMID: 34020535 DOI: 10.1093/bib/bbaa103] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Revised: 04/27/2020] [Accepted: 05/04/2020] [Indexed: 12/24/2022] Open
Abstract
The multivariate genomic selection (GS) models have not been adequately studied and their potential remains unclear. In this study, we developed a highly efficient bivariate (2D) GS method and demonstrated its significant advantages over the univariate (1D) rival methods using a rice dataset, where four traditional traits (i.e. yield, 1000-grain weight, grain number and tiller number) as well as 1000 metabolomic traits were analyzed. The novelty of the method is the incorporation of the HAT methodology in the 2D BLUP GS model such that the computational efficiency has been dramatically increased by avoiding the conventional cross-validation. The results indicated that (1) the 2D BLUP-HAT GS analysis generally produces higher predictabilities for two traits than those achieved by the analysis of individual traits using 1D GS model, and (2) selected metabolites may be utilized as ancillary traits in the new 2D BLUP-HAT GS method to further boost the predictability of traditional traits, especially for agronomically important traits with low 1D predictabilities.
Collapse
Affiliation(s)
| | | | - Han Qu
- University of California, Riverside, USA
| | - Yanru Cui
- Hebei Agricultural University, China
| | - Ruidong Li
- University of California, Riverside, USA
| | | | - Lei Yu
- University of California, Riverside, USA
| | - Rui Zhou
- South China University of Technology, Chinaa
| | | | | | - Yiru Qiao
- University of California, Riverside, USA
| | - Xuehai Hu
- Huazhong Agricultural University, China
| | - Weibo Xie
- Huazhong Agricultural University, China
| | - Zhenyu Jia
- University of California, Riverside, USA
| |
Collapse
|
11
|
Bayesian modeling reveals host genetics associated with rumen microbiota jointly influence methane emission in dairy cows. ISME JOURNAL 2020; 14:2019-2033. [PMID: 32366970 PMCID: PMC7368015 DOI: 10.1038/s41396-020-0663-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 03/25/2020] [Accepted: 04/15/2020] [Indexed: 12/11/2022]
Abstract
Reducing methane emissions from livestock production is of great importance for the sustainable management of the Earth’s environment. Rumen microbiota play an important role in producing biogenic methane. However, knowledge of how host genetics influences variation in ruminal microbiota and their joint effects on methane emission is limited. We analyzed data from 750 dairy cows, using a Bayesian model to simultaneously assess the impact of host genetics and microbiota on host methane emission. We estimated that host genetics and microbiota explained 24% and 7%, respectively, of variation in host methane levels. In this Bayesian model, one bacterial genus explained up to 1.6% of the total microbiota variance. Further analysis was performed by a mixed linear model to estimate variance explained by host genomics in abundances of microbial genera and operational taxonomic units (OTU). Highest estimates were observed for a bacterial OTU with 33%, for an archaeal OTU with 26%, and for a microbial genus with 41% heritability. However, after multiple testing correction for the number of genera and OTUs modeled, none of the effects remained significant. We also used a mixed linear model to test effects of individual host genetic markers on microbial genera and OTUs. In this analysis, genetic markers inside host genes ABS4 and DNAJC10 were found associated with microbiota composition. We show that a Bayesian model can be utilized to model complex structure and relationship between microbiota simultaneously and their interaction with host genetics on methane emission. The host genome explains a significant fraction of between-individual variation in microbial abundance. Individual microbial taxonomic groups each only explain a small amount of variation in methane emissions. The identification of genes and genetic markers suggests that it is possible to design strategies for breeding cows with desired microbiota composition associated with phenotypes.
Collapse
|
12
|
Genomic prediction and GWAS of yield, quality and disease-related traits in spring barley and winter wheat. Sci Rep 2020; 10:3347. [PMID: 32099054 PMCID: PMC7042356 DOI: 10.1038/s41598-020-60203-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Accepted: 02/07/2020] [Indexed: 11/09/2022] Open
Abstract
Genome-wide association study (GWAS) and genomic prediction (GP) are extensively employed to accelerate genetic gain and identify QTL in plant breeding. In this study, 1,317 spring barley and 1,325 winter wheat breeding lines from a commercial breeding program were genotyped with the Illumina 9 K barley or 15 K wheat SNP-chip, and phenotyped in multiple years and locations. For GWAS, in spring barley, a QTL on chr. 4H associated with powdery mildew and ramularia resistance were found. There were several SNPs on chr. 4H showing genome-wide significance with yield traits. In winter wheat, GWAS identified two SNPs on chr. 6A, and one SNP on chr. 1B, significantly associated with quality trait moisture, as well as one SNP located on chr. 5B associated with starch content in the seeds. The significant SNPs identified by multiple trait GWAS were generally the same as those found in single trait GWAS. GWAS including genotype-location information in the model identified significant SNPs in each tested location, which were not found previously when including all locations in the GWAS. For GP, in spring barley, GP using the Bayesian Power Lasso model had higher accuracy than ridge regression BLUP in powdery mildew and yield traits, whereas the prediction accuracies were similar using Bayesian Power Lasso model and rrBLUP for yield traits in winter wheat.
Collapse
|
13
|
Kristensen PS, Jensen J, Andersen JR, Guzmán C, Orabi J, Jahoor A. Genomic Prediction and Genome-Wide Association Studies of Flour Yield and Alveograph Quality Traits Using Advanced Winter Wheat Breeding Material. Genes (Basel) 2019; 10:genes10090669. [PMID: 31480460 PMCID: PMC6770321 DOI: 10.3390/genes10090669] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Revised: 08/26/2019] [Accepted: 08/29/2019] [Indexed: 12/02/2022] Open
Abstract
Use of genetic markers and genomic prediction might improve genetic gain for quality traits in wheat breeding programs. Here, flour yield and Alveograph quality traits were inspected in 635 F6 winter wheat breeding lines from two breeding cycles. Genome-wide association studies revealed single nucleotide polymorphisms (SNPs) on chromosome 5D significantly associated with flour yield, Alveograph P (dough tenacity), and Alveograph W (dough strength). Additionally, SNPs on chromosome 1D were associated with Alveograph P and W, SNPs on chromosome 1B were associated with Alveograph P, and SNPs on chromosome 4A were associated with Alveograph L (dough extensibility). Predictive abilities based on genomic best linear unbiased prediction (GBLUP) models ranged from 0.50 for flour yield to 0.79 for Alveograph W based on a leave-one-out cross-validation strategy. Predictive abilities were negatively affected by smaller training set sizes, lower genetic relationship between lines in training and validation sets, and by genotype–environment (G×E) interactions. Bayesian Power Lasso models and genomic feature models resulted in similar or slightly improved predictions compared to GBLUP models. SNPs with the largest effects can be used for screening large numbers of lines in early generations in breeding programs to select lines that potentially have good quality traits. In later generations, genomic predictions might be used for a more accurate selection of high quality wheat lines.
Collapse
Affiliation(s)
| | - Just Jensen
- Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | | | - Carlos Guzmán
- Departamento de Genética, Escuela Técnica Superior de Ingeniería Agronómica y de Montes, Edificio Gregor Mendel, Campus de Rabanales, Universidad de Córdoba, CeiA3, 14071 Córdoba, Spain
| | | | - Ahmed Jahoor
- Nordic Seed A/S, 8300 Odder, Denmark
- Department of Plant Breeding, The Swedish University of Agricultural Sciences, 23053 Alnarp, Sweden
| |
Collapse
|
14
|
Improvement of genomic prediction by integrating additional single nucleotide polymorphisms selected from imputed whole genome sequencing data. Heredity (Edinb) 2019; 124:37-49. [PMID: 31278370 PMCID: PMC6906477 DOI: 10.1038/s41437-019-0246-7] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 05/11/2019] [Accepted: 06/17/2019] [Indexed: 11/10/2022] Open
Abstract
The availability of whole genome sequencing (WGS) data enables the discovery of causative single nucleotide polymorphisms (SNPs) or SNPs in high linkage disequilibrium with causative SNPs. This study investigated effects of integrating SNPs selected from imputed WGS data into the data of 54K chip on genomic prediction in Danish Jersey. The WGS SNPs, mainly including peaks of quantitative trait loci, structure variants, regulatory regions of genes, and SNPs within genes with strong effects predicted with variant effect predictor, were selected in previous analyses for dairy breeds in Denmark–Finland–Sweden (DFS) and France (FRA). Animals genotyped with 54K chip, standard LD chip, and customized LD chip which covered selected WGS SNPs and SNPs in the standard LD chip, were imputed to 54K together with DFS and FRA SNPs. Genomic best linear unbiased prediction (GBLUP) and Bayesian four-distribution mixture models considering 54K and selected WGS SNPs as one (a one-component model) or two separate genetic components (a two-component model) were used to predict breeding values. For milk production traits and mastitis, both DFS (0.025) and FRA (0.029) sets of additional WGS SNPs improved reliabilities, and inclusions of all selected WGS SNPs generally achieved highest improvements of reliabilities (0.034). A Bayesian four-distribution model yielded higher reliabilities than a GBLUP model for milk and protein, but extra gains in reliabilities from using selected WGS SNPs were smaller for a Bayesian four-distribution model than a GBLUP model. Generally, no significant difference was observed between one-component and two-component models, except for using GBLUP models for milk.
Collapse
|
15
|
Ma P, Lund MS, Aamand GP, Su G. Use of a Bayesian model including QTL markers increases prediction reliability when test animals are distant from the reference population. J Dairy Sci 2019; 102:7237-7247. [PMID: 31155255 DOI: 10.3168/jds.2018-15815] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 03/31/2019] [Indexed: 01/23/2023]
Abstract
Relatedness between reference and test animals has an important effect on the reliability of genomic prediction for test animals. Because genomic prediction has been widely applied in practical cattle breeding and bulls have been selected according to genomic breeding value without progeny testing, the sires or grandsires of candidates might not have phenotypic information and might not be in the reference population when the candidates are selected. The objective of this study was to investigate the decreasing trend of the reliability of genomic prediction given distant reference populations, using genomic best linear unbiased prediction (GBLUP) and Bayesian variable selection models with or without including the quantitative trait locus (QTL) markers detected from sequencing data. The data used in this study consisted of 22,242 bulls genotyped using the 54K SNP array from EuroGenomics. Among them, 1,444 Danish bulls born from 2006 to 2010 were selected as test animals. Different reference populations with varying relationships to test animals were created according to pedigree-based relationships. The reference individuals having a relationship with one or more test animals higher than 0.4 (scenario ρ < 0.4), 0.2 (ρ < 0.2), or 0.1 (ρ < 0.1, where ρ = relationship coefficient) were removed from reference sets; these represented the distance between reference and test animals being 2 generations, 3 generations, and 4 generations, respectively. Imputed whole-genome sequencing data of bulls from Denmark were used to conduct a genome-wide association study (GWAS). A small number of significant variants (QTL markers) from the GWAS were added to the array data. To compare the effects of different models, the basic GBLUP model, a Bayesian selection variable model, a GBLUP model with 2 components of genetic effects, and a Bayesian model with pooled array data and QTL markers were used for estimating genomic estimated breeding values (GEBV) of test animals. The reliability of genomic prediction decreased when the test animals were more generations away from the reference population. The reliability of genomic prediction was 0.461 for 1 generation away and 0.396 for 3 generations away, with the same number of individuals in the reference set, using a GBLUP model with chip markers only. The results showed that using the Bayesian method and QTL markers improved the reliability of genomic prediction in all scenarios of relationship between test and reference animals, in a range of 1.3% and 65.1% (4 generations away with only 841 individuals in the reference set). However, most gains were for predictions of milk yield and fat yield. There was little improvement for predictions of protein yield and mastitis, and no improvement for prediction of fertility, except for scenario ρ < 0.1, in which there was a large improvement for predictions of all traits. On the other hand, models including more than 10% polygenic effect decreased prediction reliability when the relationship between test and reference animals was distant.
Collapse
Affiliation(s)
- Peipei Ma
- Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, P.R. China; Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830, Aarhus, Denmark
| | - Mogens S Lund
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830, Aarhus, Denmark
| | - Gert P Aamand
- NAV Nordic Cattle Genetic Evaluation, DK-8200, Aarhus, Denmark
| | - Guosheng Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830, Aarhus, Denmark.
| |
Collapse
|
16
|
Abstract
Genomic Selection (GS) is a method in plant breeding to predict the genetic value of untested lines based on genome-wide marker data. The method has been widely explored with simulated data and also in real plant breeding programs. However, the optimal strategy and stage for implementation of GS in a plant-breeding program is still uncertain. The accuracy of GS has proven to be affected by the data used in the GS model, including size of the training population, relationships between individuals, marker density, and use of pedigree information. GS is commonly used to predict the additive genetic value of a line, whereas non-additive genetics are often disregarded. In this review, we provide a background knowledge on genomic prediction models used for GS and a view on important considerations concerning data used in these models. We compare within- and across-breeding cycle strategies for implementation of GS in cereal breeding and possibilities for using GS to select untested lines as parents. We further discuss the difference of estimating additive and non-additive genetic values and its usefulness to either select new parents, or new candidate varieties.
Collapse
|
17
|
Rinta-Aho MJ, Sillanpää MJ. Stochastic search variable selection based on two mixture components and continuous-scale weighting. Biom J 2018; 61:729-746. [PMID: 30537402 DOI: 10.1002/bimj.201800118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 09/05/2018] [Accepted: 10/12/2018] [Indexed: 11/10/2022]
Abstract
Stochastic search variable selection (SSVS) is a Bayesian variable selection method that employs covariate-specific discrete indicator variables to select which covariates (e.g., molecular markers) are included in or excluded from the model. We present a new variant of SSVS where, instead of discrete indicator variables, we use continuous-scale weighting variables (which take also values between zero and one) to select covariates into the model. The improved model performance is shown and compared to standard SSVS using simulated and real quantitative trait locus mapping datasets. The decision making to decide phenotype-genotype associations in our SSVS variant is based on median of posterior distribution or using Bayes factors. We also show here that by using continuous-scale weighting variables it is possible to improve mixing properties of Markov chain Monte Carlo sampling substantially compared to standard SSVS. Also, the separation of association signals and nonsignals (control of noise level) seems to be more efficient compared to the standard SSVS. Thus, the novel method provides efficient new framework for SSVS analysis that additionally provides whole posterior distribution for pseudo-indicators which means more information and may help in decision making.
Collapse
Affiliation(s)
- Marko J Rinta-Aho
- Department of Mathematical Sciences and Biocenter Oulu, University of Oulu, Oulu, Finland
| | - Mikko J Sillanpää
- Department of Mathematical Sciences and Biocenter Oulu, University of Oulu, Oulu, Finland.,Infotech Oulu, University of Oulu, Oulu, Finland
| |
Collapse
|
18
|
Genomic Prediction Using Multi-trait Weighted GBLUP Accounting for Heterogeneous Variances and Covariances Across the Genome. G3-GENES GENOMES GENETICS 2018; 8:3549-3558. [PMID: 30194089 PMCID: PMC6222589 DOI: 10.1534/g3.118.200673] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Implicit assumption of common (co)variance for all loci in multi-trait Genomic Best Linear Unbiased Prediction (GBLUP) results in a genomic relationship matrix (G) that is common to all traits. When this assumption is violated, Bayesian whole genome regression methods may be superior to GBLUP by accounting for unequal (co)variance for all loci or genome regions. This study aimed to develop a strategy to improve the accuracy of GBLUP for multi-trait genomic prediction, using (co)variance estimates of SNP effects from Bayesian whole genome regression methods. Five generations (G1-G5, test populations) of genotype data were available by simulations based on data of 2,200 Danish Holstein cows (G0, reference population). Two correlated traits with heritabilities of 0.1 or 0.4, and a genetic correlation of 0.45 were generated. First, SNP effects and breeding values were estimated using BayesAS method, assuming (co)variance was the same for SNPs within a genome region, and different between regions. Region size was set as one SNP, 100 SNPs, a whole chromosome or whole genome. Second, posterior (co)variances of SNP effects were used to weight SNPs in construction of G matrices. In general, region size of 100 SNPs led to highest prediction accuracies using BayesAS, and wGBLUP outperformed GBLUP at this region size. Our results suggest that when genetic architectures of traits favor Bayesian methods, the accuracy of multi-trait GBLUP can be as high as the Bayesian method if SNPs are weighted by the Bayesian posterior (co)variances.
Collapse
|
19
|
The impact of genomic relatedness between populations on the genomic estimated breeding values. J Anim Sci Biotechnol 2018; 9:64. [PMID: 30147871 PMCID: PMC6094871 DOI: 10.1186/s40104-018-0279-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 07/19/2018] [Indexed: 11/10/2022] Open
Abstract
In genomic selection, prediction accuracy is highly driven by the size of animals in the reference population (RP). Combining related populations from different countries and regions or using a related population with large size of RP has been considered to be viable strategies in cattle breeding. The genetic relationship between related populations is important for improving the genomic predictive ability. In this study, we used 122 French bulls as test individuals. The genomic estimated breeding values (GEBVs) evaluated using French RP, America RP and Chinese RP were compared. The results showed that the GEBVs were in higher concordance using French RP and American RP compared with using Chinese population. The persistence analysis, kinship analysis and the principal component analysis (PCA) were performed for 270 French bulls, 270 American bulls and 270 Chinese bulls to interpret the results. All the analyses illustrated that the genetic relationship between French bulls and American bulls was closer compared with Chinese bulls. Another reason could be the size of RP in China was smaller than the other two RPs. In conclusion, using RP of a related population to predict GEBVs of the animals in a target population is feasible when these two populations have a close genetic relationship and the related population is large.
Collapse
|
20
|
Wang X, Xu Y, Hu Z, Xu C. Genomic selection methods for crop improvement: Current status and prospects. ACTA ACUST UNITED AC 2018. [DOI: 10.1016/j.cj.2018.03.001] [Citation(s) in RCA: 138] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
21
|
Kristensen PS, Jahoor A, Andersen JR, Cericola F, Orabi J, Janss LL, Jensen J. Genome-Wide Association Studies and Comparison of Models and Cross-Validation Strategies for Genomic Prediction of Quality Traits in Advanced Winter Wheat Breeding Lines. FRONTIERS IN PLANT SCIENCE 2018; 9:69. [PMID: 29456546 PMCID: PMC5801407 DOI: 10.3389/fpls.2018.00069] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 01/12/2018] [Indexed: 05/19/2023]
Abstract
The aim of the this study was to identify SNP markers associated with five important wheat quality traits (grain protein content, Zeleny sedimentation, test weight, thousand-kernel weight, and falling number), and to investigate the predictive abilities of GBLUP and Bayesian Power Lasso models for genomic prediction of these traits. In total, 635 winter wheat lines from two breeding cycles in the Danish plant breeding company Nordic Seed A/S were phenotyped for the quality traits and genotyped for 10,802 SNPs. GWAS were performed using single marker regression and Bayesian Power Lasso models. SNPs with large effects on Zeleny sedimentation were found on chromosome 1B, 1D, and 5D. However, GWAS failed to identify single SNPs with significant effects on the other traits, indicating that these traits were controlled by many QTL with small effects. The predictive abilities of the models for genomic prediction were studied using different cross-validation strategies. Leave-One-Out cross-validations resulted in correlations between observed phenotypes corrected for fixed effects and genomic estimated breeding values of 0.50 for grain protein content, 0.66 for thousand-kernel weight, 0.70 for falling number, 0.71 for test weight, and 0.79 for Zeleny sedimentation. Alternative cross-validations showed that the genetic relationship between lines in training and validation sets had a bigger impact on predictive abilities than the number of lines included in the training set. Using Bayesian Power Lasso instead of GBLUP models, gave similar or slightly higher predictive abilities. Genomic prediction based on all SNPs was more effective than prediction based on few associated SNPs.
Collapse
Affiliation(s)
- Peter S. Kristensen
- Nordic Seed A/S, Odder, Denmark
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark
- *Correspondence: Peter S. Kristensen
| | - Ahmed Jahoor
- Nordic Seed A/S, Odder, Denmark
- Department of Plant Breeding, The Swedish University of Agricultural Sciences, Alnarp, Sweden
| | | | - Fabio Cericola
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark
| | | | - Luc L. Janss
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark
| | - Just Jensen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark
| |
Collapse
|
22
|
Li S, Wang Q, Lin X, Jin X, Liu L, Wang C, Chen Q, Liu J, Liu H. The Use of "Omics" in Lactation Research in Dairy Cows. Int J Mol Sci 2017; 18:ijms18050983. [PMID: 28475129 PMCID: PMC5454896 DOI: 10.3390/ijms18050983] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 04/17/2017] [Accepted: 04/25/2017] [Indexed: 02/07/2023] Open
Abstract
“Omics” is the application of genomics, transcriptomics, proteomics, and metabolomics in biological research. Over the years, tremendous amounts of biological information has been gathered regarding the changes in gene, mRNA and protein expressions as well as metabolites in different physiological conditions and regulations, which has greatly advanced our understanding of the regulation of many physiological and pathophysiological processes. The aim of this review is to comprehensively describe the advances in our knowledge regarding lactation mainly in dairy cows that were obtained from the “omics” studies. The “omics” technologies have continuously been preferred as the technical tools in lactation research aiming to develop new nutritional, genetic, and management strategies to improve milk production and milk quality in dairy cows.
Collapse
Affiliation(s)
- Shanshan Li
- Institute of Dairy Science, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Quanjuan Wang
- Institute of Dairy Science, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Xiujuan Lin
- Institute of Dairy Science, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Xiaolu Jin
- Institute of Dairy Science, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Lan Liu
- Institute of Dairy Science, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Caihong Wang
- Institute of Dairy Science, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Qiong Chen
- Institute of Dairy Science, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Jianxin Liu
- Institute of Dairy Science, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Hongyun Liu
- Institute of Dairy Science, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China.
| |
Collapse
|
23
|
Guo X, Christensen OF, Ostersen T, Wang Y, Lund MS, Su G. Genomic prediction using models with dominance and imprinting effects for backfat thickness and average daily gain in Danish Duroc pigs. Genet Sel Evol 2016; 48:67. [PMID: 27623617 PMCID: PMC5022243 DOI: 10.1186/s12711-016-0245-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Accepted: 09/02/2016] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Dominance and imprinting genetic effects have been shown to contribute to genetic variance for certain traits but are usually ignored in genomic prediction of complex traits in livestock. The objectives of this study were to estimate variances of additive, dominance and imprinting genetic effects and to evaluate predictions of genetic merit based on genomic data for average daily gain (DG) and backfat thickness (BF) in Danish Duroc pigs. METHODS Corrected phenotypes of 8113 genotyped pigs from breeding and multiplier herds were used. Four Bayesian mixture models that differed in the type of genetic effects included: (A) additive genetic effects, (AD) additive and dominance genetic effects, (AI) additive and imprinting genetic effects, and (ADI) additive, dominance and imprinting genetic effects were compared using Bayes factors. The ability of the models to predict genetic merit was compared with regard to prediction reliability and bias. RESULTS Based on model ADI, narrow-sense heritabilities of 0.18 and 0.31 were estimated for DG and BF, respectively. Dominance and imprinting genetic effects accounted for 4.0 to 4.6 and 1.3 to 1.4 % of phenotypic variance, respectively, which were statistically significant. Across the four models, reliabilities of the predicted total genetic values (GTV, sum of all genetic effects) ranged from 16.1 (AI) to 18.4 % (AD) for DG and from 30.1 (AI) to 31.4 % (ADI) for BF. The least biased predictions of GTV were obtained with model AD, with regression coefficients of corrected phenotypes on GTV equal to 0.824 (DG) and 0.738 (BF). Reliabilities of genomic estimated breeding values (GBV, additive genetic effects) did not differ significantly among models for DG (between 16.5 and 16.7 %); however, for BF, model AD provided a significantly higher reliability (31.3 %) than model A (30.7 %). The least biased predictions of GBV were obtained with model AD with regression coefficients of 0.872 for DG and 0.764 for BF. CONCLUSIONS Dominance and genomic imprinting effects contribute significantly to the genetic variation of BF and DG in Danish Duroc pigs. Genomic prediction models that include dominance genetic effects can improve accuracy and reduce bias of genomic predictions of genetic merit.
Collapse
Affiliation(s)
- Xiangyu Guo
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - Ole Fredslund Christensen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - Tage Ostersen
- Danish Pig Research Centre, SEGES P/S, 1609 Copenhagen, Denmark
| | - Yachun Wang
- College of Animal Science and Technology, China Agricultural University, Beijing, 100193 People’s Republic of China
| | - Mogens Sandø Lund
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - Guosheng Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| |
Collapse
|
24
|
Fangmann A, Bergfelder-Drüing S, Tholen E, Simianer H, Erbe M. Can multi-subpopulation reference sets improve the genomic predictive ability for pigs? J Anim Sci 2016; 93:5618-30. [PMID: 26641171 DOI: 10.2527/jas.2015-9508] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
In most countries and for most livestock species, genomic evaluations are obtained from within-breed analyses. To achieve reliable breeding values, however, a sufficient reference sample size is essential. To increase this size, the use of multibreed reference populations for small populations is considered a suitable option in other species. Over decades, the separate breeding work of different pig breeding organizations in Germany has led to stratified subpopulations in the breed German Large White. Due to this fact and the limited number of Large White animals available in each organization, there was a pressing need for ascertaining if multi-subpopulation genomic prediction is superior compared with within-subpopulation prediction in pigs. Direct genomic breeding values were estimated with genomic BLUP for the trait "number of piglets born alive" using genotype data (Illumina Porcine 60K SNP BeadChip) from 2,053 German Large White animals from five different commercial pig breeding companies. To assess the prediction accuracy of within- and multi-subpopulation reference sets, a random 5-fold cross-validation with 20 replications was performed. The five subpopulations considered were only slightly differentiated from each other. However, the prediction accuracy of the multi-subpopulations approach was not better than that of the within-subpopulation evaluation, for which the predictive ability was already high. Reference sets composed of closely related multi-subpopulation sets performed better than sets of distantly related subpopulations but not better than the within-subpopulation approach. Despite the low differentiation of the five subpopulations, the genetic connectedness between these different subpopulations seems to be too small to improve the prediction accuracy by applying multi-subpopulation reference sets. Consequently, resources should be used for enlarging the reference population within subpopulation, for example, by adding genotyped females.
Collapse
|
25
|
Calus MPL, Bouwman AC, Schrooten C, Veerkamp RF. Efficient genomic prediction based on whole-genome sequence data using split-and-merge Bayesian variable selection. Genet Sel Evol 2016; 48:49. [PMID: 27357580 PMCID: PMC4926307 DOI: 10.1186/s12711-016-0225-x] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2016] [Accepted: 06/16/2016] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Use of whole-genome sequence data is expected to increase persistency of genomic prediction across generations and breeds but affects model performance and requires increased computing time. In this study, we investigated whether the split-and-merge Bayesian stochastic search variable selection (BSSVS) model could overcome these issues. BSSVS is performed first on subsets of sequence-based variants and then on a merged dataset containing variants selected in the first step. RESULTS We used a dataset that included 4,154,064 variants after editing and de-regressed proofs for 3415 reference and 2138 validation bulls for somatic cell score, protein yield and interval first to last insemination. In the first step, BSSVS was performed on 106 subsets each containing ~39,189 variants. In the second step, 1060 up to 472,492 variants, selected from the first step, were included to estimate the accuracy of genomic prediction. Accuracies were at best equal to those achieved with the commonly used Bovine 50k-SNP chip, although the number of variants within a few well-known quantitative trait loci regions was considerably enriched. When variant selection and the final genomic prediction were performed on the same data, predictions were biased. Predictions computed as the average of the predictions computed for each subset achieved the highest accuracies, i.e. 0.5 to 1.1 % higher than the accuracies obtained with the 50k-SNP chip, and yielded the least biased predictions. Finally, the accuracy of genomic predictions obtained when all sequence-based variants were included was similar or up to 1.4 % lower compared to that based on the average predictions across the subsets. By applying parallelization, the split-and-merge procedure was completed in 5 days, while the standard analysis including all sequence-based variants took more than three months. CONCLUSIONS The split-and-merge approach splits one large computational task into many much smaller ones, which allows the use of parallel processing and thus efficient genomic prediction based on whole-genome sequence data. The split-and-merge approach did not improve prediction accuracy, probably because we used data on a single breed for which relationships between individuals were high. Nevertheless, the split-and-merge approach may have potential for applications on data from multiple breeds.
Collapse
Affiliation(s)
- Mario P L Calus
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, PO Box 338, 6700 AH, Wageningen, The Netherlands.
| | - Aniek C Bouwman
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, PO Box 338, 6700 AH, Wageningen, The Netherlands
| | | | - Roel F Veerkamp
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, PO Box 338, 6700 AH, Wageningen, The Netherlands
| |
Collapse
|
26
|
Do DN, Janss LLG, Jensen J, Kadarmideen HN. SNP annotation-based whole genomic prediction and selection: an application to feed efficiency and its component traits in pigs. J Anim Sci 2016; 93:2056-63. [PMID: 26020301 DOI: 10.2527/jas.2014-8640] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The study investigated genetic architecture and predictive ability using genomic annotation of residual feed intake (RFI) and its component traits (daily feed intake [DFI], ADG, and back fat [BF]). A total of 1,272 Duroc pigs had both genotypic and phenotypic records, and the records were split into a training (968 pigs) and a validation dataset (304 pigs) by assigning records as before and after January 1, 2012, respectively. SNP were annotated by 14 different classes using Ensembl variant effect prediction. Predictive accuracy and prediction bias were calculated using Bayesian Power LASSO, Bayesian A, B, and Cπ, and genomic BLUP (GBLUP) methods. Predictive accuracy ranged from 0.508 to 0.531, 0.506 to 0.532, 0.276 to 0.357, and 0.308 to 0.362 for DFI, RFI, ADG, and BF, respectively. BayesCπ100.1 increased accuracy slightly compared to the GBLUP model and other methods. The contribution per SNP to total genomic variance was similar among annotated classes across different traits. Predictive performance of SNP classes did not significantly differ from randomized SNP groups. Genomic prediction has accuracy comparable to observed phenotype, and use of genomic prediction can be cost effective by replacing feed intake measurement. Genomic annotation had less impact on predictive accuracy traits considered here but may be different for other traits. It is the first study to provide useful insights into biological classes of SNP driving the whole genomic prediction for complex traits in pigs.
Collapse
|
27
|
Gao H, Madsen P, Nielsen US, Aamand GP, Su G, Byskov K, Jensen J. Including different groups of genotyped females for genomic prediction in a Nordic Jersey population. J Dairy Sci 2015; 98:9051-9. [PMID: 26433419 DOI: 10.3168/jds.2015-9947] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2015] [Accepted: 08/17/2015] [Indexed: 12/24/2022]
Abstract
Including genotyped females in a reference population (RP) is an obvious way to increase the RP in genomic selection, especially for dairy breeds of limited population size. However, the incorporation of these females must be conducted cautiously because of the potential preferential treatment of the genotyped cows and lower reliabilities of phenotypes compared with the proven pseudo-phenotypes of bulls. Breeding organizations in Denmark, Finland, and Sweden have implemented a female-genotyping project with the possibility of genotyping entire herds using the low-density (LD) chip. In the present study, 5 scenarios for building an RP were investigated in the Nordic Jersey population: (1) bulls only, (2) bulls with females from the LD project, (3) bulls with females from the LD project plus non-LD project females genotyped before their first calving, (4) bulls with females from the LD project plus non-LD project females genotyped after their first calving, and (5) bulls with all genotyped females. The genomically enhanced breeding value (GEBV) was predicted for 8 traits in the Nordic total merit index through a genomic BLUP model using deregressed proof (DRP) as the response variable in all scenarios. In addition, (daughter) yield deviation and raw phenotypic data were studied as response variables for comparison with the DRP, using stature as a model trait. The validation population was formed using a cut-off birth year of 2005 based on the genotyped Nordic Jersey bulls with DRP. The average increment in reliability of the GEBV across the 8 traits investigated was 1.9 to 4.5 percentage points compared with using only bulls in the RP (scenario 1). The addition of all the genotyped females to the RP resulted in the highest gain in reliability (scenario 5), followed by scenario 3, scenario 2, and scenario 4. All scenarios led to inflated GEBV because the regression coefficients are less than 1. However, scenario 2 and scenario 3 led to less bias of genomic predictions than scenario 5, with regression coefficients showing less deviation from scenario 1. For the study on stature, the daughter yield deviation/daughter yield deviation performed slightly better than the DRP as the response variable in the genomic BLUP (GBLUP) model. Therefore, adding unselected females in the RP could significantly improve the reliabilities and tended to reduce the prediction bias compared with adding selectively genotyped females. Although the DRP has performed robustly so far, the use of raw data is recommended with a single-step model as an optimal solution for future genomic evaluations.
Collapse
Affiliation(s)
- H Gao
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark.
| | - P Madsen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark
| | | | - G P Aamand
- Nordic Cattle Genetic Evaluation, DK-8200 Aarhus N, Denmark
| | - G Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark
| | - K Byskov
- Seges, DK-8200 Aarhus N, Denmark
| | - J Jensen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark
| |
Collapse
|
28
|
Cuyabano B, Su G, Rosa G, Lund M, Gianola D. Bootstrap study of genome-enabled prediction reliabilities using haplotype blocks across Nordic Red cattle breeds. J Dairy Sci 2015; 98:7351-63. [DOI: 10.3168/jds.2015-9360] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 06/16/2015] [Indexed: 12/30/2022]
|
29
|
An Efficient Genome-Wide Multilocus Epistasis Search. Genetics 2015; 201:865-70. [PMID: 26405029 DOI: 10.1534/genetics.115.182444] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 09/15/2015] [Indexed: 01/04/2023] Open
Abstract
There has been a continuing interest in approaches that analyze pairwise locus-by-locus (epistasis) interactions using multilocus association models in genome-wide data sets. In this paper, we suggest an approach that uses sure independence screening to first lower the dimension of the problem by considering the marginal importance of each interaction term within the huge loop. Subsequent multilocus association steps are executed using an extended Bayesian least absolute shrinkage and selection operator (LASSO) model and fast generalized expectation-maximization estimation algorithms. The potential of this approach is illustrated and compared with PLINK software using data examples where phenotypes have been simulated conditionally on marker data from the Quantitative Trait Loci Mapping and Marker Assisted Selection (QTLMAS) Workshop 2008 and real pig data sets.
Collapse
|
30
|
Cuyabano BC, Su G, Lund MS. Selection of haplotype variables from a high-density marker map for genomic prediction. Genet Sel Evol 2015; 47:61. [PMID: 26232271 PMCID: PMC4522081 DOI: 10.1186/s12711-015-0143-3] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2014] [Accepted: 07/23/2015] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Using haplotype blocks as predictors rather than individual single nucleotide polymorphisms (SNPs) may improve genomic predictions, since haplotypes are in stronger linkage disequilibrium with the quantitative trait loci than are individual SNPs. It has also been hypothesized that an appropriate selection of a subset of haplotype blocks can result in similar or better predictive ability than when using the whole set of haplotype blocks. This study investigated genomic prediction using a set of haplotype blocks that contained the SNPs with large effects estimated from an individual SNP prediction model. We analyzed protein yield, fertility and mastitis of Nordic Holstein cattle, and used high-density markers (about 770k SNPs). To reach an optimum number of haplotype variables for genomic prediction, predictions were performed using subsets of haplotype blocks that contained a range of 1000 to 50 000 main SNPs. RESULTS The use of haplotype blocks improved the prediction reliabilities, even when selection focused on only a group of haplotype blocks. In this case, the use of haplotype blocks that contained the 20 000 to 50 000 SNPs with the highest effect was sufficient to outperform the model that used all individual SNPs as predictors (up to 1.3 % improvement in prediction reliability for mastitis, compared to individual SNP approach), and the achieved reliabilities were similar to those using all haplotype blocks available in the genome data (from 0.6 % lower to 0.8 % higher reliability). CONCLUSIONS Haplotype blocks used as predictors can improve the reliability of genomic prediction compared to the individual SNP model. Furthermore, the use of a subset of haplotype blocks that contains the main SNP effects from genomic data could be a feasible approach to genomic prediction in dairy cattle, given an increase in density of genotype data available. The predictive ability of the models that use a subset of haplotype blocks was similar to that obtained using either all haplotype blocks or all individual SNPs, with the benefit of having a much lower computational demand.
Collapse
Affiliation(s)
- Beatriz Cd Cuyabano
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark.
| | - Guosheng Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark.
| | - Mogens S Lund
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark.
| |
Collapse
|
31
|
Wu X, Lund MS, Sun D, Zhang Q, Su G. Impact of relationships between test and training animals and among training animals on reliability of genomic prediction. J Anim Breed Genet 2015; 132:366-75. [PMID: 26010512 DOI: 10.1111/jbg.12165] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2014] [Accepted: 04/18/2015] [Indexed: 12/29/2022]
Abstract
One of the factors affecting the reliability of genomic prediction is the relationship among the animals of interest. This study investigated the reliability of genomic prediction in various scenarios with regard to the relationship between test and training animals, and among animals within the training data set. Different training data sets were generated from EuroGenomics data and a group of Nordic Holstein bulls (born in 2005 and afterwards) as a common test data set. Genomic breeding values were predicted using a genomic best linear unbiased prediction model and a Bayesian mixture model. The results showed that a closer relationship between test and training animals led to a higher reliability of genomic predictions for the test animals, while a closer relationship among training animals resulted in a lower reliability. In addition, the Bayesian mixture model in general led to a slightly higher reliability of genomic prediction, especially for the scenario of distant relationships between training and test animals. Therefore, to prevent a decrease in reliability, constant updates of the training population with animals from more recent generations are required. Moreover, a training population consisting of less-related animals is favourable for reliability of genomic prediction.
Collapse
Affiliation(s)
- X Wu
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Tjele, Denmark.,Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - M S Lund
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Tjele, Denmark
| | - D Sun
- Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Q Zhang
- Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - G Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Tjele, Denmark
| |
Collapse
|
32
|
Berger S, Pérez-Rodríguez P, Veturi Y, Simianer H, de los Campos G. Effectiveness of shrinkage and variable selection methods for the prediction of complex human traits using data from distantly related individuals. Ann Hum Genet 2015; 79:122-35. [PMID: 25600682 PMCID: PMC4428155 DOI: 10.1111/ahg.12099] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Accepted: 12/03/2014] [Indexed: 02/02/2023]
Abstract
Genome‐wide association studies (GWAS) have detected large numbers of variants associated with complex human traits and diseases. However, the proportion of variance explained by GWAS‐significant single nucleotide polymorphisms has been usually small. This brought interest in the use of whole‐genome regression (WGR) methods. However, there has been limited research on the factors that affect prediction accuracy (PA) of WGRs when applied to human data of distantly related individuals. Here, we examine, using real human genotypes and simulated phenotypes, how trait complexity, marker‐quantitative trait loci (QTL) linkage disequilibrium (LD), and the model used affect the performance of WGRs. Our results indicated that the estimated rate of missing heritability is dependent on the extent of marker‐QTL LD. However, this parameter was not greatly affected by trait complexity. Regarding PA our results indicated that: (a) under perfect marker‐QTL LD WGR can achieve moderately high prediction accuracy, and with simple genetic architectures variable selection methods outperform shrinkage procedures and (b) under imperfect marker‐QTL LD, variable selection methods can achieved reasonably good PA with simple or moderately complex genetic architectures; however, the PA of these methods deteriorated as trait complexity increases and with highly complex traits variable selection and shrinkage methods both performed poorly. This was confirmed with an analysis of human height.
Collapse
Affiliation(s)
- Swetlana Berger
- Animal Breeding and Genetics Group, Department of Animal Sciences, Georg-August-University Goettingen, Albrecht-Thaer-Weg 3, Goettingen, Germany
| | | | | | | | | |
Collapse
|
33
|
Cuyabano BCD, Su G, Lund MS. Genomic prediction of genetic merit using LD-based haplotypes in the Nordic Holstein population. BMC Genomics 2014; 15:1171. [PMID: 25539631 PMCID: PMC4367958 DOI: 10.1186/1471-2164-15-1171] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2013] [Accepted: 12/12/2014] [Indexed: 11/17/2022] Open
Abstract
Background A haplotype approach to genomic prediction using high density data in dairy cattle as an alternative to single-marker methods is presented. With the assumption that haplotypes are in stronger linkage disequilibrium (LD) with quantitative trait loci (QTL) than single markers, this study focuses on the use of haplotype blocks (haploblocks) as explanatory variables for genomic prediction. Haploblocks were built based on the LD between markers, which allowed variable reduction. The haploblocks were then used to predict three economically important traits (milk protein, fertility and mastitis) in the Nordic Holstein population. Results The haploblock approach improved prediction accuracy compared with the commonly used individual single nucleotide polymorphism (SNP) approach. Furthermore, using an average LD threshold to define the haploblocks (LD≥0.45 between any two markers) increased the prediction accuracies for all three traits, although the improvement was most significant for milk protein (up to 3.1 % improvement in prediction accuracy, compared with the individual SNP approach). Hotelling’s t-tests were performed, confirming the improvement in prediction accuracy for milk protein. Because the phenotypic values were in the form of de-regressed proofs, the improved accuracy for milk protein may be due to higher reliability of the data for this trait compared with the reliability of the mastitis and fertility data. Comparisons between best linear unbiased prediction (BLUP) and Bayesian mixture models also indicated that the Bayesian model produced the most accurate predictions in every scenario for the milk protein trait, and in some scenarios for fertility. Conclusions The haploblock approach to genomic prediction is a promising method for genomic selection in animal breeding. Building haploblocks based on LD reduced the number of variables without the loss of information. This method may play an important role in the future genomic prediction involving while genome sequences.
Collapse
Affiliation(s)
| | - Guosheng Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Denmark.
| | | |
Collapse
|
34
|
Liu T, Qu H, Luo C, Li X, Shu D, Lund MS, Su G. Genomic selection for the improvement of antibody response to Newcastle disease and avian influenza virus in chickens. PLoS One 2014; 9:e112685. [PMID: 25401767 PMCID: PMC4234505 DOI: 10.1371/journal.pone.0112685] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2014] [Accepted: 10/10/2014] [Indexed: 12/18/2022] Open
Abstract
Newcastle disease (ND) and avian influenza (AI) are the most feared diseases in the poultry industry worldwide. They can cause flock mortality up to 100%, resulting in a catastrophic economic loss. This is the first study to investigate the feasibility of genomic selection for antibody response to Newcastle disease virus (Ab-NDV) and antibody response to Avian Influenza virus (Ab-AIV) in chickens. The data were collected from a crossbred population. Breeding values for Ab-NDV and Ab-AIV were estimated using a pedigree-based best linear unbiased prediction model (BLUP) and a genomic best linear unbiased prediction model (GBLUP). Single-trait and multiple-trait analyses were implemented. According to the analysis using the pedigree-based model, the heritability for Ab-NDV estimated from the single-trait and multiple-trait models was 0.478 and 0.487, respectively. The heritability for Ab-AIV estimated from the two models was 0.301 and 0.291, respectively. The estimated genetic correlation between the two traits was 0.438. A four-fold cross-validation was used to assess the accuracy of the estimated breeding values (EBV) in the two validation scenarios. In the family sample scenario each half-sib family is randomly allocated to one of four subsets and in the random sample scenario the individuals are randomly divided into four subsets. In the family sample scenario, compared with the pedigree-based model, the accuracy of the genomic prediction increased from 0.086 to 0.237 for Ab-NDV and from 0.080 to 0.347 for Ab-AIV. In the random sample scenario, the accuracy was improved from 0.389 to 0.427 for Ab-NDV and from 0.281 to 0.367 for Ab-AIV. The multiple-trait GBLUP model led to a slightly higher accuracy of genomic prediction for both traits. These results indicate that genomic selection for antibody response to ND and AI in chickens is promising.
Collapse
Affiliation(s)
- Tianfei Liu
- College of Animal Science and Technology, Sichuan Agricultural University, Yaan, China
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Tjele, Denmark
- State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China
| | - Hao Qu
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China
- State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China
| | - Chenglong Luo
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China
- State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China
| | - Xuewei Li
- College of Animal Science and Technology, Sichuan Agricultural University, Yaan, China
- * E-mail: (XL); (DS); (GS)
| | - Dingming Shu
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou, China
- State Key Laboratory of Livestock and Poultry Breeding, Guangzhou, China
- * E-mail: (XL); (DS); (GS)
| | - Mogens Sandø Lund
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Tjele, Denmark
| | - Guosheng Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Tjele, Denmark
- * E-mail: (XL); (DS); (GS)
| |
Collapse
|
35
|
Liu T, Qu H, Luo C, Shu D, Wang J, Lund MS, Su G. Accuracy of genomic prediction for growth and carcass traits in Chinese triple-yellow chickens. BMC Genet 2014; 15:110. [PMID: 25316160 PMCID: PMC4201679 DOI: 10.1186/s12863-014-0110-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2014] [Accepted: 10/01/2014] [Indexed: 11/10/2022] Open
Abstract
Background Growth and carcass traits are very important traits for broiler chickens. However, carcass traits can only be measured postmortem. Genomic selection may be a powerful tool for such traits because of its accurate prediction of breeding values of animals without own phenotypic information. This study investigated the efficiency of genomic prediction in Chinese triple-yellow chickens. As a new line, Chinese triple-yellow chicken was developed by cross-breeding and had a small effective population. Two growth traits and three carcass traits were analyzed: body weight at 6 weeks, body weight at 12 weeks, eviscerating percentage, breast muscle percentage and leg muscle percentage. Results Genomic prediction was assessed using a 4-fold cross-validation procedure for two validation scenarios. In the first scenario, each test data set comprised two half-sib families (family sample) and the rest represented the reference data. In the second scenario, the whole data were randomly divided into four subsets (random sample). In each fold of validation, one subset was used as the test data and the others as the reference data in each single validation. Genomic breeding values were predicted using a genomic best linear unbiased prediction model, a Bayesian least absolute shrinkage and selection operator model, and a Bayesian mixture model with four distributions. The accuracy of genomic estimated breeding value (GEBV) was measured as the correlation between GEBV and the corrected phenotypic value. Using the three models, the correlations ranged from 0.448 to 0.468 for the two growth traits and from 0.176 to 0.255 for the three carcass traits in the family sample scenario, and were between 0.487 and 0.536 for growth traits and between 0.312 and 0.430 for carcass traits in the random sample scenario. The differences in the prediction accuracies between the three models were very small; the Bayesian mixture model was slightly more accurate. According to the results from the random sample scenario, the accuracy of GEBV was 0.197 higher than the conventional pedigree index, averaged over the five traits. Conclusions The results indicated that genomic selection could greatly improve the accuracy of selection in chickens, compared with conventional selection. Genomic selection for growth and carcass traits in broiler chickens is promising. Electronic supplementary material The online version of this article (doi:10.1186/s12863-014-0110-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | - Dingming Shu
- Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China.
| | | | | | | |
Collapse
|
36
|
Su G, Christensen O, Janss L, Lund M. Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances. J Dairy Sci 2014; 97:6547-59. [DOI: 10.3168/jds.2014-8210] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2014] [Accepted: 07/07/2014] [Indexed: 12/24/2022]
|
37
|
Strategies for imputation to whole genome sequence using a single or multi-breed reference population in cattle. BMC Genomics 2014; 15:728. [PMID: 25164068 PMCID: PMC4152568 DOI: 10.1186/1471-2164-15-728] [Citation(s) in RCA: 86] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Accepted: 06/18/2014] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND The advent of low cost next generation sequencing has made it possible to sequence a large number of dairy and beef bulls which can be used as a reference for imputation of whole genome sequence data. The aim of this study was to investigate the accuracy and speed of imputation from a high density SNP marker panel to whole genome sequence level. Data contained 132 Holstein, 42 Jersey, 52 Nordic Red and 16 Brown Swiss bulls with whole genome sequence data; 16 Holstein, 27 Jersey and 29 Nordic Reds had previously been typed with the bovine high density SNP panel and were used for validation. We investigated the effect of enlarging the reference population by combining data across breeds on the accuracy of imputation, and the accuracy and speed of both IMPUTE2 and BEAGLE using either genotype probability reference data or pre-phased reference data. All analyses were done on Bovine autosome 29 using 387,436 bi-allelic variants and 13,612 SNP markers from the bovine HD panel. RESULTS A combined breed reference population led to higher imputation accuracies than did a single breed reference. The highest accuracy of imputation for all three test breeds was achieved when using BEAGLE with un-phased reference data (mean genotype correlations of 0.90, 0.89 and 0.87 for Holstein, Jersey and Nordic Red respectively) but IMPUTE2 with un-phased reference data gave similar accuracies for Holsteins and Nordic Red. Pre-phasing the reference data only lead to a minor decrease in the imputation accuracy, but gave a large improvement in computation time. Pre-phasing with BEAGLE was substantially faster than pre-phasing with SHAPEIT2 (2.5 hours vs. 52 hours for 242 individuals), and imputation with pre-phased data was faster in IMPUTE2 than in BEAGLE (5 minutes vs. 50 minutes per individual). CONCLUSION Combining reference populations across breeds is a good option to increase the size of the reference data and in turn the accuracy of imputation when only few animals are available. Pre-phasing the reference data only slightly decreases the accuracy but gives substantial improvements in speed. Using BEAGLE for pre-phasing and IMPUTE2 for imputation is a fast and accurate strategy.
Collapse
|
38
|
|
39
|
Su G, Guldbrandtsen B, Aamand GP, Strandén I, Lund MS. Genomic relationships based on X chromosome markers and accuracy of genomic predictions with and without X chromosome markers. Genet Sel Evol 2014; 46:47. [PMID: 25080199 PMCID: PMC4137273 DOI: 10.1186/1297-9686-46-47] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2013] [Accepted: 06/18/2014] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Although the X chromosome is the second largest bovine chromosome, markers on the X chromosome are not used for genomic prediction in some countries and populations. In this study, we presented a method for computing genomic relationships using X chromosome markers, investigated the accuracy of imputation from a low density (7K) to the 54K SNP (single nucleotide polymorphism) panel, and compared the accuracy of genomic prediction with and without using X chromosome markers. METHODS The impact of considering X chromosome markers on prediction accuracy was assessed using data from Nordic Holstein bulls and different sets of SNPs: (a) the 54K SNPs for reference and test animals, (b) SNPs imputed from the 7K to the 54K SNP panel for test animals, (c) SNPs imputed from the 7K to the 54K panel for half of the reference animals, and (d) the 7K SNP panel for all animals. Beagle and Findhap were used for imputation. GBLUP (genomic best linear unbiased prediction) models with or without X chromosome markers and with or without a residual polygenic effect were used to predict genomic breeding values for 15 traits. RESULTS Averaged over the two imputation datasets, correlation coefficients between imputed and true genotypes for autosomal markers, pseudo-autosomal markers, and X-specific markers were 0.971, 0.831 and 0.935 when using Findhap, and 0.983, 0.856 and 0.937 when using Beagle. Estimated reliabilities of genomic predictions based on the imputed datasets using Findhap or Beagle were very close to those using the real 54K data. Genomic prediction using all markers gave slightly higher reliabilities than predictions without X chromosome markers. Based on our data which included only bulls, using a G matrix that accounted for sex-linked relationships did not improve prediction, compared with a G matrix that did not account for sex-linked relationships. A model that included a polygenic effect did not recover the loss of prediction accuracy from exclusion of X chromosome markers. CONCLUSIONS The results from this study suggest that markers on the X chromosome contribute to accuracy of genomic predictions and should be used for routine genomic evaluation.
Collapse
Affiliation(s)
- Guosheng Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Tjele DK-8830, Denmark.
| | | | | | | | | |
Collapse
|
40
|
Zhou L, Heringstad B, Su G, Guldbrandtsen B, Meuwissen THE, Svendsen M, Grove H, Nielsen US, Lund MS. Genomic predictions based on a joint reference population for the Nordic Red cattle breeds. J Dairy Sci 2014; 97:4485-96. [PMID: 24792791 DOI: 10.3168/jds.2013-7580] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2013] [Accepted: 03/13/2014] [Indexed: 12/31/2022]
Abstract
The main aim of this study was to compare accuracies of imputation and genomic predictions based on single and joint reference populations for Norwegian Red (NRF) and a composite breed (DFS) consisting of Danish Red, Finnish Ayrshire, and Swedish Red. The single nucleotide polymorphism (SNP) data for NRF consisted of 2 data sets: one including 25,000 markers (NRF25K) and the other including 50,000 markers (NRF50K). The NRF25K data set had 2,572 bulls, and the NRF50K data set had 1,128 bulls. Four hundred forty-two bulls were genotyped in both data sets (double-genotyped bulls). The DFS data set (DSF50K) included 50,000 markers of 13,472 individuals, of which around 4,700 were progeny-tested bulls. The NRF25K data set was imputed to 50,000 density using the software Beagle. The average error rate for the imputation of NRF25K decreased slightly from 0.023 to 0.021, and the correlation between observed and imputed genotypes changed from 0.935 to 0.936 when comparing the NRF50K reference and the NRF50K-DFS50K joint reference imputations. A genomic BLUP (GBLUP) model and a Bayesian 4-component mixture model were used to predict genomic breeding values for the NRF and DFS bulls based on the single and joint NRF and DFS reference populations. In the multiple population predictions, accuracies of genomic breeding values increased for the 3 production traits (milk, fat, and protein yields) for both NRF and DFS. Accuracies increased by 6 and 1.3 percentage points, on average, for the NRF and DFS bulls, respectively, using the GBLUP model, and by 9.3 and 1.3 percentage points, on average, using the Bayesian 4-component mixture model. However, accuracies for health or reproduction traits did not increase from the multiple population predictions. Among the 3 DFS populations, Swedish Red gained most in accuracies from the multiple population predictions, presumably because Swedish Red has a closer genetic relationship with NRF than Danish Red and Finnish Ayrshire. The Bayesian 4-component mixture model performed better than the GBLUP model for most production traits for both NRF and DFS, whereas no advantage was found for health or reproduction traits. In general, combining NRF and DFS reference populations was useful in genomic predictions for both the NRF and DFS bulls.
Collapse
Affiliation(s)
- L Zhou
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark
| | - B Heringstad
- Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, Box 5003, 1432 Ås, Norway.
| | - G Su
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark.
| | - B Guldbrandtsen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark
| | - T H E Meuwissen
- Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, Box 5003, 1432 Ås, Norway
| | - M Svendsen
- Geno Breeding and AI Association, 1432 Ås, Norway
| | - H Grove
- Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, Box 5003, 1432 Ås, Norway
| | - U S Nielsen
- Danish Agriculture Advisory Service, DK-8200 Aarhus N, Denmark
| | - M S Lund
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark
| |
Collapse
|
41
|
Pryce JE, Johnston J, Hayes BJ, Sahana G, Weigel KA, McParland S, Spurlock D, Krattenmacher N, Spelman RJ, Wall E, Calus MPL. Imputation of genotypes from low density (50,000 markers) to high density (700,000 markers) of cows from research herds in Europe, North America, and Australasia using 2 reference populations. J Dairy Sci 2014; 97:1799-811. [PMID: 24472132 DOI: 10.3168/jds.2013-7368] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2013] [Accepted: 12/03/2013] [Indexed: 12/30/2022]
Abstract
Combining data from research herds may be advantageous, especially for difficult or expensive-to-measure traits (such as dry matter intake). Cows in research herds are often genotyped using low-density single nucleotide polymorphism (SNP) panels. However, the precision of quantitative trait loci detection in genome-wide association studies and the accuracy of genomic selection may increase when the low-density genotypes are imputed to higher density. Genotype data were available from 10 research herds: 5 from Europe [Denmark, Germany, Ireland, the Netherlands, and the United Kingdom (UK)], 2 from Australasia (Australia and New Zealand), and 3 from North America (Canada and the United States). Heifers from the Australian and New Zealand research herds were already genotyped at high density (approximately 700,000 SNP). The remaining genotypes were imputed from around 50,000 SNP to 700,000 using 2 reference populations. Although it was not possible to use a combined reference population, which would probably result in the highest accuracies of imputation, differences arising from using 2 high-density reference populations on imputing 50,000-marker genotypes of 583 animals (from the UK) were quantified. The European genotypes (n=4,097) were imputed as 1 data set, using a reference population of 3,150 that included genotypes from 835 Australian and 1,053 New Zealand females, with the remainder being males. Imputation was undertaken using population-wide linkage disequilibrium with no family information exploited. The UK animals were also included in the North American data set (n=1,579) that was imputed to high density using a reference population of 2,018 bulls. After editing, 591,213 genotypes on 5,999 animals from 10 research herds remained. The correlation between imputed allele frequencies of the 2 imputed data sets was high (>0.98) and even stronger (>0.99) for the UK animals that were part of each imputation data set. For the UK genotypes, 2.2% were imputed differently in the 2 high-density reference data sets used. Only 0.025% of these were homozygous switches. The number of discordant SNP was lower for animals that had sires that were genotyped. Discordant imputed SNP genotypes were most common when a large difference existed in allele frequency between the 2 imputed genotype data sets. For SNP that had ≥ 20% discordant genotypes, the difference between imputed data sets of allele frequencies of the UK (imputed) genotypes was 0.07, whereas the difference in allele frequencies of the (reference) high-density genotypes was 0.30. In fact, regions existed across the genome where the frequency of discordant SNP was higher. For example, on chromosome 10 (centered on 520,948 bp), 52 SNP (out of a total of 103 SNP) had ≥ 20% discordant SNP. Four hundred and eight SNP had more than 20% discordant genotypes and were removed from the final set of imputed genotypes. We concluded that both discordance of imputed SNP genotypes and differences in allele frequencies, after imputation using different reference data sets, may be used to identify and remove poorly imputed SNP.
Collapse
Affiliation(s)
- J E Pryce
- Department of Environment and Primary Industries, Agribio, 5 Ring Road, La Trobe University, Bundoora, VIC 3083, Australia; Dairy Futures Cooperative Research Centre, 5 Ring Road, La Trobe University, Bundoora, VIC 3083, Australia; La Trobe University, 5 Ring Road, La Trobe University, Bundoora, VIC 3083, Australia.
| | - J Johnston
- Canadian Dairy Network, Guelph, Ontario, N1K 1E5, Canada
| | - B J Hayes
- Department of Environment and Primary Industries, Agribio, 5 Ring Road, La Trobe University, Bundoora, VIC 3083, Australia; Dairy Futures Cooperative Research Centre, 5 Ring Road, La Trobe University, Bundoora, VIC 3083, Australia; La Trobe University, 5 Ring Road, La Trobe University, Bundoora, VIC 3083, Australia
| | - G Sahana
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830 Tjele, Denmark
| | - K A Weigel
- Department of Dairy Science, University of Wisconsin, Madison 53706
| | - S McParland
- Animal & Grassland Research and Innovation Centre, Teagasc, Moorepark, Co. Cork, Ireland
| | - D Spurlock
- Department of Animal Science, Iowa State University, Ames 50011
| | - N Krattenmacher
- Institute of Animal Breeding and Husbandry, Christian-Albrechts-University, 24118 Kiel, Germany
| | - R J Spelman
- LIC, Private Bag 3016, Hamilton 3240, New Zealand
| | - E Wall
- Animal and Veterinary Sciences, Scotland's Rural College (SRUC), Kings Buildings, West Mains Road, Edinburgh, EH9 3JG, United Kingdom
| | - M P L Calus
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, 8200 AB Lelystad, the Netherlands
| |
Collapse
|