1
|
Freitas LA, Savegnago RP, Alves AAC, Stafuzza NB, Pedrosa VB, Rocha RA, Rosa GJM, Paz CCP. Genome-enabled prediction of indicator traits of resistance to gastrointestinal nematodes in sheep using parametric models and artificial neural networks. Res Vet Sci 2024; 166:105099. [PMID: 38091815 DOI: 10.1016/j.rvsc.2023.105099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 11/15/2023] [Accepted: 11/19/2023] [Indexed: 01/01/2024]
Abstract
This study aimed to assess the predictive ability of parametric models and artificial neural network method for genomic prediction of the following indicator traits of resistance to gastrointestinal nematodes in Santa Inês sheep: packed cell volume (PCV), fecal egg count (FEC), and Famacha© method (FAM). After quality control, the number of genotyped animals was 551 (PCV), 548 (FEC), and 565 (FAM), and 41,676 SNP. The average prediction accuracy (ACC) calculated by Pearson correlation between observed and predicted values and mean squared errors (MSE) were obtained using genomic best unbiased linear predictor (GBLUP), BayesA, BayesB, Bayesian least absolute shrinkage and selection operator (BLASSO), and Bayesian regularized artificial neural network (three and four hidden neurons, BRANN_3 and BRANN_4, respectively) in a 5-fold cross-validation technique. The average ACC varied from moderate to high according to the trait and models, ranging between 0.418 and 0.546 (PCV), between 0.646 and 0.793 (FEC), and between 0.414 and 0.519 (FAM). Parametric models presented nearly the same ACC and MSE for the studied traits and provided better accuracies than BRANN. The GBLUP, BayesA, BayesB and BLASSO models provided better accuracies than the BRANN_3 method, increasing by around 23% for PCV, and 18.5% for FEC. In conclusion, parametric models are suitable for genome-enabled prediction of indicator traits of resistance to gastrointestinal nematodes in sheep. Due to the small differences in accuracy found between them, the use of the GBLUP model is recommended due to its lower computational costs.
Collapse
Affiliation(s)
- L A Freitas
- University of Sao Paulo, Department of Genetics, Ribeirão Preto, São Paulo 14049-900, Brazil; University of Wisconsin, Department of Animal and Dairy Sciences, Madison 53706, USA.
| | - R P Savegnago
- Michigan State University, Department of Animal Science, MI 48864, USA.
| | - A A C Alves
- University of Wisconsin, Department of Animal and Dairy Sciences, Madison 53706, USA.
| | - N B Stafuzza
- Sustainable Livestock Research Center, Animal Science Institute, São José do Rio Preto, São Paulo 15130-000, Brazil
| | - V B Pedrosa
- State University of Ponta Grossa, Ponta Grossa, Paraná 84030-900, Brazil.
| | - R A Rocha
- State University of Ponta Grossa, Ponta Grossa, Paraná 84030-900, Brazil.
| | - G J M Rosa
- University of Wisconsin, Department of Animal and Dairy Sciences, Madison 53706, USA.
| | - C C P Paz
- University of Sao Paulo, Department of Genetics, Ribeirão Preto, São Paulo 14049-900, Brazil; Sustainable Livestock Research Center, Animal Science Institute, São José do Rio Preto, São Paulo 15130-000, Brazil.
| |
Collapse
|
2
|
Chafai N, Hayah I, Houaga I, Badaoui B. A review of machine learning models applied to genomic prediction in animal breeding. Front Genet 2023; 14:1150596. [PMID: 37745853 PMCID: PMC10516561 DOI: 10.3389/fgene.2023.1150596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 08/22/2023] [Indexed: 09/26/2023] Open
Abstract
The advent of modern genotyping technologies has revolutionized genomic selection in animal breeding. Large marker datasets have shown several drawbacks for traditional genomic prediction methods in terms of flexibility, accuracy, and computational power. Recently, the application of machine learning models in animal breeding has gained a lot of interest due to their tremendous flexibility and their ability to capture patterns in large noisy datasets. Here, we present a general overview of a handful of machine learning algorithms and their application in genomic prediction to provide a meta-picture of their performance in genomic estimated breeding values estimation, genotype imputation, and feature selection. Finally, we discuss a potential adoption of machine learning models in genomic prediction in developing countries. The results of the reviewed studies showed that machine learning models have indeed performed well in fitting large noisy data sets and modeling minor nonadditive effects in some of the studies. However, sometimes conventional methods outperformed machine learning models, which confirms that there's no universal method for genomic prediction. In summary, machine learning models have great potential for extracting patterns from single nucleotide polymorphism datasets. Nonetheless, the level of their adoption in animal breeding is still low due to data limitations, complex genetic interactions, a lack of standardization and reproducibility, and the lack of interpretability of machine learning models when trained with biological data. Consequently, there is no remarkable outperformance of machine learning methods compared to traditional methods in genomic prediction. Therefore, more research should be conducted to discover new insights that could enhance livestock breeding programs.
Collapse
Affiliation(s)
- Narjice Chafai
- Laboratory of Biodiversity, Ecology, and Genome, Department of Biology, Faculty of Sciences, Mohammed V University in Rabat, Rabat, Morocco
| | - Ichrak Hayah
- Laboratory of Biodiversity, Ecology, and Genome, Department of Biology, Faculty of Sciences, Mohammed V University in Rabat, Rabat, Morocco
| | - Isidore Houaga
- Centre for Tropical Livestock Genetics and Health, The Roslin Institute, Royal (Dick) School of Veterinary Medicine, The University of Edinburgh, Edinburgh, United Kingdom
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, United Kingdom
| | - Bouabid Badaoui
- Laboratory of Biodiversity, Ecology, and Genome, Department of Biology, Faculty of Sciences, Mohammed V University in Rabat, Rabat, Morocco
- African Sustainable Agriculture Research Institute (ASARI), Mohammed VI Polytechnic University (UM6P), Laayoune, Morocco
| |
Collapse
|
3
|
Wolf MJ, Neumann GB, Kokuć P, Yin T, Brockmann GA, König S, May K. Genetic evaluations for endangered dual-purpose German Black Pied cattle using 50K SNPs, a breed-specific 200K chip, and whole-genome sequencing. J Dairy Sci 2023; 106:3345-3358. [PMID: 37028956 DOI: 10.3168/jds.2022-22665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 12/16/2022] [Indexed: 04/09/2023]
Abstract
Genetic evaluations of local cattle breeds are hampered due to small reference groups or biased due to the utilization of SNP effects estimated in other large populations. Against this background, there is a lack of studies addressing the possible advantage of whole-genome sequences (WGS) or consideration of specific variants from WGS data in genomic predictions for local breeds with small population size. Consequently, the aim of this study was to compare genetic parameters and accuracies of genomic estimated breeding values (GEBV) for 305-d production traits, fat-to protein ratio (FPR), and somatic cell score (SCS) at the first test date after calving and confirmation traits of the endangered German Black Pied cattle (DSN) breed using 4 different marker panels: (1) the commercial 50K Illumina BovineSNP50 BeadChip, (2) a customized 200K chip designed for DSN (DSN200K) which considers the most important variants for DSN from WGS, (3) randomly generated 200K chips based on WGS data, and (4) a WGS panel. The same number of animals was considered for all marker panel analyses (i.e., 1,811 genotyped or sequenced cows for conformation traits, 2,383 cows for lactation production traits, and 2,420 cows for FPR and SCS). Mixed models for the estimation of genetic parameters directly included the respective genomic relationship matrix from the different marker panels plus the trait-specific fixed effects. For the calculation of GEBV accuracies, we applied repeated random subsampling validation. In the process of separate cross-validations per trait, we created a validation set including 20% of cows with masked phenotypes, and a training set comprising 80% of the cows. The cows were selected randomly in a procedure with 10 replicates considering replacements in the different scenarios. The accuracy was defined as the correlation between the direct GEBV and the phenotypes with subtracted corresponding fixed effects for the cows in the validation set. For FPR and SCS, as well as for lactation production traits, heritabilities were largest based on WGS data, but the increase compared with the 50K or DSN200K applications was quite small in the range from 0.01 to 0.03. Also, for most of the conformation traits, heritabilities were largest based on WGS and DSN200K data, but the increase was in the range of the corresponding standard error. Accordingly, GEBV accuracies for most of the studied traits were highest based on WGS data or when utilizing the DSN200K chip, but the accuracy differences across the marker panels were quite small and nonsignificant. In conclusion, WGS data and the DSN200K chip only contributed to minor improvements in genomic predictions, still justifying the use of the commercial 50K chip. Nevertheless, WGS and the 200KDSN chip harbor breed-specific variants, which are valuable for studying causal genetic mechanisms in the endangered DSN population.
Collapse
Affiliation(s)
- Manuel J Wolf
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, 35390 Gießen, Germany
| | - Guilherme B Neumann
- Animal Breeding Biology and Molecular Genetics, Albrecht Daniel Thaer-Institute for Agricultural and Horticultural Sciences, Humboldt Universität zu Berlin, 10115 Berlin, Germany
| | - Paula Kokuć
- Animal Breeding Biology and Molecular Genetics, Albrecht Daniel Thaer-Institute for Agricultural and Horticultural Sciences, Humboldt Universität zu Berlin, 10115 Berlin, Germany
| | - Tong Yin
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, 35390 Gießen, Germany
| | - Gudrun A Brockmann
- Animal Breeding Biology and Molecular Genetics, Albrecht Daniel Thaer-Institute for Agricultural and Horticultural Sciences, Humboldt Universität zu Berlin, 10115 Berlin, Germany
| | - Sven König
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, 35390 Gießen, Germany.
| | - Katharina May
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, 35390 Gießen, Germany
| |
Collapse
|
4
|
Montesinos-López OA, Montesinos-López A, Mosqueda-Gonzalez BA, Montesinos-López JC, Crossa J, Ramirez NL, Singh P, Valladares-Anguiano FA. A zero altered Poisson random forest model for genomic-enabled prediction. G3-GENES GENOMES GENETICS 2021; 11:6042695. [PMID: 33693599 PMCID: PMC8022945 DOI: 10.1093/g3journal/jkaa057] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Accepted: 12/10/2020] [Indexed: 12/23/2022]
Abstract
In genomic selection choosing the statistical machine learning model is of paramount importance. In this paper, we present an application of a zero altered random forest model with two versions (ZAP_RF and ZAPC_RF) to deal with excess zeros in count response variables. The proposed model was compared with the conventional random forest (RF) model and with the conventional Generalized Poisson Ridge regression (GPR) using two real datasets, and we found that, in terms of prediction performance, the proposed zero inflated random forest model outperformed the conventional RF and GPR models.
Collapse
Affiliation(s)
| | - Abelardo Montesinos-López
- Departamento de Matemáticas, Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, 44430 Guadalajara, Jalisco, México
| | | | | | - José Crossa
- Colegio de Postgraduados, Montecillos, Edo. de México CP 56230, México.,International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, CP 52640, Edo. de México, México
| | - Nerida Lozano Ramirez
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, CP 52640, Edo. de México, México
| | - Pawan Singh
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera Mexico-Veracruz, CP 52640, Edo. de México, México
| | | |
Collapse
|
5
|
Alves AAC, Espigolan R, Bresolin T, Costa RM, Fernandes Júnior GA, Ventura RV, Carvalheiro R, Albuquerque LG. Genome-enabled prediction of reproductive traits in Nellore cattle using parametric models and machine learning methods. Anim Genet 2020; 52:32-46. [PMID: 33191532 DOI: 10.1111/age.13021] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/13/2020] [Indexed: 12/31/2022]
Abstract
This study aimed to assess the predictive ability of different machine learning (ML) methods for genomic prediction of reproductive traits in Nellore cattle. The studied traits were age at first calving (AFC), scrotal circumference (SC), early pregnancy (EP) and stayability (STAY). The numbers of genotyped animals and SNP markers available were 2342 and 321 419 (AFC), 4671 and 309 486 (SC), 2681 and 319 619 (STAY) and 3356 and 319 108 (EP). Predictive ability of support vector regression (SVR), Bayesian regularized artificial neural network (BRANN) and random forest (RF) were compared with results obtained using parametric models (genomic best linear unbiased predictor, GBLUP, and Bayesian least absolute shrinkage and selection operator, BLASSO). A 5-fold cross-validation strategy was performed and the average prediction accuracy (ACC) and mean squared errors (MSE) were computed. The ACC was defined as the linear correlation between predicted and observed breeding values for categorical traits (EP and STAY) and as the correlation between predicted and observed adjusted phenotypes divided by the square root of the estimated heritability for continuous traits (AFC and SC). The average ACC varied from low to moderate depending on the trait and model under consideration, ranging between 0.56 and 0.63 (AFC), 0.27 and 0.36 (SC), 0.57 and 0.67 (EP), and 0.52 and 0.62 (STAY). SVR provided slightly better accuracies than the parametric models for all traits, increasing the prediction accuracy for AFC to around 6.3 and 4.8% compared with GBLUP and BLASSO respectively. Likewise, there was an increase of 8.3% for SC, 4.5% for EP and 4.8% for STAY, comparing SVR with both GBLUP and BLASSO. In contrast, the RF and BRANN did not present competitive predictive ability compared with the parametric models. The results indicate that SVR is a suitable method for genome-enabled prediction of reproductive traits in Nellore cattle. Further, the optimal kernel bandwidth parameter in the SVR model was trait-dependent, thus, a fine-tuning for this hyper-parameter in the training phase is crucial.
Collapse
Affiliation(s)
- A A C Alves
- Department of Animal Science, School of Agricultural and Veterinary Sciences, Sao Paulo State University (UNESP), Jaboticabal, 14884-900, Brazil
| | - R Espigolan
- Department of Animal Science, School of Agricultural and Veterinary Sciences, Sao Paulo State University (UNESP), Jaboticabal, 14884-900, Brazil
| | - T Bresolin
- Department of Animal Science, School of Agricultural and Veterinary Sciences, Sao Paulo State University (UNESP), Jaboticabal, 14884-900, Brazil
| | - R M Costa
- Department of Exact Sciences, School of Agricultural and Veterinary Sciences, Sao Paulo State University (UNESP), Jaboticabal, 4884-900, Brazil
| | - G A Fernandes Júnior
- Department of Animal Science, School of Agricultural and Veterinary Sciences, Sao Paulo State University (UNESP), Jaboticabal, 14884-900, Brazil
| | - R V Ventura
- Department of Animal Nutrition and Production, School of Veterinary Medicine and Animal Science, University of Sao Paulo (USP), Pirassununga, 13635-900, Brazil
| | - R Carvalheiro
- Department of Animal Science, School of Agricultural and Veterinary Sciences, Sao Paulo State University (UNESP), Jaboticabal, 14884-900, Brazil.,National Council of Technological and Scientific Development (CNPq), Brasília, 71605-001, Brazil
| | - L G Albuquerque
- Department of Animal Science, School of Agricultural and Veterinary Sciences, Sao Paulo State University (UNESP), Jaboticabal, 14884-900, Brazil.,National Council of Technological and Scientific Development (CNPq), Brasília, 71605-001, Brazil
| |
Collapse
|
6
|
Abstract
The current livestock management landscape is transitioning to a high-throughput digital era where large amounts of information captured by systems of electro-optical, acoustical, mechanical, and biosensors is stored and analyzed on a daily and hourly basis, and actionable decisions are made based on quantitative and qualitative analytic results. While traditional animal breeding prediction methods have been used with great success until recently, the deluge of information starts to create a computational and storage bottleneck that could lead to negative long-term impacts on herd management strategies if not handled properly. A plethora of machine learning approaches, successfully used in various industrial and scientific applications, made their way in the mainstream approaches for livestock breeding techniques, and current results show that such methods have the potential to match or surpass the traditional approaches, while most of the time they are more scalable from a computational and storage perspective. This article provides a succinct view on what traditional and novel prediction methods are currently used in the livestock breeding field, how successful they are, and how the future of the field looks in the new digital agriculture era.
Collapse
|
7
|
A machine learning approach for the identification of population-informative markers from high-throughput genotyping data: application to several pig breeds. Animal 2019; 14:223-232. [PMID: 31603060 DOI: 10.1017/s1751731119002167] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Single nucleotide polymorphisms (SNPs) able to describe population differences can be used for important applications in livestock, including breed assignment of individual animals, authentication of mono-breed products and parentage verification among several other applications. To identify the most discriminating SNPs among thousands of markers in the available commercial SNP chip tools, several methods have been used. Random forest (RF) is a machine learning technique that has been proposed for this purpose. In this study, we used RF to analyse PorcineSNP60 BeadChip array genotyping data obtained from a total of 2737 pigs of 7 Italian pig breeds (3 cosmopolitan-derived breeds: Italian Large White, Italian Duroc and Italian Landrace, and 4 autochthonous breeds: Apulo-Calabrese, Casertana, Cinta Senese and Nero Siciliano) to identify breed informative and reduced SNP panels using the mean decrease in the Gini Index and the Mean Decrease in Accuracy parameters with stability evaluation. Other reduced informative SNP panels were obtained using Delta, Fixation index and principal component analysis statistics, and their performances were compared with those obtained using the RF-defined panels using the RF classification method and its derived Out Of Bag rates and correct prediction proportions. Therefore, the performances of a total of six reduced panels were evaluated. The correct assignment of the animals to its breed was close to 100% for all tested approaches. Porcine chromosome 8 harboured the largest number of selected SNPs across all panels. Many SNPs were included in genomic regions in which previous studies identified signatures of selection or genes (e.g. ESR1, KITL and LCORL) that could contribute to explain, at least in part, phenotypically or economically relevant traits that might differentiate cosmopolitan and autochthonous pig breeds. Random forest used as preselection statistics highlighted informative SNPs that were not the same as those identified by other methods. This might be due to specific features of this machine learning methodology. It will be interesting to explore if the adaptation of RF methods for the identification of selection signature regions could be able to describe population-specific features that are not captured by other approaches.
Collapse
|
8
|
Tang M, Hu P, Wang CF, Yu CQ, Sheng J, Ma SJ. Prediction Model of Cardiac Risk for Dental Extraction in Elderly Patients with Cardiovascular Diseases. Gerontology 2019; 65:591-598. [PMID: 31048587 DOI: 10.1159/000497424] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2018] [Accepted: 02/03/2019] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND With the rapidly increasing population of elderly people, dental extraction in elderly individuals with cardiovascular diseases (CVDs) has become quite common. The issue of how to assure the safety of elderly patients with CVDs undergoing dental extraction has perplexed dentists and internists for many years. And it is important to derive an appropriate risk prediction tool for this population. OBJECTIVES The aim of this retrospective, observational study was to establish and validate a prediction model based on the random forest (RF) algorithm for the risk of cardiac complications of dental extraction in elderly patients with CVDs. METHODS Between August 2017 and May 2018, a total of 603 patients who fulfilled the inclusion criteria were used to create a training set. An independent test set contained 230 patients between June 2018 and July 2018. Data regarding clinical parameters, laboratory tests, clinical examinations before dental extraction, and 1-week follow-up were retrieved. Predictors were identified by using logistic regression (LR) with penalized LASSO (least absolute shrinkage and selection operator) variable selection. Then, a prediction model was constructed based on the RF algorithm by using a 5-fold cross-validation method. RESULTS The training set, based on 603 participants, including 282 men and 321 women, had an average participant age of 72.38 ± 8.31 years. Using feature selection methods, 11 predictors for risk of cardiac complications were screened out. When the RF model was constructed, its overall classification accuracy was 0.82 at the optimal cutoff value of 18.5%. In comparison to the LR model, the RF model showed a superior predictive performance. The AUROC (area under the receiver operating characteristic curve) scores of the RF and LR models were 0.83 and 0.80, respectively, in the independent test set. The AUPRC (area under the precision-recall curve) scores of the RF and LR models were 0.56 and 0.35, respectively, in the independent test set. CONCLUSION The RF-based prediction model is expected to be applicable for preoperative clinical assessment for preventing cardiac complications in elderly patients with CVDs undergoing dental extraction. The findings may aid physicians and dentists in making more informed recommendations to prevent cardiac complications in this patient population.
Collapse
Affiliation(s)
- Min Tang
- Department of Geriatrics, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Ping Hu
- Department of Geriatrics, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Cao-Feng Wang
- Department of Geriatrics, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Chuang-Qi Yu
- Department of Oral Surgery, Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Jing Sheng
- Department of Geriatrics, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Shao-Jun Ma
- Department of Geriatrics, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China,
| |
Collapse
|
9
|
Yin T, König S. Genome-wide associations and detection of potential candidate genes for direct genetic and maternal genetic effects influencing dairy cattle body weight at different ages. Genet Sel Evol 2019; 51:4. [PMID: 30727969 PMCID: PMC6366057 DOI: 10.1186/s12711-018-0444-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2018] [Accepted: 12/20/2018] [Indexed: 12/27/2022] Open
Abstract
Background Body weight (BW) at different ages are of increasing importance in dairy cattle breeding schemes, because of their strong correlation with energy efficiency traits, and their impact on cow health, longevity and farm economy. In total, 15,921 dairy cattle from 56 large-scale test-herds with BW records were genotyped for 45,613 single nucleotide polymorphisms (SNPs). This dataset was used for genome-wide association studies (GWAS), in order to localize potential candidate genes for direct and maternal genetic effects on BW recorded at birth (BW0), at 2 to 3 months of age (BW23), and at 13 to 14 months of age (BW1314). Results The first 20 principal components (PC) of the genomic relationship matrix (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf{G}}$$\end{document}G) grouped the genotyped cattle into three clusters. In the statistical models used for GWAS, correction for population structure was done by including polygenic effects with various genetic similarity matrices, such as the pedigree-based relationship matrix (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf{A}}$$\end{document}A), the \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf{G}}$$\end{document}G-matrix, the reduced \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf{G}}$$\end{document}G-matrix LOCO (i.e. exclusion of the chromosome on which the candidate SNP is located), and LOCO plus chromosome-wide PC. Inflation factors for direct genetic effects using \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf{A}}$$\end{document}A and LOCO were larger than 1.17. For \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf{G}}$$\end{document}G and LOCO plus chromosome-wide PC, inflation factors were very close to 1.0. According to Bonferroni correction, ten, two and seven significant SNPs were detected for the direct genetic effect on BW0, BW23, and BW1314, respectively. Seventy-six candidate genes contributed to direct genetic effects on BW with four involved in growth and developmental processes: FGF6, FGF23, TNNT3, and OMD. For maternal genetic effects on BW0, only three significant SNPs (according to Bonferroni correction), and four potential candidate genes, were identified. The most significant SNP on chromosome 19 explained only 0.14% of the maternal de-regressed proof variance for BW0. Conclusions For correction of population structure in GWAS, we suggest a statistical model that considers LOCO plus chromosome-wide PC. Regarding direct genetic effects, several SNPs had a significant effect on BW at different ages, and only two SNPs on chromosome 5 had a significant effect on all three BW traits. Thus, different potential candidate genes regulate BW at different ages. Maternal genetic effects followed an infinitesimal model. Electronic supplementary material The online version of this article (10.1186/s12711-018-0444-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tong Yin
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, Ludwigstr. 21b, 35390, Giessen, Germany
| | - Sven König
- Institute of Animal Breeding and Genetics, Justus-Liebig-University Gießen, Ludwigstr. 21b, 35390, Giessen, Germany.
| |
Collapse
|
10
|
Naderi S, Bohlouli M, Yin T, König S. Genomic breeding values, SNP effects and gene identification for disease traits in cow training sets. Anim Genet 2018; 49:178-192. [PMID: 29624705 DOI: 10.1111/age.12661] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/19/2018] [Indexed: 12/30/2022]
Abstract
Holstein Friesian cow training sets were created according to disease incidences. The different datasets were used to investigate the impact of random forest (RF) and genomic BLUP (GBLUP) methodology on genomic prediction accuracies. In addition, for further verifications of some specific scenarios, single-step genomic BLUP was applied. Disease traits included the overall trait categories of (i) claw disorders, (ii) clinical mastitis and (iii) infertility from 80 741 first lactation Holstein cows kept in 58 large-scale herds. A subset of 6744 cows was genotyped (50K SNP panel). Response variables for all scenarios were de-regressed proofs (DRPs) and pre-corrected phenotypes (PCPs). Initially, all sick cows were allocated to the testing set, and healthy cows represented the training set. For the ongoing cow allocation schemes, the number of sick cows in the training set increased stepwise by moving 10% of the sick cows from the testing to the training set in each step. The size of training and testing sets was kept constant by replacing the same number of cows in the testing set with (randomly selected) healthy cows from the training set. For both the RF and GBLUP methods, prediction accuracies were larger for DRPs compared to PCPs. For PCPs as a response variable, the largest prediction accuracies were observed when the disease incidences in training sets reflected the disease incidence in the whole population. A further increase in prediction accuracies for some selected cow allocation schemes (i.e. larger prediction accuracies compared to corresponding scenarios with RF or GBLUB) was achieved via single-step GBLUP applications. Correlations between genome-wide association study SNP effects and RF importance criteria for single SNPs were in a moderate range, from 0.42 to 0.57, when considering SNPs from all chromosomes or from specific chromosome segments. RF identified significant SNPs close to potential positional candidate genes: GAS1, GPAT3 and CYP2R1 for clinical mastitis; SPINK5 and SLC26A2 for laminitis; and FGF12 for endometritis.
Collapse
Affiliation(s)
- S Naderi
- Institute of Animal Breeding and Genetics, University of Gießen, Ludwigstr. 21b, 35390, Gießen, Germany
| | - M Bohlouli
- Institute of Animal Breeding and Genetics, University of Gießen, Ludwigstr. 21b, 35390, Gießen, Germany
| | - T Yin
- Institute of Animal Breeding and Genetics, University of Gießen, Ludwigstr. 21b, 35390, Gießen, Germany
| | - S König
- Institute of Animal Breeding and Genetics, University of Gießen, Ludwigstr. 21b, 35390, Gießen, Germany
| |
Collapse
|
11
|
Yin T, König S. Heritabilities and genetic correlations in the same traits across different strata of herds created according to continuous genomic, genetic, and phenotypic descriptors. J Dairy Sci 2018; 101:2171-2186. [DOI: 10.3168/jds.2017-13575] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Accepted: 10/25/2017] [Indexed: 11/19/2022]
|