1
|
Bhowmik N, Seaborn T, Ringwall KA, Dahlen CR, Swanson KC, Hulsman Hanna LL. Genetic Distinctness and Diversity of American Aberdeen Cattle Compared to Common Beef Breeds in the United States. Genes (Basel) 2023; 14:1842. [PMID: 37895190 PMCID: PMC10606367 DOI: 10.3390/genes14101842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 09/10/2023] [Accepted: 09/19/2023] [Indexed: 10/29/2023] Open
Abstract
American Aberdeen (AD) cattle in the USA descend from an Aberdeen Angus herd originally brought to the Trangie Agricultural Research Centre, New South Wales, AUS. Although put under specific selection pressure for yearling growth rate, AD remain genomically uncharacterized. The objective was to characterize the genetic diversity and structure of purebred and crossbred AD cattle relative to seven common USA beef breeds using available whole-genome SNP data. A total of 1140 animals consisting of 404 purebred (n = 8 types) and 736 admixed individuals (n = 10 types) was used. Genetic diversity metrics, an analysis of molecular variance, and a discriminant analysis of principal components were employed. When linkage disequilibrium was not accounted for, markers influenced basic diversity parameter estimates, especially for AD cattle. Even so, intrapopulation and interpopulation estimates separate AD cattle from other purebred types (e.g., Latter's pairwise FST ranged from 0.1129 to 0.2209), where AD cattle were less heterozygous and had lower allelic richness than other purebred types. The admixed AD-influenced cattle were intermediate to other admixed types for similar parameters. The diversity metrics separation and differences support strong artificial selection pressures during and after AD breed development, shaping the evolution of the breed and making them genomically distinct from similar breeds.
Collapse
Affiliation(s)
- Nayan Bhowmik
- Department of Animal Sciences, North Dakota State University, Fargo, ND 58108, USA
| | - Travis Seaborn
- School of Natural Resource Sciences, North Dakota State University, Fargo, ND 58108, USA
| | - Kris A. Ringwall
- Dickinson Research Extension Center, North Dakota State University, Dickinson, ND 58601, USA
| | - Carl R. Dahlen
- Department of Animal Sciences, North Dakota State University, Fargo, ND 58108, USA
| | - Kendall C. Swanson
- Department of Animal Sciences, North Dakota State University, Fargo, ND 58108, USA
| | | |
Collapse
|
2
|
Zhao C, Wang D, Teng J, Yang C, Zhang X, Wei X, Zhang Q. Breed identification using breed-informative SNPs and machine learning based on whole genome sequence data and SNP chip data. J Anim Sci Biotechnol 2023; 14:85. [PMID: 37259083 DOI: 10.1186/s40104-023-00880-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 04/05/2023] [Indexed: 06/02/2023] Open
Abstract
BACKGROUND Breed identification is useful in a variety of biological contexts. Breed identification usually involves two stages, i.e., detection of breed-informative SNPs and breed assignment. For both stages, there are several methods proposed. However, what is the optimal combination of these methods remain unclear. In this study, using the whole genome sequence data available for 13 cattle breeds from Run 8 of the 1,000 Bull Genomes Project, we compared the combinations of three methods (Delta, FST, and In) for breed-informative SNP detection and five machine learning methods (KNN, SVM, RF, NB, and ANN) for breed assignment with respect to different reference population sizes and difference numbers of most breed-informative SNPs. In addition, we evaluated the accuracy of breed identification using SNP chip data of different densities. RESULTS We found that all combinations performed quite well with identification accuracies over 95% in all scenarios. However, there was no combination which performed the best and robust across all scenarios. We proposed to integrate the three breed-informative detection methods, named DFI, and integrate the three machine learning methods, KNN, SVM, and RF, named KSR. We found that the combination of these two integrated methods outperformed the other combinations with accuracies over 99% in most cases and was very robust in all scenarios. The accuracies from using SNP chip data were only slightly lower than that from using sequence data in most cases. CONCLUSIONS The current study showed that the combination of DFI and KSR was the optimal strategy. Using sequence data resulted in higher accuracies than using chip data in most cases. However, the differences were generally small. In view of the cost of genotyping, using chip data is also a good option for breed identification.
Collapse
Affiliation(s)
- Changheng Zhao
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China
| | - Dan Wang
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China
| | - Jun Teng
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China
| | - Cheng Yang
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China
| | - Xinyi Zhang
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China
| | - Xianming Wei
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China
| | - Qin Zhang
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Veterinary Medicine, Shandong Agricultural University, Tai'an, 271018, China.
| |
Collapse
|
3
|
Ryan CA, Berry DP, O’Brien A, Pabiou T, Purfield DC. Evaluating the use of statistical and machine learning methods for estimating breed composition of purebred and crossbred animals in thirteen cattle breeds using genomic information. Front Genet 2023; 14:1120312. [PMID: 37274789 PMCID: PMC10237237 DOI: 10.3389/fgene.2023.1120312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 05/03/2023] [Indexed: 06/07/2023] Open
Abstract
Introduction: The ability to accurately predict breed composition using genomic information has many potential uses including increasing the accuracy of genetic evaluations, optimising mating plans and as a parameter for genotype quality control. The objective of the present study was to use a database of genotyped purebred and crossbred cattle to compare breed composition predictions using a freely available software, Admixture, with those from a single nucleotide polymorphism Best Linear Unbiased Prediction (SNP-BLUP) approach; a supplementary objective was to determine the accuracy and general robustness of low-density genotype panels for predicting breed composition. Methods: All animals had genotype information on 49,213 autosomal single nucleotide polymorphism (SNPs). Thirteen breeds were included in the analysis and 500 purebred animals per breed were used to establish the breed training populations. Accuracy of breed composition prediction was determined using a separate validation population of 3,146 verified purebred and 4,330 two and three-way crossbred cattle. Results: When all 49,213 autosomal SNPs were used for breed prediction, a minimal absolute mean difference of 0.04 between Admixture vs. SNP-BLUP breed predictions was evident. For crossbreds, the average absolute difference in breed prediction estimates generated using SNP-BLUP and Admixture was 0.068 with a root mean square error of 0.08. Breed predictions from low-density SNP panels were generated using both SNP-BLUP and Admixture and compared to breed prediction estimates using all 49,213 SNPs (representing the gold standard). Breed composition estimates of crossbreds required more SNPs than predicting the breed composition of purebreds. SNP-BLUP required ≥3,000 SNPs to predict crossbred breed composition, but only 2,000 SNPs were required to predict purebred breed status. The absolute mean (standard deviation) difference across all panels <2,000 SNPs was 0.091 (0.054) and 0.315 (0.316) when predicting the breed composition of all animals using Admixture and SNP-BLUP, respectively compared to the gold standard prediction. Discussion: Nevertheless, a negligible absolute mean (standard deviation) difference of 0.009 (0.123) in breed prediction existed between SNP-BLUP and Admixture once ≥3,000 SNPs were considered, indicating that the prediction of breed composition could be readily integrated into SNP-BLUP pipelines used for genomic evaluations thereby avoiding the necessity for a stand-alone software.
Collapse
Affiliation(s)
- C. A. Ryan
- Teagasc, Co. Cork, Ireland
- Munster Technological University, Cork, Ireland
| | | | | | - T. Pabiou
- Irish Cattle Breeding Federation, Cork, Ireland
| | | |
Collapse
|
4
|
Bina M. Defining Candidate Imprinted loci in Bos taurus. Genes (Basel) 2023; 14:1036. [PMID: 37239396 PMCID: PMC10217866 DOI: 10.3390/genes14051036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 04/27/2023] [Accepted: 04/30/2023] [Indexed: 05/28/2023] Open
Abstract
Using a whole-genome assembly of Bos taurus, I applied my bioinformatics strategy to locate candidate imprinting control regions (ICRs) genome-wide. In mammals, genomic imprinting plays essential roles in embryogenesis. In my strategy, peaks in plots mark the locations of known, inferred, and candidate ICRs. Genes in the vicinity of candidate ICRs correspond to potential imprinted genes. By displaying my datasets on the UCSC genome browser, one could view peak positions with respect to genomic landmarks. I give two examples of candidate ICRs in loci that influence spermatogenesis in bulls: CNNM1 and CNR1. I also give examples of candidate ICRs in loci that influence muscle development: SIX1 and BCL6. By examining the ENCODE data reported for mice, I deduced regulatory clues about cattle. I focused on DNase I hypersensitive sites (DHSs). Such sites reveal accessibility of chromatin to regulators of gene expression. For inspection, I chose DHSs in chromatin from mouse embryonic stem cells (ESCs) ES-E14, mesoderm, brain, heart, and skeletal muscle. The ENCODE data revealed that the SIX1 promoter was accessible to the transcription initiation apparatus in mouse ESCs, mesoderm, and skeletal muscles. The data also revealed accessibility of BCL6 locus to regulatory proteins in mouse ESCs and examined tissues.
Collapse
Affiliation(s)
- Minou Bina
- Department of Chemistry, Purdue University, West Lafayette, IN 47907, USA
| |
Collapse
|
5
|
Stolpovsky YA, Kuznetsov SB, Solodneva EV, Shumov ID. New Cattle Genotyping System Based on DNA Microarray Technology. RUSS J GENET+ 2022. [DOI: 10.1134/s1022795422080099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
6
|
Gao Z, Zhang Y, Li Z, Zeng Q, Yang F, Song Y, Song Y, He J. Genomic breed composition of Ningxiang pig via different SNP panels. J Anim Physiol Anim Nutr (Berl) 2021; 106:783-791. [PMID: 34260785 DOI: 10.1111/jpn.13603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 06/17/2021] [Accepted: 06/21/2021] [Indexed: 11/30/2022]
Abstract
The genomic breed composition (GBC) reflects the genetic relationship between individual animal and ancestor breeds in composite or hybrid breeds. Also, it can estimate the genomic contribution of each breed (ancestor) to the genome of each individual animal. Using genomic SNP information to estimate Ningxiang pig GBC is of great significance. First of all, GBC was widely used in cattle and had significant effects, but there is almost no using experience in Chinese endemic pig breeds. Importantly, High-density SNPs are expensive but can be economized by deploying a relatively small number of highly informative SNP scattered evenly across the genome. Moreover, the impact of low-density SNPs selection strategy on estimating the GBC of individual animals has not been fully explained. Using SNP data from different databases and organizations, we established reference (N = 2015) and verification (N = 302) data sets. Twelve successively smaller SNP panels (500, 1K, 5K, 10K) were built from those SNP in the reference data by three selection methods (uniform, maximized the Euclidean distance (MED) and random distribution method). For each panel, the GBC of Ningxiang pigs in the reference dataset was estimated. Then combining Shannon entropy and the GBC results, the optimal panel (the 10K SNP panel constructed by MED method) was picked out to estimate the GBC of verification Ningxiang pig, which detected that 230 individuals were purebred Ningxiang pigs and the remaining 72 impure individuals contained 6.44% blood related with Rongchang pigs and 4.09% with Bamaxiang pigs in the verification Ningxiang population. Finally, the genetic structure analysis of verification population was performed combining with the results of GBC, multi-dimensional scaling (MDS) analysis and hierarchical cluster analysis. These results showed: (a) GBC could accurately identify purebred Ningxiang pigs and, scientifically, calculate the genomic contribution of each breed of each hybrid animal. (b) GBC could carry out population genetic structure and understand the genetic background of Ningxiang pigs. Such findings highlight a variety of opportunities to better protect and identify other endangered local breeds in China facing the same situation as Ningxiang pig and provide more accurate, economical and efficient new technical support in GBC estimation breeding work.
Collapse
Affiliation(s)
- Zhendong Gao
- College of Animal Science and Technology, Hunan Agricultural University, Changsha, China
| | - Yuebo Zhang
- College of Animal Science and Technology, Hunan Agricultural University, Changsha, China
| | - Zhi Li
- College of Animal Science and Technology, Hunan Agricultural University, Changsha, China
| | - Qinhua Zeng
- College of Animal Science and Technology, Hunan Agricultural University, Changsha, China
| | - Fang Yang
- College of Animal Science and Technology, Hunan Agricultural University, Changsha, China
| | - Yuexiang Song
- College of Animal Science and Technology, Hunan Agricultural University, Changsha, China
| | - Yukun Song
- College of Animal Science and Technology, Hunan Agricultural University, Changsha, China
| | - Jun He
- College of Animal Science and Technology, Hunan Agricultural University, Changsha, China
| |
Collapse
|
7
|
Dominik S, Duff CJ, Byrne AI, Daetwyler H, Reverter A. Ultra-small SNP panels to uniquely identify individuals in thousands of samples. ANIMAL PRODUCTION SCIENCE 2021. [DOI: 10.1071/an21123] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Context
Genomic profiles are the only information source that can uniquely identify an individual but have not yet been strongly considered in the context of paddock to plate traceability due to the lack of value proposition.
Aim
The aim of this study was to define the minimum number of single nucleotide polymorphisms (SNP) required to distinguish a unique genotype profile for each individual sample within a large given population. At the same time, ad hoc approaches were explored to reduce SNP density, and therefore, the size of the dataset to improve computing efficiency and storage requirements while maintaining informativeness to distinguish individuals.
Methods
Data for this study included two datasets. One included 78 411 high-density SNP genotypes from commercial Angus cattle and the other 2107 from a research data (1000-bull genome data). In a stepwise approach, different-size SNP panels were explored, with the last step being a successive removal resulting in the smallest set of SNPs that still produced the maximum number of unique genotypes.
Key results
First study that has demonstrated for large datasets, that ultra-small SNP panels with 20–23 SNPs can generate unique genotypes for up to ~80 000 individuals, allowing for 100% matching accuracy.
Conclusions
Ultra-small SNP panels could provide an efficient method to approach the large-scale task of the traceability of beef products through the beef supply chain.
Implications
Genomic tools could enhance supply-chain traceability.
Collapse
|