1
|
Liu H, Zhu B, Wang T, Dong Y, Ju Y, Li Y, Su W, Zhang R, Dong S, Wang H, Zhou Y, Zhu Y, Wang L, Zhang Z, Zhao P, Zhang S, Guo R, A E, Zhang Y, Liu X, Tamate HB, Liang Q, Ma D, Xing X. Population genomics of sika deer reveals recent speciation and genetic selective signatures during evolution and domestication. BMC Genomics 2025; 26:364. [PMID: 40217144 PMCID: PMC11987376 DOI: 10.1186/s12864-025-11541-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2024] [Accepted: 03/28/2025] [Indexed: 04/15/2025] Open
Abstract
BACKGROUND Population genomic analysis can reconstruct the phylogenetic relationship and demographic history, and identify genomic selective signatures of a species. To date, fundamental aspects of population genomic analyses, such as intraspecies taxonomy, evolutionary history, and adaptive evolution, of sika deer have not been systematically investigated. Furthermore, accumulating lines of evidences have illustrated that incorrect species delimitation will mislead conservation decisions, and even lead to irreversible mistakes in threatened species. RESULTS In this study, we resequenced 81 wild and 71 domesticated sika deer representing 10 main geographic populations and two farms to clarify the species delimitation, demographic and divergence histories, and adaptive evolution of this species. First, our analyses of whole genomes, Y chromosomes and mitochondrial genomes revealed substantial genetic differentiation between the continental and Japanese lineages of sika deer, representing two phylogenetically distinct species. Second, sika deer in Japan were inferred to have experienced a "divergence-mixing-isolation" evolutionary scenario. Third, we identified four candidate genes (XKR4, NPAS3, CTNNA3, and CNTNAP5) possibly involved in body size regulation of sika deer by selective sweep analysis. Furthermore, we also detected two candidate genes (NRP2 and EDIL3) that may be associated with an important economic trait (antler weight) were under selection during the process of domestication. CONCLUSION Population genomic analyses revealed that the continental and Japanese lineages represent distinct phylogenetic species. Moreover, our results provide insights into the genetic selection signatures related to body size differences and a valuable genomic resource for future genetic studies and genomics-informed breeding of sika deer.
Collapse
Affiliation(s)
- Huamiao Liu
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Bo Zhu
- Novogene Bioinformatics Institute, Beijing, 100083, China
| | - Tianjiao Wang
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Yimeng Dong
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Yan Ju
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Yang Li
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Weilin Su
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Ranran Zhang
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Shiwu Dong
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Hongliang Wang
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Yongna Zhou
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Yanmin Zhu
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Lei Wang
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Zhengyi Zhang
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Pei Zhao
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China
| | - Shuyan Zhang
- Administration of Zhejiang Qingliangfeng National Nature Reserve, Hangzhou, 310000, China
| | - Rui Guo
- Administration of Zhejiang Qingliangfeng National Nature Reserve, Hangzhou, 310000, China
| | - E A
- Sichuan Tiebu Sika Deer Nature Reserve, Aba, 624000, China
| | - Yuwen Zhang
- Administrative Office of Liugong Island National Forest Park, Weihai, 264200, China
| | - Xin Liu
- Northeast Forestry University, Harbin, 150006, China
| | | | - Qiqi Liang
- Glbizzia Bioinformatics Institute, Beijing, 102208, China.
| | - De Ma
- Novogene Bioinformatics Institute, Beijing, 100083, China.
| | - Xiumei Xing
- State Key Laboratory for Molecular Biology of Special Economic Animals, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, 130112, China.
| |
Collapse
|
2
|
Miron-Toruno MF, Morett E, Aguilar-Ordonez I, Reynolds AW. Genome-Wide Selection Scans in Mexican Indigenous Populations Reveal Recent Signatures of Pathogen and Diet Adaptation. Genome Biol Evol 2025; 17:evaf043. [PMID: 40070201 PMCID: PMC11954594 DOI: 10.1093/gbe/evaf043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/05/2025] [Indexed: 04/01/2025] Open
Abstract
Whole-genome scans for natural selection signatures across Mexican indigenous populations remain underrepresented in the literature. Here, we conducted the first comparative analysis of genetic adaptation in Mexican indigenous populations using whole-genome sequencing data from 76 individuals representing 27 different ethnic groups in Mexico. We divided the cohort into northern, central, and southern populations and identified signals of natural selection within and across populations. We find evidence of adaptation to pathogenic environments in all our populations, including significant signatures in the Duffy blood group gene in central Mexican indigenous populations. Despite each region exhibiting unique local adaptation profiles, selection signatures on ARHGAP15, VGLL4, LINGO2, SYNDIG1, and TFAP2B were common to all populations. Our results also suggest that selection signatures falling within enhancers or promoters are usually connected to noncoding features, with notable exceptions like ARHGAP15 and GTDC1. This paper provides new evidence on the selection landscape of Mexican indigenous populations and lays the foundation for additional work on Mexican phenotypic characterization.
Collapse
Affiliation(s)
- Maria Fernanda Miron-Toruno
- Department of Anthropology, Baylor University, Waco, TX 76706, USA
- Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX 76107, USA
| | - Enrique Morett
- Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México (UNAM), México, Morelos 62210, México
| | - Israel Aguilar-Ordonez
- Jefatura de Supercómputo, Subdirección de Bioinformática, Instituto Nacional de Medicina Genomica (INMEGEN), Ciudad de México 14610, México
| | - Austin W Reynolds
- Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX 76107, USA
| |
Collapse
|
3
|
Carvajal-Rodríguez A. iHDSel software: The price equation and the population stability index to detect genomic patterns compatible with selective sweeps. An example with SARS-CoV-2. Biol Methods Protoc 2024; 9:bpae089. [PMID: 39679303 PMCID: PMC11646571 DOI: 10.1093/biomethods/bpae089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2024] [Revised: 11/19/2024] [Accepted: 11/25/2024] [Indexed: 12/17/2024] Open
Abstract
A large number of methods have been developed and continue to evolve for detecting the signatures of selective sweeps in genomes. Significant advances have been made, including the combination of different statistical strategies and the incorporation of artificial intelligence (machine learning) methods. Despite these advances, several common problems persist, such as the unknown null distribution of the statistics used, necessitating simulations and resampling to assign significance to the statistics. Additionally, it is not always clear how deviations from the specific assumptions of each method might affect the results. In this work, allelic classes of haplotypes are used along with the informational interpretation of the Price equation to design a statistic with a known distribution that can detect genomic patterns caused by selective sweeps. The statistic consists of Jeffreys divergence, also known as the population stability index, applied to the distribution of allelic classes of haplotypes in two samples. Results with simulated data show optimal performance of the statistic in detecting divergent selection. Analysis of real severe acute respiratory syndrome coronavirus 2 genome data also shows that some of the sites playing key roles in the virus's fitness and immune escape capability are detected by the method. The new statistic, called JHAC , is incorporated into the iHDSel (informed HacDivSel) software available at https://acraaj.webs.uvigo.es/iHDSel.html.
Collapse
Affiliation(s)
- Antonio Carvajal-Rodríguez
- Centro de Investigación Mariña (CIM), Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, Vigo, 36310 Spain
| |
Collapse
|
4
|
Yang H, Zhu M, Wang M, Zhou H, Zheng J, Qiu L, Fan W, Yang J, Yu Q, Yang Y, Zhang W. Genome-wide comparative analysis reveals selection signatures for reproduction traits in prolific Suffolk sheep. Front Genet 2024; 15:1404031. [PMID: 38911299 PMCID: PMC11193351 DOI: 10.3389/fgene.2024.1404031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Accepted: 05/20/2024] [Indexed: 06/25/2024] Open
Abstract
The identification of genome-wide selection signatures can reveal the potential genetic mechanisms involved in the generation of new breeds through natural or artificial selection. In this study, we screened the genome-wide selection signatures of prolific Suffolk sheep, a new strain of multiparous mutton sheep, to identify candidate genes for reproduction traits and unravel the germplasm characteristics and population genetic evolution of this new strain of Suffolk sheep. Whole-genome resequencing was performed at an effective sequencing depth of 20× for genomic diversity and population structure analysis. Additionally, selection signatures were investigated in prolific Suffolk sheep, Suffolk sheep, and Hu sheep using fixation index (F ST) and heterozygosity H) analysis. A total of 5,236.338 Gb of high-quality genomic data and 28,767,952 SNPs were obtained for prolific Suffolk sheep. Moreover, 99 selection signals spanning candidate genes were identified. Twenty-three genes were significantly associated with KEGG pathway and Gene Ontology terms related to reproduction, growth, immunity, and metabolism. Through selective signal analysis, genes such as ARHGEF4, CATIP, and CCDC115 were found to be significantly correlated with reproductive traits in prolific Suffolk sheep and were highly associated with the mTOR signaling pathway, the melanogenic pathway, and the Hippo signaling pathways, among others. These results contribute to the understanding of the evolution of artificial selection in prolific Suffolk sheep and provide candidate reproduction-related genes that may be beneficial for the establishment of new sheep breeds.
Collapse
Affiliation(s)
- Hua Yang
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi, China
| | - Mengting Zhu
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi, China
- College of Animal Science, Xinjiang Agricultural University, Urumqi, China
| | - Mingyuan Wang
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi, China
- College of Animal Science and Technology, Shihezi University, Shihezi, China
| | - Huaqian Zhou
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi, China
- College of Animal Science and Technology, Shihezi University, Shihezi, China
| | - Jingjing Zheng
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi, China
- College of Animal Science and Technology, Shihezi University, Shihezi, China
| | - Lixia Qiu
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi, China
| | - Wenhua Fan
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi, China
- College of Animal Science and Technology, Shihezi University, Shihezi, China
| | - Jinghui Yang
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi, China
| | - Qian Yu
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi, China
| | - Yonglin Yang
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi, China
| | - Wenzhe Zhang
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi, China
| |
Collapse
|
5
|
Panigrahi M, Rajawat D, Nayak SS, Ghildiyal K, Sharma A, Jain K, Lei C, Bhushan B, Mishra BP, Dutt T. Landmarks in the history of selective sweeps. Anim Genet 2023; 54:667-688. [PMID: 37710403 DOI: 10.1111/age.13355] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 08/28/2023] [Indexed: 09/16/2023]
Abstract
Half a century ago, a seminal article on the hitchhiking effect by Smith and Haigh inaugurated the concept of the selection signature. Selective sweeps are characterised by the rapid spread of an advantageous genetic variant through a population and hence play an important role in shaping evolution and research on genetic diversity. The process by which a beneficial allele arises and becomes fixed in a population, leading to a increase in the frequency of other linked alleles, is known as genetic hitchhiking or genetic draft. Kimura's neutral theory and hitchhiking theory are complementary, with Kimura's neutral evolution as the 'null model' and positive selection as the 'signal'. Both are widely accepted in evolution, especially with genomics enabling precise measurements. Significant advances in genomic technologies, such as next-generation sequencing, high-density SNP arrays and powerful bioinformatics tools, have made it possible to systematically investigate selection signatures in a variety of species. Although the history of selection signatures is relatively recent, progress has been made in the last two decades, owing to the increasing availability of large-scale genomic data and the development of computational methods. In this review, we embark on a journey through the history of research on selective sweeps, ranging from early theoretical work to recent empirical studies that utilise genomic data.
Collapse
Affiliation(s)
- Manjit Panigrahi
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Divya Rajawat
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | | | - Kanika Ghildiyal
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Anurodh Sharma
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Karan Jain
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Chuzhao Lei
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Bharat Bhushan
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Bishnu Prasad Mishra
- Division of Animal Biotechnology, ICAR-National Bureau of Animal Genetic Resources, Karnal, India
| | - Triveni Dutt
- Livestock Production and Management Section, Indian Veterinary Research Institute, Bareilly, India
| |
Collapse
|
6
|
Matthews DG, Dial TR, Lauder GV. Genes, Morphology, Performance, and Fitness: Quantifying Organismal Performance to Understand Adaptive Evolution. Integr Comp Biol 2023; 63:843-859. [PMID: 37422435 DOI: 10.1093/icb/icad096] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 06/06/2023] [Accepted: 06/22/2023] [Indexed: 07/10/2023] Open
Abstract
To understand the complexities of morphological evolution, we must understand the relationships between genes, morphology, performance, and fitness in complex traits. Genomicists have made tremendous progress in finding the genetic basis of many phenotypes, including a myriad of morphological characters. Similarly, field biologists have greatly advanced our understanding of the relationship between performance and fitness in natural populations. However, the connection from morphology to performance has primarily been studied at the interspecific level, meaning that in most cases we lack a mechanistic understanding of how evolutionarily relevant variation among individuals affects organismal performance. Therefore, functional morphologists need methods that will allow for the analysis of fine-grained intraspecific variation in order to close the path from genes to fitness. We suggest three methodological areas that we believe are well suited for this research program and provide examples of how each can be applied within fish model systems to build our understanding of microevolutionary processes. Specifically, we believe that structural equation modeling, biological robotics, and simultaneous multi-modal functional data acquisition will open up fruitful collaborations among biomechanists, evolutionary biologists, and field biologists. It is only through the combined efforts of all three fields that we will understand the connection between evolution (acting at the level of genes) and natural selection (acting on fitness).
Collapse
Affiliation(s)
- David G Matthews
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Terry R Dial
- Department of Biology and Ecology Center, Utah State University, Moab, UT 84322, USA
- Department of Environment and Society, Utah State University, Moab, UT 84322, USA
| | - George V Lauder
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
7
|
The study of selection signature and its applications on identification of candidate genes using whole genome sequencing data in chicken - a review. Poult Sci 2023; 102:102657. [PMID: 37054499 PMCID: PMC10123265 DOI: 10.1016/j.psj.2023.102657] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 03/09/2023] [Accepted: 03/10/2023] [Indexed: 03/17/2023] Open
Abstract
Chicken is a major source of protein for the increasing human population and is useful for research purposes. There are almost 1,600 distinct regional breeds of chicken across the globe, among which a large body of genetic and phenotypic variations has been accumulated due to extensive natural and artificial selection. Moreover, natural selection is a crucial force for animal domestication. Several approaches have been adopted to detect selection signatures in different breeds of chicken using whole genome sequencing (WGS) data including integrated haplotype score (iHS), cross-populated extend haplotype homozygosity test (XP-EHH), fixation index (FST), cross-population composite likelihood ratio (XP-CLR), nucleotide diversity (Pi), and others. In addition, gene enrichment analyses are utilized to determine KEGG pathways and gene ontology (GO) terms related to traits of interest in chicken. Herein, we review different studies that have adopted diverse approaches to detect selection signatures in different breeds of chicken. This review systematically summarizes different findings on selection signatures and related candidate genes in chickens. Future studies could combine different selection signatures approaches to strengthen the quality of the results thereby providing more affirmative inference. This would further aid in deciphering the importance of selection in chicken conservation for the increasing human population.
Collapse
|
8
|
Strandén I, Kantanen J, Lidauer MH, Mehtiö T, Negussie E. Animal board invited review: Genomic-based improvement of cattle in response to climate change. Animal 2022; 16:100673. [PMID: 36402112 DOI: 10.1016/j.animal.2022.100673] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 10/18/2022] [Accepted: 10/20/2022] [Indexed: 12/24/2022] Open
Abstract
Climate change brings challenges to cattle production, such as the need to adapt to new climates and pressure to reduce greenhouse emissions (GHG). In general, the improvement of traits in current breeding goals is favourably correlated with the reduction of GHG. Current breeding goals and tools for increasing cattle production efficiency have reduced GHG. The same amount of production can be achieved by a much smaller number of animals. Genomic selection (GS) may offer a cost-effective way of using an efficient breeding approach, even in low- and middle-income countries. As climate change increases the intensity of heatwaves, adaptation to heat stress leads to lower efficiency of production and, thus, is unfavourable to the goal of reducing GHG. Furthermore, there is evidence that heat stress during cow pregnancy can have many generation-long lowering effects on milk production. Both adaptation and reduction of GHG are among the difficult-to-measure traits for which GS is more efficient and suitable than the traditional non-genomic breeding evaluation approach. Nevertheless, the commonly used within-breed selection may be insufficient to meet the new challenges; thus, cross-breeding based on selecting highly efficient and highly adaptive breeds may be needed. Genomic introgression offers an efficient approach for cross-breeding that is expected to provide high genetic progress with a low rate of inbreeding. However, well-adapted breeds may have a small number of animals, which is a source of concern from a genetic biodiversity point of view. Furthermore, low animal numbers also limit the efficiency of genomic introgression. Sustainable cattle production in countries that have already intensified production is likely to emphasise better health, reproduction, feed efficiency, heat stress and other adaptation traits instead of higher production. This may require the application of innovative technologies for phenotyping and further use of new big data techniques to extract information for breeding.
Collapse
Affiliation(s)
- I Strandén
- Natural Resources Institute Finland (Luke), 31600 Jokioinen, Finland.
| | - J Kantanen
- Natural Resources Institute Finland (Luke), 31600 Jokioinen, Finland
| | - M H Lidauer
- Natural Resources Institute Finland (Luke), 31600 Jokioinen, Finland
| | - T Mehtiö
- Natural Resources Institute Finland (Luke), 31600 Jokioinen, Finland
| | - E Negussie
- Natural Resources Institute Finland (Luke), 31600 Jokioinen, Finland
| |
Collapse
|
9
|
Gu X. A Simple Evolutionary Model of Genetic Robustness After Gene Duplication. J Mol Evol 2022; 90:352-361. [PMID: 35913597 DOI: 10.1007/s00239-022-10065-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 06/23/2022] [Indexed: 10/16/2022]
Abstract
When a dispensable gene is duplicated (referred to the ancestral dispensability denoted by O+), genetic buffering and duplicate compensation together maintain the duplicate redundancy, whereas duplicate compensation is the only mechanism when an essential gene is duplicated (referred to the ancestral essentiality denoted by O-). To investigate these evolutionary scenarios of genetic robustness, I formulated a simple mixture model for analyzing duplicate pairs with one of the following states: double dispensable (DD), semi-dispensable (one dispensable one essential, DE), or double essential (EE). This model was applied to the yeast duplicate pairs from a whole-genome duplication (WGD) occurred about 100 million years ago (mya), and the mouse duplicate pairs from a WGD occurred about more than 500 mya. Both case studies revealed that the proportion of essentiality for those duplicates with ancestral essentiality [PE(O-)] was much higher than that for those with ancestral dispensability [PE(O+)]. While it was negligible in the yeast duplicate pairs, PE(O+) (about 20%) was shown statistically significant in the mouse duplicate pairs. These findings, together, support the hypothesis that both sub-functionalization and neo-functionalization may play some roles after gene duplication, though the former may be much faster than the later.
Collapse
Affiliation(s)
- Xun Gu
- The Laurence H. Baker Center in Bioinformatics on Biological Statistics, Department of Genetics, Development and Cell Biology, Program of Ecological and Evolutionary Biology, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
10
|
Gabián M, Morán P, Saura M, Carvajal-Rodríguez A. Detecting Local Adaptation between North and South European Atlantic Salmon Populations. BIOLOGY 2022; 11:933. [PMID: 35741456 PMCID: PMC9219887 DOI: 10.3390/biology11060933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 06/09/2022] [Accepted: 06/16/2022] [Indexed: 06/15/2023]
Abstract
Pollution and other anthropogenic effects have driven a decrease in Atlantic salmon (Salmo salar) in the Iberian Peninsula. The restocking effort carried out in the 1980s, with salmon from northern latitudes with the aim of mitigating the decline of native populations, failed, probably due to the deficiency in adaptation of foreign salmon from northern Europe to the warm waters of the Iberian Peninsula. This result would imply that the Iberian populations of Atlantic salmon have experienced local adaptation in their past evolutionary history, as has been described for other populations of this species and other salmonids. Local adaptation can occur by divergent selections between environments, favoring the fixation of alleles that increase the fitness of a population in the environment it inhabits relative to other alleles favored in another population. In this work, we compared the genomes of different populations from the Iberian Peninsula (Atlantic and Cantabric basins) and Scotland in order to provide tentative evidence of candidate SNPs responsible for the adaptive differences between populations, which may explain the failures of restocking carried out during the 1980s. For this purpose, the samples were genotyped with a 220,000 high-density SNP array (Affymetrix) specific to Atlantic salmon. Our results revealed potential evidence of local adaptation for North Spanish and Scottish populations. As expected, most differences concerned the comparison of the Iberian Peninsula with Scotland, although there were also differences between Atlantic and Cantabric populations. A high proportion of the genes identified are related to development and cellular metabolism, DNA transcription and anatomical structure. A particular SNP was identified within the NADP-dependent malic enzyme-2 (mMEP-2*), previously reported by independent studies as a candidate for local adaptation in salmon from the Iberian Peninsula. Interestingly, the corresponding SNP within the mMEP-2* region was consistent with a genomic pattern of divergent selection.
Collapse
Affiliation(s)
- María Gabián
- Centro de Investigación Mariña (CIM), Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, 36310 Vigo, Spain; (M.G.); (P.M.)
| | - Paloma Morán
- Centro de Investigación Mariña (CIM), Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, 36310 Vigo, Spain; (M.G.); (P.M.)
| | - María Saura
- Departamento de Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), 28040 Madrid, Spain;
| | - Antonio Carvajal-Rodríguez
- Centro de Investigación Mariña (CIM), Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, 36310 Vigo, Spain; (M.G.); (P.M.)
| |
Collapse
|
11
|
Kumar H, Panigrahi M, Panwar A, Rajawat D, Nayak SS, Saravanan KA, Kaisa K, Parida S, Bhushan B, Dutt T. Machine-Learning Prospects for Detecting Selection Signatures Using Population Genomics Data. J Comput Biol 2022; 29:943-960. [PMID: 35639362 DOI: 10.1089/cmb.2021.0447] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Natural selection has been given a lot of attention because it relates to the adaptation of populations to their environments, both biotic and abiotic. An allele is selected when it is favored by natural selection. Consequently, the favored allele increases in frequency in the population and neighboring linked variation diminishes, causing so-called selective sweeps. A high-throughput genomic sequence allows one to disentangle the evolutionary forces at play in populations. With the development of high-throughput genome sequencing technologies, it has become easier to detect these selective sweeps/selection signatures. Various methods can be used to detect selective sweeps, from simple implementations using summary statistics to complex statistical approaches. One of the important problems of these statistical models is the potential to provide inaccurate results when their assumptions are violated. The use of machine learning (ML) in population genetics has been introduced as an alternative method of detecting selection by treating the problem of detecting selection signatures as a classification problem. Since the availability of population genomics data is increasing, researchers may incorporate ML into these statistical models to infer signatures of selection with higher predictive accuracy and better resolution. This article describes how ML can be used to aid in detecting and studying natural selection patterns using population genomic data.
Collapse
Affiliation(s)
- Harshit Kumar
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Manjit Panigrahi
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Anuradha Panwar
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Divya Rajawat
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Sonali Sonejita Nayak
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - K A Saravanan
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Kaiho Kaisa
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Subhashree Parida
- Divisions of Pharmacology and Toxicology, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Bharat Bhushan
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Triveni Dutt
- Livestock Production and Management Section, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| |
Collapse
|
12
|
Silver LW, Cheng Y, Quigley BL, Robbins A, Timms P, Hogg CJ, Belov K. A targeted approach to investigating immune genes of an iconic Australian marsupial. Mol Ecol 2022; 31:3286-3303. [PMID: 35510793 PMCID: PMC9325493 DOI: 10.1111/mec.16493] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Revised: 03/02/2022] [Accepted: 04/05/2022] [Indexed: 11/30/2022]
Abstract
Disease is a contributing factor to the decline of wildlife populations across the globe. Koalas, iconic yet declining Australian marsupials, are predominantly impacted by two pathogens, Chlamydia and koala retrovirus. Chlamydia is an obligate intracellular bacterium and one of the most widespread sexually transmitted infections in humans worldwide. In koalas, Chlamydia infections can present as asymptomatic or can cause a range of ocular and urogenital disease signs, such as conjunctivitis, cystitis and infertility. In this study, we looked at differences in response to Chlamydia in two northern populations of koalas using a targeted gene sequencing of 1209 immune genes in addition to genome‐wide reduced representation data. We identified two MHC Class I genes associated with Chlamydia disease progression as well as 25 single nucleotide polymorphisms across 17 genes that were associated with resolution of Chlamydia infection. These genes are involved in the innate immune response (TLR5) and defence (TLR5, IFNγ, SERPINE1, STAT2 and STX4). This study deepens our understanding of the role that genetics plays in disease progression in koalas and leads into future work that will use whole genome resequencing of a larger sample set to investigate in greater detail regions identified in this study. Elucidation of the role of host genetics in disease progression and resolution in koalas will directly contribute to better design of Chlamydia vaccines and management of koala populations which have recently been listed as “endangered.”
Collapse
Affiliation(s)
- Luke W Silver
- School of Life and Environmental Sciences, The University of Sydney, New South Wales, 2006, Australia
| | - Yuanyuan Cheng
- School of Life and Environmental Sciences, The University of Sydney, New South Wales, 2006, Australia
| | - Bonnie L Quigley
- Genecology Research Centre, University of the Sunshine Coast, 90 Sippy Downs Drive, Sippy Downs, Queensland, 4556, Australia.,Provectus Algae Pty Ltd, 5 Bartlett Road, Noosaville, Queensland, 4566, Australia
| | - Amy Robbins
- Genecology Research Centre, University of the Sunshine Coast, 90 Sippy Downs Drive, Sippy Downs, Queensland, 4556, Australia.,Endeavour Veterinary Ecology Pty Ltd, 1695 Pumicestone Road, Toorbul, Queensland, 4510, Australia
| | - Peter Timms
- Genecology Research Centre, University of the Sunshine Coast, 90 Sippy Downs Drive, Sippy Downs, Queensland, 4556, Australia
| | - Carolyn J Hogg
- School of Life and Environmental Sciences, The University of Sydney, New South Wales, 2006, Australia
| | - Katherine Belov
- School of Life and Environmental Sciences, The University of Sydney, New South Wales, 2006, Australia
| |
Collapse
|
13
|
Nguembang Fadja A, Riguzzi F, Bertorelle G, Trucchi E. Identification of natural selection in genomic data with deep convolutional neural network. BioData Min 2021; 14:51. [PMID: 34863217 PMCID: PMC8642854 DOI: 10.1186/s13040-021-00280-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 10/25/2021] [Indexed: 11/10/2022] Open
Abstract
Background With the increase in the size of genomic datasets describing variability in populations, extracting relevant information becomes increasingly useful as well as complex. Recently, computational methodologies such as Supervised Machine Learning and specifically Convolutional Neural Networks have been proposed to make inferences on demographic and adaptive processes using genomic data. Even though it was already shown to be powerful and efficient in different fields of investigation, Supervised Machine Learning has still to be explored as to unfold its enormous potential in evolutionary genomics. Results The paper proposes a method based on Supervised Machine Learning for classifying genomic data, represented as windows of genomic sequences from a sample of individuals belonging to the same population. A Convolutional Neural Network is used to test whether a genomic window shows the signature of natural selection. Training performed on simulated data show that the proposed model can accurately predict neutral and selection processes on portions of genomes taken from real populations with almost 90% accuracy.
Collapse
Affiliation(s)
- Arnaud Nguembang Fadja
- Dipartimento di Matematica e Informatica, University of Ferrara, Via Saragat 1, Ferrara, I-44122, Italy.
| | - Fabrizio Riguzzi
- Dipartimento di Matematica e Informatica, University of Ferrara, Via Saragat 1, Ferrara, I-44122, Italy
| | - Giorgio Bertorelle
- Dipartimento di Scienze della Vita e Biotecnologie, University of Ferrara, Via Luigi Borsari 46, Ferrara, I-44121, Italy
| | - Emiliano Trucchi
- Dipartimento di Scienze della Vita e dell'Ambiente, Marche Polytechnic University, Via Brecce Bianche, Ancona, I-60131, Italy
| |
Collapse
|
14
|
Luqman H, Widmer A, Fior S, Wegmann D. Identifying loci under selection via explicit demographic models. Mol Ecol Resour 2021; 21:2719-2737. [PMID: 33964107 PMCID: PMC8596768 DOI: 10.1111/1755-0998.13415] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 04/03/2021] [Accepted: 04/28/2021] [Indexed: 01/28/2023]
Abstract
Adaptive genetic variation is a function of both selective and neutral forces. To accurately identify adaptive loci, it is thus critical to account for demographic history. Theory suggests that signatures of selection can be inferred using the coalescent, following the premise that genealogies of selected loci deviate from neutral expectations. Here, we build on this theory to develop an analytical framework to identify loci under selection via explicit demographic models (LSD). Under this framework, signatures of selection are inferred through deviations in demographic parameters, rather than through summary statistics directly, and demographic history is accounted for explicitly. Leveraging the property of demographic models to incorporate directionality, we show that LSD can provide information on the environment in which selection acts on a population. This can prove useful in elucidating the selective processes underlying local adaptation, by characterizing genetic trade-offs and extending the concepts of antagonistic pleiotropy and conditional neutrality from ecological theory to practical application in genomic data. We implement LSD via approximate Bayesian computation and demonstrate, via simulations, that LSD (a) has high power to identify selected loci across a large range of demographic-selection regimes, (b) outperforms commonly applied genome-scan methods under complex demographies and (c) accurately infers the directionality of selection for identified candidates. Using the same simulations, we further characterize the behaviour of isolation-with-migration models conducive to the study of local adaptation under regimes of selection. Finally, we demonstrate an application of LSD by detecting loci and characterizing genetic trade-offs underlying flower colour in Antirrhinum majus.
Collapse
Affiliation(s)
- Hirzi Luqman
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
| | - Alex Widmer
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
| | - Simone Fior
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
| | - Daniel Wegmann
- Department of BiologyUniversity of FribourgFribourgSwitzerland
- Swiss Institute of BioinformaticsFribourgSwitzerland
| |
Collapse
|
15
|
Meeks KAC, Bentley AR, Adeyemo AA, Rotimi CN. Evolutionary forces in diabetes and hypertension pathogenesis in Africans. Hum Mol Genet 2021; 30:R110-R118. [PMID: 33734377 DOI: 10.1093/hmg/ddaa238] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 10/16/2020] [Accepted: 10/22/2020] [Indexed: 11/12/2022] Open
Abstract
Rates of type 2 diabetes (T2D) and hypertension are increasing rapidly in urbanizing sub-Saharan Africa (SSA). While lifestyle factors drive the increases in T2D and hypertension prevalence, evidence across populations shows that genetic variation, which is driven by evolutionary forces including a natural selection that shaped the human genome, also plays a role. Here we report the evidence for the effect of selection in African genomes on mechanisms underlying T2D and hypertension, including energy metabolism, adipose tissue biology, insulin action and salt retention. Selection effects found for variants in genes PPARA and TCF7L2 may have enabled Africans to respond to nutritional challenges by altering carbohydrate and lipid metabolism. Likewise, African-ancestry-specific characteristics of adipose tissue biology (low visceral adipose tissue [VAT], high intermuscular adipose tissue and a strong association between VAT and adiponectin) may have been selected for in response to nutritional and infectious disease challenges in the African environment. Evidence for selection effects on insulin action, including insulin resistance and secretion, has been found for several genes including MPHOSPH9, TMEM127, ZRANB3 and MC3R. These effects may have been historically adaptive in critical conditions, such as famine and inflammation. A strong correlation between hypertension susceptibility variants and latitude supports the hypothesis of selection for salt retention mechanisms in warm, humid climates. Nevertheless, adaptive genomics studies in African populations are scarce. More work is needed, particularly genomics studies covering the wide diversity of African populations in SSA and Africans in diaspora, as well as further functional assessment of established risk loci.
Collapse
Affiliation(s)
- Karlijn A C Meeks
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Amy R Bentley
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Adebowale A Adeyemo
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Charles N Rotimi
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
16
|
Genetic Signatures of Selection for Cashmere Traits in Chinese Goats. Animals (Basel) 2020; 10:ani10101905. [PMID: 33080940 PMCID: PMC7603090 DOI: 10.3390/ani10101905] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 10/12/2020] [Accepted: 10/16/2020] [Indexed: 12/12/2022] Open
Abstract
Simple Summary Cashmere goats are a unique husbandry resource in China. These goats are well known for producing the highest cashmere yield and best fiber quality in the world. Although cashmere is highly valued and also known as “fiber gem” and “soft gold”, few studies have examined the genetic basis of cashmere traits in cashmere goats. Here, we identified selection signals by comparing Fst and XP-EHH (the cross population extend haplotype homozygosity test) of a non-cashmere breed (Huanghuai goat) with those of two cashmere breeds (Inner Mongolia and Liaoning cashmere goats). Two genes (WNT10A and CSN3) were potentially associated with cashmere traits. This information may be valuable for studying the genetic uniqueness of cashmere goats and elucidating the mechanisms underlying cashmere traits in cashmere goats. Abstract Inner Mongolia and Liaoning cashmere goats in China are well-known for their cashmere quality and yield. Thus, they are great models for identifying genomic regions associated with cashmere traits. Herein, 53 Inner Mongolia cashmere goats, Liaoning cashmere goats and Huanghuai goats were genotyped, and 53,347 single-nucleotide polymorphisms (SNPs) were produced using the Illumina Caprine 50K SNP chip. Additionally, we identified some positively selected SNPs by analyzing Fst and XP-EHH. The top 5% of SNPs had selection signatures. After gene annotation, 222 and 173 candidate genes were identified in Inner Mongolia and Liaoning cashmere goats, respectively. Several genes were related to hair follicle development, such as TRPS1, WDR74, LRRC14, SPTLC3, IGF1R, PADI2, FOXP1, WNT10A and CSN3. Gene enrichment analysis of these cashmere trait-associated genes related 67 enriched signaling pathways that mainly participate in hair follicle development and stem cell pluripotency regulation. Furthermore, we identified 20 overlapping genes that were selected in both cashmere goat breeds. Among these overlapping genes, WNT10A and CSN3, which are associated with hair follicle development, are potentially involved in cashmere production. These findings may improve molecular breeding of cashmere goats in the future.
Collapse
|
17
|
Torada L, Lorenzon L, Beddis A, Isildak U, Pattini L, Mathieson S, Fumagalli M. ImaGene: a convolutional neural network to quantify natural selection from genomic data. BMC Bioinformatics 2019; 20:337. [PMID: 31757205 PMCID: PMC6873651 DOI: 10.1186/s12859-019-2927-x] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Accepted: 05/31/2019] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND The genetic bases of many complex phenotypes are still largely unknown, mostly due to the polygenic nature of the traits and the small effect of each associated mutation. An alternative approach to classic association studies to determining such genetic bases is an evolutionary framework. As sites targeted by natural selection are likely to harbor important functionalities for the carrier, the identification of selection signatures in the genome has the potential to unveil the genetic mechanisms underpinning human phenotypes. Popular methods of detecting such signals rely on compressing genomic information into summary statistics, resulting in the loss of information. Furthermore, few methods are able to quantify the strength of selection. Here we explored the use of deep learning in evolutionary biology and implemented a program, called ImaGene, to apply convolutional neural networks on population genomic data for the detection and quantification of natural selection. RESULTS ImaGene enables genomic information from multiple individuals to be represented as abstract images. Each image is created by stacking aligned genomic data and encoding distinct alleles into separate colors. To detect and quantify signatures of positive selection, ImaGene implements a convolutional neural network which is trained using simulations. We show how the method implemented in ImaGene can be affected by data manipulation and learning strategies. In particular, we show how sorting images by row and column leads to accurate predictions. We also demonstrate how the misspecification of the correct demographic model for producing training data can influence the quantification of positive selection. We finally illustrate an approach to estimate the selection coefficient, a continuous variable, using multiclass classification techniques. CONCLUSIONS While the use of deep learning in evolutionary genomics is in its infancy, here we demonstrated its potential to detect informative patterns from large-scale genomic data. We implemented methods to process genomic data for deep learning in a user-friendly program called ImaGene. The joint inference of the evolutionary history of mutations and their functional impact will facilitate mapping studies and provide novel insights into the molecular mechanisms associated with human phenotypes.
Collapse
Affiliation(s)
- Luis Torada
- Department of Life Sciences, Silwood Park campus, Imperial College London, Buckhurst Road, Ascot, SL5 7PY UK
| | - Lucrezia Lorenzon
- Department of Life Sciences, Silwood Park campus, Imperial College London, Buckhurst Road, Ascot, SL5 7PY UK
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, piazza Leonardo da Vinci 32, Milan, 20133 Italy
| | - Alice Beddis
- Department of Life Sciences, Silwood Park campus, Imperial College London, Buckhurst Road, Ascot, SL5 7PY UK
| | - Ulas Isildak
- Department of Biological Sciences, Middle East Technical University, METU Üniversiteler Mah. Dumlupınar Blv. No:1, Ankara, 06800 Çankaya Turkey
| | - Linda Pattini
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, piazza Leonardo da Vinci 32, Milan, 20133 Italy
| | - Sara Mathieson
- Department of Computer Science, Swarthmore College, 500 College Ave, Swarthmore, 19081 PA USA
| | - Matteo Fumagalli
- Department of Life Sciences, Silwood Park campus, Imperial College London, Buckhurst Road, Ascot, SL5 7PY UK
| |
Collapse
|
18
|
Vergara-Lope A, Jabalameli MR, Horscroft C, Ennis S, Collins A, Pengelly RJ. Linkage disequilibrium maps for European and African populations constructed from whole genome sequence data. Sci Data 2019; 6:208. [PMID: 31624256 PMCID: PMC6797713 DOI: 10.1038/s41597-019-0227-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 08/29/2019] [Indexed: 11/08/2022] Open
Abstract
Quantification of linkage disequilibrium (LD) patterns in the human genome is essential for genome-wide association studies, selection signature mapping and studies of recombination. Whole genome sequence (WGS) data provides optimal source data for this quantification as it is free from biases introduced by the design of array genotyping platforms. The Malécot-Morton model of LD allows the creation of a cumulative map for each choromosome, analogous to an LD form of a linkage map. Here we report LD maps generated from WGS data for a large population of European ancestry, as well as populations of Baganda, Ethiopian and Zulu ancestry. We achieve high average genetic marker densities of 2.3-4.6/kb. These maps show good agreement with prior, low resolution maps and are consistent between populations. Files are provided in BED format to allow researchers to readily utilise this resource.
Collapse
Affiliation(s)
- Alejandra Vergara-Lope
- Human Genetics & Genomic Medicine, Faculty of Medicine, University of Southampton, Southampton, UK
| | - M Reza Jabalameli
- Human Genetics & Genomic Medicine, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Clare Horscroft
- Human Genetics & Genomic Medicine, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Sarah Ennis
- Human Genetics & Genomic Medicine, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Andrew Collins
- Human Genetics & Genomic Medicine, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Reuben J Pengelly
- Human Genetics & Genomic Medicine, Faculty of Medicine, University of Southampton, Southampton, UK.
| |
Collapse
|