51
|
Bolormaa S, Gore K, van der Werf JHJ, Hayes BJ, Daetwyler HD. Design of a low-density SNP chip for the main Australian sheep breeds and its effect on imputation and genomic prediction accuracy. Anim Genet 2015; 46:544-56. [DOI: 10.1111/age.12340] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/03/2015] [Indexed: 12/20/2022]
Affiliation(s)
- S. Bolormaa
- AgriBio; Centre for AgriBioscience; DEDJTR; Bundoora VIC 3083 Australia
- Cooperative Research Centre for Sheep Industry Innovation; Armidale NSW 2351 Australia
| | - K. Gore
- School of Environmental and Rural Science; University of New England; Armidale NSW 2351 Australia
| | - J. H. J. van der Werf
- Cooperative Research Centre for Sheep Industry Innovation; Armidale NSW 2351 Australia
- School of Environmental and Rural Science; University of New England; Armidale NSW 2351 Australia
| | - B. J. Hayes
- AgriBio; Centre for AgriBioscience; DEDJTR; Bundoora VIC 3083 Australia
- Cooperative Research Centre for Sheep Industry Innovation; Armidale NSW 2351 Australia
- School of Applied Systems Biology; La Trobe University; Bundoora VIC 3086 Australia
| | - H. D. Daetwyler
- AgriBio; Centre for AgriBioscience; DEDJTR; Bundoora VIC 3083 Australia
- Cooperative Research Centre for Sheep Industry Innovation; Armidale NSW 2351 Australia
- School of Applied Systems Biology; La Trobe University; Bundoora VIC 3086 Australia
| |
Collapse
|
52
|
Balick DJ, Do R, Cassa CA, Reich D, Sunyaev SR. Dominance of Deleterious Alleles Controls the Response to a Population Bottleneck. PLoS Genet 2015; 11:e1005436. [PMID: 26317225 PMCID: PMC4552954 DOI: 10.1371/journal.pgen.1005436] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2014] [Accepted: 07/09/2015] [Indexed: 11/30/2022] Open
Abstract
Population bottlenecks followed by re-expansions have been common throughout history of many populations. The response of alleles under selection to such demographic perturbations has been a subject of great interest in population genetics. On the basis of theoretical analysis and computer simulations, we suggest that this response qualitatively depends on dominance. The number of dominant or additive deleterious alleles per haploid genome is expected to be slightly increased following the bottleneck and re-expansion. In contrast, the number of completely or partially recessive alleles should be sharply reduced. Changes of population size expose differences between recessive and additive selection, potentially providing insight into the prevalence of dominance in natural populations. Specifically, we use a simple statistic, BR≡∑xipop1/∑xjpop2, where xi represents the derived allele frequency, to compare the number of mutations in different populations, and detail its functional dependence on the strength of selection and the intensity of the population bottleneck. We also provide empirical evidence showing that gene sets associated with autosomal recessive disease in humans may have a BR indicative of recessive selection. Together, these theoretical predictions and empirical observations show that complex demographic history may facilitate rather than impede inference of parameters of natural selection. Dominance has played a central role in classical genetics since its inception. However, the effect of dominance introduces substantial technical complications into theoretical models describing dynamics of alleles in populations. As a result, dominance is often ignored in population genetic models. Statistical tests for selection built on these models do not discriminate between recessive and additive alleles. We show that historical changes in population size can provide a way to differentiate between recessive and additive selection. Our analysis compares two sub-populations with different demographic histories. History of our own species provides plenty of examples of sub-populations that went through population bottlenecks followed by re-expansions. We show that demographic differences, which generally complicate the analysis, can instead aid in the inference of features of natural selection.
Collapse
Affiliation(s)
- Daniel J. Balick
- Division of Genetics, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Department of Medicine, Harvard Medical School, Boston, Massachusetts, United States of America
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
| | - Ron Do
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
- The Center for Statistical Genetics, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
| | - Christopher A. Cassa
- Division of Genetics, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Department of Medicine, Harvard Medical School, Boston, Massachusetts, United States of America
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
| | - David Reich
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Shamil R. Sunyaev
- Division of Genetics, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Department of Medicine, Harvard Medical School, Boston, Massachusetts, United States of America
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
53
|
Meyer WK, Venkat A, Kermany AR, van de Geijn B, Zhang S, Przeworski M. Evolutionary history inferred from the de novo assembly of a nonmodel organism, the blue-eyed black lemur. Mol Ecol 2015. [PMID: 26198179 PMCID: PMC4557055 DOI: 10.1111/mec.13327] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Lemurs, the living primates most distantly related to humans, demonstrate incredible diversity in behaviour, life history patterns and adaptive traits. Although many lemur species are endangered within their native Madagascar, there is no high-quality genome assembly from this taxon, limiting population and conservation genetic studies. One critically endangered lemur is the blue-eyed black lemur Eulemur flavifrons. This species is fixed for blue irises, a convergent trait that evolved at least four times in primates and was subject to positive selection in humans, where 5′ regulatory variation of OCA2 explains most of the brown/blue eye colour differences. We built a de novo genome assembly for E. flavifrons, providing the most complete lemur genome to date, and a high confidence consensus sequence for close sister species E. macaco, the (brown-eyed) black lemur. From diversity and divergence patterns across the genomes, we estimated a recent split time of the two species (160 Kya) and temporal fluctuations in effective population sizes that accord with known environmental changes. By looking for regions of unusually low diversity, we identified potential signals of directional selection in E. flavifrons at MITF, a melanocyte development gene that regulates OCA2 and has previously been associated with variation in human iris colour, as well as at several other genes involved in melanin biosynthesis in mammals. Our study thus illustrates how whole-genome sequencing of a few individuals can illuminate the demographic and selection history of nonmodel species.
Collapse
Affiliation(s)
- Wynn K Meyer
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Aarti Venkat
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Amir R Kermany
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.,Howard Hughes Medical Institute, University of Chicago, Chicago, IL, 60637, USA
| | - Bryce van de Geijn
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, 60637, USA
| | - Sidi Zhang
- Biological Sciences Collegiate Division, University of Chicago, Chicago, IL, 60637, USA
| | - Molly Przeworski
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.,Howard Hughes Medical Institute, University of Chicago, Chicago, IL, 60637, USA.,Department of Ecology and Evolution, University of Chicago, Chicago, IL, 60637, USA
| |
Collapse
|
54
|
Zhang Q, Calus MPL, Guldbrandtsen B, Lund MS, Sahana G. Estimation of inbreeding using pedigree, 50k SNP chip genotypes and full sequence data in three cattle breeds. BMC Genet 2015. [PMID: 26195126 PMCID: PMC4509611 DOI: 10.1186/s12863-015-0227-7] [Citation(s) in RCA: 111] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Levels of inbreeding in cattle populations have increased in the past due to the use of a limited number of bulls for artificial insemination. High levels of inbreeding lead to reduced genetic diversity and inbreeding depression. Various estimators based on different sources, e.g., pedigree or genomic data, have been used to estimate inbreeding coefficients in cattle populations. However, the comparative advantage of using full sequence data to assess inbreeding is unknown. We used pedigree and genomic data at different densities from 50k to full sequence variants to compare how different methods performed for the estimation of inbreeding levels in three different cattle breeds. Results Five different estimates for inbreeding were calculated and compared in this study: pedigree based inbreeding coefficient (FPED); run of homozygosity (ROH)-based inbreeding coefficients (FROH); genomic relationship matrix (GRM)-based inbreeding coefficients (FGRM); inbreeding coefficients based on excess of homozygosity (FHOM) and correlation of uniting gametes (FUNI). Estimates using ROH provided the direct estimated levels of autozygosity in the current populations and are free effects of allele frequencies and incomplete pedigrees which may increase in inaccuracy in estimation of inbreeding. The highest correlations were observed between FROH estimated from the full sequence variants and the FROH estimated from 50k SNP (single nucleotide polymorphism) genotypes. The estimator based on the correlation between uniting gametes (FUNI) using full genome sequences was also strongly correlated with FROH detected from sequence data. Conclusions Estimates based on ROH directly reflected levels of homozygosity and were not influenced by allele frequencies, unlike the three other estimates evaluated (FGRM, FHOM and FUNI), which depended on estimated allele frequencies. FPED suffered from limited pedigree depth. Marker density affects ROH estimation. Detecting ROH based on 50k chip data was observed to give estimates similar to ROH from sequence data. In the absence of full sequence data ROH based on 50k can be used to access homozygosity levels in individuals. However, genotypes denser than 50k are required to accurately detect short ROH that are most likely identical by descent (IBD).
Collapse
Affiliation(s)
- Qianqian Zhang
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, DK-8830, Denmark. .,Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, Wageningen, 6700 AH, The Netherlands.
| | - Mario P L Calus
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, Wageningen, 6700 AH, The Netherlands.
| | - Bernt Guldbrandtsen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, DK-8830, Denmark.
| | - Mogens S Lund
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, DK-8830, Denmark.
| | - Goutam Sahana
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, DK-8830, Denmark.
| |
Collapse
|
55
|
Fuller ZL, Niño EL, Patch HM, Bedoya-Reina OC, Baumgarten T, Muli E, Mumoki F, Ratan A, McGraw J, Frazier M, Masiga D, Schuster S, Grozinger CM, Miller W. Genome-wide analysis of signatures of selection in populations of African honey bees (Apis mellifera) using new web-based tools. BMC Genomics 2015; 16:518. [PMID: 26159619 PMCID: PMC4496815 DOI: 10.1186/s12864-015-1712-0] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2014] [Accepted: 06/22/2015] [Indexed: 11/10/2022] Open
Abstract
Background With the development of inexpensive, high-throughput sequencing technologies, it has become feasible to examine questions related to population genetics and molecular evolution of non-model species in their ecological contexts on a genome-wide scale. Here, we employed a newly developed suite of integrated, web-based programs to examine population dynamics and signatures of selection across the genome using several well-established tests, including FST, pN/pS, and McDonald-Kreitman. We applied these techniques to study populations of honey bees (Apis mellifera) in East Africa. In Kenya, there are several described A. mellifera subspecies, which are thought to be localized to distinct ecological regions. Results We performed whole genome sequencing of 11 worker honey bees from apiaries distributed throughout Kenya and identified 3.6 million putative single-nucleotide polymorphisms. The dense coverage allowed us to apply several computational procedures to study population structure and the evolutionary relationships among the populations, and to detect signs of adaptive evolution across the genome. While there is considerable gene flow among the sampled populations, there are clear distinctions between populations from the northern desert region and those from the temperate, savannah region. We identified several genes showing population genetic patterns consistent with positive selection within African bee populations, and between these populations and European A. mellifera or Asian Apis florea. Conclusions These results lay the groundwork for future studies of adaptive ecological evolution in honey bees, and demonstrate the use of new, freely available web-based tools and workflows (http://usegalaxy.org/r/kenyanbee) that can be applied to any model system with genomic information. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1712-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Zachary L Fuller
- Department of Biology, Pennsylvania State University, University Park, PA, USA.
| | - Elina L Niño
- Department of Entomology, Center for Pollinator Research, Pennsylvania State University, University Park, PA, USA.
| | - Harland M Patch
- Department of Entomology, Center for Pollinator Research, Pennsylvania State University, University Park, PA, USA.
| | - Oscar C Bedoya-Reina
- Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, PA, USA.
| | - Tracey Baumgarten
- Department of Entomology, Center for Pollinator Research, Pennsylvania State University, University Park, PA, USA.
| | - Elliud Muli
- Department of Biological Sciences, South Eastern Kenya University (SEKU), P.O. Box 170-90200, Kitui, Kenya.
| | - Fiona Mumoki
- The International Center of Insect Physiology and Ecology (icipe), PO Box 30772-00100, Nairobi, Kenya.
| | - Aakrosh Ratan
- Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, PA, USA.
| | - John McGraw
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA.
| | - Maryann Frazier
- Department of Entomology, Center for Pollinator Research, Pennsylvania State University, University Park, PA, USA.
| | - Daniel Masiga
- The International Center of Insect Physiology and Ecology (icipe), PO Box 30772-00100, Nairobi, Kenya.
| | - Stephen Schuster
- Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, PA, USA.
| | - Christina M Grozinger
- Department of Entomology, Center for Pollinator Research, Pennsylvania State University, University Park, PA, USA.
| | - Webb Miller
- Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
56
|
Beynon SE, Slavov GT, Farré M, Sunduimijid B, Waddams K, Davies B, Haresign W, Kijas J, MacLeod IM, Newbold CJ, Davies L, Larkin DM. Population structure and history of the Welsh sheep breeds determined by whole genome genotyping. BMC Genet 2015; 16:65. [PMID: 26091804 PMCID: PMC4474581 DOI: 10.1186/s12863-015-0216-x] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2014] [Accepted: 05/13/2015] [Indexed: 11/10/2022] Open
Abstract
Background One of the most economically important areas within the Welsh agricultural sector is sheep farming, contributing around £230 million to the UK economy annually. Phenotypic selection over several centuries has generated a number of native sheep breeds, which are presumably adapted to the diverse and challenging landscape of Wales. Little is known about the history, genetic diversity and relationships of these breeds with other European breeds. We genotyped 353 individuals from 18 native Welsh sheep breeds using the Illumina OvineSNP50 array and characterised the genetic structure of these breeds. Our genotyping data were then combined with, and compared to, those from a set of 74 worldwide breeds, previously collected during the International Sheep Genome Consortium HapMap project. Results Model based clustering of the Welsh and European breeds indicated shared ancestry. This finding was supported by multidimensional scaling analysis (MDS), which revealed separation of the European, African and Asian breeds. As expected, the commercial Texel and Merino breeds appeared to have extensive co-ancestry with most European breeds. Consistently high levels of haplotype sharing were observed between native Welsh and other European breeds. The Welsh breeds did not, however, form a genetically homogeneous group, with pairwise FST between breeds averaging 0.107 and ranging between 0.020 and 0.201. Four subpopulations were identified within the 18 native breeds, with high homogeneity observed amongst the majority of mountain breeds. Recent effective population sizes estimated from linkage disequilibrium ranged from 88 to 825. Conclusions Welsh breeds are highly diverse with low to moderate effective population sizes and form at least four distinct genetic groups. Our data suggest common ancestry between the native Welsh and European breeds. These findings provide the basis for future genome-wide association studies and a first step towards developing genomics assisted breeding strategies in the UK. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0216-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sarah E Beynon
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Penglais, Aberystwyth, Ceredigion, SY23 3DA, UK.
| | - Gancho T Slavov
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Penglais, Aberystwyth, Ceredigion, SY23 3DA, UK.
| | - Marta Farré
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Penglais, Aberystwyth, Ceredigion, SY23 3DA, UK. .,Royal Veterinary College, University of London, Royal College Street, London, NW1 0TU, UK.
| | - Bolormaa Sunduimijid
- Victorian Department of Environment and Primary Industries, Bundoora, VIC, 3083, Australia.
| | - Kate Waddams
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Penglais, Aberystwyth, Ceredigion, SY23 3DA, UK.
| | - Brian Davies
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Penglais, Aberystwyth, Ceredigion, SY23 3DA, UK.
| | - William Haresign
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Penglais, Aberystwyth, Ceredigion, SY23 3DA, UK.
| | - James Kijas
- Commonwealth Scientific and Industrial Research Organisation (CSIRO), 306 Carmody Road, St Lucia, QLD, 4067, Australia.
| | - Iona M MacLeod
- Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Melbourne, VIC, 3010, Australia.
| | - C Jamie Newbold
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Penglais, Aberystwyth, Ceredigion, SY23 3DA, UK.
| | - Lynfa Davies
- Hybu Cig Cymru, Meat Promotion Wales, Tŷ Rheidol, Parc Merlin, Aberystwyth, SY23 3FF, UK.
| | - Denis M Larkin
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Penglais, Aberystwyth, Ceredigion, SY23 3DA, UK. .,Royal Veterinary College, University of London, Royal College Street, London, NW1 0TU, UK.
| |
Collapse
|
57
|
Bosse M, Megens HJ, Madsen O, Crooijmans RPMA, Ryder OA, Austerlitz F, Groenen MAM, de Cara MAR. Using genome-wide measures of coancestry to maintain diversity and fitness in endangered and domestic pig populations. Genome Res 2015; 25:970-81. [PMID: 26063737 PMCID: PMC4484394 DOI: 10.1101/gr.187039.114] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2014] [Accepted: 05/13/2015] [Indexed: 01/08/2023]
Abstract
Conservation and breeding programs aim at maintaining the most diversity, thereby avoiding deleterious effects of inbreeding while maintaining enough variation from which traits of interest can be selected. Theoretically, the most diversity is maintained using optimal contributions based on many markers to calculate coancestries, but this can decrease fitness by maintaining linked deleterious variants. The heterogeneous patterns of coancestry displayed in pigs make them an excellent model to test these predictions. We propose methods to measure coancestry and fitness from resequencing data and use them in population management. We analyzed the resequencing data of Sus cebifrons, a highly endangered porcine species from the Philippines, and genotype data from the Pietrain domestic breed. By analyzing the demographic history of Sus cebifrons, we inferred two past bottlenecks that resulted in some inbreeding load. In Pietrain, we analyzed signatures of selection possibly associated with commercial traits. We also simulated the management of each population to assess the performance of different optimal contribution methods to maintain diversity, fitness, and selection signatures. Maximum genetic diversity was maintained using marker-by-marker coancestry, and least using genealogical coancestry. Using a measure of coancestry based on shared segments of the genome achieved the best results in terms of diversity and fitness. However, this segment-based management eliminated signatures of selection. We demonstrate that maintaining both diversity and fitness depends on the genomic distribution of deleterious variants, which is shaped by demographic and selection histories. Our findings show the importance of genomic and next-generation sequencing information in the optimal design of breeding or conservation programs.
Collapse
Affiliation(s)
- Mirte Bosse
- ABGC Wageningen University, 6700 Wageningen, The Netherlands
| | | | - Ole Madsen
- ABGC Wageningen University, 6700 Wageningen, The Netherlands
| | | | - Oliver A Ryder
- San Diego Zoo Institute for Conservation Research, Escondido, California 92027, USA
| | | | | | | |
Collapse
|
58
|
Deinum EE, Halligan DL, Ness RW, Zhang YH, Cong L, Zhang JX, Keightley PD. Recent Evolution in Rattus norvegicus Is Shaped by Declining Effective Population Size. Mol Biol Evol 2015; 32:2547-58. [PMID: 26037536 PMCID: PMC4576703 DOI: 10.1093/molbev/msv126] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The brown rat, Rattus norvegicus, is both a notorious pest and a frequently used model in biomedical research. By analyzing genome sequences of 12 wild-caught brown rats from their presumed ancestral range in NE China, along with the sequence of a black rat, Rattus rattus, we investigate the selective and demographic forces shaping variation in the genome. We estimate that the recent effective population size (Ne) of this species = 1.24×105, based on silent site diversity. We compare patterns of diversity in these genomes with patterns in multiple genome sequences of the house mouse (Mus musculus castaneus), which has a much larger Ne. This reveals an important role for variation in the strength of genetic drift in mammalian genome evolution. By a Pairwise Sequentially Markovian Coalescent analysis of demographic history, we infer that there has been a recent population size bottleneck in wild rats, which we date to approximately 20,000 years ago. Consistent with this, wild rat populations have experienced an increased flux of mildly deleterious mutations, which segregate at higher frequencies in protein-coding genes and conserved noncoding elements. This leads to negative estimates of the rate of adaptive evolution (α) in proteins and conserved noncoding elements, a result which we discuss in relation to the strongly positive estimates observed in wild house mice. As a consequence of the population bottleneck, wild rats also show a markedly slower decay of linkage disequilibrium with physical distance than wild house mice.
Collapse
Affiliation(s)
- Eva E Deinum
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom
| | - Daniel L Halligan
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom
| | - Rob W Ness
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom
| | - Yao-Hua Zhang
- State Key Laboratory of Integrated Management of Pest Insects and Rodents in Agriculture, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Lin Cong
- Institute of Plant Protection, Heilongjiang Academy of Agricultural Sciences, Harbin, China
| | - Jian-Xu Zhang
- State Key Laboratory of Integrated Management of Pest Insects and Rodents in Agriculture, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Peter D Keightley
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
59
|
Mészáros G, Boison SA, Pérez O'Brien AM, Ferenčaković M, Curik I, Da Silva MVB, Utsunomiya YT, Garcia JF, Sölkner J. Genomic analysis for managing small and endangered populations: a case study in Tyrol Grey cattle. Front Genet 2015; 6:173. [PMID: 26074948 PMCID: PMC4443735 DOI: 10.3389/fgene.2015.00173] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2014] [Accepted: 04/20/2015] [Indexed: 11/30/2022] Open
Abstract
Analysis of genomic data is increasingly becoming part of the livestock industry. Therefore, the routine collection of genomic information would be an invaluable resource for effective management of breeding programs in small, endangered populations. The objective of the paper was to demonstrate how genomic data could be used to analyse (1) linkage disequlibrium (LD), LD decay and the effective population size (NeLD); (2) Inbreeding level and effective population size (NeROH) based on runs of homozygosity (ROH); (3) Prediction of genomic breeding values (GEBV) using small within-breed and genomic information from other breeds. The Tyrol Grey population was used as an example, with the goal to highlight the potential of genomic analyses for small breeds. In addition to our own results we discuss additional use of genomics to assess relatedness, admixture proportions, and inheritance of harmful variants. The example data set consisted of 218 Tyrol Grey bull genotypes, which were all available AI bulls in the population. After standard quality control restrictions 34,581 SNPs remained for the analysis. A separate quality control was applied to determine ROH levels based on Illumina GenCall and Illumina GenTrain scores, resulting into 211 bulls and 33,604 SNPs. LD was computed as the squared correlation coefficient between SNPs within a 10 mega base pair (Mb) region. ROHs were derived based on regions covering at least 4, 8, and 16 Mb, suggesting that animals had common ancestors approximately 12, 6, and 3 generations ago, respectively. The corresponding mean inbreeding coefficients (FROH) were 4.0% for 4 Mb, 2.9% for 8 Mb and 1.6% for 16 Mb runs. With an average generation interval of 5.66 years, estimated NeROH was 125 (NeROH>16 Mb), 186 (NeROH>8 Mb) and 370 (NeROH>4 Mb) indicating strict avoidance of close inbreeding in the population. The LD was used as an alternative method to infer the population history and the Ne. The results show a continuous decrease in NeLD, to 780, 120, and 80 for 100, 10, and 5 generations ago, respectively. Genomic selection was developed for and is working well in large breeds. The same methodology was applied in Tyrol Grey cattle, using different reference populations. Contrary to the expectations, the accuracy of GEBVs with very small within breed reference populations were very high, between 0.13–0.91 and 0.12–0.63, when estimated breeding values and deregressed breeding values were used as pseudo-phenotypes, respectively. Subsequent analyses confirmed the high accuracies being a consequence of low reliabilities of pseudo-phenotypes in the validation set, thus being heavily influenced by parent averages. Multi-breed and across breed reference sets gave inconsistent and lower accuracies. Genomic information may have a crucial role in management of small breeds, even if its primary usage differs from that of large breeds. It allows to assess relatedness between individuals, trends in inbreeding and to take decisions accordingly. These decisions would be based on the real genome architecture, rather than conventional pedigree information, which can be missing or incomplete. We strongly suggest the routine genotyping of all individuals that belong to a small breed in order to facilitate the effective management of endangered livestock populations.
Collapse
Affiliation(s)
- Gábor Mészáros
- Division of Livestock Sciences, University of Natural Resources and Life Sciences Vienna, Austria
| | - Solomon A Boison
- Division of Livestock Sciences, University of Natural Resources and Life Sciences Vienna, Austria
| | - Ana M Pérez O'Brien
- Division of Livestock Sciences, University of Natural Resources and Life Sciences Vienna, Austria
| | | | - Ino Curik
- Department of Animal Science, University of Zagreb Zagreb, Croatia
| | | | | | - Jose F Garcia
- UNESP-Universidade Estadual Paulista Jaboticabal, Brazil
| | - Johann Sölkner
- Division of Livestock Sciences, University of Natural Resources and Life Sciences Vienna, Austria
| |
Collapse
|
60
|
Genome flux and stasis in a five millennium transect of European prehistory. Nat Commun 2014; 5:5257. [PMID: 25334030 PMCID: PMC4218962 DOI: 10.1038/ncomms6257] [Citation(s) in RCA: 395] [Impact Index Per Article: 35.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2014] [Accepted: 09/11/2014] [Indexed: 12/19/2022] Open
Abstract
The Great Hungarian Plain was a crossroads of cultural transformations that have shaped European prehistory. Here we analyse a 5,000-year transect of human genomes, sampled from petrous bones giving consistently excellent endogenous DNA yields, from 13 Hungarian Neolithic, Copper, Bronze and Iron Age burials including two to high (~22 × ) and seven to ~1 × coverage, to investigate the impact of these on Europe’s genetic landscape. These data suggest genomic shifts with the advent of the Neolithic, Bronze and Iron Ages, with interleaved periods of genome stability. The earliest Neolithic context genome shows a European hunter-gatherer genetic signature and a restricted ancestral population size, suggesting direct contact between cultures after the arrival of the first farmers into Europe. The latest, Iron Age, sample reveals an eastern genomic influence concordant with introduced Steppe burial rites. We observe transition towards lighter pigmentation and surprisingly, no Neolithic presence of lactase persistence. Recent advances in high-throughput sequencing techniques have enabled the analysis of ancient human genomes. Here the authors sequence ancient human genomes that span a period of 5,000 years, to understand the ancestral influence on Europe's genetic landscape.
Collapse
|
61
|
The effects of demography and long-term selection on the accuracy of genomic prediction with sequence data. Genetics 2014; 198:1671-84. [PMID: 25233989 DOI: 10.1534/genetics.114.168344] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The use of dense SNPs to predict the genetic value of an individual for a complex trait is often referred to as "genomic selection" in livestock and crops, but is also relevant to human genetics to predict, for example, complex genetic disease risk. The accuracy of prediction depends on the strength of linkage disequilibrium (LD) between SNPs and causal mutations. If sequence data were used instead of dense SNPs, accuracy should increase because causal mutations are present, but demographic history and long-term negative selection also influence accuracy. We therefore evaluated genomic prediction, using simulated sequence in two contrasting populations: one reducing from an ancestrally large effective population size (Ne) to a small one, with high LD common in domestic livestock, while the second had a large constant-sized Ne with low LD similar to that in some human or outbred plant populations. There were two scenarios in each population; causal variants were either neutral or under long-term negative selection. For large Ne, sequence data led to a 22% increase in accuracy relative to ∼600K SNP chip data with a Bayesian analysis and a more modest advantage with a BLUP analysis. This advantage increased when causal variants were influenced by negative selection, and accuracy persisted when 10 generations separated reference and validation populations. However, in the reducing Ne population, there was little advantage for sequence even with negative selection. This study demonstrates the joint influence of demography and selection on accuracy of prediction and improves our understanding of how best to exploit sequence for genomic prediction.
Collapse
|
62
|
|
63
|
Abstract
Although the analysis of linkage disequilibrium (LD) plays a central role in many areas of population genetics, the sampling variance of LD is known to be very large with high sensitivity to numbers of nucleotide sites and individuals sampled. Here we show that a genome-wide analysis of the distribution of heterozygous sites within a single diploid genome can yield highly informative patterns of LD as a function of physical distance. The proposed statistic, the correlation of zygosity, is closely related to the conventional population-level measure of LD, but is agnostic with respect to allele frequencies and hence likely less prone to outlier artifacts. Application of the method to several vertebrate species leads to the conclusion that >80% of recombination events are typically resolved by gene-conversion-like processes unaccompanied by crossovers, with the average lengths of conversion patches being on the order of one to several kilobases in length. Thus, contrary to common assumptions, the recombination rate between sites does not scale linearly with distance, often even up to distances of 100 kb. In addition, the amount of LD between sites separated by <200 bp is uniformly much greater than can be explained by the conventional neutral model, possibly because of the nonindependent origin of mutations within this spatial scale. These results raise questions about the application of conventional population-genetic interpretations to LD on short spatial scales and also about the use of spatial patterns of LD to infer demographic histories.
Collapse
|
64
|
Toward genomic prediction from whole-genome sequence data: impact of sequencing design on genotype imputation and accuracy of predictions. Heredity (Edinb) 2013; 112:39-47. [PMID: 23549338 DOI: 10.1038/hdy.2013.13] [Citation(s) in RCA: 144] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2012] [Revised: 01/14/2013] [Accepted: 01/15/2013] [Indexed: 12/13/2022] Open
Abstract
Genomic prediction from whole-genome sequence data is attractive, as the accuracy of genomic prediction is no longer bounded by extent of linkage disequilibrium between DNA markers and causal mutations affecting the trait, given the causal mutations are in the data set. A cost-effective strategy could be to sequence a small proportion of the population, and impute sequence data to the rest of the reference population. Here, we describe strategies for selecting individuals for sequencing, based on either pedigree relationships or haplotype diversity. Performance of these strategies (number of variants detected and accuracy of imputation) were evaluated in sequence data simulated through a real Belgian Blue cattle pedigree. A strategy (AHAP), which selected a subset of individuals for sequencing that maximized the number of unique haplotypes (from single-nucleotide polymorphism panel data) sequenced gave good performance across a range of variant minor allele frequencies. We then investigated the optimum number of individuals to sequence by fold coverage given a maximum total sequencing effort. At 600 total fold coverage (x 600), the optimum strategy was to sequence 75 individuals at eightfold coverage. Finally, we investigated the accuracy of genomic predictions that could be achieved. The advantage of using imputed sequence data compared with dense SNP array genotypes was highly dependent on the allele frequency spectrum of the causative mutations affecting the trait. When this followed a neutral distribution, the advantage of the imputed sequence data was small; however, when the causal mutations all had low minor allele frequencies, using the sequence data improved the accuracy of genomic prediction by up to 30%.
Collapse
|