1
|
Cheng X, Steinrücken M. Population Genomic Scans for Natural Selection and Demography. Annu Rev Genet 2024; 58:319-339. [PMID: 39227130 DOI: 10.1146/annurev-genet-111523-102651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
Uncovering the fundamental processes that shape genomic variation in natural populations is a primary objective of population genetics. These processes include demographic effects such as past changes in effective population size or gene flow between structured populations. Furthermore, genomic variation is affected by selection on nonneutral genetic variants, for example, through the adaptation of beneficial alleles or balancing selection that maintains genetic variation. In this article, we discuss the characterization of these processes using population genetic models, and we review methods developed on the basis of these models to unravel the underlying processes from modern population genomic data sets. We briefly discuss the conditions in which these approaches can be used to infer demography or identify specific nonneutral genetic variants and cases in which caution is warranted. Moreover, we summarize the challenges of jointly inferring demography and selective processes that affect neutral variation genome-wide.
Collapse
Affiliation(s)
- Xiaoheng Cheng
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, USA;
| | - Matthias Steinrücken
- Department of Human Genetics, University of Chicago, Chicago, Illinois, USA
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, USA;
| |
Collapse
|
2
|
Russo CAM, Eyre-Walker A, Katz LA, Gaut BS. Forty Years of Inferential Methods in the Journals of the Society for Molecular Biology and Evolution. Mol Biol Evol 2024; 41:msad264. [PMID: 38197288 PMCID: PMC10763999 DOI: 10.1093/molbev/msad264] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 11/27/2023] [Indexed: 01/11/2024] Open
Abstract
We are launching a series to celebrate the 40th anniversary of the first issue of Molecular Biology and Evolution. In 2024, we will publish virtual issues containing selected papers published in the Society for Molecular Biology and Evolution journals, Molecular Biology and Evolution and Genome Biology and Evolution. Each virtual issue will be accompanied by a perspective that highlights the historic and contemporary contributions of our journals to a specific topic in molecular evolution. This perspective, the first in the series, presents an account of the broad array of methods that have been published in the Society for Molecular Biology and Evolution journals, including methods to infer phylogenies, to test hypotheses in a phylogenetic framework, and to infer population genetic processes. We also mention many of the software implementations that make methods tractable for empiricists. In short, the Society for Molecular Biology and Evolution community has much to celebrate after four decades of publishing high-quality science including numerous important inferential methods.
Collapse
Affiliation(s)
- Claudia A M Russo
- Departamento de Genética, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | | | - Laura A Katz
- Department of Biological Sciences, Smith College, Northampton, MA, USA
| | - Brandon S Gaut
- School of Biological Sciences, University of California, Irvine, CA, USA
| |
Collapse
|
3
|
Panigrahi M, Rajawat D, Nayak SS, Ghildiyal K, Sharma A, Jain K, Lei C, Bhushan B, Mishra BP, Dutt T. Landmarks in the history of selective sweeps. Anim Genet 2023; 54:667-688. [PMID: 37710403 DOI: 10.1111/age.13355] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 08/28/2023] [Indexed: 09/16/2023]
Abstract
Half a century ago, a seminal article on the hitchhiking effect by Smith and Haigh inaugurated the concept of the selection signature. Selective sweeps are characterised by the rapid spread of an advantageous genetic variant through a population and hence play an important role in shaping evolution and research on genetic diversity. The process by which a beneficial allele arises and becomes fixed in a population, leading to a increase in the frequency of other linked alleles, is known as genetic hitchhiking or genetic draft. Kimura's neutral theory and hitchhiking theory are complementary, with Kimura's neutral evolution as the 'null model' and positive selection as the 'signal'. Both are widely accepted in evolution, especially with genomics enabling precise measurements. Significant advances in genomic technologies, such as next-generation sequencing, high-density SNP arrays and powerful bioinformatics tools, have made it possible to systematically investigate selection signatures in a variety of species. Although the history of selection signatures is relatively recent, progress has been made in the last two decades, owing to the increasing availability of large-scale genomic data and the development of computational methods. In this review, we embark on a journey through the history of research on selective sweeps, ranging from early theoretical work to recent empirical studies that utilise genomic data.
Collapse
Affiliation(s)
- Manjit Panigrahi
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Divya Rajawat
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | | | - Kanika Ghildiyal
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Anurodh Sharma
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Karan Jain
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Chuzhao Lei
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Bharat Bhushan
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Bishnu Prasad Mishra
- Division of Animal Biotechnology, ICAR-National Bureau of Animal Genetic Resources, Karnal, India
| | - Triveni Dutt
- Livestock Production and Management Section, Indian Veterinary Research Institute, Bareilly, India
| |
Collapse
|
4
|
Nait Saada J, Tsangalidou Z, Stricker M, Palamara PF. Inference of Coalescence Times and Variant Ages Using Convolutional Neural Networks. Mol Biol Evol 2023; 40:msad211. [PMID: 37738175 PMCID: PMC10581698 DOI: 10.1093/molbev/msad211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 09/11/2023] [Accepted: 09/18/2023] [Indexed: 09/24/2023] Open
Abstract
Accurate inference of the time to the most recent common ancestor (TMRCA) between pairs of individuals and of the age of genomic variants is key in several population genetic analyses. We developed a likelihood-free approach, called CoalNN, which uses a convolutional neural network to predict pairwise TMRCAs and allele ages from sequencing or SNP array data. CoalNN is trained through simulation and can be adapted to varying parameters, such as demographic history, using transfer learning. Across several simulated scenarios, CoalNN matched or outperformed the accuracy of model-based approaches for pairwise TMRCA and allele age prediction. We applied CoalNN to settings for which model-based approaches are under-developed and performed analyses to gain insights into the set of features it uses to perform TMRCA prediction. We next used CoalNN to analyze 2,504 samples from 26 populations in the 1,000 Genome Project data set, inferring the age of ∼80 million variants. We observed substantial variation across populations and for variants predicted to be pathogenic, reflecting heterogeneous demographic histories and the action of negative selection. We used CoalNN's predicted allele ages to construct genome-wide annotations capturing the signature of past negative selection. We performed LD-score regression analysis of heritability using summary association statistics from 63 independent complex traits and diseases (average N=314k), observing increased annotation-specific effects on heritability compared to a previous allele age annotation. These results highlight the effectiveness of using likelihood-free, simulation-trained models to infer properties of gene genealogies in large genomic data sets.
Collapse
Affiliation(s)
| | | | | | - Pier Francesco Palamara
- Department of Statistics, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| |
Collapse
|
5
|
Soni V, Vos M, Eyre-Walker A. A new test suggests hundreds of amino acid polymorphisms in humans are subject to balancing selection. PLoS Biol 2022; 20:e3001645. [PMID: 35653351 PMCID: PMC9162324 DOI: 10.1371/journal.pbio.3001645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 04/25/2022] [Indexed: 11/18/2022] Open
Abstract
The role that balancing selection plays in the maintenance of genetic diversity remains unresolved. Here, we introduce a new test, based on the McDonald–Kreitman test, in which the number of polymorphisms that are shared between populations is contrasted to those that are private at selected and neutral sites. We show that this simple test is robust to a variety of demographic changes, and that it can also give a direct estimate of the number of shared polymorphisms that are directly maintained by balancing selection. We apply our method to population genomic data from humans and provide some evidence that hundreds of nonsynonymous polymorphisms are subject to balancing selection.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Michiel Vos
- European Centre for Environment and Human Health, University of Exeter Medical School, Environment and Sustainability Institute, Penryn, United Kingdom
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
- * E-mail:
| |
Collapse
|
6
|
Cheng X, DeGiorgio M. BalLeRMix +: mixture model approaches for robust joint identification of both positive selection and long-term balancing selection. Bioinformatics 2021; 38:861-863. [PMID: 34664624 PMCID: PMC8756184 DOI: 10.1093/bioinformatics/btab720] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 09/13/2021] [Accepted: 10/13/2021] [Indexed: 02/03/2023] Open
Abstract
SUMMARY The growing availability of genomewide polymorphism data has fueled interest in detecting diverse selective processes affecting population diversity. However, no model-based approaches exist to jointly detect and distinguish the two complementary processes of balancing and positive selection. We extend the BalLeRMix  B-statistic framework described in Cheng and DeGiorgio (2020) for detecting balancing selection and present BalLeRMix+, which implements five B statistic extensions based on mixture models to robustly identify both types of selection. BalLeRMix+ is implemented in Python and computes the composite likelihood ratios and associated model parameters for each genomic test position. AVAILABILITY AND IMPLEMENTATION BalLeRMix+ is freely available at https://github.com/bioXiaoheng/BallerMixPlus. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
7
|
Cheng X, DeGiorgio M. Flexible Mixture Model Approaches That Accommodate Footprint Size Variability for Robust Detection of Balancing Selection. Mol Biol Evol 2020; 37:3267-3291. [PMID: 32462188 PMCID: PMC7820363 DOI: 10.1093/molbev/msaa134] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Long-term balancing selection typically leaves narrow footprints of increased genetic diversity, and therefore most detection approaches only achieve optimal performances when sufficiently small genomic regions (i.e., windows) are examined. Such methods are sensitive to window sizes and suffer substantial losses in power when windows are large. Here, we employ mixture models to construct a set of five composite likelihood ratio test statistics, which we collectively term B statistics. These statistics are agnostic to window sizes and can operate on diverse forms of input data. Through simulations, we show that they exhibit comparable power to the best-performing current methods, and retain substantially high power regardless of window sizes. They also display considerable robustness to high mutation rates and uneven recombination landscapes, as well as an array of other common confounding scenarios. Moreover, we applied a specific version of the B statistics, termed B2, to a human population-genomic data set and recovered many top candidates from prior studies, including the then-uncharacterized STPG2 and CCDC169-SOHLH2, both of which are related to gamete functions. We further applied B2 on a bonobo population-genomic data set. In addition to the MHC-DQ genes, we uncovered several novel candidate genes, such as KLRD1, involved in viral defense, and SCN9A, associated with pain perception. Finally, we show that our methods can be extended to account for multiallelic balancing selection and integrated the set of statistics into open-source software named BalLeRMix for future applications by the scientific community.
Collapse
Affiliation(s)
- Xiaoheng Cheng
- Huck Institutes of Life Sciences, Pennsylvania State University, University Park, PA
- Department of Biology, Pennsylvania State University, University Park, PA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL
| |
Collapse
|
8
|
Horscroft C, Ennis S, Pengelly RJ, Sluckin TJ, Collins A. Sequencing era methods for identifying signatures of selection in the genome. Brief Bioinform 2020; 20:1997-2008. [PMID: 30053138 DOI: 10.1093/bib/bby064] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 05/16/2018] [Indexed: 12/12/2022] Open
Abstract
Insights into genetic loci which are under selection and their functional roles contribute to increased understanding of the patterns of phenotypic variation we observe today. The availability of whole-genome sequence data, for humans and other species, provides opportunities to investigate adaptation and evolution at unprecedented resolution. Many analytical methods have been developed to interrogate these large data sets and characterize signatures of selection in the genome. We review here recently developed methods and consider the impact of increased computing power and data availability on the detection of selection signatures. Consideration of demography, recombination and other confounding factors is important, and use of a range of methods in combination is a powerful route to resolving different forms of selection in genome sequence data. Overall, a substantial improvement in methods for application to whole-genome sequencing is evident, although further work is required to develop robust and computationally efficient approaches which may increase reproducibility across studies.
Collapse
Affiliation(s)
- Clare Horscroft
- Genetic Epidemiology and Bioinformatics, Faculty of Medicine, University of Southampton, Duthie Building (808), Tremona Road, Southampton, UK.,Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK
| | - Sarah Ennis
- Genetic Epidemiology and Bioinformatics, Faculty of Medicine, University of Southampton, Duthie Building (808), Tremona Road, Southampton, UK.,Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK
| | - Reuben J Pengelly
- Genetic Epidemiology and Bioinformatics, Faculty of Medicine, University of Southampton, Duthie Building (808), Tremona Road, Southampton, UK.,Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK
| | - Timothy J Sluckin
- Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK.,Mathematical Sciences, University of Southampton, Highfield, Southampton, UK
| | - Andrew Collins
- Genetic Epidemiology and Bioinformatics, Faculty of Medicine, University of Southampton, Duthie Building (808), Tremona Road, Southampton, UK.,Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK
| |
Collapse
|
9
|
Abstract
Trans-species polymorphism has been widely used as a key sign of long-term balancing selection across multiple species. However, such sites are often rare in the genome and could result from mutational processes or technical artifacts. Few methods are yet available to specifically detect footprints of trans-species balancing selection without using trans-species polymorphic sites. In this study, we develop summary- and model-based approaches that are each specifically tailored to uncover regions of long-term balancing selection shared by a set of species by using genomic patterns of intraspecific polymorphism and interspecific fixed differences. We demonstrate that our trans-species statistics have substantially higher power than single-species approaches to detect footprints of trans-species balancing selection, and are robust to those that do not affect all tested species. We further apply our model-based methods to human and chimpanzee whole-genome sequencing data. In addition to the previously established major histocompatibility complex and malaria resistance-associated FREM3/GYPE regions, we also find outstanding genomic regions involved in barrier integrity and innate immunity, such as the GRIK1/CLDN17 intergenic region, and the SLC35F1 and ABCA13 genes. Our findings not only echo the significance of pathogen defense but also reveal novel candidates in maintaining balanced polymorphisms across human and chimpanzee lineages. Finally, we show that these trans-species statistics can be applied to and work well for an arbitrary number of species, and integrate them into open-source software packages for ease of use by the scientific community.
Collapse
Affiliation(s)
- Xiaoheng Cheng
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA
- Department of Biology, Pennsylvania State University, University Park, PA
| | - Michael DeGiorgio
- Department of Biology, Pennsylvania State University, University Park, PA
- Department of Statistics, Pennsylvania State University, University Park, PA
- Institute for CyberScience, Pennsylvania State University, University Park, PA
| |
Collapse
|
10
|
High-throughput inference of pairwise coalescence times identifies signals of selection and enriched disease heritability. Nat Genet 2018; 50:1311-1317. [PMID: 30104759 PMCID: PMC6145075 DOI: 10.1038/s41588-018-0177-x] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Accepted: 06/21/2018] [Indexed: 12/19/2022]
Abstract
Interest in reconstructing demographic histories has motivated the development of methods to estimate locus-specific pairwise coalescence times from whole-genome sequence data. Here we introduce a powerful new method, ASMC, that can estimate coalescence times using only SNP array data, and is orders of magnitude faster than previous approaches. We applied ASMC to detect recent positive selection in 113,851 phased British samples from the UK Biobank, and detected 12 genome-wide significant signals, including 6 novel loci. We also applied ASMC to sequencing data from 498 Dutch individuals to detect background selection at deeper time scales. We detected strong heritability enrichment in regions of high background selection in an analysis of 20 independent diseases and complex traits using stratified LD score regression, conditioned on a broad set of functional annotations (including other background selection annotations). These results underscore the widespread effects of background selection on the genetic architecture of complex traits.
Collapse
|
11
|
Detecting Recent Positive Selection with a Single Locus Test Bipartitioning the Coalescent Tree. Genetics 2017; 208:791-805. [PMID: 29217523 DOI: 10.1534/genetics.117.300401] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2017] [Accepted: 12/01/2017] [Indexed: 01/09/2023] Open
Abstract
Many population genomic studies have been conducted in the past to search for traces of recent events of positive selection. These traces, however, can be obscured by temporal variation of population size or other demographic factors. To reduce the confounding impact of demography, the coalescent tree topology has been used as an additional source of information for detecting recent positive selection in a population or a species. Based on the branching pattern at the root, we partition the hypothetical coalescent tree, inferred from a sequence sample, into two subtrees. The reasoning is that positive selection could impose a strong impact on branch length in one of the two subtrees while demography has the same effect on average on both subtrees. Thus, positive selection should be detectable by comparing statistics calculated for the two subtrees. Simulations demonstrate that the proposed test based on these principles has high power to detect recent positive selection even when DNA polymorphism data from only one locus is available, and that it is robust to the confounding effect of demography. One feature is that all components in the summary statistics ([Formula: see text]) can be computed analytically. Moreover, misinference of derived and ancestral alleles is seen to have only a limited effect on the test, and it therefore avoids a notorious problem when searching for traces of recent positive selection.
Collapse
|
12
|
Figueiró HV, Li G, Trindade FJ, Assis J, Pais F, Fernandes G, Santos SHD, Hughes GM, Komissarov A, Antunes A, Trinca CS, Rodrigues MR, Linderoth T, Bi K, Silveira L, Azevedo FCC, Kantek D, Ramalho E, Brassaloti RA, Villela PMS, Nunes ALV, Teixeira RHF, Morato RG, Loska D, Saragüeta P, Gabaldón T, Teeling EC, O’Brien SJ, Nielsen R, Coutinho LL, Oliveira G, Murphy WJ, Eizirik E. Genome-wide signatures of complex introgression and adaptive evolution in the big cats. SCIENCE ADVANCES 2017; 3:e1700299. [PMID: 28776029 PMCID: PMC5517113 DOI: 10.1126/sciadv.1700299] [Citation(s) in RCA: 108] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Accepted: 06/19/2017] [Indexed: 05/05/2023]
Abstract
The great cats of the genus Panthera comprise a recent radiation whose evolutionary history is poorly understood. Their rapid diversification poses challenges to resolving their phylogeny while offering opportunities to investigate the historical dynamics of adaptive divergence. We report the sequence, de novo assembly, and annotation of the jaguar (Panthera onca) genome, a novel genome sequence for the leopard (Panthera pardus), and comparative analyses encompassing all living Panthera species. Demographic reconstructions indicated that all of these species have experienced variable episodes of population decline during the Pleistocene, ultimately leading to small effective sizes in present-day genomes. We observed pervasive genealogical discordance across Panthera genomes, caused by both incomplete lineage sorting and complex patterns of historical interspecific hybridization. We identified multiple signatures of species-specific positive selection, affecting genes involved in craniofacial and limb development, protein metabolism, hypoxia, reproduction, pigmentation, and sensory perception. There was remarkable concordance in pathways enriched in genomic segments implicated in interspecies introgression and in positive selection, suggesting that these processes were connected. We tested this hypothesis by developing exome capture probes targeting ~19,000 Panthera genes and applying them to 30 wild-caught jaguars. We found at least two genes (DOCK3 and COL4A5, both related to optic nerve development) bearing significant signatures of interspecies introgression and within-species positive selection. These findings indicate that post-speciation admixture has contributed genetic material that facilitated the adaptive evolution of big cat lineages.
Collapse
Affiliation(s)
- Henrique V. Figueiró
- Laboratório de Biologia Genômica e Molecular, Faculdade de Biociências, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Porto Alegre, Rio Grande do Sul, Brazil
| | - Gang Li
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Fernanda J. Trindade
- Laboratório de Biologia Genômica e Molecular, Faculdade de Biociências, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Porto Alegre, Rio Grande do Sul, Brazil
| | - Juliana Assis
- Centro de Pesquisa René Rachou, FIOCRUZ/Minas, Belo Horizonte, Minas Gerais, Brazil
| | - Fabiano Pais
- Centro de Pesquisa René Rachou, FIOCRUZ/Minas, Belo Horizonte, Minas Gerais, Brazil
| | - Gabriel Fernandes
- Centro de Pesquisa René Rachou, FIOCRUZ/Minas, Belo Horizonte, Minas Gerais, Brazil
| | - Sarah H. D. Santos
- Laboratório de Biologia Genômica e Molecular, Faculdade de Biociências, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Porto Alegre, Rio Grande do Sul, Brazil
| | - Graham M. Hughes
- School of Biology and Environmental Science, University College Dublin, Dublin, Ireland
| | - Aleksey Komissarov
- Theodosius Dobzhansky Center for Genome Bioinformatics, Saint Petersburg State University, St. Petersburg, Russia
| | - Agostinho Antunes
- Departamento de Biologia, Faculdade de Ciências and CIIMAR/CIMAR, Universidade do Porto, Porto, Portugal
| | - Cristine S. Trinca
- Laboratório de Biologia Genômica e Molecular, Faculdade de Biociências, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Porto Alegre, Rio Grande do Sul, Brazil
| | - MaÃra R. Rodrigues
- Laboratório de Biologia Genômica e Molecular, Faculdade de Biociências, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Porto Alegre, Rio Grande do Sul, Brazil
| | - Tyler Linderoth
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA 94720–3140, USA
| | - Ke Bi
- Computational Genomics Resource Laboratory, California Institute for Quantitative Biosciences and Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, CA 94720, USA
| | | | - Fernando C. C. Azevedo
- Universidade Federal de São João Del Rey, São João Del Rey, Minas Gerais, Brazil
- Instituto Pró-CarnÃvoros, Atibaia, São Paulo, Brazil
| | - Daniel Kantek
- Laboratório de Biologia Genômica e Molecular, Faculdade de Biociências, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Porto Alegre, Rio Grande do Sul, Brazil
- Instituto Chico Mendes de Conservação da Biodiversidade, BrasÃlia, Distrito Federal, Brazil
| | - Emiliano Ramalho
- Instituto Pró-CarnÃvoros, Atibaia, São Paulo, Brazil
- Instituto de Desenvolvimento Sustentável Mamirauá, Tefé, Amazonas, Brazil
| | - Ricardo A. Brassaloti
- Escola Superior de Agricultura Luiz de Queiroz (ESALQ-USP), Piracicaba, São Paulo, Brazil
| | | | | | - Rodrigo H. F. Teixeira
- Zoológico Municipal de Sorocaba, Sorocaba, São Paulo, Brazil
- Programa de Pós-Graduação em Animais Selvagens, Universidade Estadual Paulista–Botucatu, São Paulo, Brazil
| | - Ronaldo G. Morato
- Instituto Pró-CarnÃvoros, Atibaia, São Paulo, Brazil
- Instituto Chico Mendes de Conservação da Biodiversidade, BrasÃlia, Distrito Federal, Brazil
| | - Damian Loska
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | | | - Toni Gabaldón
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Emma C. Teeling
- School of Biology and Environmental Science, University College Dublin, Dublin, Ireland
| | - Stephen J. O’Brien
- Theodosius Dobzhansky Center for Genome Bioinformatics, Saint Petersburg State University, St. Petersburg, Russia
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA 94720–3140, USA
| | - Luiz L. Coutinho
- Escola Superior de Agricultura Luiz de Queiroz (ESALQ-USP), Piracicaba, São Paulo, Brazil
| | - Guilherme Oliveira
- Centro de Pesquisa René Rachou, FIOCRUZ/Minas, Belo Horizonte, Minas Gerais, Brazil
- Instituto Tecnológico Vale, Belém, Pará, Brazil
| | - William J. Murphy
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Eduardo Eizirik
- Laboratório de Biologia Genômica e Molecular, Faculdade de Biociências, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Porto Alegre, Rio Grande do Sul, Brazil
- Instituto Pró-CarnÃvoros, Atibaia, São Paulo, Brazil
| |
Collapse
|
13
|
Pajic P, Lin YL, Xu D, Gokcumen O. The psoriasis-associated deletion of late cornified envelope genes LCE3B and LCE3C has been maintained under balancing selection since Human Denisovan divergence. BMC Evol Biol 2016; 16:265. [PMID: 27919236 PMCID: PMC5139038 DOI: 10.1186/s12862-016-0842-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2016] [Accepted: 11/23/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A common, 32kb deletion of LCE3B and LCE3C genes is strongly associated with psoriasis. We recently found that this deletion is ancient, predating Human-Denisovan divergence. However, it was not clear why negative selection has not removed this deletion from the population. RESULTS Here, we show that the haplotype block that harbors the deletion (i) retains high allele frequency among extant and ancient human populations; (ii) harbors unusually high nucleotide variation (π, P < 4.1 × 10-3); (iii) contains an excess of intermediate frequency variants (Tajima's D, P < 3.9 × 10-3); and (iv) has an unusually long time to coalescence to the most recent common ancestor (TSel, 0.1 quantile). CONCLUSIONS Our results are most parsimonious with the scenario where the LCE3BC deletion has evolved under balancing selection in humans. More broadly, this is consistent with the hypothesis that a balance between autoimmunity and natural vaccination through increased exposure to pathogens maintains this deletion in humans.
Collapse
Affiliation(s)
- Petar Pajic
- Department of Biological Sciences, University at Buffalo, Cooke 639, Buffalo, NY, 14260, USA
| | - Yen-Lung Lin
- Department of Biological Sciences, University at Buffalo, Cooke 639, Buffalo, NY, 14260, USA
| | - Duo Xu
- Department of Biological Sciences, University at Buffalo, Cooke 639, Buffalo, NY, 14260, USA
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo, Cooke 639, Buffalo, NY, 14260, USA.
| |
Collapse
|
14
|
Sugden LA, Ramachandran S. Integrating the signatures of demic expansion and archaic introgression in studies of human population genomics. Curr Opin Genet Dev 2016; 41:140-149. [PMID: 27743539 DOI: 10.1016/j.gde.2016.09.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Revised: 09/19/2016] [Accepted: 09/23/2016] [Indexed: 12/12/2022]
Abstract
Human population genomic studies have repeatedly observed a decrease in heterozygosity and an increase in linkage disequilibrium with geographic distance from Africa. While multiple demographic models can generate these patterns, many studies invoke the serial founder effect model, in which populations expand from a single origin and each new population's founders represent a subset of genetic variation in the previous population. The model assumes no admixture with archaic hominins, however, recent studies have identified loci in Homo sapiens bearing signatures of archaic introgression. These results appear to contradict the validity of analyses invoking the serial founder effect model, but we show these two perspectives are compatible. We also propose using the serial founder effect model as a null model for determining the signature of archaic admixture in modern human genomes at different geographic and genomic scales.
Collapse
Affiliation(s)
- Lauren Alpert Sugden
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA; Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, USA
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA; Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, USA.
| |
Collapse
|