1
|
Rider DF, Wolf ACE, Murray J, de Flamingh A, dos Santos ALC, Lanoë F, Zedeño MN, DeGiorgio M, Lindo J, Malhi RS. Genomic analyses correspond with deep persistence of peoples of Blackfoot Confederacy from glacial times. Sci Adv 2024; 10:eadl6595. [PMID: 38569022 PMCID: PMC10990285 DOI: 10.1126/sciadv.adl6595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 02/28/2024] [Indexed: 04/05/2024]
Abstract
Mutually beneficial partnerships between genomics researchers and North American Indigenous Nations are rare yet becoming more common. Here, we present one such partnership that provides insight into the peopling of the Americas and furnishes another line of evidence that can be used to further treaty and Indigenous rights. We show that the genomics of sampled individuals from the Blackfoot Confederacy belong to a previously undescribed ancient lineage that diverged from other genomic lineages in the Americas in Late Pleistocene times. Using multiple complementary forms of knowledge, we provide a scenario for Blackfoot population history that fits with oral tradition and provides a plausible model for the evolutionary process of the peopling of the Americas.
Collapse
Affiliation(s)
| | | | - John Murray
- Blackfeet Tribal Historic Preservation Office, Browning, MT 59417, USA
| | - Alida de Flamingh
- Center for Indigenous Science, Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL 61801, USA
| | | | - François Lanoë
- Bureau of Applied Research in Anthropology, School of Anthropology, The University of Arizona, Tucson, AZ 85721, USA
| | - Maria N. Zedeño
- Bureau of Applied Research in Anthropology, School of Anthropology, The University of Arizona, Tucson, AZ 85721, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - John Lindo
- Department of Anthropology, Emory University, Atlanta, GA 30322, USA
| | - Ripan S. Malhi
- Center for Indigenous Science, Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL 61801, USA
| |
Collapse
|
2
|
Campelo dos Santos AL, DeGiorgio M, Assis R. Predicting evolutionary targets and parameters of gene deletion from expression data. Bioinform Adv 2024; 4:vbae002. [PMID: 38282974 PMCID: PMC10812876 DOI: 10.1093/bioadv/vbae002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 12/08/2023] [Accepted: 01/04/2024] [Indexed: 01/30/2024]
Abstract
Motivation Gene deletion is traditionally thought of as a nonadaptive process that removes functional redundancy from genomes, such that it generally receives less attention than duplication in evolutionary turnover studies. Yet, mounting evidence suggests that deletion may promote adaptation via the "less-is-more" evolutionary hypothesis, as it often targets genes harboring unique sequences, expression profiles, and molecular functions. Hence, predicting the relative prevalence of redundant and unique functions among genes targeted by deletion, as well as the parameters underlying their evolution, can shed light on the role of gene deletion in adaptation. Results Here, we present CLOUDe, a suite of machine learning methods for predicting evolutionary targets of gene deletion events from expression data. Specifically, CLOUDe models expression evolution as an Ornstein-Uhlenbeck process, and uses multi-layer neural network, extreme gradient boosting, random forest, and support vector machine architectures to predict whether deleted genes are "redundant" or "unique", as well as several parameters underlying their evolution. We show that CLOUDe boasts high power and accuracy in differentiating between classes, and high accuracy and precision in estimating evolutionary parameters, with optimal performance achieved by its neural network architecture. Application of CLOUDe to empirical data from Drosophila suggests that deletion primarily targets genes with unique functions, with further analysis showing these functions to be enriched for protein deubiquitination. Thus, CLOUDe represents a key advance in learning about the role of gene deletion in functional evolution and adaptation. Availability and implementation CLOUDe is freely available on GitHub (https://github.com/anddssan/CLOUDe).
Collapse
Affiliation(s)
- Andre Luiz Campelo dos Santos
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, United States
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, United States
| | - Raquel Assis
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, United States
- Institute for Human Health and Disease Intervention, Florida Atlantic University, Boca Raton, FL 33431, United States
| |
Collapse
|
3
|
Adams R, Cain Z, Assis R, DeGiorgio M. Robust phylogenetic regression. Syst Biol 2023:syad070. [PMID: 38035624 DOI: 10.1093/sysbio/syad070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Indexed: 12/02/2023] Open
Abstract
Modern comparative biology owes much to phylogenetic regression. At its conception, this technique sparked a revolution that armed biologists with phylogenetic comparative methods (PCMs) for disentangling evolutionary correlations from those arising from hierarchical phylogenetic relationships. Over the past few decades, the phylogenetic regression framework has become a paradigm of modern comparative biology that has been widely embraced as a remedy for shared ancestry. However, recent evidence has sown doubt over the efficacy of phylogenetic regression, and PCMs more generally, with the suggestion that many of these methods fail to provide an adequate defense against unreplicated evolution-the primary justification for using them in the first place. Importantly, some of the most compelling examples of biological innovation in nature result from abrupt lineage-specific evolutionary shifts, which current regression models are largely ill-equipped to deal with. Here we explore a solution to this problem by applying robust linear regression to comparative trait data. We formally introduce robust phylogenetic regression to the PCM toolkit with linear estimators that are less sensitive to model violations than the standard least-squares estimator, while still retaining high power to detect true trait associations. Our analyses also highlight an ingenuity of the original algorithm for phylogenetic regression based on independent contrasts, whereby robust estimators are particularly effective. Collectively, we find that robust estimators hold promise for improving tests of trait associations and offer a path forward in scenarios where classical approaches may fail. Our study joins recent arguments for increased vigilance against unreplicated evolution and a better understanding of evolutionary model performance in challenging-yet biologically important-settings.
Collapse
Affiliation(s)
- Richard Adams
- Department of Entomology and Plant Pathology, University of Arkansas, Fayetteville, AR, United States
- Agricultural Statistics Laboratory, University of Arkansas, Fayetteville, AR, United States
| | - Zoe Cain
- Department of Biological and Environmental Sciences, Georgia College, Milledgeville, GA, United States
| | - Raquel Assis
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL, United States
- Institute for Human Health and Disease Intervention, Florida Atlantic University, Boca Raton, FL, United States
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL, United States
| |
Collapse
|
4
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data. Mol Biol Evol 2023; 40:msad216. [PMID: 37772983 PMCID: PMC10581699 DOI: 10.1093/molbev/msad216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 08/10/2023] [Accepted: 09/14/2023] [Indexed: 09/30/2023] Open
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under nonconvex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data although preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx, which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
Affiliation(s)
- Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mahmudul Hasan
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Sandipan Paul Arnab
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
5
|
Adams R, DeGiorgio M. Likelihood-based tests of species tree hypotheses. Mol Biol Evol 2023:msad159. [PMID: 37440530 PMCID: PMC10368450 DOI: 10.1093/molbev/msad159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 06/20/2023] [Accepted: 07/06/2023] [Indexed: 07/15/2023] Open
Abstract
Likelihood-based tests of phylogenetic trees are a foundation of modern systematics. Over the past decade, an enormous wealth and diversity of model-based approaches have been developed for phylogenetic inference of both gene trees and species trees. However, while many techniques exist for conducting formal likelihood-based tests of gene trees, such frameworks are comparatively underdeveloped and underutilized for testing species tree hypotheses. To date, widely-used tests of tree topology are designed to assess the fit of classical models of molecular sequence data and individual gene trees, and thus, are not readily applicable to the problem of species tree inference. To address this issue, we derive several analogous likelihood-based approaches for testing topologies using modern species tree models and heuristic algorithms that use gene tree topologies as input for maximum likelihood estimation under the multispecies coalescent. For the purpose of comparing support for species trees, these tests leverage the statistical procedures of their original gene tree-based counterparts that have an extended history for testing phylogenetic hypotheses at a single locus. We discuss and demonstrate a number of applications, limitations, and important considerations of these tests using simulated and empirical phylogenomic datasets that include both bifurcating topologies and reticulate network models of species relationships. Finally, we introduce the open-source R package SpeciesTopoTestR (Species Topology Tests in R) that includes a suite of functions for conducting formal likelihood-based tests of species topologies given a set of input gene tree topologies.
Collapse
Affiliation(s)
- Richard Adams
- Agricultural Statistics Laboratory, University of Arkansas, Fayetteville, AR
- Department of Entomology and Plant Pathology, University of Arkansas, Fayetteville, AR
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL
| |
Collapse
|
6
|
Arnab SP, Amin MR, DeGiorgio M. Uncovering footprints of natural selection through spectral analysis of genomic summary statistics. Mol Biol Evol 2023:msad157. [PMID: 37433019 PMCID: PMC10365025 DOI: 10.1093/molbev/msad157] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 06/28/2023] [Accepted: 07/06/2023] [Indexed: 07/13/2023] Open
Abstract
Natural selection leaves a spatial pattern along the genome, with a haplotype distribution distortion near the selected locus that fades with distance. Evaluating the spatial signal of a population-genetic summary statistic across the genome allows for patterns of natural selection to be distinguished from neutrality. Considering the genomic spatial distribution of multiple summary statistics is expected to aid in uncovering subtle signatures of selection. In recent years, numerous methods have been devised that consider genomic spatial distributions across summary statistics, utilizing both classical machine learning and deep learning architectures. However, better predictions may be attainable by improving the way in which features are extracted from these summary statistics. We apply wavelet transform, multitaper spectral analysis, and S-transform to summary statistic arrays to achieve this goal. Each analysis method converts one-dimensional summary statistic arrays to two-dimensional images of spectral analysis, allowing simultaneous temporal and spectral assessment. We feed these images into convolutional neural networks and consider combining models using ensemble stacking. Our modeling framework achieves high accuracy and power across a diverse set of evolutionary settings, including population size changes and test sets of varying sweep strength, softness, and timing. A scan of central European whole-genome sequences recapitulated well-established sweep candidates and predicted novel cancer-associated genes as sweeps with high support. Given that this modeling framework is also robust to missing genomic segments, we believe that it will represent a welcome addition to the population-genomic toolkit for learning about adaptive processes from genomic data.
Collapse
Affiliation(s)
- Sandipan Paul Arnab
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
7
|
Piya AA, DeGiorgio M, Assis R. Predicting gene expression divergence between single-copy orthologs in two species. Genome Biol Evol 2023; 15:evad078. [PMID: 37170892 PMCID: PMC10220509 DOI: 10.1093/gbe/evad078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 04/21/2023] [Accepted: 05/02/2023] [Indexed: 05/13/2023] Open
Abstract
Predicting gene expression divergence is integral to understanding the emergence of new biological functions and associated traits. Whereas several sophisticated methods have been developed for this task, their applications are either limited to duplicate genes or require expression data from more than two species. Thus, here we present PiXi, the first machine learning framework for predicting gene expression divergence between single-copy orthologs in two species. PiXi models gene expression evolution as an Ornstein-Uhlenbeck process, and overlays this model with multi-layer neural network, random forest, and support vector machine architectures for making predictions. It outputs the predicted class "conserved" or "diverged" for each pair of orthologs, as well as their predicted expression optima in the two species. We show that PiXi has high power and accuracy in predicting gene expression divergence between single-copy orthologs, as well as high accuracy and precision in estimating their expression optima in the two species, across a wide range of evolutionary scenarios, with the globally best performance achieved by a multi-layer neural network. Moreover, application of our best performing PiXi predictor to empirical gene expression data from single-copy orthologs residing at different loci in two species of Drosophila reveals that approximately 23% underwent expression divergence after positional relocation. Further analysis shows that several of these "diverged" genes are involved in the electron transport chain of the mitochondrial membrane, suggesting that new chromatin environments may impact energy production in Drosophila. Thus, by providing a toolkit for predicting gene expression divergence between single-copy orthologs in two species, PiXi can shed light on the origins of novel phenotypes across diverse biological processes and study systems.
Collapse
Affiliation(s)
- Antara Anika Piya
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FloridaUSA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FloridaUSA
| | - Raquel Assis
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FloridaUSA
- Institute for Human Health and Disease Intervention, Florida Atlantic University, Boca Raton, FloridaUSA
| |
Collapse
|
8
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor decomposition based feature extraction and classification to detect natural selection from genomic data. bioRxiv 2023:2023.03.27.527731. [PMID: 37034767 PMCID: PMC10081272 DOI: 10.1101/2023.03.27.527731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under non-convex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data while preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx , which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
|
9
|
Joseph SK, Migliore NR, Olivieri A, Torroni A, Owings AC, DeGiorgio M, Ordóñez WG, Aguilú JO, González-Andrade F, Achilli A, Lindo J. Genomic evidence for adaptation to tuberculosis in the Andes before European contact. iScience 2023; 26:106034. [PMID: 36824277 PMCID: PMC9941198 DOI: 10.1016/j.isci.2023.106034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 11/11/2022] [Accepted: 01/17/2023] [Indexed: 01/25/2023] Open
Abstract
Most studies focusing on human high-altitude adaptation in the Andean highlands have thus far been focused on Peruvian populations. We present high-coverage whole genomes from Indigenous people living in the Ecuadorian highlands and perform multi-method scans to detect positive natural selection. We identified regions of the genome that show signals of strong selection to both cardiovascular and hypoxia pathways, which are distinct from those uncovered in Peruvian populations. However, the strongest signals of selection were related to regions of the genome that are involved in immune function related to tuberculosis. Given our estimated timing of this selection event, the Indigenous people of Ecuador may have adapted to Mycobacterium tuberculosis thousands of years before the arrival of Europeans. Furthermore, we detect a population collapse that coincides with the arrival of Europeans, which is more severe than other regions of the Andes, suggesting differing effects of contact across high-altitude populations.
Collapse
Affiliation(s)
- Sophie K. Joseph
- Department of Anthropology, Emory University, Atlanta, GA 30322, USA
| | - Nicola Rambaldi Migliore
- Department of Biology and Biotechnology “L. Spallanzani”, University of Pavia, Pavia 27100, Italy
| | - Anna Olivieri
- Department of Biology and Biotechnology “L. Spallanzani”, University of Pavia, Pavia 27100, Italy
| | - Antonio Torroni
- Department of Biology and Biotechnology “L. Spallanzani”, University of Pavia, Pavia 27100, Italy
| | - Amanda C. Owings
- Department of Biology, University of Iowa, Iowa City, IA 52242, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | | | | | - Fabricio González-Andrade
- Translational Medicine Unit, Central University of Ecuador, Faculty of Medical Sciences, Iquique N14-121 y Sodiro-Itchimbia, Sector El Dorado, 170403 Quito, Ecuador,Corresponding author
| | - Alessandro Achilli
- Department of Biology and Biotechnology “L. Spallanzani”, University of Pavia, Pavia 27100, Italy,Corresponding author
| | - John Lindo
- Department of Anthropology, Emory University, Atlanta, GA 30322, USA,Corresponding author
| |
Collapse
|
10
|
Campelo dos Santos AL, Owings A, Sullasi HSL, Gokcumen O, DeGiorgio M, Lindo J. Genomic evidence for ancient human migration routes along South America's Atlantic coast. Proc Biol Sci 2022; 289:20221078. [PMID: 36322514 PMCID: PMC9629774 DOI: 10.1098/rspb.2022.1078] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
An increasing body of archaeological and genomic evidence has hinted at a complex settlement process of the Americas by humans. This is especially true for South America, where unexpected ancestral signals have raised perplexing scenarios for the early migrations into different regions of the continent. Here, we present ancient human genomes from the archaeologically rich Northeast Brazil and compare them to ancient and present-day genomic data. We find a distinct relationship between ancient genomes from Northeast Brazil, Lagoa Santa, Uruguay and Panama, representing evidence for ancient migration routes along South America's Atlantic coast. To further add to the existing complexity, we also detect greater Denisovan than Neanderthal ancestry in ancient Uruguay and Panama individuals. Moreover, we find a strong Australasian signal in an ancient genome from Panama. This work sheds light on the deep demographic history of eastern South America and presents a starting point for future fine-scale investigations on the regional level.
Collapse
Affiliation(s)
- Andre Luiz Campelo dos Santos
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA,Department of Archaeology, Federal University of Pernambuco, Recife, Pernambuco 50670-901, Brazil
| | - Amanda Owings
- Department of Anthropology, Emory University, Atlanta, GA 30322, USA
| | | | - Omer Gokcumen
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14260, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - John Lindo
- Department of Anthropology, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
11
|
Schield DR, Perry BW, Adams RH, Holding ML, Nikolakis ZL, Gopalan SS, Smith CF, Parker JM, Meik JM, DeGiorgio M, Mackessy SP, Castoe TA. The roles of balancing selection and recombination in the evolution of rattlesnake venom. Nat Ecol Evol 2022; 6:1367-1380. [PMID: 35851850 PMCID: PMC9888523 DOI: 10.1038/s41559-022-01829-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 06/15/2022] [Indexed: 02/02/2023]
Abstract
The origin of snake venom involved duplication and recruitment of non-venom genes into venom systems. Several studies have predicted that directional positive selection has governed this process. Venom composition varies substantially across snake species and venom phenotypes are locally adapted to prey, leading to coevolutionary interactions between predator and prey. Venom origins and contemporary snake venom evolution may therefore be driven by fundamentally different selection regimes, yet investigations of population-level patterns of selection have been limited. Here, we use whole-genome data from 68 rattlesnakes to test hypotheses about the factors that drive genomic diversity and differentiation in major venom gene regions. We show that selection has resulted in long-term maintenance of genetic diversity within and between species in multiple venom gene families. Our findings are inconsistent with a dominant role of directional positive selection and instead support a role of long-term balancing selection in shaping venom evolution. We also detect rapid decay of linkage disequilibrium due to high recombination rates in venom regions, suggesting that venom genes have reduced selective interference with nearby loci, including other venom paralogues. Our results provide an example of long-term balancing selection that drives trans-species polymorphism and help to explain how snake venom keeps pace with prey resistance.
Collapse
Affiliation(s)
- Drew R Schield
- Department of Biology, University of Texas at Arlington, Arlington, TX, USA.
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, CO, USA.
| | - Blair W Perry
- Department of Biology, University of Texas at Arlington, Arlington, TX, USA
- School of Biological Sciences, Washington State University, Pullman, WA, USA
| | - Richard H Adams
- Department of Biological and Environmental Sciences, Georgia College and State University, Milledgeville, GA, USA
| | | | | | | | - Cara F Smith
- School of Biological Sciences, University of Northern Colorado, Greeley, CO, USA
| | - Joshua M Parker
- Life Science Department, Fresno City College, Fresno, CA, USA
| | - Jesse M Meik
- Department of Biological Sciences, Tarleton State University, Stephenville, TX, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL, USA
| | - Stephen P Mackessy
- School of Biological Sciences, University of Northern Colorado, Greeley, CO, USA
| | - Todd A Castoe
- Department of Biology, University of Texas at Arlington, Arlington, TX, USA.
| |
Collapse
|
12
|
Lindo J, De La Rosa R, Santos ALCD, Sans M, DeGiorgio M, Figueiro G. The genomic prehistory of the Indigenous peoples of Uruguay. PNAS Nexus 2022; 1:pgac047. [PMID: 36713318 PMCID: PMC9802099 DOI: 10.1093/pnasnexus/pgac047] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 04/14/2022] [Indexed: 02/01/2023]
Abstract
The prehistory of the people of Uruguay is greatly complicated by the dramatic and severe effects of European contact, as with most of the Americas. After the series of military campaigns that exterminated the last remnants of nomadic peoples, Uruguayan official history masked and diluted the former Indigenous ethnic diversity into the narrative of a singular people that all but died out. Here, we present the first whole genome sequences of the Indigenous people of the region before the arrival of Europeans, from an archaeological site in eastern Uruguay that dates from 2,000 years before present. We find a surprising connection to ancient individuals from Panama and eastern Brazil, but not to modern Amazonians. This result may be indicative of a migration route into South America that may have occurred along the Atlantic coast. We also find a distinct ancestry previously undetected in South America. Though this work begins to piece together some of the demographic nuance of the region, the sequencing of ancient individuals from across Uruguay is needed to better understand the ancient prehistory and genetic diversity that existed before European contact, thereby helping to rebuild the history of the Indigenous population of what is now Uruguay.
Collapse
Affiliation(s)
- John Lindo
- To whom correspondence should be addressed:
| | | | - Andre L C d Santos
- Department of Archeology, Federal University of Pernambuco, Recife, Brazil,Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mónica Sans
- Departamento de Antropología Biológica, Facultad de Humanidades y Ciencias de la Educación, Universidad de la República, Montevideo, Uruguay
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | | |
Collapse
|
13
|
DeGiorgio M, Szpiech ZA. A spatially aware likelihood test to detect sweeps from haplotype distributions. PLoS Genet 2022; 18:e1010134. [PMID: 35404934 PMCID: PMC9022890 DOI: 10.1371/journal.pgen.1010134] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 04/21/2022] [Accepted: 03/04/2022] [Indexed: 01/13/2023] Open
Abstract
The inference of positive selection in genomes is a problem of great interest in evolutionary genomics. By identifying putative regions of the genome that contain adaptive mutations, we are able to learn about the biology of organisms and their evolutionary history. Here we introduce a composite likelihood method that identifies recently completed or ongoing positive selection by searching for extreme distortions in the spatial distribution of the haplotype frequency spectrum along the genome relative to the genome-wide expectation taken as neutrality. Furthermore, the method simultaneously infers two parameters of the sweep: the number of sweeping haplotypes and the “width” of the sweep, which is related to the strength and timing of selection. We demonstrate that this method outperforms the leading haplotype-based selection statistics, though strong signals in low-recombination regions merit extra scrutiny. As a positive control, we apply it to two well-studied human populations from the 1000 Genomes Project and examine haplotype frequency spectrum patterns at the LCT and MHC loci. We also apply it to a data set of brown rats sampled in NYC and identify genes related to olfactory perception. To facilitate use of this method, we have implemented it in user-friendly open source software. Identifying regions of the genome that contain adaptive variation is of fundamental interest in evolutionary biology, providing insight into an organism’s history and biology. When positive selection is recent or ongoing, we expect to find genomic patterns such as high frequency haplotypes and low genetic diversity in the vicinity of the adaptive locus. Here we develop a statistic to identify these regions based on distortions of the haplotype frequency spectrum from a background distribution. We evaluate the performance of this statistic under numerous realistic settings of interest to empiricists and demonstrate its superior performance relative to other haplotype-based selection statistics. We also apply this statistic to real population-genetic data. As a positive control, we explore two well-studied loci, LCT and MHC, in a European and an African human population that show strong evidence for selection. We also apply this statistic to the genomes of an urban brown rat population, where we uncover evidence for adaptation in olfactory perception genes. We release user-friendly software implementing this statistic.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida, United States of America
- * E-mail: (MD); (ZAS)
| | - Zachary A. Szpiech
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States of America
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail: (MD); (ZAS)
| |
Collapse
|
14
|
Harris AM, DeGiorgio M. Admixture and Ancestry Inference from Ancient and Modern Samples through Measures of Population Genetic Drift. Hum Biol 2022. [DOI: 10.1353/hub.2017.0006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
15
|
Cheng X, DeGiorgio M. BalLeRMix +: mixture model approaches for robust joint identification of both positive selection and long-term balancing selection. Bioinformatics 2021; 38:861-863. [PMID: 34664624 PMCID: PMC8756184 DOI: 10.1093/bioinformatics/btab720] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 09/13/2021] [Accepted: 10/13/2021] [Indexed: 02/03/2023] Open
Abstract
SUMMARY The growing availability of genomewide polymorphism data has fueled interest in detecting diverse selective processes affecting population diversity. However, no model-based approaches exist to jointly detect and distinguish the two complementary processes of balancing and positive selection. We extend the BalLeRMix B-statistic framework described in Cheng and DeGiorgio (2020) for detecting balancing selection and present BalLeRMix+, which implements five B statistic extensions based on mixture models to robustly identify both types of selection. BalLeRMix+ is implemented in Python and computes the composite likelihood ratios and associated model parameters for each genomic test position. AVAILABILITY AND IMPLEMENTATION BalLeRMix+ is freely available at https://github.com/bioXiaoheng/BallerMixPlus. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
16
|
Adams RH, Castoe TA, DeGiorgio M. PhyloWGA: chromosome-aware phylogenetic interrogation of whole genome alignments. Bioinformatics 2021; 37:1923-1925. [PMID: 33051672 DOI: 10.1093/bioinformatics/btaa884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 09/16/2020] [Accepted: 09/29/2020] [Indexed: 11/13/2022] Open
Abstract
SUMMARY Here, we present PhyloWGA, an open source R package for conducting phylogenetic analysis and investigation of whole genome data. AVAILABILITYAND IMPLEMENTATION Available at Github (https://github.com/radamsRHA/PhyloWGA). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Richard H Adams
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Todd A Castoe
- Department of Biology, University of Texas at Arlington, Arlington, TX 76019, USA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
17
|
Mughal MR, DeGiorgio M. Properties and unbiased estimation of F- and D-statistics in samples containing related and inbred individuals. Genetics 2021; 220:6321956. [PMID: 34849832 PMCID: PMC8733448 DOI: 10.1093/genetics/iyab090] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Accepted: 05/26/2021] [Indexed: 11/14/2022] Open
Abstract
The Patterson F- and D-statistics are commonly used
measures for quantifying population relationships and for testing hypotheses about
demographic history. These statistics make use of allele frequency information across
populations to infer different aspects of population history, such as population structure
and introgression events. Inclusion of related or inbred individuals can bias such
statistics, which may often lead to the filtering of such individuals. Here, we derive
statistical properties of the F- and D-statistics,
including their biases due to the inclusion of related or inbred individuals, their
variances, and their corresponding mean squared errors. Moreover, for those statistics
that are biased, we develop unbiased estimators and evaluate the variances of these new
quantities. Comparisons of the new unbiased statistics to the originals demonstrates that
our newly derived statistics often have lower error across a wide population parameter
space. Furthermore, we apply these unbiased estimators using several global human
populations with the inclusion of related individuals to highlight their application on an
empirical dataset. Finally, we implement these unbiased estimators in open-source software
package funbiased for easy application by the scientific community.
Collapse
Affiliation(s)
- Mehreen R Mughal
- Bioinformatics and Genomics at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
18
|
Vegesna R, Tomaszkiewicz M, Ryder OA, Campos-Sánchez R, Medvedev P, DeGiorgio M, Makova KD. Ampliconic Genes on the Great Ape Y Chromosomes: Rapid Evolution of Copy Number but Conservation of Expression Levels. Genome Biol Evol 2021; 12:842-859. [PMID: 32374870 PMCID: PMC7313670 DOI: 10.1093/gbe/evaa088] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/28/2020] [Indexed: 12/16/2022] Open
Abstract
Multicopy ampliconic gene families on the Y chromosome play an important role in spermatogenesis. Thus, studying their genetic variation in endangered great ape species is critical. We estimated the sizes (copy number) of nine Y ampliconic gene families in population samples of chimpanzee, bonobo, and orangutan with droplet digital polymerase chain reaction, combined these estimates with published data for human and gorilla, and produced genome-wide testis gene expression data for great apes. Analyzing this comprehensive data set within an evolutionary framework, we, first, found high inter- and intraspecific variation in gene family size, with larger families exhibiting higher variation as compared with smaller families, a pattern consistent with random genetic drift. Second, for four gene families, we observed significant interspecific size differences, sometimes even between sister species—chimpanzee and bonobo. Third, despite substantial variation in copy number, Y ampliconic gene families’ expression levels did not differ significantly among species, suggesting dosage regulation. Fourth, for three gene families, size was positively correlated with gene expression levels across species, suggesting that, given sufficient evolutionary time, copy number influences gene expression. Our results indicate high variability in size but conservation in gene expression levels in Y ampliconic gene families, significantly advancing our understanding of Y-chromosome evolution in great apes.
Collapse
Affiliation(s)
- Rahulsimham Vegesna
- Bioinformatics and Genomics Graduate Program, The Huck Institutes for the Life Sciences, Pennsylvania State University, University Park
| | | | - Oliver A Ryder
- Institute for Conservation Research, San Diego Zoo Global, San Diego, California
| | | | - Paul Medvedev
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park.,Department of Computer Science and Engineering, Pennsylvania State University, University Park.,Center for Computational Biology and Bioinformatics, Pennsylvania State University, University Park.,Center for Medical Genomics, Pennsylvania State University, University Park
| | - Michael DeGiorgio
- Department of Biology, Pennsylvania State University, University Park.,Institute for Computational and Data Science, Pennsylvania State University, University Park
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University, University Park.,Center for Computational Biology and Bioinformatics, Pennsylvania State University, University Park.,Center for Medical Genomics, Pennsylvania State University, University Park
| |
Collapse
|
19
|
Guiblet WM, DeGiorgio M, Cheng X, Chiaromonte F, Eckert KA, Huang YF, Makova KD. Selection and thermostability suggest G-quadruplexes are novel functional elements of the human genome. Genome Res 2021; 31:1136-1149. [PMID: 34187812 PMCID: PMC8256861 DOI: 10.1101/gr.269589.120] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 05/24/2021] [Indexed: 12/11/2022]
Abstract
Approximately 1% of the human genome has the ability to fold into G-quadruplexes (G4s)-noncanonical strand-specific DNA structures forming at G-rich motifs. G4s regulate several key cellular processes (e.g., transcription) and have been hypothesized to participate in others (e.g., firing of replication origins). Moreover, G4s differ in their thermostability, and this may affect their function. Yet, G4s may also hinder replication, transcription, and translation and may increase genome instability and mutation rates. Therefore, depending on their genomic location, thermostability, and functionality, G4 loci might evolve under different selective pressures, which has never been investigated. Here we conducted the first genome-wide analysis of G4 distribution, thermostability, and selection. We found an overrepresentation, high thermostability, and purifying selection for G4s within genic components in which they are expected to be functional-promoters, CpG islands, and 5' and 3' UTRs. A similar pattern was observed for G4s within replication origins, enhancers, eQTLs, and TAD boundary regions, strongly suggesting their functionality. In contrast, G4s on the nontranscribed strand of exons were underrepresented, were unstable, and evolved neutrally. In general, G4s on the nontranscribed strand of genic components had lower density and were less stable than those on the transcribed strand, suggesting that the former are avoided at the RNA level. Across the genome, purifying selection was stronger at stable G4s. Our results suggest that purifying selection preserves the sequences of functional G4s, whereas nonfunctional G4s are too costly to be tolerated in the genome. Thus, G4s are emerging as fundamental, functional genomic elements.
Collapse
Affiliation(s)
- Wilfried M Guiblet
- Bioinformatics and Genomics Graduate Program, Penn State University, University Park, Pennsylvania 16802, USA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida 33431, USA
| | - Xiaoheng Cheng
- Department of Biology, Penn State University, University Park, Pennsylvania 16802, USA
| | - Francesca Chiaromonte
- Department of Statistics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for Medical Genomics, Penn State University, University Park and Hershey, Pennsylvania 16802, USA
- Sant'Anna School of Advanced Studies, 56127 Pisa, Italy
| | - Kristin A Eckert
- Center for Medical Genomics, Penn State University, University Park and Hershey, Pennsylvania 16802, USA
- Department of Pathology, Penn State University, College of Medicine, Hershey, Pennsylvania 17033, USA
| | - Yi-Fei Huang
- Department of Biology, Penn State University, University Park, Pennsylvania 16802, USA
- Center for Medical Genomics, Penn State University, University Park and Hershey, Pennsylvania 16802, USA
| | - Kateryna D Makova
- Department of Biology, Penn State University, University Park, Pennsylvania 16802, USA
- Center for Medical Genomics, Penn State University, University Park and Hershey, Pennsylvania 16802, USA
| |
Collapse
|
20
|
Abstract
Learning about the roles that duplicate genes play in the origins of novel phenotypes requires an understanding of how their functions evolve. A previous method for achieving this goal, CDROM, employs gene expression distances as proxies for functional divergence and then classifies the evolutionary mechanisms retaining duplicate genes from comparisons of these distances in a decision tree framework. However, CDROM does not account for stochastic shifts in gene expression or leverage advances in contemporary statistical learning for performing classification, nor is it capable of predicting the parameters driving duplicate gene evolution. Thus, here we develop CLOUD, a multi-layer neural network built on a model of gene expression evolution that can both classify duplicate gene retention mechanisms and predict their underlying evolutionary parameters. We show that not only is the CLOUD classifier substantially more powerful and accurate than CDROM, but that it also yields accurate parameter predictions, enabling a better understanding of the specific forces driving the evolution and long-term retention of duplicate genes. Further, application of the CLOUD classifier and predictor to empirical data from Drosophila recapitulates many previous findings about gene duplication in this lineage, showing that new functions often emerge rapidly and asymmetrically in younger duplicate gene copies, and that functional divergence is driven by strong natural selection. Hence, CLOUD represents a major advancement in classifying retention mechanisms and predicting evolutionary parameters of duplicate genes, thereby highlighting the utility of incorporating sophisticated statistical learning techniques to address long-standing questions about evolution after gene duplication.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431.,Institute for Human Health and Disease Intervention, Florida Atlantic University, Boca Raton, FL 33431
| | - Raquel Assis
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431.,Institute for Human Health and Disease Intervention, Florida Atlantic University, Boca Raton, FL 33431
| |
Collapse
|
21
|
Adams RH, Blackmon H, DeGiorgio M. Of Traits and Trees: Probabilistic Distances under Continuous Trait Models for Dissecting the Interplay among Phylogeny, Model, and Data. Syst Biol 2021; 70:660-680. [PMID: 33587145 PMCID: PMC8208806 DOI: 10.1093/sysbio/syab009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 02/01/2021] [Indexed: 12/03/2022] Open
Abstract
Stochastic models of character trait evolution have become a cornerstone of evolutionary biology in an array of contexts. While probabilistic models have been used extensively for statistical inference, they have largely been ignored for the purpose of measuring distances between phylogeny-aware models. Recent contributions to the problem of phylogenetic distance computation have highlighted the importance of explicitly considering evolutionary model parameters and their impacts on molecular sequence data when quantifying dissimilarity between trees. By comparing two phylogenies in terms of their induced probability distributions that are functions of many model parameters, these distances can be more informative than traditional approaches that rely strictly on differences in topology or branch lengths alone. Currently, however, these approaches are designed for comparing models of nucleotide substitution and gene tree distributions, and thus, are unable to address other classes of traits and associated models that may be of interest to evolutionary biologists. Here, we expand the principles of probabilistic phylogenetic distances to compute tree distances under models of continuous trait evolution along a phylogeny. By explicitly considering both the degree of relatedness among species and the evolutionary processes that collectively give rise to character traits, these distances provide a foundation for comparing models and their predictions, and for quantifying the impacts of assuming one phylogenetic background over another while studying the evolution of a particular trait. We demonstrate the properties of these approaches using theory, simulations, and several empirical data sets that highlight potential uses of probabilistic distances in many scenarios. We also introduce an open-source R package named PRDATR for easy application by the scientific community for computing phylogenetic distances under models of character trait evolution.[Brownian motion; comparative methods; phylogeny; quantitative traits.].
Collapse
Affiliation(s)
- Richard H Adams
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Heath Blackmon
- Department of Biology, Texas A&M University, College Station, TX 77843, USA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
22
|
Abstract
Selective sweeps are frequent and varied signatures in the genomes of natural populations, and detecting them is consequently important in understanding mechanisms of adaptation by natural selection. Following a selective sweep, haplotypic diversity surrounding the site under selection decreases, and this deviation from the background pattern of variation can be applied to identify sweeps. Multiple methods exist to locate selective sweeps in the genome from haplotype data, but none leverages the power of a model-based approach to make their inference. Here, we propose a likelihood ratio test statistic T to probe whole-genome polymorphism data sets for selective sweep signatures. Our framework uses a simple but powerful model of haplotype frequency spectrum distortion to find sweeps and additionally make an inference on the number of presently sweeping haplotypes in a population. We found that the T statistic is suitable for detecting both hard and soft sweeps across a variety of demographic models, selection strengths, and ages of the beneficial allele. Accordingly, we applied the T statistic to variant calls from European and sub-Saharan African human populations, yielding primarily literature-supported candidates, including LCT, RSPH3, and ZNF211 in CEU, SYT1, RGS18, and NNT in YRI, and HLA genes in both populations. We also searched for sweep signatures in Drosophila melanogaster, finding expected candidates at Ace, Uhg1, and Pimet. Finally, we provide open-source software to compute the T statistic and the inferred number of presently sweeping haplotypes from whole-genome data.
Collapse
Affiliation(s)
- Alexandre M Harris
- Department of Biology, Pennsylvania State University, University Park, PA.,Molecular, Cellular, and Integrative Biosciences, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL
| |
Collapse
|
23
|
Lindo J, DeGiorgio M. Understanding the Adaptive Evolutionary Histories of South American Ancient and Present-Day Populations via Genomics. Genes (Basel) 2021; 12:360. [PMID: 33801556 PMCID: PMC8001801 DOI: 10.3390/genes12030360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Revised: 02/18/2021] [Accepted: 02/22/2021] [Indexed: 12/03/2022] Open
Abstract
The South American continent is remarkably diverse in its ecological zones, spanning the Amazon rainforest, the high-altitude Andes, and Tierra del Fuego. Yet the original human populations of the continent successfully inhabited all these zones, well before the buffering effects of modern technology. Therefore, it is likely that the various cultures were successful, in part, due to positive natural selection that allowed them to successfully establish populations for thousands of years. Detecting positive selection in these populations is still in its infancy, as the ongoing effects of European contact have decimated many of these populations and introduced gene flow from outside of the continent. In this review, we explore hypotheses of possible human biological adaptation, methods to identify positive selection, the utilization of ancient DNA, and the integration of modern genomes through the identification of genomic tracts that reflect the ancestry of the first populations of the Americas.
Collapse
Affiliation(s)
- John Lindo
- Department of Anthropology, Emory University, Atlanta, GA 30322, USA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
24
|
Koch H, DeGiorgio M. Maximum Likelihood Estimation of Species Trees from Gene Trees in the Presence of Ancestral Population Structure. Genome Biol Evol 2020; 12:3977-3995. [PMID: 32022857 PMCID: PMC7061232 DOI: 10.1093/gbe/evaa022] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/23/2020] [Indexed: 11/12/2022] Open
Abstract
Though large multilocus genomic data sets have led to overall improvements in phylogenetic inference, they have posed the new challenge of addressing conflicting signals across the genome. In particular, ancestral population structure, which has been uncovered in a number of diverse species, can skew gene tree frequencies, thereby hindering the performance of species tree estimators. Here we develop a novel maximum likelihood method, termed TASTI (Taxa with Ancestral structure Species Tree Inference), that can infer phylogenies under such scenarios, and find that it has increasing accuracy with increasing numbers of input gene trees, contrasting with the relatively poor performances of methods not tailored for ancestral structure. Moreover, we propose a supertree approach that allows TASTI to scale computationally with increasing numbers of input taxa. We use genetic simulations to assess TASTI's performance in the three- and four-taxon settings and demonstrate the application of TASTI on a six-species Afrotropical mosquito data set. Finally, we have implemented TASTI in an open-source software package for ease of use by the scientific community.
Collapse
Affiliation(s)
- Hillary Koch
- Department of Statistics, Pennsylvania State University
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University
| |
Collapse
|
25
|
Abstract
Long-term balancing selection typically leaves narrow footprints of increased genetic diversity, and therefore most detection approaches only achieve optimal performances when sufficiently small genomic regions (i.e., windows) are examined. Such methods are sensitive to window sizes and suffer substantial losses in power when windows are large. Here, we employ mixture models to construct a set of five composite likelihood ratio test statistics, which we collectively term B statistics. These statistics are agnostic to window sizes and can operate on diverse forms of input data. Through simulations, we show that they exhibit comparable power to the best-performing current methods, and retain substantially high power regardless of window sizes. They also display considerable robustness to high mutation rates and uneven recombination landscapes, as well as an array of other common confounding scenarios. Moreover, we applied a specific version of the B statistics, termed B2, to a human population-genomic data set and recovered many top candidates from prior studies, including the then-uncharacterized STPG2 and CCDC169-SOHLH2, both of which are related to gamete functions. We further applied B2 on a bonobo population-genomic data set. In addition to the MHC-DQ genes, we uncovered several novel candidate genes, such as KLRD1, involved in viral defense, and SCN9A, associated with pain perception. Finally, we show that our methods can be extended to account for multiallelic balancing selection and integrated the set of statistics into open-source software named BalLeRMix for future applications by the scientific community.
Collapse
Affiliation(s)
- Xiaoheng Cheng
- Huck Institutes of Life Sciences, Pennsylvania State University, University Park, PA
- Department of Biology, Pennsylvania State University, University Park, PA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL
| |
Collapse
|
26
|
Mughal MR, Koch H, Huang J, Chiaromonte F, DeGiorgio M. Learning the properties of adaptive regions with functional data analysis. PLoS Genet 2020; 16:e1008896. [PMID: 32853200 PMCID: PMC7480868 DOI: 10.1371/journal.pgen.1008896] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 09/09/2020] [Accepted: 05/29/2020] [Indexed: 12/12/2022] Open
Abstract
Identifying regions of positive selection in genomic data remains a challenge in population genetics. Most current approaches rely on comparing values of summary statistics calculated in windows. We present an approach termed SURFDAWave, which translates measures of genetic diversity calculated in genomic windows to functional data. By transforming our discrete data points to be outputs of continuous functions defined over genomic space, we are able to learn the features of these functions that signify selection. This enables us to confidently identify complex modes of natural selection, including adaptive introgression. We are also able to predict important selection parameters that are responsible for shaping the inferred selection events. By applying our model to human population-genomic data, we recapitulate previously identified regions of selective sweeps, such as OCA2 in Europeans, and predict that its beneficial mutation reached a frequency of 0.02 before it swept 1,802 generations ago, a time when humans were relatively new to Europe. In addition, we identify BNC2 in Europeans as a target of adaptive introgression, and predict that it harbors a beneficial mutation that arose in an archaic human population that split from modern humans within the hypothesized modern human-Neanderthal divergence range.
Collapse
Affiliation(s)
- Mehreen R. Mughal
- Bioinformatics and Genomics at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Hillary Koch
- Department of Statistics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Jinguo Huang
- Bioinformatics and Genomics at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Francesca Chiaromonte
- Department of Statistics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida, United States of America
| |
Collapse
|
27
|
Abstract
Positive selection causes beneficial alleles to rise to high frequency, resulting in a selective sweep of the diversity surrounding the selected sites. Accordingly, the signature of a selective sweep in an ancestral population may still remain in its descendants. Identifying signatures of selection in the ancestor that are shared among its descendants is important to contextualize the timing of a sweep, but few methods exist for this purpose. We introduce the statistic SS-H12, which can identify genomic regions under shared positive selection across populations and is based on the theory of the expected haplotype homozygosity statistic H12, which detects recent hard and soft sweeps from the presence of high-frequency haplotypes. SS-H12 is distinct from comparable statistics because it requires a minimum of only two populations, and properly identifies and differentiates between independent convergent sweeps and true ancestral sweeps, with high power and robustness to a variety of demographic models. Furthermore, we can apply SS-H12 in conjunction with the ratio of statistics we term [Formula: see text] and [Formula: see text] to further classify identified shared sweeps as hard or soft. Finally, we identified both previously reported and novel shared sweep candidates from human whole-genome sequences. Previously reported candidates include the well-characterized ancestral sweeps at LCT and SLC24A5 in Indo-Europeans, as well as GPHN worldwide. Novel candidates include an ancestral sweep at RGS18 in sub-Saharan Africans involved in regulating the platelet response and implicated in sudden cardiac death, and a convergent sweep at C2CD5 between European and East Asian populations that may explain their different insulin responses.
Collapse
Affiliation(s)
- Alexandre M Harris
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802
- Molecular, Cellular, and Integrative Biosciences at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania 16802
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida 33431
| |
Collapse
|
28
|
Mei H, Arbeithuber B, Cremona MA, DeGiorgio M, Nekrutenko A. A High-Resolution View of Adaptive Event Dynamics in a Plasmid. Genome Biol Evol 2020; 11:3022-3034. [PMID: 31539047 PMCID: PMC6827461 DOI: 10.1093/gbe/evz197] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2019] [Indexed: 11/30/2022] Open
Abstract
Coadaptation between bacterial hosts and plasmids frequently results in adaptive changes restricted exclusively to host genome leaving plasmids unchanged. To better understand this remarkable stability, we transformed naïve Escherichia coli cells with a plasmid carrying an antibiotic-resistance gene and forced them to adapt in a turbidostat environment. We then drew population samples at regular intervals and subjected them to duplex sequencing—a technique specifically designed for identification of low-frequency mutations. Variants at ten sites implicated in plasmid copy number control emerged almost immediately, tracked consistently across the experiment’s time points, and faded below detectable frequencies toward the end. This variation crash coincided with the emergence of mutations on the host chromosome. Mathematical modeling of trajectories for adaptive changes affecting plasmid copy number showed that such mutations cannot readily fix or even reach appreciable frequencies. We conclude that there is a strong selection against alterations of copy number even if it can provide a degree of growth advantage. This incentive is likely rooted in the complex interplay between mutated and wild-type plasmids constrained within a single cell and underscores the importance of understanding of intracellular plasmid variability.
Collapse
Affiliation(s)
- Han Mei
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University
| | | | - Marzia A Cremona
- Department of Statistics, The Pennsylvania State University.,Department of Operations and Decision Systems, Université Laval
| | - Michael DeGiorgio
- Department of Biology, The Pennsylvania State University.,Department of Statistics, The Pennsylvania State University.,Institute for CyberScience, The Pennsylvania State University
| | - Anton Nekrutenko
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University
| |
Collapse
|
29
|
Abstract
Identifying genomic locations of natural selection from sequence data is an ongoing challenge in population genetics. Current methods utilizing information combined from several summary statistics typically assume no correlation of summary statistics regardless of the genomic location from which they are calculated. However, due to linkage disequilibrium, summary statistics calculated at nearby genomic positions are highly correlated. We introduce an approach termed Trendsetter that accounts for the similarity of statistics calculated from adjacent genomic regions through trend filtering, while reducing the effects of multicollinearity through regularization. Our penalized regression framework has high power to detect sweeps, is capable of classifying sweep regions as either hard or soft, and can be applied to other selection scenarios as well. We find that Trendsetter is robust to both extensive missing data and strong background selection, and has comparable power to similar current approaches. Moreover, the model learned by Trendsetter can be viewed as a set of curves modeling the spatial distribution of summary statistics in the genome. Application to human genomic data revealed positively selected regions previously discovered such as LCT in Europeans and EDAR in East Asians. We also identified a number of novel candidates and show that populations with greater relatedness share more sweep signals.
Collapse
Affiliation(s)
- Mehreen R Mughal
- Bioinformatics and Genomics at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA
| | - Michael DeGiorgio
- Departments of Biology and Statistics, Pennsylvania State University,University Park, PA
- Institute for CyberScience, Pennsylvania State University, University Park, PA
| |
Collapse
|
30
|
Abstract
Trans-species polymorphism has been widely used as a key sign of long-term balancing selection across multiple species. However, such sites are often rare in the genome and could result from mutational processes or technical artifacts. Few methods are yet available to specifically detect footprints of trans-species balancing selection without using trans-species polymorphic sites. In this study, we develop summary- and model-based approaches that are each specifically tailored to uncover regions of long-term balancing selection shared by a set of species by using genomic patterns of intraspecific polymorphism and interspecific fixed differences. We demonstrate that our trans-species statistics have substantially higher power than single-species approaches to detect footprints of trans-species balancing selection, and are robust to those that do not affect all tested species. We further apply our model-based methods to human and chimpanzee whole-genome sequencing data. In addition to the previously established major histocompatibility complex and malaria resistance-associated FREM3/GYPE regions, we also find outstanding genomic regions involved in barrier integrity and innate immunity, such as the GRIK1/CLDN17 intergenic region, and the SLC35F1 and ABCA13 genes. Our findings not only echo the significance of pathogen defense but also reveal novel candidates in maintaining balanced polymorphisms across human and chimpanzee lineages. Finally, we show that these trans-species statistics can be applied to and work well for an arbitrary number of species, and integrate them into open-source software packages for ease of use by the scientific community.
Collapse
Affiliation(s)
- Xiaoheng Cheng
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA
- Department of Biology, Pennsylvania State University, University Park, PA
| | - Michael DeGiorgio
- Department of Biology, Pennsylvania State University, University Park, PA
- Department of Statistics, Pennsylvania State University, University Park, PA
- Institute for CyberScience, Pennsylvania State University, University Park, PA
| |
Collapse
|
31
|
Lindo J, Rogers M, Mallott EK, Petzelt B, Mitchell J, Archer D, Cybulski JS, Malhi RS, DeGiorgio M. Patterns of Genetic Coding Variation in a Native American Population before and after European Contact. Am J Hum Genet 2018; 102:806-815. [PMID: 29706345 PMCID: PMC5986697 DOI: 10.1016/j.ajhg.2018.03.008] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Accepted: 03/05/2018] [Indexed: 12/11/2022] Open
Abstract
The effects of European colonization on the genomes of Native Americans may have produced excesses of potentially deleterious features, mainly due to the severe reductions in population size and corresponding losses of genetic diversity. This assumption, however, neither considers actual genomic patterns that existed before colonization nor does it adequately capture the effects of admixture. In this study, we analyze the whole-exome sequences of modern and ancient individuals from a Northwest Coast First Nation, with a demographic history similar to other indigenous populations from the Americas. We show that in approximately ten generations from initial European contact, the modern individuals exhibit reduced levels of novel and low-frequency variants, a lower proportion of potentially deleterious alleles, and decreased heterozygosity when compared to their ancestors. This pattern can be explained by a dramatic population decline, resulting in the loss of potentially damaging low-frequency variants, and subsequent admixture. We also find evidence that the indigenous population was on a steady decline in effective population size for several thousand years before contact, which emphasizes regional demography over the common conception of a uniform expansion after entry into the Americas. This study examines the genomic consequences of colonialism on an indigenous group and describes the continuing role of gene flow among modern populations.
Collapse
Affiliation(s)
- John Lindo
- Department of Anthropology, Emory University, Atlanta, GA 30322, USA
| | - Mary Rogers
- Department of Anthropology, University of Illinois, Urbana, IL 61821, USA
| | - Elizabeth K Mallott
- Department of Anthropology, Northwestern University, Evanston, IL 60208, USA
| | | | | | - David Archer
- Department of Anthropology, Northwestern Community College, Prince Rupert, BC V8J 3P6, Canada
| | - Jerome S Cybulski
- Research, Canadian Museum of History, Gatineau, QC K1A 0M8, Canada; Department of Anthropology, University of Western Ontario, London, ON N6A 3K7, Canada; Department of Archaeology, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Ripan S Malhi
- Department of Anthropology, University of Illinois, Urbana, IL 61821, USA; Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL 61820, USA.
| | - Michael DeGiorgio
- Departments of Biology and Statistics, Pennsylvania State University, University Park, PA 16801, USA; Institute for CyberScience, Pennsylvania State University, University Park, PA 16801, USA.
| |
Collapse
|
32
|
Ye D, Zaidi AA, Tomaszkiewicz M, Anthony K, Liebowitz C, DeGiorgio M, Shriver MD, Makova KD. High Levels of Copy Number Variation of Ampliconic Genes across Major Human Y Haplogroups. Genome Biol Evol 2018; 10:1333-1350. [PMID: 29718380 PMCID: PMC6007357 DOI: 10.1093/gbe/evy086] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/27/2018] [Indexed: 01/11/2023] Open
Abstract
Because of its highly repetitive nature, the human male-specific Y chromosome remains understudied. It is important to investigate variation on the Y chromosome to understand its evolution and contribution to phenotypic variation, including infertility. Approximately 20% of the human Y chromosome consists of ampliconic regions which include nine multi-copy gene families. These gene families are expressed exclusively in testes and usually implicated in spermatogenesis. Here, to gain a better understanding of the role of the Y chromosome in human evolution and in determining sexually dimorphic traits, we studied ampliconic gene copy number variation in 100 males representing ten major Y haplogroups world-wide. Copy number was estimated with droplet digital PCR. In contrast to low nucleotide diversity observed on the Y in previous studies, here we show that ampliconic gene copy number diversity is very high. A total of 98 copy-number-based haplotypes were observed among 100 individuals, and haplotypes were sometimes shared by males from very different haplogroups, suggesting homoplasies. The resulting haplotypes did not cluster according to major Y haplogroups. Overall, only two gene families (RBMY and TSPY) showed significant differences in copy number among major Y haplogroups, and the haplogroup of a male could not be predicted based on his ampliconic gene copy numbers. Finally, we did not find significant correlations either between copy number variation and individual's height, or between the former and facial masculinity/femininity. Our results suggest rapid evolution of ampliconic gene copy numbers on the human Y, and we discuss its causes.
Collapse
Affiliation(s)
- Danling Ye
- Department of Biology, Pennsylvania State University, University Park
| | - Arslan A Zaidi
- Department of Biology, Pennsylvania State University, University Park
| | | | - Kate Anthony
- Department of Biology, Pennsylvania State University, University Park
| | - Corey Liebowitz
- Department of Anthropology, Pennsylvania State University, University Park
| | - Michael DeGiorgio
- Department of Biology, Pennsylvania State University, University Park
| | - Mark D Shriver
- Department of Anthropology, Pennsylvania State University, University Park
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University, University Park
| |
Collapse
|
33
|
Cheng X, Xu C, DeGiorgio M. Fast and robust detection of ancestral selective sweeps. Mol Ecol 2017; 26:6871-6891. [DOI: 10.1111/mec.14416] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Revised: 10/16/2017] [Accepted: 10/23/2017] [Indexed: 01/01/2023]
Affiliation(s)
- Xiaoheng Cheng
- Huck Institutes of Life Sciences; Pennsylvania State University; University Park PA USA
- Department of Biology; Pennsylvania State University; University Park PA USA
| | - Cheng Xu
- Huck Institutes of Life Sciences; Pennsylvania State University; University Park PA USA
| | - Michael DeGiorgio
- Department of Biology; Pennsylvania State University; University Park PA USA
- Department of Statistics; Pennsylvania State University; University Park PA USA
- Institute for CyberScience; Pennsylvania State University; University Park PA USA
| |
Collapse
|
34
|
Xu D, Pavlidis P, Taskent RO, Alachiotis N, Flanagan C, DeGiorgio M, Blekhman R, Ruhl S, Gokcumen O. Archaic Hominin Introgression in Africa Contributes to Functional Salivary MUC7 Genetic Variation. Mol Biol Evol 2017; 34:2704-2715. [PMID: 28957509 PMCID: PMC5850612 DOI: 10.1093/molbev/msx206] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
One of the most abundant proteins in human saliva, mucin-7, is encoded by the MUC7 gene, which harbors copy number variable subexonic repeats (PTS-repeats) that affect the size and glycosylation potential of this protein. We recently documented the adaptive evolution of MUC7 subexonic copy number variation among primates. Yet, the evolution of MUC7 genetic variation in humans remained unexplored. Here, we found that PTS-repeat copy number variation has evolved recurrently in the human lineage, thereby generating multiple haplotypic backgrounds carrying five or six PTS-repeat copy number alleles. Contrary to previous studies, we found no associations between the copy number of PTS-repeats and protection against asthma. Instead, we revealed a significant association of MUC7 haplotypic variation with the composition of the oral microbiome. Furthermore, based on in-depth simulations, we conclude that a divergent MUC7 haplotype likely originated in an unknown African hominin population and introgressed into ancestors of modern Africans.
Collapse
Affiliation(s)
- Duo Xu
- Department of Biological Sciences, University at Buffalo, The State University of New York, Buffalo, NY
| | - Pavlos Pavlidis
- Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology - Hellas, Heraklion, Crete, Greece
| | - Recep Ozgur Taskent
- Department of Biological Sciences, University at Buffalo, The State University of New York, Buffalo, NY
| | - Nikolaos Alachiotis
- Institute of Computer Science (ICS), Foundation for Research and Technology - Hellas, Heraklion, Crete, Greece
| | - Colin Flanagan
- Department of Biological Sciences, University at Buffalo, The State University of New York, Buffalo, NY
| | - Michael DeGiorgio
- Department of Biology and the Institute for CyberScience, Pennsylvania State University, University Park, PA
| | - Ran Blekhman
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Twin Cities, MN
| | - Stefan Ruhl
- Department of Oral Biology, School of Dental Medicine, University at Buffalo, The State University of New York, Buffalo, NY
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo, The State University of New York, Buffalo, NY
| |
Collapse
|
35
|
Prada C, Hanna B, Budd AF, Woodley CM, Schmutz J, Grimwood J, Iglesias-Prieto R, Pandolfi JM, Levitan D, Johnson KG, Knowlton N, Kitano H, DeGiorgio M, Medina M. Empty Niches after Extinctions Increase Population Sizes of Modern Corals. Curr Biol 2016; 26:3190-3194. [PMID: 27866895 DOI: 10.1016/j.cub.2016.09.039] [Citation(s) in RCA: 59] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Revised: 09/20/2016] [Accepted: 09/21/2016] [Indexed: 01/21/2023]
Abstract
Large environmental fluctuations often cause mass extinctions, extirpating species and transforming communities [1, 2]. While the effects on community structure are evident in the fossil record, demographic consequences for populations of individual species are harder to evaluate because fossils reveal relative, but not absolute, abundances. However, genomic analyses of living species that have survived a mass extinction event offer the potential for understanding the demographic effects of such environmental fluctuations on extant species. Here, we show how environmental variation since the Pliocene has shaped demographic changes in extant corals of the genus Orbicella, major extant reef builders in the Caribbean that today are endangered. We use genomic approaches to estimate previously unknown current and past population sizes over the last 3 million years. Populations of all three Orbicella declined around 2-1 million years ago, coincident with the extinction of at least 50% of Caribbean coral species. The estimated changes in population size are consistent across the three species despite their ecological differences. Subsequently, two shallow-water specialists expanded their population sizes at least 2-fold, over a time that overlaps with the disappearance of their sister competitor species O. nancyi (the organ-pipe Orbicella). Our study suggests that populations of Orbicella species are capable of rebounding from reductions in population size under suitable conditions and that the effective population size of modern corals provides rich standing genetic variation for corals to adapt to climate change. For conservation genetics, our study suggests the need to evaluate genetic variation under appropriate demographic models.
Collapse
Affiliation(s)
- Carlos Prada
- Department of Biology, The Pennsylvania State University, 208 Mueller Lab, State College, PA 16802, USA; Smithsonian Tropical Research Institute, Smithsonian Institution, 9100 Panama City PL, Washington, DC 20521, USA.
| | - Bishoy Hanna
- Department of Biology, The Pennsylvania State University, 208 Mueller Lab, State College, PA 16802, USA
| | - Ann F Budd
- Department of Earth and Environmental Sciences, University of Iowa, 115 Trowbridge Hall, Iowa City, IA 52242, USA
| | - Cheryl M Woodley
- CCEHBR, Hollings Marine Laboratory, NCCOS, National Ocean Service, US National Oceanic and Atmospheric Administration, 331 Fort Johnson Road, Charleston, SC 29412, USA
| | - Jeremy Schmutz
- HudsonAlpha Institute of Biotechnology, 601 Genome Way Northwest, Huntsville, AL 35806, USA
| | - Jane Grimwood
- HudsonAlpha Institute of Biotechnology, 601 Genome Way Northwest, Huntsville, AL 35806, USA
| | - Roberto Iglesias-Prieto
- Department of Biology, The Pennsylvania State University, 208 Mueller Lab, State College, PA 16802, USA; Instituto de Ciencias del Mar y Limnología, Universidad Nacional Autónoma de México, Prol. Av. Niños Héroes, Puerto Morelos C.P. 77580, Q. Roo, Cancún, Mexico
| | - John M Pandolfi
- Australian Research Council Centre of Excellence for Coral Reef Studies, The University of Queensland, Brisbane, 4072, Queensland, Australia; School of Biological Sciences, The University of Queensland, Brisbane, 4072, Queensland, Australia
| | - Don Levitan
- Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA
| | - Kenneth G Johnson
- Department of Earth Sciences, Natural History Museum, Cromwell Road, London SW7 5BD, UK
| | - Nancy Knowlton
- Department of Invertebrate Zoology, National Museum of Natural History, Smithsonian Institution, 10(th) and Constitution Avenue, NW Washington, DC 20560-0163, USA
| | - Hiroaki Kitano
- The Systems Biology Institute, Falcon Building 5F, Shirokanedai, Minato, Tokyo 108-0071, Japan
| | - Michael DeGiorgio
- Department of Biology, The Pennsylvania State University, 208 Mueller Lab, State College, PA 16802, USA.
| | - Mónica Medina
- Department of Biology, The Pennsylvania State University, 208 Mueller Lab, State College, PA 16802, USA; Department of Invertebrate Zoology, National Museum of Natural History, Smithsonian Institution, 10(th) and Constitution Avenue, NW Washington, DC 20560-0163, USA; Smithsonian Tropical Research Institute, Smithsonian Institution, 9100 Panama City PL, Washington, DC 20521, USA.
| |
Collapse
|
36
|
Pagani L, Lawson DJ, Jagoda E, Mörseburg A, Eriksson A, Mitt M, Clemente F, Hudjashov G, DeGiorgio M, Saag L, Wall JD, Cardona A, Mägi R, Wilson Sayres MA, Kaewert S, Inchley C, Scheib CL, Järve M, Karmin M, Jacobs GS, Antao T, Iliescu FM, Kushniarevich A, Ayub Q, Tyler-Smith C, Xue Y, Yunusbayev B, Tambets K, Mallick CB, Saag L, Pocheshkhova E, Andriadze G, Muller C, Westaway MC, Lambert DM, Zoraqi G, Turdikulova S, Dalimova D, Sabitov Z, Sultana GNN, Lachance J, Tishkoff S, Momynaliev K, Isakova J, Damba LD, Gubina M, Nymadawa P, Evseeva I, Atramentova L, Utevska O, Ricaut FX, Brucato N, Sudoyo H, Letellier T, Cox MP, Barashkov NA, Skaro V, Mulahasanovic L, Primorac D, Sahakyan H, Mormina M, Eichstaedt CA, Lichman DV, Abdullah S, Chaubey G, Wee JTS, Mihailov E, Karunas A, Litvinov S, Khusainova R, Ekomasova N, Akhmetova V, Khidiyatova I, Marjanović D, Yepiskoposyan L, Behar DM, Balanovska E, Metspalu A, Derenko M, Malyarchuk B, Voevoda M, Fedorova SA, Osipova LP, Lahr MM, Gerbault P, Leavesley M, Migliano AB, Petraglia M, Balanovsky O, Khusnutdinova EK, Metspalu E, Thomas MG, Manica A, Nielsen R, Villems R, Willerslev E, Kivisild T, Metspalu M. Genomic analyses inform on migration events during the peopling of Eurasia. Nature 2016; 538:238-242. [PMID: 27654910 PMCID: PMC5164938 DOI: 10.1038/nature19792] [Citation(s) in RCA: 219] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2015] [Accepted: 08/24/2016] [Indexed: 12/20/2022]
Affiliation(s)
- Luca Pagani
- Estonian Biocentre, Tartu, Estonia.,Department of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom.,Department of Biological, Geological and Environmental Sciences, University of Bologna, Via Selmi 3, 40126, Bologna, Italy
| | - Daniel John Lawson
- Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Bristol BS8 2BN, UK
| | - Evelyn Jagoda
- Department of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom.,Department of Human Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Alexander Mörseburg
- Department of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom
| | - Anders Eriksson
- Integrative Systems Biology Lab, Division of Biological and Environmental Sciences & Engineering, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia.,Department of Zoology, University of Cambridge, Cambridge, UK
| | - Mario Mitt
- Estonian Genome Center, University of Tartu, Tartu, Estonia.,Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Florian Clemente
- Department of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom.,Institut de Biologie Computationnelle, Université Montpellier 2, Montpellier, France
| | - Georgi Hudjashov
- Estonian Biocentre, Tartu, Estonia.,Department of Psychology, University of Auckland, Auckland, 1142, New Zealand.,Statistics and Bioinformatics Group, Institute of Fundamental Sciences, Massey University, Palmerston North, New Zealand
| | - Michael DeGiorgio
- Department of Biology, Pennsylvania State University, University Park, PA, 16802, USA
| | | | - Jeffrey D Wall
- Institute for Human Genetics, University of California, San Francisco, California 94143, USA
| | - Alexia Cardona
- Department of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom.,MRC Epidemiology Unit, University of Cambridge, Institute of Metabolic Science, Box 285, Addenbrooke's Hospital, Hills Road, Cambridge, CB2 0QQ
| | - Reedik Mägi
- Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Melissa A Wilson Sayres
- School of Life Sciences, Tempe, AZ, 85287 USA.,Center for Evolution and Medicine, The Biodesign Institute, Tempe, AZ, 85287 USA
| | - Sarah Kaewert
- Department of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom
| | - Charlotte Inchley
- Department of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom
| | - Christiana L Scheib
- Department of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom
| | | | - Monika Karmin
- Estonian Biocentre, Tartu, Estonia.,Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Guy S Jacobs
- Mathematical Sciences, University of Southampton, Southampton SO17 1BJ, UK.,Institute for Complex Systems Simulation, University of Southampton, Southampton SO17 1BJ, UK
| | - Tiago Antao
- Division of Biological Sciences, University of Montana, Missoula, MT, USA
| | - Florin Mircea Iliescu
- Department of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom
| | - Alena Kushniarevich
- Estonian Biocentre, Tartu, Estonia.,Institute of Genetics and Cytology, National Academy of Sciences, Minsk, Belarus
| | - Qasim Ayub
- The Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, United Kingdom
| | - Chris Tyler-Smith
- The Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, United Kingdom
| | - Yali Xue
- The Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, United Kingdom
| | - Bayazit Yunusbayev
- Estonian Biocentre, Tartu, Estonia.,Institute of Biochemistry and Genetics, Ufa Scientific Center of RAS, Ufa, Russia
| | | | | | - Lehti Saag
- Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | | | - George Andriadze
- Scientific-Research Center of the Caucasian Ethnic Groups, St. Andrews Georgian University, Georgia
| | - Craig Muller
- Center for GeoGenetics, University of Copenhagen, Denmark
| | - Michael C Westaway
- Research Centre for Human Evolution, Environmental Futures Research Institute, Griffith University, Nathan, Australia
| | - David M Lambert
- Research Centre for Human Evolution, Environmental Futures Research Institute, Griffith University, Nathan, Australia
| | - Grigor Zoraqi
- Center of Molecular Diagnosis and Genetic Research, University Hospital of Obstetrics and Gynecology, Tirana, Albania
| | | | - Dilbar Dalimova
- Institute of Bioorganic Chemistry Academy of Science, Republic of Uzbekistan
| | | | - Gazi Nurun Nahar Sultana
- Centre for Advanced Research in Sciences (CARS), DNA Sequencing Research Laboratory, University of Dhaka, Dhaka-1000, Bangladesh
| | - Joseph Lachance
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104-6145, USA.,School of Biology, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Sarah Tishkoff
- Departments of Genetics and Biology, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | | | - Jainagul Isakova
- Institute of Molecular Biology and Medicine, Bishkek, Kyrgyz Republic
| | - Larisa D Damba
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Marina Gubina
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | | | - Irina Evseeva
- Northern State Medical University, Arkhangelsk, Russia.,Anthony Nolan, London, United Kingdom
| | | | - Olga Utevska
- V. N. Karazin Kharkiv National University, Kharkiv, Ukraine
| | - François-Xavier Ricaut
- Evolutionary Medicine group, Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, UMR 5288, Centre National de la Recherche Scientifique, Université de Toulouse 3, Toulouse, France
| | - Nicolas Brucato
- Evolutionary Medicine group, Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, UMR 5288, Centre National de la Recherche Scientifique, Université de Toulouse 3, Toulouse, France
| | - Herawati Sudoyo
- Genome Diversity and Diseases Laboratory, Eijkman Institute for Molecular Biology, Jakarta, Indonesia
| | - Thierry Letellier
- Evolutionary Medicine group, Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, UMR 5288, Centre National de la Recherche Scientifique, Université de Toulouse 3, Toulouse, France
| | - Murray P Cox
- Statistics and Bioinformatics Group, Institute of Fundamental Sciences, Massey University, Palmerston North, New Zealand
| | - Nikolay A Barashkov
- Department of Molecular Genetics, Yakut Scientific Centre of Complex Medical Problems, Yakutsk, Russia.,Laboratory of Molecular Biology, Institute of Natural Sciences, M.K. Ammosov North-Eastern Federal University, Yakutsk, Russia
| | - Vedrana Skaro
- Genos, DNA laboratory, Zagreb, Croatia.,University of Osijek, Medical School, Osijek, Croatia
| | | | - Dragan Primorac
- St. Catherine Speciality Hospital, Zabok, Croatia.,Eberly College of Science, The Pennsylvania State University, University Park, PA, USA.,University of Split, Medical School, Split, Croatia.,University of Osijek, Medical School, Osijek, Croatia
| | - Hovhannes Sahakyan
- Estonian Biocentre, Tartu, Estonia.,Laboratory of Ethnogenomics, Institute of Molecular Biology, National Academy of Sciences, Republic of Armenia, 7 Hasratyan Street, 0014, Yerevan, Armenia
| | - Maru Mormina
- Department of Applied Social Sciences, University of Winchester, Sparkford Road, Winchester SO22 4NR, UK
| | - Christina A Eichstaedt
- Department of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom.,Thoraxclinic at the University Hospital Heidelberg, Heidelberg, Germany
| | - Daria V Lichman
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia.,Novosibirsk State University, Novosibirsk, Russia
| | | | | | | | | | - Alexandra Karunas
- Institute of Biochemistry and Genetics, Ufa Scientific Center of RAS, Ufa, Russia.,Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Russia
| | - Sergei Litvinov
- Institute of Biochemistry and Genetics, Ufa Scientific Center of RAS, Ufa, Russia.,Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Russia.,Estonian Biocentre, Tartu, Estonia
| | - Rita Khusainova
- Institute of Biochemistry and Genetics, Ufa Scientific Center of RAS, Ufa, Russia.,Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Russia
| | - Natalya Ekomasova
- Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Russia
| | - Vita Akhmetova
- Institute of Biochemistry and Genetics, Ufa Scientific Center of RAS, Ufa, Russia
| | - Irina Khidiyatova
- Institute of Biochemistry and Genetics, Ufa Scientific Center of RAS, Ufa, Russia.,Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Russia
| | - Damir Marjanović
- Department of Genetics and Bioengineering. Faculty of Engineering and Information Technologies, International Burch University, Sarajevo, Bosnia and Herzegovina.,Institute for Anthropological Researches, Zagreb, Croatia
| | - Levon Yepiskoposyan
- Laboratory of Ethnogenomics, Institute of Molecular Biology, National Academy of Sciences, Republic of Armenia, 7 Hasratyan Street, 0014, Yerevan, Armenia
| | | | - Elena Balanovska
- Research Centre for Medical Genetics, Russian Academy of Sciences, Moscow 115478, Russia
| | - Andres Metspalu
- Department of Zoology, University of Cambridge, Cambridge, UK.,Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Miroslava Derenko
- Genetics Laboratory, Institute of Biological Problems of the North, Russian Academy of Sciences, Magadan, Russia
| | - Boris Malyarchuk
- Genetics Laboratory, Institute of Biological Problems of the North, Russian Academy of Sciences, Magadan, Russia
| | - Mikhail Voevoda
- Institute of Internal Medicine, Siberian Branch of Russian Academy of Medical Sciences, Novosibirsk, Russia.,Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia.,Novosibirsk State University, Novosibirsk, Russia
| | - Sardana A Fedorova
- Laboratory of Molecular Biology, Institute of Natural Sciences, M.K. Ammosov North-Eastern Federal University, Yakutsk, Russia.,Department of Molecular Genetics, Yakut Scientific Centre of Complex Medical Problems, Yakutsk, Russia
| | - Ludmila P Osipova
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia.,Novosibirsk State University, Novosibirsk, Russia
| | - Marta Mirazón Lahr
- Leverhulme Centre for Human Evolutionary Studies, Department of Archaeology and Anthropology, University of Cambridge, Cambridge, United Kingdom
| | - Pascale Gerbault
- Research Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Matthew Leavesley
- Department of Archaeology, University of Papua New Guinea, University PO Box 320, NCD, Papua New Guinea.,College of Arts, Society and Education, James Cook University, PO Box 6811, Cairns QLD 4870, Australia
| | | | - Michael Petraglia
- Max Planck Institute for the Science of Human History, Kahlaische Strasse 10, D-07743 Jena, Germany
| | - Oleg Balanovsky
- Vavilov Institute for General Genetics, Russian Academy of Sciences, Moscow, Russia.,Research Centre for Medical Genetics, Russian Academy of Sciences, Moscow 115478, Russia
| | - Elza K Khusnutdinova
- Institute of Biochemistry and Genetics, Ufa Scientific Center of RAS, Ufa, Russia.,Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Russia
| | - Ene Metspalu
- Estonian Biocentre, Tartu, Estonia.,Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Mark G Thomas
- Research Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Andrea Manica
- Department of Zoology, University of Cambridge, Cambridge, UK
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California Berkeley, Berkeley 94720, CA, USA
| | - Richard Villems
- Estonian Biocentre, Tartu, Estonia.,Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia.,Estonian Academy of Sciences, 6 Kohtu Street, Tallinn 10130, Estonia
| | | | - Toomas Kivisild
- Department of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom.,Estonian Biocentre, Tartu, Estonia
| | | |
Collapse
|
37
|
Fungtammasan A, Tomaszkiewicz M, Campos-Sánchez R, Eckert KA, DeGiorgio M, Makova KD. Reverse Transcription Errors and RNA-DNA Differences at Short Tandem Repeats. Mol Biol Evol 2016; 33:2744-58. [PMID: 27413049 PMCID: PMC5026258 DOI: 10.1093/molbev/msw139] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Transcript variation has important implications for organismal function in health and disease. Most transcriptome studies focus on assessing variation in gene expression levels and isoform representation. Variation at the level of transcript sequence is caused by RNA editing and transcription errors, and leads to nongenetically encoded transcript variants, or RNA–DNA differences (RDDs). Such variation has been understudied, in part because its detection is obscured by reverse transcription (RT) and sequencing errors. It has only been evaluated for intertranscript base substitution differences. Here, we investigated transcript sequence variation for short tandem repeats (STRs). We developed the first maximum-likelihood estimator (MLE) to infer RT error and RDD rates, taking next generation sequencing error rates into account. Using the MLE, we empirically evaluated RT error and RDD rates for STRs in a large-scale DNA and RNA replicated sequencing experiment conducted in a primate species. The RT error rates increased exponentially with STR length and were biased toward expansions. The RDD rates were approximately 1 order of magnitude lower than the RT error rates. The RT error rates estimated with the MLE from a primate data set were concordant with those estimated with an independent method, barcoded RNA sequencing, from a Caenorhabditis elegans data set. Our results have important implications for medical genomics, as STR allelic variation is associated with >40 diseases. STR nonallelic transcript variation can also contribute to disease phenotype. The MLE and empirical rates presented here can be used to evaluate the probability of disease-associated transcripts arising due to RDD.
Collapse
Affiliation(s)
- Arkarachai Fungtammasan
- Integrative Biosciences, Bioinformatics and Genomics Option, Pennsylvania State University Department of Biology, Pennsylvania State University Center for Medical Genomics, Pennsylvania State University Huck Institute of Genome Sciences, Pennsylvania State University
| | - Marta Tomaszkiewicz
- Department of Biology, Pennsylvania State University Center for Medical Genomics, Pennsylvania State University
| | - Rebeca Campos-Sánchez
- Department of Biology, Pennsylvania State University Center for Medical Genomics, Pennsylvania State University
| | - Kristin A Eckert
- Center for Medical Genomics, Pennsylvania State University Department of Pathology, The Jake Gittlen Laboratories for Cancer Research, The Pennsylvania State University College of Medicine
| | - Michael DeGiorgio
- Department of Biology, Pennsylvania State University Center for Medical Genomics, Pennsylvania State University Institute for CyberScience, Pennsylvania State University
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University Center for Medical Genomics, Pennsylvania State University Huck Institute of Genome Sciences, Pennsylvania State University
| |
Collapse
|
38
|
DeGiorgio M, Huber CD, Hubisz MJ, Hellmann I, Nielsen R. SweepFinder2: increased sensitivity, robustness and flexibility. Bioinformatics 2016; 32:1895-7. [PMID: 27153702 DOI: 10.1093/bioinformatics/btw051] [Citation(s) in RCA: 151] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Accepted: 01/19/2016] [Indexed: 12/14/2022] Open
Abstract
UNLABELLED SweepFinder is a widely used program that implements a powerful likelihood-based method for detecting recent positive selection, or selective sweeps. Here, we present SweepFinder2, an extension of SweepFinder with increased sensitivity and robustness to the confounding effects of mutation rate variation and background selection. Moreover, SweepFinder2 has increased flexibility that enables the user to specify test sites, set the distance between test sites and utilize a recombination map. AVAILABILITY AND IMPLEMENTATION SweepFinder2 is a freely-available (www.personal.psu.edu/mxd60/sf2.html) software package that is written in C and can be run from a Unix command line. CONTACT mxd60@psu.edu.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Biology and Institute for CyberScience, Pennsylvania State University, University Park, PA, USA
| | - Christian D Huber
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA
| | - Melissa J Hubisz
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, USA
| | - Ines Hellmann
- Department Biologie II, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany and
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, CA, USA
| |
Collapse
|
39
|
Huber CD, DeGiorgio M, Hellmann I, Nielsen R. Detecting recent selective sweeps while controlling for mutation rate and background selection. Mol Ecol 2015; 25:142-56. [PMID: 26290347 PMCID: PMC5082542 DOI: 10.1111/mec.13351] [Citation(s) in RCA: 103] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 07/31/2015] [Accepted: 08/17/2015] [Indexed: 12/19/2022]
Abstract
A composite likelihood ratio test implemented in the program sweepfinder is a commonly used method for scanning a genome for recent selective sweeps. sweepfinder uses information on the spatial pattern (along the chromosome) of the site frequency spectrum around the selected locus. To avoid confounding effects of background selection and variation in the mutation process along the genome, the method is typically applied only to sites that are variable within species. However, the power to detect and localize selective sweeps can be greatly improved if invariable sites are also included in the analysis. In the spirit of a Hudson–Kreitman–Aguadé test, we suggest adding fixed differences relative to an out‐group to account for variation in mutation rate, thereby facilitating more robust and powerful analyses. We also develop a method for including background selection, modelled as a local reduction in the effective population size. Using simulations, we show that these advances lead to a gain in power while maintaining robustness to mutation rate variation. Furthermore, the new method also provides more precise localization of the causative mutation than methods using the spatial pattern of segregating sites alone.
Collapse
Affiliation(s)
- Christian D Huber
- Max F. Perutz Laboratory, University of Vienna, Vienna, Austria.,Vienna Graduate School of Population Genetics, University of Veterinary Medicine, Vienna, Austria.,Department of Ecology and Evolutionary Biology, University of California, Los Angeles, 621 Charles E. Young Drive South, Los Angeles, CA, 90095-1606, USA
| | - Michael DeGiorgio
- Departments of Biology and Statistics, Pennsylvania State University, University Park, PA, USA.,Institute for CyberScience, Pennsylvania State University, University Park, PA, USA
| | - Ines Hellmann
- Department Biologie II, Ludwig-Maximilians-Universität München, Großhaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | - Rasmus Nielsen
- Departments of Integrative Biology and Statistics, University of California, Berkeley, CA, USA
| |
Collapse
|
40
|
Raghavan M, Steinrücken M, Harris K, Schiffels S, Rasmussen S, DeGiorgio M, Albrechtsen A, Valdiosera C, Ávila-Arcos MC, Malaspinas AS, Eriksson A, Moltke I, Metspalu M, Homburger JR, Wall J, Cornejo OE, Moreno-Mayar JV, Korneliussen TS, Pierre T, Rasmussen M, Campos PF, de Barros Damgaard P, Allentoft ME, Lindo J, Metspalu E, Rodríguez-Varela R, Mansilla J, Henrickson C, Seguin-Orlando A, Malmström H, Stafford T, Shringarpure SS, Moreno-Estrada A, Karmin M, Tambets K, Bergström A, Xue Y, Warmuth V, Friend AD, Singarayer J, Valdes P, Balloux F, Leboreiro I, Vera JL, Rangel-Villalobos H, Pettener D, Luiselli D, Davis LG, Heyer E, Zollikofer CPE, Ponce de León MS, Smith CI, Grimes V, Pike KA, Deal M, Fuller BT, Arriaza B, Standen V, Luz MF, Ricaut F, Guidon N, Osipova L, Voevoda MI, Posukh OL, Balanovsky O, Lavryashina M, Bogunov Y, Khusnutdinova E, Gubina M, Balanovska E, Fedorova S, Litvinov S, Malyarchuk B, Derenko M, Mosher MJ, Archer D, Cybulski J, Petzelt B, Mitchell J, Worl R, Norman PJ, Parham P, Kemp BM, Kivisild T, Tyler-Smith C, Sandhu MS, Crawford M, Villems R, Smith DG, Waters MR, Goebel T, Johnson JR, Malhi RS, Jakobsson M, Meltzer DJ, Manica A, Durbin R, Bustamante CD, Song YS, Nielsen R, Willerslev E. POPULATION GENETICS. Genomic evidence for the Pleistocene and recent population history of Native Americans. Science 2015. [PMID: 26198033 DOI: 10.1126/science.aab3884] [Citation(s) in RCA: 252] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
How and when the Americas were populated remains contentious. Using ancient and modern genome-wide data, we found that the ancestors of all present-day Native Americans, including Athabascans and Amerindians, entered the Americas as a single migration wave from Siberia no earlier than 23 thousand years ago (ka) and after no more than an 8000-year isolation period in Beringia. After their arrival to the Americas, ancestral Native Americans diversified into two basal genetic branches around 13 ka, one that is now dispersed across North and South America and the other restricted to North America. Subsequent gene flow resulted in some Native Americans sharing ancestry with present-day East Asians (including Siberians) and, more distantly, Australo-Melanesians. Putative "Paleoamerican" relict populations, including the historical Mexican Pericúes and South American Fuego-Patagonians, are not directly related to modern Australo-Melanesians as suggested by the Paleoamerican Model.
Collapse
Affiliation(s)
- Maanasa Raghavan
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Matthias Steinrücken
- Computer Science Division, University of California, Berkeley, CA 94720, USA.,Department of Statistics, University of California, Berkeley, CA 94720, USA.,Department of Biostatistics and Epidemiology, University of Massachusetts, Amherst, MA 01003, USA
| | - Kelley Harris
- Department of Mathematics, University of California, Berkeley, CA 94720, USA
| | - Stephan Schiffels
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Simon Rasmussen
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Kemitorvet, Building 208, 2800 Kongens Lyngby, Denmark
| | - Michael DeGiorgio
- Departments of Biology and Statistics, Pennsylvania State University, 502 Wartik Laboratory, University Park, PA 16802, USA
| | - Anders Albrechtsen
- The Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Cristina Valdiosera
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.,Department of Archaeology and History, La Trobe University, Melbourne, Victoria 3086, Australia
| | - María C Ávila-Arcos
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.,Department of Genetics, School of Medicine, Stanford University, 300 Pasteur Dr. Lane Bldg Room L331, Stanford, California 94305, USA
| | - Anna-Sapfo Malaspinas
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Anders Eriksson
- Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK.,Integrative Systems Biology Laboratory, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Ida Moltke
- The Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Mait Metspalu
- Estonian Biocentre, Evolutionary Biology Group, Tartu 51010, Estonia.,Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia
| | - Julian R Homburger
- Department of Genetics, School of Medicine, Stanford University, 300 Pasteur Dr. Lane Bldg Room L331, Stanford, California 94305, USA
| | - Jeff Wall
- Institute for Human Genetics, University of California San Francisco, 513 Parnassus Avenue, San Francisco, CA 94143, USA
| | - Omar E Cornejo
- School of Biological Sciences, Washington State University, PO Box 644236, Heald 429, Pullman, Washington 99164, USA
| | - J Víctor Moreno-Mayar
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Thorfinn S Korneliussen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Tracey Pierre
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Morten Rasmussen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.,Department of Genetics, School of Medicine, Stanford University, 300 Pasteur Dr. Lane Bldg Room L331, Stanford, California 94305, USA
| | - Paula F Campos
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.,CIMAR/CIIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Rua dos Bragas 289, 4050-123 Porto, Portugal
| | - Peter de Barros Damgaard
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Morten E Allentoft
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - John Lindo
- Department of Anthropology, University of Illinois at Urbana-Champaign, 607 S. Mathews Ave, Urbana, IL 61801, USA
| | - Ene Metspalu
- Estonian Biocentre, Evolutionary Biology Group, Tartu 51010, Estonia.,Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia
| | - Ricardo Rodríguez-Varela
- Centro Mixto, Universidad Complutense de Madrid-Instituto de Salud Carlos III de Evolución y Comportamiento Humano, Madrid, Spain
| | - Josefina Mansilla
- Instituto Nacional de Antropología e Historia, Moneda 13, Centro, Cuauhtémoc, 06060 Mexico Mexico City, Mexico
| | - Celeste Henrickson
- University of Utah, Department of Anthropology, 270 S 1400 E, Salt Lake City, Utah 84112, USA
| | - Andaine Seguin-Orlando
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Helena Malmström
- Department of Evolutionary Biology and Science for Life Laboratory, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden
| | - Thomas Stafford
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.,AMS 14C Dating Centre, Department of Physics and Astronomy, Aarhus University, Ny Munkegade 120, 8000 Aarhus, Denmark
| | - Suyash S Shringarpure
- Department of Genetics, School of Medicine, Stanford University, 300 Pasteur Dr. Lane Bldg Room L331, Stanford, California 94305, USA
| | - Andrés Moreno-Estrada
- Department of Genetics, School of Medicine, Stanford University, 300 Pasteur Dr. Lane Bldg Room L331, Stanford, California 94305, USA.,Laboratorio Nacional de Genómica para la Biodiversidad (LANGEBIO), CINVESTAV, Irapuato, Guanajuato 36821, Mexico
| | - Monika Karmin
- Estonian Biocentre, Evolutionary Biology Group, Tartu 51010, Estonia.,Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia
| | - Kristiina Tambets
- Estonian Biocentre, Evolutionary Biology Group, Tartu 51010, Estonia
| | - Anders Bergström
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Yali Xue
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Vera Warmuth
- UCL Genetics Institute, Gower Street, London WC1E 6BT, UK.,Evolutionsbiologiskt Centrum, Norbyvägen 18D, 75236 Uppsala, Sweden
| | - Andrew D Friend
- Department of Geography, University of Cambridge, Downing Place, Cambridge CB2 3EN, UK
| | - Joy Singarayer
- Centre for Past Climate Change and Department of Meteorology, University of Reading, Earley Gate, PO Box 243, Reading, UK
| | - Paul Valdes
- School of Geographical Sciences, University Road, Clifton, Bristol BS8 1SS, UK
| | | | - Ilán Leboreiro
- Instituto Nacional de Antropología e Historia, Moneda 13, Centro, Cuauhtémoc, 06060 Mexico Mexico City, Mexico
| | - Jose Luis Vera
- Escuela Nacional de AntropologÍa e Historia, Periférico Sur y Zapote s/n. Colonia Isidro Fabela, Tlalpan, Isidro Fabela, 14030 Mexico City, Mexico
| | | | - Davide Pettener
- Dipartimento di Scienze Biologiche, Geologiche e Ambientali (BiGeA), Università di Bologna, Via Selmi 3, 40126 Bologna, Italy
| | - Donata Luiselli
- Dipartimento di Scienze Biologiche, Geologiche e Ambientali (BiGeA), Università di Bologna, Via Selmi 3, 40126 Bologna, Italy
| | - Loren G Davis
- Department of Anthropology, Oregon State University, 238 Waldo Hall, Corvallis, OR, 97331 USA
| | - Evelyne Heyer
- Museum National d'Histoire Naturelle, CNRS, Université Paris 7 Diderot, Sorbonne Paris Cité, Sorbonne Universités, Unité Eco-Anthropologie et Ethnobiologie (UMR7206), Paris, France
| | - Christoph P E Zollikofer
- Anthropological Institute and Museum, University of Zürich, Winterthurerstrasse 190, 8057 Zürich, Switzerland
| | - Marcia S Ponce de León
- Anthropological Institute and Museum, University of Zürich, Winterthurerstrasse 190, 8057 Zürich, Switzerland
| | - Colin I Smith
- Department of Archaeology and History, La Trobe University, Melbourne, Victoria 3086, Australia
| | - Vaughan Grimes
- Department of Archaeology, Memorial University, Queen's College, 210 Prince Philip Drive, St. John's, Newfoundland, A1C 5S7, Canada.,Department of Human Evolution, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, Leipzig 04103, Germany
| | - Kelly-Anne Pike
- Department of Archaeology, Memorial University, Queen's College, 210 Prince Philip Drive, St. John's, Newfoundland, A1C 5S7, Canada
| | - Michael Deal
- Department of Archaeology, Memorial University, Queen's College, 210 Prince Philip Drive, St. John's, Newfoundland, A1C 5S7, Canada
| | - Benjamin T Fuller
- Department of Earth System Science, University of California, Irvine, Keck CCAMS Group, B321 Croul Hall, Irvine, California, 92697, USA
| | - Bernardo Arriaza
- Instituto de Alta Investigación, Universidad de Tarapacá, 18 de Septiembre 2222, Carsilla 6-D Arica, Chile
| | - Vivien Standen
- Departamento de Antropologia, Universidad de Tarapacá, 18 de Septiembre 2222. Casilla 6-D Arica, Chile
| | - Maria F Luz
- Fundação Museu do Homem Americano, Centro Cultural Sérgio Motta, Campestre, 64770-000 Sao Raimundo Nonato, Brazil
| | - Francois Ricaut
- Laboratoire d'Anthropologie Moléculaire et Imagérie de Synthèse UMR-5288, CNRS, Université de Toulouse, 31073 Toulouse, France
| | - Niede Guidon
- Fundação Museu do Homem Americano, Centro Cultural Sérgio Motta, Campestre, 64770-000 Sao Raimundo Nonato, Brazil
| | - Ludmila Osipova
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Prospekt Lavrentyeva 10, 630090 Novosibirsk, Russia.,Novosibirsk State University, 2 Pirogova Str., 630090 Novosibirsk, Russia
| | - Mikhail I Voevoda
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Prospekt Lavrentyeva 10, 630090 Novosibirsk, Russia.,Institute of Internal Medicine, Siberian Branch of RAS, 175/1 ul. B. Bogatkova, Novosibirsk 630089, Russia.,Novosibirsk State University, Laboratory of Molecular Epidemiology and Bioinformatics, 630090 Novosibirsk, Russia
| | - Olga L Posukh
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Prospekt Lavrentyeva 10, 630090 Novosibirsk, Russia.,Novosibirsk State University, 2 Pirogova Str., 630090 Novosibirsk, Russia
| | - Oleg Balanovsky
- Vavilov Institute of General Genetics, Gubkina 3, 119333 Moscow, Russia.,Research Centre for Medical Genetics, Moskvorechie 1, 115478 Moscow, Russia
| | | | - Yuri Bogunov
- Vavilov Institute of General Genetics, Gubkina 3, 119333 Moscow, Russia
| | - Elza Khusnutdinova
- Institute of Biochemistry and Genetics, Ufa Scientific Center of RAS, Prospekt Oktyabrya 71, 450054 Ufa, Russia.,Department of Genetics and Fundamental Medicine, Bashkir State University, Zaki Validi 32, 450076 Ufa, Russia
| | - Marina Gubina
- Fundação Museu do Homem Americano, Centro Cultural Sérgio Motta, Campestre, 64770-000 Sao Raimundo Nonato, Brazil
| | - Elena Balanovska
- Research Centre for Medical Genetics, Moskvorechie 1, 115478 Moscow, Russia
| | - Sardana Fedorova
- Department of Molecular Genetics, Yakut Scientific Centre of Complex Medical Problems, Sergelyahskoe Shosse 4, 677010 Yakutsk, Russia.,Laboratory of Molecular Biology, Institute of Natural Sciences, M.K. Ammosov North-Eastern Federal University, 677000 Yakutsk, Russia
| | - Sergey Litvinov
- Estonian Biocentre, Evolutionary Biology Group, Tartu 51010, Estonia.,Institute of Biochemistry and Genetics, Ufa Scientific Center of RAS, Prospekt Oktyabrya 71, 450054 Ufa, Russia
| | - Boris Malyarchuk
- Institute of Biological Problems of the North, Russian Academy of Sciences, Portovaya Street 18, Magadan 685000, Russia
| | - Miroslava Derenko
- Institute of Biological Problems of the North, Russian Academy of Sciences, Portovaya Street 18, Magadan 685000, Russia
| | - M J Mosher
- Department of Anthropology, Western Washington University, Bellingham Washington 98225, USA
| | - David Archer
- Department of Anthropology, Northwest Community College, 353 Fifth Street, Prince Rupert, British Columbia V8J 3L6, Canada
| | - Jerome Cybulski
- Canadian Museum of History, 100 Rue Laurier, Gatineau, Quebec K1A 0M8, Canada.,University of Western Ontario, London, Ontario N6A 3K7, Canada.,Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada
| | - Barbara Petzelt
- Metlakatla Treaty Office, PO Box 224, Prince Rupert, BC, Canada V8J 3P6
| | | | - Rosita Worl
- Sealaska Heritage Institute, 105 S. Seward Street, Juneau, Alaska 99801, USA
| | - Paul J Norman
- Department of Structural Biology, Stanford University School of Medicine, D100 Fairchild Science Building, Stanford, California 94305-5126, USA
| | - Peter Parham
- Department of Structural Biology, Stanford University School of Medicine, D100 Fairchild Science Building, Stanford, California 94305-5126, USA
| | - Brian M Kemp
- School of Biological Sciences, Washington State University, PO Box 644236, Heald 429, Pullman, Washington 99164, USA.,Department of Anthropology, Washington State University, Pullman Washington 99163, USA
| | - Toomas Kivisild
- Estonian Biocentre, Evolutionary Biology Group, Tartu 51010, Estonia.,Division of Biological Anthropology, University of Cambridge, Henry Wellcome Building, Fitzwilliam Street, CB2 1QH, Cambridge, UK
| | - Chris Tyler-Smith
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Manjinder S Sandhu
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK.,Dept of Medicine, University of Cambridge, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK
| | - Michael Crawford
- Laboratory of Biological Anthropology, University of Kansas, 1415 Jayhawk Blvd., 622 Fraser Hall, Lawrence, Kansas 66045, USA
| | - Richard Villems
- Estonian Biocentre, Evolutionary Biology Group, Tartu 51010, Estonia.,Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia
| | - David Glenn Smith
- Molecular Anthropology Laboratory, 209 Young Hall, Department of Anthropology, University of California, One Shields Avenue, Davis, California 95616, USA
| | - Michael R Waters
- Center for the Study of the First Americans, Texas A&M University, College Station, Texas 77843-4352, USA.,Department of Anthropology, Texas A&M University, College Station, Texas 77843-4352, USA.,Department of Geography, Texas A&M University, College Station, Texas 77843-4352, USA
| | - Ted Goebel
- Center for the Study of the First Americans, Texas A&M University, College Station, Texas 77843-4352, USA
| | - John R Johnson
- Santa Barbara Museum of Natural History, 2559 Puesta del Sol, Santa Barbara, CA 93105, USA
| | - Ripan S Malhi
- Department of Anthropology, University of Illinois at Urbana-Champaign, 607 S. Mathews Ave, Urbana, IL 61801, USA.,Carle R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, 61801, USA
| | - Mattias Jakobsson
- Department of Evolutionary Biology and Science for Life Laboratory, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden
| | - David J Meltzer
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.,Department of Anthropology, Southern Methodist University, Dallas, Texas 75275, USA
| | - Andrea Manica
- Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK
| | - Richard Durbin
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Carlos D Bustamante
- Department of Genetics, School of Medicine, Stanford University, 300 Pasteur Dr. Lane Bldg Room L331, Stanford, California 94305, USA
| | - Yun S Song
- Computer Science Division, University of California, Berkeley, CA 94720, USA.,Department of Statistics, University of California, Berkeley, CA 94720, USA.,Department of Integrative Biology, University of California, 3060 Valley Life Sciences Bldg #3140, Berkeley, CA 94720, USA
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, 3060 Valley Life Sciences Bldg #3140, Berkeley, CA 94720, USA
| | - Eske Willerslev
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| |
Collapse
|
41
|
Karmin M, Saag L, Vicente M, Wilson Sayres MA, Järve M, Talas UG, Rootsi S, Ilumäe AM, Mägi R, Mitt M, Pagani L, Puurand T, Faltyskova Z, Clemente F, Cardona A, Metspalu E, Sahakyan H, Yunusbayev B, Hudjashov G, DeGiorgio M, Loogväli EL, Eichstaedt C, Eelmets M, Chaubey G, Tambets K, Litvinov S, Mormina M, Xue Y, Ayub Q, Zoraqi G, Korneliussen TS, Akhatova F, Lachance J, Tishkoff S, Momynaliev K, Ricaut FX, Kusuma P, Razafindrazaka H, Pierron D, Cox MP, Sultana GNN, Willerslev R, Muller C, Westaway M, Lambert D, Skaro V, Kovačevic L, Turdikulova S, Dalimova D, Khusainova R, Trofimova N, Akhmetova V, Khidiyatova I, Lichman DV, Isakova J, Pocheshkhova E, Sabitov Z, Barashkov NA, Nymadawa P, Mihailov E, Seng JWT, Evseeva I, Migliano AB, Abdullah S, Andriadze G, Primorac D, Atramentova L, Utevska O, Yepiskoposyan L, Marjanovic D, Kushniarevich A, Behar DM, Gilissen C, Vissers L, Veltman JA, Balanovska E, Derenko M, Malyarchuk B, Metspalu A, Fedorova S, Eriksson A, Manica A, Mendez FL, Karafet TM, Veeramah KR, Bradman N, Hammer MF, Osipova LP, Balanovsky O, Khusnutdinova EK, Johnsen K, Remm M, Thomas MG, Tyler-Smith C, Underhill PA, Willerslev E, Nielsen R, Metspalu M, Villems R, Kivisild T. A recent bottleneck of Y chromosome diversity coincides with a global change in culture. Genome Res 2015; 25:459-66. [PMID: 25770088 PMCID: PMC4381518 DOI: 10.1101/gr.186684.114] [Citation(s) in RCA: 231] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2014] [Accepted: 02/13/2015] [Indexed: 11/25/2022]
Abstract
It is commonly thought that human genetic diversity in non-African populations was shaped primarily by an out-of-Africa dispersal 50–100 thousand yr ago (kya). Here, we present a study of 456 geographically diverse high-coverage Y chromosome sequences, including 299 newly reported samples. Applying ancient DNA calibration, we date the Y-chromosomal most recent common ancestor (MRCA) in Africa at 254 (95% CI 192–307) kya and detect a cluster of major non-African founder haplogroups in a narrow time interval at 47–52 kya, consistent with a rapid initial colonization model of Eurasia and Oceania after the out-of-Africa bottleneck. In contrast to demographic reconstructions based on mtDNA, we infer a second strong bottleneck in Y-chromosome lineages dating to the last 10 ky. We hypothesize that this bottleneck is caused by cultural changes affecting variance of reproductive success among males.
Collapse
Affiliation(s)
- Monika Karmin
- Estonian Biocentre, Tartu, 51010, Estonia; Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, 51010, Estonia;
| | - Lauri Saag
- Estonian Biocentre, Tartu, 51010, Estonia; Department of Botany, Institute of Ecology and Earth Sciences, University of Tartu, Tartu, 51010, Estonia
| | - Mário Vicente
- Division of Biological Anthropology, University of Cambridge, Cambridge, CB2 1QH, United Kingdom
| | - Melissa A Wilson Sayres
- Department of Integrative Biology, University of California Berkeley, Berkeley, California 94720, USA; School of Life Sciences and The Biodesign Institute, Tempe, Arizona 85287-5001, USA
| | - Mari Järve
- Estonian Biocentre, Tartu, 51010, Estonia
| | - Ulvi Gerst Talas
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, 51010, Estonia
| | | | - Anne-Mai Ilumäe
- Estonian Biocentre, Tartu, 51010, Estonia; Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, 51010, Estonia
| | - Reedik Mägi
- Estonian Genome Center, University of Tartu, Tartu, 51010, Estonia
| | - Mario Mitt
- Estonian Genome Center, University of Tartu, Tartu, 51010, Estonia; Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, 51010, Estonia
| | - Luca Pagani
- Division of Biological Anthropology, University of Cambridge, Cambridge, CB2 1QH, United Kingdom
| | - Tarmo Puurand
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, 51010, Estonia
| | - Zuzana Faltyskova
- Division of Biological Anthropology, University of Cambridge, Cambridge, CB2 1QH, United Kingdom
| | - Florian Clemente
- Division of Biological Anthropology, University of Cambridge, Cambridge, CB2 1QH, United Kingdom
| | - Alexia Cardona
- Division of Biological Anthropology, University of Cambridge, Cambridge, CB2 1QH, United Kingdom
| | - Ene Metspalu
- Estonian Biocentre, Tartu, 51010, Estonia; Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, 51010, Estonia
| | - Hovhannes Sahakyan
- Estonian Biocentre, Tartu, 51010, Estonia; Laboratory of Ethnogenomics, Institute of Molecular Biology, National Academy of Sciences, Yerevan, 0014, Armenia
| | - Bayazit Yunusbayev
- Estonian Biocentre, Tartu, 51010, Estonia; Institute of Biochemistry and Genetics, Ufa Scientific Center of the Russian Academy of Sciences, Ufa, 450054, Russia
| | - Georgi Hudjashov
- Estonian Biocentre, Tartu, 51010, Estonia; Department of Psychology, University of Auckland, Auckland, 1142, New Zealand
| | - Michael DeGiorgio
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | | | - Christina Eichstaedt
- Division of Biological Anthropology, University of Cambridge, Cambridge, CB2 1QH, United Kingdom
| | - Mikk Eelmets
- Estonian Biocentre, Tartu, 51010, Estonia; Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, 51010, Estonia
| | | | | | - Sergei Litvinov
- Estonian Biocentre, Tartu, 51010, Estonia; Institute of Biochemistry and Genetics, Ufa Scientific Center of the Russian Academy of Sciences, Ufa, 450054, Russia
| | - Maru Mormina
- Department of Applied Social Sciences, University of Winchester, Winchester, SO22 4NR, United Kingdom
| | - Yali Xue
- The Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, United Kingdom
| | - Qasim Ayub
- The Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, United Kingdom
| | - Grigor Zoraqi
- Center of Molecular Diagnosis and Genetic Research, University Hospital of Obstetrics and Gynecology, Tirana, ALB1005, Albania
| | - Thorfinn Sand Korneliussen
- Department of Integrative Biology, University of California Berkeley, Berkeley, California 94720, USA; Center for GeoGenetics, University of Copenhagen, Copenhagen, DK-1350, Denmark
| | - Farida Akhatova
- Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, 450074, Russia; Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, 420008, Russia
| | - Joseph Lachance
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6145, USA; School of Biology, Georgia Institute of Technology, Atlanta, 30332, Georgia, USA
| | - Sarah Tishkoff
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6145, USA; Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6313, USA
| | | | - François-Xavier Ricaut
- Evolutionary Medicine Group, Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, Centre National de la Recherche Scientifique, Université de Toulouse 3, Toulouse, 31073, France
| | - Pradiptajati Kusuma
- Evolutionary Medicine Group, Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, Centre National de la Recherche Scientifique, Université de Toulouse 3, Toulouse, 31073, France; Eijkman Institute for Molecular Biology, Jakarta, 10430, Indonesia
| | - Harilanto Razafindrazaka
- Evolutionary Medicine Group, Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, Centre National de la Recherche Scientifique, Université de Toulouse 3, Toulouse, 31073, France
| | - Denis Pierron
- Evolutionary Medicine Group, Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, Centre National de la Recherche Scientifique, Université de Toulouse 3, Toulouse, 31073, France
| | - Murray P Cox
- Statistics and Bioinformatics Group, Institute of Fundamental Sciences, Massey University, Palmerston North, 4442, New Zealand
| | - Gazi Nurun Nahar Sultana
- Centre for Advanced Research in Sciences (CARS), DNA Sequencing Research Laboratory, University of Dhaka, Dhaka, Dhaka-1000, Bangladesh
| | - Rane Willerslev
- Arctic Research Centre, Aarhus University, Aarhus, DK-8000, Denmark
| | - Craig Muller
- Center for GeoGenetics, University of Copenhagen, Copenhagen, DK-1350, Denmark
| | - Michael Westaway
- Environmental Futures Research Institute, Griffith University, Nathan, 4111, Australia
| | - David Lambert
- Environmental Futures Research Institute, Griffith University, Nathan, 4111, Australia
| | - Vedrana Skaro
- Genos, DNA Laboratory, Zagreb, 10000, Croatia; University of Osijek, Medical School, Osijek, 31000, Croatia
| | | | - Shahlo Turdikulova
- Institute of Bioorganic Chemistry, Academy of Science, Tashkent, 100143, Uzbekistan
| | - Dilbar Dalimova
- Institute of Bioorganic Chemistry, Academy of Science, Tashkent, 100143, Uzbekistan
| | - Rita Khusainova
- Institute of Biochemistry and Genetics, Ufa Scientific Center of the Russian Academy of Sciences, Ufa, 450054, Russia; Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, 450074, Russia
| | - Natalya Trofimova
- Estonian Biocentre, Tartu, 51010, Estonia; Institute of Biochemistry and Genetics, Ufa Scientific Center of the Russian Academy of Sciences, Ufa, 450054, Russia
| | - Vita Akhmetova
- Institute of Biochemistry and Genetics, Ufa Scientific Center of the Russian Academy of Sciences, Ufa, 450054, Russia
| | - Irina Khidiyatova
- Institute of Biochemistry and Genetics, Ufa Scientific Center of the Russian Academy of Sciences, Ufa, 450054, Russia; Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, 450074, Russia
| | - Daria V Lichman
- Institute of Cytology and Genetics, Novosibirsk, 630090, Russia
| | - Jainagul Isakova
- Institute of Molecular Biology and Medicine, Bishkek, 720040, Kyrgyzstan
| | | | - Zhaxylyk Sabitov
- L.N. Gumilyov Eurasian National University, Astana, 010008, Kazakhstan; Center for Life Sciences, Nazarbayev University, Astana, 010000, Kazakhstan
| | - Nikolay A Barashkov
- Department of Molecular Genetics, Yakut Scientific Centre of Complex Medical Problems, Yakutsk, 677010, Russia; Laboratory of Molecular Biology, Institute of Natural Sciences, M.K. Ammosov North-Eastern Federal University, Yakutsk, 677000, Russia
| | | | - Evelin Mihailov
- Estonian Genome Center, University of Tartu, Tartu, 51010, Estonia
| | | | - Irina Evseeva
- Northern State Medical University, Arkhangelsk, 163000, Russia; Anthony Nolan, London, NW3 2NU, United Kingdom
| | | | | | - George Andriadze
- Scientific-Research Center of the Caucasian Ethnic Groups, St. Andrews Georgian University, Tbilisi, 0162, Georgia
| | - Dragan Primorac
- University of Osijek, Medical School, Osijek, 31000, Croatia; St. Catherine Specialty Hospital, Zabok, 49210, Croatia; Eberly College of Science, Pennsylvania State University, University Park, Pennsylvania 16802, USA; University of Split, Medical School, Split, 21000, Croatia
| | | | - Olga Utevska
- V.N. Karazin Kharkiv National University, Kharkiv, 61022, Ukraine
| | - Levon Yepiskoposyan
- Laboratory of Ethnogenomics, Institute of Molecular Biology, National Academy of Sciences, Yerevan, 0014, Armenia
| | - Damir Marjanovic
- Genos, DNA Laboratory, Zagreb, 10000, Croatia; Department of Genetics and Bioengineering, Faculty of Engineering and Information Technologies, International Burch University, Sarajevo, 71000, Bosnia and Herzegovina
| | - Alena Kushniarevich
- Estonian Biocentre, Tartu, 51010, Estonia; Institute of Genetics and Cytology, National Academy of Sciences, Minsk, 220072, Belarus
| | | | - Christian Gilissen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, 106525 GA, The Netherlands
| | - Lisenka Vissers
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, 106525 GA, The Netherlands
| | - Joris A Veltman
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, 106525 GA, The Netherlands
| | - Elena Balanovska
- Research Centre for Medical Genetics, Russian Academy of Sciences, Moscow, 115478, Russia
| | - Miroslava Derenko
- Genetics Laboratory, Institute of Biological Problems of the North, Russian Academy of Sciences, Magadan, 685000, Russia
| | - Boris Malyarchuk
- Genetics Laboratory, Institute of Biological Problems of the North, Russian Academy of Sciences, Magadan, 685000, Russia
| | - Andres Metspalu
- Estonian Genome Center, University of Tartu, Tartu, 51010, Estonia
| | - Sardana Fedorova
- Department of Molecular Genetics, Yakut Scientific Centre of Complex Medical Problems, Yakutsk, 677010, Russia; Laboratory of Molecular Biology, Institute of Natural Sciences, M.K. Ammosov North-Eastern Federal University, Yakutsk, 677000, Russia
| | - Anders Eriksson
- Department of Zoology, University of Cambridge, Cambridge, CB2 3EJ, United Kingdom; Integrative Systems Biology Lab, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Saudi Arabia
| | - Andrea Manica
- Department of Zoology, University of Cambridge, Cambridge, CB2 3EJ, United Kingdom
| | - Fernando L Mendez
- Department of Genetics, Stanford University School of Medicine, Stanford, California 94305-5120, USA
| | - Tatiana M Karafet
- ARL Division of Biotechnology, University of Arizona, Tucson, Arizona 85721, USA
| | - Krishna R Veeramah
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, New York 11794-5245, USA
| | - Neil Bradman
- The Henry Stewart Group, London, WC1A 2HN, United Kingdom
| | - Michael F Hammer
- ARL Division of Biotechnology, University of Arizona, Tucson, Arizona 85721, USA
| | | | - Oleg Balanovsky
- Research Centre for Medical Genetics, Russian Academy of Sciences, Moscow, 115478, Russia; Vavilov Institute for General Genetics, Russian Academy of Sciences, Moscow, 119991, Russia
| | - Elza K Khusnutdinova
- Institute of Biochemistry and Genetics, Ufa Scientific Center of the Russian Academy of Sciences, Ufa, 450054, Russia; Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, 450074, Russia
| | - Knut Johnsen
- University Hospital of North Norway, Tromsøe, N-9038, Norway
| | - Maido Remm
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, 51010, Estonia
| | - Mark G Thomas
- Research Department of Genetics, Evolution and Environment, University College London, London, WC1E 6BT, United Kingdom
| | - Chris Tyler-Smith
- The Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, United Kingdom
| | - Peter A Underhill
- Department of Genetics, Stanford University School of Medicine, Stanford, California 94305-5120, USA
| | - Eske Willerslev
- Center for GeoGenetics, University of Copenhagen, Copenhagen, DK-1350, Denmark
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California Berkeley, Berkeley, California 94720, USA
| | - Mait Metspalu
- Estonian Biocentre, Tartu, 51010, Estonia; Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, 51010, Estonia
| | - Richard Villems
- Estonian Biocentre, Tartu, 51010, Estonia; Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, 51010, Estonia; Estonian Academy of Sciences, Tallinn, 10130, Estonia
| | - Toomas Kivisild
- Estonian Biocentre, Tartu, 51010, Estonia; Division of Biological Anthropology, University of Cambridge, Cambridge, CB2 1QH, United Kingdom;
| |
Collapse
|
42
|
Clemente F, Cardona A, Inchley C, Peter B, Jacobs G, Pagani L, Lawson D, Antão T, Vicente M, Mitt M, DeGiorgio M, Faltyskova Z, Xue Y, Ayub Q, Szpak M, Mägi R, Eriksson A, Manica A, Raghavan M, Rasmussen M, Rasmussen S, Willerslev E, Vidal-Puig A, Tyler-Smith C, Villems R, Nielsen R, Metspalu M, Malyarchuk B, Derenko M, Kivisild T. A Selective Sweep on a Deleterious Mutation in CPT1A in Arctic Populations. Am J Hum Genet 2014; 95:584-589. [PMID: 25449608 DOI: 10.1016/j.ajhg.2014.09.016] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Accepted: 09/29/2014] [Indexed: 10/24/2022] Open
Abstract
Arctic populations live in an environment characterized by extreme cold and the absence of plant foods for much of the year and are likely to have undergone genetic adaptations to these environmental conditions in the time they have been living there. Genome-wide selection scans based on genotype data from native Siberians have previously highlighted a 3 Mb chromosome 11 region containing 79 protein-coding genes as the strongest candidates for positive selection in Northeast Siberians. However, it was not possible to determine which of the genes might be driving the selection signal. Here, using whole-genome high-coverage sequence data, we identified the most likely causative variant as a nonsynonymous G>A transition (rs80356779; c.1436C>T [p.Pro479Leu] on the reverse strand) in CPT1A, a key regulator of mitochondrial long-chain fatty-acid oxidation. Remarkably, the derived allele is associated with hypoketotic hypoglycemia and high infant mortality yet occurs at high frequency in Canadian and Greenland Inuits and was also found at 68% frequency in our Northeast Siberian sample. We provide evidence of one of the strongest selective sweeps reported in humans; this sweep has driven this variant to high frequency in circum-Arctic populations within the last 6-23 ka despite associated deleterious consequences, possibly as a result of the selective advantage it originally provided to either a high-fat diet or a cold environment.
Collapse
|
43
|
Malaspinas AS, Lao O, Schroeder H, Rasmussen M, Raghavan M, Moltke I, Campos PF, Sagredo FS, Rasmussen S, Gonçalves VF, Albrechtsen A, Allentoft ME, Johnson PLF, Li M, Reis S, Bernardo DV, DeGiorgio M, Duggan AT, Bastos M, Wang Y, Stenderup J, Moreno-Mayar JV, Brunak S, Sicheritz-Ponten T, Hodges E, Hannon GJ, Orlando L, Price TD, Jensen JD, Nielsen R, Heinemeier J, Olsen J, Rodrigues-Carvalho C, Lahr MM, Neves WA, Kayser M, Higham T, Stoneking M, Pena SDJ, Willerslev E. Two ancient human genomes reveal Polynesian ancestry among the indigenous Botocudos of Brazil. Curr Biol 2014; 24:R1035-7. [PMID: 25455029 PMCID: PMC4370112 DOI: 10.1016/j.cub.2014.09.078] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
Understanding the peopling of the Americas remains an important and challenging question. Here, we present (14)C dates, and morphological, isotopic and genomic sequence data from two human skulls from the state of Minas Gerais, Brazil, part of one of the indigenous groups known as 'Botocudos'. We find that their genomic ancestry is Polynesian, with no detectable Native American component. Radiocarbon analysis of the skulls shows that the individuals had died prior to the beginning of the 19th century. Our findings could either represent genomic evidence of Polynesians reaching South America during their Pacific expansion, or European-mediated transport.
Collapse
Affiliation(s)
- Anna-Sapfo Malaspinas
- Centre for GeoGenetics, Natural History Museum of Denmark, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark
| | - Oscar Lao
- Department of Forensic Molecular Biology, Erasmus MC University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, Netherlands
| | - Hannes Schroeder
- Centre for GeoGenetics, Natural History Museum of Denmark, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark; Faculty of Archaeology, Leiden University, PO Box 9515, 2300 Leiden, The Netherlands
| | - Morten Rasmussen
- Centre for GeoGenetics, Natural History Museum of Denmark, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark; Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Maanasa Raghavan
- Centre for GeoGenetics, Natural History Museum of Denmark, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark
| | - Ida Moltke
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA; The Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaløesvej 5, Copenhagen 2200, Denmark
| | - Paula F Campos
- Centre for GeoGenetics, Natural History Museum of Denmark, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark
| | - Francisca Santana Sagredo
- Oxford Radiocarbon Accelerator Unit, Research Laboratory for Archaeology and the History of Art, South Parks Road, Dyson Perrins Building, Oxford University, OX1 3QY, UK
| | - Simon Rasmussen
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Kemitorvet 208, Kgs. Lyngby, DK-2800, Denmark
| | - Vanessa F Gonçalves
- Centre for Addiction and Mental Health, Toronto, Canada, Department of Psychiatry, University of Toronto, Toronto, Canada
| | - Anders Albrechtsen
- The Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaløesvej 5, Copenhagen 2200, Denmark
| | - Morten E Allentoft
- Centre for GeoGenetics, Natural History Museum of Denmark, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark
| | - Philip L F Johnson
- Department of Biology, Emory University, 1510 Clifton Rd NE, Rm 2006, Atlanta, GA 30322
| | - Mingkun Li
- Max Planck Institute for Evolutionary Anthropology, Department of Evolutionary Genetics, Deutscher Platz 6, D-04103 Leipzig, Germany
| | - Silvia Reis
- Setor de Antropologia Biológica, Museu Nacional, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Danilo V Bernardo
- Instituto de Ciências Humanas e da Informação - ICHI, Universidade Federal do Rio Grande, Rio Grande, RS, Brazil
| | - Michael DeGiorgio
- Department of Biology, Pennsylvania State University, 502 Wartik Laboratory, University Park, Pennsylvania 16802, USA
| | - Ana T Duggan
- Max Planck Institute for Evolutionary Anthropology, Department of Evolutionary Genetics, Deutscher Platz 6, D-04103 Leipzig, Germany
| | - Murilo Bastos
- Setor de Antropologia Biológica, Museu Nacional, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Yong Wang
- Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140; Ancestry.com DNA LLC, San Francisco, CA 94107, USA
| | - Jesper Stenderup
- Centre for GeoGenetics, Natural History Museum of Denmark, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark
| | - J Victor Moreno-Mayar
- Centre for GeoGenetics, Natural History Museum of Denmark, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark
| | - Søren Brunak
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Kemitorvet 208, Kgs. Lyngby, DK-2800, Denmark
| | - Thomas Sicheritz-Ponten
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Kemitorvet 208, Kgs. Lyngby, DK-2800, Denmark
| | - Emily Hodges
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA, Howard Hughes Medical Institute
| | - Gregory J Hannon
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA, Howard Hughes Medical Institute
| | - Ludovic Orlando
- Centre for GeoGenetics, Natural History Museum of Denmark, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark
| | - T Douglas Price
- Department of Anthropology 5240 W.H. Sewell Social Science Building 1180 Observatory Dr. University of Wisconsin Madison, WI 53706, USA
| | - Jeffrey D Jensen
- Ecole Polytechnique Fédérale de Lausanne (EPFL), School of Life Sciences, Station 15, CH-1015 Lausanne, Switzerland
| | - Rasmus Nielsen
- Centre for GeoGenetics, Natural History Museum of Denmark, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark; Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140
| | - Jan Heinemeier
- AMS 14C Dating Centre, Department of Physics and Astronomy, Aarhus University, Ny Munkegade 120, DK-8000 Aarhus C, Denmark
| | - Jesper Olsen
- AMS 14C Dating Centre, Department of Physics and Astronomy, Aarhus University, Ny Munkegade 120, DK-8000 Aarhus C, Denmark
| | - Claudia Rodrigues-Carvalho
- Setor de Antropologia Biológica, Museu Nacional, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Marta Mirazón Lahr
- LCHES, Department of Archaeology and Anthropology, University of Cambridge, Fitzwilliam St, Cambridge CB2 1QH, UK
| | - Walter A Neves
- Laboratory for Human Evolutionary Studies, Department of Genetics and Evolutionary Biology, Institute of Bioscience, University of São Paulo, Brazil
| | - Manfred Kayser
- Department of Forensic Molecular Biology, Erasmus MC University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, Netherlands
| | - Thomas Higham
- Oxford Radiocarbon Accelerator Unit, Research Laboratory for Archaeology and the History of Art, South Parks Road, Dyson Perrins Building, Oxford University, OX1 3QY, UK
| | - Mark Stoneking
- Max Planck Institute for Evolutionary Anthropology, Department of Evolutionary Genetics, Deutscher Platz 6, D-04103 Leipzig, Germany.
| | - Sergio D J Pena
- Departamento de Bioquímica e Imunologia, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil.
| | - Eske Willerslev
- Centre for GeoGenetics, Natural History Museum of Denmark, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark.
| |
Collapse
|
44
|
Raghavan M, DeGiorgio M, Albrechtsen A, Moltke I, Skoglund P, Korneliussen TS, Grønnow B, Appelt M, Gulløv HC, Friesen TM, Fitzhugh W, Malmström H, Rasmussen S, Olsen J, Melchior L, Fuller BT, Fahrni SM, Stafford T, Grimes V, Renouf MAP, Cybulski J, Lynnerup N, Lahr MM, Britton K, Knecht R, Arneborg J, Metspalu M, Cornejo OE, Malaspinas AS, Wang Y, Rasmussen M, Raghavan V, Hansen TVO, Khusnutdinova E, Pierre T, Dneprovsky K, Andreasen C, Lange H, Hayes MG, Coltrain J, Spitsyn VA, Götherström A, Orlando L, Kivisild T, Villems R, Crawford MH, Nielsen FC, Dissing J, Heinemeier J, Meldgaard M, Bustamante C, O'Rourke DH, Jakobsson M, Gilbert MTP, Nielsen R, Willerslev E. The genetic prehistory of the New World Arctic. Science 2014; 345:1255832. [PMID: 25170159 DOI: 10.1126/science.1255832] [Citation(s) in RCA: 228] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The New World Arctic, the last region of the Americas to be populated by humans, has a relatively well-researched archaeology, but an understanding of its genetic history is lacking. We present genome-wide sequence data from ancient and present-day humans from Greenland, Arctic Canada, Alaska, Aleutian Islands, and Siberia. We show that Paleo-Eskimos (~3000 BCE to 1300 CE) represent a migration pulse into the Americas independent of both Native American and Inuit expansions. Furthermore, the genetic continuity characterizing the Paleo-Eskimo period was interrupted by the arrival of a new population, representing the ancestors of present-day Inuit, with evidence of past gene flow between these lineages. Despite periodic abandonment of major Arctic regions, a single Paleo-Eskimo metapopulation likely survived in near-isolation for more than 4000 years, only to vanish around 700 years ago.
Collapse
Affiliation(s)
- Maanasa Raghavan
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Michael DeGiorgio
- Department of Biology, Pennsylvania State University, 502 Wartik Laboratory, University Park, PA 16802, USA
| | - Anders Albrechtsen
- Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200 Copenhagen, Denmark
| | - Ida Moltke
- Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, 2200 Copenhagen, Denmark. Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Pontus Skoglund
- Department of Evolutionary Biology, Uppsala University, Norbyvägen 18D, 75236 Uppsala, Sweden. Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Thorfinn S Korneliussen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Bjarne Grønnow
- Arctic Centre at the Ethnographic Collections (SILA), National Museum of Denmark, Frederiksholms Kanal 12, 1220 Copenhagen, Denmark
| | - Martin Appelt
- Arctic Centre at the Ethnographic Collections (SILA), National Museum of Denmark, Frederiksholms Kanal 12, 1220 Copenhagen, Denmark
| | - Hans Christian Gulløv
- Arctic Centre at the Ethnographic Collections (SILA), National Museum of Denmark, Frederiksholms Kanal 12, 1220 Copenhagen, Denmark
| | - T Max Friesen
- Department of Anthropology, University of Toronto, Toronto, Ontario M5S 2S2, Canada
| | - William Fitzhugh
- Arctic Studies Center, Post Office Box 37012, Department of Anthropology, MRC 112, National Museum of Natural History, Smithsonian Institution, Washington, DC 20013-7012, USA
| | - Helena Malmström
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark. Department of Evolutionary Biology, Uppsala University, Norbyvägen 18D, 75236 Uppsala, Sweden
| | - Simon Rasmussen
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Kemitorvet, 2800 Kongens Lyngby, Denmark
| | - Jesper Olsen
- AMS 14C Dating Centre, Department of Physics and Astronomy, Aarhus University, Ny Munkegade 120, 8000 Aarhus C, Denmark
| | - Linea Melchior
- Anthropological Laboratory, Institute of Forensic Medicine, Faculty of Health Sciences, University of Copenhagen, Frederik V's Vej 11, 2100 Copenhagen, Denmark
| | - Benjamin T Fuller
- Department of Earth System Science, University of California, Irvine, CA 92697, USA
| | - Simon M Fahrni
- Department of Earth System Science, University of California, Irvine, CA 92697, USA
| | - Thomas Stafford
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark. AMS 14C Dating Centre, Department of Physics and Astronomy, Aarhus University, Ny Munkegade 120, 8000 Aarhus C, Denmark
| | - Vaughan Grimes
- Department of Archaeology, Memorial University, Queen's College, 210 Prince Philip Drive, St. John's, Newfoundland, A1C 5S7, Canada. Department of Human Evolution, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany
| | - M A Priscilla Renouf
- Department of Archaeology, Memorial University, Queen's College, 210 Prince Philip Drive, St. John's, Newfoundland, A1C 5S7, Canada
| | - Jerome Cybulski
- Canadian Museum of History, 100 Rue Laurier, Gatineau, Quebec K1A 0M8, Canada. Department of Anthropology, University of Western Ontario, 1151 Richmond Street North, London N6A 5C2, Canada
| | - Niels Lynnerup
- Anthropological Laboratory, Institute of Forensic Medicine, Faculty of Health Sciences, University of Copenhagen, Frederik V's Vej 11, 2100 Copenhagen, Denmark
| | - Marta Mirazon Lahr
- Leverhulme Centre for Human Evolutionary Studies, Department of Archaeology and Anthropology, University of Cambridge, Cambridge CB2 1QH, UK
| | - Kate Britton
- Department of Human Evolution, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany. Department of Archaeology, University of Aberdeen, St. Mary's Building, Elphinstone Road, Aberdeen AB24 3UF, Scotland, UK
| | - Rick Knecht
- Department of Archaeology, University of Aberdeen, St. Mary's Building, Elphinstone Road, Aberdeen AB24 3UF, Scotland, UK
| | - Jette Arneborg
- National Museum of Denmark, Frederiksholms kanal 12, 1220 Copenhagen, Denmark. School of Geosciences, University of Edinburgh, Edinburgh EH8 9XP, UK
| | - Mait Metspalu
- Estonian Biocentre, Evolutionary Biology Group, Tartu 51010, Estonia. Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia
| | - Omar E Cornejo
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305, USA. School of Biological Sciences, Washington State University, Post Office Box 644236, Pullman, WA 99164, USA
| | - Anna-Sapfo Malaspinas
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Yong Wang
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA. Ancestry.com DNA LLC, San Francisco, CA 94107, USA
| | - Morten Rasmussen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Vibha Raghavan
- Informatics and Bio-computing, Ontario Institute for Cancer Research, 661 University Avenue, Suite 510, Toronto, Ontario, M5G 0A3, Canada
| | - Thomas V O Hansen
- Center for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, Denmark
| | - Elza Khusnutdinova
- Institute of Biochemistry and Genetics, Ufa Scientific Center of Russian Academy of Sciences, Ufa, Russia. Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Bashkortostan 450074, Russia
| | - Tracey Pierre
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Kirill Dneprovsky
- State Museum for Oriental Art, 12a, Nikitsky Boulevard, Moscow 119019, Russia
| | - Claus Andreasen
- Greenland National Museum and Archives, Post Office Box 145, 3900 Nuuk, Greenland
| | - Hans Lange
- Greenland National Museum and Archives, Post Office Box 145, 3900 Nuuk, Greenland
| | - M Geoffrey Hayes
- Division of Endocrinology, Metabolism and Molecular Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA. Department of Anthropology, Weinberg College of Arts and Sciences, Northwestern University, Evanston, IL 60208, USA. Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Joan Coltrain
- Department of Anthropology, University of Utah, Salt Lake City, UT 84112, USA
| | - Victor A Spitsyn
- Research Centre for Medical Genetics of Russian Academy of Medical Sciences, 1 Moskvorechie, Moscow 115478, Russia
| | - Anders Götherström
- Department of Archaeology and Classical Studies, Stockholm University, Stockholm, Sweden
| | - Ludovic Orlando
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Toomas Kivisild
- Estonian Biocentre, Evolutionary Biology Group, Tartu 51010, Estonia. Department of Archaeology and Anthropology, University of Cambridge, Cambridge CB2 1QH, UK
| | - Richard Villems
- Estonian Biocentre, Evolutionary Biology Group, Tartu 51010, Estonia. Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia
| | - Michael H Crawford
- Laboratory of Biological Anthropology, University of Kansas, Lawrence, KS 66045, USA
| | - Finn C Nielsen
- Center for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, Denmark
| | - Jørgen Dissing
- Anthropological Laboratory, Institute of Forensic Medicine, Faculty of Health Sciences, University of Copenhagen, Frederik V's Vej 11, 2100 Copenhagen, Denmark
| | - Jan Heinemeier
- AMS 14C Dating Centre, Department of Physics and Astronomy, Aarhus University, Ny Munkegade 120, 8000 Aarhus C, Denmark
| | - Morten Meldgaard
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Carlos Bustamante
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305, USA
| | - Dennis H O'Rourke
- Department of Anthropology, University of Utah, Salt Lake City, UT 84112, USA
| | - Mattias Jakobsson
- Department of Evolutionary Biology, Uppsala University, Norbyvägen 18D, 75236 Uppsala, Sweden
| | - M Thomas P Gilbert
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Eske Willerslev
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.
| |
Collapse
|
45
|
DeGiorgio M, Lohmueller KE, Nielsen R. A model-based approach for identifying signatures of ancient balancing selection in genetic data. PLoS Genet 2014; 10:e1004561. [PMID: 25144706 PMCID: PMC4140648 DOI: 10.1371/journal.pgen.1004561] [Citation(s) in RCA: 109] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 06/26/2014] [Indexed: 01/19/2023] Open
Abstract
While much effort has focused on detecting positive and negative directional selection in the human genome, relatively little work has been devoted to balancing selection. This lack of attention is likely due to the paucity of sophisticated methods for identifying sites under balancing selection. Here we develop two composite likelihood ratio tests for detecting balancing selection. Using simulations, we show that these methods outperform competing methods under a variety of assumptions and demographic models. We apply the new methods to whole-genome human data, and find a number of previously-identified loci with strong evidence of balancing selection, including several HLA genes. Additionally, we find evidence for many novel candidates, the strongest of which is FANK1, an imprinted gene that suppresses apoptosis, is expressed during meiosis in males, and displays marginal signs of segregation distortion. We hypothesize that balancing selection acts on this locus to stabilize the segregation distortion and negative fitness effects of the distorter allele. Thus, our methods are able to reproduce many previously-hypothesized signals of balancing selection, as well as discover novel interesting candidates. In the past, balancing selection was a topic of great theoretical interest that received much attention. However, there has been little focus toward developing methods to identify regions of the genome that are under balancing selection. In this article, we present the first set of likelihood-based methods that explicitly model the spatial distribution of polymorphism expected near a site under long-term balancing selection. Simulation results show that our methods outperform commonly-used summary statistics for identifying regions under balancing selection. Finally, we performed a scan for balancing selection in Africans and Europeans using our new methods and identified a gene called FANK1 as our top candidate outside the HLA region. We hypothesize that the maintenance of polymorphism at FANK1 is the result of segregation distortion.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail:
| | - Kirk E. Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, California, United States of America
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
- Department of Statistics, University of California, Berkeley, Berkeley, California, United States of America
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
46
|
Malaspinas AS, Tange O, Moreno-Mayar JV, Rasmussen M, DeGiorgio M, Wang Y, Valdiosera CE, Politis G, Willerslev E, Nielsen R. bammds: a tool for assessing the ancestry of low-depth whole-genome data using multidimensional scaling (MDS). ACTA ACUST UNITED AC 2014; 30:2962-4. [PMID: 24974206 PMCID: PMC4184259 DOI: 10.1093/bioinformatics/btu410] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Summary: We present bammds, a practical tool that allows visualization of samples sequenced by second-generation sequencing when compared with a reference panel of individuals (usually genotypes) using a multidimensional scaling algorithm. Our tool is aimed at determining the ancestry of unknown samples—typical of ancient DNA data—particularly when only low amounts of data are available for those samples. Availability and implementation: The software package is available under GNU General Public License v3 and is freely available together with test datasets https://savannah.nongnu.org/projects/bammds/. It is using R (http://www.r-project.org/), parallel (http://www.gnu.org/software/parallel/), samtools (https://github.com/samtools/samtools). Contact:bammds-users@nongnu.org Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Anna-Sapfo Malaspinas
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina
| | - Ole Tange
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina
| | - José Víctor Moreno-Mayar
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina
| | - Morten Rasmussen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina
| | - Michael DeGiorgio
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina
| | - Yong Wang
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina
| | - Cristina E Valdiosera
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina
| | - Gustavo Politis
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina
| | - Eske Willerslev
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina
| | - Rasmus Nielsen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, Department of Biology, Pennsylvania State University, Wartik Laboratory, University Park, PA 16802, Centre for Theoretical Evolutionary Genomics, Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720-3140, Ancestry.com DNA LLC, San Francisco, CA 94107, Department of Archaeology, Environment and Community Planning Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia, INCUAPA-CONICET, Universidad del Centro de la Provincia de Buenos Aires, 7600 Olavarría, Argentina and Facultad de Ciencias Naturales y Museo de La Plata, 1900 La Plata, Argentina
| |
Collapse
|
47
|
DeGiorgio M, Syring J, Eckert AJ, Liston A, Cronn R, Neale DB, Rosenberg NA. An empirical evaluation of two-stage species tree inference strategies using a multilocus dataset from North American pines. BMC Evol Biol 2014; 14:67. [PMID: 24678701 PMCID: PMC4021425 DOI: 10.1186/1471-2148-14-67] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2013] [Accepted: 02/10/2014] [Indexed: 12/26/2022] Open
Abstract
Background As it becomes increasingly possible to obtain DNA sequences of orthologous genes from diverse sets of taxa, species trees are frequently being inferred from multilocus data. However, the behavior of many methods for performing this inference has remained largely unexplored. Some methods have been proven to be consistent given certain evolutionary models, whereas others rely on criteria that, although appropriate for many parameter values, have peculiar zones of the parameter space in which they fail to converge on the correct estimate as data sets increase in size. Results Here, using North American pines, we empirically evaluate the behavior of 24 strategies for species tree inference using three alternative outgroups (72 strategies total). The data consist of 120 individuals sampled in eight ingroup species from subsection Strobus and three outgroup species from subsection Gerardianae, spanning ∼47 kilobases of sequence at 121 loci. Each “strategy” for inferring species trees consists of three features: a species tree construction method, a gene tree inference method, and a choice of outgroup. We use multivariate analysis techniques such as principal components analysis and hierarchical clustering to identify tree characteristics that are robustly observed across strategies, as well as to identify groups of strategies that produce trees with similar features. We find that strategies that construct species trees using only topological information cluster together and that strategies that use additional non-topological information (e.g., branch lengths) also cluster together. Strategies that utilize more than one individual within a species to infer gene trees tend to produce estimates of species trees that contain clades present in trees estimated by other strategies. Strategies that use the minimize-deep-coalescences criterion to construct species trees tend to produce species tree estimates that contain clades that are not present in trees estimated by the Concatenation, RTC, SMRT, STAR, and STEAC methods, and that in general are more balanced than those inferred by these other strategies. Conclusions When constructing a species tree from a multilocus set of sequences, our observations provide a basis for interpreting differences in species tree estimates obtained via different approaches that have a two-stage structure in common, one step for gene tree estimation and a second step for species tree estimation. The methods explored here employ a number of distinct features of the data, and our analysis suggests that recovery of the same results from multiple methods that tend to differ in their patterns of inference can be a valuable tool for obtaining reliable estimates.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA.
| | | | | | | | | | | | | |
Collapse
|
48
|
Raghavan M, Skoglund P, Graf KE, Metspalu M, Albrechtsen A, Moltke I, Rasmussen S, Stafford TW, Orlando L, Metspalu E, Karmin M, Tambets K, Rootsi S, Mägi R, Campos PF, Balanovska E, Balanovsky O, Khusnutdinova E, Litvinov S, Osipova LP, Fedorova SA, Voevoda MI, DeGiorgio M, Sicheritz-Ponten T, Brunak S, Demeshchenko S, Kivisild T, Villems R, Nielsen R, Jakobsson M, Willerslev E. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature 2013; 505:87-91. [PMID: 24256729 DOI: 10.1038/nature12736] [Citation(s) in RCA: 431] [Impact Index Per Article: 39.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2013] [Accepted: 10/04/2013] [Indexed: 12/19/2022]
Abstract
The origins of the First Americans remain contentious. Although Native Americans seem to be genetically most closely related to east Asians, there is no consensus with regard to which specific Old World populations they are closest to. Here we sequence the draft genome of an approximately 24,000-year-old individual (MA-1), from Mal'ta in south-central Siberia, to an average depth of 1×. To our knowledge this is the oldest anatomically modern human genome reported to date. The MA-1 mitochondrial genome belongs to haplogroup U, which has also been found at high frequency among Upper Palaeolithic and Mesolithic European hunter-gatherers, and the Y chromosome of MA-1 is basal to modern-day western Eurasians and near the root of most Native American lineages. Similarly, we find autosomal evidence that MA-1 is basal to modern-day western Eurasians and genetically closely related to modern-day Native Americans, with no close affinity to east Asians. This suggests that populations related to contemporary western Eurasians had a more north-easterly distribution 24,000 years ago than commonly thought. Furthermore, we estimate that 14 to 38% of Native American ancestry may originate through gene flow from this ancient population. This is likely to have occurred after the divergence of Native American ancestors from east Asian ancestors, but before the diversification of Native American populations in the New World. Gene flow from the MA-1 lineage into Native American ancestors could explain why several crania from the First Americans have been reported as bearing morphological characteristics that do not resemble those of east Asians. Sequencing of another south-central Siberian, Afontova Gora-2 dating to approximately 17,000 years ago, revealed similar autosomal genetic signatures as MA-1, suggesting that the region was continuously occupied by humans throughout the Last Glacial Maximum. Our findings reveal that western Eurasian genetic signatures in modern-day Native Americans derive not only from post-Columbian admixture, as commonly thought, but also from a mixed ancestry of the First Americans.
Collapse
Affiliation(s)
- Maanasa Raghavan
- 1] Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark [2]
| | - Pontus Skoglund
- 1] Department of Evolutionary Biology, Uppsala University, Norbyvägen 18D, Uppsala 752 36, Sweden [2]
| | - Kelly E Graf
- Center for the Study of the First Americans, Texas A&M University, TAMU-4352, College Station, Texas 77845-4352, USA
| | - Mait Metspalu
- 1] Estonian Biocentre, Evolutionary Biology group, Tartu 51010, Estonia [2] Department of Integrative Biology, University of California, Berkeley, California 94720, USA [3] Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia
| | - Anders Albrechtsen
- The Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, Copenhagen 2200, Denmark
| | - Ida Moltke
- 1] The Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, Copenhagen 2200, Denmark [2] Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Simon Rasmussen
- Center for Biological Sequence Analysis, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Thomas W Stafford
- 1] Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark [2] AMS 14C Dating Centre, Department of Physics and Astronomy, University of Aarhus, Ny Munkegade 120, Aarhus DK-8000, Denmark
| | - Ludovic Orlando
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Ene Metspalu
- Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia
| | - Monika Karmin
- 1] Estonian Biocentre, Evolutionary Biology group, Tartu 51010, Estonia [2] Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia
| | - Kristiina Tambets
- Estonian Biocentre, Evolutionary Biology group, Tartu 51010, Estonia
| | - Siiri Rootsi
- Estonian Biocentre, Evolutionary Biology group, Tartu 51010, Estonia
| | - Reedik Mägi
- Estonian Genome Center, University of Tartu, Tartu 51010, Estonia
| | - Paula F Campos
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| | - Elena Balanovska
- Research Centre for Medical Genetics, Russian Academy of Medical Sciences, Moskvorechie Street 1, Moscow 115479, Russia
| | - Oleg Balanovsky
- 1] Research Centre for Medical Genetics, Russian Academy of Medical Sciences, Moskvorechie Street 1, Moscow 115479, Russia [2] Vavilov Institute of General Genetics, Russian Academy of Sciences, Gubkina Street 3, Moscow 119991, Russia
| | - Elza Khusnutdinova
- 1] Institute of Biochemistry and Genetics, Ufa Scientific Centre, Russian Academy of Sciences, Ufa, Bashkorostan 450054, Russia [2] Biology Department, Bashkir State University, Ufa, Bashkorostan 450074, Russia
| | - Sergey Litvinov
- 1] Estonian Biocentre, Evolutionary Biology group, Tartu 51010, Estonia [2] Institute of Biochemistry and Genetics, Ufa Scientific Centre, Russian Academy of Sciences, Ufa, Bashkorostan 450054, Russia
| | - Ludmila P Osipova
- The Institute of Cytology and Genetics, Center for Brain Neurobiology and Neurogenetics, Siberian Branch of the Russian Academy of Sciences, Lavrentyeva Avenue, Novosibirsk 630090, Russia
| | - Sardana A Fedorova
- Department of Molecular Genetics, Yakut Research Center of Complex Medical Problems, Russian Academy of Medical Sciences and North-Eastern Federal University, Yakutsk, Sakha (Yakutia) 677010, Russia
| | - Mikhail I Voevoda
- 1] The Institute of Cytology and Genetics, Center for Brain Neurobiology and Neurogenetics, Siberian Branch of the Russian Academy of Sciences, Lavrentyeva Avenue, Novosibirsk 630090, Russia [2] Institute of Internal Medicine, Siberian Branch of the Russian Academy of Medical Sciences, Borisa Bogatkova 175/1, Novosibirsk 630089, Russia
| | - Michael DeGiorgio
- Department of Integrative Biology, University of California, Berkeley, California 94720, USA
| | - Thomas Sicheritz-Ponten
- 1] Center for Biological Sequence Analysis, Technical University of Denmark, Kongens Lyngby 2800, Denmark [2] Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Søren Brunak
- 1] Center for Biological Sequence Analysis, Technical University of Denmark, Kongens Lyngby 2800, Denmark [2] Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | | | - Toomas Kivisild
- 1] Estonian Biocentre, Evolutionary Biology group, Tartu 51010, Estonia [2] Department of Biological Anthropology, University of Cambridge, Cambridge CB2 1QH, UK
| | - Richard Villems
- 1] Estonian Biocentre, Evolutionary Biology group, Tartu 51010, Estonia [2] Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia [3] Estonian Academy of Sciences, Tallinn 10130, Estonia
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, California 94720, USA
| | - Mattias Jakobsson
- 1] Department of Evolutionary Biology, Uppsala University, Norbyvägen 18D, Uppsala 752 36, Sweden [2] Science for Life Laboratory, Uppsala University, Norbyvägen 18D, 752 36 Uppsala, Sweden
| | - Eske Willerslev
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
| |
Collapse
|
49
|
Abstract
To infer species trees from gene trees estimated from phylogenomic data sets, tractable methods are needed that can handle dozens to hundreds of loci. We examine several computationally efficient approaches-MP-EST, STAR, STEAC, STELLS, and STEM-for inferring species trees from gene trees estimated using maximum likelihood (ML) and Bayesian approaches. Among the methods examined, we found that topology-based methods often performed better using ML gene trees and methods employing coalescent times typically performed better using Bayesian gene trees, with MP-EST, STAR, STEAC, and STELLS outperforming STEM under most conditions. We examine why the STEM tree (also called GLASS or Maximum Tree) is less accurate on estimated gene trees by comparing estimated and true coalescence times, performing species tree inference using simulations, and analyzing a great ape data set keeping track of false positive and false negative rates for inferred clades. We find that although true coalescence times are more ancient than speciation times under the multispecies coalescent model, estimated coalescence times are often more recent than speciation times. This underestimation can lead to increased bias and lack of resolution with increased sampling (either alleles or loci) when gene trees are estimated with ML. The problem appears to be less severe using Bayesian gene-tree estimates.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA; Department of Biology, Pennsylvania State University, University Park, PA 16802, USA; and Department of Mathematics and Statistics, University of New Mexico, 1 University of New Mexico, Albuquerque, NM 87131, USA
| | | |
Collapse
|
50
|
Abstract
Principal component (PC) maps, which plot the values of a given PC estimated on the basis of allele frequency variation at the geographic sampling locations of a set of populations, are often used to investigate the properties of past range expansions. Some studies have argued that in a range expansion, the axis of greatest variation (i.e., the first PC) is parallel to the axis of expansion. In contrast, others have identified a pattern in which the axis of greatest variation is perpendicular to the axis of expansion. Here, we seek to understand this difference in outcomes by investigating the effect of the geographic sampling scheme on the direction of the axis of greatest variation under a two-dimensional range expansion model. From datasets simulated using each of two different schemes for the geographic sampling of populations under the model, we create PC maps for the first PC. We find that depending on the geographic sampling scheme, the axis of greatest variation can be either parallel or perpendicular to the axis of expansion. We provide an explanation for this result in terms of intra- and interpopulation coalescence times.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Integrative Biology, University of California, Berkeley.
| | | |
Collapse
|