1
|
Shpak M, Lawrence KN, Pool JE. The Precision and Power of Population Branch Statistics in Identifying the Genomic Signatures of Local Adaptation. Genome Biol Evol 2025; 17:evaf080. [PMID: 40326284 DOI: 10.1093/gbe/evaf080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 04/21/2025] [Accepted: 04/29/2025] [Indexed: 05/07/2025] Open
Abstract
Population branch statistics, which estimate the degree of genetic differentiation along a focal population's lineage, have been used as an alternative to FST-based genome-wide scans for identifying loci associated with local selective sweeps. Beyond the population branch statistic (PBS), the normalized PBSn1 adjusts focal branch length with respect to outgroup branch lengths at the same locus, whereas population branch excess (PBE) incorporates median branch lengths at other loci. PBSn1 and PBE were proposed to be more specific to local selective sweeps as opposed to geographically ubiquitous selection. However, the accuracy and statistical power of branch statistics have not been systematically assessed. To do so, we simulate genomes in representative large and small populations with varying proportions of sites evolving under genetic drift or (approximated) background selection, with local selective sweeps or geographically parallel selective sweeps. We then assess the probability that local selective sweep loci are correctly identified as outliers by FST and by each of the branch statistics. We find that branch statistics consistently outperform FST at identifying local sweeps. Particularly when parallel sweeps are introduced, PBSn1 and PBE correctly identify local sweeps among their top outliers more frequently than PBS. Additionally, we evaluate versions of these statistics based on maximal site differentiation within a window, finding that site-based PBE and PBSn1 are particularly effective at identifying local soft sweeps. These results validate the greater specificity of the rescaled branch statistics PBE and PBSn1 to detect population-specific positive selection, supporting their use in genomic studies focused on local adaptation.
Collapse
Affiliation(s)
- Max Shpak
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, USA
| | - Kadee N Lawrence
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, USA
| | - John E Pool
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
2
|
Soni V, Jensen JD. Inferring demographic and selective histories from population genomic data using a 2-step approach in species with coding-sparse genomes: an application to human data. G3 (BETHESDA, MD.) 2025; 15:jkaf019. [PMID: 39883523 PMCID: PMC12005166 DOI: 10.1093/g3journal/jkaf019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2024] [Revised: 01/14/2025] [Accepted: 01/27/2025] [Indexed: 01/31/2025]
Abstract
The demographic history of a population, and the distribution of fitness effects (DFE) of newly arising mutations in functional genomic regions, are fundamental factors dictating both genetic variation and evolutionary trajectories. Although both demographic and DFE inference has been performed extensively in humans, these approaches have generally either been limited to simple demographic models involving a single population, or, where a complex population history has been inferred, without accounting for the potentially confounding effects of selection at linked sites. Taking advantage of the coding-sparse nature of the genome, we propose a 2-step approach in which coalescent simulations are first used to infer a complex multi-population demographic model, utilizing large non-functional regions that are likely free from the effects of background selection. We then use forward-in-time simulations to perform DFE inference in functional regions, conditional on the complex demography inferred and utilizing expected background selection effects in the estimation procedure. Throughout, recombination and mutation rate maps were used to account for the underlying empirical rate heterogeneity across the human genome. Importantly, within this framework it is possible to utilize and fit multiple aspects of the data, and this inference scheme represents a generalized approach for such large-scale inference in species with coding-sparse genomes.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Center for Evolution & Medicine, Arizona State University, Tempe, AZ 85281, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Center for Evolution & Medicine, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
3
|
Arnab SP, Dos Santos ALC, Fumagalli M, DeGiorgio M. Efficient detection and characterization of targets of natural selection using transfer learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.05.641710. [PMID: 40093065 PMCID: PMC11908262 DOI: 10.1101/2025.03.05.641710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/19/2025]
Abstract
Natural selection leaves detectable patterns of altered spatial diversity within genomes, and identifying affected regions is crucial for understanding species evolution. Recently, machine learning approaches applied to raw population genomic data have been developed to uncover these adaptive signatures. Convolutional neural networks (CNNs) are particularly effective for this task, as they handle large data arrays while maintaining element correlations. However, shallow CNNs may miss complex patterns due to their limited capacity, while deep CNNs can capture these patterns but require extensive data and computational power. Transfer learning addresses these challenges by utilizing a deep CNN pre-trained on a large dataset as a feature extraction tool for downstream classification and evolutionary parameter prediction. This approach reduces extensive training data generation requirements and computational needs while maintaining high performance. In this study, we developed TrIdent, a tool that uses transfer learning to enhance detection of adaptive genomic regions from image representations of multilocus variation. We evaluated TrIdent across various genetic, demographic, and adaptive settings, in addition to unphased data and other confounding factors. TrIdent demonstrated improved detection of adaptive regions compared to recent methods using similar data representations. We further explored model interpretability through class activation maps and adapted TrIdent to infer selection parameters for identified adaptive candidates. Using whole-genome haplotype data from European and African populations, TrIdent effectively recapitulated known sweep candidates and identified novel cancer, and other disease-associated genes as potential sweeps.
Collapse
Affiliation(s)
- Sandipan Paul Arnab
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL, USA
| | | | - Matteo Fumagalli
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
- The Alan Turing Institute, London, UK
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL, USA
| |
Collapse
|
4
|
Huang K, Ostevik KL, Jahani M, Todesco M, Bercovich N, Andrew RL, Owens GL, Rieseberg LH. Inversions contribute disproportionately to parallel genomic divergence in dune sunflowers. Nat Ecol Evol 2025; 9:325-335. [PMID: 39633041 PMCID: PMC11807836 DOI: 10.1038/s41559-024-02593-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 10/30/2024] [Indexed: 12/07/2024]
Abstract
The probability of parallel genetic evolution is a function of the strength of selection and constraints imposed by genetic architecture. Inversions capture locally adapted alleles and suppress recombination between them, which limits the range of adaptive responses. In addition, the combined phenotypic effect of alleles within inversions is likely to be greater than that of individual alleles; this should further increase the contributions of inversions to parallel evolution. We tested the hypothesis that inversions contribute disproportionately to parallel genetic evolution in independent dune ecotypes of Helianthus petiolaris. We analysed habitat data and identified variables underlying parallel habitat shifts. Genotype-environment association analyses of these variables indicated parallel responses of inversions to shared selective pressures. We also confirmed larger seed size across the dunes and performed quantitative trait locus mapping with multiple crosses. Quantitative trait loci shared between locations fell into inversions more than expected by chance. We used whole-genome sequencing data to identify selective sweeps in the dune ecotypes and found that the majority of shared swept regions were found within inversions. Phylogenetic analyses of shared regions indicated that within inversions, the same allele typically was found in the dune habitat at both sites. These results confirm predictions that inversions drive parallel divergence in the dune ecotypes.
Collapse
Affiliation(s)
- Kaichi Huang
- School of Ecology, Sun Yat-sen University, Shenzhen, China.
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada.
| | - Kate L Ostevik
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada.
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA, USA.
| | - Mojtaba Jahani
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada
| | - Marco Todesco
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
- Department of Biology, University of British Columbia, Kelowna, British Columbia, Canada
| | - Natalia Bercovich
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada
| | - Rose L Andrew
- School of Environmental and Rural Science, University of New England, Armidale, New South Wales, Australia
| | - Gregory L Owens
- Department of Biology, University of Victoria, Victoria, British Columbia, Canada
| | - Loren H Rieseberg
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
5
|
Nocchi G, Whiting JR, Yeaman S. Repeated global adaptation across plant species. Proc Natl Acad Sci U S A 2024; 121:e2406832121. [PMID: 39705310 DOI: 10.1073/pnas.2406832121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 11/09/2024] [Indexed: 12/22/2024] Open
Abstract
Global adaptation occurs when all populations of a species undergo selection toward a common optimum. This can occur by a hard selective sweep with the emergence of a new globally advantageous allele that spreads throughout a species' natural range until reaching fixation. This evolutionary process leaves a temporary trace in the region affected, which is detectable using population genomic methods. While selective sweeps have been identified in many species, there have been few comparative and systematic studies of the genes involved in global adaptation. Building upon recent findings showing repeated genetic basis of local adaptation across independent populations and species, we asked whether certain genes play a more significant role in driving global adaptation across plant species. To address this question, we scanned the genomes of 17 plant species to identify signals of repeated global selective sweeps. Despite the substantial evolutionary distance between the species analyzed, we identified several gene families with strong evidence of repeated positive selection. These gene families tend to be enriched for reduced pleiotropy, consistent with predictions from Fisher's evolutionary model and the cost of complexity hypothesis. We also found that genes with repeated sweeps exhibit elevated levels of gene duplication. Our findings contrast with recent observations of increased pleiotropy in genes driving local adaptation, consistent with predictions based on the theory of migration-selection balance.
Collapse
Affiliation(s)
- Gabriele Nocchi
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - James R Whiting
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Samuel Yeaman
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| |
Collapse
|
6
|
Amin MR, Hasan M, DeGiorgio M. Digital Image Processing to Detect Adaptive Evolution. Mol Biol Evol 2024; 41:msae242. [PMID: 39565932 PMCID: PMC11631197 DOI: 10.1093/molbev/msae242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 10/28/2024] [Accepted: 11/13/2024] [Indexed: 11/22/2024] Open
Abstract
In recent years, advances in image processing and machine learning have fueled a paradigm shift in detecting genomic regions under natural selection. Early machine learning techniques employed population-genetic summary statistics as features, which focus on specific genomic patterns expected by adaptive and neutral processes. Though such engineered features are important when training data are limited, the ease at which simulated data can now be generated has led to the recent development of approaches that take in image representations of haplotype alignments and automatically extract important features using convolutional neural networks. Digital image processing methods termed α-molecules are a class of techniques for multiscale representation of objects that can extract a diverse set of features from images. One such α-molecule method, termed wavelet decomposition, lends greater control over high-frequency components of images. Another α-molecule method, termed curvelet decomposition, is an extension of the wavelet concept that considers events occurring along curves within images. We show that application of these α-molecule techniques to extract features from image representations of haplotype alignments yield high true positive rate and accuracy to detect hard and soft selective sweep signatures from genomic data with both linear and nonlinear machine learning classifiers. Moreover, we find that such models are easy to visualize and interpret, with performance rivaling those of contemporary deep learning approaches for detecting sweeps.
Collapse
Affiliation(s)
- Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mahmudul Hasan
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
7
|
Witt KE, Villanea FA. Computational Genomics and Its Applications to Anthropological Questions. AMERICAN JOURNAL OF BIOLOGICAL ANTHROPOLOGY 2024; 186 Suppl 78:e70010. [PMID: 40071816 PMCID: PMC11898561 DOI: 10.1002/ajpa.70010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 10/14/2024] [Accepted: 12/19/2024] [Indexed: 03/15/2025]
Abstract
The advent of affordable genome sequencing and the development of new computational tools have established a new era of genomic knowledge. Sequenced human genomes number in the tens of thousands, including thousands of ancient human genomes. The abundance of data has been met with new analysis tools that can be used to understand populations' demographic and evolutionary histories. Thus, a variety of computational methods now exist that can be leveraged to answer anthropological questions. This includes novel likelihood and Bayesian methods, machine learning techniques, and a vast array of population simulators. These computational tools provide powerful insights gained from genomic datasets, although they are generally inaccessible to those with less computational experience. Here, we outline the theoretical workings behind computational genomics methods, limitations and other considerations when applying these computational methods, and examples of how computational methods have already been applied to anthropological questions. We hope this review will empower other anthropologists to utilize these powerful tools in their own research.
Collapse
Affiliation(s)
- Kelsey E. Witt
- Department of Genetics and Biochemistry and Center for Human GeneticsClemson UniversityClemsonSouth CarolinaUSA
| | | |
Collapse
|
8
|
Soni V, Jensen JD. Inferring demographic and selective histories from population genomic data using a two-step approach in species with coding-sparse genomes: an application to human data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.19.613979. [PMID: 39605418 PMCID: PMC11601476 DOI: 10.1101/2024.09.19.613979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
The demographic history of a population, and the distribution of fitness effects (DFE) of newly arising mutations in functional genomic regions, are fundamental factors dictating both genetic variation and evolutionary trajectories. Although both demographic and DFE inference has been performed extensively in humans, these approaches have generally either been limited to simple demographic models involving a single population, or, where a complex population history has been inferred, without accounting for the potentially confounding effects of selection at linked sites. Taking advantage of the coding-sparse nature of the genome, we propose a 2-step approach in which coalescent simulations are first used to infer a complex multi-population demographic model, utilizing large non-functional regions that are likely free from the effects of background selection. We then use forward-in-time simulations to perform DFE inference in functional regions, conditional on the complex demography inferred and utilizing expected background selection effects in the estimation procedure. Throughout, recombination and mutation rate maps were used to account for the underlying empirical rate heterogeneity across the human genome. Importantly, within this framework it is possible to utilize and fit multiple aspects of the data, and this inference scheme represents a generalized approach for such large-scale inference in species with coding-sparse genomes.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Center for Evolution & Medicine, Arizona State University, Tempe, AZ, US
| | - Jeffrey D. Jensen
- School of Life Sciences, Center for Evolution & Medicine, Arizona State University, Tempe, AZ, US
| |
Collapse
|
9
|
Gering E, Johnsson M, Theunissen D, Martin Cerezo ML, Steep A, Getty T, Henriksen R, Wright D. Signals of selection and ancestry in independently feral Gallus gallus populations. Mol Ecol 2024; 33:e17336. [PMID: 38553993 DOI: 10.1111/mec.17336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 03/15/2024] [Accepted: 03/20/2024] [Indexed: 10/18/2024]
Abstract
Recent work indicates that feralisation is not a simple reversal of domestication, and therefore raises questions about the predictability of evolution across replicated feral populations. In the present study we compare genes and traits of two independently established feral populations of chickens (Gallus gallus) that inhabit archipelagos within the Pacific and Atlantic regions to test for evolutionary parallelism and/or divergence. We find that feral populations from each region are genetically closer to one another than other domestic breeds, despite their geographical isolation and divergent colonisation histories. Next, we used genome scans to identify genomic regions selected during feralisation (selective sweeps) in two independently feral populations from Bermuda and Hawaii. Three selective sweep regions (each identified by multiple detection methods) were shared between feral populations, and this overlap is inconsistent with a null model in which selection targets are randomly distributed throughout the genome. In the case of the Bermudian population, many of the genes present within the selective sweeps were either not annotated or of unknown function. Of the nine genes that were identifiable, five were related to behaviour, with the remaining genes involved in bone metabolism, eye development and the immune system. Our findings suggest that a subset of feralisation loci (i.e. genomic targets of recent selection in feral populations) are shared across independently established populations, raising the possibility that feralisation involves some degree of parallelism or convergence and the potential for a shared feralisation 'syndrome'.
Collapse
Affiliation(s)
- E Gering
- Department of Biological Sciences, Halmos College of Arts and Sciences, Nova Southeastern University, Fort Lauderdale, Florida, USA
| | - M Johnsson
- AVIAN Behavioural Genomics and Physiology Group, IFM Biology, Linköping University, Linköping, Sweden
- Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - D Theunissen
- AVIAN Behavioural Genomics and Physiology Group, IFM Biology, Linköping University, Linköping, Sweden
| | - M L Martin Cerezo
- AVIAN Behavioural Genomics and Physiology Group, IFM Biology, Linköping University, Linköping, Sweden
| | - A Steep
- Genetics and Genome Sciences Program, Michigan State University, East Lansing, Michigan, USA
| | - T Getty
- Kellogg Biological Station, Michigan State University, Hickory Corners, Michigan, USA
| | - R Henriksen
- AVIAN Behavioural Genomics and Physiology Group, IFM Biology, Linköping University, Linköping, Sweden
| | - D Wright
- AVIAN Behavioural Genomics and Physiology Group, IFM Biology, Linköping University, Linköping, Sweden
| |
Collapse
|
10
|
Götsch H, Bürger R. Polygenic dynamics underlying the response of quantitative traits to directional selection. Theor Popul Biol 2024; 158:21-59. [PMID: 38677378 DOI: 10.1016/j.tpb.2024.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 04/14/2024] [Accepted: 04/19/2024] [Indexed: 04/29/2024]
Abstract
We study the response of a quantitative trait to exponential directional selection in a finite haploid population, both at the genetic and the phenotypic level. We assume an infinite sites model, in which the number of new mutations per generation in the population follows a Poisson distribution (with mean Θ) and each mutation occurs at a new, previously monomorphic site. Mutation effects are beneficial and drawn from a distribution. Sites are unlinked and contribute additively to the trait. Assuming that selection is stronger than random genetic drift, we model the initial phase of the dynamics by a supercritical Galton-Watson process. This enables us to obtain time-dependent results. We show that the copy-number distribution of the mutant in generation n, conditioned on non-extinction until n, is described accurately by the deterministic increase from an initial distribution with mean 1. This distribution is related to the absolutely continuous part W+ of the random variable, typically denoted W, that characterizes the stochasticity accumulating during the mutant's sweep. A suitable transformation yields the approximate dynamics of the mutant frequency distribution in a Wright-Fisher population of size N. Our expression provides a very accurate approximation except when mutant frequencies are close to 1. On this basis, we derive explicitly the (approximate) time dependence of the expected mean and variance of the trait and of the expected number of segregating sites. Unexpectedly, we obtain highly accurate approximations for all times, even for the quasi-stationary phase when the expected per-generation response and the trait variance have equilibrated. The latter refine classical results. In addition, we find that Θ is the main determinant of the pattern of adaptation at the genetic level, i.e., whether the initial allele-frequency dynamics are best described by sweep-like patterns at few loci or small allele-frequency shifts at many. The number of segregating sites is an appropriate indicator for these patterns. The selection strength determines primarily the rate of adaptation. The accuracy of our results is tested by comprehensive simulations in a Wright-Fisher framework. We argue that our results apply to more complex forms of directional selection.
Collapse
Affiliation(s)
- Hannah Götsch
- Faculty of Mathematics, University of Vienna, 1090 Vienna, Austria; Vienna Graduate School of Population Genetics, Austria.
| | - Reinhard Bürger
- Faculty of Mathematics, University of Vienna, 1090 Vienna, Austria
| |
Collapse
|
11
|
Gendron EMS, Qing X, Sevigny JL, Li H, Liu Z, Blaxter M, Powers TO, Thomas WK, Porazinska DL. Comparative mitochondrial genomics in Nematoda reveal astonishing variation in compositional biases and substitution rates indicative of multi-level selection. BMC Genomics 2024; 25:615. [PMID: 38890582 PMCID: PMC11184840 DOI: 10.1186/s12864-024-10500-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 06/05/2024] [Indexed: 06/20/2024] Open
Abstract
BACKGROUND Nematodes are the most abundant and diverse metazoans on Earth, and are known to significantly affect ecosystem functioning. A better understanding of their biology and ecology, including potential adaptations to diverse habitats and lifestyles, is key to understanding their response to global change scenarios. Mitochondrial genomes offer high species level characterization, low cost of sequencing, and an ease of data handling that can provide insights into nematode evolutionary pressures. RESULTS Generally, nematode mitochondrial genomes exhibited similar structural characteristics (e.g., gene size and GC content), but displayed remarkable variability around these general patterns. Compositional strand biases showed strong codon position specific G skews and relationships with nematode life traits (especially parasitic feeding habits) equal to or greater than with predicted phylogeny. On average, nematode mitochondrial genomes showed low non-synonymous substitution rates, but also high clade specific deviations from these means. Despite the presence of significant mutational saturation, non-synonymous (dN) and synonymous (dS) substitution rates could still be significantly explained by feeding habit and/or habitat. Low ratios of dN:dS rates, particularly associated with the parasitic lifestyles, suggested the presence of strong purifying selection. CONCLUSIONS Nematode mitochondrial genomes demonstrated a capacity to accumulate diversity in composition, structure, and content while still maintaining functional genes. Moreover, they demonstrated a capacity for rapid evolutionary change pointing to a potential interaction between multi-level selection pressures and rapid evolution. In conclusion, this study helps establish a background for our understanding of the potential evolutionary pressures shaping nematode mitochondrial genomes, while outlining likely routes of future inquiry.
Collapse
Affiliation(s)
- Eli M S Gendron
- Department of Entomology and Nematology, University of Florida, Gainesville, FL, USA.
| | - Xue Qing
- Department of Plant Pathology, Nanjing Agricultural University, Nanjing, China.
| | - Joseph L Sevigny
- Molecular, Cellular, and Biomedical Sciences, University of New Hampshire, Durham, NH, USA
- Hubbard Center for Genome Studies, University of New Hampshire, Durham, NH, USA
| | - Hongmei Li
- Department of Plant Pathology, Nanjing Agricultural University, Nanjing, China
| | - Zhiyin Liu
- Department of Plant Pathology, Nanjing Agricultural University, Nanjing, China
| | | | - Thomas O Powers
- Department of Plant Pathology, University of Nebraska, Lincoln, NE, USA
| | - W Kelly Thomas
- Molecular, Cellular, and Biomedical Sciences, University of New Hampshire, Durham, NH, USA
- Hubbard Center for Genome Studies, University of New Hampshire, Durham, NH, USA
| | - Dorota L Porazinska
- Department of Entomology and Nematology, University of Florida, Gainesville, FL, USA
| |
Collapse
|
12
|
Rossi M, Hausmann AE, Alcami P, Moest M, Roussou R, Van Belleghem SM, Wright DS, Kuo CY, Lozano-Urrego D, Maulana A, Melo-Flórez L, Rueda-Muñoz G, McMahon S, Linares M, Osman C, McMillan WO, Pardo-Diaz C, Salazar C, Merrill RM. Adaptive introgression of a visual preference gene. Science 2024; 383:1368-1373. [PMID: 38513020 PMCID: PMC7616200 DOI: 10.1126/science.adj9201] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 01/30/2024] [Indexed: 03/23/2024]
Abstract
Visual preferences are important drivers of mate choice and sexual selection, but little is known of how they evolve at the genetic level. In this study, we took advantage of the diversity of bright warning patterns displayed by Heliconius butterflies, which are also used during mate choice. Combining behavioral, population genomic, and expression analyses, we show that two Heliconius species have evolved the same preferences for red patterns by exchanging genetic material through hybridization. Neural expression of regucalcin1 correlates with visual preference across populations, and disruption of regucalcin1 with CRISPR-Cas9 impairs courtship toward conspecific females, providing a direct link between gene and behavior. Our results support a role for hybridization during behavioral evolution and show how visually guided behaviors contributing to adaptation and speciation are encoded within the genome.
Collapse
Affiliation(s)
- Matteo Rossi
- Faculty of Biology, Ludwig Maximilian University; Munich, Germany
| | | | - Pepe Alcami
- Faculty of Biology, Ludwig Maximilian University; Munich, Germany
| | - Markus Moest
- Department of Ecology and Research Department for Limnology, Mondsee; University of Innsbruck, Innsbruck, Austria
| | - Rodaria Roussou
- Faculty of Biology, Ludwig Maximilian University; Munich, Germany
| | | | | | - Chi-Yun Kuo
- Faculty of Biology, Ludwig Maximilian University; Munich, Germany
- Smithsonian Tropical Research Institute; Gamboa, Panama
| | - Daniela Lozano-Urrego
- Faculty of Biology, Ludwig Maximilian University; Munich, Germany
- Faculty of Natural Sciences, Universidad del Rosario; Bogotá, Colombia
| | - Arif Maulana
- Faculty of Biology, Ludwig Maximilian University; Munich, Germany
| | - Lina Melo-Flórez
- Faculty of Biology, Ludwig Maximilian University; Munich, Germany
- Faculty of Natural Sciences, Universidad del Rosario; Bogotá, Colombia
| | - Geraldine Rueda-Muñoz
- Faculty of Biology, Ludwig Maximilian University; Munich, Germany
- Faculty of Natural Sciences, Universidad del Rosario; Bogotá, Colombia
| | - Saoirse McMahon
- Faculty of Biology, Ludwig Maximilian University; Munich, Germany
| | - Mauricio Linares
- Faculty of Natural Sciences, Universidad del Rosario; Bogotá, Colombia
| | - Christof Osman
- Faculty of Biology, Ludwig Maximilian University; Munich, Germany
| | | | | | - Camilo Salazar
- Faculty of Natural Sciences, Universidad del Rosario; Bogotá, Colombia
| | - Richard M. Merrill
- Faculty of Biology, Ludwig Maximilian University; Munich, Germany
- Smithsonian Tropical Research Institute; Gamboa, Panama
| |
Collapse
|
13
|
Brandt DYC, Huber CD, Chiang CWK, Ortega-Del Vecchyo D. The Promise of Inferring the Past Using the Ancestral Recombination Graph. Genome Biol Evol 2024; 16:evae005. [PMID: 38242694 PMCID: PMC10834162 DOI: 10.1093/gbe/evae005] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 12/11/2023] [Accepted: 12/17/2023] [Indexed: 01/21/2024] Open
Abstract
The ancestral recombination graph (ARG) is a structure that represents the history of coalescent and recombination events connecting a set of sequences (Hudson RR. In: Futuyma D, Antonovics J, editors. Gene genealogies and the coalescent process. In: Oxford Surveys in Evolutionary Biology; 1991. p. 1 to 44.). The full ARG can be represented as a set of genealogical trees at every locus in the genome, annotated with recombination events that change the topology of the trees between adjacent loci and the mutations that occurred along the branches of those trees (Griffiths RC, Marjoram P. An ancestral recombination graph. In: Donnelly P, Tavare S, editors. Progress in population genetics and human evolution. Springer; 1997. p. 257 to 270.). Valuable insights can be gained into past evolutionary processes, such as demographic events or the influence of natural selection, by studying the ARG. It is regarded as the "holy grail" of population genetics (Hubisz M, Siepel A. Inference of ancestral recombination graphs using ARGweaver. In: Dutheil JY, editors. Statistical population genomics. New York, NY: Springer US; 2020. p. 231-266.) since it encodes the processes that generate all patterns of allelic and haplotypic variation from which all commonly used summary statistics in population genetic research (e.g. heterozygosity and linkage disequilibrium) can be derived. Many previous evolutionary inferences relied on summary statistics extracted from the genotype matrix. Evolutionary inferences using the ARG represent a significant advancement as the ARG is a representation of the evolutionary history of a sample that shows the past history of recombination, coalescence, and mutation events across a particular sequence. This representation in theory contains as much information, if not more, than the combination of all independent summary statistics that could be derived from the genotype matrix. Consistent with this idea, some of the first ARG-based analyses have proven to be more powerful than summary statistic-based analyses (Speidel L, Forest M, Shi S, Myers SR. A method for genome-wide genealogy estimation for thousands of samples. Nat Genet. 2019:51(9):1321 to 1329.; Stern AJ, Wilton PR, Nielsen R. An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data. PLoS Genet. 2019:15(9):e1008384.; Hubisz MJ, Williams AL, Siepel A. Mapping gene flow between ancient hominins through demography-aware inference of the ancestral recombination graph. PLoS Genet. 2020:16(8):e1008895.; Fan C, Mancuso N, Chiang CWK. A genealogical estimate of genetic relationships. Am J Hum Genet. 2022:109(5):812-824.; Fan C, Cahoon JL, Dinh BL, Ortega-Del Vecchyo D, Huber C, Edge MD, Mancuso N, Chiang CWK. A likelihood-based framework for demographic inference from genealogical trees. bioRxiv. 2023.10.10.561787. 2023.; Hejase HA, Mo Z, Campagna L, Siepel A. A deep-learning approach for inference of selective sweeps from the ancestral recombination graph. Mol Biol Evol. 2022:39(1):msab332.; Link V, Schraiber JG, Fan C, Dinh B, Mancuso N, Chiang CWK, Edge MD. Tree-based QTL mapping with expected local genetic relatedness matrices. bioRxiv. 2023.04.07.536093. 2023.; Zhang BC, Biddanda A, Gunnarsson ÁF, Cooper F, Palamara PF. Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits. Nat Genet. 2023:55(5):768-776.). As such, there has been significant interest in the field to investigate 2 main problems related to the ARG: (i) How can we estimate the ARG based on genomic data, and (ii) how can we extract information of past evolutionary processes from the ARG? In this perspective, we highlight 3 topics that pertain to these main issues: The development of computational innovations that enable the estimation of the ARG; remaining challenges in estimating the ARG; and methodological advances for deducing evolutionary forces and mechanisms using the ARG. This perspective serves to introduce the readers to the types of questions that can be explored using the ARG and to highlight some of the most pressing issues that must be addressed in order to make ARG-based inference an indispensable tool for evolutionary research.
Collapse
Affiliation(s)
- Débora Y C Brandt
- Department of Genetics Evolution and Environment, University College London, London, UK
| | - Christian D Huber
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - Charleston W K Chiang
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Diego Ortega-Del Vecchyo
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma De México, Querétaro, Querétaro, Mexico
| |
Collapse
|
14
|
Panigrahi M, Rajawat D, Nayak SS, Ghildiyal K, Sharma A, Jain K, Lei C, Bhushan B, Mishra BP, Dutt T. Landmarks in the history of selective sweeps. Anim Genet 2023; 54:667-688. [PMID: 37710403 DOI: 10.1111/age.13355] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 08/28/2023] [Indexed: 09/16/2023]
Abstract
Half a century ago, a seminal article on the hitchhiking effect by Smith and Haigh inaugurated the concept of the selection signature. Selective sweeps are characterised by the rapid spread of an advantageous genetic variant through a population and hence play an important role in shaping evolution and research on genetic diversity. The process by which a beneficial allele arises and becomes fixed in a population, leading to a increase in the frequency of other linked alleles, is known as genetic hitchhiking or genetic draft. Kimura's neutral theory and hitchhiking theory are complementary, with Kimura's neutral evolution as the 'null model' and positive selection as the 'signal'. Both are widely accepted in evolution, especially with genomics enabling precise measurements. Significant advances in genomic technologies, such as next-generation sequencing, high-density SNP arrays and powerful bioinformatics tools, have made it possible to systematically investigate selection signatures in a variety of species. Although the history of selection signatures is relatively recent, progress has been made in the last two decades, owing to the increasing availability of large-scale genomic data and the development of computational methods. In this review, we embark on a journey through the history of research on selective sweeps, ranging from early theoretical work to recent empirical studies that utilise genomic data.
Collapse
Affiliation(s)
- Manjit Panigrahi
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Divya Rajawat
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | | | - Kanika Ghildiyal
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Anurodh Sharma
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Karan Jain
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Chuzhao Lei
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Bharat Bhushan
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Bishnu Prasad Mishra
- Division of Animal Biotechnology, ICAR-National Bureau of Animal Genetic Resources, Karnal, India
| | - Triveni Dutt
- Livestock Production and Management Section, Indian Veterinary Research Institute, Bareilly, India
| |
Collapse
|
15
|
Gretgrix LJ, Decker O, Green PT, Köhler F, Moussalli A, Murphy NP. Genetic diversity of a short-ranged endemic terrestrial snail. Ecol Evol 2023; 13:e10785. [PMID: 38034337 PMCID: PMC10684984 DOI: 10.1002/ece3.10785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 11/02/2023] [Accepted: 11/17/2023] [Indexed: 12/02/2023] Open
Abstract
The factors that influence population structure and connectivity are unknown for most terrestrial invertebrates but are of particular interest both for understanding the impacts of disturbance and for determining accurate levels of biodiversity and local endemism. The main objective of this study was to determine the historical patterns of genetic differentiation and contemporary gene flow in the terrestrial snail, Austrochloritis kosciuszkoensis (Shea & O. L. Griffiths, 2010). Snails were collected in the Mt Buffalo and Alpine National Parks in Victoria, in a bid to understand how populations of this species are connected both within continuous habitat and between adjacent, yet separate environments. Utilising both mitochondrial DNA (mtDNA) and single nucleotide polymorphism (SNP) data, the degree of population structure was determined within and between sites. Very high levels of genetic divergence were found between the Mt Buffalo and Alpine snails, with no evidence for genetic exchange detected between the two regions, indicating speciation has possibly occurred between the two regions. Our analyses of the combined mtDNA and nDNA (generated from SNPs) data have revealed patterns of genetic diversity that are consistent with a history of long-term isolation and limited connectivity. This history may be related to past cycles of changes to the climate over hundreds of thousands of years, which have, in part, caused the fragmentation of Australian forests. Within both regions, extremely limited gene flow between separate populations suggests that these land snails have very limited dispersal capabilities across existing landscape barriers, especially at Mt Buffalo: here, populations only 5 km apart from each other are genetically differentiated. The distinct genetic divergences and clearly reduced dispersal ability detected in this data explain the likely existence of at least two previously unnamed cryptic Austrochloritis species within a 30-50 km radius, and highlight the need for more concentrated efforts to understand population structure and gene flow in terrestrial invertebrates.
Collapse
Affiliation(s)
- Lachlan J. Gretgrix
- Department of Environment and Genetics, School of Agriculture, Biomedicine and EnvironmentLa Trobe UniversityMelbourneVictoriaAustralia
| | - Orsi Decker
- Department of Environment and Genetics, School of Agriculture, Biomedicine and EnvironmentLa Trobe UniversityMelbourneVictoriaAustralia
- Bavarian National ParkNationalparkverwaltung Bayerischer WaldGrafenauGermany
| | - Peter T. Green
- Department of Environment and Genetics, School of Agriculture, Biomedicine and EnvironmentLa Trobe UniversityMelbourneVictoriaAustralia
| | | | | | - Nicholas P. Murphy
- Department of Environment and Genetics, School of Agriculture, Biomedicine and EnvironmentLa Trobe UniversityMelbourneVictoriaAustralia
| |
Collapse
|
16
|
Bazzicalupo E, Ratkiewicz M, Seryodkin IV, Okhlopkov I, Galsandorj N, Yarovenko YA, Ozolins J, Saveljev AP, Melovski D, Gavashelishvili A, Schmidt K, Godoy JA. Genome-environment association analyses reveal geographically restricted adaptive divergence across the range of the widespread Eurasian carnivore Lynx lynx (Linnaeus, 1758). Evol Appl 2023; 16:1773-1788. [PMID: 38029067 PMCID: PMC10681490 DOI: 10.1111/eva.13570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 05/18/2023] [Accepted: 06/06/2023] [Indexed: 12/01/2023] Open
Abstract
Local adaptations to the environment are an important aspect of the diversity of a species and their discovery, description and quantification has important implications for the fields of taxonomy, evolutionary and conservation biology. In this study, we scan genomes from several populations across the distributional range of the Eurasian lynx, with the objective of finding genomic windows under positive selection which may underlie local adaptations to different environments. A total of 394 genomic windows are found to be associated to local environmental conditions, and they are enriched for genes involved in metabolism, behaviour, synaptic organization and neural development. Adaptive genetic structure, reconstructed from SNPs in candidate windows, is considerably different than the neutral genetic structure of the species. A widespread adaptively homogeneous group is recovered occupying areas of harsher snow and temperature climatic conditions in the north-western, central and eastern parts of the distribution. Adaptively divergent populations are recovered in the westernmost part of the range, especially within the Baltic population, but also predicted for different patches in the western and southern part of the range, associated with different snow and temperature regimes. Adaptive differentiation driven by climate does not correlate much with the subspecies taxonomic delimitations, suggesting that subspecific divergences are mostly driven by neutral processes of genetic drift and gene flow. Our results will aid the selection of source populations for assisted gene flow or genetic rescue programs by identifying what climatic patterns to look for as predictors of pre-adaptation of individuals. Particularly, the Carpathian population is confirmed as the best source of individuals for the genetic rescue of the endangered, isolated and genetically eroded Balkan population. Additionally, reintroductions in central and western Europe, currently based mostly on Carpathian lynxes, could consider the Baltic population as an additional source to increase adaptive variation and likely improve adaptation to their milder climate.
Collapse
Affiliation(s)
- Enrico Bazzicalupo
- Department of Ecology and EvolutionEstación Biológica de Doñana (CSIC)SevilleSpain
| | | | - Ivan V. Seryodkin
- Laboratory of Ecology and Conservation of AnimalsPacific Institute of Geography of Far East Branch of Russian Academy of SciencesVladivostokRussia
| | - Innokentiy Okhlopkov
- Institute for Biological Problems of CryolithozoneSiberian Branch of the Russian Academy of SciencesYakutskRussia
| | | | - Yuriy A. Yarovenko
- Pre‐Caspian Institute of Biological ResourcesDagestan Federal Scientific Centre of RASMakhachkalaRussia
| | - Janis Ozolins
- Department of Hunting and Wildlife ManagementLatvijas Valsts mežzinātnes institūts "Silava"SalaspilsLatvia
| | - Alexander P. Saveljev
- Department of Animal EcologyRussian Research Institute of Game Management and Fur FarmingKirovRussia
| | - Dime Melovski
- Macedonian Ecological Society (MES)SkopjeNorth Macedonia
| | | | | | - José A. Godoy
- Department of Ecology and EvolutionEstación Biológica de Doñana (CSIC)SevilleSpain
| |
Collapse
|
17
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data. Mol Biol Evol 2023; 40:msad216. [PMID: 37772983 PMCID: PMC10581699 DOI: 10.1093/molbev/msad216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 08/10/2023] [Accepted: 09/14/2023] [Indexed: 09/30/2023] Open
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under nonconvex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data although preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx, which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
Affiliation(s)
- Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mahmudul Hasan
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Sandipan Paul Arnab
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
18
|
Soni V, Johri P, Jensen JD. Evaluating power to detect recurrent selective sweeps under increasingly realistic evolutionary null models. Evolution 2023; 77:2113-2127. [PMID: 37395482 PMCID: PMC10547124 DOI: 10.1093/evolut/qpad120] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 06/15/2023] [Accepted: 06/30/2023] [Indexed: 07/04/2023]
Abstract
The detection of selective sweeps from population genomic data often relies on the premise that the beneficial mutations in question have fixed very near the sampling time. As it has been previously shown that the power to detect a selective sweep is strongly dependent on the time since fixation as well as the strength of selection, it is naturally the case that strong, recent sweeps leave the strongest signatures. However, the biological reality is that beneficial mutations enter populations at a rate, one that partially determines the mean wait time between sweep events and hence their age distribution. An important question thus remains about the power to detect recurrent selective sweeps when they are modeled by a realistic mutation rate and as part of a realistic distribution of fitness effects, as opposed to a single, recent, isolated event on a purely neutral background as is more commonly modeled. Here we use forward-in-time simulations to study the performance of commonly used sweep statistics, within the context of more realistic evolutionary baseline models incorporating purifying and background selection, population size change, and mutation and recombination rate heterogeneity. Results demonstrate the important interplay of these processes, necessitating caution when interpreting selection scans; specifically, false-positive rates are in excess of true-positive across much of the evaluated parameter space, and selective sweeps are often undetectable unless the strength of selection is exceptionally strong.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| |
Collapse
|
19
|
Heraghty SD, Jackson JM, Lozier JD. Whole genome analyses reveal weak signatures of population structure and environmentally associated local adaptation in an important North American pollinator, the bumble bee Bombus vosnesenskii. Mol Ecol 2023; 32:5479-5497. [PMID: 37702957 DOI: 10.1111/mec.17125] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 08/21/2023] [Accepted: 08/24/2023] [Indexed: 09/14/2023]
Abstract
Studies of species that experience environmental heterogeneity across their distributions have become an important tool for understanding mechanisms of adaptation and predicting responses to climate change. We examine population structure, demographic history and environmentally associated genomic variation in Bombus vosnesenskii, a common bumble bee in the western USA, using whole genome resequencing of populations distributed across a broad range of latitudes and elevations. We find that B. vosnesenskii exhibits minimal population structure and weak isolation by distance, confirming results from previous studies using other molecular marker types. Similarly, demographic analyses with Sequentially Markovian Coalescent models suggest that minimal population structure may have persisted since the last interglacial period, with genomes from different parts of the species range showing similar historical effective population size trajectories and relatively small fluctuations through time. Redundancy analysis revealed a small amount of genomic variation explained by bioclimatic variables. Environmental association analysis with latent factor mixed modelling (LFMM2) identified few outlier loci that were sparsely distributed throughout the genome and although a few putative signatures of selective sweeps were identified, none encompassed particularly large numbers of loci. Some outlier loci were in genes with known regulatory relationships, suggesting the possibility of weak selection, although compared with other species examined with similar approaches, evidence for extensive local adaptation signatures in the genome was relatively weak. Overall, results indicate B. vosnesenskii is an example of a generalist with a high degree of flexibility in its environmental requirements that may ultimately benefit the species under periods of climate change.
Collapse
Affiliation(s)
- Sam D Heraghty
- Department of Biological Sciences, The University of Alabama, Tuscaloosa, Alabama, USA
| | - Jason M Jackson
- Department of Biological Sciences, The University of Alabama, Tuscaloosa, Alabama, USA
| | - Jeffrey D Lozier
- Department of Biological Sciences, The University of Alabama, Tuscaloosa, Alabama, USA
| |
Collapse
|
20
|
Arnab SP, Amin MR, DeGiorgio M. Uncovering Footprints of Natural Selection Through Spectral Analysis of Genomic Summary Statistics. Mol Biol Evol 2023; 40:msad157. [PMID: 37433019 PMCID: PMC10365025 DOI: 10.1093/molbev/msad157] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 06/28/2023] [Accepted: 07/06/2023] [Indexed: 07/13/2023] Open
Abstract
Natural selection leaves a spatial pattern along the genome, with a haplotype distribution distortion near the selected locus that fades with distance. Evaluating the spatial signal of a population-genetic summary statistic across the genome allows for patterns of natural selection to be distinguished from neutrality. Considering the genomic spatial distribution of multiple summary statistics is expected to aid in uncovering subtle signatures of selection. In recent years, numerous methods have been devised that consider genomic spatial distributions across summary statistics, utilizing both classical machine learning and deep learning architectures. However, better predictions may be attainable by improving the way in which features are extracted from these summary statistics. We apply wavelet transform, multitaper spectral analysis, and S-transform to summary statistic arrays to achieve this goal. Each analysis method converts one-dimensional summary statistic arrays to two-dimensional images of spectral analysis, allowing simultaneous temporal and spectral assessment. We feed these images into convolutional neural networks and consider combining models using ensemble stacking. Our modeling framework achieves high accuracy and power across a diverse set of evolutionary settings, including population size changes and test sets of varying sweep strength, softness, and timing. A scan of central European whole-genome sequences recapitulated well-established sweep candidates and predicted novel cancer-associated genes as sweeps with high support. Given that this modeling framework is also robust to missing genomic segments, we believe that it will represent a welcome addition to the population-genomic toolkit for learning about adaptive processes from genomic data.
Collapse
Affiliation(s)
- Sandipan Paul Arnab
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
21
|
Soni V, Johri P, Jensen JD. Evaluating power to detect recurrent selective sweeps under increasingly realistic evolutionary null models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.15.545166. [PMID: 37398347 PMCID: PMC10312679 DOI: 10.1101/2023.06.15.545166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
The detection of selective sweeps from population genomic data often relies on the premise that the beneficial mutations in question have fixed very near the sampling time. As it has been previously shown that the power to detect a selective sweep is strongly dependent on the time since fixation as well as the strength of selection, it is naturally the case that strong, recent sweeps leave the strongest signatures. However, the biological reality is that beneficial mutations enter populations at a rate, one that partially determines the mean wait time between sweep events and hence their age distribution. An important question thus remains about the power to detect recurrent selective sweeps when they are modelled by a realistic mutation rate and as part of a realistic distribution of fitness effects (DFE), as opposed to a single, recent, isolated event on a purely neutral background as is more commonly modelled. Here we use forward-in-time simulations to study the performance of commonly used sweep statistics, within the context of more realistic evolutionary baseline models incorporating purifying and background selection, population size change, and mutation and recombination rate heterogeneity. Results demonstrate the important interplay of these processes, necessitating caution when interpreting selection scans; specifically, false positive rates are in excess of true positive across much of the evaluated parameter space, and selective sweeps are often undetectable unless the strength of selection is exceptionally strong. Teaser Text Outlier-based genomic scans have proven a popular approach for identifying loci that have potentially experienced recent positive selection. However, it has previously been shown that an evolutionarily appropriate baseline model that incorporates non-equilibrium population histories, purifying and background selection, and variation in mutation and recombination rates is necessary to reduce often extreme false positive rates when performing genomic scans. Here we evaluate the power to detect recurrent selective sweeps using common SFS-based and haplotype-based methods under these increasingly realistic models. We find that while these appropriate evolutionary baselines are essential to reduce false positive rates, the power to accurately detect recurrent selective sweeps is generally low across much of the biologically relevant parameter space.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Present address: Department of Biology, Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | | |
Collapse
|
22
|
Tobler R, Souilmi Y, Huber CD, Bean N, Turney CSM, Grey ST, Cooper A. The role of genetic selection and climatic factors in the dispersal of anatomically modern humans out of Africa. Proc Natl Acad Sci U S A 2023; 120:e2213061120. [PMID: 37220274 PMCID: PMC10235988 DOI: 10.1073/pnas.2213061120] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 03/14/2023] [Indexed: 05/25/2023] Open
Abstract
The evolutionarily recent dispersal of anatomically modern humans (AMH) out of Africa (OoA) and across Eurasia provides a unique opportunity to examine the impacts of genetic selection as humans adapted to multiple new environments. Analysis of ancient Eurasian genomic datasets (~1,000 to 45,000 y old) reveals signatures of strong selection, including at least 57 hard sweeps after the initial AMH movement OoA, which have been obscured in modern populations by extensive admixture during the Holocene. The spatiotemporal patterns of these hard sweeps provide a means to reconstruct early AMH population dispersals OoA. We identify a previously unsuspected extended period of genetic adaptation lasting ~30,000 y, potentially in the Arabian Peninsula area, prior to a major Neandertal genetic introgression and subsequent rapid dispersal across Eurasia as far as Australia. Consistent functional targets of selection initiated during this period, which we term the Arabian Standstill, include loci involved in the regulation of fat storage, neural development, skin physiology, and cilia function. Similar adaptive signatures are also evident in introgressed archaic hominin loci and modern Arctic human groups, and we suggest that this signal represents selection for cold adaptation. Surprisingly, many of the candidate selected loci across these groups appear to directly interact and coordinately regulate biological processes, with a number associated with major modern diseases including the ciliopathies, metabolic syndrome, and neurodegenerative disorders. This expands the potential for ancestral human adaptation to directly impact modern diseases, providing a platform for evolutionary medicine.
Collapse
Affiliation(s)
- Raymond Tobler
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, SA5005, Australia
| | - Yassine Souilmi
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, SA5005, Australia
- Environment Institute, The University of Adelaide, Adelaide, SA5005, Australia
| | - Christian D. Huber
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, SA5005, Australia
| | - Nigel Bean
- Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers, The University of Adelaide, Adelaide, SA5005, Australia
- School of Mathematical Sciences, The University of Adelaide, Adelaide, SA5005, Australia
| | - Chris S. M. Turney
- Division of Research, University of Technology Sydney, Ultimo, NSW2007, Australia
| | - Shane T. Grey
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, University of New South Wales, Sydney, NSW2052, Australia
- Transplantation Immunology Group, Translation Science Pillar, Garvan Institute of Medical Research, Darlinghurst, NSW2010, Australia
| | - Alan Cooper
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, SA5005, Australia
- Blue Sky Genetics, Ashton, SA5137, Australia
| |
Collapse
|
23
|
Souilmi Y, Tobler R, Johar A, Williams M, Grey ST, Schmidt J, Teixeira JC, Rohrlach A, Tuke J, Johnson O, Gower G, Turney C, Cox M, Cooper A, Huber CD. Admixture has obscured signals of historical hard sweeps in humans. Nat Ecol Evol 2022; 6:2003-2015. [PMID: 36316412 PMCID: PMC9715430 DOI: 10.1038/s41559-022-01914-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2021] [Accepted: 09/16/2022] [Indexed: 11/06/2022]
Abstract
The role of natural selection in shaping biological diversity is an area of intense interest in modern biology. To date, studies of positive selection have primarily relied on genomic datasets from contemporary populations, which are susceptible to confounding factors associated with complex and often unknown aspects of population history. In particular, admixture between diverged populations can distort or hide prior selection events in modern genomes, though this process is not explicitly accounted for in most selection studies despite its apparent ubiquity in humans and other species. Through analyses of ancient and modern human genomes, we show that previously reported Holocene-era admixture has masked more than 50 historic hard sweeps in modern European genomes. Our results imply that this canonical mode of selection has probably been underappreciated in the evolutionary history of humans and suggest that our current understanding of the tempo and mode of selection in natural populations may be inaccurate.
Collapse
Affiliation(s)
- Yassine Souilmi
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
| | - Raymond Tobler
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
- Evolution of Cultural Diversity Initiative, Australian National University, Canberra, Australian Capital Territory, Australia.
| | - Angad Johar
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
- Department of Cardiovascular Diseases, Mayo Clinic, Rochester, MN, USA.
| | - Matthew Williams
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Shane T Grey
- Transplantation Immunology Group, Immunology Division, Garvan Institute of Medical Research, Darlinghurst, New South Wales, Australia
- St Vincent's Clinical School, Faculty of Medicine, UNSW, Darlinghurst, New South Wales, Australia
| | - Joshua Schmidt
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - João C Teixeira
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Adam Rohrlach
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, The University of Adelaide, Adelaide, South Australia, Australia
- Department of Archaeogenetics, Max Planck Institute for the Science of Human History, Jena, Germany
| | - Jonathan Tuke
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, The University of Adelaide, Adelaide, South Australia, Australia
- School of Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Olivia Johnson
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Graham Gower
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Chris Turney
- Chronos 14Carbon-Cycle Facility and Earth and Sustainability Science Research Centre, University of New South Wales, Sydney, New South Wales, Australia
| | - Murray Cox
- Statistics and Bioinformatics Group, School of Fundamental Sciences, Massey University, Palmerston North, New Zealand
| | - Alan Cooper
- South Australian Museum, Adelaide, South Australia, Australia.
- BlueSky Genetics, Ashton, South Australia, Australia.
| | - Christian D Huber
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
- Department of Biology, Penn State University, University Park, PA, USA.
| |
Collapse
|
24
|
Koller D, Wendt FR, Pathak GA, De Lillo A, De Angelis F, Cabrera-Mendoza B, Tucci S, Polimanti R. Denisovan and Neanderthal archaic introgression differentially impacted the genetics of complex traits in modern populations. BMC Biol 2022; 20:249. [PMID: 36344982 PMCID: PMC9641937 DOI: 10.1186/s12915-022-01449-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 10/24/2022] [Indexed: 11/09/2022] Open
Abstract
BACKGROUND Introgression from extinct Neanderthal and Denisovan human species has been shown to contribute to the genetic pool of modern human populations and their phenotypic spectrum. Evidence of how Neanderthal introgression shaped the genetics of human traits and diseases has been extensively studied in populations of European descent, with signatures of admixture reported for instance in genes associated with pigmentation, immunity, and metabolic traits. However, limited information is currently available about the impact of archaic introgression on other ancestry groups. Additionally, to date, no study has been conducted with respect to the impact of Denisovan introgression on the health and disease of modern populations. Here, we compare the way evolutionary pressures shaped the genetics of complex traits in East Asian and European populations, and provide evidence of the impact of Denisovan introgression on the health of East Asian and Central/South Asian populations. RESULTS Leveraging genome-wide association statistics from the Biobank Japan and UK Biobank, we assessed whether Denisovan and Neanderthal introgression together with other evolutionary genomic signatures were enriched for the heritability of physiological and pathological conditions in populations of East Asian and European descent. In EAS, Denisovan-introgressed loci were enriched for coronary artery disease heritability (1.69-fold enrichment, p=0.003). No enrichment for archaic introgression was observed in EUR. We also performed a phenome-wide association study of Denisovan and Neanderthal alleles in six ancestry groups available in the UK Biobank. In EAS, the Denisovan-introgressed SNP rs62391664 in the major histocompatibility complex region was associated with albumin/globulin ratio (beta=-0.17, p=3.57×10-7). Neanderthal-introgressed alleles were associated with psychiatric and cognitive traits in EAS (e.g., "No Bipolar or Depression"-rs79043717 beta=-1.5, p=1.1×10-7), and with blood biomarkers (e.g., alkaline phosphatase-rs11244089 beta=0.1, p=3.69×10-116) and red hair color (rs60733936 beta=-0.86, p=4.49×10-165) in EUR. In the other ancestry groups, Neanderthal alleles were associated with several traits, also including the use of certain medications (e.g., Central/South East Asia: indapamide - rs732632 beta=-2.38, p=5.22×10-7). CONCLUSIONS Our study provides novel evidence regarding the impact of archaic introgression on the genetics of complex traits in worldwide populations, highlighting the specific contribution of Denisovan introgression in EAS populations.
Collapse
Affiliation(s)
- Dora Koller
- Department of Psychiatry, Yale University School of Medicine, West Haven, CT, 06516, USA
- VA CT Healthcare Center, West Haven, CT, 06516, USA
- Department of Genetics, Microbiology and Statistics, Faculty of Biology, University of Barcelona, Barcelona, Catalonia, 08028, Spain
| | - Frank R Wendt
- Department of Psychiatry, Yale University School of Medicine, West Haven, CT, 06516, USA
- VA CT Healthcare Center, West Haven, CT, 06516, USA
| | - Gita A Pathak
- Department of Psychiatry, Yale University School of Medicine, West Haven, CT, 06516, USA
- VA CT Healthcare Center, West Haven, CT, 06516, USA
| | - Antonella De Lillo
- Department of Psychiatry, Yale University School of Medicine, West Haven, CT, 06516, USA
- Department of Biology, University of Rome "Tor Vergata", Rome, 00133, Italy
| | - Flavio De Angelis
- Department of Psychiatry, Yale University School of Medicine, West Haven, CT, 06516, USA
- VA CT Healthcare Center, West Haven, CT, 06516, USA
- Department of Biology, University of Rome "Tor Vergata", Rome, 00133, Italy
| | - Brenda Cabrera-Mendoza
- Department of Psychiatry, Yale University School of Medicine, West Haven, CT, 06516, USA
- VA CT Healthcare Center, West Haven, CT, 06516, USA
| | - Serena Tucci
- Department of Anthropology, Yale University, New Haven, CT, 06511, USA
| | - Renato Polimanti
- Department of Psychiatry, Yale University School of Medicine, West Haven, CT, 06516, USA.
- VA CT Healthcare Center, West Haven, CT, 06516, USA.
| |
Collapse
|
25
|
Brooks E, Slender AL, Cu S, Breed MF, Stangoulis JCR. A range-wide analysis of population structure and genomic variation within the critically endangered spiny daisy (Acanthocladium dockeri). CONSERV GENET 2022. [DOI: 10.1007/s10592-022-01468-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
AbstractUnderstanding population structure and genetic diversity is important for designing effective conservation strategies. As a critically endangered shrub, the six remaining extant populations of spiny daisy (Acanthocladium dockeri) are restricted to country roadsides in the mid-north of South Australia, where the species faces many ongoing abiotic and biotic threats to survival. Currently the spiny daisy is managed by selecting individuals from the extant populations and translocating them to establish insurance populations. However, there is little information available on the genetic differentiation between populations and diversity within source populations, which are essential components of planning translocations. To help fill this knowledge gap, we analysed population structure within and among all six of its known wild populations using 7,742 SNPs generated by a genotyping-by-sequencing approach. Results indicated that each population was strongly differentiated, had low levels of genetic diversity, and there was no evidence of inter-population gene flow. Individuals within each population were generally closely related, however, the Melrose population consisted entirely of clones. Our results suggest genetic rescue should be applied to wild spiny daisy populations to increase genetic diversity that will subsequently lead to greater intra-population fitness and adaptability. As a starting point, we suggest focussing on improving seed viability via inter-population crosses such as through hand pollination experiments to experimentally assess their sexual compatibility with the hope of increasing spiny daisy sexual reproduction and long-term reproductive fitness.
Collapse
|
26
|
Cortés AJ, López-Hernández F, Blair MW. Genome-Environment Associations, an Innovative Tool for Studying Heritable Evolutionary Adaptation in Orphan Crops and Wild Relatives. Front Genet 2022; 13:910386. [PMID: 35991553 PMCID: PMC9389289 DOI: 10.3389/fgene.2022.910386] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 05/30/2022] [Indexed: 11/23/2022] Open
Abstract
Leveraging innovative tools to speed up prebreeding and discovery of genotypic sources of adaptation from landraces, crop wild relatives, and orphan crops is a key prerequisite to accelerate genetic gain of abiotic stress tolerance in annual crops such as legumes and cereals, many of which are still orphan species despite advances in major row crops. Here, we review a novel, interdisciplinary approach to combine ecological climate data with evolutionary genomics under the paradigm of a new field of study: genome-environment associations (GEAs). We first exemplify how GEA utilizes in situ georeferencing from genotypically characterized, gene bank accessions to pinpoint genomic signatures of natural selection. We later discuss the necessity to update the current GEA models to predict both regional- and local- or micro-habitat-based adaptation with mechanistic ecophysiological climate indices and cutting-edge GWAS-type genetic association models. Furthermore, to account for polygenic evolutionary adaptation, we encourage the community to start gathering genomic estimated adaptive values (GEAVs) for genomic prediction (GP) and multi-dimensional machine learning (ML) models. The latter two should ideally be weighted by de novo GWAS-based GEA estimates and optimized for a scalable marker subset. We end the review by envisioning avenues to make adaptation inferences more robust through the merging of high-resolution data sources, such as environmental remote sensing and summary statistics of the genomic site frequency spectrum, with the epigenetic molecular functionality responsible for plastic inheritance in the wild. Ultimately, we believe that coupling evolutionary adaptive predictions with innovations in ecological genomics such as GEA will help capture hidden genetic adaptations to abiotic stresses based on crop germplasm resources to assist responses to climate change. "I shall endeavor to find out how nature's forces act upon one another, and in what manner the geographic environment exerts its influence on animals and plants. In short, I must find out about the harmony in nature" Alexander von Humboldt-Letter to Karl Freiesleben, June 1799.
Collapse
Affiliation(s)
- Andrés J. Cortés
- Corporacion Colombiana de Investigacion Agropecuaria AGROSAVIA, C.I. La Selva, Rionegro, Colombia
| | - Felipe López-Hernández
- Corporacion Colombiana de Investigacion Agropecuaria AGROSAVIA, C.I. La Selva, Rionegro, Colombia
| | - Matthew W. Blair
- Department of Agricultural & Environmental Sciences, Tennessee State University, Nashville, TN, United States
| |
Collapse
|
27
|
DeGiorgio M, Szpiech ZA. A spatially aware likelihood test to detect sweeps from haplotype distributions. PLoS Genet 2022; 18:e1010134. [PMID: 35404934 PMCID: PMC9022890 DOI: 10.1371/journal.pgen.1010134] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 04/21/2022] [Accepted: 03/04/2022] [Indexed: 01/13/2023] Open
Abstract
The inference of positive selection in genomes is a problem of great interest in evolutionary genomics. By identifying putative regions of the genome that contain adaptive mutations, we are able to learn about the biology of organisms and their evolutionary history. Here we introduce a composite likelihood method that identifies recently completed or ongoing positive selection by searching for extreme distortions in the spatial distribution of the haplotype frequency spectrum along the genome relative to the genome-wide expectation taken as neutrality. Furthermore, the method simultaneously infers two parameters of the sweep: the number of sweeping haplotypes and the "width" of the sweep, which is related to the strength and timing of selection. We demonstrate that this method outperforms the leading haplotype-based selection statistics, though strong signals in low-recombination regions merit extra scrutiny. As a positive control, we apply it to two well-studied human populations from the 1000 Genomes Project and examine haplotype frequency spectrum patterns at the LCT and MHC loci. We also apply it to a data set of brown rats sampled in NYC and identify genes related to olfactory perception. To facilitate use of this method, we have implemented it in user-friendly open source software.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida, United States of America
| | - Zachary A. Szpiech
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States of America
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
| |
Collapse
|
28
|
Moinet A, Schlichta F, Peischl S, Excoffier L. Strong neutral sweeps occurring during a population contraction. Genetics 2022; 220:6529544. [PMID: 35171980 PMCID: PMC8982045 DOI: 10.1093/genetics/iyac021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 01/22/2022] [Indexed: 11/14/2022] Open
Abstract
A strong reduction in diversity around a specific locus is often interpreted as a recent rapid fixation of a positively selected allele, a phenomenon called a selective sweep. Rapid fixation of neutral variants can however lead to a similar reduction in local diversity, especially when the population experiences changes in population size, e.g. bottlenecks or range expansions. The fact that demographic processes can lead to signals of nucleotide diversity very similar to signals of selective sweeps is at the core of an ongoing discussion about the roles of demography and natural selection in shaping patterns of neutral variation. Here, we quantitatively investigate the shape of such neutral valleys of diversity under a simple model of a single population size change, and we compare it to signals of a selective sweep. We analytically describe the expected shape of such "neutral sweeps" and show that selective sweep valleys of diversity are, for the same fixation time, wider than neutral valleys. On the other hand, it is always possible to parametrize our model to find a neutral valley that has the same width as a given selected valley. Our findings provide further insight into how simple demographic models can create valleys of genetic diversity similar to those attributed to positive selection.
Collapse
Affiliation(s)
- Antoine Moinet
- Interfaculty Bioinformatics Unit, University of Bern, Bern 3012, Switzerland,Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland,Computational and Molecular Population Genetics Lab, Institute of Ecology and Evolution, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland
| | - Flávia Schlichta
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland,Computational and Molecular Population Genetics Lab, Institute of Ecology and Evolution, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland
| | - Stephan Peischl
- Interfaculty Bioinformatics Unit, University of Bern, Bern 3012, Switzerland,Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland,Corresponding author.
| | - Laurent Excoffier
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland,Computational and Molecular Population Genetics Lab, Institute of Ecology and Evolution, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland
| |
Collapse
|
29
|
Vecchyo DOD, Lohmueller KE, Novembre J. Haplotype-based inference of the distribution of fitness effects. Genetics 2022; 220:6501446. [PMID: 35100400 PMCID: PMC8982047 DOI: 10.1093/genetics/iyac002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 12/18/2021] [Indexed: 11/13/2022] Open
Abstract
Abstract
Recent genome sequencing studies with large sample sizes in humans have discovered a vast quantity of low-frequency variants, providing an important source of information to analyze how selection is acting on human genetic variation. In order to estimate the strength of natural selection acting on low-frequency variants, we have developed a likelihood-based method that uses the lengths of pairwise identity-by-state between haplotypes carrying low-frequency variants. We show that in some non-equilibrium populations (such as those that have had recent population expansions) it is possible to distinguish between positive or negative selection acting on a set of variants. With our new framework, one can infer a fixed selection intensity acting on a set of variants at a particular frequency, or a distribution of selection coefficients for standing variants and new mutations. We show an application of our method to the UK10K phased haplotype dataset of individuals.
Collapse
Affiliation(s)
- Diego Ortega-Del Vecchyo
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, 76230, México
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, California, 90095, United States of America
| | - Kirk E Lohmueller
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, California, 90095, United States of America
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, California, 90095, United States of America
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, 90095, United States of America
| | - John Novembre
- Department of Human Genetics, University of Chicago, Chicago, Illinois, 60637, United States of America
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, 60637, United States of America
| |
Collapse
|
30
|
Cheng JY, Stern AJ, Racimo F, Nielsen R. Detecting Selection in Multiple Populations by Modeling Ancestral Admixture Components. Mol Biol Evol 2022; 39:msab294. [PMID: 34626111 PMCID: PMC8763095 DOI: 10.1093/molbev/msab294] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
One of the most powerful and commonly used approaches for detecting local adaptation in the genome is the identification of extreme allele frequency differences between populations. In this article, we present a new maximum likelihood method for finding regions under positive selection. It is based on a Gaussian approximation to allele frequency changes and it incorporates admixture between populations. The method can analyze multiple populations simultaneously and retains power to detect selection signatures specific to ancestry components that are not representative of any extant populations. Using simulated data, we compare our method to related approaches, and show that it is orders of magnitude faster than the state-of-the-art, while retaining similar or higher power for most simulation scenarios. We also apply it to human genomic data and identify loci with extreme genetic differentiation between major geographic groups. Many of the genes identified are previously known selected loci relating to hair pigmentation and morphology, skin, and eye pigmentation. We also identify new candidate regions, including various selected loci in the Native American component of admixed Mexican-Americans. These involve diverse biological functions, such as immunity, fat distribution, food intake, vision, and hair development.
Collapse
Affiliation(s)
- Jade Yu Cheng
- Lundbeck GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Aaron J Stern
- Graduate Group in Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Fernando Racimo
- Lundbeck GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Rasmus Nielsen
- Lundbeck GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA
- Department of Statistics, University of California, Berkeley, Berkeley, CA, USA
| |
Collapse
|
31
|
Qiu J, Zhou Q, Ye W, Chen Q, Bao YJ. SweepCluster: A SNP clustering tool for detecting gene-specific sweeps in prokaryotes. BMC Bioinformatics 2022; 23:19. [PMID: 34991447 PMCID: PMC8734265 DOI: 10.1186/s12859-021-04533-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 12/13/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The gene-specific sweep is a selection process where an advantageous mutation along with the nearby neutral sites in a gene region increases the frequency in the population. It has been demonstrated to play important roles in ecological differentiation or phenotypic divergence in microbial populations. Therefore, identifying gene-specific sweeps in microorganisms will not only provide insights into the evolutionary mechanisms, but also unravel potential genetic markers associated with biological phenotypes. However, current methods were mainly developed for detecting selective sweeps in eukaryotic data of sparse genotypes and are not readily applicable to prokaryotic data. Furthermore, some challenges have not been sufficiently addressed by the methods, such as the low spatial resolution of sweep regions and lack of consideration of the spatial distribution of mutations. RESULTS We proposed a novel gene-centric and spatial-aware approach for identifying gene-specific sweeps in prokaryotes and implemented it in a python tool SweepCluster. Our method searches for gene regions with a high level of spatial clustering of pre-selected polymorphisms in genotype datasets assuming a null distribution model of neutral selection. The pre-selection of polymorphisms is based on their genetic signatures, such as elevated population subdivision, excessive linkage disequilibrium, or significant phenotype association. Performance evaluation using simulation data showed that the sensitivity and specificity of the clustering algorithm in SweepCluster is above 90%. The application of SweepCluster in two real datasets from the bacteria Streptococcus pyogenes and Streptococcus suis showed that the impact of pre-selection was dramatic and significantly reduced the uninformative signals. We validated our method using the genotype data from Vibrio cyclitrophicus, the only available dataset of gene-specific sweeps in bacteria, and obtained a concordance rate of 78%. We noted that the concordance rate could be underestimated due to distinct reference genomes and clustering strategies. The application to the human genotype datasets showed that SweepCluster is also applicable to eukaryotic data and is able to recover 80% of a catalog of known sweep regions. CONCLUSION SweepCluster is applicable to a broad category of datasets. It will be valuable for detecting gene-specific sweeps in diverse genotypic data and provide novel insights on adaptive evolution.
Collapse
Affiliation(s)
- Junhui Qiu
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-Resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences, Hubei University, Wuhan, 430062, China
| | - Qi Zhou
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-Resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences, Hubei University, Wuhan, 430062, China
| | - Weicai Ye
- School of Computer Science and Engineering, Guangdong Province Key Laboratory of Computational Science, and National Engineering Laboratory for Big Data Analysis and Application, Sun Yat-Sen University, Guangzhou, 510275, China
| | - Qianjun Chen
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-Resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences, Hubei University, Wuhan, 430062, China.
| | - Yun-Juan Bao
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-Resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences, Hubei University, Wuhan, 430062, China.
| |
Collapse
|
32
|
Yeaman S. Evolution of polygenic traits under global vs local adaptation. Genetics 2022; 220:iyab134. [PMID: 35134196 PMCID: PMC8733419 DOI: 10.1093/genetics/iyab134] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 08/05/2021] [Indexed: 12/14/2022] Open
Abstract
Observations about the number, frequency, effect size, and genomic distribution of alleles associated with complex traits must be interpreted in light of evolutionary process. These characteristics, which constitute a trait's genetic architecture, can dramatically affect evolutionary outcomes in applications from agriculture to medicine, and can provide a window into how evolution works. Here, I review theoretical predictions about the evolution of genetic architecture under spatially homogeneous, global adaptation as compared with spatially heterogeneous, local adaptation. Due to the tension between divergent selection and migration, local adaptation can favor "concentrated" genetic architectures that are enriched for alleles of larger effect, clustered in a smaller number of genomic regions, relative to expectations under global adaptation. However, the evolution of such architectures may be limited by many factors, including the genotypic redundancy of the trait, mutation rate, and temporal variability of environment. I review the circumstances in which predictions differ for global vs local adaptation and discuss where progress can be made in testing hypotheses using data from natural populations and lab experiments. As the field of comparative population genomics expands in scope, differences in architecture among traits and species will provide insights into how evolution works, and such differences must be interpreted in light of which kind of selection has been operating.
Collapse
Affiliation(s)
- Sam Yeaman
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| |
Collapse
|
33
|
Ma Y, Wariss HM, Liao R, Zhang R, Yun Q, Olmstead RG, Chau JH, Milne RI, Van de Peer Y, Sun W. Genome-wide analysis of butterfly bush (Buddleja alternifolia) in three uplands provides insights into biogeography, demography and speciation. THE NEW PHYTOLOGIST 2021; 232:1463-1476. [PMID: 34292587 PMCID: PMC9291457 DOI: 10.1111/nph.17637] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 07/19/2021] [Indexed: 05/06/2023]
Abstract
Understanding processes that generate and maintain large disjunctions within plant species can provide valuable insights into plant diversity and speciation. The butterfly bush Buddleja alternifolia has an unusual disjunct distribution, occurring in the Himalaya, Hengduan Mountains (HDM) and the Loess Plateau (LP) in China. We generated a high-quality, chromosome-level genome assembly of B. alternifolia, the first within the family Scrophulariaceae. Whole-genome re-sequencing data from 48 populations plus morphological and petal colour reflectance data covering its full distribution range were collected. Three distinct genetic lineages of B. alternifolia were uncovered, corresponding to Himalayan, HDM and LP populations, with the last also differentiated morphologically and phenologically, indicating occurrence of allopatric speciation likely to be facilitated by geographic isolation and divergent adaptation to distinct ecological niches. Moreover, speciation with gene flow between populations from either side of a mountain barrier could be under way within LP. The current disjunctions within B. alternifolia might result from vicariance of a once widespread distribution, followed by several past contraction and expansion events, possibly linked to climate fluctuations promoted by the Kunlun-Yellow river tectonic movement. Several adaptive genes are likely to be either uniformly or diversely selected among regions, providing a footprint of local adaptations. These findings provide new insights into plant biogeography, adaptation and different processes of allopatric speciation.
Collapse
Affiliation(s)
- Yong‐Peng Ma
- Yunnan Key Laboratory for Integrative Conservation of Plant Species with Extremely Small PopulationsKunming Institute of BotanyChinese Academy of SciencesKunming650201China
| | - Hafiz Muhammad Wariss
- Yunnan Key Laboratory for Integrative Conservation of Plant Species with Extremely Small PopulationsKunming Institute of BotanyChinese Academy of SciencesKunming650201China
| | - Rong‐Li Liao
- Yunnan Key Laboratory for Integrative Conservation of Plant Species with Extremely Small PopulationsKunming Institute of BotanyChinese Academy of SciencesKunming650201China
- Fuzhou Botanical GardenFuzhou350012China
| | - Ren‐Gang Zhang
- Beijing Ori‐Gene Science and Technology Co. LtdBeijing102206China
| | - Quan‐Zheng Yun
- Beijing Ori‐Gene Science and Technology Co. LtdBeijing102206China
| | - Richard G. Olmstead
- Department of Biology and Burke MuseumUniversity of WashingtonBox 351800SeattleWA98195USA
| | - John H. Chau
- Centre for Ecological Genomics and Wildlife ConservationDepartment of ZoologyUniversity of JohannesburgPO Box 524Auckland Park2006South Africa
| | - Richard I. Milne
- Institute of Molecular Plant SciencesUniversity of EdinburghEdinburghEH9 3JHUK
| | - Yves Van de Peer
- Department of Plant Biotechnology and BioinformaticsGhent UniversityGhentB‐9052Belgium
- VIB Center for Plant Systems BiologyGhentB‐9052Belgium
- College of HorticultureNanjing Agricultural UniversityNanjing210095China
- Department of Biochemistry, Genetics and MicrobiologyUniversity of PretoriaArcadia0007South Africa
| | - Wei‐Bang Sun
- Yunnan Key Laboratory for Integrative Conservation of Plant Species with Extremely Small PopulationsKunming Institute of BotanyChinese Academy of SciencesKunming650201China
| |
Collapse
|
34
|
Luqman H, Widmer A, Fior S, Wegmann D. Identifying loci under selection via explicit demographic models. Mol Ecol Resour 2021; 21:2719-2737. [PMID: 33964107 PMCID: PMC8596768 DOI: 10.1111/1755-0998.13415] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 04/03/2021] [Accepted: 04/28/2021] [Indexed: 01/28/2023]
Abstract
Adaptive genetic variation is a function of both selective and neutral forces. To accurately identify adaptive loci, it is thus critical to account for demographic history. Theory suggests that signatures of selection can be inferred using the coalescent, following the premise that genealogies of selected loci deviate from neutral expectations. Here, we build on this theory to develop an analytical framework to identify loci under selection via explicit demographic models (LSD). Under this framework, signatures of selection are inferred through deviations in demographic parameters, rather than through summary statistics directly, and demographic history is accounted for explicitly. Leveraging the property of demographic models to incorporate directionality, we show that LSD can provide information on the environment in which selection acts on a population. This can prove useful in elucidating the selective processes underlying local adaptation, by characterizing genetic trade-offs and extending the concepts of antagonistic pleiotropy and conditional neutrality from ecological theory to practical application in genomic data. We implement LSD via approximate Bayesian computation and demonstrate, via simulations, that LSD (a) has high power to identify selected loci across a large range of demographic-selection regimes, (b) outperforms commonly applied genome-scan methods under complex demographies and (c) accurately infers the directionality of selection for identified candidates. Using the same simulations, we further characterize the behaviour of isolation-with-migration models conducive to the study of local adaptation under regimes of selection. Finally, we demonstrate an application of LSD by detecting loci and characterizing genetic trade-offs underlying flower colour in Antirrhinum majus.
Collapse
Affiliation(s)
- Hirzi Luqman
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
| | - Alex Widmer
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
| | - Simone Fior
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
| | - Daniel Wegmann
- Department of BiologyUniversity of FribourgFribourgSwitzerland
- Swiss Institute of BioinformaticsFribourgSwitzerland
| |
Collapse
|
35
|
Bisschop G, Lohse K, Setter D. Sweeps in time: leveraging the joint distribution of branch lengths. Genetics 2021; 219:iyab119. [PMID: 34849880 PMCID: PMC8633083 DOI: 10.1093/genetics/iyab119] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 07/10/2021] [Indexed: 11/14/2022] Open
Abstract
Current methods of identifying positively selected regions in the genome are limited in two key ways: the underlying models cannot account for the timing of adaptive events and the comparison between models of selective sweeps and sequence data is generally made via simple summaries of genetic diversity. Here, we develop a tractable method of describing the effect of positive selection on the genealogical histories in the surrounding genome, explicitly modeling both the timing and context of an adaptive event. In addition, our framework allows us to go beyond analyzing polymorphism data via the site frequency spectrum or summaries thereof and instead leverage information contained in patterns of linked variants. Tests on both simulations and a human data example, as well as a comparison to SweepFinder2, show that even with very small sample sizes, our analytic framework has higher power to identify old selective sweeps and to correctly infer both the time and strength of selection. Finally, we derived the marginal distribution of genealogical branch lengths at a locus affected by selection acting at a linked site. This provides a much-needed link between our analytic understanding of the effects of sweeps on sequence variation and recent advances in simulation and heuristic inference procedures that allow researchers to examine the sequence of genealogical histories along the genome.
Collapse
Affiliation(s)
- Gertjan Bisschop
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, UK
| | - Konrad Lohse
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, UK
| | - Derek Setter
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, UK
| |
Collapse
|
36
|
Colella JP, Tigano A, Dudchenko O, Omer AD, Khan R, Bochkov ID, Aiden EL, MacManes MD. Limited Evidence for Parallel Evolution Among Desert-Adapted Peromyscus Deer Mice. J Hered 2021; 112:286-302. [PMID: 33686424 PMCID: PMC8141686 DOI: 10.1093/jhered/esab009] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 02/27/2021] [Indexed: 01/14/2023] Open
Abstract
Warming climate and increasing desertification urge the identification of genes involved in heat and dehydration tolerance to better inform and target biodiversity conservation efforts. Comparisons among extant desert-adapted species can highlight parallel or convergent patterns of genome evolution through the identification of shared signatures of selection. We generate a chromosome-level genome assembly for the canyon mouse (Peromyscus crinitus) and test for a signature of parallel evolution by comparing signatures of selective sweeps across population-level genomic resequencing data from another congeneric desert specialist (Peromyscus eremicus) and a widely distributed habitat generalist (Peromyscus maniculatus), that may be locally adapted to arid conditions. We identify few shared candidate loci involved in desert adaptation and do not find support for a shared pattern of parallel evolution. Instead, we hypothesize divergent molecular mechanisms of desert adaptation among deer mice, potentially tied to species-specific historical demography, which may limit or enhance adaptation. We identify a number of candidate loci experiencing selective sweeps in the P. crinitus genome that are implicated in osmoregulation (Trypsin, Prostasin) and metabolic tuning (Kallikrein, eIF2-alpha kinase GCN2, APPL1/2), which may be important for accommodating hot and dry environmental conditions.
Collapse
Affiliation(s)
- Jocelyn P Colella
- Department of Molecular, Cellular, and Biomedical Sciences, University of New Hampshire, Durham, NH.,Hubbard Genome Center, University of New Hampshire, Durham, NH.,Biodiversity Institute, University of Kansas, Lawrence, KS
| | - Anna Tigano
- Department of Molecular, Cellular, and Biomedical Sciences, University of New Hampshire, Durham, NH.,Hubbard Genome Center, University of New Hampshire, Durham, NH
| | - Olga Dudchenko
- Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX.,Center for Theoretical and Biological Physics, Rice University, Houston, TX.,Department of Computer Science, Department of Computational and Applied Mathematics, Rice University, Houston, TX
| | - Arina D Omer
- Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX
| | - Ruqayya Khan
- Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX.,Department of Computer Science, Department of Computational and Applied Mathematics, Rice University, Houston, TX
| | - Ivan D Bochkov
- Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX.,Department of Computer Science, Department of Computational and Applied Mathematics, Rice University, Houston, TX
| | - Erez L Aiden
- Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX.,Center for Theoretical and Biological Physics, Rice University, Houston, TX.,Department of Computer Science, Department of Computational and Applied Mathematics, Rice University, Houston, TX.,Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, Shanghai 201210, China.,School of Agriculture and Environment, University of Western Australia, Perth, WA, Australia
| | - Matthew D MacManes
- Department of Molecular, Cellular, and Biomedical Sciences, University of New Hampshire, Durham, NH.,Hubbard Genome Center, University of New Hampshire, Durham, NH
| |
Collapse
|
37
|
Lloret-Villas A, Bhati M, Kadri NK, Fries R, Pausch H. Investigating the impact of reference assembly choice on genomic analyses in a cattle breed. BMC Genomics 2021; 22:363. [PMID: 34011274 PMCID: PMC8132449 DOI: 10.1186/s12864-021-07554-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 03/22/2021] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Reference-guided read alignment and variant genotyping are prone to reference allele bias, particularly for samples that are greatly divergent from the reference genome. A Hereford-based assembly is the widely accepted bovine reference genome. Haplotype-resolved genomes that exceed the current bovine reference genome in quality and continuity have been assembled for different breeds of cattle. Using whole genome sequencing data of 161 Brown Swiss cattle, we compared the accuracy of read mapping and sequence variant genotyping as well as downstream genomic analyses between the bovine reference genome (ARS-UCD1.2) and a highly continuous Angus-based assembly (UOA_Angus_1). RESULTS Read mapping accuracy did not differ notably between the ARS-UCD1.2 and UOA_Angus_1 assemblies. We discovered 22,744,517 and 22,559,675 high-quality variants from ARS-UCD1.2 and UOA_Angus_1, respectively. The concordance between sequence- and array-called genotypes was high and the number of variants deviating from Hardy-Weinberg proportions was low at segregating sites for both assemblies. More artefactual INDELs were genotyped from UOA_Angus_1 than ARS-UCD1.2 alignments. Using the composite likelihood ratio test, we detected 40 and 33 signatures of selection from ARS-UCD1.2 and UOA_Angus_1, respectively, but the overlap between both assemblies was low. Using the 161 sequenced Brown Swiss cattle as a reference panel, we imputed sequence variant genotypes into a mapping cohort of 30,499 cattle that had microarray-derived genotypes using a two-step imputation approach. The accuracy of imputation (Beagle R2) was very high (0.87) for both assemblies. Genome-wide association studies between imputed sequence variant genotypes and six dairy traits as well as stature produced almost identical results from both assemblies. CONCLUSIONS The ARS-UCD1.2 and UOA_Angus_1 assemblies are suitable for reference-guided genome analyses in Brown Swiss cattle. Although differences in read mapping and genotyping accuracy between both assemblies are negligible, the choice of the reference genome has a large impact on detecting signatures of selection that already reached fixation using the composite likelihood ratio test. We developed a workflow that can be adapted and reused to compare the impact of reference genomes on genome analyses in various breeds, populations and species.
Collapse
Affiliation(s)
| | - Meenu Bhati
- Animal Genomics, ETH Zürich, Lindau, 8315 Switzerland
| | | | - Ruedi Fries
- Chair of Animal Breeding, TU München, Freising-Weihenstephan, 85354 Germany
| | - Hubert Pausch
- Animal Genomics, ETH Zürich, Lindau, 8315 Switzerland
| |
Collapse
|
38
|
Harris AM, DeGiorgio M. A Likelihood Approach for Uncovering Selective Sweep Signatures from Haplotype Data. Mol Biol Evol 2021; 37:3023-3046. [PMID: 32392293 PMCID: PMC7530616 DOI: 10.1093/molbev/msaa115] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Selective sweeps are frequent and varied signatures in the genomes of natural populations, and detecting them is consequently important in understanding mechanisms of adaptation by natural selection. Following a selective sweep, haplotypic diversity surrounding the site under selection decreases, and this deviation from the background pattern of variation can be applied to identify sweeps. Multiple methods exist to locate selective sweeps in the genome from haplotype data, but none leverages the power of a model-based approach to make their inference. Here, we propose a likelihood ratio test statistic T to probe whole-genome polymorphism data sets for selective sweep signatures. Our framework uses a simple but powerful model of haplotype frequency spectrum distortion to find sweeps and additionally make an inference on the number of presently sweeping haplotypes in a population. We found that the T statistic is suitable for detecting both hard and soft sweeps across a variety of demographic models, selection strengths, and ages of the beneficial allele. Accordingly, we applied the T statistic to variant calls from European and sub-Saharan African human populations, yielding primarily literature-supported candidates, including LCT, RSPH3, and ZNF211 in CEU, SYT1, RGS18, and NNT in YRI, and HLA genes in both populations. We also searched for sweep signatures in Drosophila melanogaster, finding expected candidates at Ace, Uhg1, and Pimet. Finally, we provide open-source software to compute the T statistic and the inferred number of presently sweeping haplotypes from whole-genome data.
Collapse
Affiliation(s)
- Alexandre M Harris
- Department of Biology, Pennsylvania State University, University Park, PA.,Molecular, Cellular, and Integrative Biosciences, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL
| |
Collapse
|
39
|
Hill T, Unckless RL. Adaptation, ancestral variation and gene flow in a 'Sky Island' Drosophila species. Mol Ecol 2021; 30:83-99. [PMID: 33089581 PMCID: PMC7945764 DOI: 10.1111/mec.15701] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 09/28/2020] [Accepted: 10/08/2020] [Indexed: 02/06/2023]
Abstract
Over time, populations of species can expand, contract, fragment and become isolated, creating subpopulations that must adapt to local conditions. Understanding how species maintain variation after divergence as well as adapt to these changes in the face of gene flow is of great interest, especially as the current climate crisis has caused range shifts and frequent migrations for many species. Here, we characterize how a mycophageous fly species, Drosophila innubila, came to inhabit and adapt to its current range which includes mountain forests in south-western USA separated by large expanses of desert. Using population genomic data from more than 300 wild-caught individuals, we examine four populations to determine their population history in these mountain forests, looking for signatures of local adaptation. In this first extensive study, establishing D. innubila as a key genomic "Sky Island" model, we find D. innubila spread northwards during the previous glaciation period (30-100 KYA) and have recently expanded even further (0.2-2 KYA). D. innubila shows little evidence of population structure, consistent with a recent establishment and genetic variation maintained since before geographic stratification. We also find some signatures of recent selective sweeps in chorion proteins and population differentiation in antifungal immune genes suggesting differences in the environments to which flies are adapting. However, we find little support for long-term recurrent selection in these genes. In contrast, we find evidence of long-term recurrent positive selection in immune pathways such as the Toll signalling system and the Toll-regulated antimicrobial peptides.
Collapse
Affiliation(s)
- Tom Hill
- 4055 Haworth Hall, The Department of Molecular Biosciences, University of Kansas, 1200 Sunnyside Avenue, Lawrence, KS 66045
| | - Robert L. Unckless
- 4055 Haworth Hall, The Department of Molecular Biosciences, University of Kansas, 1200 Sunnyside Avenue, Lawrence, KS 66045
| |
Collapse
|
40
|
Wendt FR, Pathak GA, Overstreet C, Tylee DS, Gelernter J, Atkinson EG, Polimanti R. Characterizing the effect of background selection on the polygenicity of brain-related traits. Genomics 2021; 113:111-119. [PMID: 33278486 PMCID: PMC7855394 DOI: 10.1016/j.ygeno.2020.11.032] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 11/20/2020] [Accepted: 11/30/2020] [Indexed: 01/10/2023]
Abstract
BACKGROUND Genome-wide association studies (GWAS) have demonstrated that psychopathology phenotypes are affected by many risk alleles with small effect (polygenicity). It is unclear how ubiquitously evolutionary pressures influence the genetic architecture of these traits. METHODS We partitioned SNP heritability to assess the contribution of background (BGS) and positive selection, Neanderthal local ancestry, functional significance, and genotype networks in 75 brain-related traits (8411 ≤ N ≤ 1,131,181, mean N = 205,289). We applied binary annotations by dichotomizing each measure based on top 2%, 1%, and 0.5% of all scores genome-wide. Effect size distribution features were calculated using GENESIS. We tested the relationship between effect size distribution descriptive statistics and natural selection. In a subset of traits, we explore the inclusion of diagnostic heterogeneity (e.g., number of diagnostic combinations and total symptoms) in the tested relationship. RESULTS SNP-heritability was enriched (false discovery rate q < 0.05) for loci with elevated BGS (7 phenotypes) and in genic (34 phenotypes) and loss-of-function (LoF)-intolerant regions (67 phenotypes). These effects were strongest in GWAS of schizophrenia (1.90-fold BGS, 1.16-fold genic, and 1.92-fold LoF), educational attainment (1.86-fold BGS, 1.12-fold genic, and 1.79-fold LoF), and cognitive performance (2.29-fold BGS, 1.12-fold genic, and 1.79-fold LoF). BGS (top 2%) significantly predicted effect size variance for trait-associated loci (σ2 parameter) in 75 brain-related traits (β = 4.39 × 10-5, p = 1.43 × 10-5, model r2 = 0.548). Considering the number of DSM-5 diagnostic combinations per psychiatric disorder improved model fit (σ2 ~ BTop2% × Genic × diagnostic combinations; model r2 = 0.661). CONCLUSIONS Brain-related phenotypes with larger variance in risk locus effect sizes are associated with loci under BGS. We show exploratory results suggesting that diagnostic complexity may also contribute to the increased polygenicity of psychiatric disorders.
Collapse
Affiliation(s)
- Frank R Wendt
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare System, West Haven, CT 06516, USA
| | - Gita A Pathak
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare System, West Haven, CT 06516, USA
| | - Cassie Overstreet
- National Center for Posttraumatic Stress Disorder, Clinical Neurosciences Division, VA CT Healthcare System and Department of Psychiatry, Yale University School of Medicine, USA
| | - Daniel S Tylee
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare System, West Haven, CT 06516, USA
| | - Joel Gelernter
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare System, West Haven, CT 06516, USA; Departments of Genetics and Neuroscience, Yale University School of Medicine, New Haven, CT 06510, USA
| | - Elizabeth G Atkinson
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Renato Polimanti
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare System, West Haven, CT 06516, USA.
| |
Collapse
|
41
|
Schneider K, White TJ, Mitchell S, Adams CE, Reeve R, Elmer KR. The pitfalls and virtues of population genetic summary statistics: Detecting selective sweeps in recent divergences. J Evol Biol 2020; 34:893-909. [DOI: 10.1111/jeb.13738] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 10/22/2020] [Accepted: 10/24/2020] [Indexed: 12/12/2022]
Affiliation(s)
- Kevin Schneider
- Institute of Biodiversity, Animal Health & Comparative Medicine College of Medical, Veterinary & Life Sciences University of Glasgow Glasgow UK
| | - Tom J. White
- Institute of Biodiversity, Animal Health & Comparative Medicine College of Medical, Veterinary & Life Sciences University of Glasgow Glasgow UK
| | - Sonia Mitchell
- Institute of Biodiversity, Animal Health & Comparative Medicine College of Medical, Veterinary & Life Sciences University of Glasgow Glasgow UK
| | - Colin E. Adams
- Institute of Biodiversity, Animal Health & Comparative Medicine College of Medical, Veterinary & Life Sciences University of Glasgow Glasgow UK
- Scottish Centre for Ecology and the Natural Environment Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary & Life Sciences University of Glasgow Glasgow UK
| | - Richard Reeve
- Institute of Biodiversity, Animal Health & Comparative Medicine College of Medical, Veterinary & Life Sciences University of Glasgow Glasgow UK
| | - Kathryn R. Elmer
- Institute of Biodiversity, Animal Health & Comparative Medicine College of Medical, Veterinary & Life Sciences University of Glasgow Glasgow UK
| |
Collapse
|
42
|
Siddiqui SS, Vaill M, Do R, Khan N, Verhagen AL, Zhang W, Lenz HJ, Johnson-Pais TL, Leach RJ, Fraser G, Wang C, Feng GS, Varki N, Varki A. Human-specific polymorphic pseudogenization of SIGLEC12 protects against advanced cancer progression. FASEB Bioadv 2020; 3:69-82. [PMID: 33615152 PMCID: PMC7876704 DOI: 10.1096/fba.2020-00092] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 10/08/2020] [Accepted: 10/09/2020] [Indexed: 12/22/2022] Open
Abstract
Compared with our closest living evolutionary cousins, humans appear unusually prone to develop carcinomas (cancers arising from epithelia). The SIGLEC12 gene, which encodes the Siglec-XII protein expressed on epithelial cells, has several uniquely human features: a fixed homozygous missense mutation inactivating its natural ligand recognition property; a polymorphic frameshift mutation eliminating full-length protein expression in ~60%-70% of worldwide human populations; and, genomic features suggesting a negative selective sweep favoring the pseudogene state. Despite the loss of canonical sialic acid binding, Siglec-XII still recruits Shp2 and accelerates tumor growth in a mouse model. We hypothesized that dysfunctional Siglec-XII facilitates human carcinoma progression, correlating with known tumorigenic signatures of Shp2-dependent cancers. Immunohistochemistry was used to detect Siglec-XII expression on tissue microarrays. PC-3 prostate cancer cells were transfected with Siglec-XII and transcription of genes enriched with Siglec-XII was determined. Genomic SIGLEC12 status was determined for four different cancer cohorts. Finally, a dot blot analysis of human urinary epithelial cells was established to determine the Siglec-XII expressors versus non-expressors. Forced expression in a SIGLEC12 null carcinoma cell line enriched transcription of genes associated with cancer progression. While Siglec-XII was detected as expected in ~30%-40% of normal epithelia, ~80% of advanced carcinomas showed strong expression. Notably, >80% of late-stage colorectal cancers had a functional SIGLEC12 allele, correlating with overall increased mortality. Thus, advanced carcinomas are much more likely to occur in individuals whose genomes have an intact SIGLEC12 gene, likely because the encoded Siglec-XII protein recruits Shp2-related oncogenic pathways. The finding has prognostic, diagnostic, and therapeutic implications.
Collapse
Affiliation(s)
- Shoib S Siddiqui
- Departments of Medicine, Cellular and Molecular Medicine, and Pathology, Glycobiology Research and Training Cente and Center for Academic Research and Training in Anthropogeny University of California San Diego CA USA.,Present address: Department of Biotechnology American University of Ras Al Khaimah (AURAK American University of Ras Al Khaimah Road Al Burairat Area Ras Al Khaimah UAE
| | - Michael Vaill
- Departments of Medicine, Cellular and Molecular Medicine, and Pathology, Glycobiology Research and Training Cente and Center for Academic Research and Training in Anthropogeny University of California San Diego CA USA
| | - Raymond Do
- Departments of Medicine, Cellular and Molecular Medicine, and Pathology, Glycobiology Research and Training Cente and Center for Academic Research and Training in Anthropogeny University of California San Diego CA USA
| | - Naazneen Khan
- Departments of Medicine, Cellular and Molecular Medicine, and Pathology, Glycobiology Research and Training Cente and Center for Academic Research and Training in Anthropogeny University of California San Diego CA USA
| | - Andrea L Verhagen
- Departments of Medicine, Cellular and Molecular Medicine, and Pathology, Glycobiology Research and Training Cente and Center for Academic Research and Training in Anthropogeny University of California San Diego CA USA
| | - Wu Zhang
- University of Southern California Norris Comprehensive Cancer Center Los Angeles CA USA
| | - Heinz-Josef Lenz
- University of Southern California Norris Comprehensive Cancer Center Los Angeles CA USA
| | | | - Robin J Leach
- Department of Urology University of TX Health Science Center San Antonio TX USA.,Departments of Cell Systems and Anatomy University of TX Health Science Center San Antonio TX USA
| | - Gary Fraser
- School of Public Health Loma Linda University Loma Linda CA USA
| | - Charles Wang
- School of Public Health Loma Linda University Loma Linda CA USA
| | - Gen-Sheng Feng
- Departments of Medicine, Cellular and Molecular Medicine, and Pathology, Glycobiology Research and Training Cente and Center for Academic Research and Training in Anthropogeny University of California San Diego CA USA
| | - Nissi Varki
- Departments of Medicine, Cellular and Molecular Medicine, and Pathology, Glycobiology Research and Training Cente and Center for Academic Research and Training in Anthropogeny University of California San Diego CA USA
| | - Ajit Varki
- Departments of Medicine, Cellular and Molecular Medicine, and Pathology, Glycobiology Research and Training Cente and Center for Academic Research and Training in Anthropogeny University of California San Diego CA USA
| |
Collapse
|
43
|
Whitelaw BL, Cooke IR, Finn J, da Fonseca RR, Ritschard EA, Gilbert MTP, Simakov O, Strugnell JM. Adaptive venom evolution and toxicity in octopods is driven by extensive novel gene formation, expansion, and loss. Gigascience 2020; 9:giaa120. [PMID: 33175168 PMCID: PMC7656900 DOI: 10.1093/gigascience/giaa120] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 08/10/2020] [Accepted: 10/06/2020] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Cephalopods represent a rich system for investigating the genetic basis underlying organismal novelties. This diverse group of specialized predators has evolved many adaptations including proteinaceous venom. Of particular interest is the blue-ringed octopus genus (Hapalochlaena), which are the only octopods known to store large quantities of the potent neurotoxin, tetrodotoxin, within their tissues and venom gland. FINDINGS To reveal genomic correlates of organismal novelties, we conducted a comparative study of 3 octopod genomes, including the Southern blue-ringed octopus (Hapalochlaena maculosa). We present the genome of this species and reveal highly dynamic evolutionary patterns at both non-coding and coding organizational levels. Gene family expansions previously reported in Octopus bimaculoides (e.g., zinc finger and cadherins, both associated with neural functions), as well as formation of novel gene families, dominate the genomic landscape in all octopods. Examination of tissue-specific genes in the posterior salivary gland revealed that expression was dominated by serine proteases in non-tetrodotoxin-bearing octopods, while this family was a minor component in H. maculosa. Moreover, voltage-gated sodium channels in H. maculosa contain a resistance mutation found in pufferfish and garter snakes, which is exclusive to the genus. Analysis of the posterior salivary gland microbiome revealed a diverse array of bacterial species, including genera that can produce tetrodotoxin, suggestive of a possible production source. CONCLUSIONS We present the first tetrodotoxin-bearing octopod genome H. maculosa, which displays lineage-specific adaptations to tetrodotoxin acquisition. This genome, along with other recently published cephalopod genomes, represents a valuable resource from which future work could advance our understanding of the evolution of genomic novelty in this family.
Collapse
Affiliation(s)
- Brooke L Whitelaw
- Centre for Sustainable Tropical Fisheries and Aquaculture, James Cook University, 1 James Cook Dr, Douglas QLD 4811 , Australia
- Sciences, Museum Victoria, 11 Nicholson St, Carlton, Victoria 3053, Australia
| | - Ira R Cooke
- College of Public Health, Medical and Vet Sciences, James Cook University,1 James Cook Dr, Douglas QLD 4811 , Australia
- La Trobe Institute of Molecular Science, La Trobe University, Plenty Rd &, Kingsbury Dr, Bundoora, Melbourne, Victoria 3086, Australia
| | - Julian Finn
- Sciences, Museum Victoria, 11 Nicholson St, Carlton, Victoria 3053, Australia
| | - Rute R da Fonseca
- Center for Macroecology, Evolution and Climate (CMEC), GLOBE Institute, University of Copenhagen, Universitetsparken 15, 2100 Copenhagen, Denmark
| | - Elena A Ritschard
- Department of Neurosciences and Developmental Biology, University of Vienna,Universitätsring 1, 1010 Wien, Vienna, Austria
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Naples, Italy
| | - M T P Gilbert
- Center for Evolutionary Hologenomics, GLOBE Institute, University of Copenhagen, Øster Voldgade 5–7, 1350 Copenhagen, Denmark
| | - Oleg Simakov
- Department of Neurosciences and Developmental Biology, University of Vienna,Universitätsring 1, 1010 Wien, Vienna, Austria
| | - Jan M Strugnell
- Centre for Sustainable Tropical Fisheries and Aquaculture, James Cook University, 1 James Cook Dr, Douglas QLD 4811 , Australia
- Department of Ecology, Environment and Evolution, La Trobe University, Plenty Rd &, Kingsbury Dr, Bundoora, Melbourne, Victoria 3086, Australia
| |
Collapse
|
44
|
Cooke I, Ying H, Forêt S, Bongaerts P, Strugnell JM, Simakov O, Zhang J, Field MA, Rodriguez-Lanetty M, Bell SC, Bourne DG, van Oppen MJ, Ragan MA, Miller DJ. Genomic signatures in the coral holobiont reveal host adaptations driven by Holocene climate change and reef specific symbionts. SCIENCE ADVANCES 2020; 6:6/48/eabc6318. [PMID: 33246955 PMCID: PMC7695477 DOI: 10.1126/sciadv.abc6318] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Accepted: 10/15/2020] [Indexed: 05/24/2023]
Abstract
Genetic signatures caused by demographic and adaptive processes during past climatic shifts can inform predictions of species' responses to anthropogenic climate change. To identify these signatures in Acropora tenuis, a reef-building coral threatened by global warming, we first assembled the genome from long reads and then used shallow whole-genome resequencing of 150 colonies from the central inshore Great Barrier Reef to inform population genomic analyses. We identify population structure in the host that reflects a Pleistocene split, whereas photosymbiont differences between reefs most likely reflect contemporary (Holocene) conditions. Signatures of selection in the host were associated with genes linked to diverse processes including osmotic regulation, skeletal development, and the establishment and maintenance of symbiosis. Our results suggest that adaptation to post-glacial climate change in A. tenuis has involved selection on many genes, while differences in symbiont specificity between reefs appear to be unrelated to host population structure.
Collapse
Affiliation(s)
- Ira Cooke
- College of Public Health, Medical and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia.
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Townsville, Queensland, Australia
| | - Hua Ying
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - Sylvain Forêt
- Research School of Biology, Australian National University, Canberra, ACT, Australia
- ARC Centre of Excellence for Coral Reef Studies, Australian National University, Canberra, ACT, Australia
| | - Pim Bongaerts
- California Academy of Sciences, Golden Gate Park, San Francisco, CA, USA
| | - Jan M Strugnell
- Centre for Sustainable Tropical Fisheries and Aquaculture, James Cook University, Townsville, Queensland, Australia
- Department of Ecology, Environment and Evolution, School of Life Sciences, La Trobe University, Melbourne, Australia
- College of Science and Engineering, James Cook University, Townsville, Queensland, Australia
| | - Oleg Simakov
- Department of Molecular Evolution and Development, University of Vienna, Austria
| | - Jia Zhang
- College of Public Health, Medical and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Townsville, Queensland, Australia
- ARC Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, Queensland, Australia
| | - Matt A Field
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Townsville, Queensland, Australia
- Australian Institute of Tropical Health and Medicine, James Cook University, Cairns, Queensland, Australia
| | - Mauricio Rodriguez-Lanetty
- Institute of Environment and Department of Biological Sciences, Florida International University, Miami, Fl 33199, USA
| | - Sara C Bell
- Australian Institute of Marine Science, Townsville, Queensland, Australia
| | - David G Bourne
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Townsville, Queensland, Australia
- College of Science and Engineering, James Cook University, Townsville, Queensland, Australia
- Australian Institute of Marine Science, Townsville, Queensland, Australia
| | - Madeleine Jh van Oppen
- Australian Institute of Marine Science, Townsville, Queensland, Australia
- School of BioSciences, University of Melbourne, Melbourne, Australia
| | - Mark A Ragan
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - David J Miller
- College of Public Health, Medical and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia.
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Townsville, Queensland, Australia
- ARC Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, Queensland, Australia
| |
Collapse
|
45
|
Cortés AJ, López-Hernández F, Osorio-Rodriguez D. Predicting Thermal Adaptation by Looking Into Populations' Genomic Past. Front Genet 2020; 11:564515. [PMID: 33101385 PMCID: PMC7545011 DOI: 10.3389/fgene.2020.564515] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Accepted: 08/24/2020] [Indexed: 12/18/2022] Open
Abstract
Molecular evolution offers an insightful theory to interpret the genomic consequences of thermal adaptation to previous events of climate change beyond range shifts. However, disentangling often mixed footprints of selective and demographic processes from those due to lineage sorting, recombination rate variation, and genomic constrains is not trivial. Therefore, here we condense current and historical population genomic tools to study thermal adaptation and outline key developments (genomic prediction, machine learning) that might assist their utilization for improving forecasts of populations' responses to thermal variation. We start by summarizing how recent thermal-driven selective and demographic responses can be inferred by coalescent methods and in turn how quantitative genetic theory offers suitable multi-trait predictions over a few generations via the breeder's equation. We later assume that enough generations have passed as to display genomic signatures of divergent selection to thermal variation and describe how these footprints can be reconstructed using genome-wide association and selection scans or, alternatively, may be used for forward prediction over multiple generations under an infinitesimal genomic prediction model. Finally, we move deeper in time to comprehend the genomic consequences of thermal shifts at an evolutionary time scale by relying on phylogeographic approaches that allow for reticulate evolution and ecological parapatric speciation, and end by envisioning the potential of modern machine learning techniques to better inform long-term predictions. We conclude that foreseeing future thermal adaptive responses requires bridging the multiple spatial scales of historical and predictive environmental change research under modern cohesive approaches such as genomic prediction and machine learning frameworks.
Collapse
Affiliation(s)
- Andrés J Cortés
- Corporación Colombiana de Investigación Agropecuaria AGROSAVIA, C.I. La Selva, Rionegro, Colombia.,Departamento de Ciencias Forestales, Facultad de Ciencias Agrarias, Universidad Nacional de Colombia - Sede Medellín, Medellín, Colombia
| | - Felipe López-Hernández
- Corporación Colombiana de Investigación Agropecuaria AGROSAVIA, C.I. La Selva, Rionegro, Colombia
| | - Daniela Osorio-Rodriguez
- Division of Geological and Planetary Sciences, California Institute of Technology (Caltech), Pasadena, CA, United States
| |
Collapse
|
46
|
Horscroft C, Ennis S, Pengelly RJ, Sluckin TJ, Collins A. Sequencing era methods for identifying signatures of selection in the genome. Brief Bioinform 2020; 20:1997-2008. [PMID: 30053138 DOI: 10.1093/bib/bby064] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 05/16/2018] [Indexed: 12/12/2022] Open
Abstract
Insights into genetic loci which are under selection and their functional roles contribute to increased understanding of the patterns of phenotypic variation we observe today. The availability of whole-genome sequence data, for humans and other species, provides opportunities to investigate adaptation and evolution at unprecedented resolution. Many analytical methods have been developed to interrogate these large data sets and characterize signatures of selection in the genome. We review here recently developed methods and consider the impact of increased computing power and data availability on the detection of selection signatures. Consideration of demography, recombination and other confounding factors is important, and use of a range of methods in combination is a powerful route to resolving different forms of selection in genome sequence data. Overall, a substantial improvement in methods for application to whole-genome sequencing is evident, although further work is required to develop robust and computationally efficient approaches which may increase reproducibility across studies.
Collapse
Affiliation(s)
- Clare Horscroft
- Genetic Epidemiology and Bioinformatics, Faculty of Medicine, University of Southampton, Duthie Building (808), Tremona Road, Southampton, UK.,Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK
| | - Sarah Ennis
- Genetic Epidemiology and Bioinformatics, Faculty of Medicine, University of Southampton, Duthie Building (808), Tremona Road, Southampton, UK.,Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK
| | - Reuben J Pengelly
- Genetic Epidemiology and Bioinformatics, Faculty of Medicine, University of Southampton, Duthie Building (808), Tremona Road, Southampton, UK.,Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK
| | - Timothy J Sluckin
- Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK.,Mathematical Sciences, University of Southampton, Highfield, Southampton, UK
| | - Andrew Collins
- Genetic Epidemiology and Bioinformatics, Faculty of Medicine, University of Southampton, Duthie Building (808), Tremona Road, Southampton, UK.,Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK
| |
Collapse
|
47
|
Lewis JJ, Van Belleghem SM, Papa R, Danko CG, Reed RD. Many functionally connected loci foster adaptive diversification along a neotropical hybrid zone. SCIENCE ADVANCES 2020; 6:6/39/eabb8617. [PMID: 32978147 PMCID: PMC7518860 DOI: 10.1126/sciadv.abb8617] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Accepted: 08/11/2020] [Indexed: 05/02/2023]
Abstract
Characterizing the genetic complexity of adaptation and trait evolution is a major emphasis of evolutionary biology and genetics. Incongruent findings from genetic studies have resulted in conceptual models ranging from a few large-effect loci to massively polygenic architectures. Here, we combine chromatin immunoprecipitation sequencing, Hi-C, RNA sequencing, and 40 whole-genome sequences from Heliconius butterflies to show that red color pattern diversification occurred via many genomic loci. We find that the red wing pattern master regulatory transcription factor Optix binds dozens of loci also under selection, which frequently form three-dimensional adaptive hubs with selection acting on multiple physically interacting genes. Many Optix-bound genes under selection are tied to pigmentation and wing development, and these loci collectively maintain separation between adaptive red color pattern phenotypes in natural populations. We propose a model of trait evolution where functional connections between loci may resolve much of the disparity between large-effect and polygenic evolutionary models.
Collapse
Affiliation(s)
- James J Lewis
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, USA.
- Baker Institute for Animal Health, Cornell University, Ithaca, NY, USA
| | | | - Riccardo Papa
- Department of Biology, University of Puerto Rico-Rio Piedras, San Juan, Puerto Rico
- Molecular Sciences and Research Center, University of Puerto Rico, San Juan, Puerto Rico
| | - Charles G Danko
- Baker Institute for Animal Health, Cornell University, Ithaca, NY, USA
| | - Robert D Reed
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, USA
| |
Collapse
|
48
|
Muntané G, Farré X, Bosch E, Martorell L, Navarro A, Vilella E. The shared genetic architecture of schizophrenia, bipolar disorder and lifespan. Hum Genet 2020; 140:441-455. [PMID: 32772156 DOI: 10.1007/s00439-020-02213-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Accepted: 07/27/2020] [Indexed: 12/11/2022]
Abstract
Psychiatric disorders such as Schizophrenia (SCZ) and Bipolar Disorder (BD) represent an evolutionary paradox, as they exhibit strong negative effects on fitness, such as decreased fecundity and early mortality, yet they persist at a worldwide prevalence of approximately 1%. Molecular mechanisms affecting lifespan, which may be widely common among complex diseases with fitness effects, can be studied by the integrated analysis of data from genome-wide association studies (GWAS) of human longevity together with any disease of interest. Here, we report the first of such studies, focusing on the genetic overlap-pleiotropy-between two psychiatric disorders with shortened lifespan, SCZ and BD, and human parental lifespan (PLS) as a surrogate of life expectancy. Our results are twofold: first, we demonstrate extensive polygenic overlap between SCZ and PLS and to a lesser extent between BD and PLS. Second, we identified novel loci shared between PLS and SCZ (n = 39), and BD (n = 8). Whereas most of the identified SCZ (66%) and BD (62%) pleiotropic risk alleles were associated with reduced lifespan, we also detected some antagonistic protective alleles associated to shorter lifespans. In fact, top-associated SNPs with SCZ seems to explain longevity variance explained (LVE) better than many other life-threatening diseases, including Type 2 diabetes and most cancers, probably due to a high overlap with smoking-related pathways. Overall, our study provides evidence of a genetic burden driven through premature mortality among people with SCZ, which can have profound implications for understanding, and potentially treating, the mortality gap associated with this psychiatric disorder.
Collapse
Affiliation(s)
- Gerard Muntané
- Biomedical Network Research Centre on Mental Health (CIBERSAM), Hospital Universitari Institut Pere Mata, IISPV Universitat Rovira i Virgili, Reus, Spain. .,Departament de Ciències Experimentals i de la Salut, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain.
| | - Xavier Farré
- Departament de Ciències Experimentals i de la Salut, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain
| | - Elena Bosch
- Departament de Ciències Experimentals i de la Salut, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain
| | - Lourdes Martorell
- Biomedical Network Research Centre on Mental Health (CIBERSAM), Hospital Universitari Institut Pere Mata, IISPV Universitat Rovira i Virgili, Reus, Spain
| | - Arcadi Navarro
- Departament de Ciències Experimentals i de la Salut, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain.,Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats, ICREA, Barcelona, Spain.,Barcelonaβeta Brain Research Center, Fundació Pasqual Maragall, Barcelona, Spain
| | - Elisabet Vilella
- Biomedical Network Research Centre on Mental Health (CIBERSAM), Hospital Universitari Institut Pere Mata, IISPV Universitat Rovira i Virgili, Reus, Spain
| |
Collapse
|
49
|
VolcanoFinder: Genomic scans for adaptive introgression. PLoS Genet 2020; 16:e1008867. [PMID: 32555579 PMCID: PMC7326285 DOI: 10.1371/journal.pgen.1008867] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Revised: 06/30/2020] [Accepted: 05/18/2020] [Indexed: 12/16/2022] Open
Abstract
Recent research shows that introgression between closely-related species is an important source of adaptive alleles for a wide range of taxa. Typically, detection of adaptive introgression from genomic data relies on comparative analyses that require sequence data from both the recipient and the donor species. However, in many cases, the donor is unknown or the data is not currently available. Here, we introduce a genome-scan method—VolcanoFinder—to detect recent events of adaptive introgression using polymorphism data from the recipient species only. VolcanoFinder detects adaptive introgression sweeps from the pattern of excess intermediate-frequency polymorphism they produce in the flanking region of the genome, a pattern which appears as a volcano-shape in pairwise genetic diversity. Using coalescent theory, we derive analytical predictions for these patterns. Based on these results, we develop a composite-likelihood test to detect signatures of adaptive introgression relative to the genomic background. Simulation results show that VolcanoFinder has high statistical power to detect these signatures, even for older sweeps and for soft sweeps initiated by multiple migrant haplotypes. Finally, we implement VolcanoFinder to detect archaic introgression in European and sub-Saharan African human populations, and uncovered interesting candidates in both populations, such as TSHR in Europeans and TCHH-RPTN in Africans. We discuss their biological implications and provide guidelines for identifying and circumventing artifactual signals during empirical applications of VolcanoFinder. The process by which beneficial alleles are introduced into a species from a closely-related species is termed adaptive introgression. We present an analytically-tractable model for the effects of adaptive introgression on non-adaptive genetic variation in the genomic region surrounding the beneficial allele. The result we describe is a characteristic volcano-shaped pattern of increased variability that arises around the positively-selected site, and we introduce an open-source method VolcanoFinder to detect this signal in genomic data. Importantly, VolcanoFinder is a population-genetic likelihood-based approach, rather than a comparative-genomic approach, and can therefore probe genomic variation data from a single population for footprints of adaptive introgression, even from a priori unknown and possibly extinct donor species.
Collapse
|
50
|
Harris RB, Jensen JD. Considering Genomic Scans for Selection as Coalescent Model Choice. Genome Biol Evol 2020; 12:871-877. [PMID: 32396636 PMCID: PMC7313662 DOI: 10.1093/gbe/evaa093] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/06/2020] [Indexed: 12/17/2022] Open
Abstract
First inspired by the seminal work of Lewontin and Krakauer (1973. Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics 74(1):175-195.) and Maynard Smith and Haigh (1974. The hitch-hiking effect of a favourable gene. Genet Res. 23(1):23-35.), genomic scans for positive selection remain a widely utilized tool in modern population genomic analysis. Yet, the relative frequency and genomic impact of selective sweeps have remained a contentious point in the field for decades, largely owing to an inability to accurately identify their presence and quantify their effects-with current methodologies generally being characterized by low true-positive rates and/or high false-positive rates under many realistic demographic models. Most of these approaches are based on Wright-Fisher assumptions and the Kingman coalescent and generally rely on detecting outlier regions which do not conform to these neutral expectations. However, previous theoretical results have demonstrated that selective sweeps are well characterized by an alternative class of model known as the multiple-merger coalescent. Taken together, this suggests the possibility of not simply identifying regions which reject the Kingman, but rather explicitly testing the relative fit of a genomic window to the multiple-merger coalescent. We describe the advantages of such an approach, which owe to the branching structure differentiating selective and neutral models, and demonstrate improved power under certain demographic scenarios relative to a commonly used approach. However, regions of the demographic parameter space continue to exist in which neither this approach nor existing methodologies have sufficient power to detect selective sweeps.
Collapse
|