1
|
Long-distance migration is a major factor driving local adaptation at continental scale in Coho salmon. Mol Ecol 2023; 32:542-559. [PMID: 35000273 DOI: 10.1111/mec.16339] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 11/19/2021] [Accepted: 12/23/2021] [Indexed: 01/25/2023]
Abstract
Inferring the genomic basis of local adaptation is a long-standing goal of evolutionary biology. Beyond its fundamental evolutionary implications, such knowledge can guide conservation decisions for populations of conservation and management concern. Here, we investigated the genomic basis of local adaptation in the Coho salmon (Oncorhynchus kisutch) across its entire North American range. We hypothesized that extensive spatial variation in environmental conditions and the species' homing behaviour may promote the establishment of local adaptation. We genotyped 7829 individuals representing 217 sampling locations at more than 100,000 high-quality RADseq loci to investigate how recombination might affect the detection of loci putatively under selection and took advantage of the precise description of the demographic history of the species from our previous work to draw accurate population genomic inferences about local adaptation. The results indicated that genetic differentiation scans and genetic-environment association analyses were both significantly affected by variation in recombination rate as low recombination regions displayed an increased number of outliers. By taking these confounding factors into consideration, we revealed that migration distance was the primary selective factor driving local adaptation and partial parallel divergence among distant populations. Moreover, we identified several candidate single nucleotide polymorphisms associated with long-distance migration and altitude including a gene known to be involved in adaptation to altitude in other species. The evolutionary implications of our findings are discussed along with conservation applications.
Collapse
|
2
|
Comparing mixed models and Random Forest association tests using naturalGWAS and a Striped Bass SNP dataset. Mol Ecol Resour 2022; 23:145-158. [PMID: 35980658 DOI: 10.1111/1755-0998.13701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 08/11/2022] [Accepted: 08/12/2022] [Indexed: 11/29/2022]
Abstract
In this study, we used the phenotype simulation package naturalGWAS to test the performance of Zhao's Random Forest method in comparison to an uncorrected Random Forest test, latent factor mixed models (LFMM), genome-wide efficient mixed models (GEMMA), and confounder adjusted linear regression (CATE). We created 400 sets of phenotypes, corresponding to five effect sizes and 2, 5, 15, or 30 causal loci, simulated from two empirical datasets containing SNPs from Striped Bass representing three and 13 populations. All association methods were evaluated for their ability to detect genotype-phenotype associations based on power, false discovery rates, and number of false positives. Genomic inflation was highest for uncorrected Random Forest and LFMM tests and lowest for Gemma and Zhao's Random Forest. All association tests had similar power to detect causal loci, and Zhao's Random Forest had the lowest false discovery rate in all scenarios. To measure the performance of association tests in small datasets with few loci surrounding a causal gene we also ran analyses again after removing causal loci from each dataset. All association tests were only able to find true positives, defined as loci located within 30k bp of a causal locus, in 3%-18% of simulations. In contrast, at least one false positive was found in 17%-44% of simulations. Zhao's Random Forest again identified the fewest false positives of all association tests studied. The ability to test the power of association tests for individual empirical datasets can be an extremely useful first step when designing a GWAS study.
Collapse
|
3
|
Implications of Large-Effect Loci for Conservation: A Review and Case Study with Pacific Salmon. J Hered 2022; 113:121-144. [PMID: 35575083 DOI: 10.1093/jhered/esab069] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 11/07/2021] [Indexed: 11/13/2022] Open
Abstract
The increasing feasibility of assembling large genomic datasets for non-model species presents both opportunities and challenges for applied conservation and management. A popular theme in recent studies is the search for large-effect loci that explain substantial portions of phenotypic variance for a key trait(s). If such loci can be linked to adaptations, 2 important questions arise: 1) Should information from these loci be used to reconfigure conservation units (CUs), even if this conflicts with overall patterns of genetic differentiation? 2) How should this information be used in viability assessments of populations and larger CUs? In this review, we address these questions in the context of recent studies of Chinook salmon and steelhead (anadromous form of rainbow trout) that show strong associations between adult migration timing and specific alleles in one small genomic region. Based on the polygenic paradigm (most traits are controlled by many genes of small effect) and genetic data available at the time showing that early-migrating populations are most closely related to nearby late-migrating populations, adult migration differences in Pacific salmon and steelhead were considered to reflect diversity within CUs rather than separate CUs. Recent data, however, suggest that specific alleles are required for early migration, and that these alleles are lost in populations where conditions do not support early-migrating phenotypes. Contrasting determinations under the US Endangered Species Act and the State of California's equivalent legislation illustrate the complexities of incorporating genomics data into CU configuration decisions. Regardless how CUs are defined, viability assessments should consider that 1) early-migrating phenotypes experience disproportionate risks across large geographic areas, so it becomes important to identify early-migrating populations that can serve as reliable sources for these valuable genetic resources; and 2) genetic architecture, especially the existence of large-effect loci, can affect evolutionary potential and adaptability.
Collapse
|
4
|
Non-Lethal Sampling Supports Integrative Movement Research in Freshwater Fish. Front Genet 2022; 13:795355. [PMID: 35547248 PMCID: PMC9081360 DOI: 10.3389/fgene.2022.795355] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 03/17/2022] [Indexed: 11/13/2022] Open
Abstract
Freshwater ecosystems and fishes are enormous resources for human uses and biodiversity worldwide. However, anthropogenic climate change and factors such as dams and environmental contaminants threaten these freshwater systems. One way that researchers can address conservation issues in freshwater fishes is via integrative non-lethal movement research. We review different methods for studying movement, such as with acoustic telemetry. Methods for connecting movement and physiology are then reviewed, by using non-lethal tissue biopsies to assay environmental contaminants, isotope composition, protein metabolism, and gene expression. Methods for connecting movement and genetics are reviewed as well, such as by using population genetics or quantitative genetics and genome-wide association studies. We present further considerations for collecting molecular data, the ethical foundations of non-lethal sampling, integrative approaches to research, and management decisions. Ultimately, we argue that non-lethal sampling is effective for conducting integrative, movement-oriented research in freshwater fishes. This research has the potential for addressing critical issues in freshwater systems in the future.
Collapse
|
5
|
IIb-RAD-sequencing coupled with random forest classification indicates regional population structuring and sex-specific differentiation in salmon lice ( Lepeophtheirus salmonis). Ecol Evol 2022; 12:e8809. [PMID: 35414904 PMCID: PMC8986551 DOI: 10.1002/ece3.8809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 03/18/2022] [Accepted: 03/22/2022] [Indexed: 11/29/2022] Open
Abstract
The aquaculture industry has been dealing with salmon lice problems forming serious threats to salmonid farming. Several treatment approaches have been used to control the parasite. Treatment effectiveness must be optimized, and the systematic genetic differences between subpopulations must be studied to monitor louse species and enhance targeted control measures. We have used IIb-RAD sequencing in tandem with a random forest classification algorithm to detect the regional genetic structure of the Norwegian salmon lice and identify important markers for sex differentiation of this species. We identified 19,428 single nucleotide polymorphisms (SNPs) from 95 individuals of salmon lice. These SNPs, however, were not able to distinguish the differential structure of lice populations. Using the random forest algorithm, we selected 91 SNPs important for geographical classification and 14 SNPs important for sex classification. The geographically important SNP data substantially improved the genetic understanding of the population structure and classified regional demographic clusters along the Norwegian coast. We also uncovered SNP markers that could help determine the sex of the salmon louse. A large portion of the SNPs identified to be under directional selection was also ranked highly important by random forest. According to our findings, there is a regional population structure of salmon lice associated with the geographical location along the Norwegian coastline.
Collapse
|
6
|
Population genomics and geographic dispersal in Chagas disease vectors: Landscape drivers and evidence of possible adaptation to the domestic setting. PLoS Genet 2022; 18:e1010019. [PMID: 35120121 PMCID: PMC8849464 DOI: 10.1371/journal.pgen.1010019] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 02/16/2022] [Accepted: 01/06/2022] [Indexed: 12/19/2022] Open
Abstract
Accurate prediction of vectors dispersal, as well as identification of adaptations that allow blood-feeding vectors to thrive in built environments, are a basis for effective disease control. Here we adopted a landscape genomics approach to assay gene flow, possible local adaptation, and drivers of population structure in Rhodnius ecuadoriensis, an important vector of Chagas disease. We used a reduced-representation sequencing technique (2b-RADseq) to obtain 2,552 SNP markers across 272 R. ecuadoriensis samples from 25 collection sites in southern Ecuador. Evidence of high and directional gene flow between seven wild and domestic population pairs across our study site indicates insecticide-based control will be hindered by repeated re-infestation of houses from the forest. Preliminary genome scans across multiple population pairs revealed shared outlier loci potentially consistent with local adaptation to the domestic setting, which we mapped to genes involved with embryogenesis and saliva production. Landscape genomic models showed elevation is a key barrier to R. ecuadoriensis dispersal. Together our results shed early light on the genomic adaptation in triatomine vectors and facilitate vector control by predicting that spatially-targeted, proactive interventions would be more efficacious than current, reactive approaches. Re-infestation of recently insecticide-treated houses by wild/secondary triatomine, their potential adaptation to this new environment and capabilities to geographically disperse across multiple human communities jeopardise sustainable Chagas disease control. This is the first study in Chagas disease vectors that identifies genomic regions possibly linked to adaptations to the built environment and describes landscape drivers for accurate prediction of geographic dispersal. We sampled multiple domestic and wild Rhodnius ecuadoriensis population pairs across a mountainous terrain in southern Ecuador. We evidenced that triatomine movement from forest to built enviroments does occur at a high rate. In these highly connected population pairs we detected loci possibly linked to local adaptation among the genomic makers we evaluated and in doing so we pave the way for future triatomine genomic research. We highlighted that current haphazardous vector control in the zone will be hindered by reinfestation of triatomines from the forest. Instead, we recommend frequent and spatially-targeted vector control and provided a landacape genomic model that identifies highly connected and isolated triatomine populations to facilitate efficient vector control.
Collapse
|
7
|
Hierarchical genetic structure and implications for conservation of the world's largest salmonid, Hucho taimen. Sci Rep 2021; 11:20508. [PMID: 34654859 PMCID: PMC8520000 DOI: 10.1038/s41598-021-99530-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 09/20/2021] [Indexed: 11/09/2022] Open
Abstract
Population genetic analyses can evaluate how evolutionary processes shape diversity and inform conservation and management of imperiled species. Taimen (Hucho taimen), the world’s largest freshwater salmonid, is threatened, endangered, or extirpated across much of its range due to anthropogenic activity including overfishing and habitat degradation. We generated genetic data using high throughput sequencing of reduced representation libraries for taimen from multiple drainages in Mongolia and Russia. Nucleotide diversity estimates were within the range documented in other salmonids, suggesting moderate diversity despite widespread population declines. Similar to other recent studies, our analyses revealed pronounced differentiation among the Arctic (Selenge) and Pacific (Amur and Tugur) drainages, suggesting historical isolation among these systems. However, we found evidence for finer-scale structure within the Pacific drainages, including unexpected differentiation between tributaries and the mainstem of the Tugur River. Differentiation across the Amur and Tugur basins together with coalescent-based demographic modeling suggests the ancestors of Tugur tributary taimen likely diverged in the eastern Amur basin, prior to eventual colonization of the Tugur basin. Our results suggest the potential for differentiation of taimen at different geographic scales, and suggest more thorough geographic and genomic sampling may be needed to inform conservation and management of this iconic salmonid.
Collapse
|
8
|
Expanding the conservation genomics toolbox: Incorporating structural variants to enhance genomic studies for species of conservation concern. Mol Ecol 2021; 30:5949-5965. [PMID: 34424587 PMCID: PMC9290615 DOI: 10.1111/mec.16141] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 07/28/2021] [Accepted: 08/18/2021] [Indexed: 12/28/2022]
Abstract
Structural variants (SVs) are large rearrangements (>50 bp) within the genome that impact gene function and the content and structure of chromosomes. As a result, SVs are a significant source of functional genomic variation, that is, variation at genomic regions underpinning phenotype differences, that can have large effects on individual and population fitness. While there are increasing opportunities to investigate functional genomic variation in threatened species via single nucleotide polymorphism (SNP) data sets, SVs remain understudied despite their potential influence on fitness traits of conservation interest. In this future-focused Opinion, we contend that characterizing SVs offers the conservation genomics community an exciting opportunity to complement SNP-based approaches to enhance species recovery. We also leverage the existing literature-predominantly in human health, agriculture and ecoevolutionary biology-to identify approaches for readily characterizing SVs and consider how integrating these into the conservation genomics toolbox may transform the way we manage some of the world's most threatened species.
Collapse
|
9
|
Low-coverage whole-genome sequencing reveals molecular markers for spawning season and sex identification in Gulf of Maine Atlantic cod ( Gadus morhua, Linnaeus 1758). Ecol Evol 2021; 11:10659-10671. [PMID: 34367604 PMCID: PMC8328444 DOI: 10.1002/ece3.7878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 06/17/2021] [Accepted: 06/18/2021] [Indexed: 11/28/2022] Open
Abstract
Atlantic cod (Gadus morhua, Linnaeus 1758) in the western Gulf of Maine are managed as a single stock despite several lines of evidence supporting two spawning groups (spring and winter) that overlap spatially, while exhibiting seasonal spawning isolation. Low-coverage whole-genome sequencing was used to evaluate the genomic population structure of Atlantic cod spawning groups in the western Gulf of Maine and Georges Bank using 222 individuals collected over multiple years. Results indicated low total genomic differentiation, while also showing strong differentiation between spring and winter-spawning groups at specific regions of the genome. Guided regularized random forest and ranked F ST methods were used to select panels of single nucleotide polymorphisms (SNPs) that could reliably distinguish spring and winter-spawning Atlantic cod (88.5% assignment rate), as well as males and females (95.0% assignment rate) collected in the western Gulf of Maine. These SNP panels represent a valuable tool for fisheries research and management of Atlantic cod in the western Gulf of Maine that will aid investigations of stock production and support accuracy of future assessments.
Collapse
|
10
|
A zero altered Poisson random forest model for genomic-enabled prediction. G3-GENES GENOMES GENETICS 2021; 11:6042695. [PMID: 33693599 PMCID: PMC8022945 DOI: 10.1093/g3journal/jkaa057] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Accepted: 12/10/2020] [Indexed: 12/23/2022]
Abstract
In genomic selection choosing the statistical machine learning model is of paramount importance. In this paper, we present an application of a zero altered random forest model with two versions (ZAP_RF and ZAPC_RF) to deal with excess zeros in count response variables. The proposed model was compared with the conventional random forest (RF) model and with the conventional Generalized Poisson Ridge regression (GPR) using two real datasets, and we found that, in terms of prediction performance, the proposed zero inflated random forest model outperformed the conventional RF and GPR models.
Collapse
|
11
|
Identification and Functional Annotation of Genes Related to Bone Stability in Laying Hens Using Random Forests. Genes (Basel) 2021; 12:genes12050702. [PMID: 34066823 PMCID: PMC8151682 DOI: 10.3390/genes12050702] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 05/05/2021] [Accepted: 05/06/2021] [Indexed: 12/20/2022] Open
Abstract
Skeletal disorders, including fractures and osteoporosis, in laying hens cause major welfare and economic problems. Although genetics have been shown to play a key role in bone integrity, little is yet known about the underlying genetic architecture of the traits. This study aimed to identify genes associated with bone breaking strength and bone mineral density of the tibiotarsus and the humerus in laying hens. Potentially informative single nucleotide polymorphisms (SNP) were identified using Random Forests classification. We then searched for genes known to be related to bone stability in close proximity to the SNPs and identified 16 potential candidates. Some of them had human orthologues. Based on our findings, we can support the assumption that multiple genes determine bone strength, with each of them having a rather small effect, as illustrated by our SNP effect estimates. Furthermore, the enrichment analysis showed that some of these candidates are involved in metabolic pathways critical for bone integrity. In conclusion, the identified candidates represent genes that may play a role in the bone integrity of chickens. Although further studies are needed to determine causality, the genes reported here are promising in terms of alleviating bone disorders in laying hens.
Collapse
|
12
|
Abstract
Diadromy, the predictable movements of individuals between marine and freshwater environments, is biogeographically and phylogenetically widespread across fishes. Thus, despite the high energetic and potential fitness costs involved in moving between distinct environments, diadromy appears to be an effective life history strategy. Yet, the origin and molecular mechanisms that underpin this migratory behavior are not fully understood. In this review, we aim first to summarize what is known about diadromy in fishes; this includes the phylogenetic relationship among diadromous species, a description of the main hypotheses regarding its origin, and a discussion of the presence of non-migratory populations within diadromous species. Second, we discuss how recent research based on -omics approaches (chiefly genomics, transcriptomics, and epigenomics) is beginning to provide answers to questions on the genetic bases and origin(s) of diadromy. Finally, we suggest future directions for -omics research that can help tackle questions on the evolution of diadromy.
Collapse
|
13
|
Comparing the Performance of Microsatellites and RADseq in Population Genetic Studies: Analysis of Data for Pike ( Esox lucius) and a Synthesis of Previous Studies. Front Genet 2020; 11:218. [PMID: 32231687 PMCID: PMC7082332 DOI: 10.3389/fgene.2020.00218] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Accepted: 02/24/2020] [Indexed: 01/06/2023] Open
Abstract
Population genetic studies reveal biodiversity patterns and inform about drivers of evolutionary differentiation and adaptation, including gene flow, drift and selection. This can advance our understanding and aid decision making regarding management and conservation efforts. Microsatellites have long been used in population genetic studies. Thanks to the development of newer techniques, sequencing approaches such as restriction site associated DNA sequencing (RADseq) are on their way to replace microsatellites for some applications. However, the performance of these two marker types in population genetics have rarely been systematically compared. We utilized three neutrally and adaptively differentiated populations of anadromous pike (Esox lucius) to assess the relative performance of microsatellites and RADseq with respect to resolution and conclusiveness of estimates of population differentiation and genetic structure. To this end, the same set of individuals (N = 64) were genotyped with both RADseq and microsatellite markers. To assess effects of sample size, the same subset of 10 randomly chosen individuals from each population (N = 30 in total) were also genotyped with both methods. Comparisons of estimated genetic diversity and structure showed that both markers were able to uncover genetic structuring. The full RADseq dataset provided the clearest detection of the finer scaled genetic structuring, and the other three datasets (full and subset microsatellite, and subset RADseq) provided comparable results. A search for outlier loci performed on the full SNP dataset pointed to signs of selection potentially associated with salinity and temperature, exemplifying the utility of RADseq to inform about the importance of different environmental factors. To evaluate whether performance differences between the markers are general or context specific, the results of previous studies that have investigated population structure using both marker types were synthesized. The synthesis revealed that RADseq performed as well as, or better than microsatellites in detecting genetic structuring in the included studies. The differences in the ability to detect population structure, both in the present and the previous studies, are likely explained by the higher number of loci typically utilized in RADseq compared to microsatellite analysis, as increasing the number of markers will (regardless of the marker type) increase power and allow for clearer detection and higher resolution of genetic structure.
Collapse
|
14
|
|
15
|
Absence of founder effect and evidence for adaptive divergence in a recently introduced insular population of white‐tailed deer (
Odocoileus virginianus
). Mol Ecol 2019; 29:86-104. [DOI: 10.1111/mec.15317] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Revised: 10/25/2019] [Accepted: 10/29/2019] [Indexed: 12/16/2022]
|
16
|
Abstract
Salmon were among the first nonmodel species for which systematic population genetic studies of natural populations were conducted, often to support management and conservation. The genomics revolution has improved our understanding of the evolutionary ecology of salmon in two major ways: (a) Large increases in the numbers of genetic markers (from dozens to 104-106) provide greater power for traditional analyses, such as the delineation of population structure, hybridization, and population assignment, and (b) qualitatively new insights that were not possible with traditional genetic methods can be achieved by leveraging detailed information about the structure and function of the genome. Studies of the first type have been more common to date, largely because it has taken time for the necessary tools to be developed to fully understand the complex salmon genome. We expect that the next decade will witness many new studies that take full advantage of salmonid genomic resources.
Collapse
|
17
|
Mitochondria, sex and variation in routine metabolic rate. Mol Ecol 2019; 28:4608-4619. [PMID: 31529542 DOI: 10.1111/mec.15244] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Accepted: 09/12/2019] [Indexed: 12/14/2022]
Abstract
Variation in the metabolic costs associated with organismal maintenance may play a key role in determining fitness, and thus these differences among individuals are likely to be subject to natural selection. Although the evolvability of maintenance metabolism depends on its underlying genetic architecture, relatively little is known about the nature of genetic variation that underlies this trait. To address this, we measured variation in routine metabolic rate (ṀO2 routine ), an index of maintenance metabolism, within and among three populations of Atlantic killifish, Fundulus heteroclitus, including a population from a region of genetic admixture between two subspecies. Polygenic association tests among individuals from the admixed population identified 54 single nucleotide polymorphisms (SNPs) that were associated with ṀO2 routine , and these SNPs accounted for 43% of interindividual variation in this trait. However, genetic associations with ṀO2 routine involved different SNPs if females and males were analysed separately, and there was a sex-dependent effect of mitochondrial genotype on variation in routine metabolism. These results imply that there are sex-specific genetic mechanisms, and potential mitonuclear interactions, that underlie variation in ṀO2 routine . Additionally, there was evidence for epistatic interactions between 17% of the possible pairs of trait-associated SNPs, suggesting that epistatic effects on ṀO2 routine are common. These data demonstrate not only that phenotypic variation in this ecologically important trait has a polygenic basis with considerable epistasis among loci, but also that these underlying genetic mechanisms, and particularly the role of mitochondrial genotype, may be sex-specific.
Collapse
|
18
|
What can be learned by scanning the genome for molecular convergence in wild populations? Ann N Y Acad Sci 2019; 1476:23-42. [PMID: 31241191 PMCID: PMC7586825 DOI: 10.1111/nyas.14177] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 05/24/2019] [Accepted: 06/04/2019] [Indexed: 12/11/2022]
Abstract
Convergent evolution, where independent lineages evolve similar phenotypes in response to similar challenges, can provide valuable insight into how selection operates and the limitations it encounters. However, it has only recently become possible to explore how convergent evolution is reflected at the genomic level. The overlapping outlier approach (OOA), where genome scans of multiple independent lineages are used to find outliers that overlap and therefore identify convergently evolving loci, is becoming popular. Here, we present a quantitative analysis of 34 studies that used this approach across many sampling designs, taxa, and sampling intensities. We found that OOA studies with increased biological sampling power within replicates have increased likelihood of finding overlapping, "convergent" signals of adaptation between them. When identifying convergent loci as overlapping outliers, it is tempting to assume that any false-positive outliers derived from individual scans will fail to overlap across replicates, but this cannot be guaranteed. We highlight how population demographics and genomic context can contribute toward both true convergence and false positives in OOA studies. We finish with an exploration of emerging methods that couple genome scans with phenotype and environmental measures, leveraging added information from genome data to more directly test hypotheses of the likelihood of convergent evolution.
Collapse
|
19
|
Integrative Population and Physiological Genomics Reveals Mechanisms of Adaptation in Killifish. Mol Biol Evol 2019; 35:2639-2653. [PMID: 30102365 DOI: 10.1093/molbev/msy154] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Adaptive divergence between marine and freshwater (FW) environments is important in generating phyletic diversity within fishes, but the genetic basis of this process remains poorly understood. Genome selection scans can identify adaptive loci, but incomplete knowledge of genotype-phenotype connections makes interpreting their significance difficult. In contrast, association mapping (genome-wide association mapping [GWAS], random forest [RF] analyses) links genotype to phenotype, but offer limited insight into the evolutionary forces shaping variation. Here, we combined GWAS, RF, and selection scans to identify loci important in adaptation to FW environments. We utilized FW-native and brackish water (BW)-native populations of Atlantic killifish (Fundulus heteroclitus) as well as a naturally admixed population between the two. We measured morphology and multiple physiological traits that differ between populations and may contribute to osmotic adaptation (salinity tolerance, hypoxia tolerance, metabolic rate, body shape) and used a reduced representation approach for genome-wide genotyping. Our results show patterns of population divergence in physiological capabilities that are consistent with local adaptation. Population genomic scans between BW-native and FW-native populations identified genomic regions evolving by natural selection, whereas association mapping revealed loci that contribute to variation for each trait. There was substantial overlap in the genomic regions putatively under selection and loci associated with phenotypic traits, particularly for salinity tolerance, suggesting that these regions and genes are important for adaptive divergence between BW and FW environments. Together, these data provide insight into the mechanisms that enable diversification of fishes across osmotic boundaries.
Collapse
|
20
|
Transition-transversion encoding and genetic relationship metric in ReliefF feature selection improves pathway enrichment in GWAS. BioData Min 2018; 11:23. [PMID: 30410580 PMCID: PMC6215626 DOI: 10.1186/s13040-018-0186-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2018] [Accepted: 10/22/2018] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND ReliefF is a nearest-neighbor based feature selection algorithm that efficiently detects variants that are important due to statistical interactions or epistasis. For categorical predictors, like genotypes, the standard metric used in ReliefF has been a simple (binary) mismatch difference. In this study, we develop new metrics of varying complexity that incorporate allele sharing, adjustment for allele frequency heterogeneity via the genetic relationship matrix (GRM), and physicochemical differences of variants via a new transition/transversion encoding. METHODS We introduce a new two-dimensional transition/transversion genotype encoding for ReliefF, and we implement three ReliefF attribute metrics: 1.) genotype mismatch (GM), which is the ReliefF standard, 2.) allele mismatch (AM), which accounts for heterozygous differences and has not been used previously in ReliefF, and 3.) the new transition/transversion metric. We incorporate these attribute metrics into the ReliefF nearest neighbor calculation with a Manhattan metric, and we introduce GRM as a new ReliefF nearest-neighbor metric to adjust for allele frequency heterogeneity. RESULTS We apply ReliefF with each metric to a GWAS of major depressive disorder and compare the detection of genes in pathways implicated in depression, including Axon Guidance, Neuronal System, and G Protein-Coupled Receptor Signaling. We also compare with detection by Random Forest and Lasso as well as random/null selection to assess pathway size bias. CONCLUSIONS Our results suggest that using more genetically motivated encodings, such as transition/transversion, and metrics that adjust for allele frequency heterogeneity, such as GRM, lead to ReliefF attribute scores with improved pathway enrichment.
Collapse
|
21
|
The Peril of Gene-Targeted Conservation. Trends Ecol Evol 2018; 33:827-839. [DOI: 10.1016/j.tree.2018.08.011] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 08/27/2018] [Accepted: 08/28/2018] [Indexed: 01/01/2023]
|
22
|
Tolerance traits related to climate change resilience are independent and polygenic. GLOBAL CHANGE BIOLOGY 2018; 24:5348-5360. [PMID: 29995321 DOI: 10.1111/gcb.14386] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Accepted: 06/06/2018] [Indexed: 05/21/2023]
Abstract
The resilience of organisms to climate change through adaptive evolution is dependent on the extent of genetically based variation in key phenotypic traits and the nature of genetic associations between them. For aquatic animals, upper thermal tolerance and hypoxia tolerance are likely to be a important determinants of sensitivity to climate change. To determine the genetic basis of these traits and to detect associations between them, we compared naturally occurring populations of two subspecies of Atlantic killifish, Fundulus heteroclitus, that differ in both thermal and hypoxia tolerance. Multilocus association mapping demonstrated that 47 and 35 single nucleotide polymorphisms (SNPs) explained 43.4% and 51.9% of variation in thermal and hypoxia tolerance, respectively, suggesting that genetic mechanisms underlie a substantial proportion of variation in each trait. However, no explanatory SNPs were shared between traits, and upper thermal tolerance varied approximately linearly with latitude, whereas hypoxia tolerance exhibited a steep phenotypic break across the contact zone between the subspecies. These results suggest that upper thermal tolerance and hypoxia tolerance are neither phenotypically correlated nor genetically associated, and thus that rates of adaptive change in these traits can be independently fine-tuned by natural selection. This modularity of important traits can underpin the evolvability of organisms to complex future environmental change.
Collapse
|
23
|
Genomics and conservation units: The genetic basis of adult migration timing in Pacific salmonids. Evol Appl 2018; 11:1518-1526. [PMID: 30344624 PMCID: PMC6183503 DOI: 10.1111/eva.12687] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Revised: 07/18/2018] [Accepted: 07/20/2018] [Indexed: 01/01/2023] Open
Abstract
It is now routinely possible to generate genomics-scale datasets for nonmodel species; however, many questions remain about how best to use these data for conservation and management. Some recent genomics studies of anadromous Pacific salmonids have reported a strong association between alleles at one or a very few genes and a key life history trait (adult migration timing) that has played an important role in defining conservation units. Publication of these results has already spurred a legal challenge to the existing framework for managing these species, which was developed under the paradigm that most phenotypic traits are controlled by many genes of small effect, and that parallel evolution of life history traits is common. But what if a key life history trait can only be expressed if a specific allele is present? Does the current framework need to be modified to account for the new genomics results, as some now propose? Although this real-world example focuses on Pacific salmonids, the issues regarding how genomics can inform us about the genetic basis of phenotypic traits, and what that means for applied conservation, are much more general. In this perspective, we consider these issues and outline a general process that can be used to help generate the types of additional information that would be needed to make informed decisions about the adequacy of existing conservation and management frameworks.
Collapse
|
24
|
Selection and Utility of Single Nucleotide Polymorphism Markers to Reveal Fine-Scale Population Structure in Human Malaria Parasite Plasmodium falciparum. Front Ecol Evol 2018. [DOI: 10.3389/fevo.2018.00145] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
|
25
|
Genomic Prediction of Breeding Values Using a Subset of SNPs Identified by Three Machine Learning Methods. Front Genet 2018; 9:237. [PMID: 30023001 PMCID: PMC6039760 DOI: 10.3389/fgene.2018.00237] [Citation(s) in RCA: 79] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2018] [Accepted: 06/14/2018] [Indexed: 12/22/2022] Open
Abstract
The analysis of large genomic data is hampered by issues such as a small number of observations and a large number of predictive variables (commonly known as “large P small N”), high dimensionality or highly correlated data structures. Machine learning methods are renowned for dealing with these problems. To date machine learning methods have been applied in Genome-Wide Association Studies for identification of candidate genes, epistasis detection, gene network pathway analyses and genomic prediction of phenotypic values. However, the utility of two machine learning methods, Gradient Boosting Machine (GBM) and Extreme Gradient Boosting Method (XgBoost), in identifying a subset of SNP makers for genomic prediction of breeding values has never been explored before. In this study, using 38,082 SNP markers and body weight phenotypes from 2,093 Brahman cattle (1,097 bulls as a discovery population and 996 cows as a validation population), we examined the efficiency of three machine learning methods, namely Random Forests (RF), GBM and XgBoost, in (a) the identification of top 400, 1,000, and 3,000 ranked SNPs; (b) using the subsets of SNPs to construct genomic relationship matrices (GRMs) for the estimation of genomic breeding values (GEBVs). For comparison purposes, we also calculated the GEBVs from (1) 400, 1,000, and 3,000 SNPs that were randomly selected and evenly spaced across the genome, and (2) from all the SNPs. We found that RF and especially GBM are efficient methods in identifying a subset of SNPs with direct links to candidate genes affecting the growth trait. In comparison to the estimate of prediction accuracy of GEBVs from using all SNPs (0.43), the 3,000 top SNPs identified by RF (0.42) and GBM (0.46) had similar values to those of the whole SNP panel. The performance of the subsets of SNPs from RF and GBM was substantially better than that of evenly spaced subsets across the genome (0.18–0.29). Of the three methods, RF and GBM consistently outperformed the XgBoost in genomic prediction accuracy.
Collapse
|
26
|
Genomewide association analyses of fitness traits in captive-reared Chinook salmon: Applications in evaluating conservation strategies. Evol Appl 2018; 11:853-868. [PMID: 29928295 PMCID: PMC5999212 DOI: 10.1111/eva.12599] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Accepted: 01/09/2018] [Indexed: 12/20/2022] Open
Abstract
A novel application of genomewide association analyses is to use trait-associated loci to monitor the effects of conservation strategies on potentially adaptive genetic variation. Comparisons of fitness between captive- and wild-origin individuals, for example, do not reveal how captive rearing affects genetic variation underlying fitness traits or which traits are most susceptible to domestication selection. Here, we used data collected across four generations to identify loci associated with six traits in adult Chinook salmon (Oncorhynchus tshawytscha) and then determined how two alternative management approaches for captive rearing affected variation at these loci. Loci associated with date of return to freshwater spawning grounds (return timing), length and weight at return, age at maturity, spawn timing, and daily growth coefficient were identified using 9108 restriction site-associated markers and random forest, an approach suitable for polygenic traits. Mapping of trait-associated loci, gene annotations, and integration of results across multiple studies revealed candidate regions involved in several fitness-related traits. Genotypes at trait-associated loci were then compared between two hatchery populations that were derived from the same source but are now managed as separate lines, one integrated with and one segregated from the wild population. While no broad-scale change was detected across four generations, there were numerous regions where trait-associated loci overlapped with signatures of adaptive divergence previously identified in the two lines. Many regions, primarily with loci linked to return and spawn timing, were either unique to or more divergent in the segregated line, suggesting that these traits may be responding to domestication selection. This study is one of the first to utilize genomic approaches to demonstrate the effectiveness of a conservation strategy, managed gene flow, on trait-associated-and potentially adaptive-loci. The results will promote the development of trait-specific tools to better monitor genetic change in captive and wild populations.
Collapse
|
27
|
Comparing methods for detecting multilocus adaptation with multivariate genotype-environment associations. Mol Ecol 2018; 27:2215-2233. [DOI: 10.1111/mec.14584] [Citation(s) in RCA: 267] [Impact Index Per Article: 44.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2017] [Revised: 03/16/2018] [Accepted: 03/19/2018] [Indexed: 12/18/2022]
|
28
|
A practical introduction to Random Forest for genetic association studies in ecology and evolution. Mol Ecol Resour 2018; 18:755-766. [PMID: 29504715 DOI: 10.1111/1755-0998.12773] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Revised: 02/08/2018] [Accepted: 02/17/2018] [Indexed: 12/25/2022]
Abstract
Large genomic studies are becoming increasingly common with advances in sequencing technology, and our ability to understand how genomic variation influences phenotypic variation between individuals has never been greater. The exploration of such relationships first requires the identification of associations between molecular markers and phenotypes. Here, we explore the use of Random Forest (RF), a powerful machine-learning algorithm, in genomic studies to discern loci underlying both discrete and quantitative traits, particularly when studying wild or nonmodel organisms. RF is becoming increasingly used in ecological and population genetics because, unlike traditional methods, it can efficiently analyse thousands of loci simultaneously and account for nonadditive interactions. However, understanding both the power and limitations of Random Forest is important for its proper implementation and the interpretation of results. We therefore provide a practical introduction to the algorithm and its use for identifying associations between molecular markers and phenotypes, discussing such topics as data limitations, algorithm initiation and optimization, as well as interpretation. We also provide short R tutorials as examples, with the aim of providing a guide to the implementation of the algorithm. Topics discussed here are intended to serve as an entry point for molecular ecologists interested in employing Random Forest to identify trait associations in genomic data sets.
Collapse
|
29
|
Association mapping reveals candidate loci for resistance and anaemic response to an emerging temperature-driven parasitic disease in a wild salmonid fish. Mol Ecol 2018; 27:1385-1401. [PMID: 29411465 DOI: 10.1111/mec.14509] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2017] [Accepted: 01/08/2018] [Indexed: 02/06/2023]
Abstract
Even though parasitic infections are often costly or deadly for the host, we know very little which genes influence parasite susceptibility and disease severity. Proliferative kidney disease is an emerging and, at elevated water temperatures, potentially deadly disease of salmonid fishes that is caused by the myxozoan parasite Tetracapsuloides bryosalmonae. By screening >7.6 K SNPs in 255 wild brown trout (Salmo trutta) and combining association mapping and Random Forest approaches, we identified several candidate genes for both the parasite resistance (inverse of relative parasite load; RPL) and the severe anaemic response to the parasite. The strongest RPL-associated SNP mapped to a noncoding region of the congeneric Atlantic salmon (S. salar) chromosome 10, whereas the second strongest RPL-associated SNP mapped to an intronic region of PRICKLE2 gene, which is a part of the planar cell polarity signalling pathway involved in kidney development. The top SNP associated with anaemia mapped to the intron of the putative PRKAG2 gene. The human ortholog of this gene has been associated with haematocrit and other blood-related traits, making it a prime candidate influencing parasite-triggered anaemia in brown trout. Our findings demonstrate the power of association mapping to pinpoint genomic regions and potential causative genes underlying climate change-driven parasitic disease resistance and severity. Furthermore, this work illustrates the first steps towards dissecting genotype-phenotype links in a wild fish population using closely related genome information.
Collapse
|
30
|
Oceanographic variation influences spatial genomic structure in the sea scallop, Placopecten magellanicus. Ecol Evol 2018; 8:2824-2841. [PMID: 29531698 PMCID: PMC5838053 DOI: 10.1002/ece3.3846] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Revised: 12/05/2017] [Accepted: 12/06/2017] [Indexed: 01/03/2023] Open
Abstract
Environmental factors can influence diversity and population structure in marine species and accurate understanding of this influence can both improve fisheries management and help predict responses to environmental change. We used 7163 SNPs derived from restriction site-associated DNA sequencing genotyped in 245 individuals of the economically important sea scallop, Placopecten magellanicus, to evaluate the correlations between oceanographic variation and a previously identified latitudinal genomic cline. Sea scallops span a broad latitudinal area (>10 degrees), and we hypothesized that climatic variation significantly drives clinal trends in allele frequency. Using a large environmental dataset, including temperature, salinity, chlorophyll a, and nutrient concentrations, we identified a suite of SNPs (285-621, depending on analysis and environmental dataset) potentially under selection through correlations with environmental variation. Principal components analysis of different outlier SNPs and environmental datasets revealed similar northern and southern clusters, with significant associations between the first axes of each (R2adj = .66-.79). Multivariate redundancy analysis of outlier SNPs and the environmental principal components indicated that environmental factors explained more than 32% of the variance. Similarly, multiple linear regressions and random-forest analysis identified winter average and minimum ocean temperatures as significant parameters in the link between genetic and environmental variation. This work indicates that oceanographic variation is associated with the observed genomic cline in this species and that seasonal periods of extreme cold may restrict gene flow along a latitudinal gradient in this marine benthic bivalve. Incorporating this finding into management may improve accuracy of management strategies and future predictions.
Collapse
|
31
|
|
32
|
Genomics and telemetry suggest a role for migration harshness in determining overwintering habitat choice, but not gene flow, in anadromous Arctic Char. Mol Ecol 2017; 26:6784-6800. [DOI: 10.1111/mec.14393] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 08/25/2017] [Accepted: 10/02/2017] [Indexed: 12/11/2022]
|
33
|
Resolving neutral and deterministic contributions to genomic structure in Syntrichia ruralis (Bryophyta, Pottiaceae) informs propagule sourcing for dryland restoration. CONSERV GENET 2017. [DOI: 10.1007/s10592-017-1026-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
34
|
Signatures of polygenic adaptation associated with climate across the range of a threatened fish species with high genetic connectivity. Mol Ecol 2017; 26:6253-6269. [DOI: 10.1111/mec.14368] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Revised: 09/22/2017] [Accepted: 09/25/2017] [Indexed: 12/25/2022]
|
35
|
Applications of random forest feature selection for fine-scale genetic population assignment. Evol Appl 2017; 11:153-165. [PMID: 29387152 PMCID: PMC5775496 DOI: 10.1111/eva.12524] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Accepted: 07/11/2017] [Indexed: 01/10/2023] Open
Abstract
Genetic population assignment used to inform wildlife management and conservation efforts requires panels of highly informative genetic markers and sensitive assignment tests. We explored the utility of machine‐learning algorithms (random forest, regularized random forest and guided regularized random forest) compared with FST ranking for selection of single nucleotide polymorphisms (SNP) for fine‐scale population assignment. We applied these methods to an unpublished SNP data set for Atlantic salmon (Salmo salar) and a published SNP data set for Alaskan Chinook salmon (Oncorhynchus tshawytscha). In each species, we identified the minimum panel size required to obtain a self‐assignment accuracy of at least 90% using each method to create panels of 50–700 markers Panels of SNPs identified using random forest‐based methods performed up to 7.8 and 11.2 percentage points better than FST‐selected panels of similar size for the Atlantic salmon and Chinook salmon data, respectively. Self‐assignment accuracy ≥90% was obtained with panels of 670 and 384 SNPs for each data set, respectively, a level of accuracy never reached for these species using FST‐selected panels. Our results demonstrate a role for machine‐learning approaches in marker selection across large genomic data sets to improve assignment for management and conservation of exploited populations.
Collapse
|
36
|
The evolutionary basis of premature migration in Pacific salmon highlights the utility of genomics for informing conservation. SCIENCE ADVANCES 2017; 3:e1603198. [PMID: 28835916 PMCID: PMC5559211 DOI: 10.1126/sciadv.1603198] [Citation(s) in RCA: 114] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2016] [Accepted: 07/19/2017] [Indexed: 05/15/2023]
Abstract
The delineation of conservation units (CUs) is a challenging issue that has profound implications for minimizing the loss of biodiversity and ecosystem services. CU delineation typically seeks to prioritize evolutionary significance, and genetic methods play a pivotal role in the delineation process by quantifying overall differentiation between populations. Although CUs that primarily reflect overall genetic differentiation do protect adaptive differences between distant populations, they do not necessarily protect adaptive variation within highly connected populations. Advances in genomic methodology facilitate the characterization of adaptive genetic variation, but the potential utility of this information for CU delineation is unclear. We use genomic methods to investigate the evolutionary basis of premature migration in Pacific salmon, a complex behavioral and physiological phenotype that exists within highly connected populations and has experienced severe declines. Strikingly, we find that premature migration is associated with the same single locus across multiple populations in each of two different species. Patterns of variation at this locus suggest that the premature migration alleles arose from a single evolutionary event within each species and were subsequently spread to distant populations through straying and positive selection. Our results reveal that complex adaptive variation can depend on rare mutational events at a single locus, demonstrate that CUs reflecting overall genetic differentiation can fail to protect evolutionarily significant variation that has substantial ecological and societal benefits, and suggest that a supplemental framework for protecting specific adaptive variation will sometimes be necessary to prevent the loss of significant biodiversity and ecosystem services.
Collapse
|
37
|
Sex matters in massive parallel sequencing: Evidence for biases in genetic parameter estimation and investigation of sex determination systems. Mol Ecol 2017; 26:6767-6783. [DOI: 10.1111/mec.14217] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2016] [Revised: 03/23/2017] [Accepted: 03/29/2017] [Indexed: 12/26/2022]
|
38
|
Genetic basis of adult migration timing in anadromous steelhead discovered through multivariate association testing. Proc Biol Sci 2017; 283:rspb.2015.3064. [PMID: 27170720 DOI: 10.1098/rspb.2015.3064] [Citation(s) in RCA: 86] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Accepted: 04/14/2016] [Indexed: 01/21/2023] Open
Abstract
Migration traits are presumed to be complex and to involve interaction among multiple genes. We used both univariate analyses and a multivariate random forest (RF) machine learning algorithm to conduct association mapping of 15 239 single nucleotide polymorphisms (SNPs) for adult migration-timing phenotype in steelhead (Oncorhynchus mykiss). Our study focused on a model natural population of steelhead that exhibits two distinct migration-timing life histories with high levels of admixture in nature. Neutral divergence was limited between fish exhibiting summer- and winter-run migration owing to high levels of interbreeding, but a univariate mixed linear model found three SNPs from a major effect gene to be significantly associated with migration timing (p < 0.000005) that explained 46% of trait variation. Alignment to the annotated Salmo salar genome provided evidence that all three SNPs localize within a 46 kb region overlapping GREB1-like (an oestrogen target gene) on chromosome Ssa03. Additionally, multivariate analyses with RF identified that these three SNPs plus 15 additional SNPs explained up to 60% of trait variation. These candidate SNPs may provide the ability to predict adult migration timing of steelhead to facilitate conservation management of this species, and this study demonstrates the benefit of multivariate analyses for association studies.
Collapse
|
39
|
Functional Annotation of All Salmonid Genomes (FAASG): an international initiative supporting future salmonid research, conservation and aquaculture. BMC Genomics 2017; 18:484. [PMID: 28655320 PMCID: PMC5488370 DOI: 10.1186/s12864-017-3862-8] [Citation(s) in RCA: 77] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2016] [Accepted: 06/14/2017] [Indexed: 11/21/2022] Open
Abstract
We describe an emerging initiative - the 'Functional Annotation of All Salmonid Genomes' (FAASG), which will leverage the extensive trait diversity that has evolved since a whole genome duplication event in the salmonid ancestor, to develop an integrative understanding of the functional genomic basis of phenotypic variation. The outcomes of FAASG will have diverse applications, ranging from improved understanding of genome evolution, to improving the efficiency and sustainability of aquaculture production, supporting the future of fundamental and applied research in an iconic fish lineage of major societal importance.
Collapse
|
40
|
The role of allochrony in speciation. Mol Ecol 2017; 26:3330-3342. [DOI: 10.1111/mec.14126] [Citation(s) in RCA: 92] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Revised: 03/17/2017] [Accepted: 03/20/2017] [Indexed: 12/15/2022]
|
41
|
RADseq provides unprecedented insights into molecular ecology and evolutionary genetics: comment on Breaking RAD by Lowry et al
. (2016). Mol Ecol Resour 2017; 17:356-361. [DOI: 10.1111/1755-0998.12649] [Citation(s) in RCA: 102] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Revised: 12/20/2016] [Accepted: 12/20/2016] [Indexed: 01/26/2023]
|
42
|
Saving the spandrels? Adaptive genomic variation in conservation and fisheries management. JOURNAL OF FISH BIOLOGY 2016; 89:2697-2716. [PMID: 27723095 DOI: 10.1111/jfb.13168] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 09/06/2016] [Indexed: 06/06/2023]
Abstract
As highlighted by many of the papers in this issue, research on the genomic basis of adaptive phenotypic variation in natural populations has made spectacular progress in the past few years, largely due to the advances in sequencing technology and analysis. Without question, the resulting genomic data will improve the understanding of regions of the genome under selection and extend knowledge of the genetic basis of adaptive evolution. What is far less clear, but has been the focus of active discussion, is how such information can or should transfer into conservation practice to complement more typical conservation applications of genetic data. Before such applications can be realized, the evolutionary importance of specific targets of selection relative to the genome-wide diversity of the species as a whole must be evaluated. The key issues for the incorporation of adaptive genomic variation in conservation and management are discussed here, using published examples of adaptive genomic variation associated with specific phenotypes in salmonids and other taxa to highlight practical considerations for incorporating such information into conservation programmes. Scenarios are described in which adaptive genomic data could be used in conservation or restoration, constraints on its utility and the importance of validating inferences drawn from new genomic data before applying them in conservation practice. Finally, it is argued that an excessive focus on preserving the adaptive variation that can be measured, while ignoring the vast unknown majority that cannot, is a modern twist on the adaptationist programme that Gould and Lewontin critiqued almost 40 years ago.
Collapse
|
43
|
Detecting polygenic selection in marine populations by combining population genomics and quantitative genetics approaches. Curr Zool 2016; 62:603-616. [PMID: 29491948 PMCID: PMC5804256 DOI: 10.1093/cz/zow088] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2016] [Accepted: 07/21/2016] [Indexed: 12/27/2022] Open
Abstract
Highly fecund marine species with dispersive life-history stages often display large population sizes and wide geographic distribution ranges. Consequently, they are expected to experience reduced genetic drift, efficient selection fueled by frequent adaptive mutations, and high migration loads. This has important consequences for understanding how local adaptation proceeds in the sea. A key issue in this regard, relates to the genetic architecture underlying fitness traits. Theory predicts that adaptation may involve many genes but with a high variance in effect size. Therefore, the effect of selection on allele frequencies may be substantial for the largest effect size loci, but insignificant for small effect genes. In such a context, the performance of population genomic methods to unravel the genetic basis of adaptation depends on the fraction of adaptive genetic variance explained by the cumulative effect of outlier loci. Here, we address some methodological challenges associated with the detection of local adaptation using molecular approaches. We provide an overview of genome scan methods to detect selection, including those assuming complex demographic models that better describe spatial population structure. We then focus on quantitative genetics approaches that search for genotype-phenotype associations at different genomic scales, including genome-wide methods evaluating the cumulative effect of variants. We argue that the limited power of single locus tests can be alleviated by the use of polygenic scores to estimate the joint contribution of candidate variants to phenotypic variation.
Collapse
|
44
|
On the maintenance of genetic variation and adaptation to environmental change: considerations from population genomics in fishes. JOURNAL OF FISH BIOLOGY 2016; 89:2519-2556. [PMID: 27687146 DOI: 10.1111/jfb.13145] [Citation(s) in RCA: 114] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2016] [Accepted: 08/23/2016] [Indexed: 05/18/2023]
Abstract
The first goal of this paper was to overview modern approaches to local adaptation, with a focus on the use of population genomics data to detect signals of natural selection in fishes. Several mechanisms are discussed that may enhance the maintenance of genetic variation and evolutionary potential, which have been overlooked and should be considered in future theoretical development and predictive models: the prevalence of soft sweeps, polygenic basis of adaptation, balancing selection and transient polymorphisms, parallel evolution, as well as epigenetic variation. Research on fish population genomics has provided ample evidence for local adaptation at the genome level. Pervasive adaptive evolution, however, seems to almost never involve the fixation of beneficial alleles. Instead, adaptation apparently proceeds most commonly by soft sweeps entailing shifts in frequencies of alleles being shared between differentially adapted populations. One obvious factor contributing to the maintenance of standing genetic variation in the face of selective pressures is that adaptive phenotypic traits are most often highly polygenic, and consequently the response to selection should derive mostly from allelic co-variances among causative loci rather than pronounced allele frequency changes. Balancing selection in its various forms may also play an important role in maintaining adaptive genetic variation and the evolutionary potential of species to cope with environmental change. A large body of literature on fishes also shows that repeated evolution of adaptive phenotypes is a ubiquitous evolutionary phenomenon that seems to occur most often via different genetic solutions, further adding to the potential options of species to cope with a changing environment. Moreover, a paradox is emerging from recent fish studies whereby populations of highly reduced effective population sizes and impoverished genetic diversity can apparently retain their adaptive potential in some circumstances. Although more empirical support is needed, several recent studies suggest that epigenetic variation could account for this apparent paradox. Therefore, epigenetic variation should be fully integrated with considerations pertaining to role of soft sweeps, polygenic and balancing selection, as well as repeated adaptation involving different genetic basis towards improving models predicting the evolutionary potential of species to cope with a changing world.
Collapse
|
45
|
Sewage treatment plant associated genetic differentiation in the blue mussel from the Baltic Sea and Swedish west coast. PeerJ 2016; 4:e2628. [PMID: 27812424 PMCID: PMC5088577 DOI: 10.7717/peerj.2628] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2016] [Accepted: 09/29/2016] [Indexed: 12/31/2022] Open
Abstract
Human-derived environmental pollutants and nutrients that reach the aquatic environment through sewage effluents, agricultural and industrial processes are constantly contributing to environmental changes that serve as drivers for adaptive responses and evolutionary changes in many taxa. In this study, we examined how two types of point sources of aquatic environmental pollution, harbors and sewage treatment plants, affect gene diversity and genetic differentiation in the blue mussel in the Baltic Sea area and off the Swedish west coast (Skagerrak). Reference sites (REF) were geographically paired with sites from sewage treatments plant (STP) and harbors (HAR) with a nested sampling scheme, and genetic differentiation was evaluated using a high-resolution marker amplified fragment length polymorphism (AFLP). This study showed that genetic composition in the Baltic Sea blue mussel was associated with exposure to sewage treatment plant effluents. In addition, mussel populations from harbors were genetically divergent, in contrast to the sewage treatment plant populations, suggesting that there is an effect of pollution from harbors but that the direction is divergent and site specific, while the pollution effect from sewage treatment plants on the genetic composition of blue mussel populations acts in the same direction in the investigated sites.
Collapse
|
46
|
Genomic signatures among Oncorhynchus nerka ecotypes to inform conservation and management of endangered Sockeye Salmon. Evol Appl 2016; 9:1285-1300. [PMID: 27877206 PMCID: PMC5108219 DOI: 10.1111/eva.12412] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Accepted: 07/25/2016] [Indexed: 01/04/2023] Open
Abstract
Conservation of life history variation is an important consideration for many species with trade-offs in migratory characteristics. Many salmonid species exhibit both resident and migratory strategies that capitalize on benefits in freshwater and marine environments. In this study, we investigated genomic signatures for migratory life history in collections of resident and anadromous Oncorhynchus nerka (Kokanee and Sockeye Salmon, respectively) from two lake systems, using ~2,600 SNPs from restriction-site-associated DNA sequencing (RAD-seq). Differing demographic histories were evident in the two systems where one pair was significantly differentiated (Redfish Lake, FST = 0.091 [95% confidence interval: 0.087 to 0.095]) but the other pair was not (Alturas Lake, FST = -0.007 [-0.008 to -0.006]). Outlier and association analyses identified several candidate markers in each population pair, but there was limited evidence for parallel signatures of genomic variation associated with migration. Despite lack of evidence for consistent markers associated with migratory life history in this species, candidate markers were mapped to functional genes and provide evidence for adaptive genetic variation within each lake system. Life history variation has been maintained in these nearly extirpated populations of O. nerka, and conservation efforts to preserve this diversity are important for long-term resiliency of this species.
Collapse
|
47
|
Detecting Polygenic Evolution: Problems, Pitfalls, and Promises. Trends Genet 2016; 32:155-164. [DOI: 10.1016/j.tig.2015.12.004] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2015] [Revised: 12/21/2015] [Accepted: 12/22/2015] [Indexed: 10/22/2022]
|
48
|
RAD sequencing reveals within-generation polygenic selection in response to anthropogenic organic and metal contamination in North Atlantic Eels. Mol Ecol 2015; 25:219-37. [DOI: 10.1111/mec.13466] [Citation(s) in RCA: 115] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2015] [Revised: 11/06/2015] [Accepted: 11/06/2015] [Indexed: 12/14/2022]
|
49
|
Effectiveness of managed gene flow in reducing genetic divergence associated with captive breeding. Evol Appl 2015; 8:956-71. [PMID: 26640521 PMCID: PMC4662342 DOI: 10.1111/eva.12331] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Accepted: 09/02/2015] [Indexed: 12/28/2022] Open
Abstract
Captive breeding has the potential to rebuild depressed populations. However, associated genetic changes may decrease restoration success and negatively affect the adaptive potential of the entire population. Thus, approaches that minimize genetic risks should be tested in a comparative framework over multiple generations. Genetic diversity in two captive-reared lines of a species of conservation interest, Chinook salmon (Oncorhynchus tshawytscha), was surveyed across three generations using genome-wide approaches. Genetic divergence from the source population was minimal in an integrated line, which implemented managed gene flow by using only naturally-born adults as captive broodstock, but significant in a segregated line, which bred only captive-origin individuals. Estimates of effective number of breeders revealed that the rapid divergence observed in the latter was largely attributable to genetic drift. Three independent tests for signatures of adaptive divergence also identified temporal change within the segregated line, possibly indicating domestication selection. The results empirically demonstrate that using managed gene flow for propagating a captive-reared population reduces genetic divergence over the short term compared to one that relies solely on captive-origin parents. These findings complement existing studies of captive breeding, which typically focus on a single management strategy and examine the fitness of one or two generations.
Collapse
|
50
|
An integrated linkage map reveals candidate genes underlying adaptive variation in Chinook salmon (
Oncorhynchus tshawytscha
). Mol Ecol Resour 2015; 16:769-83. [DOI: 10.1111/1755-0998.12479] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Revised: 10/08/2015] [Accepted: 10/14/2015] [Indexed: 12/31/2022]
|