1
|
Grant AR, Johnson KP, Stanley EL, Baldwin-Brown J, Kolenčík S, Allen JM. Rapid Targeted Assembly of the Proteome Reveals Evolutionary Variation of GC Content in Avian Lice. Bioinform Biol Insights 2024; 18:11779322241257991. [PMID: 38860163 PMCID: PMC11163934 DOI: 10.1177/11779322241257991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 05/02/2024] [Indexed: 06/12/2024] Open
Abstract
Nucleotide base composition plays an influential role in the molecular mechanisms involved in gene function, phenotype, and amino acid composition. GC content (proportion of guanine and cytosine in DNA sequences) shows a high level of variation within and among species. Many studies measure GC content in a small number of genes, which may not be representative of genome-wide GC variation. One challenge when assembling extensive genomic data sets for these studies is the significant amount of resources (monetary and computational) associated with data processing, and many bioinformatic tools have not been optimized for resource efficiency. Using a high-performance computing (HPC) cluster, we manipulated resources provided to the targeted gene assembly program, automated target restricted assembly method (aTRAM), to determine an optimum way to run the program to maximize resource use. Using our optimum assembly approach, we assembled and measured GC content of all of the protein-coding genes of a diverse group of parasitic feather lice. Of the 499 426 genes assembled across 57 species, feather lice were GC-poor (mean GC = 42.96%) with a significant amount of variation within and between species (GC range = 19.57%-73.33%). We found a significant correlation between GC content and standard deviation per taxon for overall GC and GC3, which could indicate selection for G and C nucleotides in some species. Phylogenetic signal of GC content was detected in both GC and GC3. This research provides a large-scale investigation of GC content in parasitic lice laying the foundation for understanding the basis of variation in base composition across species.
Collapse
Affiliation(s)
- Avery R Grant
- Department of Biology, University of Nevada, Reno, Reno, NV, USA
| | - Kevin P Johnson
- Illinois Natural History Survey, Prairie Research Institute, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - Edward L Stanley
- Department of Natural History, Florida Museum of Natural History, University of Florida, Gainesville, FL, USA
| | | | - Stanislav Kolenčík
- Faculty of Mathematics, Natural Sciences, and Information Technologies, University of Primorska, Koper, Slovenia
| | - Julie M Allen
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, USA
| |
Collapse
|
2
|
Galtier N. Half a Century of Controversy: The Neutralist/Selectionist Debate in Molecular Evolution. Genome Biol Evol 2024; 16:evae003. [PMID: 38311843 PMCID: PMC10839204 DOI: 10.1093/gbe/evae003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/01/2024] [Indexed: 02/06/2024] Open
Abstract
The neutral and nearly neutral theories, introduced more than 50 yr ago, have raised and still raise passionate discussion regarding the forces governing molecular evolution and their relative importance. The debate, initially focused on the amount of within-species polymorphism and constancy of the substitution rate, has spread, matured, and now underlies a wide range of topics and questions. The neutralist/selectionist controversy has structured the field and influences the way molecular evolutionary scientists conceive their research.
Collapse
Affiliation(s)
- Nicolas Galtier
- ISEM, CNRS, IRD, Université de Montpellier, Montpellier, France
| |
Collapse
|
3
|
Bourret J, Borvető F, Bravo IG. Subfunctionalisation of paralogous genes and evolution of differential codon usage preferences: The showcase of polypyrimidine tract binding proteins. J Evol Biol 2023; 36:1375-1392. [PMID: 37667674 DOI: 10.1111/jeb.14212] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 07/11/2023] [Accepted: 07/12/2023] [Indexed: 09/06/2023]
Abstract
Gene paralogs are copies of an ancestral gene that appear after gene or full genome duplication. When two sister gene copies are maintained in the genome, redundancy may release certain evolutionary pressures, allowing one of them to access novel functions. Here, we focused our study on gene paralogs on the evolutionary history of the three polypyrimidine tract binding protein genes (PTBP) and their concurrent evolution of differential codon usage preferences (CUPrefs) in vertebrate species. PTBP1-3 show high identity at the amino acid level (up to 80%) but display strongly different nucleotide composition, divergent CUPrefs and, in humans and in many other vertebrates, distinct tissue-specific expression levels. Our phylogenetic inference results show that the duplication events leading to the three extant PTBP1-3 lineages predate the basal diversification within vertebrates, and genomic context analysis illustrates that local synteny has been well preserved over time for the three paralogs. We identify a distinct evolutionary pattern towards GC3-enriching substitutions in PTBP1, concurrent with enrichment in frequently used codons and with a tissue-wide expression. In contrast, PTBP2s are enriched in AT-ending, rare codons, and display tissue-restricted expression. As a result of this substitution trend, CUPrefs sharply differ between mammalian PTBP1s and the rest of PTBPs. Genomic context analysis suggests that GC3-rich nucleotide composition in PTBP1s is driven by local substitution processes, while the evidence in this direction is thinner for PTBP2-3. An actual lack of co-variation between the observed GC composition of PTBP2-3 and that of the surrounding non-coding genomic environment would raise an interrogation on the origin of CUPrefs, warranting further research on a putative tissue-specific translational selection. Finally, we communicate an intriguing trend for the use of the UUG-Leu codon, which matches the trends of AT-ending codons. Our results are compatible with a scenario in which a combination of directional mutation-selection processes would have differentially shaped CUPrefs of PTBPs in vertebrates: the observed GC-enrichment of PTBP1 in placental mammals may be linked to genomic location and to the strong and broad tissue-expression, while AT-enrichment of PTBP2 and PTBP3 would be associated with rare CUPrefs and thus, possibly to specialized spatio-temporal expression. Our interpretation is coherent with a gene subfunctionalisation process by differential expression regulation associated with the evolution of specific CUPrefs.
Collapse
Affiliation(s)
- Jérôme Bourret
- Laboratoire MIVEGEC (CNRS IRD Univ Montpellier), Centre National de la Recherche Scientifique (CNRS), Montpellier, France
| | - Fanni Borvető
- Laboratoire MIVEGEC (CNRS IRD Univ Montpellier), Centre National de la Recherche Scientifique (CNRS), Montpellier, France
| | - Ignacio G Bravo
- Laboratoire MIVEGEC (CNRS IRD Univ Montpellier), Centre National de la Recherche Scientifique (CNRS), Montpellier, France
| |
Collapse
|
4
|
Smith SA, Walker-Hale N, Parins-Fukuchi CT. Compositional shifts associated with major evolutionary transitions in plants. THE NEW PHYTOLOGIST 2023; 239:2404-2415. [PMID: 37381083 DOI: 10.1111/nph.19099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 06/04/2023] [Indexed: 06/30/2023]
Abstract
Heterogeneity in gene trees, morphological characters, and composition has been associated with several major plant clades. Here, we examine heterogeneity in composition across a large transcriptomic dataset of plants to better understand whether locations of shifts in composition are shared across gene regions and whether directions of shifts within clades are shared across gene regions. We estimate mixed models of composition for both nucleotide and amino acids across a recent large-scale transcriptomic dataset for plants. We find shifts in composition across both nucleotide and amino acid datasets, with more shifts detected in nucleotides. We find that Chlorophytes and lineages within experience the most shifts. However, many shifts occur at the origins of land, vascular, and seed plants. While genes in these clades do not typically share the same composition, they tend to shift in the same direction. We discuss potential causes of these patterns. Compositional heterogeneity has been highlighted as a potential problem for phylogenetic analysis, but the variation presented here highlights the need to further investigate these patterns for the signal of biological processes.
Collapse
Affiliation(s)
- Stephen A Smith
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, 48103, USA
| | | | | |
Collapse
|
5
|
Cope AL, Shah P. Intragenomic variation in non-adaptive nucleotide biases causes underestimation of selection on synonymous codon usage. PLoS Genet 2022; 18:e1010256. [PMID: 35714134 PMCID: PMC9246145 DOI: 10.1371/journal.pgen.1010256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 06/30/2022] [Accepted: 05/13/2022] [Indexed: 11/20/2022] Open
Abstract
Patterns of non-uniform usage of synonymous codons vary across genes in an organism and between species across all domains of life. This codon usage bias (CUB) is due to a combination of non-adaptive (e.g. mutation biases) and adaptive (e.g. natural selection for translation efficiency/accuracy) evolutionary forces. Most models quantify the effects of mutation bias and selection on CUB assuming uniform mutational and other non-adaptive forces across the genome. However, non-adaptive nucleotide biases can vary within a genome due to processes such as biased gene conversion (BGC), potentially obfuscating signals of selection on codon usage. Moreover, genome-wide estimates of non-adaptive nucleotide biases are lacking for non-model organisms. We combine an unsupervised learning method with a population genetics model of synonymous coding sequence evolution to assess the impact of intragenomic variation in non-adaptive nucleotide bias on quantification of natural selection on synonymous codon usage across 49 Saccharomycotina yeasts. We find that in the absence of a priori information, unsupervised learning can be used to identify genes evolving under different non-adaptive nucleotide biases. We find that the impact of intragenomic variation in non-adaptive nucleotide bias varies widely, even among closely-related species. We show that the overall strength and direction of translational selection can be underestimated by failing to account for intragenomic variation in non-adaptive nucleotide biases. Interestingly, genes falling into clusters identified by machine learning are also physically clustered across chromosomes. Our results indicate the need for more nuanced models of sequence evolution that systematically incorporate the effects of variable non-adaptive nucleotide biases on codon frequencies. Codon usage bias (CUB), or the unequal usage of codons of the same amino acid (i.e. synonymous codons), has been observed in species across all domains of life. CUB is known to be shaped by both non-adaptive (e.g. mutation biases) and adaptive (e.g. natural selection for translation efficiency/accuracy) evolution. A key challenge for researchers is disentangling the role of various processes shaping codon usage, often for the purpose of identifying codons favored by natural selection, sometimes referred to as “optimal” or “preferred” codons. Despite large variation in non-adaptive nucleotide biases within a genome, most methods to quantify natural selection typically ignore this variation for the sake of simplicity. Here, we combine a population genetics model with unsupervised machine learning to identify genes evolving under different non-adaptive nucleotide biases across 49 budding yeasts species. We find that ignoring for variation in non-adaptive nucleotide biases can obfuscate signals of selection on codon usage. Our results indicate the need for more nuanced models of coding sequence evolution.
Collapse
Affiliation(s)
- Alexander L. Cope
- Department of Genetics, Rutgers University, Piscataway, New Jersey, United States of America
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey, United States of America
- Robert Wood Johnson Medical School, Rutgers University, Piscataway, New Jersey, United States of America
- * E-mail: (ALC); (PS)
| | - Premal Shah
- Department of Genetics, Rutgers University, Piscataway, New Jersey, United States of America
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey, United States of America
- * E-mail: (ALC); (PS)
| |
Collapse
|
6
|
An extensive evaluation of codon usage pattern and bias of structural proteins p30, p54 and, p72 of the African swine fever virus (ASFV). Virusdisease 2021; 32:810-822. [PMID: 34901328 DOI: 10.1007/s13337-021-00719-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 06/23/2021] [Indexed: 10/20/2022] Open
Abstract
African swine fever virus (ASFV) belongs to the family of Asfarviridae to the genus Asfivirus. ASF virus causes hemorrhage illness with a high mortality rate and hence, commercial loss in the swine community. The ASFV has been categorized by variation in codon usage that is caused by high mutation rates and natural selection. The evolution is caused mainly due to the mutation pressure and regulating the protein gene expression. Based on publicly accessible nucleotide sequences of the ASFV and its host (pig & tick), codon usage bias analysis was performed since an approved effective vaccination is not available to date, it is very important to analyze the codon usage bias of the p30, p54, and p72 proteins of ASFV to produce an effective and efficient vaccine to control the disease. Even though the codon usage bias analyses have been evaluated earlier, the evaluation of the codon usage pattern specific to p30, p54, and p72 of ASFV is inadequate. In all the protein-coding sequences, nucleotide base and codons terminating with base T were most frequent and the mean effective number of codons (Nc) was high, indicating the presence of codon usage bias. The GC contents and dinucleotide frequencies also indicated the codon usage bias of the ASFV pig and tick. The Nc plot, parity plot, neutrality plot analysis, revealed natural selection, as well as mutation pressure, were the major constraints in altering the codon bias of ASF virus. codon usage bias analysis was performed with no substantial differences in codon usage of the ASFV in pig and tick. Supplementary Information The online version contains supplementary material available at 10.1007/s13337-021-00719-x.
Collapse
|
7
|
Krasovec M, Rickaby REM, Filatov DA. Evolution of Mutation Rate in Astronomically Large Phytoplankton Populations. Genome Biol Evol 2021; 12:1051-1059. [PMID: 32645145 PMCID: PMC7486954 DOI: 10.1093/gbe/evaa131] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/02/2020] [Indexed: 02/06/2023] Open
Abstract
Genetic diversity is expected to be proportional to population size, yet, there is a well-known, but unexplained lack of genetic diversity in large populations-the "Lewontin's paradox." Larger populations are expected to evolve lower mutation rates, which may help to explain this paradox. Here, we test this conjecture by measuring the spontaneous mutation rate in a ubiquitous unicellular marine phytoplankton species Emiliania huxleyi (Haptophyta) that has modest genetic diversity despite an astronomically large population size. Genome sequencing of E. huxleyi mutation accumulation lines revealed 455 mutations, with an unusual GC-biased mutation spectrum. This yielded an estimate of the per site mutation rate µ = 5.55×10-10 (CI 95%: 5.05×10-10 - 6.09×10-10), which corresponds to an effective population size Ne ∼ 2.7×106. Such a modest Ne is surprising for a ubiquitous and abundant species that accounts for up to 10% of global primary productivity in the oceans. Our results indicate that even exceptionally large populations do not evolve mutation rates lower than ∼10-10 per nucleotide per cell division. Consequently, the extreme disparity between modest genetic diversity and astronomically large population size in the plankton species cannot be explained by an unusually low mutation rate.
Collapse
Affiliation(s)
- Marc Krasovec
- Department of Plant Sciences, University of Oxford, United Kingdom
| | | | - Dmitry A Filatov
- Department of Plant Sciences, University of Oxford, United Kingdom
| |
Collapse
|
8
|
Genome-wide identification and expression pattern analysis of lipoxygenase gene family in banana. Sci Rep 2021; 11:9948. [PMID: 33976263 PMCID: PMC8113564 DOI: 10.1038/s41598-021-89211-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 04/19/2021] [Indexed: 01/19/2023] Open
Abstract
The LOX genes have been identified and characterized in many plant species, but studies on the banana LOX genes are very limited. In this study, we respectively identified 18 MaLOX, 11 MbLOX, and 12 MiLOX genes from the Musa acuminata, M. balbisiana and M. itinerans genome data, investigated their gene structures and characterized the physicochemical properties of their encoded proteins. Banana LOXs showed a preference for using and ending with G/C and their encoded proteins can be classified into 9-LOX, Type I 13-LOX and Type II 13-LOX subfamilies. The expansion of the MaLOXs might result from the combined actions of genome-wide, tandem, and segmental duplications. However, tandem and segmental duplications contribute to the expansion of MbLOXs. Transcriptome data based gene expression analysis showed that MaLOX1, 4, and 7 were highly expressed in fruit and their expression levels were significantly regulated by ethylene. And 11, 12 and 7 MaLOXs were found to be low temperature-, high temperature-, and Fusarium oxysporum f. sp. Cubense tropical race 4 (FocTR4)-responsive, respectively. MaLOX8, 9 and 13 are responsive to all the three stresses, MaLOX4 and MaLOX12 are high temperature- and FocTR4-responsive; MaLOX6 and MaLOX17 are significantly induced by low temperature and FocTR4; and the expression of MaLOX7 and MaLOX16 are only affected by high temperature. Quantitative real-time PCR (qRT-PCR) analysis revealed that the expression levels of several MaLOXs are regulated by MeJA and FocTR4, indicating that they can increase the resistance of banana by regulating the JA pathway. Additionally, the weighted gene co-expression network analysis (WGCNA) of MaLOXs revealed 3 models respectively for 5 (MaLOX7-11), 3 (MaLOX6, 13, and 17), and 1 (MaLOX12) MaLOX genes. Our findings can provide valuable information for the characterization, evolution, diversity and functionality of MaLOX, MbLOX and MiLOX genes and are helpful for understanding the roles of LOXs in banana growth and development and adaptations to different stresses.
Collapse
|
9
|
Burgarella C, Berger A, Glémin S, David J, Terrier N, Deu M, Pot D. The Road to Sorghum Domestication: Evidence From Nucleotide Diversity and Gene Expression Patterns. FRONTIERS IN PLANT SCIENCE 2021; 12:666075. [PMID: 34527004 PMCID: PMC8435843 DOI: 10.3389/fpls.2021.666075] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 07/20/2021] [Indexed: 05/17/2023]
Abstract
Native African cereals (sorghum, millets) ensure food security to millions of low-income people from low fertility and drought-prone regions of Africa and Asia. In spite of their agronomic importance, the genetic bases of their phenotype and adaptations are still not well-understood. Here we focus on Sorghum bicolor, which is the fifth cereal worldwide for grain production and constitutes the staple food for around 500 million people. We leverage transcriptomic resources to address the adaptive consequences of the domestication process. Gene expression and nucleotide variability were analyzed in 11 domesticated and nine wild accessions. We documented a downregulation of expression and a reduction of diversity both in nucleotide polymorphism (30%) and gene expression levels (18%) in domesticated sorghum. These findings at the genome-wide level support the occurrence of a global reduction of diversity during the domestication process, although several genes also showed patterns consistent with the action of selection. Nine hundred and forty-nine genes were significantly differentially expressed between wild and domesticated gene pools. Their functional annotation points to metabolic pathways most likely contributing to the sorghum domestication syndrome, such as photosynthesis and auxin metabolism. Coexpression network analyzes revealed 21 clusters of genes sharing similar expression patterns. Four clusters (totaling 2,449 genes) were significantly enriched in differentially expressed genes between the wild and domesticated pools and two were also enriched in domestication and improvement genes previously identified in sorghum. These findings reinforce the evidence that the combined and intricated effects of the domestication and improvement processes do not only affect the behaviors of a few genes but led to a large rewiring of the transcriptome. Overall, these analyzes pave the way toward the identification of key domestication genes valuable for genetic resources characterization and breeding purposes.
Collapse
Affiliation(s)
- Concetta Burgarella
- CIRAD, UMR AGAP Institut, Montpellier, France
- AGAP Institut, Univ F-34398 Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
- *Correspondence: Concetta Burgarella
| | - Angélique Berger
- CIRAD, UMR AGAP Institut, Montpellier, France
- AGAP Institut, Univ F-34398 Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Sylvain Glémin
- CNRS, Univ. Rennes, ECOBIO – UMR 6553, Rennes, France
- Department of Ecology and Evolution, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Jacques David
- AGAP Institut, Univ F-34398 Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Nancy Terrier
- AGAP Institut, Univ F-34398 Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Monique Deu
- CIRAD, UMR AGAP Institut, Montpellier, France
- AGAP Institut, Univ F-34398 Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - David Pot
- CIRAD, UMR AGAP Institut, Montpellier, France
- AGAP Institut, Univ F-34398 Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
- David Pot
| |
Collapse
|
10
|
Demographic history and adaptive synonymous and nonsynonymous variants of nuclear genes in Rhododendron oldhamii (Ericaceae). Sci Rep 2020; 10:16658. [PMID: 33028947 PMCID: PMC7542430 DOI: 10.1038/s41598-020-73748-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Accepted: 09/22/2020] [Indexed: 11/23/2022] Open
Abstract
Demographic events are important in shaping the population genetic structure and exon variation can play roles in adaptive divergence. Twelve nuclear genes were used to investigate the species-level phylogeography of Rhododendron oldhamii, test the difference in the average GC content of coding sites and of third codon positions with that of surrounding non-coding regions, and test exon variants associated with environmental variables. Spatial expansion was suggested by R2 index of the aligned intron sequences of all genes of the regional samples and sum of squared deviations statistic of the aligned intron sequences of all genes individually and of all genes of the regional and pooled samples. The level of genetic differentiation was significantly different between regional samples. Significantly lower and higher average GC contents across 94 sequences of the 12 genes at third codon positions of coding sequences than that of surrounding non-coding regions were found. We found seven exon variants associated strongly with environmental variables. Our results demonstrated spatial expansion of R. oldhamii in the late Pleistocene and the optimal third codon position could end in A or T rather than G or C as frequent alleles and could have been important for adaptive divergence in R. oldhamii.
Collapse
|
11
|
Hämälä T, Tiffin P. Biased Gene Conversion Constrains Adaptation in Arabidopsis thaliana. Genetics 2020; 215:831-846. [PMID: 32414868 PMCID: PMC7337087 DOI: 10.1534/genetics.120.303335] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2020] [Accepted: 05/14/2020] [Indexed: 02/01/2023] Open
Abstract
Reduction of fitness due to deleterious mutations imposes a limit to adaptive evolution. By characterizing features that influence this genetic load we may better understand constraints on responses to both natural and human-mediated selection. Here, using whole-genome, transcriptome, and methylome data from >600 Arabidopsis thaliana individuals, we set out to identify important features influencing selective constraint. Our analyses reveal that multiple factors underlie the accumulation of maladaptive mutations, including gene expression level, gene network connectivity, and gene-body methylation. We then focus on a feature with major effect, nucleotide composition. The ancestral vs. derived status of segregating alleles suggests that GC-biased gene conversion, a recombination-associated process that increases the frequency of G and C nucleotides regardless of their fitness effects, shapes sequence patterns in A. thaliana Through estimation of mutational effects, we present evidence that biased gene conversion hinders the purging of deleterious mutations and contributes to a genome-wide signal of decreased efficacy of selection. By comparing these results to two outcrossing relatives, Arabidopsis lyrata and Capsella grandiflora, we find that protein evolution in A. thaliana is as strongly affected by biased gene conversion as in the outcrossing species. Last, we perform simulations to show that natural levels of outcrossing in A. thaliana are sufficient to facilitate biased gene conversion despite increased homozygosity due to selfing. Together, our results show that even predominantly selfing taxa are susceptible to biased gene conversion, suggesting that it may constitute an important constraint to adaptation among plant species.
Collapse
Affiliation(s)
- Tuomas Hämälä
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota 55108
| | - Peter Tiffin
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota 55108
| |
Collapse
|
12
|
Martin G, Cardi C, Sarah G, Ricci S, Jenny C, Fondi E, Perrier X, Glaszmann JC, D'Hont A, Yahiaoui N. Genome ancestry mosaics reveal multiple and cryptic contributors to cultivated banana. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 102:1008-1025. [PMID: 31930580 PMCID: PMC7317953 DOI: 10.1111/tpj.14683] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 12/18/2019] [Accepted: 01/02/2020] [Indexed: 05/24/2023]
Abstract
Hybridizations between closely related species commonly occur in the domestication process of many crops. Banana cultivars are derived from such hybridizations between species and subspecies of the Musa genus that have diverged in various tropical Southeast Asian regions and archipelagos. Among the diploid and triploid hybrids generated, those with seedless parthenocarpic fruits were selected by humans and thereafter dispersed through vegetative propagation. Musa acuminata subspecies contribute to most of these cultivars. We analyzed sequence data from 14 M. acuminata wild accessions and 10 M. acuminata-based cultivars, including diploids and one triploid, to characterize the ancestral origins along their chromosomes. We used multivariate analysis and single nucleotide polymorphism clustering and identified five ancestral groups as contributors to these cultivars. Four of these corresponded to known M. acuminata subspecies. A fifth group, found only in cultivars, was defined based on the 'Pisang Madu' cultivar and represented two uncharacterized genetic pools. Diverse ancestral contributions along cultivar chromosomes were found, resulting in mosaics with at least three and up to five ancestries. The commercially important triploid Cavendish banana cultivar had contributions from at least one of the uncharacterized genetic pools and three known M. acuminata subspecies. Our results highlighted that cultivated banana origins are more complex than expected - involving multiple hybridization steps - and also that major wild banana ancestors have yet to be identified. This study revealed the extent to which admixture has framed the evolution and domestication of a crop plant.
Collapse
Affiliation(s)
- Guillaume Martin
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Céline Cardi
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Gautier Sarah
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Sébastien Ricci
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
- CARBAP, Rue Dinde, No. 110, Bonanjo, BP 832, Douala, Cameroon
- CIRAD, UMR AGAP, F-97130, Capesterre Belle Eau, France
| | - Christophe Jenny
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Emmanuel Fondi
- CARBAP, Rue Dinde, No. 110, Bonanjo, BP 832, Douala, Cameroon
| | - Xavier Perrier
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Jean-Christophe Glaszmann
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Angélique D'Hont
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Nabila Yahiaoui
- CIRAD, UMR AGAP, F-34398, Montpellier, France
- AGAP, Univ. Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| |
Collapse
|
13
|
Diop SI, Subotic O, Giraldo-Fonseca A, Waller M, Kirbis A, Neubauer A, Potente G, Murray-Watson R, Boskovic F, Bont Z, Hock Z, Payton AC, Duijsings D, Pirovano W, Conti E, Grossniklaus U, McDaniel SF, Szövényi P. A pseudomolecule-scale genome assembly of the liverwort Marchantia polymorpha. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 101:1378-1396. [PMID: 31692190 DOI: 10.1111/tpj.14602] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Accepted: 10/28/2019] [Indexed: 05/07/2023]
Abstract
Marchantia polymorpha has recently become a prime model for cellular, evo-devo, synthetic biological, and evolutionary investigations. We present a pseudomolecule-scale assembly of the M. polymorpha genome, making comparative genome structure analysis and classical genetic mapping approaches feasible. We anchored 88% of the M. polymorpha draft genome to a high-density linkage map resulting in eight pseudomolecules. We found that the overall genome structure of M. polymorpha is in some respects different from that of the model moss Physcomitrella patens. Specifically, genome collinearity between the two bryophyte genomes and vascular plants is limited, suggesting extensive rearrangements since divergence. Furthermore, recombination rates are greatest in the middle of the chromosome arms in M. polymorpha like in most vascular plant genomes, which is in contrast with P. patens where recombination rates are evenly distributed along the chromosomes. Nevertheless, some other properties of the genome are shared with P. patens. As in P. patens, DNA methylation in M. polymorpha is spread evenly along the chromosomes, which is in stark contrast with the angiosperm model Arabidopsis thaliana, where DNA methylation is strongly enriched at the centromeres. Nevertheless, DNA methylation and recombination rate are anticorrelated in all three species. Finally, M. polymorpha and P. patens centromeres are of similar structure and marked by high abundance of retroelements unlike in vascular plants. Taken together, the highly contiguous genome assembly we present opens unexplored avenues for M. polymorpha research by linking the physical and genetic maps, making novel genomic and genetic analyses, including map-based cloning, feasible.
Collapse
Affiliation(s)
- Seydina I Diop
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
- BaseClear B.V., Sylviusweg 74, 2333 BE, Leiden, the Netherlands
| | - Oliver Subotic
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
- BaseClear B.V., Sylviusweg 74, 2333 BE, Leiden, the Netherlands
| | - Alejandro Giraldo-Fonseca
- Department of Plant and Microbial Biology & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
| | - Manuel Waller
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
| | - Alexander Kirbis
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
| | - Anna Neubauer
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
| | - Giacomo Potente
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
- BaseClear B.V., Sylviusweg 74, 2333 BE, Leiden, the Netherlands
| | - Rachel Murray-Watson
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
| | - Filip Boskovic
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
- Cavendish Laboratory, University of Cambridge, JJ Thompson Avenue, CB3 0HE, Cambridge, UK
| | - Zoe Bont
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
- Institute of Plant Sciences, University of Bern, Altenbergrain 21, 3013, Bern, Switzerland
| | - Zsofia Hock
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
| | - Adam C Payton
- Department of Biology, University of Florida, 876 Newell Drive, Gainesville, FL, 32611, USA
| | | | - Walter Pirovano
- BaseClear B.V., Sylviusweg 74, 2333 BE, Leiden, the Netherlands
| | - Elena Conti
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
| | - Ueli Grossniklaus
- Department of Plant and Microbial Biology & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
| | - Stuart F McDaniel
- Department of Biology, University of Florida, 876 Newell Drive, Gainesville, FL, 32611, USA
| | - Péter Szövényi
- Department of Systematic and Evolutionary Botany & Zurich-Basel Plant Science Center, University of Zurich, Zollikerstrasse 107, 8008, Zurich, Switzerland
| |
Collapse
|
14
|
Gros‐Balthazard M, Besnard G, Sarah G, Holtz Y, Leclercq J, Santoni S, Wegmann D, Glémin S, Khadari B. Evolutionary transcriptomics reveals the origins of olives and the genomic changes associated with their domestication. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2019; 100:143-157. [PMID: 31192486 PMCID: PMC6851578 DOI: 10.1111/tpj.14435] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 05/29/2019] [Accepted: 06/03/2019] [Indexed: 05/11/2023]
Abstract
The olive (Olea europaea L. subsp. europaea) is one of the oldest and most socio-economically important cultivated perennial crop in the Mediterranean region. Yet, its origins are still under debate and the genetic bases of the phenotypic changes associated with its domestication are unknown. We generated RNA-sequencing data for 68 wild and cultivated olive trees to study the genetic diversity and structure both at the transcription and sequence levels. To localize putative genes or expression pathways targeted by artificial selection during domestication, we employed a two-step approach in which we identified differentially expressed genes and screened the transcriptome for signatures of selection. Our analyses support a major domestication event in the eastern part of the Mediterranean basin followed by dispersion towards the West and subsequent admixture with western wild olives. While we found large changes in gene expression when comparing cultivated and wild olives, we found no major signature of selection on coding variants and weak signals primarily affected transcription factors. Our results indicated that the domestication of olives resulted in only moderate genomic consequences and that the domestication syndrome is mainly related to changes in gene expression, consistent with its evolutionary history and life history traits.
Collapse
Affiliation(s)
- Muriel Gros‐Balthazard
- AGAP, University Montpellier, CIRAD, INRAMontpellier SupAgroMontpellierFrance
- Present address:
New York University Abu Dhabi (NYUAD), Center for Genomics and Systems BiologySaadiyat IslandAbu DhabiUnited Arab Emirates
| | | | - Gautier Sarah
- AGAP, University Montpellier, CIRAD, INRAMontpellier SupAgroMontpellierFrance
| | - Yan Holtz
- AGAP, University Montpellier, CIRAD, INRAMontpellier SupAgroMontpellierFrance
| | - Julie Leclercq
- AGAP, University Montpellier, CIRAD, INRAMontpellier SupAgroMontpellierFrance
| | - Sylvain Santoni
- AGAP, University Montpellier, CIRAD, INRAMontpellier SupAgroMontpellierFrance
| | - Daniel Wegmann
- Department of BiologyUniversity of FribourgFribourgSwitzerland
- Swiss Institute of BioinformaticsFribourgSwitzerland
| | - Sylvain Glémin
- CNRSUniversité de RennesECOBIO (Ecosystèmes, biodiversité, évolution) − UMR 6553F‐35000RennesFrance
- Department of Ecology and GeneticsEvolutionary Biology CentreUppsala UniversityUppsalaSweden
| | - Bouchaib Khadari
- AGAP, University Montpellier, CIRAD, INRAMontpellier SupAgroMontpellierFrance
- Conservatoire Botanique National MéditerranéenUMR AGAPMontpellierFrance
| |
Collapse
|
15
|
Borges R, Szöllősi GJ, Kosiol C. Quantifying GC-Biased Gene Conversion in Great Ape Genomes Using Polymorphism-Aware Models. Genetics 2019; 212:1321-1336. [PMID: 31147380 PMCID: PMC6707462 DOI: 10.1534/genetics.119.302074] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Accepted: 05/20/2019] [Indexed: 11/18/2022] Open
Abstract
As multi-individual population-scale data become available, more complex modeling strategies are needed to quantify genome-wide patterns of nucleotide usage and associated mechanisms of evolution. Recently, the multivariate neutral Moran model was proposed. However, it was shown insufficient to explain the distribution of alleles in great apes. Here, we propose a new model that includes allelic selection. Our theoretical results constitute the basis of a new Bayesian framework to estimate mutation rates and selection coefficients from population data. We apply the new framework to a great ape dataset, where we found patterns of allelic selection that match those of genome-wide GC-biased gene conversion (gBGC). In particular, we show that great apes have patterns of allelic selection that vary in intensity-a feature that we correlated with great apes' distinct demographies. We also demonstrate that the AT/GC toggling effect decreases the probability of a substitution, promoting more polymorphisms in the base composition of great ape genomes. We further assess the impact of GC-bias in molecular analysis, and find that mutation rates and genetic distances are estimated under bias when gBGC is not properly accounted for. Our results contribute to the discussion on the tempo and mode of gBGC evolution, while stressing the need for gBGC-aware models in population genetics and phylogenetics.
Collapse
Affiliation(s)
- Rui Borges
- Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Wien, Austria
| | - Gergely J Szöllősi
- Department of Biological Physics, MTA-ELTE "Lendulet" Evolutionary Genomics Research Group, Eötvös University, Pázmány P. stny. 1A, Budapest 1117, Hungary
| | - Carolin Kosiol
- Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Wien, Austria
- Centre for Biological Diversity, School of Biology, University of St Andrews, Fife KY16 9TH, UK
| |
Collapse
|
16
|
LaBella AL, Opulente DA, Steenwyk JL, Hittinger CT, Rokas A. Variation and selection on codon usage bias across an entire subphylum. PLoS Genet 2019; 15:e1008304. [PMID: 31365533 PMCID: PMC6701816 DOI: 10.1371/journal.pgen.1008304] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Revised: 08/20/2019] [Accepted: 07/11/2019] [Indexed: 01/04/2023] Open
Abstract
Variation in synonymous codon usage is abundant across multiple levels of organization: between codons of an amino acid, between genes in a genome, and between genomes of different species. It is now well understood that variation in synonymous codon usage is influenced by mutational bias coupled with both natural selection for translational efficiency and genetic drift, but how these processes shape patterns of codon usage bias across entire lineages remains unexplored. To address this question, we used a rich genomic data set of 327 species that covers nearly one third of the known biodiversity of the budding yeast subphylum Saccharomycotina. We found that, while genome-wide relative synonymous codon usage (RSCU) for all codons was highly correlated with the GC content of the third codon position (GC3), the usage of codons for the amino acids proline, arginine, and glycine was inconsistent with the neutral expectation where mutational bias coupled with genetic drift drive codon usage. Examination between genes’ effective numbers of codons and their GC3 contents in individual genomes revealed that nearly a quarter of genes (381,174/1,683,203; 23%), as well as most genomes (308/327; 94%), significantly deviate from the neutral expectation. Finally, by evaluating the imprint of translational selection on codon usage, measured as the degree to which genes’ adaptiveness to the tRNA pool were correlated with selective pressure, we show that translational selection is widespread in budding yeast genomes (264/327; 81%). These results suggest that the contribution of translational selection and drift to patterns of synonymous codon usage across budding yeasts varies across codons, genes, and genomes; whereas drift is the primary driver of global codon usage across the subphylum, the codon bias of large numbers of genes in the majority of genomes is influenced by translational selection. Synonymous mutations in genes have no effect on the encoded proteins and were once thought to be evolutionarily neutral. By examining codon usage bias across codons, genes, and genomes of 327 species in the budding yeast subphylum, we show that synonymous codon usage is shaped by both neutral processes and selection for translational efficiency. Specifically, whereas codon usage bias for most codons appears to be strongly associated with mutational bias and largely driven by genetic drift across the entire subphylum, patterns of codon usage bias in a few codons, as well as in many genes in nearly all genomes of budding yeasts, deviate from neutral expectations. Rather, the synonymous codons used within genes in most budding yeast genomes are adapted to the tRNAs present within each genome, a result most likely due to translational selection that optimizes codons to match the tRNAs. Our results suggest that patterns of codon usage bias in budding yeasts, and perhaps more broadly in fungi and other microbial eukaryotes, are shaped by both neutral and selective processes.
Collapse
Affiliation(s)
- Abigail L. LaBella
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Dana A. Opulente
- Laboratory of Genetics, Genome Center of Wisconsin, DOE Great Lakes Bioenergy Research Center, Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, University of Wisconsin–Madison, Wisconsin, United States of America
| | - Jacob L. Steenwyk
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Chris Todd Hittinger
- Laboratory of Genetics, Genome Center of Wisconsin, DOE Great Lakes Bioenergy Research Center, Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, University of Wisconsin–Madison, Wisconsin, United States of America
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
17
|
Glémin S, Scornavacca C, Dainat J, Burgarella C, Viader V, Ardisson M, Sarah G, Santoni S, David J, Ranwez V. Pervasive hybridizations in the history of wheat relatives. SCIENCE ADVANCES 2019; 5:eaav9188. [PMID: 31049399 PMCID: PMC6494498 DOI: 10.1126/sciadv.aav9188] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 03/20/2019] [Indexed: 05/18/2023]
Abstract
Cultivated wheats are derived from an intricate history of three genomes, A, B, and D, present in both diploid and polyploid species. It was recently proposed that the D genome originated from an ancient hybridization between the A and B lineages. However, this result has been questioned, and a robust phylogeny of wheat relatives is still lacking. Using transcriptome data from all diploid species and a new methodological approach, our comprehensive phylogenomic analysis revealed that more than half of the species descend from an ancient hybridization event but with a more complex scenario involving a different parent than previously thought-Aegilops mutica, an overlooked wild species-instead of the B genome. We also detected other extensive gene flow events that could explain long-standing controversies in the classification of wheat relatives.
Collapse
Affiliation(s)
- Sylvain Glémin
- CNRS, Univ Rennes, ECOBIO (Ecosystèmes, biodiversité, évolution)–UMR 6553, F-35042 Rennes, France
- Department of Ecology and Genetics, Evolutionary Biology Center, Uppsala University, Norbyvägen 18D, 752 36 Uppsala, Sweden
| | - Celine Scornavacca
- Institut des Sciences de l’Evolution Université de Montpellier, CNRS, IRD, EPHE CC 064, Place Eugène Bataillon, 34095 Montpellier, cedex 05, France
| | - Jacques Dainat
- National Bioinformatics Infrastructure Sweden (NBIS), SciLifeLab, Uppsala Biomedicinska Centrum (BMC), Husargatan 3, S-751 23 Uppsala, Sweden
- IMBIM–Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala Biomedicinska Centrum (BMC), Husargatan 3, Box 582, S-751 23 Uppsala, Sweden
| | - Concetta Burgarella
- AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
- CIRAD, UMR AGAP, F-34398 Montpellier, France
| | - Véronique Viader
- AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Morgane Ardisson
- AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Gautier Sarah
- AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
- South Green Bioinformatics Platform, BIOVERSITY, CIRAD, INRA, IRD, Montpellier SupAgro, Montpellier, France
| | - Sylvain Santoni
- AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Jacques David
- AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Vincent Ranwez
- AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| |
Collapse
|
18
|
Galtier N, Roux C, Rousselle M, Romiguier J, Figuet E, Glémin S, Bierne N, Duret L. Codon Usage Bias in Animals: Disentangling the Effects of Natural Selection, Effective Population Size, and GC-Biased Gene Conversion. Mol Biol Evol 2019; 35:1092-1103. [PMID: 29390090 DOI: 10.1093/molbev/msy015] [Citation(s) in RCA: 79] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Selection on codon usage bias is well documented in a number of microorganisms. Whether codon usage is also generally shaped by natural selection in large organisms, despite their relatively small effective population size (Ne), is unclear. In animals, the population genetics of codon usage bias has only been studied in a handful of model organisms so far, and can be affected by confounding, nonadaptive processes such as GC-biased gene conversion and experimental artefacts. Using population transcriptomics data, we analyzed the relationship between codon usage, gene expression, allele frequency distribution, and recombination rate in 30 nonmodel species of animals, each from a different family, covering a wide range of effective population sizes. We disentangled the effects of translational selection and GC-biased gene conversion on codon usage by separately analyzing GC-conservative and GC-changing mutations. We report evidence for effective translational selection on codon usage in large-Ne species of animals, but not in small-Ne ones, in agreement with the nearly neutral theory of molecular evolution. C- and T-ending codons tend to be preferred over synonymous G- and A-ending ones, for reasons that remain to be determined. In contrast, we uncovered a conspicuous effect of GC-biased gene conversion, which is widespread in animals and the main force determining the fate of AT↔GC mutations. Intriguingly, the strength of its effect was uncorrelated with Ne.
Collapse
Affiliation(s)
- Nicolas Galtier
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Camille Roux
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France.,Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,UMR 8198 - Evo-Eco-Paleo, CNRS, Université de Lille-Sciences et Technologies, Villeneuve d'Ascq, France
| | - Marjolaine Rousselle
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Jonathan Romiguier
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France.,Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Emeric Figuet
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Sylvain Glémin
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France.,Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Nicolas Bierne
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Laurent Duret
- Laboratoire de Biométrie et Biologie Evolutive, UMR 5558, CNRS, Université de Lyon, Université Lyon 1, Villeurbanne, France
| |
Collapse
|
19
|
Rife TW, Graybosch RA, Poland JA. Genomic Analysis and Prediction within a US Public Collaborative Winter Wheat Regional Testing Nursery. THE PLANT GENOME 2018; 11:180012. [PMID: 30512033 DOI: 10.3835/plantgenome2018.02.0012] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The development of inexpensive, whole-genome profiling enables a transition to allele-based breeding using genomic prediction models. These models consider alleles shared between lines to predict phenotypes and select new lines based on estimated breeding values. This approach can leverage highly unbalanced datasets that are common to breeding programs. The Southern Regional Performance Nursery (SRPN) is a public nursery established by the USDA-ARS in 1931 to characterize performance and quality of near-release wheat ( L.) varieties from breeding programs in the US Central Plains. New entries are submitted annually and can be re-entered only once. The trial is grown at >30 locations each year and lines are evaluated for grain yield, disease resistance, and agronomic traits. Overall genetic gain is measured across years by including common check cultivars for comparison. We have generated whole-genome profiles via genotyping-by-sequencing (GBS) for 939 SPRN entries dating back to 1992 to explore the potential use of the nursery as a genomic selection (GS) training population (TP). The GS prediction models across years (average = 0.33) outperformed year-to-year phenotypic correlation for yield ( = 0.27) for a majority of the years evaluated, suggesting that genomic selection has the potential to outperform low heritability selection on yield in these highly variable environments. We also examined the predictability of programs using both program-specific and whole-set TPs. Generally, the predictability of a program was similar with both approaches. These results suggest that wheat breeding programs can collaboratively leverage the immense datasets that are generated from regional testing networks.
Collapse
|
20
|
Distinguishing Among Evolutionary Forces Acting on Genome-Wide Base Composition: Computer Simulation Analysis of Approximate Methods for Inferring Site Frequency Spectra of Derived Mutations. G3-GENES GENOMES GENETICS 2018; 8:1755-1769. [PMID: 29588382 PMCID: PMC5940166 DOI: 10.1534/g3.117.300512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Inferred ancestral nucleotide states are increasingly employed in analyses of within- and between -species genome variation. Although numerous studies have focused on ancestral inference among distantly related lineages, approaches to infer ancestral states in polymorphism data have received less attention. Recently developed approaches that employ complex transition matrices allow us to infer ancestral nucleotide sequence in various evolutionary scenarios of base composition. However, the requirement of a single gene tree to calculate a likelihood is an important limitation for conducting ancestral inference using within-species variation in recombining genomes. To resolve this problem, and to extend the applicability of ancestral inference in studies of base composition evolution, we first evaluate three previously proposed methods to infer ancestral nucleotide sequences among within- and between-species sequence variation data. The methods employ a single allele, bifurcating tree, or a star tree for within-species variation data. Using simulated nucleotide sequences, we employ ancestral inference to infer fixations and polymorphisms. We find that all three methods show biased inference. We modify the bifurcating tree method to include weights to adjust for an expected site frequency spectrum, “bifurcating tree with weighting” (BTW). Our simulation analysis show that the BTW method can substantially improve the reliability and robustness of ancestral inference in a range of scenarios that include non-neutral and/or non-stationary base composition evolution.
Collapse
|
21
|
Niklas KJ, Dunker AK, Yruela I. The evolutionary origins of cell type diversification and the role of intrinsically disordered proteins. JOURNAL OF EXPERIMENTAL BOTANY 2018; 69:1437-1446. [PMID: 29394379 DOI: 10.1093/jxb/erx493] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Accepted: 12/19/2017] [Indexed: 05/26/2023]
Abstract
The evolution of complex multicellular life forms occurred multiple times and was attended by cell type specialization. We review seven lines of evidence indicating that intrinsically disordered/ductile proteins (IDPs) played a significant role in the evolution of multicellularity and cell type specification: (i) most eukaryotic transcription factors (TFs) and multifunctional enzymes contain disproportionately long IDP sequences (≥30 residues in length), whereas highly conserved enzymes are normally IDP region poor; (ii) ~80% of the proteome involved in development are IDPs; (iii) the majority of proteins undergoing alternative splicing (AS) of pre-mRNA contain significant IDP regions; (iv) proteins encoded by DNA regions flanking crossing-over 'hot spots' are significantly enriched in IDP regions; (v) IDP regions are disproportionately subject to combinatorial post-translational modifications (PTMs) as well as AS; (vi) proteins involved in transcription and RNA processing are enriched in IDP regions; and (vii) a strong positive correlation exists between the number of different cell types and the IDP proteome fraction across a broad spectrum of uni- and multicellular algae, plants, and animals. We argue that the multifunctionalities conferred by IDPs and the disproportionate involvement of IDPs with AS and PTMs provided a IDP-AS-PTM 'motif' that significantly contributed to the evolution of multicellularity in all major eukaryotic lineages.
Collapse
Affiliation(s)
- Karl J Niklas
- Plant Biology Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, USA
| | - A Keith Dunker
- Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Inmaculada Yruela
- Estación Experimental de Aula Dei, Consejo Superior de Investigaciones Científicas (EEAD-CSIC), Avda. Montañana, Zaragoza, Spain
- Grupo de Bioquímica, Biofísica y Biología Computacional (BIFI, UNIZAR), Unidad Asociada al CSIC, Spain
| |
Collapse
|
22
|
Mazumdar P, Binti Othman R, Mebus K, Ramakrishnan N, Ann Harikrishna J. Codon usage and codon pair patterns in non-grass monocot genomes. ANNALS OF BOTANY 2017; 120:893-909. [PMID: 29155926 PMCID: PMC5710610 DOI: 10.1093/aob/mcx112] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Accepted: 09/19/2017] [Indexed: 05/19/2023]
Abstract
BACKGROUND AND AIMS Studies on codon usage in monocots have focused on grasses, and observed patterns of this taxon were generalized to all monocot species. Here, non-grass monocot species were analysed to investigate the differences between grass and non-grass monocots. METHODS First, studies of codon usage in monocots were reviewed. The current information was then extended regarding codon usage, as well as codon-pair context bias, using four completely sequenced non-grass monocot genomes (Musa acuminata, Musa balbisiana, Phoenix dactylifera and Spirodela polyrhiza) for which comparable transcriptome datasets are available. Measurements were taken regarding relative synonymous codon usage, effective number of codons, derived optimal codon and GC content and then the relationships investigated to infer the underlying evolutionary forces. KEY RESULTS The research identified optimal codons, rare codons and preferred codon-pair context in the non-grass monocot species studied. In contrast to the bimodal distribution of GC3 (GC content in third codon position) in grasses, non-grass monocots showed a unimodal distribution. Disproportionate use of G and C (and of A and T) in two- and four-codon amino acids detected in the analysis rules out the mutational bias hypothesis as an explanation of genomic variation in GC content. There was found to be a positive relationship between CAI (codon adaptation index; predicts the level of expression of a gene) and GC3. In addition, a strong correlation was observed between coding and genomic GC content and negative correlation of GC3 with gene length, indicating a strong impact of GC-biased gene conversion (gBGC) in shaping codon usage and nucleotide composition in non-grass monocots. CONCLUSION Optimal codons in these non-grass monocots show a preference for G/C in the third codon position. These results support the concept that codon usage and nucleotide composition in non-grass monocots are mainly driven by gBGC.
Collapse
Affiliation(s)
- Purabi Mazumdar
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
| | - RofinaYasmin Binti Othman
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
| | - Katharina Mebus
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
| | - N Ramakrishnan
- Electrical and Computer System Engineering, School of Engineering, Monash University Malaysia, Bandar Sunway, Malaysia
| | - Jennifer Ann Harikrishna
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
- For correspondence. E-mail:
| |
Collapse
|
23
|
Ranwez V, Serra A, Pot D, Chantret N. Domestication reduces alternative splicing expression variations in sorghum. PLoS One 2017; 12:e0183454. [PMID: 28886042 PMCID: PMC5590825 DOI: 10.1371/journal.pone.0183454] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2017] [Accepted: 08/06/2017] [Indexed: 01/09/2023] Open
Abstract
Domestication is known to strongly reduce genomic diversity through population bottlenecks. The resulting loss of polymorphism has been thoroughly documented in numerous cultivated species. Here we investigate the impact of domestication on the diversity of alternative transcript expressions using RNAseq data obtained on cultivated and wild sorghum accessions (ten accessions for each pool). In that aim, we focus on genes expressing two isoforms in sorghum and estimate the ratio between expression levels of those isoforms in each accession. Noticeably, for a given gene, one isoform can either be overexpressed or underexpressed in some wild accessions, whereas in the cultivated accessions, the balance between the two isoforms of the same gene appears to be much more homogenous. Indeed, we observe in sorghum significantly more variation in isoform expression balance among wild accessions than among domesticated accessions. The possibility exists that the loss of nucleotide diversity due to domestication could affect regulatory elements, controlling transcription or degradation of these isoforms. Impact on the isoform expression balance is discussed. As far as we know, this is the first time that the impact of domestication on transcript isoform balance has been studied at the genomic scale. This could pave the way towards the identification of key domestication genes with finely tuned isoform expressions in domesticated accessions while being highly variable in their wild relatives.
Collapse
Affiliation(s)
| | - Audrey Serra
- Montpellier SupAgro, UMR AGAP, Montpellier, France
| | - David Pot
- CIRAD, UMR AGAP, Montpellier, France
| | | |
Collapse
|