1
|
Bénitière F, Lefébure T, Duret L. Variation in the fitness impact of translationally optimal codons among animals. Genome Res 2025; 35:446-458. [PMID: 39929724 PMCID: PMC11960461 DOI: 10.1101/gr.279837.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Accepted: 01/30/2025] [Indexed: 03/05/2025]
Abstract
Early studies in invertebrate model organisms (fruit flies, nematodes) showed that their synonymous codon usage is under selective pressure to optimize translation efficiency in highly expressed genes (a process called translational selection). In contrast, mammals show little evidence of selection for translationally optimal codons. To understand this difference, we examined the use of synonymous codons in 223 metazoan species, covering a wide range of animal clades. For each species, we predicted the set of optimal codons based on the pool of tRNA genes present in its genome, and we analyzed how the frequency of optimal codons correlates with gene expression to quantify the intensity of translational selection (S). We observed that few metazoans show clear signs of translational selection. As predicted by the nearly neutral theory, the highest values of S are observed in species with large effective population sizes (N e). Overall, however, N e appears to be a poor predictor of the intensity of translational selection, suggesting important differences in the fitness effect of synonymous codon usage across taxa. We propose that the few animal taxa that are clearly affected by translational selection correspond to organisms with strong constraints for a very rapid growth rate.
Collapse
Affiliation(s)
- Florian Bénitière
- Laboratoire de Biométrie et Biologie Évolutive, Université Lyon 1, UMR CNRS 5558, Villeurbanne, France
- Université Claude Bernard Lyon 1, LEHNA UMR 5023, CNRS, ENTPE, F-69622, Villeurbanne, France
| | - Tristan Lefébure
- Université Claude Bernard Lyon 1, LEHNA UMR 5023, CNRS, ENTPE, F-69622, Villeurbanne, France
| | - Laurent Duret
- Laboratoire de Biométrie et Biologie Évolutive, Université Lyon 1, UMR CNRS 5558, Villeurbanne, France;
| |
Collapse
|
2
|
Radrizzani S, Kudla G, Izsvák Z, Hurst LD. Selection on synonymous sites: the unwanted transcript hypothesis. Nat Rev Genet 2024; 25:431-448. [PMID: 38297070 DOI: 10.1038/s41576-023-00686-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/04/2023] [Indexed: 02/02/2024]
Abstract
Although translational selection to favour codons that match the most abundant tRNAs is not readily observed in humans, there is nonetheless selection in humans on synonymous mutations. We hypothesize that much of this synonymous site selection can be explained in terms of protection against unwanted RNAs - spurious transcripts, mis-spliced forms or RNAs derived from transposable elements or viruses. We propose not only that selection on synonymous sites functions to reduce the rate of creation of unwanted transcripts (for example, through selection on exonic splice enhancers and cryptic splice sites) but also that high-GC content (but low-CpG content), together with intron presence and position, is both particular to functional native mRNAs and used to recognize transcripts as native. In support of this hypothesis, transcription, nuclear export, liquid phase condensation and RNA degradation have all recently been shown to promote GC-rich transcripts and suppress AU/CpG-rich ones. With such 'traps' being set against AU/CpG-rich transcripts, the codon usage of native genes has, in turn, evolved to avoid such suppression. That parallel filters against AU/CpG-rich transcripts also affect the endosomal import of RNAs further supports the unwanted transcript hypothesis of synonymous site selection and explains the similar design rules that have enabled the successful use of transgenes and RNA vaccines.
Collapse
Affiliation(s)
- Sofia Radrizzani
- Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, UK
- Milner Therapeutics Institute, Jeffrey Cheah Biomedical Centre, University of Cambridge, Cambridge, UK
| | - Grzegorz Kudla
- MRC Human Genetics Unit, Institute for Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| | - Zsuzsanna Izsvák
- Max-Delbrück-Center for Molecular Medicine in the Helmholtz Society, Berlin, Germany
| | - Laurence D Hurst
- Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, UK.
| |
Collapse
|
3
|
Lewin LE, Daniels KG, Hurst LD. Genes for highly abundant proteins in Escherichia coli avoid 5' codons that promote ribosomal initiation. PLoS Comput Biol 2023; 19:e1011581. [PMID: 37878567 PMCID: PMC10599525 DOI: 10.1371/journal.pcbi.1011581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 10/09/2023] [Indexed: 10/27/2023] Open
Abstract
In many species highly expressed genes (HEGs) over-employ the synonymous codons that match the more abundant iso-acceptor tRNAs. Bacterial transgene codon randomization experiments report, however, that enrichment with such "translationally optimal" codons has little to no effect on the resultant protein level. By contrast, consistent with the view that ribosomal initiation is rate limiting, synonymous codon usage following the 5' ATG greatly influences protein levels, at least in part by modifying RNA stability. For the design of bacterial transgenes, for simple codon based in silico inference of protein levels and for understanding selection on synonymous mutations, it would be valuable to computationally determine initiation optimality (IO) scores for codons for any given species. One attractive approach is to characterize the 5' codon enrichment of HEGs compared with the most lowly expressed genes, just as translational optimality scores of codons have been similarly defined employing the full gene body. Here we determine the viability of this approach employing a unique opportunity: for Escherichia coli there is both the most extensive protein abundance data for native genes and a unique large-scale transgene codon randomization experiment enabling objective definition of the 5' codons that cause, rather than just correlate with, high protein abundance (that we equate with initiation optimality, broadly defined). Surprisingly, the 5' ends of native genes that specify highly abundant proteins avoid such initiation optimal codons. We find that this is probably owing to conflicting selection pressures particular to native HEGs, including selection favouring low initiation rates, this potentially enabling high efficiency of ribosomal usage and low noise. While the classical HEG enrichment approach does not work, rendering simple prediction of native protein abundance from 5' codon content futile, we report evidence that initiation optimality scores derived from the transgene experiment may hold relevance for in silico transgene design for a broad spectrum of bacteria.
Collapse
Affiliation(s)
- Loveday E. Lewin
- The Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, United Kingdom
| | - Kate G. Daniels
- The Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, United Kingdom
| | - Laurence D. Hurst
- The Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, United Kingdom
| |
Collapse
|
4
|
Rezvannejad E, Mousavizadeh S. Identification genetic variations in some heat shock protein genes of Tali goat breed and study their structural and functional effects on relevant proteins. Vet Med Sci 2023; 9:2247-2259. [PMID: 37530404 PMCID: PMC10508551 DOI: 10.1002/vms3.1231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 07/16/2023] [Accepted: 07/21/2023] [Indexed: 08/03/2023] Open
Abstract
BACKGROUND Animals of different regions have adapted to adverse environmental conditions by modifying their phenotypic and genotypic characteristics in the long run. OBJECTIVES In this study, the effect of genetic variations of 10 heat shock protein (HSP) genes (HSP70A4, HSP70A9, HSP40C17, HSP40C27, HSP90AA1, HSP90AB1, HSPB7, HSPB11, HSPD1 and HSPE1) on the three-dimensional protein structure and function of proteins in Tali goat (a tropical breed) were studied and were compared with Saanen goat (as a sensitive breed). METHODS A pooled DNA of 15 samples from blood was sequenced and mapped to the goat reference sequence. The bioinformatics analysis was used to identify nsSNPs in the Tali breed and was compared with the Saanen goat. Four online bioinformatics tools (Sorting Intolerant from Tolerant, Protein Variation Effect Analyzer, Polymorphism Phenotyping version2 and Single Nucleotide Polymorphism Database and Gene Ontology) showed three deleterious missense nsSNPs and seven natural missense SNPs in these HSPs genes of Tali goat. RESULTS Out of 10 reported nsSNPs, 5 nsSNPs in HSP70A4, 1 nsSNP inHSP70A9, 2 nsSNPs in HSP40C17, 1 nsSNP in HSP40C27 and 1 nsSNP in HSPD1 were detected. ConSurf tools showed that the majority of the predicted nsSNPs occur in conserved sites. Moreover, several post-translational modification (PTM) predictors computed the probability of post-translation change of nsSNPs. The putative phosphorylation and glycosylation sites in HSPs proteins were substitutions rs669769139 and rs666336692 of the Tali goat breed. CONCLUSION These results on the effect of type of genetic variants on the function of HSP proteins will assist to predict the resistance to hard conditions in goat breeds. Considering that the identified SNPid rs669769139 (S248) which is located on the N-terminal ATPase domain of HSP70A4 is a PTM site with a highly conserved score and a natural substitution on changing the stability and benign protein that can affect the functional and structural characterization of HSPs protein for adaptation to the local climate.
Collapse
Affiliation(s)
- Elham Rezvannejad
- Department of Biotechnology, Institute of Sciences and High Technology and Environmental SciencesGraduate University of Advanced TechnologyKermanIran
| | | |
Collapse
|
5
|
Picard MAL, Leblay F, Cassan C, Willemsen A, Daron J, Bauffe F, Decourcelle M, Demange A, Bravo IG. Transcriptomic, proteomic, and functional consequences of codon usage bias in human cells during heterologous gene expression. Protein Sci 2023; 32:e4576. [PMID: 36692287 PMCID: PMC9926478 DOI: 10.1002/pro.4576] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/12/2023] [Accepted: 01/14/2023] [Indexed: 01/25/2023]
Abstract
Differences in codon frequency between genomes, genes, or positions along a gene, modulate transcription and translation efficiency, leading to phenotypic and functional differences. Here, we present a multiscale analysis of the effects of synonymous codon recoding during heterologous gene expression in human cells, quantifying the phenotypic consequences of codon usage bias at different molecular and cellular levels, with an emphasis on translation elongation. Six synonymous versions of an antibiotic resistance gene were generated, fused to a fluorescent reporter, and independently expressed in HEK293 cells. Multiscale phenotype was analyzed by means of quantitative transcriptome and proteome assessment, as proxies for gene expression; cellular fluorescence, as a proxy for single-cell level expression; and real-time cell proliferation in absence or presence of antibiotic, as a proxy for the cell fitness. We show that differences in codon usage bias strongly impact the molecular and cellular phenotype: (i) they result in large differences in mRNA levels and protein levels, leading to differences of over 15 times in translation efficiency; (ii) they introduce unpredicted splicing events; (iii) they lead to reproducible phenotypic heterogeneity; and (iv) they lead to a trade-off between the benefit of antibiotic resistance and the burden of heterologous expression. In human cells in culture, codon usage bias modulates gene expression by modifying mRNA availability and suitability for translation, leading to differences in protein levels and eventually eliciting functional phenotypic changes.
Collapse
Affiliation(s)
- Marion A. L. Picard
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Fiona Leblay
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Cécile Cassan
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Anouk Willemsen
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Josquin Daron
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Frédérique Bauffe
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Mathilde Decourcelle
- BioCampus Montpellier (University of Montpellier, CNRS, INSERM)MontpellierFrance
| | - Antonin Demange
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| | - Ignacio G. Bravo
- French National Center for Scientific ResearchLaboratory MIVEGEC (CNRS, IRD, University of Montpellier)MontpellierFrance
| |
Collapse
|
6
|
Xing ZP, Liang X, Wang X, Hu HY, Huang YX. Novel gene rearrangement pattern in mitochondrial genome of Ooencyrtusplautus Huang & Noyes, 1994: new gene order in Encyrtidae (Hymenoptera, Chalcidoidea). Zookeys 2022; 1124:1-21. [PMID: 36762364 PMCID: PMC9836654 DOI: 10.3897/zookeys.1124.83811] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 09/14/2022] [Indexed: 11/12/2022] Open
Abstract
Studies of mitochondrial genomes have a wide range of applications in phylogeny, population genetics, and evolutionary biology. In this study, we sequenced and analyzed the mitochondrial genome of Ooencyrtusplautus Huang & Noyes, 1994 (Hymenoptera, Encyrtidae). The nearly complete mitogenome of O.plautus was 15,730 bp in size, including 13 PCGs (protein-coding genes), 22 tRNAs, 2 rRNAs, and a nearly complete control region. The nucleotide composition was significantly biased toward adenine and thymine, with an A + T content of 84.6%. We used the reference sequence of Chouioiacunea and calculated the Ka/Ks ratio for each set of PCGs. The highest value of the Ka/Ks ratio within 13 PCGs was found in nad2 with 1.1, suggesting that they were subjected to positive selection. This phenomenon was first discovered in Encyrtidae. Compared with other encyrtid mitogenomes, a translocation of trnW was found in O.plautus, which was the first of its kind to be reported in Encyrtidae. Comparing with ancestral arrangement pattern, wasps reflect extensive gene rearrangements. Although these insects have a high frequency of gene rearrangement, species from the same family and genus tend to have similar gene sequences. As the number of sequenced mitochondrial genomes in Chalcidoidea increases, we summarize some of the rules of gene rearrangement in Chalcidoidea, that is four gene clusters with frequent gene rearrangements. Ten mitogenomes were included to reconstruct the phylogenetic trees of Encyrtidae based on both 13 PCGs (nucleotides of protein coding genes) and AA matrix (amino acids of protein coding genes) using the maximum likelihood and Bayesian inference methods. The phylogenetic tree reconstructed by Bayesian inference based on AA data set showed that Aenasiusarizonensis and Metaphycuseriococci formed a clade representing Tetracneminae. The remaining six species formed a monophyletic clade representing Encyrtinae. In Encyrtinae, Encyrtus forms a monophyletic clade as a sister group to the clade formed by O.plautus and Diaphorencyrtusaligarhensis. Encyrtussasakii and Encyrtusrhodooccisiae were most closely related species in this monophyletic clade. In addition, gene rearrangements can provide a valuable information for molecular phylogenetic reconstruction. These results enhance our understanding of phylogenetic relationships among Encyrtidae.
Collapse
Affiliation(s)
- Zhi-Ping Xing
- Collaborative Innovation Center of Recovery and Reconstruction of Degraded Ecosystem in Wanjiang Basin Co-founded by Anhui Province and Ministry of Education, Wuhu, Anhui 241000, China,School of Ecology and Environment, Anhui Normal University, Wuhu, Anhui 241000, China
| | - Xin Liang
- Collaborative Innovation Center of Recovery and Reconstruction of Degraded Ecosystem in Wanjiang Basin Co-founded by Anhui Province and Ministry of Education, Wuhu, Anhui 241000, China,School of Ecology and Environment, Anhui Normal University, Wuhu, Anhui 241000, China
| | - Xu Wang
- School of Ecology and Environment, Anhui Normal University, Wuhu, Anhui 241000, China,Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, 1 Beichen West Road, Chaoyang District, Beijing, 100101, China
| | - Hao-Yuan Hu
- Collaborative Innovation Center of Recovery and Reconstruction of Degraded Ecosystem in Wanjiang Basin Co-founded by Anhui Province and Ministry of Education, Wuhu, Anhui 241000, China,School of Ecology and Environment, Anhui Normal University, Wuhu, Anhui 241000, China
| | - Yi-Xin Huang
- Collaborative Innovation Center of Recovery and Reconstruction of Degraded Ecosystem in Wanjiang Basin Co-founded by Anhui Province and Ministry of Education, Wuhu, Anhui 241000, China,School of Ecology and Environment, Anhui Normal University, Wuhu, Anhui 241000, China,Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, 1 Beichen West Road, Chaoyang District, Beijing, 100101, China
| |
Collapse
|
7
|
A novel statistical method predicts mutability of the genomic segments of the SARS-CoV-2 virus. QRB DISCOVERY 2021; 3:e1. [PMID: 35106478 PMCID: PMC8795775 DOI: 10.1017/qrd.2021.13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 05/28/2021] [Accepted: 11/26/2021] [Indexed: 11/06/2022] Open
Abstract
Abstract
The SARS-CoV-2 virus has made the largest pandemic of the 21st century, with hundreds of millions of cases and tens of millions of fatalities. Scientists all around the world are racing to develop vaccines and new pharmaceuticals to overcome the pandemic and offer effective treatments for COVID-19 disease. Consequently, there is an essential need to better understand how the pathogenesis of SARS-CoV-2 is affected by viral mutations and to determine the conserved segments in the viral genome that can serve as stable targets for novel therapeutics. Here, we introduce a text-mining method to estimate the mutability of genomic segments directly from a reference (ancestral) whole genome sequence. The method relies on calculating the importance of genomic segments based on their spatial distribution and frequency over the whole genome. To validate our approach, we perform a large-scale analysis of the viral mutations in nearly 80,000 publicly available SARS-CoV-2 predecessor whole genome sequences and show that these results are highly correlated with the segments predicted by the statistical method used for keyword detection. Importantly, these correlations are found to hold at the codon and gene levels, as well as for gene coding regions. Using the text-mining method, we further identify codon sequences that are potential candidates for siRNA-based antiviral drugs. Significantly, one of the candidates identified in this work corresponds to the first seven codons of an epitope of the spike glycoprotein, which is the only SARS-CoV-2 immunogenic peptide without a match to a human protein.
Collapse
|
8
|
Abrahams L, Savisaar R, Mordstein C, Young B, Kudla G, Hurst LD. Evidence in disease and non-disease contexts that nonsense mutations cause altered splicing via motif disruption. Nucleic Acids Res 2021; 49:9665-9685. [PMID: 34469537 PMCID: PMC8464065 DOI: 10.1093/nar/gkab750] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 08/17/2021] [Accepted: 08/19/2021] [Indexed: 12/21/2022] Open
Abstract
Transcripts containing premature termination codons (PTCs) can be subject to nonsense-associated alternative splicing (NAS). Two models have been evoked to explain this, scanning and splice motif disruption. The latter postulates that exonic cis motifs, such as exonic splice enhancers (ESEs), are disrupted by nonsense mutations. We employ genome-wide transcriptomic and k-mer enrichment methods to scrutinize this model. First, we show that ESEs are prone to disruptive nonsense mutations owing to their purine richness and paucity of TGA, TAA and TAG. The motif model correctly predicts that NAS rates should be low (we estimate 5–30%) and approximately in line with estimates for the rate at which random point mutations disrupt splicing (8–20%). Further, we find that, as expected, NAS-associated PTCs are predictable from nucleotide-based machine learning approaches to predict splice disruption and, at least for pathogenic variants, are enriched in ESEs. Finally, we find that both in and out of frame mutations to TAA, TGA or TAG are associated with exon skipping. While a higher relative frequency of such skip-inducing mutations in-frame than out of frame lends some credence to the scanning model, these results reinforce the importance of considering splice motif modulation to understand the etiology of PTC-associated disease.
Collapse
Affiliation(s)
- Liam Abrahams
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK
| | - Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK.,Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal
| | - Christine Mordstein
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK.,MRC Human Genetics Unit, The University of Edinburgh, Crewe Road, Edinburgh EH4 2XU, UK.,Aarhus University, Department of Molecular Biology and Genetics, C F Møllers Allé 3, 8000 Aarhus, Denmark
| | - Bethan Young
- MRC Human Genetics Unit, The University of Edinburgh, Crewe Road, Edinburgh EH4 2XU, UK
| | - Grzegorz Kudla
- MRC Human Genetics Unit, The University of Edinburgh, Crewe Road, Edinburgh EH4 2XU, UK
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK
| |
Collapse
|
9
|
Rezvannejad E, Mousavizadeh SA, Lotfi S, Kargar N. Determine genetic variations in heat shock factor gene family (HSFs) and study their effect on the functional and structural characterization of protein in Tali goat. Anim Biotechnol 2021; 34:236-245. [PMID: 34370605 DOI: 10.1080/10495398.2021.1954935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
In this study, the effect of genetic variations of four heat shock transcription factor genes (HSF1, HSF2, HSF4, and HSF5) on the 3 D protein structure and function were studied. We defined the breed-specific genetic variations of pooled DNA of Tali goat that differed from the goat reference sequence (CHI2.0). Disordered regions of HSF proteins were predicted using PONDR. Post-translation changes were studied by several predicted online servers. Then, the structure of the order region of proteins was anticipated by using the Swiss model. Tali goat HSF genes contain a total number of 181, 679, 91, and 301 SNPs for HSF1, 2, 4, and 5, respectively. Also, 5 and 3 variants were identified as nsSNPs in the coding region of HSF4 and HSF5, respectively. (r.145A/S), (r.322P/Y), (r.379T/C) in HSF4 and (r.300Q/P), (r.573E/Q) in HSF5 obtained the tolerant and high confidence (SIFT score) for nsSNPs. More than half of these proteins are predicted to be disordered (56, 50, 52, and 50%, respectively for HSF1, 2, 4, and 5). Phosphorylation, acetylation, glycosylation, and Sumoylation sites of HSFs were compared between Tali goat and reference goat. Three residues S145, S263, and S322 of HSF4 in Tali goat were phosphorylation sites, and in HSF5, the reference goat has a phosphorylation site in S593.
Collapse
Affiliation(s)
- Elham Rezvannejad
- Department of Biotechnology, Institute of Sciences and High Technology and Environmental Sciences, Graduate University of Advanced Technology, Kerman, Iran
| | | | - Safa Lotfi
- Department of Biotechnology, Institute of Sciences and High Technology and Environmental Sciences, Graduate University of Advanced Technology, Kerman, Iran
| | - Najmeh Kargar
- Department of Animal Science, Kerman Agricultural and Natural Resources Research and Education Center, Kerman, Iran
| |
Collapse
|
10
|
Weak selection on synonymous codons substantially inflates dN/dS estimates in bacteria. Proc Natl Acad Sci U S A 2021; 118:2023575118. [PMID: 33972434 DOI: 10.1073/pnas.2023575118] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Synonymous codon substitutions are not always selectively neutral as revealed by several types of analyses, including studies of codon usage patterns among genes. We analyzed codon usage in 13 bacterial genomes sampled from across a large order of bacteria, Enterobacterales, and identified presumptively neutral and selected classes of synonymous substitutions. To estimate substitution rates, given a neutral/selected classification of synonymous substitutions, we developed a flexible [Formula: see text] substitution model that allows multiple classes of synonymous substitutions. Under this multiclass synonymous substitution (MSS) model, the denominator of [Formula: see text] includes only the strictly neutral class of synonymous substitutions. On average, the value of [Formula: see text] under the MSS model was 80% of that under the standard codon model in which all synonymous substitutions are assumed to be neutral. The indication is that conventional [Formula: see text] analyses overestimate these values and thus overestimate the frequency of positive diversifying selection and underestimate the strength of purifying selection. To quantify the strength of selection necessary to explain this reduction, we developed a model of selected compensatory codon substitutions. The reduction in synonymous substitution rate, and thus the contribution that selection makes to codon bias variation among genes, can be adequately explained by very weak selection, with a mean product of population size and selection coefficient, [Formula: see text].
Collapse
|
11
|
The association between polymorphism of norepinephrine transporter G1287A and major depressive disorder, antidepressant response: a meta-analysis. Psychiatr Genet 2020; 30:101-109. [PMID: 32459709 DOI: 10.1097/ypg.0000000000000254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Massive research has examined the cause of major depressive disorder (MDD) and accumulating evidence has revealed that the gene for the norepinephrine transporter (NET) is involved in MDDs etiology as well as the antidepressant response. The G1287A (rs5569, GRCh38, Chromosome 16, 55697923) is located in the exon 9 region of the SLC6A2 gene. It was found to be connected with MDD and antidepressant response in people of different genetic ancestries. However, the results are still inconsistent. METHODS A meta-analysis was conducted to evaluate the overall association of rs5569 polymorphisms with MDD and the antidepressant response. RESULTS Sixteen articles that studied the connection between the G1287A polymorphism and MDD or antidepressant response were identified, and their outcomes revealed there was a significant connection between the polymorphisms and MDD and antidepressant response. Our study indicated that the GG genotype may be a protection factor against the development of MDD [odds ratio (OR = 0.78, 95% confidence interval (CI) = 0.64-0.96, P = 0.02 for Asian population; OR = 0.79, 95% CI = 0.63-0.98, P = 0.03 for Han Chinese population] while the GG genotype had a worse antidepressant response (OR = 0.49, 95% CI = 0.25-0.94, P = 0.03). CONCLUSIONS NET G1287A polymorphisms are involved in the etiology of MDD and antidepressant response.
Collapse
|
12
|
Abrahams L, Hurst LD. A Depletion of Stop Codons in lincRNA is Owing to Transfer of Selective Constraint from Coding Sequences. Mol Biol Evol 2020; 37:1148-1164. [PMID: 31841162 PMCID: PMC7086181 DOI: 10.1093/molbev/msz299] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Although the constraints on a gene’s sequence are often assumed to reflect the functioning of that gene, here we propose transfer selection, a constraint operating on one class of genes transferred to another, mediated by shared binding factors. We show that such transfer can explain an otherwise paradoxical depletion of stop codons in long intergenic noncoding RNAs (lincRNAs). Serine/arginine-rich proteins direct the splicing machinery by binding exonic splice enhancers (ESEs) in immature mRNA. As coding exons cannot contain stop codons in one reading frame, stop codons should be rare within ESEs. We confirm that the stop codon density (SCD) in ESE motifs is low, even accounting for nucleotide biases. Given that serine/arginine-rich proteins binding ESEs also facilitate lincRNA splicing, a low SCD could transfer to lincRNAs. As predicted, multiexon lincRNA exons are depleted in stop codons, a result not explained by open reading frame (ORF) contamination. Consistent with transfer selection, stop codon depletion in lincRNAs is most acute in exonic regions with the highest ESE density, disappears when ESEs are masked, is consistent with stop codon usage skews in ESEs, and is diminished in both single-exon lincRNAs and introns. Owing to low SCD, the maximum lengths of pseudo-ORFs frequently exceed null expectations. This has implications for ORF annotation and the evolution of de novo protein-coding genes from lincRNAs. We conclude that not all constraints operating on genes need be explained by the functioning of the gene but may instead be transferred owing to shared binding factors.
Collapse
Affiliation(s)
- Liam Abrahams
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
13
|
Vallejos-Vidal E, Reyes-Cerpa S, Rivas-Pardo JA, Maisey K, Yáñez JM, Valenzuela H, Cea PA, Castro-Fernandez V, Tort L, Sandino AM, Imarai M, Reyes-López FE. Single-Nucleotide Polymorphisms (SNP) Mining and Their Effect on the Tridimensional Protein Structure Prediction in a Set of Immunity-Related Expressed Sequence Tags (EST) in Atlantic Salmon ( Salmo salar). Front Genet 2020; 10:1406. [PMID: 32174954 PMCID: PMC7056891 DOI: 10.3389/fgene.2019.01406] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Accepted: 12/24/2019] [Indexed: 12/12/2022] Open
Abstract
Single-nucleotide polymorphisms (SNPs) are single genetic code variations considered one of the most common forms of nucleotide modifications. Such SNPs can be located in genes associated to immune response and, therefore, they may have direct implications over the phenotype of susceptibility to infections affecting the productive sector. In this study, a set of immune-related genes (cc motif chemokine 19 precursor [ccl19], integrin β2 (itβ2, also named cd18), glutathione transferase omega-1 [gsto-1], heat shock 70 KDa protein [hsp70], major histocompatibility complex class I [mhc-I]) were analyzed to identify SNPs by data mining. These genes were chosen based on their previously reported expression on infectious pancreatic necrosis virus (IPNV)-infected Atlantic salmon phenotype. The available EST sequences for these genes were obtained from the Unigene database. Twenty-eight SNPs were found in the genes evaluated and identified most of them as transition base changes. The effect of the SNPs located on the 5'-untranslated region (UTR) or 3'-UTR upon transcription factor binding sites and alternative splicing regulatory motifs was assessed and ranked with a low-medium predicted FASTSNP score risk. Synonymous SNPs were found on itβ2 (c.2275G > A), gsto-1 (c.558G > A), and hsp70 (c.1950C > T) with low FASTSNP predicted score risk. The difference in the relative synonymous codon usage (RSCU) value between the variant codons and the wild-type codon (ΔRSCU) showed one negative (hsp70 c.1950C > T) and two positive ΔRSCU values (itβ2 c.2275G > A; gsto-1 c.558G > A), suggesting that these synonymous SNPs (sSNPs) may be associated to differences in the local rate of elongation. Nonsynonymous SNPs (nsSNPs) in the gsto-1 translatable gene region were ranked, using SIFT and POLYPHEN web-tools, with the second highest (c.205A > G; c484T > C) and the highest (c.499T > C; c.769A > C) predicted score risk possible. Using homology modeling to predict the effect of these nonsynonymous SNPs, the most relevant nucleotide changes for gsto-1 were observed for the nsSNPs c.205A > G, c484T > C, and c.769A > C. Molecular dynamics was assessed to analyze if these GSTO-1 variants have significant differences in their conformational dynamics, suggesting these SNPs could have allosteric effects modulating its catalysis. Altogether, these results suggest that candidate SNPs identified may play a crucial potential role in the immune response of Atlantic salmon.
Collapse
Affiliation(s)
- Eva Vallejos-Vidal
- Department of Cell Biology, Physiology and Immunology, Faculty of Biosciences, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Sebastián Reyes-Cerpa
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, Santiago, Chile
- Escuela de Biotecnología, Facultad de Ciencias, Universidad Mayor, Santiago, Chile
| | - Jaime Andrés Rivas-Pardo
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, Santiago, Chile
- Escuela de Biotecnología, Facultad de Ciencias, Universidad Mayor, Santiago, Chile
| | - Kevin Maisey
- Centro de Biotecnología Acuícola, Departamento de Biología, Facultad de Química y Biología, Universidad de Santiago de Chile, Santiago, Chile
| | - José M. Yáñez
- Facultad de Ciencias Veterinarias y Pecuarias, Universidad de Chile, Santiago, Chile
| | - Hector Valenzuela
- Centro de Biotecnología Acuícola, Departamento de Biología, Facultad de Química y Biología, Universidad de Santiago de Chile, Santiago, Chile
| | - Pablo A. Cea
- Facultad de Ciencias, Universidad de Chile, Santiago, Chile
| | | | - Lluis Tort
- Department of Cell Biology, Physiology and Immunology, Faculty of Biosciences, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Ana M. Sandino
- Centro de Biotecnología Acuícola, Departamento de Biología, Facultad de Química y Biología, Universidad de Santiago de Chile, Santiago, Chile
| | - Mónica Imarai
- Centro de Biotecnología Acuícola, Departamento de Biología, Facultad de Química y Biología, Universidad de Santiago de Chile, Santiago, Chile
| | - Felipe E. Reyes-López
- Department of Cell Biology, Physiology and Immunology, Faculty of Biosciences, Universitat Autònoma de Barcelona, Barcelona, Spain
| |
Collapse
|
14
|
Pollo-Oliveira L, de Crécy-Lagard V. Can Protein Expression Be Regulated by Modulation of tRNA Modification Profiles? Biochemistry 2018; 58:355-362. [PMID: 30511849 DOI: 10.1021/acs.biochem.8b01035] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
tRNAs are the central adaptor molecules in translation. Their decoding properties are influenced by post-transcriptional modifications, particularly in the critical anticodon-stem-loop (ASL) region. Synonymous codon choice, also called codon usage bias, affects both translation efficiency and accuracy, and ASL modifications play key roles in both of these processes. In combination with a handful of historical examples, recent studies integrating ribosome profiling, proteomics, codon-usage analyses, and modification quantifications show that levels of tRNA modifications can change under stress, during development, or under specific metabolic conditions and can modulate the expression of specific genes. Deconvoluting the different responses (global or specific) to tRNA modification deficiencies can be difficult because of pleiotropic effects, but, as more cases emerge, it does seem that tRNA modification changes could add another layer of regulation in the transfer of information from DNA to protein.
Collapse
Affiliation(s)
- Leticia Pollo-Oliveira
- Department of Microbiology and Cell Science , University of Florida , Gainesville , Florida 32603 , United States
| | - Valérie de Crécy-Lagard
- Department of Microbiology and Cell Science , University of Florida , Gainesville , Florida 32603 , United States.,University of Florida Genetics Institute , Gainesville , Florida 32608 , United States
| |
Collapse
|
15
|
Savisaar R, Hurst LD. Exonic splice regulation imposes strong selection at synonymous sites. Genome Res 2018; 28:1442-1454. [PMID: 30143596 PMCID: PMC6169883 DOI: 10.1101/gr.233999.117] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 07/31/2018] [Indexed: 01/17/2023]
Abstract
What proportion of coding sequence nucleotides have roles in splicing, and how strong is the selection that maintains them? Despite a large body of research into exonic splice regulatory signals, these questions have not been answered. This is because, to our knowledge, previous investigations have not explicitly disentangled the frequency of splice regulatory elements from the strength of the evolutionary constraint under which they evolve. Current data are consistent both with a scenario of weak and diffuse constraint, enveloping large swaths of sequence, as well as with well-defined pockets of strong purifying selection. In the former case, natural selection on exonic splice enhancers (ESEs) might primarily act as a slight modifier of codon usage bias. In the latter, mutations that disrupt ESEs are likely to have large fitness and, potentially, clinical effects. To distinguish between these scenarios, we used several different methods to determine the distribution of selection coefficients for new mutations within ESEs. The analyses converged to suggest that ∼15%-20% of fourfold degenerate sites are part of functional ESEs. Most of these sites are under strong evolutionary constraint. Therefore, exonic splice regulation does not simply impose a weak bias that gently nudges coding sequence evolution in a particular direction. Rather, the selection to preserve these motifs is a strong force that severely constrains the evolution of a substantial proportion of coding nucleotides. Thus synonymous mutations that disrupt ESEs should be considered as a potentially common cause of single-locus genetic disorders.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom
| |
Collapse
|
16
|
Abrahams L, Hurst LD. Adenine Enrichment at the Fourth CDS Residue in Bacterial Genes Is Consistent with Error Proofing for +1 Frameshifts. Mol Biol Evol 2018; 34:3064-3080. [PMID: 28961919 PMCID: PMC5850271 DOI: 10.1093/molbev/msx223] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Beyond selection for optimal protein functioning, coding sequences (CDSs) are under selection at the RNA and DNA levels. Here, we identify a possible signature of “dual-coding,” namely extensive adenine (A) enrichment at bacterial CDS fourth sites. In 99.07% of studied bacterial genomes, fourth site A use is greater than expected given genomic A-starting codon use. Arguing for nucleotide level selection, A-starting serine and arginine second codons are heavily utilized when compared with their non-A starting synonyms. Several models have the ability to explain some of this trend. In part, A-enrichment likely reduces 5′ mRNA stability, promoting translation initiation. However T/U, which may also reduce stability, is avoided. Further, +1 frameshifts on the initiating ATG encode a stop codon (TGA) provided A is the fourth residue, acting either as a frameshift “catch and destroy” or a frameshift stop and adjust mechanism and hence implicated in translation initiation. Consistent with both, genomes lacking TGA stop codons exhibit weaker fourth site A-enrichment. Sequences lacking a Shine–Dalgarno sequence and those without upstream leader genes, that may be more error prone during initiation, have greater utilization of A, again suggesting a role in initiation. The frameshift correction model is consistent with the notion that many genomic features are error-mitigation factors and provides the first evidence for site-specific out of frame stop codon selection. We conjecture that the NTG universal start codon may have evolved as a consequence of TGA being a stop codon and the ability of NTGA to rapidly terminate or adjust a ribosome.
Collapse
Affiliation(s)
- Liam Abrahams
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| |
Collapse
|
17
|
Liu L, Yu S, Chen R, Lv X, Pan C. A novel synonymous SNP (A47A) of the <i>TMEM95</i> gene is significantly associated with the reproductive traits related to testis in male piglets. Arch Anim Breed 2017. [DOI: 10.5194/aab-60-235-2017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Abstract. Transmembrane protein 95 (TMEM95) is located on the acrosomal membrane of the sperm head involved in the acrosome reaction; thus, it is regarded as affecting spermatogenesis and reproduction traits. The aim of this study was to explore the novel single nucleotide polymorphisms (SNPs) within the pig TMEM95 gene as well as to evaluate their associations with the testicular sizes in male Landrace (LD) and Large White (LW) breeds. After pool sequencing and bioinformatics analysis, only one novel coding SNP was found in exon 1, namely NC_010454.3: g.341T > C, resulting in a synonymous mutation (A47A). This SNP could be genotyped using the StuI polymerase chain reaction–restriction fragment length polymorphism (PCR-RFLP) assay. The minor allelic frequencies (MAFs) were 0.259 and 0.480 in the LD and LW breeds. Their polymorphism information content (PIC) values were 0.310 and 0.375. The LW population was at the Hardy–Weinberg equilibrium (HWE) (p > 0.05), whereas the LD population was not (p < 0.05). Association analyses demonstrated that a significant relationship was found between this A47A polymorphism and testis weight at 40 days of age in the LW population (p = 0.047), and the heterozygote individuals showed lower testis weight than those with other genotypes. Moreover, this SNP was significantly associated with three testis measurement traits at 15 days of age in the LW population (p < 0.05); the individuals with genotypes TT and TC showed consistently superior testis measurement traits than those with genotype CC. These findings demonstrate that the A47A polymorphism had a significant effect on testis measurement traits, suggesting that the TMEM95 gene could be a candidate gene associated with reproductive traits. These results could contribute to breeding and genetics programs in the pig industry via DNA marker-assisted selection (MAS).
Collapse
|
18
|
Savisaar R, Hurst LD. Estimating the prevalence of functional exonic splice regulatory information. Hum Genet 2017; 136:1059-1078. [PMID: 28405812 PMCID: PMC5602102 DOI: 10.1007/s00439-017-1798-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Accepted: 04/04/2017] [Indexed: 12/14/2022]
Abstract
In addition to coding information, human exons contain sequences necessary for correct splicing. These elements are known to be under purifying selection and their disruption can cause disease. However, the density of functional exonic splicing information remains profoundly uncertain. Several groups have experimentally investigated how mutations at different exonic positions affect splicing. They have found splice information to be distributed widely in exons, with one estimate putting the proportion of splicing-relevant nucleotides at >90%. These results suggest that splicing could place a major pressure on exon evolution. However, analyses of sequence conservation have concluded that the need to preserve splice regulatory signals only slightly constrains exon evolution, with a resulting decrease in the average human rate of synonymous evolution of only 1–4%. Why do these two lines of research come to such different conclusions? Among other reasons, we suggest that the methods are measuring different things: one assays the density of sites that affect splicing, the other the density of sites whose effects on splicing are visible to selection. In addition, the experimental methods typically consider short exons, thereby enriching for nucleotides close to the splice junction, such sites being enriched for splice-control elements. By contrast, in part owing to correction for nucleotide composition biases and to the assumption that constraint only operates on exon ends, the conservation-based methods can be overly conservative.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK.
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| |
Collapse
|
19
|
McCarthy C, Carrea A, Diambra L. Bicodon bias can determine the role of synonymous SNPs in human diseases. BMC Genomics 2017; 18:227. [PMID: 28288557 PMCID: PMC5347174 DOI: 10.1186/s12864-017-3609-6] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 03/04/2017] [Indexed: 01/09/2023] Open
Abstract
Background For a long time synonymous single nucleotide polymorphisms were considered as silent mutations. However, nowadays it is well known that they can affect protein conformation and function, leading to altered disease susceptibilities, differential prognosis and/or drug responses, among other clinically relevant genetic traits. This occurs through different mechanisms: by disrupting the splicing signals of precursor mRNAs, affecting regulatory binding-sites of transcription factors and miRNAs, or by modifying the secondary structure of mRNAs. Results In this paper we considered 22 human genetic diseases or traits, linked to 35 synonymous single nucleotide polymorphisms in 27 different genes. We performed a local sequence context analysis in terms of the ribosomal pause propensity affected by synonymous single nucleotide polymorphisms. We found that synonymous mutations related to the above mentioned mechanisms presented small pause propensity changes, whereas synonymous mutations that were not related to those mechanisms presented large pause propensity changes. On the other hand, we did not observe large variations in the codon usage of codons associated with these mutations. Furthermore, we showed that the changes in the pause propensity associated with benign sSNPs are significantly lower than the pause propensity changes related to sSNPs associated to diseases. Conclusions These results suggest that the genetic diseases or traits related to synonymous mutations with large pause propensity changes, could be the consequence of another mechanism underlying non-silent synonymous mutations. Namely, alternative protein configuration related, in turn, to alterations in the ribosome-mediated translational attenuation program encoded by pairs of consecutive codons, not codons. These findings shed light on the latter mechanism based on the perturbation of the co-translational folding process. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3609-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Christina McCarthy
- Centro Regional de Estudio Génomicos, Universidad Nacional de La Plata, Boulevard 120, La Plata, Argentina.,CONICET, Buenos Aires, Argentina.,Departamento de Informática y Tecnología, Escuela de Ciencias Agrarias, Naturales y Ambientales, Universidad Nacional del Noroeste de la Provincia de Buenos Aires, Pergamino, Argentina
| | - Alejandra Carrea
- Centro Regional de Estudio Génomicos, Universidad Nacional de La Plata, Boulevard 120, La Plata, Argentina.,CONICET, Buenos Aires, Argentina
| | - Luis Diambra
- Centro Regional de Estudio Génomicos, Universidad Nacional de La Plata, Boulevard 120, La Plata, Argentina. .,CONICET, Buenos Aires, Argentina.
| |
Collapse
|
20
|
Abstract
Synonymous mutations do not change the sequence of the polypeptide but they may still influence fitness. We investigated in Salmonella enterica how four synonymous mutations in the rpsT gene (encoding ribosomal protein S20) reduce fitness (i.e., growth rate) and the mechanisms by which this cost can be genetically compensated. The reduced growth rates of the synonymous mutants were correlated with reduced levels of the rpsT transcript and S20 protein. In an adaptive evolution experiment, these fitness impairments could be compensated by mutations that either caused up-regulation of S20 through increased gene dosage (due to duplications), increased transcription of the rpsT gene (due to an rpoD mutation or mutations in rpsT), or increased translation from the rpsT transcript (due to rpsT mutations). We suggest that the reduced levels of S20 in the synonymous mutants result in production of a defective subpopulation of 30S subunits lacking S20 that reduce protein synthesis and bacterial growth and that the compensatory mutations restore S20 levels and the number of functional ribosomes. Our results demonstrate how specific synonymous mutations can cause substantial fitness reductions and that many different types of intra- and extragenic compensatory mutations can efficiently restore fitness. Furthermore, this study highlights that also synonymous sites can be under strong selection, which may have implications for the use of dN/dS ratios as signature for selection.
Collapse
Affiliation(s)
- Anna Knöppel
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Joakim Näsvall
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Dan I Andersson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
21
|
Wu X, Hurst LD. Determinants of the Usage of Splice-Associated cis-Motifs Predict the Distribution of Human Pathogenic SNPs. Mol Biol Evol 2016; 33:518-29. [PMID: 26545919 PMCID: PMC4866546 DOI: 10.1093/molbev/msv251] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2015] [Revised: 10/21/2015] [Accepted: 10/25/2015] [Indexed: 12/11/2022] Open
Abstract
Where in genes do pathogenic mutations tend to occur and does this provide clues as to the possible underlying mechanisms by which single nucleotide polymorphisms (SNPs) cause disease? As splice-disrupting mutations tend to occur predominantly at exon ends, known also to be hot spots of cis-exonic splice control elements, we examine the relationship between the relative density of such exonic cis-motifs and pathogenic SNPs. In particular, we focus on the intragene distribution of exonic splicing enhancers (ESE) and the covariance between them and disease-associated SNPs. In addition to showing that disease-causing genes tend to be genes with a high intron density, consistent with missplicing, five factors established as trends in ESE usage, are considered: relative position in exons, relative position in genes, flanking intron size, splice sites usage, and phase. We find that more than 76% of pathogenic SNPs are within 3-69 bp of exon ends where ESEs generally reside, this being 13% more than expected. Overall from enrichment of pathogenic SNPs at exon ends, we estimate that approximately 20-45% of SNPs affect splicing. Importantly, we find that within genes pathogenic SNPs tend to occur in splicing-relevant regions with low ESE density: they are found to occur preferentially in the terminal half of genes, in exons flanked by short introns and at the ends of phase (0,0) exons with 3' non-"AGgt" splice site. We suggest the concept of the "fragile" exon, one home to pathogenic SNPs owing to its vulnerability to splice disruption owing to low ESE density.
Collapse
Affiliation(s)
- XianMing Wu
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| | - Laurence D Hurst
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| |
Collapse
|
22
|
Abstract
Exonic splice enhancers (ESEs) are short nucleotide motifs, enriched near exon ends, that enhance the recognition of the splice site and thus promote splicing. Are intronless genes under selection to avoid these motifs so as not to attract the splicing machinery to an mRNA that should not be spliced, thereby preventing the production of an aberrant transcript? Consistent with this possibility, we find that ESEs in putative recent retrocopies are at a higher density and evolving faster than those in other intronless genes, suggesting that they are being lost. Moreover, intronless genes are less dense in putative ESEs than intron-containing ones. However, this latter difference is likely due to the skewed base composition of intronless sequences, a skew that is in line with the general GC richness of few exon genes. Indeed, after controlling for such biases, we find that both intronless and intron-containing genes are denser in ESEs than expected by chance. Importantly, nucleotide-controlled analysis of evolutionary rates at synonymous sites in ESEs indicates that the ESEs in intronless genes are under purifying selection in both human and mouse. We conclude that on the loss of introns, some but not all, ESE motifs are lost, the remainder having functions beyond a role in splice promotion. These results have implications for the design of intronless transgenes and for understanding the causes of selection on synonymous sites.
Collapse
Affiliation(s)
- Rosina Savisaar
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| |
Collapse
|
23
|
Beh LY, Müller MM, Muir TW, Kaplan N, Landweber LF. DNA-guided establishment of nucleosome patterns within coding regions of a eukaryotic genome. Genome Res 2015; 25:1727-38. [PMID: 26330564 PMCID: PMC4617968 DOI: 10.1101/gr.188516.114] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Accepted: 08/20/2015] [Indexed: 12/13/2022]
Abstract
A conserved hallmark of eukaryotic chromatin architecture is the distinctive array of well-positioned nucleosomes downstream from transcription start sites (TSS). Recent studies indicate that trans-acting factors establish this stereotypical array. Here, we present the first genome-wide in vitro and in vivo nucleosome maps for the ciliate Tetrahymena thermophila. In contrast with previous studies in yeast, we find that the stereotypical nucleosome array is preserved in the in vitro reconstituted map, which is governed only by the DNA sequence preferences of nucleosomes. Remarkably, this average in vitro pattern arises from the presence of subsets of nucleosomes, rather than the whole array, in individual Tetrahymena genes. Variation in GC content contributes to the positioning of these sequence-directed nucleosomes and affects codon usage and amino acid composition in genes. Given that the AT-rich Tetrahymena genome is intrinsically unfavorable for nucleosome formation, we propose that these “seed” nucleosomes—together with trans-acting factors—may facilitate the establishment of nucleosome arrays within genes in vivo, while minimizing changes to the underlying coding sequences.
Collapse
Affiliation(s)
- Leslie Y Beh
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey 08544, USA
| | - Manuel M Müller
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, USA
| | - Tom W Muir
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, USA
| | - Noam Kaplan
- Program in Systems Biology, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA
| | - Laura F Landweber
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey 08544, USA
| |
Collapse
|
24
|
Bush SJ, Kover PX, Urrutia AO. Lineage-specific sequence evolution and exon edge conservation partially explain the relationship between evolutionary rate and expression level in A. thaliana. Mol Ecol 2015; 24:3093-106. [PMID: 25930165 PMCID: PMC4480654 DOI: 10.1111/mec.13221] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Revised: 04/21/2015] [Accepted: 04/28/2015] [Indexed: 02/06/2023]
Abstract
Rapidly evolving proteins can aid the identification of genes underlying phenotypic adaptation across taxa, but functional and structural elements of genes can also affect evolutionary rates. In plants, the ‘edges’ of exons, flanking intron junctions, are known to contain splice enhancers and to have a higher degree of conservation compared to the remainder of the coding region. However, the extent to which these regions may be masking indicators of positive selection or account for the relationship between dN/dS and other genomic parameters is unclear. We investigate the effects of exon edge conservation on the relationship of dN/dS to various sequence characteristics and gene expression parameters in the model plant Arabidopsis thaliana. We also obtain lineage-specific dN/dS estimates, making use of the recently sequenced genome of Thellungiella parvula, the second closest sequenced relative after the sister species Arabidopsis lyrata. Overall, we find that the effect of exon edge conservation, as well as the use of lineage-specific substitution estimates, upon dN/dS ratios partly explains the relationship between the rates of protein evolution and expression level. Furthermore, the removal of exon edges shifts dN/dS estimates upwards, increasing the proportion of genes potentially under adaptive selection. We conclude that lineage-specific substitutions and exon edge conservation have an important effect on dN/dS ratios and should be considered when assessing their relationship with other genomic parameters.
Collapse
Affiliation(s)
- Stephen J Bush
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Paula X Kover
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Araxi O Urrutia
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| |
Collapse
|
25
|
Effects of codon usage on gene expression: empirical studies on Drosophila. J Mol Evol 2015; 80:219-26. [PMID: 25838108 PMCID: PMC4408374 DOI: 10.1007/s00239-015-9675-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Accepted: 03/29/2015] [Indexed: 10/28/2022]
Abstract
For most amino acids, more than one codon can be used. Many hypotheses have been put forward to account for patterns of uneven use of synonymous codons (codon usage bias) that most often have been indirectly tested primarily by analyses of patterns. Direct experimental tests of effects of synonymous codon usage are available for unicellular organisms, however empirical data addressing this problem in multicellular eukaryotes are sparse. We have developed a flexible transfecting plasmid that allows us to empirically test the effects of different codons on transcription and translation and present data from Drosophila. We could detect no significant effects of codon usage on transcription. With regard to translation, optimal codons (most used) produce higher levels of protein expression compared to non-optimal codons if the effect of difference in thermodynamic stability of secondary structure of the 5' mRNA ribosome-binding site is controlled for. These results are consistent with what has been found in bacteria and thus expand the generality of these principles to multicellular eukaryotes.
Collapse
|
26
|
Abstract
Numerous computational methods exist to assess the mode and strength of natural selection in protein-coding sequences, yet how distinct methods relate to one another remains largely unknown. Here, we elucidate the relationship between two widely used phylogenetic modeling frameworks: dN/dS models and mutation-selection (MutSel) models. We derive a mathematical relationship between dN/dS and scaled selection coefficients, the focal parameters of MutSel models, and use this relationship to gain deeper insight into the behaviors, limitations, and applicabilities of these two modeling frameworks. We prove that, if all synonymous changes are neutral, standard MutSel models correspond to dN/dS ≤ 1. However, if synonymous codons differ in fitness, dN/dS can take on arbitrarily high values even if all selection is purifying. Thus, the MutSel modeling framework cannot necessarily accommodate positive, diversifying selection, while dN/dS cannot distinguish between purifying selection on synonymous codons and positive selection on amino acids. We further propose a new benchmarking strategy of dN/dS inferences against MutSel simulations and demonstrate that the widely used Goldman-Yang-style dN/dS models yield substantially biased dN/dS estimates on realistic sequence data. In contrast, the less frequently used Muse-Gaut-style models display much less bias. Strikingly, the least-biased and most precise dN/dS estimates are never found in the models with the best fit to the data, measured through both AIC and BIC scores. Thus, selecting models based on goodness-of-fit criteria can yield poor parameter estimates if the models considered do not precisely correspond to the underlying mechanism that generated the data. In conclusion, establishing mathematical links among modeling frameworks represents a novel, powerful strategy to pinpoint previously unrecognized model limitations and strengths.
Collapse
Affiliation(s)
- Stephanie J Spielman
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute of Cellular and Molecular Biology, The University of Texas at Austin
| | - Claus O Wilke
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute of Cellular and Molecular Biology, The University of Texas at Austin
| |
Collapse
|
27
|
Wu X, Hurst LD. Why Selection Might Be Stronger When Populations Are Small: Intron Size and Density Predict within and between-Species Usage of Exonic Splice Associated cis-Motifs. Mol Biol Evol 2015; 32:1847-61. [PMID: 25771198 PMCID: PMC4476162 DOI: 10.1093/molbev/msv069] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
The nearly neutral theory predicts that small effective population size provides the conditions for weakened selection. This is postulated to explain why our genome is more “bloated” than that of, for example, yeast, ours having large introns and large intergene spacer. If a bloated genome is also an error prone genome might it, however, be the case that selection for error-mitigating properties is stronger in our genome? We examine this notion using splicing as an exemplar, not least because large introns can predispose to noisy splicing. We thus ask whether, owing to genomic decay, selection for splice error-control mechanisms is stronger, not weaker, in species with large introns and small populations. In humans much information defining splice sites is in cis-exonic motifs, most notably exonic splice enhancers (ESEs). These act as splice-error control elements. Here then we ask whether within and between-species intron size is a predictor of the commonality of exonic cis-splicing motifs. We show that, as predicted, the proportion of synonymous sites that are ESE-associated and under selection in humans is weakly positively correlated with the size of the flanking intron. In a phylogenetically controlled framework, we observe, also as expected, that mean intron size is both predicted by Ne.μ and is a good predictor of cis-motif usage across species, this usage coevolving with splice site definition. Unexpectedly, however, across taxa intron density is a better predictor of cis-motif usage than intron size. We propose that selection for splice-related motifs is driven by a need to avoid decoy splice sites that will be more common in genes with many and large introns. That intron number and density predict ESE usage within human genes is consistent with this, as is the finding of intragenic heterogeneity in ESE density. As intronic content and splice site usage across species is also well predicted by Ne.μ, the result also suggests an unusual circumstance in which selection (for cis-modifiers of splicing) might be stronger when population sizes are smaller, as here splicing is noisier, resulting in a greater need to control error-prone splicing.
Collapse
Affiliation(s)
- XianMing Wu
- Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| |
Collapse
|
28
|
Haerty W, Ponting CP. Unexpected selection to retain high GC content and splicing enhancers within exons of multiexonic lncRNA loci. RNA (NEW YORK, N.Y.) 2015; 21:333-46. [PMID: 25589248 PMCID: PMC4338330 DOI: 10.1261/rna.047324.114] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Accepted: 11/25/2014] [Indexed: 06/04/2023]
Abstract
If sequencing was possible only for genomes, and not for RNAs or proteins, then functional protein-coding exons would be recognizable by their unusual patterns of nucleotide composition, specifically a high GC content across the body of exons, and an unusual nucleotide content near their edges. RNAs and proteins can, of course, be sequenced but the extent of functionality of intergenic long noncoding RNAs (lncRNAs) remains under question owing to their low nucleotide conservation. Inspired by the nucleotide composition patterns of protein-coding exons, we sought evidence for functionality across lncRNA loci from diverse species. We found that such patterns across multiexonic lncRNA loci mirror those of proteincoding genes, although to a lesser degree: Specifically, compared with introns, lncRNA exons are GC rich. Additionally we report evidence for the action of purifying selection to preserve exonic splicing enhancers within human multiexonic lncRNAs and nucleotide composition in fruit fly lncRNAs. Our findings provide evidence for selection for more efficient rates of transcription and splicing within lncRNA loci. Despite only a minor proportion of their RNA bases being constrained, multiexonic intergenic lncRNAs appear to require accurate splicing of their exons to transact their function.
Collapse
|
29
|
Schüler A, Ghanbarian AT, Hurst LD. Purifying selection on splice-related motifs, not expression level nor RNA folding, explains nearly all constraint on human lincRNAs. Mol Biol Evol 2014; 31:3164-83. [PMID: 25158797 PMCID: PMC4245815 DOI: 10.1093/molbev/msu249] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
There are two strong and equally important predictors of rates of human protein evolution: The amount the gene is expressed and the proportion of exonic sequence devoted to control splicing, mediated largely by selection on exonic splice enhancer (ESE) motifs. Is the same true for noncoding RNAs, known to be under very weak purifying selection? Prior evidence suggests that selection at splice sites in long intergenic noncoding RNAs (lincRNAs) is important. We now report multiple lines of evidence indicating that the great majority of purifying selection operating on lincRNAs in humans is splice related. Splice-related parameters explain much of the between-gene variation in evolutionary rate in humans. Expression rate is not a relevant predictor, although expression breadth is weakly so. In contrast to protein-coding RNAs, we observe no relationship between evolutionary rate and lincRNA stability. As in protein-coding genes, ESEs are especially abundant near splice junctions and evolve slower than non-ESE sequence equidistant from boundaries. Nearly all constraint in lincRNAs is at exon ends (N.B. the same is not witnessed in Drosophila). Although we cannot definitely answer the question as to why splice-related selection is so important, we find no evidence that splicing might enable the nonsense-mediated decay pathway to capture transcripts incorrectly processed by ribosomes. We find evidence consistent with the notion that splicing modifies the underlying chromatin through recruitment of splice-coupled chromatin modifiers, such as CHD1, which in turn might modulate neighbor gene activity. We conclude that most selection on human lincRNAs is splice mediated and suggest that the possibility of splice-chromatin coupling is worthy of further scrutiny.
Collapse
Affiliation(s)
- Andreas Schüler
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Avazeh T Ghanbarian
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
30
|
Zhou K, Kuo A, Grigoriev IV. Reverse transcriptase and intron number evolution. Stem Cell Investig 2014; 1:17. [PMID: 27358863 DOI: 10.3978/j.issn.2306-9759.2014.08.01] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2014] [Accepted: 08/04/2014] [Indexed: 11/14/2022]
Abstract
BACKGROUND Introns are universal in eukaryotic genomes and play important roles in transcriptional regulation, mRNA export to the cytoplasm, nonsense-mediated decay as both a regulatory and a splicing quality control mechanism, R-loop avoidance, alternative splicing, chromatin structure, and evolution by exon-shuffling. METHODS Sixteen complete fungal genomes were used 13 of which were sequenced and annotated by JGI. Ustilago maydis, Cryptococcus neoformans, and Coprinus cinereus (also named Coprinopsis cinerea) were from the Broad Institute. Gene models from JGI-annotated genomes were taken from the GeneCatalog track that contained the best representative gene models. Varying fractions of the GeneCatalog were manually curated by external users. For clarity, we used the JGI unique database identifier. RESULTS The last common ancestor of eukaryotes (LECA) has an estimated 6.4 coding exons per gene (EPG) and evolved into the diverse eukaryotic life forms, which is recapitulated by the development of a stem cell. We found a parallel between the simulated reverse transcriptase (RT)-mediated intron loss and the comparative analysis of 16 fungal genomes that spanned a wide range of intron density. Although footprints of RT (RTF) were dynamic, relative intron location (RIL) to the 5'-end of mRNA faithfully traced RT-mediated intron loss and revealed 7.7 EPG for LECA. The mode of exon length distribution was conserved in simulated intron loss, which was exemplified by the shared mode of 75 nt between fungal and Chlamydomonas genomes. The dominant ancient exon length was corroborated by the average exon length of the most intron-rich genes in fungal genomes and consistent with ancient protein modules being ~25 aa. Combined with the conservation of a protein length of 400 aa, the earliest ancestor of eukaryotes could have 16 EPG. During earlier evolution, Ascomycota's ancestor had significantly more 3'-biased RT-mediated intron loss that was followed by dramatic RTF loss. There was a down trend of EPG from more conserved to less conserved genes. Moreover, species-specific genes have higher exon-densities, shorter exons, and longer introns when compared to genes conserved at the phylum level. However, intron length in species-specific genes became shorter than that of genes conserved in all species after genomes experiencing drastic intron loss. The estimated EPG from the most frequent exon length is more than double that from the RIL method. CONCLUSIONS This implies significant intron loss during the very early period of eukaryotic evolution. De novo gene-birth contributes to shorter exons, longer introns, and higher exon-density in species-specific genes relative to conserved genes.
Collapse
Affiliation(s)
- Kemin Zhou
- 1 Computational Genomics, Bristol-Myers Squibb, 311 Pennington Rocky Hill Road, Pennington, NJ 08534, USA ; 2 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Alan Kuo
- 1 Computational Genomics, Bristol-Myers Squibb, 311 Pennington Rocky Hill Road, Pennington, NJ 08534, USA ; 2 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Igor V Grigoriev
- 1 Computational Genomics, Bristol-Myers Squibb, 311 Pennington Rocky Hill Road, Pennington, NJ 08534, USA ; 2 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| |
Collapse
|
31
|
Kessler MD, Dean MD. Effective population size does not predict codon usage bias in mammals. Ecol Evol 2014; 4:3887-900. [PMID: 25505518 PMCID: PMC4242573 DOI: 10.1002/ece3.1249] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2014] [Revised: 08/04/2014] [Accepted: 08/07/2014] [Indexed: 12/20/2022] Open
Abstract
Synonymous codons are not used at equal frequency throughout the genome, a phenomenon termed codon usage bias (CUB). It is often assumed that interspecific variation in the intensity of CUB is related to species differences in effective population sizes (Ne), with selection on CUB operating less efficiently in species with small Ne. Here, we specifically ask whether variation in Ne predicts differences in CUB in mammals and report two main findings. First, across 41 mammalian genomes, CUB was not correlated with two indirect proxies of Ne (body mass and generation time), even though there was statistically significant evidence of selection shaping CUB across all species. Interestingly, autosomal genes showed higher codon usage bias compared to X-linked genes, and high-recombination genes showed higher codon usage bias compared to low recombination genes, suggesting intraspecific variation in Ne predicts variation in CUB. Second, across six mammalian species with genetic estimates of Ne (human, chimpanzee, rabbit, and three mouse species: Mus musculus, M. domesticus, and M. castaneus), Ne and CUB were weakly and inconsistently correlated. At least in mammals, interspecific divergence in Ne does not strongly predict variation in CUB. One hypothesis is that each species responds to a unique distribution of selection coefficients, confounding any straightforward link between Ne and CUB.
Collapse
Affiliation(s)
- Michael D Kessler
- Molecular and Computational Biology, University of Southern California 1050 Childs Way, Los Angeles, California, 90089
| | - Matthew D Dean
- Molecular and Computational Biology, University of Southern California 1050 Childs Way, Los Angeles, California, 90089
| |
Collapse
|
32
|
Ma L, Cui P, Zhu J, Zhang Z, Zhang Z. Translational selection in human: more pronounced in housekeeping genes. Biol Direct 2014; 9:17. [PMID: 25011537 PMCID: PMC4100034 DOI: 10.1186/1745-6150-9-17] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Accepted: 07/02/2014] [Indexed: 02/17/2023] Open
Abstract
BACKGROUND Translational selection is a ubiquitous and significant mechanism to regulate protein expression in prokaryotes and unicellular eukaryotes. Recent evidence has shown that translational selection is weakly operative in highly expressed genes in human and other vertebrates. However, it remains unclear whether translational selection acts differentially on human genes depending on their expression patterns. RESULTS Here we report that human housekeeping (HK) genes that are strictly defined as genes that are expressed ubiquitously and consistently in most or all tissues, are under stronger translational selection. CONCLUSIONS These observations clearly show that translational selection is also closely associated with expression pattern. Our results suggest that human HK genes are more efficiently and/or accurately translated into proteins, which will inevitably open up a new understanding of HK genes and the regulation of gene expression. REVIEWERS This article was reviewed by Yuan Yuan, Baylor College of Medicine; Han Liang, University of Texas MD Anderson Cancer Center (nominated by Dr Laura Landweber) Eugene Koonin, NCBI, NLM, NIH, United States of America Sandor Pongor, International Centre for Genetic Engineering and biotechnology (ICGEB), Italy.
Collapse
Affiliation(s)
| | | | | | | | - Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, No,1 Beichen West Road, Chaoyang District, Beijing 100101, China.
| |
Collapse
|
33
|
Han F, Peng Y, Xu L, Xiao P. Identification, characterization, and utilization of single copy genes in 29 angiosperm genomes. BMC Genomics 2014; 15:504. [PMID: 24950957 PMCID: PMC4092219 DOI: 10.1186/1471-2164-15-504] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2013] [Accepted: 06/17/2014] [Indexed: 01/01/2023] Open
Abstract
Background Single copy genes are common across angiosperm genomes. With the sufficiently high quality sequenced genomes, the identification of large-scale single copy genes among multiple species is possible. Although some characteristics have been reported, our study provides novel insights into single copy genes. Results We identified single copy genes across 29 angiosperm genomes. A significant negative correlation was found between the number of duplicate blocks and the number of single copy genes. We found that a considerable number of single copy genes are located in organelles, showing a preference for binding and catalytic activity. The analysis of effective number of codons (Nc) illustrates that single copy genes have a stronger codon bias than non-single copy genes in eudicots. The relative high expression level of single copy genes was partially confirmed by the RNA-seq data, rather than the Codon Adaptation Index (CAI). Unlike in most other species, a strongly negatively correlation occurs between Nc and GC3 among single copy genes in grass genomes. When compared to all non-single copy genes, single copy genes indicate more conservation (as indicated by Ka and Ks values). But our alternative splicing (AS) results reveal that selective constraints are weaker in single copy genes than in low copy family genes (1–10 in-paralogs) and stronger than high copy family genes (>10 in-paralogs). Using concatenated shared single copy genes, we obtained a well-resolved phylogenetic tree. With the addition of intron sequences, the branch support is improved, but striking incongruences are also evident. Therefore, it is noteworthy that inclusion of intron sequences seems more appropriate for the phylogenetic reconstruction at lower taxonomic levels. Conclusions Our analysis provides insight into the evolutionary characteristics of single copy genes across 29 angiosperm genomes. The results suggest that there are key differences in evolutionary constraints between single copy genes and non-single copy genes. And to some extent, these evolutionary constraints show some species-specific differences, especially between eudicots and monocots. Our preliminary evidence also suggests that the concatenated shared single copy genes are well suited for use in resolving phylogenetic relationships. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-504) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | - Peigen Xiao
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Beijing 100193, PR China.
| |
Collapse
|
34
|
Hunt RC, Simhadri VL, Iandoli M, Sauna ZE, Kimchi-Sarfaty C. Exposing synonymous mutations. Trends Genet 2014; 30:308-21. [PMID: 24954581 DOI: 10.1016/j.tig.2014.04.006] [Citation(s) in RCA: 231] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2013] [Revised: 04/16/2014] [Accepted: 04/17/2014] [Indexed: 12/12/2022]
Abstract
Synonymous codon changes, which do not alter protein sequence, were previously thought to have no functional consequence. Although this concept has been overturned in recent years, there is no unique mechanism by which these changes exert biological effects. A large repertoire of both experimental and bioinformatic methods has been developed to understand the effects of synonymous variants. Results from this body of work have provided global insights into how biological systems exploit the degeneracy of the genetic code to control gene expression, protein folding efficiency, and the coordinated expression of functionally related gene families. Although it is now clear that synonymous variants are important in a variety of contexts, from human disease to the safety and efficacy of therapeutic proteins, there is no clear consensus on the approaches to identify and validate these changes. Here, we review the diverse methods to understand the effects of synonymous mutations.
Collapse
Affiliation(s)
- Ryan C Hunt
- Division of Hematology, Center for Biologics Evaluation and Research, Food and Drug Administration, Bethesda, MD, USA.
| | - Vijaya L Simhadri
- Division of Hematology, Center for Biologics Evaluation and Research, Food and Drug Administration, Bethesda, MD, USA
| | - Matthew Iandoli
- Division of Hematology, Center for Biologics Evaluation and Research, Food and Drug Administration, Bethesda, MD, USA
| | - Zuben E Sauna
- Division of Hematology, Center for Biologics Evaluation and Research, Food and Drug Administration, Bethesda, MD, USA.
| | - Chava Kimchi-Sarfaty
- Division of Hematology, Center for Biologics Evaluation and Research, Food and Drug Administration, Bethesda, MD, USA.
| |
Collapse
|
35
|
Falanga A, Stojanović O, Kiffer-Moreira T, Pinto S, Millán JL, Vlahoviček K, Baralle M. Exonic splicing signals impose constraints upon the evolution of enzymatic activity. Nucleic Acids Res 2014; 42:5790-8. [PMID: 24692663 PMCID: PMC4027185 DOI: 10.1093/nar/gku240] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Exon splicing enhancers (ESEs) overlap with amino acid coding sequences implying a dual evolutionary selective pressure. In this study, we map ESEs in the placental alkaline phosphatase gene (ALPP), absent in the corresponding exon of the ancestral tissue-non-specific alkaline phosphatase gene (ALPL). The ESEs are associated with amino acid differences between the transcripts in an area otherwise conserved. We switched out the ALPP ESEs sequences with the sequence from the related ALPL, introducing the associated amino acid changes. The resulting enzymes, produced by cDNA expression, showed different kinetic characteristics than ALPL and ALPP. In the organism, this enzyme will never be subjected to selection because gene splicing analysis shows exon skipping due to loss of the ESE. Our data prove that ESEs restrict the evolution of enzymatic activity. Thus, suboptimal proteins may exist in scenarios when coding nucleotide changes and consequent amino acid variation cannot be reconciled with the splicing function.
Collapse
Affiliation(s)
- Alessia Falanga
- Molecular Pathology Group, International Centre for Genetic Engineering and Biotechnology (ICGEB), Padriciano 99, 34149 Trieste, Italy
| | - Ozren Stojanović
- Bioinformatics Group, Department of Molecular Biology, Division of Biology, Faculty of Science, University of Zagreb, Horvatovac 102a, 10000 Zagreb, Croatia
| | - Tina Kiffer-Moreira
- Sanford Children's Health Research Center, Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Sofia Pinto
- Bioinformatics Group, Department of Molecular Biology, Division of Biology, Faculty of Science, University of Zagreb, Horvatovac 102a, 10000 Zagreb, Croatia
| | - José Luis Millán
- Sanford Children's Health Research Center, Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Kristian Vlahoviček
- Bioinformatics Group, Department of Molecular Biology, Division of Biology, Faculty of Science, University of Zagreb, Horvatovac 102a, 10000 Zagreb, Croatia Department of Informatics, University of Oslo, PO Box 1080 Blindern, NO-0316 Oslo, Norway
| | - Marco Baralle
- Molecular Pathology Group, International Centre for Genetic Engineering and Biotechnology (ICGEB), Padriciano 99, 34149 Trieste, Italy
| |
Collapse
|
36
|
Yona AH, Bloom-Ackermann Z, Frumkin I, Hanson-Smith V, Charpak-Amikam Y, Feng Q, Boeke JD, Dahan O, Pilpel Y. tRNA genes rapidly change in evolution to meet novel translational demands. eLife 2013; 2:e01339. [PMID: 24363105 PMCID: PMC3868979 DOI: 10.7554/elife.01339] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
Changes in expression patterns may occur when organisms are presented with new environmental challenges, for example following migration or genetic changes. To elucidate the mechanisms by which the translational machinery adapts to such changes, we perturbed the tRNA pool of Saccharomyces cerevisiae by tRNA gene deletion. We then evolved the deletion strain and observed that the genetic adaptation was recurrently based on a strategic mutation that changed the anticodon of other tRNA genes to match that of the deleted one. Strikingly, a systematic search in hundreds of genomes revealed that anticodon mutations occur throughout the tree of life. We further show that the evolution of the tRNA pool also depends on the need to properly couple translation to protein folding. Together, our observations shed light on the evolution of the tRNA pool, demonstrating that mutation in the anticodons of tRNA genes is a common adaptive mechanism when meeting new translational demands. DOI:http://dx.doi.org/10.7554/eLife.01339.001 Genes contain the blueprints for the proteins that are essential for countless biological functions and processes, and the path that leads from a particular gene to the corresponding protein is long and complex. The genetic information stored in the DNA must first be transcribed to produce a messenger RNA molecule, which then has to be translated to produce a string of amino acids that fold to form a protein. The translation step is performed by a molecular machine called the ribosome, with transfer RNA molecules bringing the amino acids that are needed to make the protein. The information in messenger RNA is stored as a series of letters, with groups of three letters called codons representing the different amino acids. Since there are four letters—A, C, G and U—it is possible to form 64 different codons. And since there are only 20 amino acids, two or more different codons can specify the same amino acid (for example, AGU and AGC both specify serine), and two or more different transfer RNA molecules can take this amino acid to the ribosome. Moreover, some codons are found more often than others in the messenger RNA molecules, so the genes that encode the related transfer RNA molecules are more common than the genes for other transfer RNA molecules. Environmental pressures mean that organisms must adapt to survive, with some genes and proteins increasing in importance, and others becoming less important. Clearly the relative numbers of the different transfer RNA molecules will also need to change to reflect these evolutionary changes, but the details of how this happens were not understood. Now Yona et al. have explored this issue by studying yeast cells that lack a gene for one of the less common transfer RNA molecules (corresponding to the codon AGG, which specifies the amino acid arginine). At first this mutation resulted in slower growth of the yeast cells, but after being allowed to evolve over 200 generations, the rate of growth matched that of a normal strain with all transfer RNA genes. Yona et al. found that the gene for a more common transfer RNA molecule, corresponding to the codon AGA, which also specifies arginine, had mutated to AGG. As a result, the mutated yeast was eventually able to produce proteins as quickly as wild type yeast. Moreover, further experiments showed that the levels of some transfer RNAs are kept deliberately low in order to slow down the production of proteins so as to ensure that the proteins assume their correct structure. But does the way these cells evolved in the lab resemble what happened in nature? To address this question Yona et al. examined a database of transfer RNA sequences from more than 500 species, and found evidence for the same codon-based switching mechanism in many species across the tree of life. DOI:http://dx.doi.org/10.7554/eLife.01339.002
Collapse
Affiliation(s)
- Avihu H Yona
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Cáceres EF, Hurst LD. The evolution, impact and properties of exonic splice enhancers. Genome Biol 2013; 14:R143. [PMID: 24359918 PMCID: PMC4054783 DOI: 10.1186/gb-2013-14-12-r143] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2013] [Accepted: 12/20/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In humans, much of the information specifying splice sites is not at the splice site. Exonic splice enhancers are one of the principle non-splice site motifs. Four high-throughput studies have provided a compendium of motifs that function as exonic splice enhancers, but only one, RESCUE-ESE, has been generally employed to examine the properties of enhancers. Here we consider these four datasets to ask whether there is any consensus on the properties and impacts of exonic splice enhancers. RESULTS While only about 1% of all the identified hexamer motifs are common to all analyses we can define reasonably sized sets that are found in most datasets. These consensus intersection datasets we presume reflect the true properties of exonic splice enhancers. Given prior evidence for the properties of enhancers and splice-associated mutations, we ask for all datasets whether the exonic splice enhancers considered are purine enriched; enriched near exon boundaries; able to predict trends in relative codon usage; slow evolving at synonymous sites; rare in SNPs; associated with weak splice sites; and enriched near longer introns. While the intersect datasets match expectations, only one original dataset, RESCUE-ESE, does. Unexpectedly, a fully experimental dataset identifies motifs that commonly behave opposite to the consensus, for example, being enriched in exon cores where splice-associated mutations are rare. CONCLUSIONS Prior analyses that used the RESCUE-ESE set of hexamers captured the properties of consensus exonic splice enhancers. We estimate that at least 4% of synonymous mutations are deleterious owing to an effect on enhancer functioning.
Collapse
|
38
|
Yang YF, Zhu T, Niu DK. Association of intron loss with high mutation rate in Arabidopsis: implications for genome size evolution. Genome Biol Evol 2013; 5:723-33. [PMID: 23516254 PMCID: PMC4104619 DOI: 10.1093/gbe/evt043] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Despite the prevalence of intron losses during eukaryotic evolution, the selective forces acting on them have not been extensively explored. Arabidopsis thaliana lost half of its genome and experienced an elevated rate of intron loss after diverging from A. lyrata. The selective force for genome reduction was suggested to have driven the intron loss. However, the evolutionary mechanism of genome reduction is still a matter of debate. In this study, we found that intron-lost genes have high synonymous substitution rates. Assuming that differences in mutability among different introns are conserved among closely related species, we used the nucleotide substitution rate between orthologous introns in other species as the proxy of the mutation rate of Arabidopsis introns, either lost or extant. The lost introns were found to have higher mutation rates than extant introns. At the genome-wide level, A. thaliana has a higher mutation rate than A. lyrata, which correlates with the higher rate of intron loss and rapid genome reduction of A. thaliana. Our results indicate that selection to minimize mutational hazards might be the selective force for intron loss, and possibly also for genome reduction, in the evolution of A. thaliana. Small genome size and lower genome-wide intron density were widely reported to be correlated with phenotypic features, such as high metabolic rates and rapid growth. We argue that the mutational-hazard hypothesis is compatible with these correlations, by suggesting that selection for rapid growth might indirectly increase mutational hazards.
Collapse
Affiliation(s)
- Yu-Fei Yang
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering, and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, China
| | | | | |
Collapse
|
39
|
Familial Alzheimer's disease coding mutations reduce Presenilin-1 expression in a novel genomic locus reporter model. Neurobiol Aging 2013; 35:443.e5-443.e16. [PMID: 24011544 DOI: 10.1016/j.neurobiolaging.2013.07.026] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2013] [Revised: 07/28/2013] [Accepted: 07/31/2013] [Indexed: 01/13/2023]
Abstract
We have generated a physiologically relevant bacterial artificial chromosome (BAC)-based genomic DNA expression model to study PS1 gene expression and function. The PS1-WT-BAC construct restored γ-secretase function, whereas the mutant PS1 BACs demonstrated partial to complete loss of enzymatic activity when stably expressed in a PS double knock-out clonal cell line. We then engineered WT and mutant human PS1-BAC-Luciferase whole genomic locus reporter transgenes, which we transiently transduced in mouse and human non-neuronal and neuronal-like cells, respectively. PS1 ΔE9 and C410Y FAD were found to lower PS1 gene expression in both cell lines, whereas PS1-M146V showed a neuron-specific effect. The nonclinical γ-secretase inactive PS1-D257A mutation did not alter gene expression in either cell line. This is the first time that pathogenic coding mutations in the PS1 gene have been shown to lower PS1 gene expression. These findings may represent a pathologic mechanism for PS1 FAD mutations independent of their effects on γ-secretase activity and demonstrate how dominant PS1 mutations may exert their pathogenic effects by a loss-of-function mechanism.
Collapse
|
40
|
De Maio N, Schlötterer C, Kosiol C. Linking great apes genome evolution across time scales using polymorphism-aware phylogenetic models. Mol Biol Evol 2013; 30:2249-62. [PMID: 23906727 PMCID: PMC3773373 DOI: 10.1093/molbev/mst131] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The genomes of related species contain valuable information on the history of the considered taxa. Great apes in particular exhibit variation of evolutionary patterns along their genomes. However, the great ape data also bring new challenges, such as the presence of incomplete lineage sorting and ancestral shared polymorphisms. Previous methods for genome-scale analysis are restricted to very few individuals or cannot disentangle the contribution of mutation rates and fixation biases. This represents a limitation both for the understanding of these forces as well as for the detection of regions affected by selection. Here, we present a new model designed to estimate mutation rates and fixation biases from genetic variation within and between species. We relax the assumption of instantaneous substitutions, modeling substitutions as mutational events followed by a gradual fixation. Hence, we straightforwardly account for shared ancestral polymorphisms and incomplete lineage sorting. We analyze genome-wide synonymous site alignments of human, chimpanzee, and two orangutan species. From each taxon, we include data from several individuals. We estimate mutation rates and GC-biased gene conversion intensity. We find that both mutation rates and biased gene conversion vary with GC content. We also find lineage-specific differences, with weaker fixation biases in orangutan species, suggesting a reduced historical effective population size. Finally, our results are consistent with directional selection acting on coding sequences in relation to exonic splicing enhancers.
Collapse
Affiliation(s)
- Nicola De Maio
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria
| | | | | |
Collapse
|
41
|
Doherty A, McInerney JO. Translational selection frequently overcomes genetic drift in shaping synonymous codon usage patterns in vertebrates. Mol Biol Evol 2013; 30:2263-7. [PMID: 23883522 DOI: 10.1093/molbev/mst128] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Synonymous codon usage patterns are shaped by a balance between mutation, drift, and natural selection. To date, detection of translational selection in vertebrates has proven to be a challenging task, obscured by small long-term effective population sizes in larger animals and the existence of isochores in some species. The consensus is that, in such species, natural selection is either completely ineffective at overcoming mutational pressures and genetic drift or perhaps is effective but so weak that it is not detectable. The aim of this research is to understand the interplay between mutation, selection, and genetic drift in vertebrates. We observe that although variation in mutational bias is undoubtedly the dominant force influencing codon usage, translational selection acts as a weak additional factor influencing synonymous codon usage. These observations indicate that translational selection is a widespread phenomenon in vertebrates and is not limited to a few species.
Collapse
Affiliation(s)
- Aoife Doherty
- Bioinformatics and Molecular Evolution Unit, Department of Biology, National University of Ireland Maynooth, Maynooth, Co. Kildare, Ireland
| | | |
Collapse
|
42
|
Behura SK, Singh BK, Severson DW. Antagonistic relationships between intron content and codon usage bias of genes in three mosquito species: functional and evolutionary implications. Evol Appl 2013; 6:1079-89. [PMID: 24187589 PMCID: PMC3804240 DOI: 10.1111/eva.12088] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2013] [Accepted: 06/14/2013] [Indexed: 12/17/2022] Open
Abstract
Genome biology of mosquitoes holds potential in developing knowledge-based control strategies against vectorborne diseases such as malaria, dengue, West Nile, and others. Although the genomes of three major vector mosquitoes have been sequenced, attempts to elucidate the relationship between intron and codon usage bias across species in phylogenetic contexts are limited. In this study, we investigated the relationship between intron content and codon bias of orthologous genes among three vector mosquito species. We found an antagonistic relationship between codon usage bias and the intron number of genes in each mosquito species. The pattern is further evident among the intronless and the intron-containing orthologous genes associated with either low or high codon bias among the three species. Furthermore, the covariance between codon bias and intron number has a directional component associated with the species phylogeny when compared with other nonmosquito insects. By applying a maximum likelihood-based continuous regression method, we show that codon bias and intron content of genes vary among the insects in a phylogeny-dependent manner, but with no evidence of adaptive radiation or species-specific adaptation. We discuss the functional and evolutionary significance of antagonistic relationships between intron content and codon bias.
Collapse
Affiliation(s)
- Susanta K Behura
- Eck Institute for Global Health, Department of Biological Sciences, University of Notre Dame Notre Dame, IN, USA
| | | | | |
Collapse
|
43
|
Synonymous codon changes in the oncogenes of the cottontail rabbit papillomavirus lead to increased oncogenicity and immunogenicity of the virus. Virology 2013; 438:70-83. [PMID: 23433866 DOI: 10.1016/j.virol.2013.01.005] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2012] [Revised: 12/21/2012] [Accepted: 01/09/2013] [Indexed: 12/30/2022]
Abstract
Papillomaviruses use rare codons with respect to the host. The reasons for this are incompletely understood but among the hypotheses is the concept that rare codons result in low protein production and this allows the virus to escape immune surveillance. We changed rare codons in the oncogenes E6 and E7 of the cottontail rabbit papillomavirus to make them more mammalian-like and tested the mutant genomes in our in vivo animal model. While the amino acid sequences of the proteins remained unchanged, the oncogenic potential of some of the altered genomes increased dramatically. In addition, increased immunogenicity, as measured by spontaneous regression, was observed as the numbers of codon changes increased. This work suggests that codon usage may modify protein production in ways that influence disease outcome and that evaluation of synonymous codons should be included in the analysis of genetic variants of infectious agents and their association with disease.
Collapse
|
44
|
Wu X, Tronholm A, Cáceres EF, Tovar-Corona JM, Chen L, Urrutia AO, Hurst LD. Evidence for deep phylogenetic conservation of exonic splice-related constraints: splice-related skews at exonic ends in the brown alga Ectocarpus are common and resemble those seen in humans. Genome Biol Evol 2013; 5:1731-45. [PMID: 23902749 PMCID: PMC3787667 DOI: 10.1093/gbe/evt115] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/25/2013] [Indexed: 12/22/2022] Open
Abstract
The control of RNA splicing is often modulated by exonic motifs near splice sites. Chief among these are exonic splice enhancers (ESEs). Well-described ESEs in mammals are purine rich and cause predictable skews in codon and amino acid usage toward exonic ends. Looking across species, those with relatively abundant intronic sequence are those with the more profound end of exon skews, indicative of exonization of splice site recognition. To date, the only intron-rich species that have been analyzed are mammals, precluding any conclusions about the likely ancestral condition. Here, we examine the patterns of codon and amino acid usage in the vicinity of exon-intron junctions in the brown alga Ectocarpus siliculosus, a species with abundant large introns, known SR proteins, and classical splice sites. We find that amino acids and codons preferred/avoided at both 3' and 5' ends in Ectocarpus, of which there are many, tend, on average, to also be preferred/avoided at the same exon ends in humans. Moreover, the preferences observed at the 5' ends of exons are largely the same as those at the 3' ends, a symmetry trend only previously observed in animals. We predict putative hexameric ESEs in Ectocarpus and show that these are purine rich and that there are many more of these identified as functional ESEs in humans than expected by chance. These results are consistent with deep phylogenetic conservation of SR protein binding motifs. Assuming codons preferred near boundaries are "splice optimal" codons, in Ectocarpus, unlike Drosophila, splice optimal and translationally optimal codons are not mutually exclusive. The exclusivity of translationally optimal and splice optimal codon sets is thus not universal.
Collapse
Affiliation(s)
- XianMing Wu
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
| | - Ana Tronholm
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
- Present address: Department of Biological Sciences, University of Alabama, Mary Harmon Bryant Hall, Tuscaloosa, AL
| | - Eva Fernández Cáceres
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
| | - Jaime M. Tovar-Corona
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
| | - Lu Chen
- Human Genetics, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, United Kingdom
| | - Araxi O. Urrutia
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
| | - Laurence D. Hurst
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
| |
Collapse
|
45
|
Hershberg R, Petrov DA. On the limitations of using ribosomal genes as references for the study of codon usage: a rebuttal. PLoS One 2012; 7:e49060. [PMID: 23284622 PMCID: PMC3527481 DOI: 10.1371/journal.pone.0049060] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Accepted: 10/05/2012] [Indexed: 01/08/2023] Open
Abstract
In a recent paper published in PLOS ONE, Wang et al. challenge our finding that the identity of optimal codons in different genomes follows a set of clear rules. Here we provide a rebuttal of their paper and demonstrate that the results of our original PLOS Genetics paper stand. This provides us with an opportunity to bring up an aspect of how codon usage has been studied that should be of general interest. The Wang et al. study, as well as many other studies, used ribosomal genes as a reference set for the study of patterns of codon usage. We discuss here the assumptions that are made in order to justify using ribosomal genes to study codon bias, suggest that this practice can at times be problematic, and discuss its limitations.
Collapse
Affiliation(s)
- Ruth Hershberg
- Rachel & Menachem Mendelovitch Evolutionary Processes of Mutation & Natural Selection Research Laboratory, Department of Genetics, Technion-Israel Institute of Technology, Haifa, Israel.
| | | |
Collapse
|
46
|
Williams C, Hoppe HJ, Rezgui D, Strickland M, Forbes BE, Grutzner F, Frago S, Ellis RZ, Wattana-Amorn P, Prince SN, Zaccheo OJ, Nolan CM, Mungall AJ, Jones EY, Crump MP, Hassan AB. An exon splice enhancer primes IGF2:IGF2R binding site structure and function evolution. Science 2012; 338:1209-13. [PMID: 23197533 PMCID: PMC4658703 DOI: 10.1126/science.1228633] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Placental development and genomic imprinting coevolved with parental conflict over resource distribution to mammalian offspring. The imprinted genes IGF2 and IGF2R code for the growth promoter insulin-like growth factor 2 (IGF2) and its inhibitor, mannose 6-phosphate (M6P)/IGF2 receptor (IGF2R), respectively. M6P/IGF2R of birds and fish do not recognize IGF2. In monotremes, which lack imprinting, IGF2 specifically bound M6P/IGF2R via a hydrophobic CD loop. We show that the DNA coding the CD loop in monotremes functions as an exon splice enhancer (ESE) and that structural evolution of binding site loops (AB, HI, FG) improved therian IGF2 affinity. We propose that ESE evolution led to the fortuitous acquisition of IGF2 binding by M6P/IGF2R that drew IGF2R into parental conflict; subsequent imprinting may then have accelerated affinity maturation.
Collapse
Affiliation(s)
- Christopher Williams
- Department of Organic and Biological Chemistry, School of Chemistry, University of Bristol, Bristol BS8 1TS, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Liou SW, Huang YF. An exon/intron disparity framework based on the nucleotide profile of single sequence. ACTA ACUST UNITED AC 2012. [DOI: 10.1007/s13721-012-0007-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
48
|
Künstner A, Nabholz B, Ellegren H. Significant selective constraint at 4-fold degenerate sites in the avian genome and its consequence for detection of positive selection. Genome Biol Evol 2011; 3:1381-9. [PMID: 22042333 PMCID: PMC3242499 DOI: 10.1093/gbe/evr112] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/24/2011] [Indexed: 12/15/2022] Open
Abstract
A major conclusion from comparative genomics is that many sequences that do not code for proteins are conserved beyond neutral expectations, indicating that they evolve under the influence of purifying selection and are likely to have functional roles. Due to the degeneracy of the genetic code, synonymous sites within protein-coding genes have previously been seen as "silent" with respect to function and thereby invisible to selection. However, there are indications that synonymous sites of vertebrate genomes are also subject to selection and this is not necessarily because of potential codon bias. We used divergence in ancestral repeats as a neutral reference to estimate the constraint on 4-fold degenerate sites of avian genes in a whole-genome approach. In the pairwise comparison of chicken and zebra finch, constraint was estimated at 24-32%. Based on three-species alignments of chicken, turkey, and zebra finch, lineage-specific estimates of constraint were 43%, 29%, and 24%, respectively. The finding of significant constraint at 4-fold degenerate sites from data on interspecific divergence was replicated in an analysis of intraspecific diversity in the chicken genome. These observations corroborate recent data from mammalian genomes and call for a reappraisal of the use of synonymous substitution rates as neutral standards in molecular evolutionary analysis, for example, in the use of the well-known d(N)/d(S) ratio and in inferences on positive selection. We show by simulations that the rate of false positives in the detection of positively selected genes and sites increases several-fold at the levels of constraint at 4-fold degenerate sites found in this study.
Collapse
Affiliation(s)
| | | | - Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| |
Collapse
|
49
|
Lin MF, Kheradpour P, Washietl S, Parker BJ, Pedersen JS, Kellis M. Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes. Genome Res 2011; 21:1916-28. [PMID: 21994248 DOI: 10.1101/gr.108753.110] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The degeneracy of the genetic code allows protein-coding DNA and RNA sequences to simultaneously encode additional, overlapping functional elements. A sequence in which both protein-coding and additional overlapping functions have evolved under purifying selection should show increased evolutionary conservation compared to typical protein-coding genes--especially at synonymous sites. In this study, we use genome alignments of 29 placental mammals to systematically locate short regions within human ORFs that show conspicuously low estimated rates of synonymous substitution across these species. The 29-species alignment provides statistical power to locate more than 10,000 such regions with resolution down to nine-codon windows, which are found within more than a quarter of all human protein-coding genes and contain ∼2% of their synonymous sites. We collect numerous lines of evidence that the observed synonymous constraint in these regions reflects selection on overlapping functional elements including splicing regulatory elements, dual-coding genes, RNA secondary structures, microRNA target sites, and developmental enhancers. Our results show that overlapping functional elements are common in mammalian genes, despite the vast genomic landscape.
Collapse
Affiliation(s)
- Michael F Lin
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | | | | | | | | | |
Collapse
|
50
|
Zago P, Buratti E, Stuani C, Baralle FE. Evolutionary connections between coding and splicing regulatory regions in the fibronectin EDA exon. J Mol Biol 2011; 411:1-15. [PMID: 21663748 DOI: 10.1016/j.jmb.2011.05.031] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2011] [Revised: 05/16/2011] [Accepted: 05/20/2011] [Indexed: 01/03/2023]
Abstract
Research on exonic coding sequences has demonstrated that many substitutions at the amino acid level may also reflect profound changes at the level of splicing regulatory regions. These results have revealed that, for many alternatively spliced exons, there is considerable pressure to strike a balance between two different and sometimes conflicting forces: the drive to improve the quality and production efficiency of proteins and the maintenance of proper exon recognition by the splicing machinery. Up to now, the systems used to investigate these connections have mostly focused on short alternatively spliced exons that contain a high density of splicing regulatory elements. Although this is obviously a desirable feature in order to maximize the chances of spotting connections, it also complicates the process of drawing straightforward evolutionary pathways between different species (because of the numerous alternative pathways through which the same end point can be achieved). The alternatively spliced fibronectin extra domain A exon (also referred to as EDI or EIIIA) does not have these limitations, as its inclusion is already known to depend on a single exonic splicing enhancer element within its sequence. In this study, we have compared the rat and human fibronectin EDA exons with regard to RNA structure, exonic splicing enhancer strengths, and SR protein occupancy. The results gained from these analyses have then been used to perform an accurate evaluation of EDA sequences observed in a wide range of animal species. This comparison strongly suggests the existence of an evolutionary connection between changes at the nucleotide levels and the need to maintain efficient EDA recognition in different species.
Collapse
Affiliation(s)
- Paola Zago
- International Center for Genetic Engineering and Biotechnology, Trieste, Italy
| | | | | | | |
Collapse
|